CSCE641: Computer GraphicsImage Formation
Jinxiang Chai
Are They Images?
Outline
• Color representation• Image representation• Pin-hole Camera• Projection matrix• Plenoptic function
Outline
• Color representation• Image representation• Pin-hole Camera• Projection matrix• Plenoptic function
Color Representation
• Why do we use RGB to encode pixel color?
• Can we use RGB to represent all colors?
• What are other color representations?
Human Vision
Model of human vision
Human Vision
Model of human vision
Vision components:
• Incoming light
• Human eye
Electromagnetic Spectrum
Visible light frequencies range between:– Red: 4.3X1014 hertz (700nm)– Violet: 7.5X1014 hertz (400nm)
Visible Light
The human eye can see “visible” light in the frequency between 400nm-700nm
Visible Light
The human eye can see “visible” light in the frequency between 400nm-700nm
400nm 700nm
Visible Light
The human eye can see “visible” light in the frequency between 400nm-700nm
400nm 700nm
- Not strict boundary
- Some colors are absent (brown, pink)
12
Spectral Energy DistributionThree different types of lights
13
Spectral Energy DistributionThree different types of lights
Can we use spectral energy distribution to represent color?
14
Spectral Energy DistributionThree different types of lights
Can we use spectral energy distribution to represent color?
- Not really, different distribution might result in the same color (metamers)!
Spectral Energy DistributionThe six spectra below look the same purple to
normal color-vision people
Color Representation?
Why not all ranges of light spectrum are perceived?
So how to represent color?
- unique
- compact
- work for as many visible lights as possible
400nm 700nm
Human Vision
Photoreceptor cells in the retina:
- Rods
- Cones
Light Detection: Rods and Cones
Rods: -120 million rods in retina -1000X more light sensitive than Cones - Discriminate B/W brightness in low illumination - Short wave-length sensitive
Cons: - 6-7 million Cones in the retina - Responsible for high-resolution vision - Discriminate Colors - Three types of color sensors (64% red, 32%, 2% blue) - Sensitive to any combination of three colors
Tristimulus of Color Theory
Spectral-response functions of each of the three types of cones
Tristimulus of Color Theory
Spectral-response functions of each of the three types of cones
Can we use them to match any spectral color?
Tristimulus of Color Theory
Spectral-response functions of each of the three types of cones
Color matching function based on RGB - any spectral color can be represented as a linear combination of
these primary colors
Tristimulus of Color Theory
Spectral-response functions of each of the three types of cones
Color matching function based on RGB - any spectral color can be represented as a linear combination of
these primary colors
Tristimulus of Color Theory
Spectral-response functions of each of the three types of cones
Color matching function based on RGB - any spectral color can be represented as a linear combination of
these primary colors
Tristimulus Color Theory
So, color is psychological- Representing color as a linear combination of red, green, and
blue is related to cones, not physics
- Most people have the same cones, but there are some people who don’t – the sky might not look blue to them (although they will call it “blue” nonetheless)
- But many people (mostly men) are colorblind, missing 1,2 or 3 cones (can buy cheaper TVs)
Additive and Subtractive Color
RGB color model CMY color model
Complementary color models: R=1-C; G = 1-M; B=1-Y;
White: [1 1 1]T
Green: [0 1 0];
White: [0 0 0]T
Green: [1 0 1];
RGB Color Space
RGB cube– Easy for devices– Can represent all the colors?– But not perceptual– Where is brightness, hue and saturation?
red
green
blue
Outline
• Color representation• Image representation• Pin-hole Camera• Projection matrix• Plenoptic function
Image Representation
An image is a 2D rectilinear array of Pixels
- A width X height array where each entry of the array stores a single pixel
Image Representation
An image is a 2D rectilinear array of Pixels
- A width X height array where each entry of the array stores a single pixel
A 5X5 picture
pixel
Image Representation
A pixel stores color information
Luminance pixels - gray-scale images (intensity images) - 0-1.0 or 0-255 - 8 bits per pixel
Red, green, blue pixels (RGB) - Color images - Each channel: 0-1.0 or 0-255 - 24 bits per pixel
Image Representation
An image is a 2D rectilinear array of Pixels
- A width X height array where each entry of the array stores a single pixel
- Each pixel stores color information
(255,255,255)
Outline
• Color representation• Image representation• Pin-hole Camera• Projection matrix• Plenoptic Function
How Do We See the World?
Let’s design a camera: idea 1: put a piece of film in front of camera
Do we get a reasonable picture?
Pin-hole Camera
• Add a barrier to block off most of the rays– This reduces blurring– The opening known as the aperture– How does this transform the image?
Camera Obscura
• The first camera– Known to Aristotle– Depth of the room is the focal length– Pencil of rays – all rays through a point
Camera Obscura
How does the aperture size affect the image?
Shrinking the Aperture
• Why not make the aperture as small as possible?– Less light gets through– Diffraction effects…
Shrink the Aperture: Diffraction
A diffuse circular disc appears!
Shrink the Aperture
The Reason of Lenses
Adding A Lens
• A lens focuses light onto the film– There is a specific distance at which objects are “in
focus”• other points project to a “circle of confusion” in the
image– Changing the shape of the lens changes this distance
“circle of confusion”
Changing Lenses
28 mm 50 mm
210 mm70 mm
Outline
• Color representation• Image representation• Pin-hole Camera• Projection matrix• Plenoptic Function
Projection Matrix
• What’s the geometric relationship between 3D objects and 2D images?
Modeling Projection: 3D->2D
The coordinate system– We will use the pin-hole model as an approximation– Put the optical center (Center Of Projection) at the origin– Put the image plane (Projection Plane) in front of the COP– The camera looks down the negative z axis
Modeling Projection: 3D->2D
Projection equations– Compute intersection with PP of ray from (x,y,z) to
COP– Derived using similar triangles (on board)
Modeling Projection: 3D->2D
Projection equations– Compute intersection with PP of ray from (x,y,z) to
COP– Derived using similar triangles (on board)
– We get the projection by throwing out the last coordinate:
Homogeneous Coordinates
Is this a linear transformation?– no—division by z is nonlinear
Homogeneous Coordinates
Is this a linear transformation?
Trick: add one more coordinate:
homogeneous image coordinates
homogeneous scene coordinates
– no—division by z is nonlinear
Homogeneous Coordinates
Is this a linear transformation?
Trick: add one more coordinate:
homogeneous image coordinates
homogeneous scene coordinates
Converting from homogeneous coordinates
– no—division by z is nonlinear
Perspective Projection
divide by third coordinate
Projection is a matrix multiply using homogeneous coordinates:
Perspective Projection
divide by third coordinate
This is known as perspective projection– The matrix is the projection matrix– Can also formulate as a 4x4
Projection is a matrix multiply using homogeneous coordinates:
divide by fourth coordinate
Perspective Effects
Distant object becomes small
The distortion of items when viewed at an angle (spatial foreshortening)
Perspective Effects
Distant object becomes small
The distortion of items when viewed at an angle (spatial foreshortening)
Perspective Effects
Distant object becomes small
The distortion of items when viewed at an angle (spatial foreshortening)
Parallel ProjectionSpecial case of perspective projection
– Distance from the COP to the PP is infinite
– Also called “parallel projection”– What’s the projection matrix?
Image World
Weak-perspective Projection
Scaled orthographic projection
- object size is small as compared to the average distance from the camera z0 (e.g.σz < z0/20)
- d/z ≈ d/z0 (constant)
Weak-perspective Projection
Scaled orthographic projection
- object size is small as compared to the average distance from the camera z0 (e.g.σz < z0/20)
- d/z ≈ d/z0 (constant)
Projection matrix: λ
λ
d z0
View Transformation
From world coordinate to camera coordinate
P
View Transformation
From world coordinate to camera coordinate
P
Viewport Transformation
x
y
u
vu0, v0
From projection coordinate to image coordinate
Viewport Transformation
x
y
u
vu0, v0
u0
v0
100
-sy0
sx 0u
v
1
x
y
1
From projection coordinate to image coordinate
Putting It Together
From world coordinate to image coordinate
u0
v0
100
-sy0
sx 0u
v
1
Perspective projection
View transformation
Viewport projection
Putting It Together
From world coordinate to image coordinate
u0
v0
100
-sy0
sx 0u
v
1
Perspective projection
View transformation
Viewport projection
Image resolution, aspect ratio
Focal length The relative position & orientation between camera and objects
Camera Parameters
Totally 11 parameters,
u0
v0
100
-sy0
sx 0u
v
1
Camera Parameters
Totally 11 parameters,
u0
v0
100
-sy0
sx 0u
v
1
Intrinsic camera parameters
extrinsic camera parameters
How about this image?
Outline
• Color representation• Image representation• Pin-hole Camera• Projection matrix• Plenoptic function
Plenoptic Function
What is the set of all things that we can ever see?
- The Plenoptic Function (Adelson & Bergen)
Plenoptic Function
What is the set of all things that we can ever see?
- The Plenoptic Function (Adelson & Bergen)
Let’s start with a stationary person and try to parameterize everything that he can see…
Plenoptic Function
Any ray seen from a single view point can be parameterized by (θ,φ).
Color Image
is intensity of light – Seen from a single view point (θ,φ)– At a single time t– As a function of wavelength λ
P(θ,φ,λ)
Dynamic Scene
is intensity of light – Seen from a single view point (θ,φ)– Over time t– As a function of wavelength λ
P(θ,φ,λ,t)
Moving around A Static Scene
is intensity of light – Seen from an arbitrary view point (θ,φ)– At an arbitrary location (x,y,z)– At a single time t– As a function of wavelength λ
P(x,y,z,θ,φ,λ)
Moving around A Dynamic Scene
is intensity of light – Seen from an arbitrary view point (θ,φ)– At an arbitrary location (x,y,z)– Over time t– As a function of wavelength λ
P(x,y,z,θ,φ,λ,t)
Plenoptic Function
Can reconstruct every possible view, at every moment, from every position, at every wavelength
Contains every photograph, every movie, everything that anyone has ever seen! it completely captures our visual reality!
An image is a 2D sample of plenoptic function!
P(x,y,z,θ,φ,λ,t)
How to “Capture” Orthographic Images
Rebinning rays forms orthographic images
How to “Capture” Orthographic Images
Rebinning rays forms orthographic images
How to “Capture” Orthographic Images
Rebinning rays forms orthographic images
How to “Capture” Orthographic Images
Rebinning rays forms orthographic images
How to “Capture” Orthographic Images
Rebinning rays forms orthographic images
Multi-perspective ImagesRebinning rays forms multiperspective
images
Multi-perspective ImagesRebinning rays forms multiperspective
images
……
Multi-perspective Images
Outline
• Color representation• Image representation• Pin-hole Camera• Projection matrix• Plenoptic Function
They Are All Images
Next lecture
Image sampling theory
Fourier Analysis
Recommended