Computer Vision: Intro

Preview:

Citation preview

Computer Vision: Intro.What are its goals?What are the applicationsWhat are some ways of using images(Later: methods and programming)

Finding ships in an aerial photograph

A corresponding map

Dock areaRegistered map and image

Finding a kidney in computer-aided tomographic scan

Prototype kidney model and model fitting

Resulting kidney and spinal cord instances

Goal of computer visionMake useful decisions about real physical objects and scenes based on sensed images.Alternative (Aloimonos and Rosenfeld): goal is the construction of scene descriptions from images. (Read reference S1)How do you find the door to leave?How do you determine if a person is friendly or hostile? .. an elder? .. a possible mate?

Critical IssuesSensing: how do sensors obtain images of the world?Information: how do we obtain color, texture, shape, motion, etc.?Representations: what representations should/does a computer [or brain] use?Algorithms: what algorithms process image information and construct scene descriptions?

Images: 2D projections of 3D3D world has color, texture, surfaces, volumes, light sources, objects, motion, betweeness, adjacency, connentions, etc.2D image is a projection of a scene from a specific viewpoint; many 3D features are captured, some not.Brightness or color = g(x,y) or f(row, column) for a certain instant of timeImages indicate familiar people, moving objects or animals, health of people or machines

Image receives reflectionsLight reaches surfaces in 3DSurfaces reflectSensor element receives light energyIntensity countsAngles countMaterial counts

CCD Camera has discrete eltsLens collects light raysCCD eltsreplace chemicals of filmNumber of eltsless than with film (so far)

Resolution is “pixels per unit of length”

Resolution decreases by one half in cases at leftHuman faces can be recognized at 64 x 64 pixels per face

Features detected depend on the resolution

Can tell hearts from diamondsCan tell face valueGenerally need 2 pixels across line or small region (such as eye)

Camera + Programs = DisplayCamera inputs to frame bufferProgram can interpret dataProgram can add graphicsProgram can add imagery

Computer Vision-Rosenfeld

VisionThe most powerful sense for many living organismsFiled related to visual perception: physiology, psychology, computational and robot vision, and engineeringLarge part of the human brain is devoted to visual perception

Computational algorithms

Computer Vision-Rosenfeld

Do you see as it is?

Totally straight line

How many black dots?

Computer Vision-Rosenfeld

The Goal of Image UnderstandingWe use vision to interact with our environments and survivePerform visual tasks: engage in many kinds of behaviors that are guided by visual inputsDavid Marr: construction of a detailed representation of the physical world

Transform 2D data into a description of the 3D spatiotemporal world

Computer Vision-Rosenfeld

Scene RecoveryWhat properties of a scene can be recovered by means of vision?Inverse optics problem

Optics map: the world into the imageVision: attempts to invert the optical map

Computer vision and other fields

http://en.wikipedia.org/wiki/Computer_vision

A Range of representationsGeneralized images

Iconic(image-like)Low level processing

A Range of representations

Segmented images

Edge segmentation

A Range of representationsGeometric representation

3D Shape Prior knowledge

A Range of representationsRelational models

Semantic netsHigh level processing

Look at some CV applications

Graphics or image retrieval systems; Geographical: GIS;

Medical image analysis; manufacturing

Image Database SearchCompany wants a new logoMake several designsSearch logo database for infringement

Aerial images & GISAerial image of WenatchieRiver watershedCan correspond to map; can inventory snow coverage

Medical imaging is critical

Visible human project at NLMAtlas for comparisonTestbed for methods

Medical Imaging

CT image of a patient’s abdomen

3D reconstruction of Blood Vessel Tree

Medical Imaging

Cardiac tagged MRIs

Manufacturing case 100 % inspection neededQuality demanded by major buyerAssembly line updated for visual inspection well before today’s powerful computers

Simple Hole Counting Alg.Customer needs 100% inspectionAbout 100 holesBig problem if any hole missingImplementation in the 70’sAlg also good for counting objects

Imaging added to lineCamera placed above conveyor lineBack lighting added1D of image from motion of object past the camera

Critical “corner patterns”“external corner”has 3(1)s and 1(0)“internal corner”has 3(0)s and 1(1)Holes computed from only these patterns!

Hole (Object) Counting Alg.

#holes = (#e - #i)/4

Variations on AlgorithmEasy if entire image is in memoryOnly need to have 2 rows in memory at any time

* used in the 1970’s* can allow special hardware

Some other methods

Finding contrast in an image; using neighborhoods of pixels;

detecting motion across 2 images

Differentiate to find object edges

For each pixel, compute its contrastCan use max difference of its 8 neighborsDetects intensity change across boundary of adjacent regions

4 and 8 neighbors of a pixel4 neighbors are at multiples of 90 degrees

. N .W * E. S .

8 neighbors are at every multiple of 45 degrees

NW N NEW * ESW S SE

Detect Motion via SubtractionConstant backgroundMoving objectProduces pixel differences at boundaryReveals moving object and its shape

Some image format issues

Spatial resolution; intensity resolution; image file format

Resolution is “pixels per unit of length”

Resolution decreases by one half in cases at leftHuman faces can be recognized at 64 x 64 pixels per face

Features detected depend on the resolution

Can tell hearts from diamondsCan tell face valueGenerally need 2 pixels across line or small region (such as eye)

Many different image file forms

Portable gray map (PGM) older formGIF was early commercial versionJPEG (JPG) is modern versionMany others existDo they handle color?Do they provide for compression?Need to have size & parameters & pixels

PGM image with ASCII info.P2 means ASCII grayCommentsW=16; H=8192 is max intensityCan be made with editor

P1: binary, P3: RGB, P4 and P6: binary format

JPG current popular formPublic, not private, standardAllows for image compression; often 10:1 or 30:1 are easily possible8x8 intensity regions are fit with basis of cosinesError in cosine fit coded as wellParameters then compressed with Huffman codingVERY TECHNICAL!

First day course businessSyllabus on web (read for next time)Course web pages (www.research.rutgers.edu/~chansu/CS580Web/ ) Textbook by Shapiro and StockmanRead Chapters 1 and 2 Read Chapter 1 for Ballard Computer Vision book (Available online: http://homepages.inf.ed.ac.uk/rbf/BOOKS/BANDB/bandb.htm)

Recommended