14
Computer Vision – Intro Images are taken from: Computer Vision : Algorithms and Applications / Richard Szeliski

Computer Vision Introduction

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Computer Vision Introduction

Computer Vision – Intro

Images are taken from: Computer Vision : Algorithms and Applications / Richard Szeliski

Page 2: Computer Vision Introduction

Time line

Page 3: Computer Vision Introduction

Standard Computer Vision

Tasks

Page 4: Computer Vision Introduction

Open CV

OpenCV (Open Source Computer Vision Library: http://opencv.org) is an open-source BSD-licensed library that includes several hundreds of computer vision algorithms.

Page 5: Computer Vision Introduction

Open CV – hard facts

• OpenCV is released under a BSD license • Free for both academic and commercial use. • C++, C, Python and Java interfaces. • Supports Windows, Linux, Mac OS, iOS and Android.• Written in optimized C/C++• Ctake advantage of multi-core processing.• Downloads exceeding 6 million.• Latest version 2.4.6

Page 6: Computer Vision Introduction

Open CV – intro (1/2)OpenCV has a modular structure, which means that the package includes several shared or static libraries. The following modules are available:

core - a compact module defining basic data structures, including the dense multi-dimensional array Mat and basic functions used by all other modules.imgproc - an image processing module that includes linear and non-linear image filtering, geometrical image transformations (resize, affine and perspective warping, generic table-based remapping), color space conversion, histograms, and so on.video - a video analysis module that includes motion estimation, background subtraction, and object tracking algorithms.calib3d - basic multiple-view geometry algorithms, single and stereo camera calibration, object pose estimation, stereo correspondence algorithms, and elements of 3D reconstruction.

Page 7: Computer Vision Introduction

Open CV – intro (2/2)features2d - salient feature detectors, descriptors, and descriptor matchers.objdetect - detection of objects and instances of the predefined classes (for example, faces, eyes, mugs, people, cars, and so on).highgui - an easy-to-use interface to video capturing, image and video codecs, as well as simple UI capabilities.gpu - GPU-accelerated algorithms from different OpenCV modules.... some other helper modules, such as FLANN and Google test wrappers, Python bindings, and others.

http://docs.opencv.org/doc/tutorials/tutorials.html

Page 8: Computer Vision Introduction

Android programming - stepsMinimum skills:

Java for android / Objective C for iOSopenCV C++ for native code

Minimum installation for android: (We will learn and apply later today)Eclipse IDEAndroid ADTopenCVopenCV C++/nativeSimulator

Page 9: Computer Vision Introduction

Canny Edge Detector• void Canny(InputArray image, OutputArray edges, double threshold1,

double threshold2, int apertureSize=3, bool L2gradient=false )

• Parameters: image – single-channel 8-bit input image. edges – output edge map; it has the same size and type as image . threshold1 – first threshold for the hysteresis procedure. threshold2 – second threshold for the hysteresis procedure.apertureSize – aperture size for the Sobel() operator.L2gradient – a flag, indicating whether a more accurate L_2 norm =\sqrt{(dI/dx)^2 + (dI/dy)^2} should be used to calculate the image gradient magnitude ( L2gradient=true ), or whether the default L_1 norm =|dI/dx|+|dI/dy| is enough ( L2gradient=false ).

Page 10: Computer Vision Introduction

Canny Edge Detector - codeMat src, src_gray;Mat dst, detected_edges;

int edgeThresh = 1;int lowThreshold = 1;int const max_lowThreshold = 100;int kernel_size = 3;char* window_name = "Edge Map";

/// Reduce noise with a kernel 3x3. Assume src_gray is already read blur( src_gray, detected_edges, Size(3,3) );

/// Canny detector

Canny( detected_edges, detected_edges, lowThreshold, lowThreshold, kernel_size );

/// Using Canny's output as a mask, we display our result dst = Scalar::all(0);

src.copyTo( dst, detected_edges); imshow( window_name, dst );

Page 11: Computer Vision Introduction

Hough Transform• void HoughLines(InputArray image, OutputArray lines, double rho, double theta,

Int threshold, double srn=0, double stn=0 )

• Parameters:image – 8-bit, single-channel binary source image. lines – Output vector of lines rho – Distance resolution of the accumulator in pixels.theta – Angle resolution of the accumulator in radians.threshold – Accumulator threshold parameter. srn – For the multi-scale Hough transform, it is a divisor for the distance

resolution rho. stn – For the multi-scale Hough transform, it is a divisor for the distance

resolution theta.

Page 12: Computer Vision Introduction

Hough Transform - code Mat dst, cdst; Canny(src, dst, 50, 200, 3); cvtColor(dst, cdst, CV_GRAY2BGR);

vector<Vec2f> lines;

HoughLines(dst, lines, 1, CV_PI/180, 100, 0, 0 );

// Draw the lines for( size_t i = 0; i < lines.size(); i++ ) { float rho = lines[i][0], theta = lines[i][1]; Point pt1, pt2; double a = cos(theta), b = sin(theta); double x0 = a*rho, y0 = b*rho; pt1.x = cvRound(x0 + 1000*(-b)); pt1.y = cvRound(y0 + 1000*(a)); pt2.x = cvRound(x0 - 1000*(-b)); pt2.y = cvRound(y0 - 1000*(a)); line( cdst, pt1, pt2, Scalar(0,0,255), 3, CV_AA); }

Page 13: Computer Vision Introduction

Cascade classifier• void CascadeClassifier::detectMultiScale(const Mat& image, vector<Rect>& objects, double

scaleFactor=1.1, int minNeighbors=3, int flags=0, Size minSize=Size(), Size maxSize=Size())

• Parameters:cascade – Haar classifier cascade (OpenCV 1.x API only). It can be loaded from XML or

YAML file using Load(). image – Matrix of the type CV_8U containing an image where objects are detected.objects – Vector of rectangles where each rectangle contains the detected object.scaleFactor – Parameter specifying how much the image size is reduced at each image scale.minNeighbors – Parameter specifying how many neighbors each candidate rectangle

should have to retain it.flags – Parameter with the same meaning for an old cascade as in the function

cvHaarDetectObjects. It is not used for a new cascade.minSize – Minimum possible object size. Objects smaller than that are ignored.maxSize – Maximum possible object size. Objects larger than that are ignored.

Page 14: Computer Vision Introduction

Cascade classifier - code String face_cascade_name = "haarcascade_frontalface_alt.xml";CascadeClassifier face_cascade; // load cascadeface_cascade.load( face_cascade_name ) ;eyes_cascade.load( eyes_cascade_name );

Mat frame_gray; cvtColor( frame, frame_gray, CV_BGR2GRAY ); equalizeHist( frame_gray, frame_gray );

// Detect faces face_cascade.detectMultiScale( frame_gray, faces, 1.1, 2,0|CV_HAAR_SCALE_IMAGE, Size(30, 30) );

// Draw ellipses for( int i = 0; i < faces.size(); i++ ) { Point center( faces[i].x + faces[i].width*0.5, faces[i].y + faces[i].height*0.5 ); ellipse( frame, center, Size( faces[i].width*0.5, faces[i].height*0.5), 0, 0, 360, Scalar( 255, 0, 255 ), 4, 8, 0 ); }