Have you ever used computer vision? - Brown...

View
4
Download
0
Category

Documents

Preview:

Citation preview

Have you ever used computer vision?

How? Where?

Think-Pair-Share

Jitendra Malik, UC Berkeley

Three ‘R’s of Computer Vision

Jitendra Malik, UC Berkeley

Three ‘R’s of Computer Vision

“The classic problems of computational vision:

reconstruction

recognition

(re)organization.”

Have you ever used computer vision?

How? Where?

Reconstruction? Recognition? (Re)organization?

Think-Pair-Share

• Laptop: Biometrics auto-login (face recognition, 3D), OCR

• Smartphones: QR codes, computational photography (Android Lens Blur, iPhone Portrait Mode), panorama construction (Google Photo Spheres), face detection, expression detection (smile), Snapchat filters (face tracking), Google Tango (3D reconstruction),

• Web: Image search, Google photos (face recognition, object recognition, scene recognition, geolocalization from vision), Facebook (image captioning), Google maps aerial imaging (image stitching), YouTube (content categorization)

• VR/AR: outside-in tracking (HTC VIVE), inside out tracking (simultaneous localization and mapping, HoloLens)

• Xbox: Kinect, full body tracking of skeleton, gesture recognition

• Medical imaging: CAT / MRI reconstruction, assisted diagnosis, automatic pathology, connectomics, endoscopic surgery

• Industry: vision-based robotics (marker-based), machine-assisted router (jig), automated post, ANPR (number plates), surveillance, drones

• Transportation: assisted driving (everything), face tracking/iris dilation for drunkeness, drowsiness

• Media: Visual effects for film, TV (reconstruction), virtual sports replay (reconstruction), semantics-based auto edits (reconstruction, recognition)

Optical character recognition (OCR)

Digit recognition, AT&T labs

http://www.research.att.com/~yann/

Technology to convert scanned docs to text• If you have a scanner, it probably came with OCR software

License plate readershttp://en.wikipedia.org/wiki/Automatic_number_plate_recognition

JH

http://www.research.att.com/~yann

http://en.wikipedia.org/wiki/Automatic_number_plate_recognition

Face detection

• Almost all digital cameras detect faces

• Snapchat face filters

Smile detection

Sony Cyber-shot® T70 Digital Still Camera JH

http://www.sonystyle.com/webapp/wcs/stores/servlet/ProductDisplay?catalogId=10551&storeId=10151&productId=8198552921665200469&langId=-1

Object recognition (in supermarkets)

LaneHawk by EvolutionRobotics

“A smart camera is flush-mounted in the checkout lane, continuously

watching for items. When an item is detected and recognized, the

cashier verifies the quantity of items that were found under the basket,

and continues to close the transaction. The item can remain under the

basket, and with LaneHawk,you are assured to get paid for it… “

JH

http://www.evolution.com/products/lanehawk/

Vision-based biometrics

“How the Afghan Girl was Identified by Her Iris Patterns” Read the story

wikipedia

JH

http://www.cl.cam.ac.uk/~jgd1000/afghan.html

http://en.wikipedia.org/wiki/Afghan_Girl_(photo)

Login without a password…

Login without a password…

Login without a password…

Object recognition (in mobile phones)

Google GogglesWikipedia: “In a May 2014 update to Google Mobile for iOS, the Google

Goggles feature was removed due to being ‘of no clear use to too many people.’“ : (

http://www.google.com/mobile/goggles/#text

3D from images

Building Rome in a Day: Agarwal et al. 2009

Human shape capture

Human shape capture

Human shape capture

Human shape capture

Star Wars: Rogue One – Peter Cushing / Admiral Tarkin

Special effects: shape capture

Special effects: shape capture

Special effects: motion capture

Interactive Games: Kinect

• Object Recognition: http://www.youtube.com/watch?feature=iv&v=fQ59dXOo63o

• Mario: http://www.youtube.com/watch?v=8CTJL5lUjHg

• 3D: http://www.youtube.com/watch?v=7QrnwoO1-8A

• Robot: http://www.youtube.com/watch?v=w8BmgtMKFbY

JH

http://www.youtube.com/watch?feature=iv&v=fQ59dXOo63o

http://www.youtube.com/watch?v=8CTJL5lUjHg

http://www.youtube.com/watch?v=7QrnwoO1-8A

http://www.youtube.com/watch?v=w8BmgtMKFbY

Sports

Sportvision first down line

Nice explanation on www.howstuffworks.com

http://www.sportvision.com/video.html

JH

http://www.howstuffworks.com/first-down-line.htm

http://www.howstuffworks.com/

http://www.sportvision.com/video.html

Medical imaging

Image guided surgery

Grimson et al., MIT3D imaging

MRI, CT

JH

http://groups.csail.mit.edu/vision/medical-vision/surgery/surgical_navigation.html

AutoCars - Uber bought CMU’s lab

Industrial robots

Vision-guided robots position nut runners on wheels

JH

Vision in space

Vision systems (JPL) used for several tasks• Panorama stitching

• 3D terrain modeling

• Obstacle detection, position tracking

• For more, read “Computer Vision on Mars” by Matthies et al.

NASA'S Mars Exploration Rover Spirit captured this westward view from atop

a low plateau where Spirit spent the closing months of 2007.

JH

http://www.ri.cmu.edu/pubs/pub_5719.html

http://marsrovers.jpl.nasa.gov/gallery/images.html

Mobile robots

http://www.robocup.org/NASA’s Mars Spirit Rover

http://en.wikipedia.org/wiki/Spirit_rover

Saxena et al. 2008

STAIR at StanfordJH

http://www.robocup.org/

http://upload.wikimedia.org/wikipedia/commons/d/d8/NASA_Mars_Rover.jpg

http://en.wikipedia.org/wiki/Spirit_rover

http://stair.stanford.edu/

Augmented Reality and Virtual Reality

MS HoloLens, Oculus, Magic Leap,

Jitendra Malik, UC Berkeley

Three ‘R’s of Computer Vision

“[Further progress in] the classic problems of computational vision:

reconstruction

recognition

(re)organization

[requires us to study the interaction among these processes].”

State of the art today?

With enough training data, computer vision nearly

matches human vision at most recognition tasks

Deep learning has been an enormous disruption to

the field. Many techniques are being “deepified”.

JH

Computer Vision and Nearby Fields

Derogatory summary of computer vision:

“Machine learning applied to visual data.”

• Computer Graphics: Models to Images

• Computer Vision: Images to Models

JH

Scope of CSCI 1430

Computer Vision Robotics

Neuroscience

Graphics

Computational Photography

Machine Learning

Medical Imaging

Human Computer Interaction

Optics

Image ProcessingGeometric Reasoning

RecognitionDeep Learning

JH

COURSE STAFF

I work in here.

Instructor: James Tompkin

Max Planck InstituteGermany

University College LondonUK

james_tompkin@brown.eduOffice: CIT 547

Render pixels?Capture pixels?Interact with pixels?

Watch my research overview video!

I am probably interested.

mailto:james_tompkin@brown.edu

TA introduction

Eric XiaoDaniel NurieliEleanor TursmanMartin Zhu

Jackson (Jack) Gibbons

Prerequisites

• Linear algebra, basic calculus, and probability

• Programming, data structures

CS 143 – James Hays

• Continuing his course – many materials, the courseworks, from him + previous staff – serious thanks!

• If you see ‘JH’ on a slide, then it is one of his.

Contact

• Course runs quiet hours – 9pm to 9am.– No contact! We will ignore you.

• Piazza first– TAs are not on there all the time; set hours.

• cs1430tas@lists.brown.edu second

• TA office hours: Moon Lab 227, Tues-Thurs 7-9pm

• James: Tues 1-2pm

mailto:cs1430tas@lists.brown.edu

Textbook

http://szeliski.org/Book/

JH

http://szeliski.org/Book/

Textbook

Projects / Grading

• 100% programming projects (6 total)

Project 0 (do this immediately)• Install MATLAB• Go Help, go through ‘Getting Started with

MATLAB’ tutorial. Familiarize.• Go through second tutorial (image operands)• Come to us in office hours (next week) if you have

trouble.

Proj 1: Image Filtering and Hybrid Images

• Implement image filtering to separate high and low frequencies.

• Combine high frequencies and low frequencies from different images to create a scale-dependent image.

JH

Proj 2: Local Feature Matching

• Implement interest point detector, SIFT-like local feature descriptor, and simple matching algorithm.

JH

Proj 3: Multi-view Geometry

• Recover camera calibration from feature point matches.

• Foundation for almost all measurement in computer vision.

JH

Proj 4: Scene Recognition with Bag of Words

• Quantize local features into a “vocabulary”, describe images as histograms of “visual words”, train classifiers to recognize scenes based on these histograms.

JH

Proj 5: Object Detection with a Sliding Window

• Train a face detector based on positive examples and “mined” hard negatives, detect faces at multiple scales and suppress duplicate detections.

JH

Proj 6: Convolutional Neural Net

• Proj 4 again, but state of the art.

JH

Any questions?!

• Friday: Light and Color

Recommended

Introduction to Computer Vision - Brown Universitycs.brown.edu/courses/cs143/2009/lecture10.pdf · CS143 Intro to Computer Vision ©Michael J. Black One more connection… • Problem

Documents

JavaScript - Kangwoncs.kangwon.ac.kr/~whcho/2017_Spring/practice/lecture7.pdf · 2017. 4. 28. · JavaScript HTML 페이지의사용자반응을처리 HTML의정적인페이지를동적으로수행하게함

Documents

CS143 Relational Model - oak.cs.ucla.edu

Documents

technology with know-how for your success - BMC Grocery Solutions Brochure.pdf · NCR Fastlane SelfServ LaneHawk •Reduced shrink and rapid ROI boost profits per lane, per day by

Documents

CV as a social good bad?cs.brown.edu/courses/cs143/2017_Spring/lectures_Spring...CV / ML ‘human factors’ •Computer vision / machine learning is a tool. •Tools are used under

Documents

Emergency Management’s Role in the Opiate Crisisema.ohio.gov/Documents/DirectorsConference/2017_Spring...2015 –554 total O’s • 2014 –354 total O’s • 2013 –175 total

Documents

37th Korean College of Rheumatology Annual Scienti˜c ...rheum.thewithin.kr/register/2017_spring/file/Program_Book.pdf · Considering latest strategy beyond EULAR 2016 recommendation

Documents

lec00 intro web - Cornell University€¦ · Objectrecognion(insupermarkets) LaneHawk by EvolutionRobotics “A smart camera is flush-mounted in the checkout lane, continuously watching

Documents

CS143 Handout 20 Summer 2011 July 15th CS143 Practice ... · CS143 Handout 20 Summer 2011 July 15th, 2011 CS143 Practice Midterm and Solution Exam Facts Wednesday, July 20th from

Documents

HTML5 멀티미디어태그 - Kangwoncs.kangwon.ac.kr/~whcho/2017_Spring/practice/lecture3.pdf · 2017. 3. 24. · 공간분할태그 웹페이지의공간을분할하는태그 공간을분할하여CSS을통해원하는레이아웃을구성

Documents

CS 143: Introduction to Computer Visioncs.brown.edu/courses/cs143/2011/lectures/01.pdf · Object recognition (in supermarkets) LaneHawk by EvolutionRobotics “A smart camera is flush-mounted

Documents

22958 PEPERIKSAAN AKHIR JUN 2018 … · 22/06/2018 jumaat pagi 8:30 am 11:30 am s cs143 mgt162 rcs1434e 27 d/cengal 22/06/2018 jumaat pagi 8:30 am 11:30 am s cs143 mgt162 rcs1434f

Documents

Introduction to Computer Visioncs.brown.edu/courses/cs143/2009/lecture3.pdf · CS143 Intro to Computer Vision ©Michael J. Black Very powerful effect Liu and Todd, Vision Research

Documents

Last lecture: Edges primer - Brown Universitycs.brown.edu/courses/cs143/2017_Spring/lectures_Spring... · 2017. 2. 13. · Last lecture: Edges primer • Edge detection to identify

Documents

Frost Cardiovascular Stress Testing - c.ymcdn.comc.ymcdn.com/.../resmgr/2017_spring/Frost_Cardiovascular_Stress_.pdf · • Normal stress test does not mean arteries are clean or

Documents

Compilers - Stanford Universityweb.stanford.edu › class › cs143 › lectures › lecture01.pdfCompilers CS143 3:00-4:20 TT Lectures on Zoom scheduled through Canvas 1 Instructor:

Documents

CS143, Brown James Hays

Documents

2017년 - 더 위드인ons.thewithin.kr/register/2017_spring/program/roomA.pdf · 2017-02-28 · 2017년 대한비만체형학회 제22차 학술대회 CONTENTS Room A, 그랜드볼룸

Documents

HTML5 기본구조와작성법 - Kangwoncs.kangwon.ac.kr/~whcho/2017_Spring/practice/lecture1.pdf · 2017-03-17 · WWW(World Wide Web)의웹표준기술, 웹마크업언어 웹페이지의뼈대를이루고있으며,

Documents

Multi-stable Perception - Brown Universitycs.brown.edu/courses/cs143/lectures_Fall2017/32_Fall2017_OpticalFlow.pdf · Multi-stable Perception Necker Cube. Spinning dancer illusion,

Documents