68
Content-based image and video analysis Tools and Libraries 11.07.2011

CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Content-based image and video analysis

Tools and Libraries

11.07.2011

Page 2: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Lecture overview

Labeled data is needed in almost all discussed approaches But: Labeling data is tedious and expensive work We will discuss different approaches to solve this

problem Software libraries for standard tasks can

drastically reduce development time We will discuss software libraries for image

processing and machine learning

Content-based Image and Video Retrieval 2

Page 3: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Labels Diverse types of data annotations needed

Face recognition face bounding box, facial landmarks, id Object recognition bounding box, object class High-level features: images depicting concept Genre-classification:videos of certain genre …

The only way to get the labels is to get people to manually label the data It is a very boring task People need to be compensated somehow Most common way: just pay them!

Can be quite expensive for large datasets

Content-based Image and Video Retrieval 3

Page 4: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

LabelMe

Collaborative labeling To download the database, a certain number of

images have to be annotated Label quality probably good, since annotators

have to use them Public (everyone can join) A large database (460000 labeled objects of all

kinds) http://labelme.csail.mit.edu/

Credit: Franziska KrausContent-based Image and Video Retrieval 4

Page 5: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Content-based Image and Video Retrieval 5

LabelMe

Page 6: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Annotation Tool

Draw polygons and name the labeled object

Content-based Image and Video Retrieval 6

Page 7: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Annotation Tool

Quality of polygons and names is not ensured by supervision → still sufficient A lot of images have more than 80% of their pixels

labeled Several different object categories in many images

Content-based Image and Video Retrieval 7

Page 8: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Annotation Tool

Objects are mostly labeled completely despite of partial occlusions Object-parts hierarchies Depth-ordering

Content-based Image and Video Retrieval 8

Page 9: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Browse Database

Content-based Image and Video Retrieval 9

Page 10: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Query Database

Content-based Image and Video Retrieval 10

Page 11: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Query Database

Label names are up to the user→ no consistency

Use WordNet (dictionary) to group categories

Content-based Image and Video Retrieval 11

Page 12: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Object-parts hierarchies

Polygons with high overlap either Object-part hierarchy (head body) Occlusion

For a given query (e.g. “car”) check for polygons that often have a high overlap (e.g. “wheel”)

Compute score as percentage of images where part (“wheel”) has high overlap with object (“car”)

List of object-part candidates

Content-based Image and Video Retrieval 12

Page 13: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Occlusion / Depth-ordering

Simple heuristics Some things can never occlude other objects (sky) An object completely contained in another is on

top May be wrong if containing object is transparent etc.

The polygon with more control points in the intersecting area is on top

Use color histograms: compare histogram in overlapping region to the two other regions More similar region is on top

Combined heuristic achieves 2.9% error

Content-based Image and Video Retrieval 13

Page 14: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Semi-automatic labeling

Use available labels to train detectors Run detector on unlabeled data Let user verify the generated labels

Content-based Image and Video Retrieval 14

Page 15: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Summary of LabelMe

Advantages Motivation to label images is given To download the database a user first has to label

a certain number of images→ Data quality is probably good

Disadvantages Only a few people are interested in the labels However they are the only ones providing the

labels Is it possible to get people with no interest in

the labels to annotate the images (without paying them)?

Content-based Image and Video Retrieval 15

Page 16: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Human Computation

Humans can solve problems that computers can't solve yet Simple example: captchas

Humans are way better than computers in labeling and tagging images How do we get them to do that? Do people on the Internet have enough time?

Content-based Image and Video Retrieval 16

Page 17: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Human Computation

“People all over the world spent 9 billion hours playing solitaire in 2003” Luis von Ahn, 2006

Construction of the Empire State Building took ~ 6.8h playing solitaire(7 million human-hours)

Building the Panama Canal took less than 1 day of playing solitaire(20 million human-hours)

Content-based Image and Video Retrieval 17

Page 18: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Human Computation

Humans have a lot of time Humans can easily label and tag images BUT you'll have to pay them to do that for you

Solution – Create a game that encourages people to label images Or any other task you want them to do

Content-based Image and Video Retrieval 18

Page 19: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Games With A Purpose

http://www.gwap.com/Content-based Image and Video Retrieval 19

Page 20: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

ESP-Game

Game for tagging images from the Web

Users see a picture and have to agree on a tag for what the picture contains

→ A lot of image-tag pairs, but no information about location or size of objects in the image

Content-based Image and Video Retrieval 20

Page 21: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

ESP-Game

Content-based Image and Video Retrieval 21

Page 22: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

User statistics

In the first four months after release 13,630 people played the game

80% of them played more than once

1,271,451 labels for 293,760 images were generated

33 people played more than 1000 games (>50h) Extrapolating

5000 people playing 24h/d could label all images in Google image search in one month!

In popular online gaming sites, many more players are online at a time (>100,000 players)

Content-based Image and Video Retrieval 22

Page 23: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Label quality

Search for labels For 10 random labels, images with this label were

displayed For all returned images, the label made sense Very high precision

Manually labeled images For all images at least 83% of the ESP game tags

were also used by the manual annotators For all images the three most common tags used by

the manual annotators were also in the ESP game tags Manual quality assessment

People would use 85% of the ESP game tags to describe the corresponding image

Only 1.7% of the tags don’t make sense as description of the image

Content-based Image and Video Retrieval 23

Page 24: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Peekaboom

Game for locating objects in images Improves data collected by the ESP Game Two random players are paired up

Boom (Player 1) reveals parts of the image Peek (Player 2) guesses the associated word Role switch on successful guess

Playing “bots” for uneven number of players or when people quit the game

Content-based Image and Video Retrieval 24

Page 25: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Game Overview

Content-based Image and Video Retrieval 25

Page 26: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Pings

Boom can “ping” parts of the revealed object to point out particular parts

Content-based Image and Video Retrieval 26

Page 27: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Word – Image Relation

Boom can give hints about how the word relates to the image

Content-based Image and Video Retrieval 27

Page 28: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Image Metadata

For each image-word pair, metadata is collected How the word relates to the image

Hints Necessary pixels to guess the word

Area that is revealed Pixels inside the specified object

Pings Most salient aspects of objects in the image

Sequence of Boom's clicks Elimination of poor image-word pairs

Many pairs of players click “pass”

Content-based Image and Video Retrieval 28

Page 29: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Cheating

People could try to violate the game by logging in at the same time Tell each other which words to type

→ reveal the wrong parts and type the right word anyway

Multiple anti-cheating mechanisms

Content-based Image and Video Retrieval 29

Page 30: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Anti-Cheating Mechanisms

Player queue Player has to wait n seconds until he is paired up

IP address checks Seed images

Images with hand-verified metadata in the game

Limited freedom to enter guesses Guess field used for communication Only letters, only words in the dictionary

Aggregating data from multiple players

Content-based Image and Video Retrieval 30

Page 31: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Implementation

Spelling check Incorrect words are displayed

in a different color Inappropriate word

replacement Substituted with words like

“ILuvPeekaboom” Top scores list and ranks

Top scores of the day and of all time

Users get a rank based on their total number of points

Content-based Image and Video Retrieval 31

Page 32: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Additional Applications

Improving image-search results Peekaboom gives estimate of the fraction of the

image that is related to the word → use this fractions to order image results Use ping data for pointing to objects Object bounding boxes

Image search engine with highlighted results

Content-based Image and Video Retrieval 32

Page 33: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Ping Accuracy

Use ping data for pointing Arrow lines pointing to the

objects Pings selected at random 100% accuracy shown in

experiment

Content-based Image and Video Retrieval 33

Page 34: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Bounding Box Accuracy

Bounding box created with Peekaboom Lowest overlap with user-

created box was 50%

Content-based Image and Video Retrieval 34

Page 35: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

User Statistics

How enjoyable is the game? 14,000 different people and 1.1 million pieces of

data during the first month User comments:

Content-based Image and Video Retrieval 35

“One unfortunate side effect of playing so much in such a short time was a mild case of carpal tunnel syndrome in my right hand and forearm, but

that dissipated quickly.”

“This game is like crack. I've been Peekaboom-free for 32 hours. Unlike other

games, Peekaboom is cooperative.”

“[...] I would say that it gives the same gut feeling as combining gambling with charades while riding on a roller coaster. The good points are that you increase

and stimulate your intelligence, you don't lose all your money and you don't fall off the ride. The bad point is that you look at your watch and eight hours have just

disappeared!”

Page 36: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Summary of Peekaboom

Peekaboom works because people like to play games

Experiments show that results are sufficiently accurate

In addition to the ESP data, information about location and size of objects in the image is retrieved

Content-based Image and Video Retrieval 36

Page 37: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Other “Games with a purpose” Tag a tune

Some piece of music is played to you and partner You must describe the music with words Based on your partner’s description you have to decide whether you

two are listening to the same song Verbosity

Describe a word to the partner using other words Partner must guess secret word

Squigl Trace the outline of an object in the same way as your partner Very limited time (5-10 seconds)

Matchin Decide which one of two images you like best Points when partner agrees

Popvideo You and your partner are shown a video clip You have to enter tags describing the video (and audio!) Points when tags match

Content-based Image and Video Retrieval 37

Page 38: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Publicly available labeled datasets

Some datasets are available for free in order to advance research Caltech 101/256, AR, …

Some evaluation campaigns generate a lot of labeled data and provide it for everybody: PASCAL VOC challenge for participants: TRECVID, ImageCLEF

Content-based Image and Video Retrieval 38

Page 39: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Caltech 101/256

Freely available http://vision.caltech.edu/Image_Datasets/Caltech101 http://vision.caltech.edu/Image_Datasets/Caltech256

Pictures of 101/256 object categories Labels and outlines of the objects

Content-based Image and Video Retrieval 39

Page 40: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

PASCAL

Collection of object recognition databases with ground truth

VOC challenge uses it Freely available

http://pascallin.ecs.soton.ac.uk/challenges/VOC/databases.html

Content-based Image and Video Retrieval 40

Page 41: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Software libraries

Many systems use standard algorithms Linear algebra Image processing

Image filters, image features, etc. Machine learning

SVMs, PCA, LDA, etc.

Advantages of using standard libraries Avoid errors in own implementation Leverage know-how of other people

The implementation details of simple algorithms can be quite tricky sometimes!

Save a lot of development time More time to work on your algorithm

Content-based Image and Video Retrieval 42

Page 42: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

OpenCV

Open Computer Vision library http://sourceforge.net/projects/opencvlibrary/ http://opencv.willowgarage.com/

Features C-like interface (upcoming 2.0 is more C++) Linear algebra Image processing functions

Filters, FFT, DCT, etc. Image features

SURF, HOG, Haar-like features (mainly in 2.0/SVN) Detectors

Haar-cascades (training & detection) Machine learning

SVMs, Neural nets, decision trees, GMMs, … Much much more…

Content-based Image and Video Retrieval 43

Page 43: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

OKAPI

Open Karlsruhe library for processing of images http://isl.ira.uka.de/msmmi/okapi/doc/

Not public at the moment If you want to use it, come work with us

Features C++ Cameras (V4L, Firewire, VfW) Videos (frame-accurate random access) Image Features (DCT, Gabor, LBP, MCT, …) Linear Projections (PCA, LDA, RCA) Detector (using MCT features, very fast) SVMs (libsvm, liblinear) 3D geometry functions Simple GUI for prototypes Many utility functions (timing, XML, in-memory image IO, …)

Content-based Image and Video Retrieval 44

Page 44: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

SVMs

Libsvm http://www.csie.ntu.edu.tw/~cjlin/libsvm/ Simple and standard

Liblinear (linear SVMs) http://www.csie.ntu.edu.tw/~cjlin/liblinear/ Much faster for linear SVMs

SVMlight http://svmlight.joachims.org/ Also very popular

Shogun toolbox http://www.shogun-toolbox.org/ Many kernel functions and Multi-Kernel Learning (MKL) Uses libsvm or SVMlight in the background

Content-based Image and Video Retrieval 45

Page 45: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Machine Learning Weka (Java)

http://www.cs.waikato.ac.nz/ml/weka/ Java-ML (Java)

http://java-ml.sourceforge.net/ Torch (C/Lua)

http://torch5.sourceforge.net/ MLC++ (C++)

http://www.sgi.com/tech/mlc/ MLPACK / FASTlib

http://mloss.org/software/view/152/ http://fastlib.analytics1305.com/

Spider (Matlab) http://www.kyb.tuebingen.mpg.de/bs/people/spider/

FLANN (Approximate k-Nearest Neighbors) http://www.cs.ubc.ca/~mariusm/index.php/FLANN/FLANN

List of Open Source Machine Learning software http://mloss.org/

Content-based Image and Video Retrieval 46

Page 46: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

References

Russell, Torralba, MurphyLabelMe: A Database and Web-Based Tool for Image AnnotationInternational Journal of Computer Vision, vol. 77, issue 1, May 2008

von Ahn, DabbishLabeling images with a computer gameProceedings of the SIGCHI conference on Human factors in computing systems, 2004

von Ahn, Liu, BlumPeekaboom: a game for locating objects in imagesProceedings of the SIGCHI conference on Human Factors in computing systems, 2006

Content-based Image and Video Retrieval 47

Page 47: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Lecture Overview

Introduction Visual Descriptors Image Segementation Classification Shot Boundary Detection & Genre Classification High-level Feature Detection High-level Feature Detection II Person Identification Copy Detection

Semantics Search Tools and Libraries

Content-based Image and Video Retrieval 48

Computer Vision & Machine Learning Overview

Indexing & Matching

High-level topics

Intro. to Video proc.

Page 48: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Content-based Image and Video Retrieval 49

Page 49: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Development of superpixel-based features forobject recognition (SA/Bachelor, DA/Master)

Superpixels compute an oversegmentation of an image

Object is segmented into multiple superpixels thatcan be used as atomic building blocks to describethe objects

Develop descriptor based on single superpixels andapply it in an object recognition setting

Evaluate object recognition performance andcompare it to state-of-the-art algorithms

Contact:Alexander [email protected]

Page 50: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Foreground-background segmentationwith superpixels (SA/Bachelor)

Superpixels significantly reduce the number of imageelements from hundreds of thousands of pixels to a coupleof hundred superpixels

Apply existing foreground-background segmentationalgorithms to superpixel segmentations of video streams

Compare and evaluate superpixel-based segmentation topixel-based approaches with respect to quality and speed-up

Contact:Alexander [email protected]

Page 51: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Part-based person detection using the Modified Census Transform (MCT)

Bachelor Thesis Extend an existing holistic

MCT person detector to a part-based detector.

Tasks Determine suitable person

parts Train part detectors Fuse part detections Evaluate detection

performance

Requirements C++ programming

experience Ability to work independently

52Content-based image and video retrieval

Part-based person detections.

Contact Martin Bäuml [email protected] Arne Schumann [email protected]

Page 52: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Hiwi: Face Tracking Dataset

no standard benchmark for facetracking we are working on one

Tasks search for fitting videos (TV

series, Youtube, news…) labeling the data

Contact Martin Bäuml <[email protected]>

Geb. 50.20 R228

Page 53: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Computer Vision forthe Blind and Human-Robot Interaction

(MA/BA Theses and HiWi Jobs)

Help blind people and robots to visually explore and investigate their environment

Various topics available, for example: Who is looking at me, at whom or at what? Describe objects, e.g., their color, or read text, e.g.,

street and door signs Allow following spoken path descriptions, e.g. “go to

the crossroads, turn left and after the church go right” How to present the information to the blind?

Integrated in SFB 588 “Humanoide Roboter”, collaboration with Fraunhofer IOSB possibleContact: B. Schauerte <[email protected]>

Page 54: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Situationserkennung im SmartControlRoom

Aufgabe Situationserkennung in Bildfolgen am Beispiel eines

SmartControlRoom Modellierung von Situationen und

Weiterentwicklung neuartiger logischer Verfahren Fusionieren von Tracking, Kopfdrehungs-

Schätzung, Gestenerkennung, Spracherkennung und Raumbeschreibung

Thomas and Alex areediting the map together

Kontakt:

[email protected]

S1 and S2 are editingthe map

M and S3 are listeningto EL

S4 is doing individual work

Page 55: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Studentische Arbeiten am Fraunhofer IOSBAnalyse der Arbeitsprozesse im Stabsraum

Aufgabe Analyse der Arbeitsprozesse im

Stabsraum anhand mehrerer Kameraströme mit Tonspur und Nachrichtenverkehr

Weiterentwicklung eines entsprechenden Analyse-Tools

Grundlage bilden für eine automatische Situationserkennung mittels Computer Vision

Kontakt:[email protected]

Beispiel-Daten:

Page 56: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Hiwi: Web Admin

Task taking care of our website uploading lecture slides !

Requirements good knowledge of Joomla reliable, timely work

Contact Martin Bäuml <[email protected]>

Geb. 50.20 R228

Page 57: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Content-based Image and Video Retrieval 58

Page 58: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Content-based Image and Video Retrieval 59

Page 59: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Studienarbeit• Previous system(Open‐set face recognition  for visitor interface)– Challenges

• Mismatch in Pose and illumination

• Unbalanced unknown and known samples for training

• Personalize TV program for different family members according to identity– New challenges

• Multi‐resolution due to various distances

Page 60: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Studienarbeit

• Goal– Improve previous system– Evaluate performance with different person‐camera distance

– Scale (distance)‐invariant

Contact: Hua [email protected]

Page 61: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Open Hiwi position

Interested in event recognition? We are looking for a student to teach ARMAR how to recognize events within rooms (eg. cooking, cleaning)

Motivated by Laptev et al., “Learning realistic actions in movies”

Options for SA/DA!!!

Required skills Interest in computer vision C++ programming experience under Linux

Email [email protected] for more details

62Content-based image and video retrieval

HOG HOF

bag-of-features

...

classification

STIP

Page 62: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Studienarbeit/HiwiMultiview ISM for Localization and

Segmentation of HumansTask• Extend the probabilistic formulation of the standard

Implicit Shape Model to 3D/multiview• Annotate existing smart room data for training the

model• Implement and evaluate your approach

Requirements• Knowledge of computer vision and machine learning• Programming experience in C++ and Python

Contact• Martin Bäuml <[email protected]>

Page 63: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

Studienarbeit/HiwiClothing-based person recognition

on ISM segmented dataTask• Use ISM to localize and segment people entering the

door• Identify persons based on their clothing using the

automatically segmented data• Compare your approach against a discriminative

body detector-based approach

Requirements• Knowledge of computer vision and machine learning• Programming experience in C++ and Python

Contact• Martin Bäuml <[email protected]>

Page 64: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

65

Persondetection in a bird‘s view

• Person detection works well in most perspectives, but are not very much explored in bird views– Difficulties are arise from different

appearance distortions, depending on the person‘s position

• Nevertheless, bird views offer a great deal of information we could and should use!

• Goal: Build a person detector, that also responds to different body orientations seen from above

Contact: Florian van de Camp, [email protected], fon: 6091-449Prof. Dr.-Ing. Rainer StiefelhagenInstitut für AnthropomatikForschungsbereich Maschinensehen für Mensch-Maschine InteraktionFakultät für Informatik, Universität Karlsruhe (TH)

Page 65: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

66

Headpose in Active Camera Views

• Headpose allows to deduce a coarse gaze approximation– Passive cameras, that are set up far away

only deliver low-resolution captures– A lot of work tries to cope with this by using

multiple cameras and hence, different views, for a more stable estimate

• But: active cameras allow to zoom onto a person in order to deliver a high-resolution capture

• Goal: Build a system that uses an active camera to zoom onto a person and outputs a high-detailed head orientation estimation

Contact: Michael Voit, [email protected], fon: 6091-449Prof. Dr.-Ing. Rainer StiefelhagenInstitut für AnthropomatikForschungsbereich Maschinensehen für Mensch-Maschine InteraktionFakultät für Informatik, Universität Karlsruhe (TH)

Page 66: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

67

3D Voxel Coloring (Studienarbeit)Current situation:

• 3D reconstruction of multiple people in a SmartControlRoom using voxel carving

• Multiple camera views of the same scene from different angles are available

• Voxel coloring is challenging due to noise in 3D data

Goal:

• Develop voxel coloring algorithm that computes the correct color for each voxel considering all camera images

Contact: Alexander Schick, 0721-6091-348, [email protected]

Prof. Dr.-Ing. Rainer StiefelhagenInstitut für AnthropomatikForschungsbereich Maschinensehen für Mensch-Maschine InteraktionFakultät für Informatik, Universität Karlsruhe (TH)

Page 67: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

68

Visualization for Human-Machine Interaction (Hiwi)

Current situation:• Multiple computer vision components extract information about

people working in our SmartControlRoom:– Tracking and identification– Body model and gestures– Headpose and focus of attention

• Each component has its own visualization

Goal:• Build framework that uses all provided information to create one

integrated visual representation of the whole scene

Requirements:• Experience with visualization using, for example, openGL or VTK

(we are open to suggestions)• Very good C++ or python skills under Linux

Contact: Alexander Schick, 0721-6091-348, [email protected]

Prof. Dr.-Ing. Rainer StiefelhagenInstitut für AnthropomatikForschungsbereich Maschinensehen für Mensch-Maschine InteraktionFakultät für Informatik, Universität Karlsruhe (TH)

Page 68: CBIR 13 Tools Libraries v2 - KIT · User statistics In the first four months after release 13,630 people played the game 80% of them played more than once 1,271,451 labels for 293,760

MA/DA/Hiwi: Person Retrieval in a camera network

Person Retrieval using faces only works quite well in well-defined cases. For more general settings, incorporation of more features and global knowledge is required.

Task Extend existing face retrieval for camera network (~ 10%) Incorporate non-biometric features into the retrieval (~ 40%) Use global model knowledge to improve the retrieval (~ 30%) Evaluate your approach on an appropriate dataset (~ 20%)

Requirements Knowledge of computer vision and machine learning Very good programming experience in C++ & Python Experience with distributed systems is a plus

Contact Martin Bäuml <[email protected]>

Geb. 50.20, R228