Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Rogerio Feris, April 11, 2013 EECS 6890 – Topics in Information Processing
Spring 2013, Columbia University http://rogerioferis.com/VisualRecognitionAndSearch
Class 11: Smart Surveillance
Video Analytics for Smart Surveillance
Video Analytics for Smart Surveillance
Video Capture/
Encoding &
Management
DVR - records
& streams video
Real-time alerts • Perimeter violation
• Tailgating attempt
• Red car on service road
User driven queries • Find red cars
• Find tailgating incidents involving this person
Sensors &
Transactions
Analytics & Framework
Watches the video for alerts & events
• Analytics modules:
- Object tracking and classification
- Face capture and recognition
- License Plate Recognition
- Many others
• Gathers event meta-data & makes it searchable
• Provides plug and play framework for analytics
IBM Smart Vision Suite (SVS)
Visual Recognition And Search Columbia University, Spring 2013
Mode of Operation: Real-time Alerts
Tripwire Directional Motion Removed Object
Examples of user configurable real-time alerts
Triggers on the cat crossing the blue line
Triggers on right-turns, when the cars move in the direction of the arrow
Triggers when object outlined in blue is removed from its position
Visual Recognition And Search Columbia University, Spring 2013
Mode of Operation: Search After the Fact
“Show me all large vehicles with yellow color that crossed this road in the past 5 days” *Finds DHL delivery trucks+
“Show me all events with duration greater than 30 seconds” [Finds people loitering]
Visual Recognition And Search Columbia University, Spring 2013
Traditional Pipeline: Blob-Based Analytics
Background Subtraction
Blob Tracking High-Level Processing
Background Subtraction: Moving Object Detection (Blobs)
Most existing smart surveillance systems in the market rely on blob-based video analysis. They are efficient and work well in low-activity scenarios.
Visual Recognition And Search Columbia University, Spring 2013
Blob-Based Analytics: Limitations
Dealing with Crowded Scenes
Objects close to each other are clustered into a single blob
Environmental Conditions
Rain, snow, reflections, and shadows cause spurious blobs
Original Video Background Subtraction Tracking
Visual Recognition And Search Columbia University, Spring 2013
Object-Centric Video Analytics
Vehicle Detection in Crowded Scenes
Click for video
Visual Recognition And Search Columbia University, Spring 2013
Object-Centric Video Analytics
Limitations / Challenges
Detector accuracy: dealing with appearance variations
Different object poses, lighting changes, etc.
Detector efficiency / cost
State-of-the-art approaches usually run at low frame rates
How many object classes are needed?
Visual Recognition And Search Columbia University, Spring 2013
Large-Scale Detector Learning
[Feris et al, Large-Scale Vehicle Detection, Indexing, and Search in Urban Surveillance Videos, IEEE Transactions on Multimedia, 2012]
Visual Recognition And Search Columbia University, Spring 2013
Automatic Training Data Collection
User-defined Region of Interest (ROI)
Prior information about motion direction and size of cars in the region
Classifier is applied based on motion direction and blob shape (via background modeling, no appearance) and only high-confidence samples are selected
Original Video Captured Samples [click for video]
Visual Recognition And Search Columbia University, Spring 2013
Automatic Training Data Collection
~5 hours video, click for demo [Training Data]
Visual Recognition And Search Columbia University, Spring 2013
Synthetic Occlusion Generator
Visual Recognition And Search Columbia University, Spring 2013
Synthetic Occlusion Generator
Visual Recognition And Search Columbia University, Spring 2013
Huge Vehicle Dataset
Nearly one million images (50+ cameras) ! Largest public dataset to date has ~5000 images
Visual Recognition And Search Columbia University, Spring 2013
Automatic Dataset Semantic Partitioning
Large variations in pose cause drastic appearance variations difficult for learning
Clustering based on motion direction (related to vehicle pose) motionlet clusters
Multiple detectors are learned (for each motionlet cluster) rather than a single monolithic detector
Clustering Based on Motionlets
Visual Recognition And Search Columbia University, Spring 2013
Core Detector Model
Cascade of Adaboost Classifiers with Haar-like Features
A feature pool containing a huge set (order of millions) of feature configurations is generated over multiple feature planes
Similar to Integral channel features (Dollar et al), but instead of randomization, we use massively parallel feature selection to select a compact set of discriminative features through Adaboost learning
Visual Recognition And Search Columbia University, Spring 2013
Deep Cascade Detectors
Significant accuracy improvement by training deep cascades with huge amount of bootstraped negative samples [200,000 negative samples]
Visual Recognition And Search Columbia University, Spring 2013
Large-Scale Multi-Pose Vehicle Detection
100+ frames per second!
Visual Recognition And Search Columbia University, Spring 2013
Other Visual Analytics Modules
© 2007 IBM Corporation
Real-Time Alerts
Basic Alerts
User Configurable
Composite Alerts
Real-time Alerts
Match plates
against license
plate watch-list
Search
Event attributes &
Object Appearance
Search
Partial license
plates
Face Capture
Frontal and Profile
face views of
people to create
face catalog
Face Recognition
Match captured
faces against a
face watch-list
Sensors
Events
Transaction
Logs
911 Call
Logs
RFID Events
GPS
Metadata
Retail
Loss Prevention
Marketing &
Operations
Public Sector
City Surveillance
Core Technologies:
Background Modeling
Tracking
Limit access to camera/ functions
Redact information from video
Fuzzy meta-data representation
Object Classification
Color Classification
Cross Sector
Solution
Retail Solution
Visual Recognition And Search Columbia University, Spring 2013
Sweethearting Detection
[Fan et al, Recognition of Repetitive Sequential Human Activity, CVPR 2009]
Visual Recognition And Search Columbia University, Spring 2013
Attribute-based People Search
[Feris et al, "Indexing and searching according to attributes of a person", US Patent 20100106707, 2008]
[B. Siddiquie , R. S. Feris and L. Davis. "Image Ranking and Retrieval Based on Multi-Attribute Queries“, CVPR 2011 (Oral ), USA, 2011]
[D. Vaquero, R. S. Feris, D. Tran, L. Brown, A. Hampapur, and M. Turk, "Attribute-based people search in surveillance environments", WACV 2009 ]
Query Example: “Show me all people entering IBM last month with beard, dark skin, using sunglasses, wearing a red jacket and blue pants”
Visual Recognition And Search Columbia University, Spring 2013
Attribute-based Vehicle Search
[Feris et al, ICMR 2011]
Visual Recognition And Search Columbia University, Spring 2013
Abandoned Object Detection [Fan and Pankanti, Robust Foreground and Abandonment Analysis For Large-scale
Abandoned Object Detection in Complex Surveillance Videos, AVSS 2012]
[Y. L. Tian, R. S. Feris, and A. Hampapur. Real-time detection of abandoned and removed objects in complex environments, VS 2008]
Main Issues
Approach
Visual Recognition And Search Columbia University, Spring 2013
Surveillance Event Detection (SED)
Qiang Chen et al, CMU-IBM-NUS@TRECVID 2012: Surveillance Event Detection, 2012
We ranked 1st in 4 out of 7 surveillance event detection tasks
Visual Recognition And Search Columbia University, Spring 2013
Demo