Upload
gage-myers
View
41
Download
4
Embed Size (px)
DESCRIPTION
E-V: Efficient Visual Surveillance with Electronic Footprints. Jin Teng, Junda Zhu, Boying Zhang, Dong Xuan and Yuan F. Zheng IEEE Infocom 2012. Outline. Deficiency of Visual Surveillance Systems A Brief of Our E-V System A Case Study A Broader View of Our E-V System Final Remarks. - PowerPoint PPT Presentation
Citation preview
23/4/19
E-V: Efficient Visual Surveillancewith Electronic Footprints
Jin Teng, Junda Zhu, Boying Zhang, Dong Xuan and Yuan F. Zheng
IEEE Infocom 2012
Outline
Deficiency of Visual Surveillance Systems A Brief of Our E-V System A Case Study A Broader View of Our E-V System Final Remarks
23/4/19 2
Visual Surveillance
23/4/19 3
Failure Examples
Chicago police installed 10,000 surveillance cameras in the city, only 1 of 200 crimes is captured by the visual surveillance [2]!
One of the bombers in London bombing (July, 2005) is not identified by the surveillance system and escaped [3]!
23/4/19 4
Why fail?
Large volume of video dataTemporal: 2.07*106 frames per camera per daySpatial: tons of surveillance cameras in a city
e.g. New York has 4176 video cameras in lower Manhattan area[1].
Monitored objects may be visually occluded or have multiple inconsistent appearance
23/4/19 5
† Big Apple is Watching You:
http://www.slate.com/articles/news_and_politics/explainer/2010/05/big_apple_is_watching_you.html
Visual technologies are not efficient and accurate enough to do automatic localization and tracking, and a lot of human power is needed!
Outline
Deficiency of Visual Surveillance Systems A Brief of Our E-V System A Case Study A Broader View of Our E-V System Final Remarks
23/4/19 6
Our Methodology: E-V Integration
Combining electronic and visual signals for efficient surveillance
E-V Integration makes it possible to efficiently and accurately localize and identify objects in a large volume of video data
23/4/19 7
Indexing & Sorting Localization Accuracy
E-Signal Easy Low
V-Signal Hard High
Electronic Signals
Name Distance Frequency Data Rate (down)
GSM 35 km850, 900,
1800, 1900 MHz80 kb/s (GPRS), 236 kb/s (EDGE)
LTE 30 km–100 km 700 MHz–2.6 GHz >100 Mb/s
WiFi 100 m2.4 GHz (802.11b/g),
5 GHz (802.11a)54 Mb/s
2.4 GHz, 5 GHz 450 Mb/s
Bluetooth 10 m2.4 GHz,
Frequency Hopping2.1 Mb/s
(up to 24 Mb/s)
NFC < 4 cm 13.56 MHz 106 kb/s–424 kb/s
8
Wireless channels: Wireless address, such as WiFi MAC address Content etc.
Electronic signals are emitted by many mobile devices Mobile device’s popularity is increasing
Smartphone as an example: 302.6 million shipped in 2010
Pervasiveness of Electronic Signals
9
Our E-V System: A Bird’s Eye View
10
Our E-V System: Layers
23/4/19
Surveillance HealthTraining
LocalizationOther
TechnologiesIdentification
Electronic Visual Other Signals
Specific Applications
Technologies
Sensing Methods
11
Related Work on E-V Integration
Fuse multiple sensors for tracking [4] Visual camera + RFID for monitoring [5] Existing work cannot achieve accuracy and efficiency
for visual surveillance at the same time!
23/4/19 12
Outline
Deficiency of Visual Surveillance Systems A Brief of Our E-V System A Case Study A Broader View of Our E-V System Final Remarks
23/4/19 13
A Typical Surveillance Scenario
Find a specific person given some vague visual information, i.e., retrieve his appearance in videos of a long period of time
If we depend on videos alone, we may need Extract all human figures in each frame, which may come in the number of
thousands, and compare them with a designated vague picture. Involve a large amount of human efforts to stare at the videos, which may
last several hours or even days, from a number of cameras.
With E-V integration, how can we do?
23/4/19 14
Problem Formulation: Notations V-sensing: V-ID and V Frame
V-ID: Visual identity, such as human figure VID*: Our target V-ID V Frame: a set of V-IDs with some background captured by visual sensors (cameras) in certain area and time
E-sensing: E-ID and E Frame E-ID: Electronic identity such as MAC address etc. EID*: Our target E-ID E Frame: a set of E-IDs captured by electronic sensors in certain area and time
Vagueness and completeness Vagueness: reflect how clearly a V-ID/E-ID can be identified Completeness: reflect if V-IDs/E-IDs are complete in a V/E frame
15
Problem Formulation: Cases
23/4/19 16
Input Target Input Frames
EID* VID* EIDs VIDs
Vagueness Clear
Vague Completeness Complete
Incomplete
Baseline case ( ): Input: clear EID*, (and vague VID*), and a set of E frames with clear
and complete EIDs and V frames with vague and complete VIDs Output: VID* in video frames (VID* may be different from given
vague VID*)
√
√ √
√
General case: Input: EID* (and VID*), and a set of E frames and corresponding V
frames Output: VID* in video frames
√
√
√
A Naïve Solution to the Baseline Case
Two steps: Step 1: Find out all E frames which include EID* (example) Step 2: Identify VID* in their corresponding V frames
Comments: Few V frames to process because V frames without VID* are filtered out, but there may be still many V frames
17Suppose we have three E/V frames. We go through them one by one.
E frame 2
EID* EID2
EID3
E frame 3
EID* EID2
E frame 1
EID* EID1
E-Filtering Find the minimum number of E Frames, whose intersection
is the given E-ID, i.e. EID* Much less frames for further V side processing We will formulate it into the Element Distinguishing
Problem (EDP)
V-Retrieval Retrieve the V-ID from the filtered frames through
intersection to determine VID* We will formulate it into the n-partite Best Matching
Problem (nBM).
Our Solution
18
EID*
E-filtering Overview
19
E frame 1
EID* EID1
E frame 3
EID* EID2
E frame 2
EID* EID3
EID2Two E Frames are enough to identify EID* through intersection.
E frame 1
EID1
E frame 2
EID3EID2
Nature of E-Filtering
20
Finding the minimum number of frames, whose intersection is EID*
NP-complete: equivalent to the set cover problem Whether each E-ID appears in each E frame is summarized
in a matrix, with 1 meaning ‘appear’ and 0 ‘not appear’. At least one 0 in each non-EID* column Use these 0s to ‘cover’ all non-EID* column
EID* EID1 EID2 EID3
e1 1 1 0 0
e2 1 0 1 1
e3 1 0 1 0
At least one 0 in each non-EID* column
Solution: EDP Algorithm Element Distinguishing Problem (EDP)
The element to be distinguished is EID*
Greedily select E Frames in which the most number of E-IDs can be told apart from EID* In the example, the greedy algorithm will select e1 or e3
first, because we can tell two E-IDs are not EID* Repeat the greedy selection until EID* is distinguishable
EID* EID1 EID2 EID3
e1 1 1 0 0
e2 1 0 1 1
e3 1 0 1 0
EDP(cont’d) Approximation results can be achieved with the greedy
heuristic algorithm for the set cover problem
22
V-Retrieval General Problem
Find the corresponding VID* from the frames selected by E-Filtering
VID* is the only one that should appear in all the frames after E filtering. So an intersection operation can give VID*.
Largest Challenge Indistinct V-IDs: do not know for sure which person is
which in different frames
Solution nBM algorithm: find the VID with the largest probability of
appearing in all V frames.
23
The nBM Algorithm n-partite Best Match Problem (nBM)
Find the VID* that matches the visual appearance of EID* best
Put all VIDs in different frames in n different circles
n-partite graph (right)
Find whether an VID appears in each V frame based on similarity scores
Using Maximum Likelihood Criterion to choose the VID whose appearance/ disappearance agrees with EID* best.
1VID
1v
2v
1VID
2VID 4VID
3VID 5VID
3v
1VID
6VID
7VID 9VID
8VID
24
Similar?
1VID
Dummy VID to indicate that VID1 is not similar to any VIDs in this frame
Practical Considerations In the baseline case, we assumed that the information
of E-IDs and V-IDs is complete. However, in realistic cases, we may have
Ghost V-ID or missing V-ID Missing E-ID
25
√
√
√ The baseline case we have studied □ practical case of our focus solved
Input Target Input Frames
EID* VID* EIDs VIDs
Vagueness Clear □ □
Vague □ □Completeness Complete
Incomplete □ □
√
√ √
√
√
√
Solutions to Practical Problems Careful Deployment
Make sure that the coverage of the camera and the wireless detectors are roughly the same
nBM is probability based, so it is naturally resistant to noises Select appropriate threshold in nBM for better tradeoff between noise
resistance and performance
Generalized EDP Handle missing/ghost E-ID Introduction of fuzzy logic to improve the robustness of EDP Use RSSI for estimation and smoothing
EID* EID1 EID2 EID3 EID4
e1' 0.98 0.95 0.1 0.01 0.06
e2' 0.9 0.01 1 0.94 0.04
e3' 0.88 0.99 0.03 0.1 0.12
e4' 0.99 0.02 0.89 0.27 0.23
EIDi 10
EIDi 1010
10
smoothing
smoothing
Time
EIDi
EIDi
26
A Quick Recap of Our Solutions
27
ID Complete ID Incomplete
E-Filtering on EIDs
EDP GEDP
V-Retrieval on VIDs
nBM nBM+Deployment
Implementation
Real world implementationOne camera viewing from above to collect V frames1-3 laptops around sniffing the WiFi traffic to
collect E frames Tested on campus
GymnasiumLibrary
28
Experimental Evaluations Real world experiments
Successfully find the VID* Minimum frames needed for Scenario 1 is 3, and we achieve 3 Minimum frames needed for Scenario 2 is 3, and we achieve 4
Scenario 1:Gymnasium6 people28 frames
Scenario 2:Library8 people40 frames
29
Large Scale Simulation-based Evaluations Evaluation settings
Networks of cameras and wireless detectors at three locations
~120 people moving randomly Much less video frames to process (left)
High Accuracy (right)30
E-V Surveillance: Problem Space
31
TrackingOnsite Offline
Cooperative
Uncooperative
Final Remarks Existing visual surveillance system is not efficient Our E-V system
Integrates the E signals and V signals for efficient visual surveillance
Implemented in real world
Many open issues left, still a long way to go
32
References
[1] Big Apple is Watching You: http://www.slate.com/articles/news_and_politics/explainer/2010/05/big_apple_is_watching_you.html
[2] http://articles.chicagotribune.com/2010-05-06/news/ct-oped-0506-chapman-20100506_1_surveillance- cameras-vandalism-effect-on-violent-crime
[3] http://news.bbc.co.uk/2/hi/4659093.stm
[4] D. Smith, et.al, “Approaches to Multisensor Data Fusion in TargetTracking: A Survey”, Knowledge and Data Engineering, IEEE Transactionson, 2006.
[5] S. Cho, et.al, “Association and Identification in HeterogeneousSensors Environment with Coverage Uncertainty”, IEEE AdvancedVideo and Signal Based Surveillance, 2009.
23/4/19 33
Backup Slides
A Case Study
A typical surveillance scenario Problem formation in E-V integration Our solution Implementation and Evaluations
23/4/19 35
GEDP Algorithm Clearly NP-hard
We can reduce EDP to GEDP
Heuristic algorithm based on the subset sum approximation algorithm
36
The nBM Algorithm n-partite Best Match Problem (nBM)
Find the VID* that matches the visual appearance of EID* best
Put all VIDs in different frames in n different circles
n-partite graph (right)
Similarity matrix for all V-IDs which have appeared
1VID
1v
2v
1VID
2VID 4VID
3VID 5VID
3v
1VID
6VID
7VID 9VID
8VID
37
nBM (cont’d) Maximum Likelihood matching
Given the observed VID1 … VIDm Which VID is the best candidate
Calculate the probability of all VIDi across all V frames Select the VID with the largest probability
1VID
1v
2v
1VID
2VID 4VID
3VID 5VID
VID1 is not in v2
VID1 is in v2, and appears as VID2
38