View
221
Download
0
Tags:
Embed Size (px)
Citation preview
Sequential Adaptive Sensor Management – A. Hero
• Sequential: only one sensor deployed at a time
• Adaptive: next sensor selection based on present and past measurements
• Multi-modality: sensor modes can be switched at each time
• Detection/Classification/Tracking: task is to minimize decision error
• Centralized decision making: sensor has access to entire set of previous measurements
• Smart targets: may hide from active sensor
Single-target state vector:x
y
System Block Diagram
SensorScheduler
Pre-processor
FeatureSelector/Extractor
FeatureMapper
PerfmnceMonitor/Predictor
Detector/Classifier
a1
a3
Actions Prediction
Confidence
Decisions
Adaptive Sequential Acquisition
• Sensor acquires data having density • Adaptive sensor scheduling
• Sensor selection criteria: design to– Minimize predicted MSE, Pe, (Pm, Pf), time-to-detect, etc.– Maximize predicted information gain (Kreucher&etal:ISPN03):
k=1 k=2 k=3 k=3
Time update : Evolve density according to Chapman-Kolmogorov Equation
Progress (since June 04)• Developed novel multitarget particle filter to
represent the JMPD and propagate through time
• Developed method of adaptively factorizing the JMPD when applicable to allow for computationally tractable proposals
• Developed interacting multiple model formulation
• Studied the effect of mismatch in target motion models on filter performance
• Developed an importance density method for simultaneous detection and tracking that accounts for target arrival and removal
• Developed sensor models based on realistic GMTI, ATR, and SAR sensors
• Developed model for multimodality sensor that provides both kinematic and identification information and used for simultaneous detect, track, and ID of 10 real targets
Multitarget Tracking via a Particle Filter Representation of the JMPD
Measurement Update density via Bayes’ Rule
Propagate Particles Forward in Time
Add/Remove Partitions to Particles to account for target birth/death
Update particle weights based on measurements z
Resample
Predict information gain for each possible sensing action
Progress (since June 04)• Developed a method of information prediction
based on computing the Expected Renyi Divergence between prior and posterior JMPD
• Implemented method using particle filter representation of the JMPD
• Studied the effect of mismatch in target motion models on filter performance
• Compared “task-driven” optimization to “information-driven” optimization
• Developed value-to-go approximation for tractable approximate non-myopic scheduling
• Developed reinforcement learning methods for non-myopic scheduling and applied to “smart” target problem using a multi-modality sensor
• Simulated sensor management for simultaneous detect/track and ID with multi modality sensor
Information Based Sensor Resource Allocation
Time update the JMPD
Compute expected information gain between time updated
JMPD and time/measurement updated JMPD
Make best observation
Measurement update the JMPD
Progress Highlighted Today
1. Particle Filtering for simultaneous detection, tracking, and identification (Kreucher&Etal Aerospace2005)
2. Investigation of sensitivity to model mismatch
3. Multi-modality non-myopic sensor management via Reinforcement Learning and Value-to-go Approximation (Kreucher&Hero:ICASSP2005)
4. Optimal multi-stage design of experiments for adaptive waveform design (Rangarajaran&etal:ICASSP2005)
Progress 1: PF for Simultaneous Detection, Tracking and
Identification• JMPD formulation simultaneously addresses detection, tracking and
identification• Until recently, our PF implementation has ignored the detection problem
– Problem becomes significantly more complicated when target number is unknown and time varying
– There is a non-zero probability for a new target arriving at each position within the surveillance area (leads to exponential explosion of possibilities)
– Particle filter implementation must use an importance density that efficiently samples from distributions on target number and target state
• Solution is a measurement-directed importance density that is biased towards proposing new targets in areas of high (accumulated) likelihood and is biased toward removing targets in areas of low likelihood
• This extension allows us to solve the complete problem – target detection, tracking and identification via sensor management with no initial knowledge about the number and states of the targets.
Simultaneous Detection, Tracking and Identification
• Simulation result– No tip-offs at startup
• Unknown number of targets • Unknown position & velocities
– Goal is to detect and track the ten real targets
• Monte Carlo testing on the algorithm– Performance measured in two
ways:• The number of targets correctly
detected and tracked versus time (true number of targets is 10)
• The filter estimate of target number versus time (true number of targets is 10)
Simultaneous Detection, Tracking and Identification
• Simulation result– No tip-offs about anything at
startup• Unknown number of targets • Unknown position, velocity,
ID – Goal is to detect, track and
identify the ten real targets• Performance measured in
two ways:– The number of targets
correctly detected and tracked versus time (truth is 10)
– The filter estimate of target number versus time (truth is 10)
Progress 1: PF for Simultaneous Detection, Tracking and
Identification• Simulation result
– No tip-offs about anything at startup
• Unknown number of targets • Unknown position, velocity,
ID – Goal is to detect, track and
identify the ten real targets• Performance measured in
two ways:– The number of targets
correctly detected and tracked versus time (truth is 10)
– The filter estimate of target number versus time (truth is 10)
Approach• We investigate the effect of mismatch
between the filter estimate of SNR and the actual SNR
• Experiment: 10 (real) targets with myopic SM.
• CFAR detection w/ pf = .001, and pd = pf1/(1+SNR*M)
– i.e. Rayleigh distributed energy returns from both background & signal. Threshold set for Pf =
.001.
– For a constant pf, SNR determines what pd is
• Filter has an estimate of SNR (and hence pd)
and uses this for SM and filtering. What is the effect on tracking of erroneous SNR info?
• Bottom line: Filter appears quite robust to mismatch in SNR, pd, pf, target model.
Progress 2: Effect of model mismatch
Effect of Pd, Pf mismatch
• We use a sensor model: p(y|S,a)– For thresholded GMTI returns, this is characterized by Pd and Pf
• Simulation : 10 (real) targets tracking and (myopic) sensor management.– How does misestimating Pd & Pf effect performance?
Effect of dynamic model mismatch• Diffusive target model p(Sk,Tk|Sk-1,Tk-1) includes models of how individual
targets move and how targets arrive and leave surveillance region – We have been in a mismatch scenario all along since we use real targets
– This study quantifies how mismatch in motion model effects performance
True diffusivity of the targets
Mismatch of the filter (measured as amount of over estimation)
Normalizedtracking error (ratio of tracking error with mismatch to tracking error when matched)
Progress 3: Non-Myopic Sensor Management
• There are many situations where long-term planning provides benefit– Sensor platform motion creates time varying sensor/target visibility
• Sensor/target line of sight may change resulting in targets becoming obscured• Delay measuring targets that will remain visible in order to interrogate targets
that are predicted to become obscured
– Convoy Movement may involve targets that overtake/pass one another • Targets may become closely spaced (and unresolvable to the sensor)• Plan ahead to measure targets before they become unresolvable to the sensor
– Crossing Targets become unresolvable to the sensor• Sensor resolution may prohibit successful target identification if targets are too
close together• Plan ahead to identify targets before they become too close
• Planning ahead in these situations allows better prediction of reemergence point, target trajectory, target intention
Extra dwells at time 1 help
predict where target reemerges
at time 6
Not made by myopic strategy
Sensor Position
Region of Interest
Shadowed Target
Visible Target
Time 1 Time 3
Time 4 Time 5 Time 6
Relevant Multi-target Tracking Scenario
Time 2
Non-myopic strategy scans regions that will become obscured
while deferring regions that will remain visible in the future.
Value Function Approximation
Value of state Myopic part of Vunder action a
Non-myopic correction under a
Bellman equation:
I. VTG approximation:
II. Linear Q-learning approximation:
The Bellman equation describes the value of an action in terms of the immediate (myopic) benefit and the long-term (non-myopic) benefit.
Generates, a, s’, r
Calculates, a, s’, Qest
UpdateQk to Qk+1)','(max
'asQrQ k
aest
Example: Two Real Targets• Target Trajectories Taken From Real, Recorded Data
– 2 moving ground targets– Need to estimate the position and velocity in x and y (4-d state vector for each target)
• Time varying visibility taken from real elevation map & simulated platform trajectory
• Sensor decides where to steer an agile antenna and illuminates a 100mx100m patch on the ground. Thresholded measurements indicate the presence or absence of a target (with pd and pfa)
• At initialization the filter the target position is known to be in a 300m x 500m area on the ground (i.e. the prior for target position is uniform over this region)
Comparing the Management Strategies
Algorithm Time for Training Time for TestingRandom - 0.04s / secondMyopic - 0.12s / secondNon-myopic via VTG - 0.37s / secondNon-myopic via RL ~50 hours 0.60s / second
Non-myopic via RL timing• Generate Training Episodes :
• (50 timesteps x 0.5s/second + 10s fixed cost per episode) * 2000 episodes = 1200 minutes• Batch training :
• 36 possible actions (Q-functions to estimate) x 20 minutes per action = 720 minutes• Update value of Q function (i.e. 2nd pass) : 500 minutes• Batch train on second pass : 720 minutes
We Suspect that the training timefor the RL algorithm could be
reduced (perhapsby even an order of magnitudewith a C-based implementation)
Example : Multiple Modality Sensor
• A sensor has two waveforms – Waveform 1 (X-band) has good
detection performance but is susceptible to line of sight visibility
– Waveform 2 (HF) has poorer detection performance but is not susceptible to visibility
• The platform is moving and so sensor to ground visibility changes with time
• The filter is to detect and track a target in the surveillance area– No information about target
location a priori
– Q-learning used to learn the best non-myopic policy
Progress 4: Optimal Experimental Design
• Upper left box - Beam scheduling, waveform selection, beam steering operator, and transmission into the medium, denoted by channel function
• Right side box - Processes received signals and retransmits.• Lower left box - Processes output after reinsertion.
Motivation
• Imaging a medium using an array of sensors.
• Widely studied in mine detection, ultrasonic medical imaging, foliage penetrating radar, nondestructive testing, and active audio.
• GOAL: Optimally design a sequence of measurements to image a medium of multiple scatterers using an array of transducers.
• Four signal processing steps:1. Transmission of time varying signals into the medium.2. Recording of backscattered field from medium.3. Transmission of the processed backscatter signals.4. Measurement and spatial filtering of backscattered signals.
Mathematical Description• Channel between transmitted field and received backscattered field,
• Four signal processing steps
• where receiver noises are i.i.d
• Design objective: minimize MSE under transmitted energy constraint
Analytical Results• Constraint:
• Nearly optimal design:
• MSE improvement factor:
Comments and Extensions
• Results are robust to variation of estimator error residual esp at low SNR
• Results apply to 2-stage min MSE design under average energy constraint when Greens function is known and non-random
• Analytical results for multi-stage (>2) waveform design?
• Random (Rayleigh/Rician) media?
• Extension to non-quadratic objective functions?
• Classification, detection, regularized image reconstruction?
Pubs Since June 2004• Sequential adaptive sensor management
– “Adaptive Multi-modality Sensor Scheduling for Detection and Tracking of Smart Targets”, C. Kreucher, D. Blatt, A. Hero, and K. Kastella, accepted for publication, Nov. 2004
– “Sensor Management Using An Active Sensing Approach ”, C. Kreucher, D. Blatt, A. Hero, and K. Kastella, accepted for publication, Oct 2004
– “Multitarget tracking using a particle filter representation of the joint multi-target probability density,” C. Kreucher, K. Kastella, and A. Hero, accepted for publication, Sept. 2004
– “Efficient methods of non-myopic sensor management for multitarget tracking,” C. Kreucher, A. Hero, K. Kastella, and D. Chang, 43rd IEEE Conference on Decision and Control, December 2004.
– “Multiplatform Information based Sensor Management,” C. Kreucher, A. Hero, and K. Kastella, to appear at SPIE Defense and Security Symposium, March 2005
– “Non-myopic Approaches to Scheduling Agile Sensors for Multitarget Detection, Tracking, and Identification,” C. Kreucher, and A. Hero, to appear at IEEE ICASSP March 2005
– “Particle Filtering for Multitarget Detection and Tracking,” C. Kreucher, M. Morelande, A. Hero and K. Kastella, to appear at IEEE Aerospace Conference, March 2005
Pubs Since June 2004 (ctd)• Iterative function optimization
– “A convergent incremental gradient algorithm with constant stepsize,” D. Blatt, A. Hero, H. Gauchman, SIAM Optimization, submitted Sept. 2004
– “Convergent incremental optimization transfer algorithms,” S. Ahn, J. Fessler, D. Blatt, A. Hero. IEEE Trans. on Medical Imaging, submitted Oct. 2004
• Predicting model mismatch– "Tests for global maximum of the likelihood function," D. Blatt and A. O.
Hero, Proc. of ICASSP , Philadelphia, March, 2005. – "On tests for global maximum of the log-likelihood function," D. Blatt and
A. O. Hero, , IEEE Trans. on Info Theory, submitted Jan. 2005.
–
• Sequential waveform scheduling– "Optimal experimental design for an inverse scattering problem,“R.
Rangangaran, R. Raich and A. O. Hero, to appear in Proc. of ICASSP, Philadelphia, March, 2005.
Synergistic Activities and Awards(2003-2004)
• General Dynamics Medal Paper Award– C. Kreucher, K. Castella, and A. O. Hero, "Multitarget sensor management using alpha
divergence measures,” Proc First IEEE Conference on Information Processing in Sensor Networks , Palo Alto, April 2003
• EMM-CVPR-03, ASP-03, EUSIPCO-04, ICASSP-05, SSP-05, A. Hero plenary speaker:
• General Dynamics, Inc– K. Kastella: collaboration with A. Hero in sensor management, July 2002-– C. Kreucher: doctoral student of A. Hero, Sept. 2002-2004
• ARL– ARLTAB oversight: A Hero is member 2004-– ARL SEDD: A. Hero is member of yearly review panel, May 2002-– NAS-Robotics: A. Hero chaired cross-cutting review panel, May 2004.– B. Sadler: N. Patwari (doctoral student of A. Hero) internship in distributed sensor
information processing, summer 2003• ERIM Intl.
– B. Thelen&N. Subotic: H. Neemuchwala (Hero’s PhD student) internship in applying entropic graphs to pattern classification, summer 2003
• Chalmers Univ., Sweden– M. Viberg: A. Hero was Opponent on multimodality landmine detection doctoral thesis,
Aug 2003
Transitions
• PF/SM to ISP Phase II (Schmidt at Raytheon)• MRF backscatter modeling to GD (Kastella/Onstott)• SM to NSF-ITR (UM, UW, BU)• SM approaches integrated into
– Dynamic Machine Learning (Prof. Satinder Singh/Chris Kreucher)
– Generalization error (Prof. Susan Murphy/Doron Blatt)
• Collaboration with Prof. Hilllel Gauchman (UIUC Math) on distributed optimization
• Collaboration with GD on Willow Run experiment for multi-modal tracking of dismounts and vehicles
Personnel on A. Hero’s sub-Project (2003-2004)
• Chris Kreucher, 4th year grad student– UM-Dearborn – General Dynamics Sponsorship
• Neal Patwari, 3rd year doctoral student– Virginia tech– NSF Graduate Fellowship/MURI GSRA
• Doron Blatt, 3rd year doctoral student– Univ. Tel Aviv– Dept. Fellowship/MURI GSRA
• Raghuram Rangarajan, 3rd year doctoral student– IIT Madras– Dept. Fellowship/MURI GSRA