Sequential Adaptive Sensor Management – A. Hero Sequential: only one sensor deployed at a time Adaptive: next sensor selection based on present and past

Sequential Adaptive Sensor Management – A. Hero

• Sequential: only one sensor deployed at a time

• Adaptive: next sensor selection based on present and past measurements

• Multi-modality: sensor modes can be switched at each time

• Detection/Classification/Tracking: task is to minimize decision error

• Centralized decision making: sensor has access to entire set of previous measurements

• Smart targets: may hide from active sensor

Single-target state vector:x

y

System Block Diagram

SensorScheduler

Pre-processor

FeatureSelector/Extractor

FeatureMapper

PerfmnceMonitor/Predictor

Detector/Classifier

a1

a3

Actions Prediction

Confidence

Decisions

Adaptive Sequential Acquisition

• Sensor acquires data having density • Adaptive sensor scheduling

• Sensor selection criteria: design to– Minimize predicted MSE, Pe, (Pm, Pf), time-to-detect, etc.– Maximize predicted information gain (Kreucher&etal:ISPN03):

k=1 k=2 k=3 k=3

Time update : Evolve density according to Chapman-Kolmogorov Equation

Progress (since June 04)• Developed novel multitarget particle filter to

represent the JMPD and propagate through time

• Developed method of adaptively factorizing the JMPD when applicable to allow for computationally tractable proposals

• Developed interacting multiple model formulation

• Studied the effect of mismatch in target motion models on filter performance

• Developed an importance density method for simultaneous detection and tracking that accounts for target arrival and removal

• Developed sensor models based on realistic GMTI, ATR, and SAR sensors

• Developed model for multimodality sensor that provides both kinematic and identification information and used for simultaneous detect, track, and ID of 10 real targets

Multitarget Tracking via a Particle Filter Representation of the JMPD

Measurement Update density via Bayes’ Rule

Propagate Particles Forward in Time

Add/Remove Partitions to Particles to account for target birth/death

Update particle weights based on measurements z

Resample

Predict information gain for each possible sensing action

Progress (since June 04)• Developed a method of information prediction

based on computing the Expected Renyi Divergence between prior and posterior JMPD

• Implemented method using particle filter representation of the JMPD

• Studied the effect of mismatch in target motion models on filter performance

• Compared “task-driven” optimization to “information-driven” optimization

• Developed value-to-go approximation for tractable approximate non-myopic scheduling

• Developed reinforcement learning methods for non-myopic scheduling and applied to “smart” target problem using a multi-modality sensor

• Simulated sensor management for simultaneous detect/track and ID with multi modality sensor

Information Based Sensor Resource Allocation

Time update the JMPD

Compute expected information gain between time updated

JMPD and time/measurement updated JMPD

Make best observation

Measurement update the JMPD

Progress Highlighted Today

1. Particle Filtering for simultaneous detection, tracking, and identification (Kreucher&Etal Aerospace2005)

2. Investigation of sensitivity to model mismatch

3. Multi-modality non-myopic sensor management via Reinforcement Learning and Value-to-go Approximation (Kreucher&Hero:ICASSP2005)

4. Optimal multi-stage design of experiments for adaptive waveform design (Rangarajaran&etal:ICASSP2005)

Progress 1: PF for Simultaneous Detection, Tracking and

Identification• JMPD formulation simultaneously addresses detection, tracking and

identification• Until recently, our PF implementation has ignored the detection problem

– Problem becomes significantly more complicated when target number is unknown and time varying

– There is a non-zero probability for a new target arriving at each position within the surveillance area (leads to exponential explosion of possibilities)

– Particle filter implementation must use an importance density that efficiently samples from distributions on target number and target state

• Solution is a measurement-directed importance density that is biased towards proposing new targets in areas of high (accumulated) likelihood and is biased toward removing targets in areas of low likelihood

• This extension allows us to solve the complete problem – target detection, tracking and identification via sensor management with no initial knowledge about the number and states of the targets.

Simultaneous Detection, Tracking and Identification

• Simulation result– No tip-offs at startup

• Unknown number of targets • Unknown position & velocities

– Goal is to detect and track the ten real targets

• Monte Carlo testing on the algorithm– Performance measured in two

ways:• The number of targets correctly

detected and tracked versus time (true number of targets is 10)

• The filter estimate of target number versus time (true number of targets is 10)

Simultaneous Detection, Tracking and Identification

• Simulation result– No tip-offs about anything at

startup• Unknown number of targets • Unknown position, velocity,

ID – Goal is to detect, track and

identify the ten real targets• Performance measured in

two ways:– The number of targets

correctly detected and tracked versus time (truth is 10)

– The filter estimate of target number versus time (truth is 10)

Progress 1: PF for Simultaneous Detection, Tracking and

Identification• Simulation result

– No tip-offs about anything at startup

• Unknown number of targets • Unknown position, velocity,

ID – Goal is to detect, track and

identify the ten real targets• Performance measured in

two ways:– The number of targets

correctly detected and tracked versus time (truth is 10)

– The filter estimate of target number versus time (truth is 10)

Approach• We investigate the effect of mismatch

between the filter estimate of SNR and the actual SNR

• Experiment: 10 (real) targets with myopic SM.

• CFAR detection w/ pf = .001, and pd = pf1/(1+SNR*M)

– i.e. Rayleigh distributed energy returns from both background & signal. Threshold set for Pf =

.001.

– For a constant pf, SNR determines what pd is

• Filter has an estimate of SNR (and hence pd)

and uses this for SM and filtering. What is the effect on tracking of erroneous SNR info?

• Bottom line: Filter appears quite robust to mismatch in SNR, pd, pf, target model.

Progress 2: Effect of model mismatch

Effect of Pd, Pf mismatch

• We use a sensor model: p(y|S,a)– For thresholded GMTI returns, this is characterized by Pd and Pf

• Simulation : 10 (real) targets tracking and (myopic) sensor management.– How does misestimating Pd & Pf effect performance?

Effect of dynamic model mismatch• Diffusive target model p(Sk,Tk|Sk-1,Tk-1) includes models of how individual

targets move and how targets arrive and leave surveillance region – We have been in a mismatch scenario all along since we use real targets

– This study quantifies how mismatch in motion model effects performance

True diffusivity of the targets

Mismatch of the filter (measured as amount of over estimation)

Normalizedtracking error (ratio of tracking error with mismatch to tracking error when matched)

Progress 3: Non-Myopic Sensor Management

• There are many situations where long-term planning provides benefit– Sensor platform motion creates time varying sensor/target visibility

• Sensor/target line of sight may change resulting in targets becoming obscured• Delay measuring targets that will remain visible in order to interrogate targets

that are predicted to become obscured

– Convoy Movement may involve targets that overtake/pass one another • Targets may become closely spaced (and unresolvable to the sensor)• Plan ahead to measure targets before they become unresolvable to the sensor

– Crossing Targets become unresolvable to the sensor• Sensor resolution may prohibit successful target identification if targets are too

close together• Plan ahead to identify targets before they become too close

• Planning ahead in these situations allows better prediction of reemergence point, target trajectory, target intention

Extra dwells at time 1 help

predict where target reemerges

at time 6

Not made by myopic strategy

Sensor Position

Region of Interest

Shadowed Target

Visible Target

Time 1 Time 3

Time 4 Time 5 Time 6

Relevant Multi-target Tracking Scenario

Time 2

Non-myopic strategy scans regions that will become obscured

while deferring regions that will remain visible in the future.

Value Function Approximation

Value of state Myopic part of Vunder action a

Non-myopic correction under a

Bellman equation:

I. VTG approximation:

II. Linear Q-learning approximation:

The Bellman equation describes the value of an action in terms of the immediate (myopic) benefit and the long-term (non-myopic) benefit.

Generates, a, s’, r

Calculates, a, s’, Qest

UpdateQk to Qk+1)','(max

'asQrQ k

aest

Example: Two Real Targets• Target Trajectories Taken From Real, Recorded Data

– 2 moving ground targets– Need to estimate the position and velocity in x and y (4-d state vector for each target)

• Time varying visibility taken from real elevation map & simulated platform trajectory

• Sensor decides where to steer an agile antenna and illuminates a 100mx100m patch on the ground. Thresholded measurements indicate the presence or absence of a target (with pd and pfa)

• At initialization the filter the target position is known to be in a 300m x 500m area on the ground (i.e. the prior for target position is uniform over this region)

Comparing the Management Strategies

Algorithm Time for Training Time for TestingRandom - 0.04s / secondMyopic - 0.12s / secondNon-myopic via VTG - 0.37s / secondNon-myopic via RL ~50 hours 0.60s / second

Non-myopic via RL timing• Generate Training Episodes :

• (50 timesteps x 0.5s/second + 10s fixed cost per episode) * 2000 episodes = 1200 minutes• Batch training :

• 36 possible actions (Q-functions to estimate) x 20 minutes per action = 720 minutes• Update value of Q function (i.e. 2nd pass) : 500 minutes• Batch train on second pass : 720 minutes

We Suspect that the training timefor the RL algorithm could be

reduced (perhapsby even an order of magnitudewith a C-based implementation)

Example : Multiple Modality Sensor

• A sensor has two waveforms – Waveform 1 (X-band) has good

detection performance but is susceptible to line of sight visibility

– Waveform 2 (HF) has poorer detection performance but is not susceptible to visibility

• The platform is moving and so sensor to ground visibility changes with time

• The filter is to detect and track a target in the surveillance area– No information about target

location a priori

– Q-learning used to learn the best non-myopic policy

Progress 4: Optimal Experimental Design

• Upper left box - Beam scheduling, waveform selection, beam steering operator, and transmission into the medium, denoted by channel function

• Right side box - Processes received signals and retransmits.• Lower left box - Processes output after reinsertion.

Motivation

• Imaging a medium using an array of sensors.

• Widely studied in mine detection, ultrasonic medical imaging, foliage penetrating radar, nondestructive testing, and active audio.

• GOAL: Optimally design a sequence of measurements to image a medium of multiple scatterers using an array of transducers.

• Four signal processing steps:1. Transmission of time varying signals into the medium.2. Recording of backscattered field from medium.3. Transmission of the processed backscatter signals.4. Measurement and spatial filtering of backscattered signals.

Mathematical Description• Channel between transmitted field and received backscattered field,

• Four signal processing steps

• where receiver noises are i.i.d

• Design objective: minimize MSE under transmitted energy constraint

Analytical Results• Constraint:

• Nearly optimal design:

• MSE improvement factor:

Comments and Extensions

• Results are robust to variation of estimator error residual esp at low SNR

• Results apply to 2-stage min MSE design under average energy constraint when Greens function is known and non-random

• Analytical results for multi-stage (>2) waveform design?

• Random (Rayleigh/Rician) media?

• Extension to non-quadratic objective functions?

• Classification, detection, regularized image reconstruction?

Pubs Since June 2004• Sequential adaptive sensor management

– “Adaptive Multi-modality Sensor Scheduling for Detection and Tracking of Smart Targets”, C. Kreucher, D. Blatt, A. Hero, and K. Kastella, accepted for publication, Nov. 2004

– “Sensor Management Using An Active Sensing Approach ”, C. Kreucher, D. Blatt, A. Hero, and K. Kastella, accepted for publication, Oct 2004

– “Multitarget tracking using a particle filter representation of the joint multi-target probability density,” C. Kreucher, K. Kastella, and A. Hero, accepted for publication, Sept. 2004

– “Efficient methods of non-myopic sensor management for multitarget tracking,” C. Kreucher, A. Hero, K. Kastella, and D. Chang, 43rd IEEE Conference on Decision and Control, December 2004.

– “Multiplatform Information based Sensor Management,” C. Kreucher, A. Hero, and K. Kastella, to appear at SPIE Defense and Security Symposium, March 2005

– “Non-myopic Approaches to Scheduling Agile Sensors for Multitarget Detection, Tracking, and Identification,” C. Kreucher, and A. Hero, to appear at IEEE ICASSP March 2005

– “Particle Filtering for Multitarget Detection and Tracking,” C. Kreucher, M. Morelande, A. Hero and K. Kastella, to appear at IEEE Aerospace Conference, March 2005

Pubs Since June 2004 (ctd)• Iterative function optimization

– “A convergent incremental gradient algorithm with constant stepsize,” D. Blatt, A. Hero, H. Gauchman, SIAM Optimization, submitted Sept. 2004

– “Convergent incremental optimization transfer algorithms,” S. Ahn, J. Fessler, D. Blatt, A. Hero. IEEE Trans. on Medical Imaging, submitted Oct. 2004

• Predicting model mismatch– "Tests for global maximum of the likelihood function," D. Blatt and A. O.

Hero, Proc. of ICASSP , Philadelphia, March, 2005. – "On tests for global maximum of the log-likelihood function," D. Blatt and

A. O. Hero, , IEEE Trans. on Info Theory, submitted Jan. 2005.

–

• Sequential waveform scheduling– "Optimal experimental design for an inverse scattering problem,“R.

Rangangaran, R. Raich and A. O. Hero, to appear in Proc. of ICASSP, Philadelphia, March, 2005.

Synergistic Activities and Awards(2003-2004)

• General Dynamics Medal Paper Award– C. Kreucher, K. Castella, and A. O. Hero, "Multitarget sensor management using alpha

divergence measures,” Proc First IEEE Conference on Information Processing in Sensor Networks , Palo Alto, April 2003

• EMM-CVPR-03, ASP-03, EUSIPCO-04, ICASSP-05, SSP-05, A. Hero plenary speaker:

• General Dynamics, Inc– K. Kastella: collaboration with A. Hero in sensor management, July 2002-– C. Kreucher: doctoral student of A. Hero, Sept. 2002-2004

• ARL– ARLTAB oversight: A Hero is member 2004-– ARL SEDD: A. Hero is member of yearly review panel, May 2002-– NAS-Robotics: A. Hero chaired cross-cutting review panel, May 2004.– B. Sadler: N. Patwari (doctoral student of A. Hero) internship in distributed sensor

information processing, summer 2003• ERIM Intl.

– B. Thelen&N. Subotic: H. Neemuchwala (Hero’s PhD student) internship in applying entropic graphs to pattern classification, summer 2003

• Chalmers Univ., Sweden– M. Viberg: A. Hero was Opponent on multimodality landmine detection doctoral thesis,

Aug 2003

Transitions

• PF/SM to ISP Phase II (Schmidt at Raytheon)• MRF backscatter modeling to GD (Kastella/Onstott)• SM to NSF-ITR (UM, UW, BU)• SM approaches integrated into

– Dynamic Machine Learning (Prof. Satinder Singh/Chris Kreucher)

– Generalization error (Prof. Susan Murphy/Doron Blatt)

• Collaboration with Prof. Hilllel Gauchman (UIUC Math) on distributed optimization

• Collaboration with GD on Willow Run experiment for multi-modal tracking of dismounts and vehicles

Personnel on A. Hero’s sub-Project (2003-2004)

• Chris Kreucher, 4th year grad student– UM-Dearborn – General Dynamics Sponsorship

• Neal Patwari, 3rd year doctoral student– Virginia tech– NSF Graduate Fellowship/MURI GSRA

• Doron Blatt, 3rd year doctoral student– Univ. Tel Aviv– Dept. Fellowship/MURI GSRA

• Raghuram Rangarajan, 3rd year doctoral student– IIT Madras– Dept. Fellowship/MURI GSRA

Documents

Sequential Adaptive Sensor Management – A. Hero Sequential: only one sensor deployed at a time Adaptive: next sensor selection based on present and past