View
59
Download
0
Category
Preview:
DESCRIPTION
Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding. Martin Sachenbacher July 1, 2003. Exact vs. Approximate ME. Problems of ME with incomplete belief state Dead ends (no solutions) Incorrect leading solutions Incorrect probabilities of solutions - PowerPoint PPT Presentation
Citation preview
Exact Mode Estimation for POMDPs Exact Mode Estimation for POMDPs based on Constraint Decomposition based on Constraint Decomposition and Symbolic Encodingand Symbolic Encoding
Martin SachenbacherJuly 1, 2003
Exact vs. Approximate MEExact vs. Approximate ME
Problems of ME with incomplete belief state– Dead ends (no solutions)– Incorrect leading solutions– Incorrect probabilities of solutions
Usefulness of ME with complete belief state– As accuracy reference– As performance reference– As a starting point for approximations
Key: Compact representation of belief state– Map to semiring-based CSP– Decompose Hypergraph into Hypertree– Encode Tree Nodes symbolically as ADDs
OutlineOutline
SCSPs (Semiring-based CSPs) Mapping State Constraints to SCSPs Mapping Transition Constraints to SCSPs ADDs (Algebraic Decision Diagrams) Hypertree Decompositions of SCSPs Solving Tree-structured SCSPs Exact Mode Estimation for POMDPs as
Decomposition/ADD-based SCSP Solving Demonstration: Two Switches Example
SCSPs (Semiring-based CSPs)SCSPs (Semiring-based CSPs)
Generalization of CSPs [Bistarelli et al. 97] Domain D, Variables V, Set S, Type T V Constraints are mappings Dk S Operations (for join) and (for projection) on S (S, , , 0, 1) must for form c-semiring Dynamic Programming applicable to all SCSPs Examples
– ({0,1}, , , 0, 1): Classical CSPs– (R+, min, +, +, 0): Weighted CSPs– ([0,1], max, *, 0, 1): Probabilistic CSPs
Encoding States as SCSPsEncoding States as SCSPs
Example: Or-Gate P(Or=ok) = 99%, P(Or=fty) = 1%
xt in1 in2 outok lo lo look lo hi hiok hi lo hiok hi hi hifty * * *
f
0.990.990.990.990.01
≥ 1
Or
Encoding Observations as SCSPsEncoding Observations as SCSPs
Example: (Probabilistic) Observation
0 1 2 3
P
0.9
0.60.3
xi
xi f
0123
0.60.90.30.0
Distribution over values for xi
Encoding Transitions as SCSPsEncoding Transitions as SCSPs
Example: (Probabilistic) CCA
0
1
0.9 0.9
0.9
0.9
xt cmd xt+1 f
0 off 00 on 00 off 10 on 11 off 01 on 01 off 11 on 1
0.90.10.10.90.90.10.10.9
cmd=offcmd=on
cmd=on
cmd=off
Transition Function
ADDs: Symbolic (graph-based) representation of functions {0,1}n R
Generalization of BDDs (functions {0,1}n {0,1}) Canonicity of representation (as for BDDs) Efficient package: CUDD
Algebraic Decision DiagramsAlgebraic Decision Diagrams
A
B B
C C
0 1 2 3
ADD Join OperationsADD Join Operations
Multiplication, addition, maximum, … Generalization of BDD operations
ABC f f*gg f>1f+g
000001010011100101110111
01121223
32010001
32131224
02020003
5*f
055105101015
00010111
max(f,g)
32121223
ExampleExample
Summation of ADD f, ADD g
A
B B
C C C
3 2 1 0
A
B B
C C
0 1 2 3
A
B B
C C C
4 3 2 1
+ =
ADD Projection OperationsADD Projection Operations
(f,X) (and (f,X)) obtained by summing (multiplying) values of tuples that differ only w.r.t. X
ABC f
000001010011100101110111
01121223
AB (f,{C})
00011011
1335
(f,{C})
0226
ADD Projection OperationsADD Projection Operations
For optimization, we require operation max(f,X) that yields maximum value of tuples differing only w.r.t. X
ABC f
000001010011100101110111
01121223
AB (f,{C})
00011011
1335
(f,{C})
0226
Not part of CUDD, but easy to implement as variant of /(f,X).
max(f,{C})
1223
Solving SCSPs using DecompositionSolving SCSPs using Decomposition
Transform SCSPs into Hypertree H=(T,,) Compute constraint (v) for each node v Bottom-up phase for computing values Top-down phase for extracting solutions
Pseudocode for Bottom-Up PhasePseudocode for Bottom-Up Phase
Function solve(v)For Each child children(v)
(v) (v) max((child), (child) \ (v))
Next child
Return (v) Generalization of (Semi-)Join Operation
ExampleExample
Boolean Polycell
And1
And2
F = 0
Or2
G = 1
Or1
Or3
X
Y
Z
B = 1
D = 1
A = 1
E = 0
C = 1
ExampleExample
Hypertree Decomposition of Boolean Polycell
O3A1CEFXYZ
A2GYZ O1ACXO2BDY
Y,Z Y C,X
ok 1 1 1fty 1 1
1fty 1 0
1fty 1 1
0fty 1 0
0
ok ok 1 0 0 0 0 1ok ok 1 0 0 0 1 1ok ok 1 0 0 1 0 1
…
ok 1 1 1fty 1 1
1fty 1 1
0
ok 1 1 1fty 1 1
1fty 1 1
0
v0
v1 v2 v3
U=.98505
U=.99U=.99U=.995U=.005 U=.01 U=.01
ExampleExample
Initial (v0)U=.98505
U=.00995
U=.00005U=.00495
fty ok 1 0 0 0 1 1fty ok 1 0 0 0 1 0fty ok 1 0 0 1 0 0fty ok 1 0 0 1 0 1fty ok 1 0 0 0 1 1fty ok 1 0 0 1 0 1
……
ok ok 1 0 0 0 0 1
ok ok 1 0 0 0 1 1
ok ok 1 0 0 1 0 1
ADD with20 nodes,5 leaves
O3A1CEFXYZv0
ExampleExample
After multiplication with max((v1),{A2,G})
ok ok 1 0 0 0 1 1
U=.98012
U=.00990
U=.00492
U=2.4E-5
U=4.9E-5
U=2.5E-7
fty ok 1 0 0 0 1 1ok ok 1 0 0 0 0
1ok ok 1 0 0 1 0 1……
…
…
ADD with28 nodes,7 leaves
O3A1CEFXYZv0
ExampleExample
After multiplication with max((v2),{O2,B,D})
ok ok 1 0 0 0 1 1
U=.97032
U=.00980
U=.00487
U=4.9E-7U=2.4E-7
U=2.5E-9
fty ok 1 0 0 0 1 1ok fty 1 0 0 0 1
1…
…U=4.9E-5
…
…
O3A1CEFXYZv0
ADD with30 nodes,8 leaves
ExampleExample
After multiplication with max((v3),{O1,A})
ADD with35 nodes,10 leaves
ok ok 1 0 0 0 1 1
U=.00970
U=.00482
U=9.8E-5
U=4.9E-7
U=2.4E-7
U=4.9E-9
ok fty 1 0 0 1 1 1fty ok 1 0 0 0 1
1…
…U=4.8E-5
…
…
U=2.4E-9
U=2.5E-11
…
…
Best Solution:Umax = .0097
O3A1CEFXYZv0
Pseudocode for Top-Down PhasePseudocode for Top-Down Phase
Function extractSolutions(vroot)E edges(vroot)
(vroot) max(, vars() \ decvars()vars(E))While E Do
e choose(E)v son-node(e)E (E \ e) edges(v)
0-1 (0)
div max(0-1 (v), vars())
( (v)) -1 div max(, vars() \ decvars()vars(E))
End While
“Divisor”
Restrict todecision and
shared variables
No search queue necessary
ExampleExample
Initial = max((vroot),{E,F})
ok ok 1 0 1 1 U=.00970
U=.00482
U=9.8E-5
U=4.9E-7
U=2.4E-7
U=4.9E-9
ok fty 1 1 1 1
fty ok 1 0 1 1
…
…U=4.8E-5
…
…
U=2.4E-9
U=2.5E-11
…
…
O3A1CXYZ
ADD with21 tuples, 33 nodes, 10 leaves
ExampleExample
After processing edge(v0,v3)
fty ok ok 1 1 U=.00970
U=.00482
U=9.8E-5
U=4.9E-7
U=2.4E-7
U=4.9E-9
ok ok fty 1 1
fty fty ok 1 1
…
…U=4.8E-5
…
…
U=2.4E-9
U=2.5E-11
…
…
O1O3A1YZ
ADD with21 tuples, 32 nodes, 10 leaves
ExampleExample
After processing edge(v0,v2)
fty ok ok ok 1 1
U=.00970
U=.00482
U=9.8E-5
U=9.9E-7
U=4.9E-7
ok ok ok fty 1 1fty fty ok ok 1 1fty ok fty ok 1 1
…
…U=4.8E-5
…
…
U=2.5E-11
…
…
O1O2O3A1YZ
ADD with30 tuples, 47 nodes, 11 leaves
ExampleExample
After processing edge(v0,v1)
fty ok ok ok ok U=.00970
U=.00482
U=9.8E-5
U=9.9E-7
ok ok ok fty okfty fty ok ok okfty ok fty ok ok
…
…U=4.8E-5
…
…
U=2.5E-11
…
…
O1O2O3A1A2
ADD with26 tuples,35 nodes, 12 leaves
U=2.4E-5#Solutions = 26
Easy to focus on leading solutions.
Application: Exact ME for POMDPsApplication: Exact ME for POMDPs
Given: POMDP (Feasible States, Observables, Control Actions, Transitions), Observations
Approach: Complete representation of belief state (through decomposition and symbolic encoding)
Benefit: Allows for exploiting Markov property
S0
S1 …Sn
Time t
S0
S1 …Sn
Time t+1
Algorithm: Exact ME for POMDPsAlgorithm: Exact ME for POMDPs
Construct Hypertree (offline) Construct State-ADDs for each node (offline) Construct Transition-ADDs for each node (offline) Repeat for each time step:
– Multiply nodes with Obs-ADDs (“Condition on Observations”)
– Establish consistency in the tree (Bottom-up)– Extract leading solution(s) from the tree (Top-down)
– Multiply nodes with Transition-ADDs, project on xt+1, set xt = xt+1, multiply with State-ADDs (“Transition Expansion”)
Complexity: Polynomial in width of Hypertree
ExampleExample
Adapted from Jim Kurien’s thesis
t0: Sw1.cmd = on t1: Or.out = lo, Sw1.cmd = idl, Sw2.cmd = on t2: Or.out = lo
Sw1
≥ 1
Sw2
Or
hi
hi
Switches more likely to fail than Or-Gate
ExampleExample
Switch Model
on
fty
0.95
1.0
t1 t2
0.05
lo lo lo hihi lohi hi
off
0.05
t1 t2lo lo hi hi
0.95
cmd=off
cmd=on
0.95
0.95
true
cmd=off,idlcmd=on,idl
ExampleExample
Switch Model
xt t1 t2 on lo loon hi hioff * *fty * *
f
1.01.01.01.0
xt cmd xt+1 f
on on onon off offon idl onon * ftyoff on onoff off offoff idl offoff * ftyfty * fty
0.950.950.950.050.950.950.950.051.0
ExampleExample
Or-Gate Model
ok
fty
0.99
1.0
in1 in2 out
true
0.01
lo lo lolo hi hihi lo hihi hi hi
xt in1 in2 outok lo lo look lo hi hiok hi lo hiok hi hi hifty * * *
xt xt+1
ok okok ftyfty fty
f
1.01.01.01.01.0
f
0.990.011.0
ExampleExample
Initial belief state (chosen):– p(Sw=on) = p(Sw=off) = 0.475, p(Sw=fty) = 0.05– p(Or=ok) = 0.99, p(Or=fty) = 0.01
Observations/Commands:– t0: Sw1.cmd=on– t1: Or.out=lo, Sw1.cmd=idl, Sw2.cmd=on– t2: Or.out=lo
Leading Solutions:– t0: Sw1=on/off, Sw2=on/off, Or=ok– t1: Sw1=fty, Sw2=off, Or=ok– t2: Sw1=on, Sw2=on, Or=fty
ConclusionConclusion
SCSPs elegant and general representation ADDs encoding of SCSPs efficient in average case,
exponential in the number of variables in worst case Decomposition factors problem into set of ADDs,
each confined to small numbers of variables The two methods complement each other well How far can we get with this combination?
Recommended