Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
1
CSE473:ArtificialIntelligence
Bayes’Nets:D-Separation
DanielWeld[MostslideswerecreatedbyDanKleinandPieterAbbeelforCS188IntrotoAIatUCBerkeley.AllCS188materialsareavailableathttp://ai.berkeley.edu.]
Bayes’ Nets:BigPicture
§ Twoproblemswithusingfulljointdistributiontablesasourprobabilisticmodels:§ Unlessthereareonlyafewvariables,thejointisWAYtoo
bigtorepresentexplicitly§ Hardtolearn(estimate)anythingempiricallyaboutmore
thanafewvariablesatatime
§ Bayes’ nets:atechniquefordescribingcomplexjointdistributions(models)usingsimple,localdistributions(conditionalprobabilities)§ Moreproperly… akaprobabilisticgraphicalmodel§ Wedescribehowvariableslocallyinteract§ Localinteractionschaintogethertogiveglobal,indirect
interactions§ Forabout10min,we’ll bevagueabouthowthese
interactionsarespecified
2
Bayes’ NetSemantics
§ Asetofnodes,onepervariableX
§ Adirected,acyclic graph
§ Aconditionaldistributionforeachnode
§ AcollectionofdistributionsoverX,oneforeachcombinationofparents’ values
§ CPT:conditionalprobabilitytable
§ Descriptionofanoisy“causal” process
A1
X
An
ABayesnet=Topology(graph)+LocalConditionalProbabilities
P(A1 ) …. P(An )
Example:AlarmNetwork
Burglary Earthqk
Alarm
Johncalls
Marycalls
B P(B)
+b 0.001
-b 0.999
E P(E)
+e 0.002
-e 0.998
B E A P(A|B,E)
+b +e +a 0.95+b +e -a 0.05+b -e +a 0.94+b -e -a 0.06-b +e +a 0.29-b +e -a 0.71-b -e +a 0.001-b -e -a 0.999
A J P(J|A)
+a +j 0.9+a -j 0.1-a +j 0.05-a -j 0.95
A M P(M|A)
+a +m 0.7+a -m 0.3-a +m 0.01-a -m 0.99
3
Example:AlarmNetworkB P(B)
+b 0.001
-b 0.999
E P(E)
+e 0.002
-e 0.998
B E A P(A|B,E)
+b +e +a 0.95+b +e -a 0.05+b -e +a 0.94+b -e -a 0.06-b +e +a 0.29-b +e -a 0.71-b -e +a 0.001-b -e -a 0.999
A J P(J|A)
+a +j 0.9+a -j 0.1-a +j 0.05-a -j 0.95
A M P(M|A)
+a +m 0.7+a -m 0.3-a +m 0.01-a -m 0.99
B E
A
MJ
What’sNextwithBayes’Nets
Questionswecanask:
§ Definition:P(X=x)
§ Representation:givenaBNgraph,whatkindsofdistributionscanitencode?
§ Modeling:whatBNismostappropriateforagivendomain?
§ Inference:givenafixedBN,whatisP(X|e)?
§ Learning:Givendata,whatisbestBNencoding?
4
Remember….
§ X,Yindependentifandonlyif:
§ XandYareconditionallyindependentgivenZ:ifandonlyif:
ConditionalIndependenceinaBNImportantquestionaboutaBN:
§ Aretwonodesindependentgivencertainevidence?§ Ifyes,mustproveusingalgebra(tediousingeneral)§ Ifno,canprovewithacounterexample(includingCPTs)§ Forexample:
§ Question1:areXandZnecessarily independent?§ Answer:NO.Example:lowpressurecausesrain,whichcausestraffic.§ InformationaboutXmaychangeourbeliefinZ(viaY)§ Similarly,infoaboutZmaychangeourbeliefinX
X Y Z
P(Y|X) = 1P(Y|#X) = 1
P(Z|Y) = .5P(Z|#Y) = .5
P(X) = .9
5
ConditionalIndependenceinaBN§ ImportantquestionaboutaBN:
§ Aretwonodesindependentgivencertainevidence?§ Ifyes,canproveusingalgebra(tediousingeneral)§ Ifno,canprovewithacounterexample§ Forexample:
§ Question1:areXandZnecessarily independent?§ Answer:no.
§ Question2:areXandZconditionallyindependent,givenY?§ Answer:yes,aswe’llsee
X Y Z
D-separation
6
D-separation:Outline
§ Studyindependencepropertiesforsubgraphs(connectedtriples)
§ Analyzecomplexcasesintermsoftriplesalongpathsbetweenvars
§ D-separation:acondition/algorithmforansweringsuchqueries
CausalChains
§ Thisconfigurationisa“causalchain” § GuaranteedXindependentofZgivenY?
§ Evidencealongthechain“blocks” theinfluence(makes“inactive”)
Yes!
X:LowpressureY:RainZ:Traffic
7
CommonCause
§ Thisconfigurationisa“commoncause” § GuaranteedXandZindependentgivenY?
§ Observingthecauseblocksinfluencebetweeneffects.(makesinactive)
Yes!
Y:Projectdue
X:Forumsbusy Z:Labfull
CommonEffect§ Lastconfiguration:twocausesofone
effect(v-structures)
Z:Traffic
§ AreXandYindependent?
§ Yes:theballgameandtheraincausetraffic,buttheyarenotcorrelated
§ Stillneedtoprovetheymustbe(tryit!)
§ AreXandYindependentgivenZ?
§ No:seeingtrafficputstherainandtheballgameincompetitionasexplanation.
§ Thisisbackwards fromtheothercases
§ Observinganeffectactivatesinfluencebetweenpossiblecauses.(makesactive!)
X:Raining Y:Ballgame
8
CommonEffect
Z:Traffic
X:Raining Y:Ballgame
P(X) = 0.8 P(Y) = 0.1
X Y P(Z)T T 0.95T F 0.8F T 0.9F F 0.5
X Y Z P
T T T 0.076T T F 0.004T F T 0.576T F F 0.144F T T 0.162F T F 0.002F F T 0.090F F F 0.009
CommonEffect
Z:Traffic
X:Raining Y:Ballgame
P(X) = 0.8 P(Y) = 0.1
X Y P(Z)T T 0.95T F 0.8F T 0.9F F 0.5
X Y Z P
T T T 0.076T T F 0.004T F T 0.576T F F 0.144F T T 0.162F T F 0.002F F T 0.090F F F 0.009
9
CommonEffect
Z:Traffic
X:Raining Y:Ballgame
P(X) = 0.8 P(Y) = 0.1
X Y P(Z)T T 0.95T F 0.8F T 0.9F F 0.5
X Y Z P
T T T 0.076T T F 0.004T F T 0.576T F F 0.144F T T 0.018F T F 0.002F F T 0.090F F F 0.009
P(X|Y) =
= 0.08 / 0.1= 0.8
0.076+0.004
0.076+0.004+0.018+0.002
ButSupposeAlsoKnowZ=T
Z:Traffic
X:Raining Y:Ballgame
P(X) = 0.8 P(Y) = 0.1
X Y P(Z)T T 0.95T F 0.8F T 0.9F F 0.5
X Y Z P
T T T 0.076T T F 0.004T F T 0.576T F F 0.144F T T 0.018F T F 0.002F F T 0.090F F F 0.009
P(X|Y,Z) =
= 0.8085
0.076
0.076+ 0.018
P(X|Z) = .076+.576 .076+.576+.018+.090
= 0.652/0.76= 0.858
10
TheGeneralCase
TheGeneralCase
§ Generalquestion:inagivenBN,aretwovariablesindependent(givenevidence)?
§ Solution:analyzethegraph
§ Anycomplexexamplecanbebrokenintorepetitionsofthethreecanonicalcases
11
§ Query:
§ Checkall(undirected!)pathsbetweenand§ Ifoneormorepathsisactive,thenindependencenotguaranteed
§ Otherwise(i.e.ifallpathsareinactive),then“D-separated”=independenceis guaranteed
D-Separation
Xi �� Xj |{Xk1 , ..., Xkn}
Xi �� Xj |{Xk1 , ..., Xkn}
?
Xi �� Xj |{Xk1 , ..., Xkn}
§ Query:
Example
X BR T
12
Active/InactivePaths
§ Question:AreXandYconditionallyindependentgivenevidencevariables{Z}?§ Yes,ifXandY“d-separated” byZ§ Considerall(undirected)pathsfromXtoY§ Ifnopathisactiveà independence!
§ Apath isactiveifeverytripleinpathisactive:§ CausalchainA® B® CwhereBisunobserved(eitherdirection)§ CommoncauseA¬ B® CwhereBisunobserved§ Commoneffect(akav-structure)
A® B¬ CwhereBoroneofitsdescendants isobserved
§ Allittakestoblockapathisasingle inactivesegment§ (Butall pathsmustbeinactive)
ActiveTriples InactiveTriples
X
X
X
X
X
X
X
Y
Y
Y
Y
Y
Y
Y
Example
Yes,Independent! R
T
B
T’
ActiveTriples InactiveTriples
X
X
X
X
X
X
X
Y
Y
Y
Y
Y
Y
Y
13
Example
Yes,Independent! R
T
B
T’
ActiveTriples InactiveTriples
X
X
X
X
X
X
X
Y
Y
Y
Y
Y
Y
Y
No
Example
Yes,Independent! R
T
B
T’
ActiveTriples InactiveTriples
X
X
X
X
X
X
X
Y
Y
Y
Y
Y
Y
Y
No
No
14
Example
R
T
B
D
L
T’
Yes,Independent
ActiveTriples InactiveTriples
X
X
X
X
X
X
X
Y
Y
Y
Y
Y
Y
Y
Example
R
T
B
D
L
T’
Yes,Independent
Yes,Independent
ActiveTriples InactiveTriples
X
X
X
X
X
X
X
Y
Y
Y
Y
Y
Y
Y
15
Example
R
T
B
D
L
T’
Yes,Independent
Yes,Independent
No
ActiveTriples InactiveTriples
X
X
X
X
X
X
X
Y
Y
Y
Y
Y
Y
Y
Example
R
T
B
D
L
T’
Yes,Independent
Yes,Independent
No
No
ActiveTriples InactiveTriples
X
X
X
X
X
X
X
Y
Y
Y
Y
Y
Y
Y
16
Example
R
T
B
D
L
T’
Yes,Independent
Yes,Independent
Yes,Independent
No
No
ActiveTriples InactiveTriples
X
X
X
X
X
X
X
Y
Y
Y
Y
Y
Y
Y
Example
R
T
B
D
L
T’
Yes,Independent
Yes,Independent
Yes,Independent
No
No
No
ActiveTriples InactiveTriples
X
X
X
X
X
X
X
Y
Y
Y
Y
Y
Y
Y
R T’ | L, B
17
Example
§ Variables:§ R:Raining§ T:Traffic§ D:Roofdrips§ S:I’msad
§ Questions:
T
S
D
R
No
ActiveTriples InactiveTriples
X
X
X
X
X
X
X
Y
Y
Y
Y
Y
Y
Y
Example
§ Variables:§ R:Raining§ T:Traffic§ D:Roofdrips§ S:I’msad
§ Questions:
T
S
D
R
Yes,Independent
No
ActiveTriples InactiveTriples
X
X
X
X
X
X
X
Y
Y
Y
Y
Y
Y
Y
18
Example
§ Variables:§ R:Raining§ T:Traffic§ D:Roofdrips§ S:I’msad
§ Questions:
T
S
D
R
Yes,Independent
No
No
ActiveTriples InactiveTriples
X
X
X
X
X
X
X
Y
Y
Y
Y
Y
Y
Y
TwoDegenerateCases
DR
Unconnected à Always Independent There are no paths between R and D, so definitely no active paths)
19
TwoDegenerateCases
D
R
Directly connected à Dependent (for some values of the CPT)
StructureImplications
§ GivenaBayesnetstructure,canrund-separationalgorithmtobuildacompletelistofconditionalindependencesthatarenecessarilytrueoftheform
§ Thislistdeterminesthesetofprobabilitydistributionsthatcanberepresented
Xi �� Xj |{Xk1 , ..., Xkn}
20
ComputingAllIndependences
X
Y
Z
X
Y
Z
X
Y
Z
X
Y
Z
XY
Z
{X �� Y,X �� Z, Y �� Z,
X �� Z | Y,X �� Y | Z, Y �� Z | X}
TopologyLimitsDistributions§ Givensomegraphtopology
G,onlycertainjointdistributionscanbeencoded
§ Thegraphstructureguaranteescertain(conditional)independences
§ (Theremightbemoreindependence)
§ Addingarcsincreasesthesetofdistributions,buthasseveralcosts
§ Fullconditioningcanencodeanydistribution
X
Y
Z
X
Y
Z
X
Y
Z
{X �� Z | Y }
X
Y
Z X
Y
Z X
Y
Z
X
Y
Z X
Y
Z X
Y
Z
{}
21
BayesNetsRepresentationSummary
§ Bayesnetscompactlyencodejointdistributions
§ GuaranteedindependenciesofdistributionscanbededucedfromBNgraphstructure
§ D-separationgivespreciseconditionalindependenceguaranteesfromgraphalone
§ ABayes’ net’sjointdistributionmayhavefurther(conditional)independencethatisnotdetectableuntilyouinspectitsspecificdistribution