105
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion DAG discovery Network Analysis 2017 Sacha Epskamp 04-12-2017

DAG discovery - Network Analysis 2017

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

DAG discoveryNetwork Analysis 2017

Sacha Epskamp

04-12-2017

Page 2: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Last week

• Regularization controls for spurious connection• LASSO regularization• EBIC model selection

• Bootstrap methods assess accuracy and stability of results• Non-parametric bootstrap• Case-drop bootstrap

• Comparing networks takes three steps• Visually inspect; Correlate weights; Permutation test

(NetworkComparisonTest)• Non-normal data

• Non-paranormal transformation• Polychoric correlations

Page 3: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Bootnet estimation

Page 4: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Directed Acyclic Graphs

Page 5: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Building blocks of a DAGCommon Cause

A

B

C

Example: Disease (B)causes twosymptoms (A and C).

Chain

A B C

Example: Insomnia(A) causes fatigue(B), which in turncauses concentrationproblems (C)

ColliderA

B

C

Example: Difficulty ofclass (A) andIntelligence of student(C) cause grade on atest (B)

A 6⊥⊥ C

A ⊥⊥ C | B

A 6⊥⊥ C

A ⊥⊥ C | B

A ⊥⊥ C

A 6⊥⊥ C | B

Page 6: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

To identify two variables (e.g., B and F) are conditionallyindependent given a third (e.g., C) or set of multiple variables:• List all paths between the variables (ignore direction of edge)• For each path, check if the variable to condition on is:

• The middle node in a chain or common cause structure• Not the middle node (common effect) in a collider structure or

an effect of such a common effect• If so, then the path is blocked• If all such paths are blocked, the two variables are

d-separated and thus conditionally independent

Page 7: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

• A ⊥⊥ B• A ⊥⊥ D | C• B ⊥⊥ G | C ,E• ...

Testing this causal model involves testing if all these conditionalindependence relations hold

Page 8: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

However, if this model fits:

• A → B → C

Then so do these:

• A ← B → C

• A ← B ← C

Because these models imply the same conditional independencerelationships and are therefore equivalent

Page 9: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

DAGS & Probability

• A key problem in statistics is characterizing a joint likelihoodfunction of all data• A function that tells you how likely your observed data is given

some parameters• Pr(A ,B ,C ,D, . . .)

• This function is used in estimating parameters• Parameters are selected that maximize the likelihood function

• Obtaining the joint likelihood may be complicated though

• DAGs make this much simpler!

Page 10: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

DAGS & Probability

Normally, to obtain the joint likelihood we need to factorize (chainrule):

Pr(A ,B ,C ,D,E) = Pr(A ) Pr(B | A ) Pr(C | A ,B) Pr(D | A ,B ,C) Pr(E | A ,B ,C ,D)

But if we know the DAG:

A → B → C → D → E

Then we know, e.g., Pr(E | A ,B ,C ,D) = Pr(E | D) (any node onlydepends on their “parents”), and thus:

Pr(A ,B ,C ,D,E) = Pr(A ) Pr(B | A ) Pr(C | B) Pr(D | C) Pr(E | D)

Much simpler!

Page 11: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Joint Likelihood of Multiple Realizations

Y1 Y2 Y3 Y4 Y5

lag−0

Simplest: independent cases (e.g., cross-sectional data):

Pr(YYY ) = Pr(YYY1) Pr(YYY2) Pr(YYY3) Pr(YYY4) Pr(YYY5)

Estimable if all probability distributions are assumed identical

Page 12: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Joint Likelihood of Multiple Realizations

Y1 Y2 Y3 Y4 Y5

lag−1

Lag-1 factorization (time-series):

Pr(YYY ) = Pr(YYY1) Pr(YYY2 | YYY1) Pr(YYY3 | YYY2) Pr(YYY4 | YYY3) Pr(YYY5 | YYY4)

Estimable if all probability distributions are assumed identical

Page 13: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Statistical models can often be portrayed as DAGs, in which casethey are called graphical models. For example:

Lee, M. D., & Wagenmakers, E. J. (2014). Bayesian cognitive modeling: A practical course.Cambridge university press.

• Powerful method for showing how the parameters of a complex model interact withone-another

• Bayesian software packages (e.g., WinBUGS, JAGS, Stan) use this DAG in samplingfrom the posterior distribution

Page 14: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

DAG Discovery

• DAG search algorithms intent to identify an equivalence class

• List equally plausible DAGs• Two types of algorithms:

• Constraint-based algorithms

• (1) identify edge locations, (2) identify colliders, (3) orient edgesunder acyclicity assumtion

• Score-based algorithms:

• Find optimal DAG by model selection/search

• Prior knowledge can be used in both cases to greatly help thealgorithm

• E.g., causation cannot go backward in time

Page 15: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

DAG Discovery

• DAG search algorithms intent to identify an equivalence class• List equally plausible DAGs

• Two types of algorithms:

• Constraint-based algorithms

• (1) identify edge locations, (2) identify colliders, (3) orient edgesunder acyclicity assumtion

• Score-based algorithms:

• Find optimal DAG by model selection/search

• Prior knowledge can be used in both cases to greatly help thealgorithm

• E.g., causation cannot go backward in time

Page 16: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

DAG Discovery

• DAG search algorithms intent to identify an equivalence class• List equally plausible DAGs

• Two types of algorithms:

• Constraint-based algorithms

• (1) identify edge locations, (2) identify colliders, (3) orient edgesunder acyclicity assumtion

• Score-based algorithms:

• Find optimal DAG by model selection/search

• Prior knowledge can be used in both cases to greatly help thealgorithm

• E.g., causation cannot go backward in time

Page 17: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

DAG Discovery

• DAG search algorithms intent to identify an equivalence class• List equally plausible DAGs

• Two types of algorithms:• Constraint-based algorithms

• (1) identify edge locations, (2) identify colliders, (3) orient edgesunder acyclicity assumtion

• Score-based algorithms:

• Find optimal DAG by model selection/search

• Prior knowledge can be used in both cases to greatly help thealgorithm

• E.g., causation cannot go backward in time

Page 18: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

DAG Discovery

• DAG search algorithms intent to identify an equivalence class• List equally plausible DAGs

• Two types of algorithms:• Constraint-based algorithms

• (1) identify edge locations, (2) identify colliders, (3) orient edgesunder acyclicity assumtion

• Score-based algorithms:

• Find optimal DAG by model selection/search

• Prior knowledge can be used in both cases to greatly help thealgorithm

• E.g., causation cannot go backward in time

Page 19: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

DAG Discovery

• DAG search algorithms intent to identify an equivalence class• List equally plausible DAGs

• Two types of algorithms:• Constraint-based algorithms

• (1) identify edge locations, (2) identify colliders, (3) orient edgesunder acyclicity assumtion

• Score-based algorithms:

• Find optimal DAG by model selection/search

• Prior knowledge can be used in both cases to greatly help thealgorithm

• E.g., causation cannot go backward in time

Page 20: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

DAG Discovery

• DAG search algorithms intent to identify an equivalence class• List equally plausible DAGs

• Two types of algorithms:• Constraint-based algorithms

• (1) identify edge locations, (2) identify colliders, (3) orient edgesunder acyclicity assumtion

• Score-based algorithms:• Find optimal DAG by model selection/search

• Prior knowledge can be used in both cases to greatly help thealgorithm

• E.g., causation cannot go backward in time

Page 21: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

DAG Discovery

• DAG search algorithms intent to identify an equivalence class• List equally plausible DAGs

• Two types of algorithms:• Constraint-based algorithms

• (1) identify edge locations, (2) identify colliders, (3) orient edgesunder acyclicity assumtion

• Score-based algorithms:• Find optimal DAG by model selection/search

• Prior knowledge can be used in both cases to greatly help thealgorithm

• E.g., causation cannot go backward in time

Page 22: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

DAG Discovery

• DAG search algorithms intent to identify an equivalence class• List equally plausible DAGs

• Two types of algorithms:• Constraint-based algorithms

• (1) identify edge locations, (2) identify colliders, (3) orient edgesunder acyclicity assumtion

• Score-based algorithms:• Find optimal DAG by model selection/search

• Prior knowledge can be used in both cases to greatly help thealgorithm• E.g., causation cannot go backward in time

Page 23: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Assumptions

• Causal Sufficiency Assumption

• “There exist no common unobserved (also known as hidden orlatent) variables in the domain that are parent of one or moreobserved variables of the domain”.

• tl;dr: No latent variables• Markov Assumption

• “Given a Bayesian network model B, any variable isindependent of all its nondescendants in B, given its parents”.

• tl;dr: Acyclicity

• Faithfulness Assumption

• “A BN graph G and a probability distribution P are faithful toone another iff every one and all independence relations validin P are those entailed by the Markov assumption on G”.

• tl;dr: No weird stuff

Source: Margaritis, D., 2003. Learning Bayesian network model structure from data.Thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh.

Page 24: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Assumptions

• Causal Sufficiency Assumption• “There exist no common unobserved (also known as hidden or

latent) variables in the domain that are parent of one or moreobserved variables of the domain”.

• tl;dr: No latent variables• Markov Assumption

• “Given a Bayesian network model B, any variable isindependent of all its nondescendants in B, given its parents”.

• tl;dr: Acyclicity

• Faithfulness Assumption

• “A BN graph G and a probability distribution P are faithful toone another iff every one and all independence relations validin P are those entailed by the Markov assumption on G”.

• tl;dr: No weird stuff

Source: Margaritis, D., 2003. Learning Bayesian network model structure from data.Thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh.

Page 25: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Assumptions

• Causal Sufficiency Assumption• “There exist no common unobserved (also known as hidden or

latent) variables in the domain that are parent of one or moreobserved variables of the domain”.

• tl;dr: No latent variables

• Markov Assumption

• “Given a Bayesian network model B, any variable isindependent of all its nondescendants in B, given its parents”.

• tl;dr: Acyclicity

• Faithfulness Assumption

• “A BN graph G and a probability distribution P are faithful toone another iff every one and all independence relations validin P are those entailed by the Markov assumption on G”.

• tl;dr: No weird stuff

Source: Margaritis, D., 2003. Learning Bayesian network model structure from data.Thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh.

Page 26: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Assumptions

• Causal Sufficiency Assumption• “There exist no common unobserved (also known as hidden or

latent) variables in the domain that are parent of one or moreobserved variables of the domain”.

• tl;dr: No latent variables• Markov Assumption

• “Given a Bayesian network model B, any variable isindependent of all its nondescendants in B, given its parents”.

• tl;dr: Acyclicity• Faithfulness Assumption

• “A BN graph G and a probability distribution P are faithful toone another iff every one and all independence relations validin P are those entailed by the Markov assumption on G”.

• tl;dr: No weird stuff

Source: Margaritis, D., 2003. Learning Bayesian network model structure from data.Thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh.

Page 27: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Assumptions

• Causal Sufficiency Assumption• “There exist no common unobserved (also known as hidden or

latent) variables in the domain that are parent of one or moreobserved variables of the domain”.

• tl;dr: No latent variables• Markov Assumption

• “Given a Bayesian network model B, any variable isindependent of all its nondescendants in B, given its parents”.

• tl;dr: Acyclicity• Faithfulness Assumption

• “A BN graph G and a probability distribution P are faithful toone another iff every one and all independence relations validin P are those entailed by the Markov assumption on G”.

• tl;dr: No weird stuff

Source: Margaritis, D., 2003. Learning Bayesian network model structure from data.Thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh.

Page 28: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Assumptions

• Causal Sufficiency Assumption• “There exist no common unobserved (also known as hidden or

latent) variables in the domain that are parent of one or moreobserved variables of the domain”.

• tl;dr: No latent variables• Markov Assumption

• “Given a Bayesian network model B, any variable isindependent of all its nondescendants in B, given its parents”.

• tl;dr: Acyclicity

• Faithfulness Assumption

• “A BN graph G and a probability distribution P are faithful toone another iff every one and all independence relations validin P are those entailed by the Markov assumption on G”.

• tl;dr: No weird stuff

Source: Margaritis, D., 2003. Learning Bayesian network model structure from data.Thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh.

Page 29: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Assumptions

• Causal Sufficiency Assumption• “There exist no common unobserved (also known as hidden or

latent) variables in the domain that are parent of one or moreobserved variables of the domain”.

• tl;dr: No latent variables• Markov Assumption

• “Given a Bayesian network model B, any variable isindependent of all its nondescendants in B, given its parents”.

• tl;dr: Acyclicity• Faithfulness Assumption

• “A BN graph G and a probability distribution P are faithful toone another iff every one and all independence relations validin P are those entailed by the Markov assumption on G”.

• tl;dr: No weird stuff

Source: Margaritis, D., 2003. Learning Bayesian network model structure from data.Thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh.

Page 30: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Assumptions

• Causal Sufficiency Assumption• “There exist no common unobserved (also known as hidden or

latent) variables in the domain that are parent of one or moreobserved variables of the domain”.

• tl;dr: No latent variables• Markov Assumption

• “Given a Bayesian network model B, any variable isindependent of all its nondescendants in B, given its parents”.

• tl;dr: Acyclicity• Faithfulness Assumption

• “A BN graph G and a probability distribution P are faithful toone another iff every one and all independence relations validin P are those entailed by the Markov assumption on G”.

• tl;dr: No weird stuff

Source: Margaritis, D., 2003. Learning Bayesian network model structure from data.Thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh.

Page 31: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Assumptions

• Causal Sufficiency Assumption• “There exist no common unobserved (also known as hidden or

latent) variables in the domain that are parent of one or moreobserved variables of the domain”.

• tl;dr: No latent variables• Markov Assumption

• “Given a Bayesian network model B, any variable isindependent of all its nondescendants in B, given its parents”.

• tl;dr: Acyclicity• Faithfulness Assumption

• “A BN graph G and a probability distribution P are faithful toone another iff every one and all independence relations validin P are those entailed by the Markov assumption on G”.

• tl;dr: No weird stuff

Source: Margaritis, D., 2003. Learning Bayesian network model structure from data.Thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh.

Page 32: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Score-based algorithms

• Score-based algorithms fit several DAGs to some criteriumand selects the best

• Possible criteria are posterior model fit and AIC/BIC

• Searching all possible DAGs is intractable, so some strategyis needed

• Examples• Hill Climbing; Tabu Search

• Used, e.g., by McNally, R. J., Mair, P., Mugno, B. L., &Riemann, B. C. (2017). Co-morbid obsessive-compulsivedisorder and depression: a Bayesian network approach.Psychological Medicine, 1-11.

Page 33: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Hill Climbing

• 1. Start at empty, full or random network• 2. Add, remove, or reverse edges all possible edges• 3. Select the best fitting model that performs better than

current model• 4. Go to 2

Page 34: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Hill Climbing

• Hill Climbing results in a local optimum• Random restarts and perturbations can be used to find a

global optimum• No control for overfitting

• Bootstrapping and only retaining stable edges is highlyrecommended

Page 35: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Constraint-based algorithm• Structure estimated based on conditional independence

relationships

• E.g., Inductive Causation algorithm:

1. For each pair a and b, look for (a y b | Sab ). If no such Sab

exists, then a and b are dependent.2. For each trio (a, b , c) such that a − c − b check if c belongs to

Sab . If so, then nothing. If c is not in Sab then make a colliderat c, i.e. a → c ← b.

3. Orient as many of the undirected edges as possible, subjectto: (i) no new v-structures and (ii) no cycles.

• Examples:

• IC algorithm; PC algorithm; Grow-Shrink; IncrementalAssociation Markov Blanket

• Used, e.g., by Borsboom, D., & Cramer, A. O. (2013). Networkanalysis: an integrative approach to the structure ofpsychopathology. Annual review of clinical psychology, 9,91-121.

Page 36: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Constraint-based algorithm• Structure estimated based on conditional independence

relationships• E.g., Inductive Causation algorithm:

1. For each pair a and b, look for (a y b | Sab ). If no such Sab

exists, then a and b are dependent.2. For each trio (a, b , c) such that a − c − b check if c belongs to

Sab . If so, then nothing. If c is not in Sab then make a colliderat c, i.e. a → c ← b.

3. Orient as many of the undirected edges as possible, subjectto: (i) no new v-structures and (ii) no cycles.

• Examples:

• IC algorithm; PC algorithm; Grow-Shrink; IncrementalAssociation Markov Blanket

• Used, e.g., by Borsboom, D., & Cramer, A. O. (2013). Networkanalysis: an integrative approach to the structure ofpsychopathology. Annual review of clinical psychology, 9,91-121.

Page 37: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Constraint-based algorithm• Structure estimated based on conditional independence

relationships• E.g., Inductive Causation algorithm:

1. For each pair a and b, look for (a y b | Sab ). If no such Sab

exists, then a and b are dependent.

2. For each trio (a, b , c) such that a − c − b check if c belongs toSab . If so, then nothing. If c is not in Sab then make a colliderat c, i.e. a → c ← b.

3. Orient as many of the undirected edges as possible, subjectto: (i) no new v-structures and (ii) no cycles.

• Examples:

• IC algorithm; PC algorithm; Grow-Shrink; IncrementalAssociation Markov Blanket

• Used, e.g., by Borsboom, D., & Cramer, A. O. (2013). Networkanalysis: an integrative approach to the structure ofpsychopathology. Annual review of clinical psychology, 9,91-121.

Page 38: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Constraint-based algorithm• Structure estimated based on conditional independence

relationships• E.g., Inductive Causation algorithm:

1. For each pair a and b, look for (a y b | Sab ). If no such Sab

exists, then a and b are dependent.2. For each trio (a, b , c) such that a − c − b check if c belongs to

Sab . If so, then nothing. If c is not in Sab then make a colliderat c, i.e. a → c ← b.

3. Orient as many of the undirected edges as possible, subjectto: (i) no new v-structures and (ii) no cycles.

• Examples:

• IC algorithm; PC algorithm; Grow-Shrink; IncrementalAssociation Markov Blanket

• Used, e.g., by Borsboom, D., & Cramer, A. O. (2013). Networkanalysis: an integrative approach to the structure ofpsychopathology. Annual review of clinical psychology, 9,91-121.

Page 39: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Constraint-based algorithm• Structure estimated based on conditional independence

relationships• E.g., Inductive Causation algorithm:

1. For each pair a and b, look for (a y b | Sab ). If no such Sab

exists, then a and b are dependent.2. For each trio (a, b , c) such that a − c − b check if c belongs to

Sab . If so, then nothing. If c is not in Sab then make a colliderat c, i.e. a → c ← b.

3. Orient as many of the undirected edges as possible, subjectto: (i) no new v-structures and (ii) no cycles.

• Examples:

• IC algorithm; PC algorithm; Grow-Shrink; IncrementalAssociation Markov Blanket

• Used, e.g., by Borsboom, D., & Cramer, A. O. (2013). Networkanalysis: an integrative approach to the structure ofpsychopathology. Annual review of clinical psychology, 9,91-121.

Page 40: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Constraint-based algorithm• Structure estimated based on conditional independence

relationships• E.g., Inductive Causation algorithm:

1. For each pair a and b, look for (a y b | Sab ). If no such Sab

exists, then a and b are dependent.2. For each trio (a, b , c) such that a − c − b check if c belongs to

Sab . If so, then nothing. If c is not in Sab then make a colliderat c, i.e. a → c ← b.

3. Orient as many of the undirected edges as possible, subjectto: (i) no new v-structures and (ii) no cycles.

• Examples:

• IC algorithm; PC algorithm; Grow-Shrink; IncrementalAssociation Markov Blanket

• Used, e.g., by Borsboom, D., & Cramer, A. O. (2013). Networkanalysis: an integrative approach to the structure ofpsychopathology. Annual review of clinical psychology, 9,91-121.

Page 41: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Constraint-based algorithm• Structure estimated based on conditional independence

relationships• E.g., Inductive Causation algorithm:

1. For each pair a and b, look for (a y b | Sab ). If no such Sab

exists, then a and b are dependent.2. For each trio (a, b , c) such that a − c − b check if c belongs to

Sab . If so, then nothing. If c is not in Sab then make a colliderat c, i.e. a → c ← b.

3. Orient as many of the undirected edges as possible, subjectto: (i) no new v-structures and (ii) no cycles.

• Examples:• IC algorithm; PC algorithm; Grow-Shrink; Incremental

Association Markov Blanket

• Used, e.g., by Borsboom, D., & Cramer, A. O. (2013). Networkanalysis: an integrative approach to the structure ofpsychopathology. Annual review of clinical psychology, 9,91-121.

Page 42: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Constraint-based algorithm• Structure estimated based on conditional independence

relationships• E.g., Inductive Causation algorithm:

1. For each pair a and b, look for (a y b | Sab ). If no such Sab

exists, then a and b are dependent.2. For each trio (a, b , c) such that a − c − b check if c belongs to

Sab . If so, then nothing. If c is not in Sab then make a colliderat c, i.e. a → c ← b.

3. Orient as many of the undirected edges as possible, subjectto: (i) no new v-structures and (ii) no cycles.

• Examples:• IC algorithm; PC algorithm; Grow-Shrink; Incremental

Association Markov Blanket• Used, e.g., by Borsboom, D., & Cramer, A. O. (2013). Network

analysis: an integrative approach to the structure ofpsychopathology. Annual review of clinical psychology, 9,91-121.

Page 43: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Easiness of Class Intelligence

Grade IQ

Diploma

Page 44: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Easiness of Class Intelligence

Grade IQ

Diploma

What if we don’t know the structure?

Page 45: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Easiness of Class Intelligence

Grade IQ

Diploma

Are the two nodes independent given *any* set of other nodes(including the empty set)?

• Yes! They are independent to begin with!

• Draw no edge between Easiness of Class and Intelligence

Page 46: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Easiness of Class Intelligence

Grade IQ

Diploma

Are the two nodes independent given *any* set of other nodes(including the empty set)?

• Yes! They are independent to begin with!

• Draw no edge between Easiness of Class and Intelligence

Page 47: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Easiness of Class Intelligence

Grade IQ

Diploma

Are the two nodes independent given *any* set of other nodes(including the empty set)?

• Yes! They are independent to begin with!

• Draw no edge between Easiness of Class and Intelligence

Page 48: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Easiness of Class Intelligence

Grade IQ

Diploma

Are the two nodes independent given *any* set of other nodes(including the empty set)?

• No!

• Draw an edge between Easiness of Class and Grade

Page 49: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Easiness of Class Intelligence

Grade IQ

Diploma

Are the two nodes independent given *any* set of other nodes(including the empty set)?

• No!

• Draw an edge between Easiness of Class and Grade

Page 50: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Easiness of Class Intelligence

Grade IQ

Diploma

Are the two nodes independent given *any* set of other nodes(including the empty set)?

• No!

• Draw an edge between Easiness of Class and Grade

Page 51: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Easiness of Class Intelligence

Grade IQ

Diploma

Are the two nodes independent given *any* set of other nodes(including the empty set)?

• No!

• Draw an edge between Grade and Intelligence

Page 52: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Easiness of Class Intelligence

Grade IQ

Diploma

Are the two nodes independent given *any* set of other nodes(including the empty set)?

• No!

• Draw an edge between Grade and Intelligence

Page 53: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Easiness of Class Intelligence

Grade IQ

Diploma

Are the two nodes independent given *any* set of other nodes(including the empty set)?

• No!

• Draw an edge between Grade and Intelligence

Page 54: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Easiness of Class Intelligence

Grade IQ

Diploma

Page 55: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Easiness of Class Intelligence

Grade IQ

Diploma

Is the middle node in the set that separated the other two nodes?

• Yes!

• Do nothing

Page 56: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Easiness of Class Intelligence

Grade IQ

Diploma

Is the middle node in the set that separated the other two nodes?

• Yes!

• Do nothing

Page 57: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Easiness of Class Intelligence

Grade IQ

Diploma

Is the middle node in the set that separated the other two nodes?

• Yes!

• Do nothing

Page 58: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Easiness of Class Intelligence

Grade IQ

Diploma

Is the middle node in the set that separated the other two nodes?

• Yes!

• Grade is a collider between Easiness of Class and Intelligence

Page 59: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Easiness of Class Intelligence

Grade IQ

Diploma

Is the middle node in the set that separated the other two nodes?

• Yes!

• Grade is a collider between Easiness of Class and Intelligence

Page 60: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Easiness of Class Intelligence

Grade IQ

Diploma

Is the middle node in the set that separated the other two nodes?

• Yes!

• Grade is a collider between Easiness of Class and Intelligence

Page 61: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Easiness of Class Intelligence

Grade IQ

Diploma

Do we now know the direction of the edge between Grade andDiploma?

• Yes! Grade was not a common effect of diploma and anothervariable!

Page 62: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Easiness of Class Intelligence

Grade IQ

Diploma

Do we now know the direction of the edge between Grade andDiploma?

• Yes! Grade was not a common effect of diploma and anothervariable!

Page 63: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Easiness of Class Intelligence

Grade IQ

Diploma

Do we now know the direction of the edge between Intelligenceand IQ?

• No!

Page 64: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Easiness of Class Intelligence

Grade IQ

Diploma

Do we now know the direction of the edge between Intelligenceand IQ?

• No!

Page 65: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Easiness of Class Intelligence

Grade IQ

Diploma

Easiness of Class Intelligence

IQGrade

Diploma

Page 66: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Easiness of Class Intelligence

IQGrade

Diploma

Page 67: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Constraint-based vs Score-based algorithms

• Constraint-based algorithms are more specific and detailed,allow for a more certain causal interpretation. But are alsosensitive to error (if one test is wrong everything fails!)

• Score-based methods provide a metric of confidence in thereturned model and are useful in approximating the jointprobability distribution

• Hybrid methods that aim to take the best from both worlds arealso developed!

• e.g., Max-Min Hill Climbing

Page 68: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Constraint-based vs Score-based algorithms

• Constraint-based algorithms are more specific and detailed,allow for a more certain causal interpretation. But are alsosensitive to error (if one test is wrong everything fails!)

• Score-based methods provide a metric of confidence in thereturned model and are useful in approximating the jointprobability distribution

• Hybrid methods that aim to take the best from both worlds arealso developed!

• e.g., Max-Min Hill Climbing

Page 69: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Constraint-based vs Score-based algorithms

• Constraint-based algorithms are more specific and detailed,allow for a more certain causal interpretation. But are alsosensitive to error (if one test is wrong everything fails!)

• Score-based methods provide a metric of confidence in thereturned model and are useful in approximating the jointprobability distribution

• Hybrid methods that aim to take the best from both worlds arealso developed!

• e.g., Max-Min Hill Climbing

Page 70: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Constraint-based vs Score-based algorithms

• Constraint-based algorithms are more specific and detailed,allow for a more certain causal interpretation. But are alsosensitive to error (if one test is wrong everything fails!)

• Score-based methods provide a metric of confidence in thereturned model and are useful in approximating the jointprobability distribution

• Hybrid methods that aim to take the best from both worlds arealso developed!• e.g., Max-Min Hill Climbing

Page 71: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Directed Acyclic Graphs

• A DAG implies a set of independence relationships, which canbe tested

• If the data is assumed Multivariate Gaussian:

• Each variable normally distributed• Linear relationships between variables

• Then the correlation or covariance can be used to test fordependencies and the partial correlation or partial covariancecan be used to test for conditional dependencies

Page 72: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Directed Acyclic Graphs

• A DAG implies a set of independence relationships, which canbe tested

• If the data is assumed Multivariate Gaussian:

• Each variable normally distributed• Linear relationships between variables

• Then the correlation or covariance can be used to test fordependencies and the partial correlation or partial covariancecan be used to test for conditional dependencies

Page 73: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Directed Acyclic Graphs

• A DAG implies a set of independence relationships, which canbe tested

• If the data is assumed Multivariate Gaussian:

• Each variable normally distributed

• Linear relationships between variables

• Then the correlation or covariance can be used to test fordependencies and the partial correlation or partial covariancecan be used to test for conditional dependencies

Page 74: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Directed Acyclic Graphs

• A DAG implies a set of independence relationships, which canbe tested

• If the data is assumed Multivariate Gaussian:

• Each variable normally distributed• Linear relationships between variables

• Then the correlation or covariance can be used to test fordependencies and the partial correlation or partial covariancecan be used to test for conditional dependencies

Page 75: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Directed Acyclic Graphs

• A DAG implies a set of independence relationships, which canbe tested

• If the data is assumed Multivariate Gaussian:

• Each variable normally distributed• Linear relationships between variables

• Then the correlation or covariance can be used to test fordependencies and the partial correlation or partial covariancecan be used to test for conditional dependencies

Page 76: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

A

B

C

• Cov (A ,C) , 0

• Cov (A ,C | B) = 0

Page 77: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Structural Equation Modeling• In SEM, the variance-covariance matrix is modeled and

compared to the observed variance-covariance matrix

• If multivariate normality holds, then the Schur complementshows that any partial covariance can be expressed solely interms of variances and covariances:

• Cov(Yi ,Yj | X = x

)=

Cov(Yi ,Yj

)− Cov (Yi ,X ) Var (X )−1 Cov

(X ,Yj

)• Thus, a specific structure of the correlation matrix also implies

a model for all possible partial correlations

• If the implied covariance matrix of SEM exactly matches theobserved covariance matrix, then the data contains alld-separations that are implied by the causal model

• In that case, the model could have generated the data!• But, this does not mean the model is correct

• Equivalent models could have generated the same data!

Page 78: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Structural Equation Modeling• In SEM, the variance-covariance matrix is modeled and

compared to the observed variance-covariance matrix• If multivariate normality holds, then the Schur complement

shows that any partial covariance can be expressed solely interms of variances and covariances:

• Cov(Yi ,Yj | X = x

)=

Cov(Yi ,Yj

)− Cov (Yi ,X ) Var (X )−1 Cov

(X ,Yj

)

• Thus, a specific structure of the correlation matrix also impliesa model for all possible partial correlations

• If the implied covariance matrix of SEM exactly matches theobserved covariance matrix, then the data contains alld-separations that are implied by the causal model

• In that case, the model could have generated the data!• But, this does not mean the model is correct

• Equivalent models could have generated the same data!

Page 79: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Structural Equation Modeling• In SEM, the variance-covariance matrix is modeled and

compared to the observed variance-covariance matrix• If multivariate normality holds, then the Schur complement

shows that any partial covariance can be expressed solely interms of variances and covariances:

• Cov(Yi ,Yj | X = x

)=

Cov(Yi ,Yj

)− Cov (Yi ,X ) Var (X )−1 Cov

(X ,Yj

)• Thus, a specific structure of the correlation matrix also implies

a model for all possible partial correlations

• If the implied covariance matrix of SEM exactly matches theobserved covariance matrix, then the data contains alld-separations that are implied by the causal model

• In that case, the model could have generated the data!• But, this does not mean the model is correct

• Equivalent models could have generated the same data!

Page 80: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Structural Equation Modeling• In SEM, the variance-covariance matrix is modeled and

compared to the observed variance-covariance matrix• If multivariate normality holds, then the Schur complement

shows that any partial covariance can be expressed solely interms of variances and covariances:

• Cov(Yi ,Yj | X = x

)=

Cov(Yi ,Yj

)− Cov (Yi ,X ) Var (X )−1 Cov

(X ,Yj

)• Thus, a specific structure of the correlation matrix also implies

a model for all possible partial correlations

• If the implied covariance matrix of SEM exactly matches theobserved covariance matrix, then the data contains alld-separations that are implied by the causal model

• In that case, the model could have generated the data!• But, this does not mean the model is correct

• Equivalent models could have generated the same data!

Page 81: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Structural Equation Modeling• In SEM, the variance-covariance matrix is modeled and

compared to the observed variance-covariance matrix• If multivariate normality holds, then the Schur complement

shows that any partial covariance can be expressed solely interms of variances and covariances:

• Cov(Yi ,Yj | X = x

)=

Cov(Yi ,Yj

)− Cov (Yi ,X ) Var (X )−1 Cov

(X ,Yj

)• Thus, a specific structure of the correlation matrix also implies

a model for all possible partial correlations

• If the implied covariance matrix of SEM exactly matches theobserved covariance matrix, then the data contains alld-separations that are implied by the causal model

• In that case, the model could have generated the data!

• But, this does not mean the model is correct

• Equivalent models could have generated the same data!

Page 82: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Structural Equation Modeling• In SEM, the variance-covariance matrix is modeled and

compared to the observed variance-covariance matrix• If multivariate normality holds, then the Schur complement

shows that any partial covariance can be expressed solely interms of variances and covariances:

• Cov(Yi ,Yj | X = x

)=

Cov(Yi ,Yj

)− Cov (Yi ,X ) Var (X )−1 Cov

(X ,Yj

)• Thus, a specific structure of the correlation matrix also implies

a model for all possible partial correlations

• If the implied covariance matrix of SEM exactly matches theobserved covariance matrix, then the data contains alld-separations that are implied by the causal model

• In that case, the model could have generated the data!• But, this does not mean the model is correct

• Equivalent models could have generated the same data!

Page 83: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Structural Equation Modeling• In SEM, the variance-covariance matrix is modeled and

compared to the observed variance-covariance matrix• If multivariate normality holds, then the Schur complement

shows that any partial covariance can be expressed solely interms of variances and covariances:

• Cov(Yi ,Yj | X = x

)=

Cov(Yi ,Yj

)− Cov (Yi ,X ) Var (X )−1 Cov

(X ,Yj

)• Thus, a specific structure of the correlation matrix also implies

a model for all possible partial correlations

• If the implied covariance matrix of SEM exactly matches theobserved covariance matrix, then the data contains alld-separations that are implied by the causal model

• In that case, the model could have generated the data!• But, this does not mean the model is correct

• Equivalent models could have generated the same data!

Page 84: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Doosje, B., Loseman, A., & Bos, K. (2013). Determinants ofradicalization of Islamic youth in the Netherlands: Personaluncertainty, perceived injustice, and perceived group threat.Journal of Social Issues, 69(3), 586-604.

Page 85: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Page 86: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Page 87: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Page 88: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

What does pcalg come up with?

In−group Identification

Individual Deprivation

Collective Deprivation

Intergroup Anxiety

Symbolic Threat

Realistic Threat

Personal Emotional Uncertainty

Perceived Injustice

Perceived Illegitimacy authorities

Perceived In−group superiority

Distance to Other People

Societal Disconnected

Attitude towards Muslim Violence

Own Violent Intentions

Page 89: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Does it fit?

## chisq df pvalue cfi nfi

## 80.52 39.00 0.00 0.89 0.82

## rmsea rmsea.ci.lower rmsea.ci.upper

## 0.09 0.06 0.12

• Not really. . .

Page 90: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

DAG Discovery

Discovering an equivalence set of DAGs is possible under someassumptions:

• Causal Sufficiency

• Markov Aumption

• Faithfulness

Two general methods:

• Score-based algorithms

• Constraint-based algorithms

DAGs provide useful characterisations of the joint likelihood andcan be fitted to the data (e.g., SEM)

Page 91: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

DAG Discovery

Discovering an equivalence set of DAGs is possible under someassumptions:

• Causal Sufficiency

• Markov Aumption

• Faithfulness

Two general methods:

• Score-based algorithms

• Constraint-based algorithms

DAGs provide useful characterisations of the joint likelihood andcan be fitted to the data (e.g., SEM)

Page 92: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

DAG Discovery

Discovering an equivalence set of DAGs is possible under someassumptions:

• Causal Sufficiency

• Markov Aumption

• Faithfulness

Two general methods:

• Score-based algorithms

• Constraint-based algorithms

DAGs provide useful characterisations of the joint likelihood andcan be fitted to the data (e.g., SEM)

Page 93: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

DAG Discovery

Discovering an equivalence set of DAGs is possible under someassumptions:

• Causal Sufficiency

• Markov Aumption

• Faithfulness

Two general methods:

• Score-based algorithms

• Constraint-based algorithms

DAGs provide useful characterisations of the joint likelihood andcan be fitted to the data (e.g., SEM)

Page 94: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

DAG Discovery

Discovering an equivalence set of DAGs is possible under someassumptions:

• Causal Sufficiency

• Markov Aumption

• Faithfulness

Two general methods:

• Score-based algorithms

• Constraint-based algorithms

DAGs provide useful characterisations of the joint likelihood andcan be fitted to the data (e.g., SEM)

Page 95: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

DAG Discovery

Discovering an equivalence set of DAGs is possible under someassumptions:

• Causal Sufficiency

• Markov Aumption

• Faithfulness

Two general methods:

• Score-based algorithms

• Constraint-based algorithms

DAGs provide useful characterisations of the joint likelihood andcan be fitted to the data (e.g., SEM)

Page 96: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

DAG Discovery

Discovering an equivalence set of DAGs is possible under someassumptions:

• Causal Sufficiency

• Markov Aumption

• Faithfulness

Two general methods:

• Score-based algorithms

• Constraint-based algorithms

DAGs provide useful characterisations of the joint likelihood andcan be fitted to the data (e.g., SEM)

Page 97: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

DAG Discovery

Discovering an equivalence set of DAGs is possible under someassumptions:

• Causal Sufficiency

• Markov Aumption

• Faithfulness

Two general methods:

• Score-based algorithms

• Constraint-based algorithms

DAGs provide useful characterisations of the joint likelihood andcan be fitted to the data (e.g., SEM)

Page 98: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

But...

• Assumptions often not plausible

• Latents or acyclicity• Prone to errors

• Often edges are estimated in a different direction than youwould expect

• Exploratory estimation may suffer from low power

• Confirmatory fit may suffer from many equivalent models

Page 99: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

But...

• Assumptions often not plausible• Latents or acyclicity

• Prone to errors

• Often edges are estimated in a different direction than youwould expect

• Exploratory estimation may suffer from low power

• Confirmatory fit may suffer from many equivalent models

Page 100: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

But...

• Assumptions often not plausible• Latents or acyclicity

• Prone to errors

• Often edges are estimated in a different direction than youwould expect

• Exploratory estimation may suffer from low power

• Confirmatory fit may suffer from many equivalent models

Page 101: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

But...

• Assumptions often not plausible• Latents or acyclicity

• Prone to errors• Often edges are estimated in a different direction than you

would expect

• Exploratory estimation may suffer from low power

• Confirmatory fit may suffer from many equivalent models

Page 102: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

But...

• Assumptions often not plausible• Latents or acyclicity

• Prone to errors• Often edges are estimated in a different direction than you

would expect

• Exploratory estimation may suffer from low power

• Confirmatory fit may suffer from many equivalent models

Page 103: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

But...

• Assumptions often not plausible• Latents or acyclicity

• Prone to errors• Often edges are estimated in a different direction than you

would expect

• Exploratory estimation may suffer from low power

• Confirmatory fit may suffer from many equivalent models

Page 104: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Software

Several R packages, but mainly:• pcalg

• Implements the PC-algorithm (a faster variant of theIC-algorithm)

• bnlearn• Implements everything *but* the PC-algorithm

We will see these in the assignment!

Page 105: DAG discovery - Network Analysis 2017

Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion

Thank you for your attention!