31
A Strategy for Making Predictions under Manipulation Ioannis Tsamardinos Assistant Professor Computer Science Department, University of Crete ICS, Foundation for Research and Technology - Hellas Laura E. Brown Ph.D. Candidate Dept. Biomedical Inf., Vanderbilt Univ.

A Strategy for Making Predictions under Manipulation

  • Upload
    minowa

  • View
    27

  • Download
    0

Embed Size (px)

DESCRIPTION

A Strategy for Making Predictions under Manipulation. Ioannis Tsamardinos Assistant Professor Computer Science Department, University of Crete ICS, Foundation for Research and Technology - Hellas. Laura E. Brown Ph.D. Candidate Dept. Biomedical Inf., Vanderbilt Univ. - PowerPoint PPT Presentation

Citation preview

Page 1: A Strategy for Making Predictions under Manipulation

A Strategy for Making Predictions under Manipulation

Ioannis TsamardinosAssistant ProfessorComputer Science Department, University of CreteICS, Foundation for Research and Technology - Hellas

Laura E. BrownPh.D. CandidateDept. Biomedical Inf., Vanderbilt Univ.

Page 2: A Strategy for Making Predictions under Manipulation

5/10/2007 I. Tsamardinos, CSD, University of Crete

2

Selecting a Formulation of Causality Causal Bayesian

Networks Cross Sectional Data No explicit notion of time No feedback cycles

allows Edges express causal

relations Distribution expressed

as

T

V2

V3V1

V5

V4

V6 ))(|()( ii VPaVPVP

Page 3: A Strategy for Making Predictions under Manipulation

5/10/2007 I. Tsamardinos, CSD, University of Crete

3

Effect of Manipulation

T

V2

V3V1

V5

V4

V6Manipulate V1 , V5

Page 4: A Strategy for Making Predictions under Manipulation

5/10/2007 I. Tsamardinos, CSD, University of Crete

4

Effect of Manipulation

T

V2

V3V1

V5

V4

V6Manipulate V1 , V5

T

V2

V3V1

V5

V4

V6

E

External Manipulator

Page 5: A Strategy for Making Predictions under Manipulation

5/10/2007 I. Tsamardinos, CSD, University of Crete

5

Effect of Manipulation

T

V2

V3V1

V5

V4

V6Manipulate V1 , V5

T

V2

V3V1

V5

V4

V6

E

Other parents are removed

Page 6: A Strategy for Making Predictions under Manipulation

5/10/2007 I. Tsamardinos, CSD, University of Crete

6

Effect of Manipulation

T

V2

V3V1

V5

V4

V6

E

Mii

MiiiM

ii

EVPVPaVPVP

VPaVPVP

)|())(|()(

))(|()(

M the set of manipulated variables

J Pearl. Causality, Models, Reasoning, and Inference, 2000.

Page 7: A Strategy for Making Predictions under Manipulation

5/10/2007 I. Tsamardinos, CSD, University of Crete

7

Types of Predictive Tasks

A. No manipulations

B. Known set of manipulated variables M From data following P(V) Predict data following PM(V) The way manipulations are performed is

unknown, i.e. PM(Vi | E) are uknown

C. Unknown M

Page 8: A Strategy for Making Predictions under Manipulation

5/10/2007 I. Tsamardinos, CSD, University of Crete

8

The Markov Blanket of T

The set of direct causes, direct effects, and direct causes of direct effects

T

V2

V3V1

V5

V4

V6

Page 9: A Strategy for Making Predictions under Manipulation

5/10/2007 I. Tsamardinos, CSD, University of Crete

9

The Manipulated Markov Blanket of T The set of direct

causes, direct effects, and direct causes of direct effects in the manipulated distribution E.g. V1 and V5

T

V2

V3V1

V5

V4

V6

Page 10: A Strategy for Making Predictions under Manipulation

5/10/2007 I. Tsamardinos, CSD, University of Crete

10

Properties of MB(T)

The smallest-size, most-predictive subset of variables

All and only the variables we need for building optimal predictive models

I. Tsamardinos and C. F. Aliferis. Towards principled feature selection: Relevancy, Filters and Wrappers. AI & Statistics, 2003.

Page 11: A Strategy for Making Predictions under Manipulation

5/10/2007 I. Tsamardinos, CSD, University of Crete

11

A. No Manipulations

Find the MB(T) Fit a model from training data for P(T |

MBM(T)), using only the the variables of the MB(T)

Page 12: A Strategy for Making Predictions under Manipulation

5/10/2007 I. Tsamardinos, CSD, University of Crete

12

B. Known M

Find the MBM(T) Fit a model from training data, using only the

variables of the MBM(T) Proposition:

PM(T | MBM(T)) = P(T | MBM(T))

provided there are no manipulated spouses of T that is a descendant of T in the unmanipulated distribution

Page 13: A Strategy for Making Predictions under Manipulation

5/10/2007 I. Tsamardinos, CSD, University of Crete

13

Can Be Fit From Unmanipulated Data

T

V2

V3V1

V5

V4

V6

M = {V1 , V5}

PM(T | MBM(T)) = P(T | MBM(T))

Page 14: A Strategy for Making Predictions under Manipulation

5/10/2007 I. Tsamardinos, CSD, University of Crete

14

Cannot Be Fit From Unmanipulated Data

T

V2

V3V1

V5

V4

V6

M = {V1, V4 }

PM(T | MBM(T)) P(T | MBM(T))

Page 15: A Strategy for Making Predictions under Manipulation

5/10/2007 I. Tsamardinos, CSD, University of Crete

15

Unknown Manipulations M

Find the direct causes of T Fit a model from training data, using only the

the variables that are direct causes of T

Only the direct causes remain in MBM(T) under any manipulation

Page 16: A Strategy for Making Predictions under Manipulation

5/10/2007 I. Tsamardinos, CSD, University of Crete

16

Learning Bayesian Networks

Many algorithms that can learn the network exist Discrete data : MMHC1

Mixed: Bach2

Find the graph, find the MBM(T), fit a model and you are done

… or are you?

1. I Tsamardinos, LE Brown, and CF Aliferis. Machine Learning, 65(1):31, 2006.2. F.R. Bach and M.I. Jordan. NIPS-02

Page 17: A Strategy for Making Predictions under Manipulation

5/10/2007 I. Tsamardinos, CSD, University of Crete

17

Faithfulness and Parity Functions All BN methods assume

Faithfulness Causes and effects have

detectable conditional pairwise associations with T

T = V1 XOR V3

No pairwise association between T and V1

T

V3V1

Page 18: A Strategy for Making Predictions under Manipulation

5/10/2007 I. Tsamardinos, CSD, University of Crete

18

Parity Functions in Feature Space T = V1 XOR V2

No pairwise association T, V1

Construct New Feature V1 V2

Pairwise associations become apparent

T

V2V1

V1V2

V2V1

T

Page 19: A Strategy for Making Predictions under Manipulation

5/10/2007 I. Tsamardinos, CSD, University of Crete

19

Feature Space Markov Blanket Map Data to Feature Space Learn the Markov Blanket in Feature Space

Page 20: A Strategy for Making Predictions under Manipulation

5/10/2007 I. Tsamardinos, CSD, University of Crete

20

Feature Space Markov Blanket Map Data to Feature Space

Brute force is inefficient Indirectly map to feature space using an SVM Assume: low SVM weight of a feature implies low

association of the feature with T Produce only the top weighted features!

(recently developed heuristic method) Learn the Markov Blanket in Feature Space

Run HITON1

1. C. F. Aliferis, I. Tsamardinos, and A. Statnikov. AMIA 2003.

Page 21: A Strategy for Making Predictions under Manipulation

5/10/2007 I. Tsamardinos, CSD, University of Crete

21

Inducting the MB(T)

Run MMMB1, RFE2, FSMB3, no feature selection

Build predictive models If there is a large discrepancy in predicting

performance consult FSMB If there are “parity”-like variables, add the

corresponding constructed features in the data before learning the network

1. I Tsamardinos, CF Aliferis, and A Statnikov. KDD 2003.2. I. Guyon, et. al. Machine Learning, 46(1-3):389{422}, 2002.3. submitted for publication

Page 22: A Strategy for Making Predictions under Manipulation

5/10/2007 I. Tsamardinos, CSD, University of Crete

22

Hidden Variables and Confounding

T

V2

V3V1

V5

V4

V6

H1

H2

H1 , H2 hidden variables

Dashed edges appear in the marginal network

Marginal MB(T) showed in green

Page 23: A Strategy for Making Predictions under Manipulation

5/10/2007 I. Tsamardinos, CSD, University of Crete

23

Hidden Variables and Confounding

T

V2

V3V1

V5

V4

V6

H1

H2

H1 , H2 hidden variables

Dashed edges appear in the marginal network

Redish edges are “removed” by manipulations

Manipulations of V5 , V3

lead to errors in estimating MBM(T) (bluish nodes)

Page 24: A Strategy for Making Predictions under Manipulation

5/10/2007 I. Tsamardinos, CSD, University of Crete

24

Finding Non-Confounded Edges

T

V3V1

V5

V6

V2Proposition: V = O H, O are

observable, H are not. P(V) is faithful to a Causal Bayesian Network . If

1. S O, I(V1 ; T | S)

2. S O, I(V3 ; T | S)

3. S O, I(V5 ;T | S)

4. Z1 O, s.t. I(V1 ; V3 | S)

5. Z2 O, s.t. I(V1 ; V5 | S)

6. I(V1 ; V3 | Z1 {T})

7. I(V1 ; V5 | Z2 {T})

Then there is a causal path T to V5

(edge T V5 is causal)

Page 25: A Strategy for Making Predictions under Manipulation

5/10/2007 I. Tsamardinos, CSD, University of Crete

25

Finding Non-Confounded Edges

T

V3V1

V5

V6

V2Proposition: V = O H, O are

observable, H are not. P(V) is faithful to a Causal Bayesian Network . If

1. S O, I(V1 ; T | S)

2. S O, I(V3 ; T | S)

3. S O, I(V5 ;T | S)

4. Z1 O, s.t. I(V1 ; V3 | S)

5. Z2 O, s.t. I(V1 ; V5 | S)

6. I(V1 ; V3 | Z1 {T})

7. I(V1 ; V5 | Z2 {T})

Then there is a causal path T to V5

(edge T V5 is causal)

H

Page 26: A Strategy for Making Predictions under Manipulation

5/10/2007 I. Tsamardinos, CSD, University of Crete

26

Finding Non-Confounded Edges Use to test to

Orient some edges Find truly causal (non-confounded) edges

Extension of basic idea presented in [1]

1. S. Mani, P. Spirtes, and G.F. Cooper. UAI 2006.

Page 27: A Strategy for Making Predictions under Manipulation

5/10/2007 I. Tsamardinos, CSD, University of Crete

27

Finding the MBM(T)

Edge existence: BN learning algorithm Edge orientation:

Learn the network, convert to PDAG, obtain compelled edges

Confounding test Edge confounding

Confounding test Weigh evidence and decide on orientation

and absence of confounding

Page 28: A Strategy for Making Predictions under Manipulation

5/10/2007 I. Tsamardinos, CSD, University of Crete

28

Finding the MBM(T)

T

V2

V3V1

V5

V4

V6

V7

Non-confounded

Oriented but could be confounded

Undirected

Manipulated NodesVi

Are V7 , V3 part of MBM(T)?

Is V4 part of MBM(T)?

Page 29: A Strategy for Making Predictions under Manipulation

5/10/2007 I. Tsamardinos, CSD, University of Crete

29

Results

Page 30: A Strategy for Making Predictions under Manipulation

5/10/2007 I. Tsamardinos, CSD, University of Crete

30

Limitations

Most time spent or REGED Conditional independence tests were

sometimes inappropriate New methods not optimized or fully tested Model averaging should be used Formal methods for weighing the evidence

are needed

Page 31: A Strategy for Making Predictions under Manipulation

5/10/2007 I. Tsamardinos, CSD, University of Crete

31

Conclusions

General basis of theory and algorithms for predictions under manipulation

New algorithms for addressing lack of faithfulness and hidden confounding variables

The strategy can be implemented using the new and existing algorithms

Many open directions/problems Faithfulness Acyclicity Hidden variables Timed data