23
Fast and Accurate MRFs through Evidence- Specific Structures Veselin Stoyanov and Jason Eisner HLT/COE and CLSP Johns Hopkins University

Fast and Accurate MRFs through Evidence-Specific Structures 

  • Upload
    fia

  • View
    47

  • Download
    0

Embed Size (px)

DESCRIPTION

Fast and Accurate MRFs through Evidence-Specific Structures . Veselin Stoyanov and Jason Eisner HLT/COE and CLSP Johns Hopkins University. Why Evidence-Specific MRFs?. Data may be missing heterogeneously. For example, the relational case. - PowerPoint PPT Presentation

Citation preview

Page 1: Fast and Accurate MRFs through Evidence-Specific Structures 

Fast and Accurate MRFs through Evidence-Specific Structures

Veselin Stoyanov and Jason EisnerHLT/COE and CLSP

Johns Hopkins University

Page 2: Fast and Accurate MRFs through Evidence-Specific Structures 

Why Evidence-Specific MRFs?

• Data may be missing heterogeneously.– For example, the relational case.

• We want to learn evidence specific structures in order to:– Improve accuracy (under approximate inference)– Speed up inference (while keeping accuracy high)

Page 3: Fast and Accurate MRFs through Evidence-Specific Structures 

Markov Random Fields

AC

E D

B

C

Page 4: Fast and Accurate MRFs through Evidence-Specific Structures 

Heterogeneous Evidence

AC

E D

B

C

Page 5: Fast and Accurate MRFs through Evidence-Specific Structures 

Heterogeneous Evidence

AC

E D

B

C

Page 6: Fast and Accurate MRFs through Evidence-Specific Structures 

Evidence Specific Structures

AC

E D

B

C

Page 7: Fast and Accurate MRFs through Evidence-Specific Structures 

Evidence Specific Structures through Gates

• Gates [Minka and Winn, 2008]

AC

E D

B

FG7

G2

G3G4 G5

G6

G1

G1

Page 8: Fast and Accurate MRFs through Evidence-Specific Structures 

Evidence Specific Structures through Gates

• Conditioned on the inputs

AC

E D

B

FG7

G2

G3G4 G5

G6

G1

G1

Page 9: Fast and Accurate MRFs through Evidence-Specific Structures 

Evidence Specific Structures through Gates

• Gates [Minka and Winn, 2008]– A formalism for expressing contextual

independence• In vanilla MRF:

• Including a gate g{0,1} to control each factor :

• A soft version:

Page 10: Fast and Accurate MRFs through Evidence-Specific Structures 

Evidence Specific Structures through Gates

• Each gate is a classifier that decides whether the corresponding factor should be on or off– Using features of the observation pattern and/or

the observed values• Inference time can be expressed as a function

of the number of gates that are on• Gates can also be partially on:– Damp messages over the corresponding factors

Page 11: Fast and Accurate MRFs through Evidence-Specific Structures 

A Two-step test-time process

1. Compute gate values

AC

E D

B

FG7

G2

G3G4 G5

G6

G1

G1

Page 12: Fast and Accurate MRFs through Evidence-Specific Structures 

A Two-step test-time process

2. Run inference on the active factors

AC

E D

B

FG7

G2

G3G4 G5

G6

G1

G1

Page 13: Fast and Accurate MRFs through Evidence-Specific Structures 

A Two-step test-time process

2. Run inference on the active factors

AC

E D

B

FG7

G2

G3G4

G1

Page 14: Fast and Accurate MRFs through Evidence-Specific Structures 

A Two-step test-time process

2. Run inference on the active factors

AC

E D

B

F

G3G4

G1

Page 15: Fast and Accurate MRFs through Evidence-Specific Structures 

A Two-step test-time process

2. Run inference on the active factors

AC

E D

B

F

Page 16: Fast and Accurate MRFs through Evidence-Specific Structures 

Gates are Random Variables

We can define an MRF over the gates:

AC

E D

B

FG7

G2

G3G4 G5

G6

G1

G1

Page 17: Fast and Accurate MRFs through Evidence-Specific Structures 

Gates are Random Variables

We can define an MRF over the gates:

AC

E D

B

FG7

G2

G3G4 G5

G6

G1

G1

Page 18: Fast and Accurate MRFs through Evidence-Specific Structures 

Training

• The objective:– Learn MRF parameters and gate features that

minimize a measure of accuracy and speed on training data:

– We will use ERMA [Stoyanov, Ropson and Eisner,2011] to jointly learn the MRF and gate parameters

Page 19: Fast and Accurate MRFs through Evidence-Specific Structures 

ERMA

• Empirical Risk Minimization under Approximations– An algorithm for learning in graphical models that

will be used with approximations– Uses back-propagation of error to compute

gradients of loss with respect to parameters– A local optimizer to find parameters that (locally)

minimize loss

Page 20: Fast and Accurate MRFs through Evidence-Specific Structures 

Experiments

• Synthetic data– Sample a random 4-ary MRF– Sample training and test data. – Forget the structure.– Start learning with a fully connected but

binary graph.

Page 21: Fast and Accurate MRFs through Evidence-Specific Structures 

Experiments

• Gate features– A conjunction of factor id and r.v. values• Say GAB controls factor FAB between r.v.’s A

and B.• A (and B) can be {1,0,hidden or output}• Features for GAB are:

– fab_{1,0,h,o}_{1,0,h,o}• For instance, if A = 1, B is output:

– Feature fab_1_o will fire• Total number of features for a gate is 4x4 –

2x2 (we can exclude factors for which both r.v.’s are observed)

Page 22: Fast and Accurate MRFs through Evidence-Specific Structures 

Results

0 500000 1000000 1500000 2000000 2500000 3000000 3500000 4000000 4500000 50000000.19

0.2

0.21

0.22

0.23

0.24

0.25

0.26

MSE vs. Number of MessagesL1ES

Page 23: Fast and Accurate MRFs through Evidence-Specific Structures 

Conclusions and Future Work

• Evidence-specific structures can result in more accurate and/or faster models.

• Gates and ERMA can be utilized for representing and learning conditional independences.

• We will be applying our model to relational data.