13
Semi-supervised enhancer prediction using the Segway framework Orion J. Buske, Tzitziki Lemus, Michael M. Hoffman, Jeff A. Bilmes, William Stafford Noble Department of Genome Sciences University of Washington

Semi-supervised enhancer prediction using the Segway framework · buske_201003_ENCODE_presentation Author: Orion Buske Created Date: 3/11/2010 4:21:29 AM

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Semi-supervised enhancer prediction using the Segway framework · buske_201003_ENCODE_presentation Author: Orion Buske Created Date: 3/11/2010 4:21:29 AM

Semi-supervised enhancer predictionusing the Segway framework

Orion J. Buske, Tzitziki Lemus,Michael M. Hoffman, Jeff A. Bilmes,

William Staffor d Noble

Department of Genome SciencesUniversity of Washington

Page 2: Semi-supervised enhancer prediction using the Segway framework · buske_201003_ENCODE_presentation Author: Orion Buske Created Date: 3/11/2010 4:21:29 AM

1 2 0 2 1 00… …unsupervised

Page 3: Semi-supervised enhancer prediction using the Segway framework · buske_201003_ENCODE_presentation Author: Orion Buske Created Date: 3/11/2010 4:21:29 AM

1 2 0 2 1 00… …unsupervised

2 02 1 00… …

semi-supervised

novelknown p300 peaks (Heintzman et al. 2009)

Page 4: Semi-supervised enhancer prediction using the Segway framework · buske_201003_ENCODE_presentation Author: Orion Buske Created Date: 3/11/2010 4:21:29 AM

recall

prec

isio

n

better

worse

Fraction of p300 sitesoverlapped by predictions

Fraction of predictionsthat overlap p300 sites

Page 5: Semi-supervised enhancer prediction using the Segway framework · buske_201003_ENCODE_presentation Author: Orion Buske Created Date: 3/11/2010 4:21:29 AM

recall

prec

isio

n

Page 6: Semi-supervised enhancer prediction using the Segway framework · buske_201003_ENCODE_presentation Author: Orion Buske Created Date: 3/11/2010 4:21:29 AM

CTCF

H3K4me1

H3K4me2

H3K4me3

H3K9ac

H3K9me1

H3K27ac

H3K27me3

H3K36me3

H4k20me1

DNaseI

Pol2

predictedobserved

Higher H3K4me1 H3K9me1 H3K36me3 H4K20me1 Input BDP1 BRF1 GATA1 JunD

Lower H3K4me3 DNaseI CTCF Pol2 TAF1

Example

semi-supervised labelprecision: 0.27recall: 0.56

P-SS

Page 7: Semi-supervised enhancer prediction using the Segway framework · buske_201003_ENCODE_presentation Author: Orion Buske Created Date: 3/11/2010 4:21:29 AM

Segway hypothesizes more than one type of p300 site

semi-supervised labelprecision: 0.27recall: 0.56

P-SS

Page 8: Semi-supervised enhancer prediction using the Segway framework · buske_201003_ENCODE_presentation Author: Orion Buske Created Date: 3/11/2010 4:21:29 AM

higher

lower

P-2

P-3

semi-supervised labelprecision: 0.27recall: 0.56

unsupervised labels

P-SS

Page 9: Semi-supervised enhancer prediction using the Segway framework · buske_201003_ENCODE_presentation Author: Orion Buske Created Date: 3/11/2010 4:21:29 AM

higher

lower

P-SSH3K4me3TAF1Pol2H3K27acZNF267

P-2H3K4me3TAF1Pol3H3K4me1cFos

P-3H3K4me3TAF1Pol2H3K9acCTCF

Subtypes correspond to active/repressed chromatin statessimilar

P-2

P-3

P-SS

Page 10: Semi-supervised enhancer prediction using the Segway framework · buske_201003_ENCODE_presentation Author: Orion Buske Created Date: 3/11/2010 4:21:29 AM

Combined P-SS, P-2, P-3precision: 0.21recall: 0.91

With combined labels, we achieve comparableprecision with excellent recall

P-2

P-3

P-SS

Page 11: Semi-supervised enhancer prediction using the Segway framework · buske_201003_ENCODE_presentation Author: Orion Buske Created Date: 3/11/2010 4:21:29 AM

At least two segments within 1kb(P-SS, P-2, P-3)

precision: 0.31recall: 0.77

With multiple predicted sites in close proximity, weimprove precision with good recall

P-2

P-3

P-SS

Page 12: Semi-supervised enhancer prediction using the Segway framework · buske_201003_ENCODE_presentation Author: Orion Buske Created Date: 3/11/2010 4:21:29 AM

Acknowledgements

AvailabilitySegway: http://noble.gs.washington.edu/proj/segway

Segtools: http://noble.gs.washington.edu/proj/segtools

ENCODE Project Consortium

NHGRI

Page 13: Semi-supervised enhancer prediction using the Segway framework · buske_201003_ENCODE_presentation Author: Orion Buske Created Date: 3/11/2010 4:21:29 AM

CTCF

H3K4me1

H3K4me2

H3K4me3

H3K9ac

H3K9me1

H3K27ac

H3K27me3

H3K36me3

H4k20me1

DNase

Pol2

predicted (TP)observed (TP)false negative

False negatives have highermean signal than truepositives

Lower GATA1

Higher H3K4me3 Pol2 Pol3 TAF1

Example