25
Epigenomics The many garments of the genome sequence Winterschool Brisbane, 2014 Dr Fabian Buske Garvan Institute of Medical Research

Fabien Buske - Epigenomics - The many garments of the genome sequence

Embed Size (px)

DESCRIPTION

Epigenetic modifications are reversible modifications on the DNA that affect gene expression without changing the actual genome sequence. The spectrum of modifications range from DNA methylation, histone modification and nucleosome positioning to DNA packaging and chromatin organization in the three dimensional space. This presentation will highlight different assays and bioinformatic approaches used to query epigenetic modifications genome‐wide as well as how these layers of information can be integrated into meaningful models. First presented at the 2014 Winter School in Mathematical and Computational Biology http://bioinformatics.org.au/ws14/program/

Citation preview

Page 1: Fabien Buske - Epigenomics - The many garments of the genome sequence

EpigenomicsThe many garments of the genome sequence

Winterschool Brisbane, 2014 !

Dr Fabian Buske Garvan Institute of Medical Research

Page 2: Fabien Buske - Epigenomics - The many garments of the genome sequence

Sequencing has revolutionised life sciences

Epigenetics!ChIP-Seq,

WGBS, HiC,

DNaseHS, Repli-seq,

Transcriptomics!RNA-seq, CAGE-seq

Capture-seq, …

Genomics!WGS,

ExonCapture …

Page 3: Fabien Buske - Epigenomics - The many garments of the genome sequence

Epigenetics

the study of heritable changes that occur without a change in the DNA sequence

Page 4: Fabien Buske - Epigenomics - The many garments of the genome sequence

Epigenetics

http://www.youtube.com/watch?v=Tj_6DcUTRnM

Page 5: Fabien Buske - Epigenomics - The many garments of the genome sequence

Outline

• DNA methylation

- Whole Genome Bisulphite Sequencing

• Histone modification

- Chromatin Immunoprecipitation Sequencing

• DNA looping

- Chromosome Conformation Capture (HiC)

Page 6: Fabien Buske - Epigenomics - The many garments of the genome sequence

DNA methylation• Addition of a methyl group to the 5-carbon of

cytosine in DNA (5mC)

• In mammals, almost exclusively occurs at CpG dinucleotides in a strand symmetrical manner

- Strand symmetry allows for stable inheritance through cell divisions via DNMT1 maintenance

- ~28M CpG sites in the human genome

- Majority are methylated

- Except the 3.9M in/adjacent to CpG islands

Page 7: Fabien Buske - Epigenomics - The many garments of the genome sequence

Why study DNA methylation?

• Has demonstrated roles in!- Cellular programming

- dynamic during development/differentiation - Genomic imprinting/X-inactivation

!• 5mC presence is anti-correlated with “activity” of a DNA sequence!

- Promoters, gene bodies, distal regulatory elements, insulators - MBPs bind 5mC to repress the surrounding chromatin

!• Is stable and relatively easily assayable!

- Covalent modification of the DNA

Page 8: Fabien Buske - Epigenomics - The many garments of the genome sequence

DNA methylation & cancer• Aberrant promoter methylation in cancer is associated with tumour

suppressor gene silencing!- Occurs at enhancers/insulators as well !

!!!!!!!!!

• Alterations in other diseases are relatively poorly studied

Page 9: Fabien Buske - Epigenomics - The many garments of the genome sequence

How do we study DNA methylation?

• Bisulfite treatment deaminates unmethylated cytosines to uracil!!

- Uracil is converted to thymine via PCR!- 5mC is unaffected, therefore remains as

cytosine after PCR!!

‣ Methylation is then assayable as a SNP

Shear DNA

Methylated DNA

C GTCT

C GTUT

C GTTT

PCR

Page 10: Fabien Buske - Epigenomics - The many garments of the genome sequence

Whole genome bisulphite sequencing

Benefits!• Assays all mappable CpG sites (~27M)!• Get a “free” genome sequence at the same time!!Caveats!• Quantitation ability is proportional to depth of sequencing (count Cs vs Ts)!

- To detect a 10% change in 5mC at a single site, requires lots of coverage!- Pooling possible as adjacent CpG sites are correlated!

• Expensive, low throughput, µgs of DNA needed!• Analysis is not straightforward, few methods are available!!Library preparation is basically the same of WGS but with a bisulfite step and different polymerase (Uracil tolerant proofreader)

Page 11: Fabien Buske - Epigenomics - The many garments of the genome sequence

Data analysis of methylated regions

• Mapping  against  an  in-­‐silico  bisulfite-­‐treated  genome  (Bismark)  • Discovery  of  ac>ve  regulatory  regions  de  novo  (MethylSeekR  -­‐  HMM)  

!!!!!

• Differen>ally  Methylated  Regions  between  pa>ent  cohorts/treatments/condi>ons  (bioconductor  bsseq)

Page 12: Fabien Buske - Epigenomics - The many garments of the genome sequence

Histones

the nucleosome is composed of two copies of each of the four core histones (ie, H2A, H2B, H3, and H4), which are wrapped around by 146 bp of DNA

The N-terminal tails of histone polypeptides can be modified by more than 100 different post-translational modifications including methylation, acetylation, phosphorylation, and ubiquitination

Page 13: Fabien Buske - Epigenomics - The many garments of the genome sequence

Why study Histone modifications?

important epigenetic mechanism in transcriptional regulation through modification of the chromatin structure or through chromatin condensation

interplay between histone modifications and DNA methylation define developmental potential of a cell

chromatin profiling is especially well suited to the characterisation of non-coding portions of the genome in a tissue-specific manner

Page 14: Fabien Buske - Epigenomics - The many garments of the genome sequence

How do we study Histones?• Chromatin Immunoprecipitation

with subsequent sequencing (ChIP-Seq)!!

- crosslinking of proteins to DNA!- enrichment with specific antibody!- sequencing!!

‣ Analysis of histone mark deposition via read density

DNA-protein complex

DNA extraction

Sample fragmentation

Crosslink proteins and DNA

Immunoprecipitate

Page 15: Fabien Buske - Epigenomics - The many garments of the genome sequence

ChIP-Seq

Benefits!

• Captures genome-wide tissue-specific protein-DNA interactions !

• Relatively cheap compared to WGBS, HiC!

!Caveats!

• Highly dependent on an available antibody and its specificity !

• ~20-60M reads depending on the fraction of the genome anticipated to be

bound!

• Controls (input) need to be sequenced deeper that actual IP library!

!

Page 16: Fabien Buske - Epigenomics - The many garments of the genome sequence

ChIP-Seq data analysis

Page 17: Fabien Buske - Epigenomics - The many garments of the genome sequence

Mapping to the sequence space

Transcribed in cancer cells

Transcribed in normal cells

Page 18: Fabien Buske - Epigenomics - The many garments of the genome sequence

DNA looping

• The DNA fiber is a flexible polymer

• DNA looping enables genomic regions that are distant in sequence space to come in close physical proximity and thus relay signals (e.g. enhancers and promoters)

Page 19: Fabien Buske - Epigenomics - The many garments of the genome sequence

1D 3Dvs

Page 20: Fabien Buske - Epigenomics - The many garments of the genome sequence

Sequencing based 3D assaysCardinality

Resolution

3C 4C

5C

Chia-Pet

HiC

High (bp)

Low (mb)

One-to-one All-to-All

HiC

Chia-Pet

5Csweet spot

Capture-C

Capture-c

$

$$$

quadratic nature of “all versus all” data

Page 21: Fabien Buske - Epigenomics - The many garments of the genome sequence

3C/HiC protocol

• HiC: Before ligation, the restriction ends are filled in with biotin-labeled nucleotides.

DNACrosslink proteins and DNA Sample fragmentation Ligation PCR amplify ligated junctions

via restriction enzymes

Page 22: Fabien Buske - Epigenomics - The many garments of the genome sequence

HiC data processing

http://www.bioinformatics.babraham.ac.uk/projects/hicup/

Page 23: Fabien Buske - Epigenomics - The many garments of the genome sequence
Page 24: Fabien Buske - Epigenomics - The many garments of the genome sequence

Take Home Messages• Epigenetics: the study of heritable changes that

occur without a change in the DNA sequence

• Variety of assays available for the interrogation of the epigenetic state genome-wide

• Lots of public data available (ENCODE, Epigenome Roadmap, GEO)

• Understand the biological question and the wet-lab protocol… choose your tools accordingly!

• Check out Illumina’s poster http://bit.ly/1kxGdzz