Upload
beyla
View
30
Download
0
Tags:
Embed Size (px)
DESCRIPTION
LOCATION ANALYSIS (ChIP-on-chip) Regulation of Human ES Cells June 2006. Mammalian Development. Richard Young, Professor of Biology, MIT. Cell 125, 301-313, April 21, 2006. Overview: Control of embryonic stem cells. ES cells must remain pluripotent until signalled to differentiate - PowerPoint PPT Presentation
Citation preview
June 2006Page 1
LOCATION ANALYSIS (ChIP-on-chip)
Regulation of Human ES CellsJune 2006
June 2006Page 2
Mammalian Development
Cell 125, 301-313, April 21, 2006.
Richard Young, Professor of Biology, MIT
June 2006Page 3
Overview: Control of embryonic stem cells
• ES cells must remain pluripotent until signalled to differentiate
• Polycomb group proteins (PcG) repress genes previously found to control segment identity in drosophila by modifying chromatin
• PcG proteins assemble Polycomb Repressive Complexes (PRCs) – required to repress developmental genes so that cells are pluripotent
• Specifically, PRC2 plays a role in histone methylation for gene silencing
June 2006Page 4
The role of PRC2 and its components
What does PRC2 do?
• Represses developmental genes in ES cells to maintain pluripotency
• Catalyzes methylation of histone H3 lysine-27 in nucleosomes: associated genes are thus silenced through repressed chromatin state
• Contains subunits EED, EZH2, and SUZ12 critical for PRC2 for the methyltransferase activity
RING1
BMI1
hPc2
PRC1
NH4-ARTKQTARKSTGGKAPRKQLATKAARKSAPATG
EED
EZH2SUZ12
PRC2
H3
June 2006Page 5
Goals of methodology to confirm PRC2 role
• DNA segments bound by RNA pol 2 or SUZ12 isolated using ChIP-on-chip (Agilent)
• SUZ12 mapped genome-wide to understand how PRC2 needed for self- renewal and pluripotency
• RNA pol II also mapped as control and reference to PRC2 occupation
0 25
210
215
0
25
210
215
Human embryonicstem cells (H9)
Whole genome arrays4.6 million features
Scatter plot(ChIP / reference)
ChIP
53130000 5314000
Fold
Enrich
ment (I
P/W
CE
)
SMCX
Chromosomal position (bp)
53135000
2
6
0
4
Promoters bound byRNA polymerase II or Suz12
June 2006Page 6
Genome-wide binding of SUZ12 and RNA polymerase II
RNA pol 2 enrichment ratios• Bound at 87% known genes• 4% false positive PCR-confirmed
SUZ12 enrichment ratios• Associated with 1,893 promoters of 22,500 genes (8%)• 95% of bound sites within 1 kb of known transcriptional start sites• 40% within 1kb of CpG Islands• 3% false positive
June 2006Page 7
Suz12 & Pol II are mutually exclusive
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
177885000 177900000 177915000
0
1
2
3
4
5
6
7
8
182340000 182360000 182380000 182400000
NEUROD1
HNRPA3
Fo
ld E
nric
hm
en
tF
old
En
rich
me
nt
Pol IISuz12
Pol IISuz12
Pol II30%
Neither
Both1.9%
Suz12 6.5%
Lee et. al., Cell 2006
June 2006Page 8
PRC2 occupies key developmental regulators confirmed by SUZ12 occupation
Development
Reg. of Transcription
Morphogenesis
Organogenesis
Neurogenesis
Cell-cell signaling
Protein transport
Cell cycle
Resp.to DNA damage
DNA metabolism
Protein biosynthesis
RNA metabolism
1E-20 1E-40 1E-601E0
Suz12RNA Pol II
• SUZ12 mainly occupies genes that control development and transcription
• RNA pol II occupies genes controlling broader cell proliferation functions
June 2006Page 9
SUZ12 binds multitude of developmental transcription factor families
Transcription factor families include:
• HOX
• HOX co-factors (MEIS/EVX)
• FOX
• NEUROD
• Myogenic basic domain (MYO)
• GATA binding protein
• LIM homeobox (LHX)
• Distal-less homeobox
• SRY box (SOX)
• RUNX, PAX, SIX, POU
GATA
ATOH
LHX
POU
IRX
DLX
SIX
NEUROD
BHLHB
PAX
FOX
HOX
SOX
TBX
NKX
HES
EBF
RUNX
MYO
CDX
MEIS/EVX
CDX2
MYOD1
NEUROG2
NEUROD1
MEIS1
EBF
PAX3
SOX21
T
OLIG2
GATA4
FOXA1
Transcription factor family membersoccupied by PRC2
Red circles: Individual TFsWhite ovals: TF with defined role in development
June 2006Page 10
Suz12 covers large domains over HOX clusters
Pol II
Suz12
• High portion of developmental regulator genes bound by SUZ12 in extended regions
• SUZ12 binding ~100kb across HOX A-D clusters
• Unrelated genes not bound
• Thus, PRC2 favors binding to developmental regulator genes
June 2006Page 11
Targets of PRC2 shared with key ES cell regulators
RNAP2
SUZ12
• OCT4, SOX, NANOG previously reported to play critical role in differentiation (Boyer et. al. 2005)
• Subset of dev. regulator genes almost all occupied by PRC2 as well
• Further support to link between PRC2 binding and repression of dev regulators and ES cell pluripotency
June 2006Page 12
Suz12 occupied genes in ES cells are poised for expression during differentiation
• Genes bound by SUZ12 more likely to be activated during differentiation than genes that are not bound
• Indicative that genes once bound by SUZ12 are preferentially activated as ES cells differentiate
June 2006Page 13
Loss of PRC2 in differentiated cells
In muscle, PRC2 is lost from genes encoding regulators of muscle development…
…but maintained at genes encoding developmental regulators for other cell types.
June 2006Page 14
Regulation of ES cell pluripotency by Polycomb
PRC2 maintains ES cell pluripotency by repressing key developmental regulators.
• PRC2 localizes to the promoters of hundreds of genes encoding known developmental regulators.
• SUZ12 component was mapped using ChIP-on-chip to indicate PRC2 role
• PRC2 is associated with methylation at H3K27 and transcriptional repression genome-wide.
• Genes bound by PRC2 become activated as ES cells differentiate.
• In differentiated cells, there is a loss of PRC2 at genes that play a role in specifying the identity of that tissue.
June 2006Page 15
Master Regulators of Mammalian Transcription
Embryonic stem cellsOCT4, NANOG, SOX2 Polycomb
Brain and Spinal CordSOX1-18, OCT6, MeCP2 CBP, NGN, NEURODCerebrumCerebellumGanglia & nerves
Circulatory SystemMyocardin, GATA4, TBX5, NKX2.5, MEF2, HANDHeart Vascular system
Digestive SystemHNF1, HNF4, HNF6, CBP, PGC1, FOXA, PDX1, GATA, EsophagusStomachIntestinesLiverPancreas
Urinary SystemHNF1B, HNF4, CDX, FTFC/EBP, FOXA, GATAKidneyUrinary tract
Respiratory SystemHNF-3, NKX2.1 and GATA6AirwaysLungs
Reproductive OrgansER, SF1, DAX1, C/EBPOvaryUterusBreastTestis
Skeletal and MuscularMYOD, MEF2, MRF4, MYF5MyogeninBoneMuscleCartilage
Hematopoietic SystemTAL1, LMO1, LMO2, E2A, XBP1, AML1, MLL1, PU.1, C/EBPBone marrowBloodEmbryonic Liver
Immune SystemSTAT1, STAT3 STAT5, NFB family, IRF1, IRF3, IRF5ThymusSpleenLymph nodes
Sensory OrgansSOX1-18, OCT6, PAX3,PAX6, NGN, SKIN1EyeEarOlfactorySkinTongue
June 2006Page 16
ChIP-on-chip: The Agilent AdvantageFlexibility
• Rapid custom design iteration and turnaround with inkjet technology
• Over 9 species for whole genome and focused microarrays
• Wide range of array formats
• Analysis software and visualization tools and support
Commitment
• Dedicated ChIP-on-chip in-house expertise and support
Quality
• Unique, unmatched probe design – Tm-balanced format,optimal spacing deliver enhanced specificity and sensitivity
June 2006Page 17
Microarray format flexibility
244k
15k
105k
Standard 1 x3 slide
Multi-pak Capability
e.g. C. elegans whole genome
244k
100 bp
200 bp
300 bp
3
2
1
number of slidesprobesspacing
44k
June 2006Page 18
Agilent ChIP-on-chip:Human, Mouse, Rat Arrays
Promoter arrays: approx. -5.5kb to +2.5kb around transcription start sites for ~17,000 RefSeq genes
- Human hg17- Mouse mm7 Whole genome sets- HumanDatabases (customization) – 100 bp tiled probes across genome- Human hg17, hg18 available soon- Mouse mm7-Rat, Rn 3.1
Analyze protein-DNA binding events and protein structure/function in mammalian systems
June 2006Page 19
Agilent ChIP-on-chip:Model Organism Arrays
Model whole genome tiled arrays and tiled databases - Yeast (S. cerevisiae)- Plant (A. thaliana), ath1- Round Worm (C. elegans), Ce2- Fly (D. melanogaster), dm2Shared design:- Yeast (S. pombe)Model promoter arrays- Zebrafish (D. rerio), zv4
Compare protein-DNA binding events and protein structure/function
between model organisms and human systems
June 2006Page 20
Agilent CpG ChIP-on-chip:Exploring DNA methylation
CpG
CpG
CpGCpG
CpG
CpGCpG
CpG
CpG
Tiled arrays with 60-oligomer probes spaced ~100 bp apart
Uses the CpG island probes defined by Gardiner-Garden and Frommer from UCSC hg17/NCBI release 35 (May 2004 build)
Compatible with methylated DNA immunoprecipation method (Keshet et. al, 2006) and possibly other methods
June 2006Page 21
Genome wide detection of DNA methylation
Restriction enzyme based approaches
Advantages• Specific• Sensitive
Disadvantages• Limited to RE sites• Complex data analysis• Reference
Antibody based method (mDIP)Advantages
• Unbiased• Simpler data analysis• Genomic reference
Disadvantages• Cross reactivity• Sensitivity ?
CH3
CH3
CH3
CH3
digest
CH3
CH3
fragment
June 2006Page 22
eArraySelf-service custom array design
Upload and print your own designs (60-mers)
Select Agilent-designed probes for your genomic regions
RAPID Turnaround: 2-3 week delivery
http://earray.chem.agilent.com
June 2006Page 23
Features:
User control over analysis steps and parameters
Multiple output formats & reports
Quality Control report
Peak detection visualization
Support for replicates
Compatibility with UCSC Genome Browser
Windows or Macintosh platforms
Coming soon: ChIP/CGH Analytics plus support for methylation
Software: ChIP AnalyticsProcess spot intensities to determine binding sites
Agilent Scan
Axon Scan
Agilent FE
GenePix
Chip Analytics
Ver 1.2
June 2006Page 24
Summary
Dedication
Expertise
Flexibility
Conduct your experiments today
June 2006Page 25
Agilent’s Strength
Commitment• Acquired seminal intellectual property• Provide local application scientist support• Customer trainings & workshops (EMBL,
CSHL, and Agilent)
Quality• Highest sensitivity in industry• Feature quality (size, physical oligo,
optimized 60-mers)• Validated probe selection process• NO complex statistical manipulations• Easy to use analysis software
Flexibility• Conduct experiments TODAY • Rapid custom array design leveraging eArray• Multiple array formats on 1x3 glass slide• Density of up to 440,000 (future) features on a single
slide
June 2006Page 26
APPENDIX
June 2006Page 27
Where the proteome meets the genome
Binding Protein
June 2006Page 28
Chromatin Immunoprecipitation (ChIP)
TFDNA
TFDNA
Cross-link protein to DNA
With formaldehyde
Randomly shear DNA
By sonication
TFDNA
Precipitate DNA-protein
Complex with anti-TF Antibody
TFTF TF
TF Reverse Cross-linking
Purify DNA
Enriched, TF bound
DNA
June 2006Page 30
GENE XGENE YTF
ChIP-enrichment of DNA vs. total DNA input
Total DNA input (WCE)
Enriched DNA (IP)
Note: chromatin DNA fragments
are ~100 - 500 bp
June 2006Page 31
“ChIP on chip” / Location Analysis
GENE XGENE YTF
Chromosome position
En
ric
hm
en
t
1x
Cy5 / Cy3
June 2006Page 32
High Accuracy Location Analysis Start with optimal probe design
Goals:• Construct a database of high-quality
probes spanning the genome. • Provide tools to select probes onto arrays
for particular applications.
Methodology
• Tile 60-mers at 1-bp spacing across non-RepeatMasked genome (1.3B probes).
• Reduce 10-fold by thermodynamic scores (130M probes).
• Homology search against the genome using ProbeSpec (custom homology search tool designed for probe matching).
• Reduce 10-fold using thermodynamic and homology scores (13M probes).
• Re-score homology using MegaBlast (catches gapped alignments).
Notes
• Probe scoring model is trained on XY/XX or other model systems.
• Down-selection uses Pairwise Reduction to balance probe spacing with probe quality.
June 2006Page 33
Probe Design
List of Probes Find smallest interval Remove worst of pair< N
Advantages
• Balances spacing with performance
• Scoring is easily tuned
• Robust to genome perturbations
ACTG
Pairwise Reduction
Location Analysis
Select regions based on TSS, merge
overlapping regions.
CGH
Select entire chromosomes, apply spatial bias to over-represent desirable sub-regions (genes,
promoters, CpG islands, etc).
ARRAY DESIGN1. Select regions2. Select all probes in regions3. Apply Pairwise Reduction to
achieve desired probe count or coverage
June 2006Page 34
Probe DesignChoosing the best probes
• Agilent’s proven platform
• 60-mers, robust hybridization
Increasingly constrained region …
… limited chances of finding a well-behaved probe
• Probe optimization criteria:
• Uniqueness (homology)
• Tm
• Self-structure
Probes
Net effect: Constrained regions Restricted probe performance Noisier system
More probes ≠ more accurate measurements. One well chosen 60mer gives greater measurement accuracy than the statistical average of multiple “noisy” probes
June 2006Page 35
Recent presentationsGraves Lab, Huntsman InstituteSystems Biology, Cold Spring Harbor Laboratory, Mar. 23-26, 2006
June 2006Page 36
Recent presentationsMyers Lab, Stanford UniversityThe Biology of Genomes, Cold Spring Harbor Laboratory, May 10-14, 2006
June 2006Page 37
Recent presentationsHuang Lab, Ohio State UniversityAACR, Washington DC, Apr. 1-5, 2006
June 2006Page 38
Working with leaders50+ Customers & Growing…
Agilent Technologies expands microarray technology agreement to National Cancer Institute extramural researchersAccess to emerging techniques enables more scientists to study cancer from multiple perspectives, speed progress toward 2015 goal
PALO ALTO, Calif., Dec. 1, 2005 Citing growing demand for emerging microarray applications that complement gene expression studies, Agilent Technologies Inc. (NYSE: A) today announced the extension of its technology access program to National Cancer Institute (NCI) extramural researchers. The NCI funds approximately 4,500 research grants a year.
NCI extramural researchers can now obtain Agilent's microarray solutions for comparative genomic hybridization (CGH), ChIP-on-chip (also known as location analysis), and gene expression under the agreement. NCI's intramural Center for Cancer Research (CCR) already has access to the Agilent microarray technology.
"The NCI's goal is to eliminate the suffering and death due to cancer by 2015, and growing numbers of researchers realize that this can't be achieved using gene expression data alone," said Fran DiNuzzo, Agilent Life Science and Chemical Analysis vice president and general manager, Integrated Biology Solutions. "We're focused on providing the genomics tools to help scientists study pathways from multiple perspectives, link applications with well-designed bioinformatics systems, and thus reach useful discoveries faster."
Microarray-based CGH is being recognized as a powerful technique for pinpointing genomic gains and losses associated with cancer and other genetic-based diseases. A paper in the December 2004 issue of the Proceedings of the National Academy of Sciences (PNAS) demonstrates that oligonucleotide arrays designed for CGH provide a robust and precise platform for detecting chromosomal alterations with high sensitivity, even in complex samples such as those used by oncology investigators.
ChIP-on-chip is an emerging microarray application for determining where proteins bind to regulatory regions of DNA. The September 2005 issue of Cell published a paper by professor Richard Young's laboratory at the Whitehead Institute, validating the effectiveness of this technique by examining key transcriptional regulators of stem cells. The prior month's issue described the utility of location analysis in producing high-resolution maps of histone acetylation and methylationin yeast. This technique is important, as changes in chromatin structure play an important role in the silencing of certain genes in cancer, and histone deacetylase inhibitors have demonstrated anti-cancer effect.
"We observe very impressive enrichment upon immunoprecipitation with these microarrays, and the dynamic range of the signal in the IP channel is excellent -- the background signal is extremely low," said Dr. Brian Dynlacht, director of Genomics Program for New York University's Cancer Institute, an NCI extramural researcher referring to his use of Agilent's mammalian location analysis microarrays.
The technology access program includes Agilent reagents, catalog and custom microarrays, instrumentation, and software. In addition to providing promotional pricing, the program encourages broad publication of scientific results. It is designed to facilitate collaborations between academic, governmental and commercial researchers.
Agilent Technologies expands microarray technology agreement to National Cancer Institute extramural researchersAccess to emerging techniques enables more scientists to study cancer from multiple perspectives, speed progress toward 2015 goal
PALO ALTO, Calif., Dec. 1, 2005 Citing growing demand for emerging microarray applications that complement gene expression studies, Agilent Technologies Inc. (NYSE: A) today announced the extension of its technology access program to National Cancer Institute (NCI) extramural researchers. The NCI funds approximately 4,500 research grants a year.
NCI extramural researchers can now obtain Agilent's microarray solutions for comparative genomic hybridization (CGH), ChIP-on-chip (also known as location analysis), and gene expression under the agreement. NCI's intramural Center for Cancer Research (CCR) already has access to the Agilent microarray technology.
"The NCI's goal is to eliminate the suffering and death due to cancer by 2015, and growing numbers of researchers realize that this can't be achieved using gene expression data alone," said Fran DiNuzzo, Agilent Life Science and Chemical Analysis vice president and general manager, Integrated Biology Solutions. "We're focused on providing the genomics tools to help scientists study pathways from multiple perspectives, link applications with well-designed bioinformatics systems, and thus reach useful discoveries faster."
Microarray-based CGH is being recognized as a powerful technique for pinpointing genomic gains and losses associated with cancer and other genetic-based diseases. A paper in the December 2004 issue of the Proceedings of the National Academy of Sciences (PNAS) demonstrates that oligonucleotide arrays designed for CGH provide a robust and precise platform for detecting chromosomal alterations with high sensitivity, even in complex samples such as those used by oncology investigators.
ChIP-on-chip is an emerging microarray application for determining where proteins bind to regulatory regions of DNA. The September 2005 issue of Cell published a paper by professor Richard Young's laboratory at the Whitehead Institute, validating the effectiveness of this technique by examining key transcriptional regulators of stem cells. The prior month's issue described the utility of location analysis in producing high-resolution maps of histone acetylation and methylationin yeast. This technique is important, as changes in chromatin structure play an important role in the silencing of certain genes in cancer, and histone deacetylase inhibitors have demonstrated anti-cancer effect.
"We observe very impressive enrichment upon immunoprecipitation with these microarrays, and the dynamic range of the signal in the IP channel is excellent -- the background signal is extremely low," said Dr. Brian Dynlacht, director of Genomics Program for New York University's Cancer Institute, an NCI extramural researcher referring to his use of Agilent's mammalian location analysis microarrays.
The technology access program includes Agilent reagents, catalog and custom microarrays, instrumentation, and software. In addition to providing promotional pricing, the program encourages broad publication of scientific results. It is designed to facilitate collaborations between academic, governmental and commercial researchers.
FRANK HOLSTEGE
STEPHEN BELL
RICK YOUNG
ERIN O’SHEA, ANDREW McMAHON
TREY IDEKER, CHRIS GLASS
BRIAN DYNLACHT
BONITA BREWER
RICK MYERS
NAFTALI KAMINSKI
TIM HUANG
FRANCOIS ROBERT
KLAUS KAESTNER
BRAD CAIRNS
GUOPING FAN
WING WONG
JEAN-PIERRE ISSA
June 2006Page 39
DNA methylation
Methylation of C5 of cytosine in CG dinucleotide
– DNA methyltransferases
– Post-replication maintenance (DNMT 1)
– de novo (DNMT3A & DNMT3B)
Gene regulation
– embryonic development
– genomic imprinting
– gene silencing - cancer
CpG islands
– regions of high CG, generally un-methylated, 1% of human genome
– promoter associated
Chromatin stability
June 2006Page 40
Array Roadmap‘MASTER’ TILING DATABASES (~100bp)
Species # Probes
Agilent Designed
Human 13M
Mouse 13M
Rat 10M
Arabidopsis ~750K
C. elegans ~725K
P. aeruginosa TBD
Cryptoccocus TBD
OTHER GENOME DATASETS
Customer Designed
Species Coverage
S. cereveisiae 40K (~266bp)
Drosophila13.8M genome
(~250bp)
Zebrafish11K annotated genes (~250bp)
S. pombe TBD
All Genome Databases will be loaded in eArray v4.5
MAR APR MAY JUN JUL AUG
Catalog & custom 44K designs (100µ)
Early Access 185K designs (60µ)• Hu promoter• Mo promoter• Hu CpG Island• Custom
Catalog 244K (60µ)• Hu promoter• Mo promoter• Hu CpG Island• Hu, Mo, & model
organism WG• Yeast WG (4x44)
Custom 244k & multi-array formats
• 1x244K• 2x110K• 4x44K
DRAFT
DEC
Early Access 440K (30µ) & multi-array formats
*Feature size in parenthesis
June 2006Page 41
22k 44k 185k
1.9k11k
244k
44k
Multi-pak
2001 2002 2003 2004 2005 2006 2007
Existing
Future
July ‘06 Early ‘07
Agilent microarray formats timeline
79k
440k
105k 189k
27k15k
June 2006Page 42
Agilent ChIP-on-chip:Major Applications
• Identify transcription factor and DNA-binding protein targets
• Characterize transcription, DNA replication, and DNA repair events
• Map chromatin modifications such as DNA methylation
• Determine modality and interactions between therapeutic compounds and target genes
• Validate and augment existing gene expression data with true binding events
June 2006Page 43
Available 244K Array Designs: Summary
Array Type Part Number # slides # arrays per slide
# samples Description:
Yeast G4491A 1 1 5 • Probes for ~12 MB of the genome • Probes spacing 50 nt on average
Yeast 4-pack (4x44K) G4493A 1 4 20 • Probes for ~12 MB of the genome • Probes spacing 290 nt on average
Human Expanded Promoter 2-set
G4489A 2 1 5 • ~17,000 best defined human transcripts• Coverage: -5.5k to +2.5k region. Approx. 25
probes/gene.
Mouse Expanded Promoter 2-set
G4490A 2 1 5 • ~17,000 best defined mouse transcripts• Coverage: -5.5k to +2.5k region. Approx. 25
probes/gene
Human CpG Islands G4492A 1 1 5 • 27,800 CpG Islands on 1 array
MADE TO ORDER: Agilent Unrestricted
AMADIDS
G4495A 1 1 1 • Order single slides from sets• Order Human ENCODE, Drosophila,
Arabidopsis, Zebrafish, C. elegans
Custom 1x244K G4496A 1 1 1 • Create your own array
Custom 4x44K G4497A 1 4 4 • Create your own array
June 2006Page 44
Model organism and made-to-order arraysArray Type # slides in
setSource Format Part number Design IDs Description:
Drosophila WG 2 dm2 244k G4495A 14816-17 • Drosophila whole genome on 2-array set with 233 nt average tiling density
C. Elegans WG 2 ce2 244k G4495A 14793-94 • C. elegans whole genome on 2-array set with 182 nt average tiling density
Arabidopsis WG 2 ath1 244k G4495A 14798-99 • Arabidopsis whole genome on 2-array set with 212 nt average tiling density
Zebrafish (share) proximal promoter
2 zv4 44k G4475A 013834-35 • Probes cover -1.5kb to +.5kb from the tss and represent 11,000 transcripts
Zebrafish (share) expanded promoter
9 zv4 44k G4475A 013824-32 • Probes cover -9kb to +3kb from the tss and represent 11,000 transcripts
S. Pombe WG (share)
1 Sanger, Sep 2004
44k TBD TBD • Whole genome 44K array, shared
Human ENCODE 1 Human ENCODE
244k G4495A 14792 • 153K probes covering ENCODE regions on single slide