Synthesis & Analysis on Molecular Arrays

Preview:

DESCRIPTION

Synthesis & Analysis on Molecular Arrays. CHI Microarrays in Medicine 4-May-2005 9:20-9:50 AM. Thanks to: Washington U, Harvard-MIT Broad Inst., DARPA-BioSpice, DOE-GTL , EU-MolTools, NGHRI-CEGS, NHLBI-PGA, NIGMS-SysBio, PhRMA, Lipper Foundation - PowerPoint PPT Presentation

Citation preview

Thanks to: Washington U, Harvard-MIT

Broad Inst., DARPA-BioSpice, DOE-GTL, EU-MolTools,

NGHRI-CEGS, NHLBI-PGA, NIGMS-SysBio, PhRMA, Lipper Foundation

Agencourt, Ambergen, Atactic, BeyondGenomics, Caliper, Genomatica, Genovoxx, Helicos, MJR, NEN, Nimblegen, SynBioCorp, ThermoFinnigan, Xeotron/Invitrogen

For more info see: arep.med.harvard.edu

CHI Microarrays in Medicine4-May-2005 9:20-9:50 AM

Synthesis &Analysis on Molecular Arrays

Systems Biology Loop

Syntheses &Perturbations

Models

Experimental designs

(Systematic)

Data

Analysis & Synthesis Tools

Genome engineering

DNA & RNAPolony

Sequencing

Why Synthetic Genomes & Proteomes?

• Test array hypotheses e.g. cis-DNA/RNA-elements • Multi-epitopes, vaccines, protein design• Mass spectrometry & array standards.• Access to any protein (complex) including post-transcriptional modifications• Utility of molecular biology DNA-RNA-Protein

in vitro "kits" (e.g. PCR, T7, Roche)

Whole genome or part?Whole if major redesign e.g. changingthe genetic code and stability.

Up to 760K Oligos/Chip18 Mbp for $1K raw (6-18K genes)

<1K Oxamer Electrolytic acid/base 8K Atactic/Xeotron/Invitrogen Photo-Generated Acid Sheng , Zhou, Gulari, Gao (U.Houston) 24K Agilent Ink-jet standard reagents 48K Febit 100K Metrigen 380K Nimblegen Photolabile 5'protection Nuwaysir, Smith, Albert

Tian, Gong, Church

Improve DNA Synthesis CostSynthesis on chips in pools is 5000X less expensive per

oligonucleotide, but amounts are low (1e6 molecules rather than usual 1e12) & bimolecular kinetics slow with square of concentration decrease!)

Solution: Amplify the oligos then release them.

10 50 10 => ss-70-mer (chip)

20-mer PCR primers with restriction sites at the 50mer junctions

Tian, Gong, Sheng , Zhou, Gulari, Gao, Church

=> ds-90-mer

=> ds-50-mer

Improve DNA Synthesis Accuracyvia mismatch selection

Tian & Church Other mismatch methods: MutS (&H,L)

Improving DNA synthesis accuracy

Method Bp/error

Chip assembly only 160 Hybridization-selection 1,400MutS-gel-shift 10,000PCR 35 cycles 10,000MutHLS cleavage 100,000

Tian & Church 2004 NatureCarr & Jacobson 2004 NARSmith & Modrich 1997 PNAShttp://www.invitrogen.com/content.cfm?pageid=453

Computer aided Design Polymerase Assembly Multiplexing (CAD-PAM)

For tandem, inverted and dispersed repeats: Focus on 3' ends, hierarchical assembly, size-selection and scaffolding.

Mullis 1986 CSHSQB, Dillon 1990 BioTech, Stemmer 1995 Gene Tian et al. 2004 Nature, Kodumal et 2004 PNAS

50

75

125 225 425 825 … 100*2^(n-1)

Genome assembly

0 1 2 3 4 PAM cycle# 550 75 125 225 425 #bp 825

50 HS PAM 425 MutS PAM 10K anneal 100K red5Mbp

USER USER-S1 USER-5'only One pool 480 pools 480 genomic 48 1 of 117K universal primers primer pairs

HS=Hybridization-SelectionUSER=Uracil DNA glycosylase &EndoVIII remove flanking primer pairs

] ]PCR in vitro

All 30S-Ribosomal-protein DNAs(codon re-optimized)

Tian, Gong, Sheng , Zhou, Gulari, Gao, Church

1.7 kb

0.3 kb

s190.3kb

Nimblegen 95K chip

Atactic <4K chip

Extreme mRNA makeover for protein expression in vitro

RS-2,4,5,6,9,10,12,13,15,16,17,and 21 detectable initially.

RS-1, 3, 7, 8, 11, 14, 18, 19, 20 initially weak or undetectable.

Solution: Iteratively resynthesize all mRNAs with less mRNA structure.

Tian & Church

20w 20m 17w 17m 16w 16m

10kd

W: wild-typeM: modified

Western blot based on His-tags

3 Exponential technologies

Shendure J, Mitra R, Varma C, Church GM, 2004 Nature Reviews of Genetics. Carlson 2003 ; Kurzweil 2002

1E-3

1E-1

1E+1

1E+3

1E+5

1E+7

1E+9

1E+11

1E+13

1830 1850 1870 1890 1910 1930 1950 1970 1990 2010

urea

E.coli

B12

tRNA

operons

telegraph

Computation & communication

(bits/sec)

Synthesis (daltons)

Analysis(bp/$) tRNA

Why Personal Genomics?

• Pathogen rapid response: emerging disease & biowarfare• B & T-cell diversity: clinical temporal profiling• Proteomics: antibodies & aptamers • RNA & methylation: quantitate splicing, & chromatin.• Preventative medicine: genotype–phenotype association• Cancer: drug targets, loss-of-heterozygosity• Synthetic biology: laboratory selections• Phylogenetic: footprinting, biodiversity

Shendure et al. 2004 Nature Reviews of Genetics

Cancer Genome Projectdiagnosis, prognosis, therapies

Mutations G719S, L858R, Del746ELREA in red.

EGFR Mutations in lung cancer: correlation with clinical response to gefitinib [Iressa] therapy.

Paez, … Meyerson (Apr 2004) Science 304: 1497

Lynch … Haber, (Apr 2004) New Engl J Med. 350:2129.

Pao .. Mardis,Wilson,Varmus H, PNAS (Aug 2004) 101:13306-11.

Dulbecco R. (1986) A turning point in cancer research: sequencing the human genome. Science 231:1055-6.

A’

A’A’

A’

A’

A’

B

BB

B

BB

A

Single Molecule From Library

B

BA’

A’

1st Round of PCR

Primer is Extendedby Polymerase

B

A’

BA’

Polymerase colony (polony) PCR in a gel

Primer A has 5’ immobilizing Acrydite

Mitra & Church Nucleic Acids Res. 27: e34

Polymerase clones Plone sequencing

Polony-slides vs. Plone-beads1 vs. 2 immobilized primersdNTP extension vs. ligationSingle molecule vs. multi-molecule detection

Cleavable dNTP-Fluorophore (& terminators)

Mitra,RD, Shendure,J, Olejnik,J, Olejnik,EK, and Church,GM (2003) Fluorescent in situ Sequencing on Polymerase Colonies. Analyt. Biochem. 320:55-65

Reduce

or

photo-cleave

Polony Bead Sequencing Pipeline

In vitro libraries via paired tag

manipulation

Bead polonies via emulsion PCR

[Dre03]

Monolayered immobilization in acrylamide

Enrichment of amplified beads

SOFTWARE

Images → Tag Sequences

Tag Sequences → Genome

FISSEQ or “wobble”sequencing

Epifluorescence Scope with Integrated Flow

Cell

Mitra, Shendure, Porreca, Rosenbaum, Church unpub.

rs3778973

rs997906

rs1557917

rs39284

rs10500042

rs4717028

C

A

G

C

G

C

C

A

G

C

G

C

GM12248 GM12249

GM10835

T

T

A

T

A

T

T

T

A

T

A

T

C

A

G

C

G

C

T

T

A

T

A

T

Haplotypes inferred by pedigreevs. direct single molecule measures homozygous

in the parents

heterozygous in the son

1.8Mb

79.9Mb

88.2Mb

89.4Mb

114Mb

155Mb

rs3778973

rs997906

rs1557917

rs39284

rs10500042

rs4717028

GM10835

1.8Mb

79.9Mb

88.2Mb

89.4Mb

114Mb

155Mb

C

A

G

C

G

C

T

T

A

T

A

T

1Mb haplotypes

AT=198 GT=0 GC=45

rs3778973

rs997906

rs1557917

rs39284

rs10500042

rs4717028

GM10835

1.8Mb

79.9Mb

88.2Mb

89.4Mb

114Mb

155Mb

C

A

G

C

G

C

T

T

A

T

A

T

75Mb haplotypes

TT=8 TC=0 AC=23

rs3778973

rs997906

rs1557917

rs39284

rs10500042

rs4717028

GM10835

1.8Mb

79.9Mb

88.2Mb

89.4Mb

114Mb

155Mb

C

A

G

C

G

C

T

T

A

T

A

T

153Mb haplotypes

TT=72 CT=15 CC=28

Plone-bead Fluorescent In Situ Sequencing in vitro Libraries

Greg PorrecaAbraham Rosenbaum

1 to 100kb Genomic1 to 100kb Genomic

M

L R

M

PCRbead

Sequencingprimers

Selectorbead

2x20bp after MmeI2x20bp after MmeI

Dressman et al PNAS 2003 emulsion

Plone-FISSeq: up to 1 billion beads/slideWhite= Fe-core pixels, Cy5 primer (570nm) ; Cy3 dNTP (666nm)

Jay Shendure, Greg Porreca

• # of bases sequenced (total Mbp) 23 (no) 10.8 (yes)

• # bases sequenced (unique) 73 b 4.7 Mb (72%)

• Avg fold coverage 324K 2.3

• Pixels used per bead (analysis) 3.6 3.6

• Read Length (bp) 14 24

• Indels 0.6% ?

• Substitutions (raw error-rate) 4e-5 1e-2• Throughput (kb/min) 360 10• Speed/cost ratio relative to 1100 32 current ABI capillary sequencing @ 0.75 kb/min/device

Plone-bead FISSeq '04 '05Consider amplification , homopolymer, context errors?

Shendure & Porreca

CD44 Exon Combinatorics (Zhu & Shendure)

• Alternatively Spliced Cell Adhesion Molecule• Specific variable exons are up-or-down-regulated in

various cancers (>2000 papers)• v6 & v7 enable direct binding to chondroitin sulfate,

heparin…

Zhu,J, et al. Science. 301:836-8.

Zhu J, Shendure J, Mitra RD, Church GM. Science 301:836-8. Single molecule profiling of alternative pre-mRNA splicing.

EXON PATTERN Eph4 Eph4bDD TOTALEph4 FRATIO LSTP-PV------------7-8-9-10 609 764 1373 1.17 1E-4--------------8-9-10 320 390 710 1.13 3E-2----------6-7-8-9-10 431 251 682 -1.85 4E-18------4-5-6-7-8-9-10 218 216 434 -1.08 2E-1----------------9-10 68 143 211 1.96 7E-7--------5-6-7-8-9-10 86 39 125 -2.37 2E-6----3-4-5-6-7-8-9-10 40 56 96 1.30 9E-2------4-5---7-8-9-10 16 74 90 4.30 2E-9--2-3-4-5-6-7-8-9-10 44 28 72 -1.69 1E-21-2-3-4-5-6-7-8-9-10 22 5 27 -4.73 3E-4--------5---7-8-9-10 5 19 24 3.53 3E-3----3-4-5---7-8-9-10 1 15 16 13.95 4E-4--2-3-4-5---7-8-9-10 1 10 11 9.30 5E-3

Eph4 = murine mammary epithelial cell line

Eph4bDD = stable transfection of Eph4 with MEK-1 (tumorigenic)

CD44 RNA splicing isoforms

Soluble CD44

Zhu & Varma

Systems Biology Loop

Syntheses &Perturbations

Models

Experimental designs

(Systematic)

Data

Analysis & Synthesis Tools

Genome engineering

DNA & RNAPolony

Sequencing

Molecular Systems BiologyTranscriptomics

Proteomics Metabolomics

Functional genomics Structural genomics

Computational biology Theoretical biology

Mathematical biologySynthetic biology

An open access journalwww.nature.com/msb/

Thanks to: Washington U, Harvard-MIT

Broad Inst., DARPA-BioSpice, DOE-GTL, EU-MolTools,

NGHRI-CEGS, NHLBI-PGA, NIGMS-SysBio, PhRMA, Lipper Foundation

Agencourt, Ambergen, Atactic, BeyondGenomics, Caliper, Genomatica, Genovoxx, Helicos, MJR, NEN, Nimblegen, SynBioCorp, ThermoFinnigan, Xeotron/Invitrogen

For more info see: arep.med.harvard.edu

CHI Microarrays in Medicine4-May-2005 9:20-9:50 AM

Synthesis &Analysis on Molecular Arrays

.

Recommended