13
Next Genera*on Sequencing Me3e Voldby Larsen PhD, associate professor

NextGeneraonSequencing - CBS · Presentation1.pptx Author: Mette Voldby Larsen Created Date: 6/12/2013 8:43:05 AM

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: NextGeneraonSequencing - CBS · Presentation1.pptx Author: Mette Voldby Larsen Created Date: 6/12/2013 8:43:05 AM

Next  Genera*on  Sequencing  

Me3e  Voldby  Larsen    PhD,  associate  professor  

Page 2: NextGeneraonSequencing - CBS · Presentation1.pptx Author: Mette Voldby Larsen Created Date: 6/12/2013 8:43:05 AM

Outline  

•  DNA  sequencing  – Exercise:  Assembling  NGS  data  

•  From  DNA  to  protein  – Exercise:  Predic>ng  genes  using  Prodigal  

•  PathogenFinder  – Exercise:  Iden>fying  immunologically  relevant  target  proteins  using  PathogenFinder  

Page 3: NextGeneraonSequencing - CBS · Presentation1.pptx Author: Mette Voldby Larsen Created Date: 6/12/2013 8:43:05 AM

   Deoxyribose,  phosphate,  base  (Adenine,  Thymine,  Guanine,  Cytosine)  

   Double  helix:  Via  hydrogen  bonds,  A  pairs  with  T  and  C  with  G.    

   The  two  strings  are  an>parallele.    

DNA  Deoxyribose  Nucleic  Acid  

Page 4: NextGeneraonSequencing - CBS · Presentation1.pptx Author: Mette Voldby Larsen Created Date: 6/12/2013 8:43:05 AM

Organiza>on  of  DNA  

Ribosome              

Page 5: NextGeneraonSequencing - CBS · Presentation1.pptx Author: Mette Voldby Larsen Created Date: 6/12/2013 8:43:05 AM

History  of  sequencing  1977  Fred  Sanger  develops  sequencing  by  enzyma>c  synthesis  using  chain-­‐  

   termina>ng  inhibitors.  First  genera+on  sequencing.  1982  GenBank  is  established  as  a  collec>on  of  all  publicly  available  DNA  sequences.  1990  The  Human  Genome  Project  is  launched,  planned  to  take  15  years.  1995  The  first  genome  of  a  free-­‐living  organism,  the  bacterium  Haemophilus    

   influenza  (1.8  Mb)  (PMID:  7542800).  1996  The  first  genome  of  a  eukaryote,  Saccharomyces  cerevisiae  (12.1  MB)  (PMID:  

   8849441).  1996  Pyrosequencing  is  developed.  Next  (second)  genera+on  sequencing.  1998  The  first  genome  of  an  animal,  the  nematode  Caenorhabdi+  elegans  (97  Mb)  

   (PMID:  9851916).  2001  The  first  dra\s  of  the  human  genome  (3  Gb)  (PMID:  11237011  and  PMID:    

   11181995).  2005  Launch  of  the  GS20  sequencer  (454/Roche)  using  pyrosequencing.  2006  Launch  of  the  Genome  Analyzer  (Solexa/Illumina)  using  cyclic  reversible    

   terminator  sequencing.  Next  (second)  genera+on  sequencing.  2010  Ion  torrent  bench  top  sequencing  machines.  Next  (second)  genera+on    

   sequencing.  2011  Pacific  Biosciences  RS  machine  capable  of  single-­‐molecule  sequencing.  Third  

   genera+on  sequencing.      

Page 6: NextGeneraonSequencing - CBS · Presentation1.pptx Author: Mette Voldby Larsen Created Date: 6/12/2013 8:43:05 AM

Sanger    sequencing  

No -OH at the 3’ position – additional nucleotides cannot be added.

Template strand

Primer

DNA with unknown sequence is mixed with primers, DNA polymerase, dNTPs and flourescent ddNTPs.

The ddNTPs are each bound to a flourescent dye.

Synthesis stops when a flourescent ddNTP is added. Fragments of different lengths are made, each ending with a ddNTP.

The fragments flouresces with different color identifying the ddNTP that terminated the fragment.

Page 7: NextGeneraonSequencing - CBS · Presentation1.pptx Author: Mette Voldby Larsen Created Date: 6/12/2013 8:43:05 AM

Next  genera>on  sequencing  Pyrosequencing  (Roche/454)  

Page 8: NextGeneraonSequencing - CBS · Presentation1.pptx Author: Mette Voldby Larsen Created Date: 6/12/2013 8:43:05 AM

Cyclic  reversible  terminator  sequencing  

Page 9: NextGeneraonSequencing - CBS · Presentation1.pptx Author: Mette Voldby Larsen Created Date: 6/12/2013 8:43:05 AM

Comparison  of  sequencing  methods  

Page 10: NextGeneraonSequencing - CBS · Presentation1.pptx Author: Mette Voldby Larsen Created Date: 6/12/2013 8:43:05 AM

Example  of  data  file  from  next  gen  sequencer    -­‐  Short  (raw)  reads  in  FASTQ  format  

@M10_0139:1:2:18915:1321#ATCACG/1!TATCAAGAAAGATTTTAACAGCATTGACTCTGTTATCGAGTTTCATTTTAAACATAGTTTCCAGTGGT!+M10_0139:1:2:18915:1321#ATCACG/1!_bbeeeccgfgecgiiiihfhchiiiiiiiiihhfhhh^dghhhhf_fffghhhhhhacgeeghgbb] !@M10_0139:1:2:18915:1321#ATCACG/2!AGTTCATAGTGACAAGGTAATATTTGTCAAATTATATCGACCTAAAACGGTAGGATATATAACAAAAT!+M10_0139:1:2:18915:1321#ATCACG/2!a__eceeeeggffhihe^bhfiifh_edeg_agbgd]dd`g`fgdhedffaedadhhchhfhiicfhX !@M10_0139:1:2:12256:1321#ATCACG/1!ACGGGTGAACTGTACGGCATCGAAGCCCTTGCGCGCTGGCACGATCCCCAGCATGGTCATGCCCCCTC !+M10_0139:1:2:12256:1321#ATCACG/1!___`c_c`egge[bfghdeghfhhhhhfiii_ffhhN`ghhfddbcddadcddbccb_bbbcbc^aac !@M10_0139:1:2:12256:1321#ATCACG/2!AATCCGGAAAAGCCCGTACCAAAATCATCTACCGATAAGCCCACGCCCATATCACGCAGGATGAATCG !+M10_0139:1:2:12256:1321#ATCACG/2!a_ZcccWHO_bgadgc_WbaceZefda^f`egd`HO[ega\G\b`F_dggeca_cad`Y]^b__bKYZ!

.  

.  

.  

Page 11: NextGeneraonSequencing - CBS · Presentation1.pptx Author: Mette Voldby Larsen Created Date: 6/12/2013 8:43:05 AM

Reference  assembly  

GACCC  CTGG  

AAAAA  TG

GC  

CGAG

 CGT  

Contig

GCTGG  

GCG

A   TGCT  

ACGCT  

Coverage/depth

Page 12: NextGeneraonSequencing - CBS · Presentation1.pptx Author: Mette Voldby Larsen Created Date: 6/12/2013 8:43:05 AM

De  novo  assembly  

CTGGGC  

AAAAA  

GTG

GC  

AGCG

T  

GCTGAA  AACGCT  

Page 13: NextGeneraonSequencing - CBS · Presentation1.pptx Author: Mette Voldby Larsen Created Date: 6/12/2013 8:43:05 AM

Exercise  

Assembling  NGS  data