16
VarDetect: a nucleotide VarDetect: a nucleotide sequence variation exploratory sequence variation exploratory tool tool VarDetect VarDetect Chumpol Ngamphiw Chumpol Ngamphiw 1 , Supasak Kulawonganunchai , Supasak Kulawonganunchai 2 , , Anunchai Assawamakin Anunchai Assawamakin 3 , Ekachai Jenwitheesuk , Ekachai Jenwitheesuk 1 and Sissades Tongsima and Sissades Tongsima 1 1 Genome Institute, National Center for Genetic Engineering and Biotechnology, Thailand 2 Department of Computer Science, School of Engineering and Technology, Asian Institute of Technology, Thailand 3 Division of Medical Genetics, Siriraj Hospital, Mahidol University, Thailand

VarDetect: a nucleotide sequence variation exploratory tool VarDetect Chumpol Ngamphiw 1, Supasak Kulawonganunchai 2, Anunchai Assawamakin 3, Ekachai Jenwitheesuk

Embed Size (px)

Citation preview

VarDetect: a nucleotide VarDetect: a nucleotide sequence variation exploratory sequence variation exploratory

tooltoolVarDetect

VarDetect

Chumpol NgamphiwChumpol Ngamphiw11, Supasak , Supasak KulawonganunchaiKulawonganunchai22, Anunchai Assawamakin, Anunchai Assawamakin33, , Ekachai JenwitheesukEkachai Jenwitheesuk11

and Sissades Tongsimaand Sissades Tongsima11

1 Genome Institute, National Center for Genetic Engineering and Biotechnology, Thailand

2 Department of Computer Science, School of Engineering and Technology,

Asian Institute of Technology, Thailand

3 Division of Medical Genetics, Siriraj Hospital, Mahidol University, Thailand

2

OutlineOutline Nucleotide sequence variation Common sequencing artifacts VarDetect: algorithms overview Experimental results Conclusions

3

http://urgi.versailles.inra.fr/projects/GnpSNP/general_documentation.php

Nucleotide sequence variationNucleotide sequence variation

4

Common sequencing artifactsCommon sequencing artifacts

http://seqcore.brcf.med.umich.edu/doc/dnaseq/trouble/badseq.html

5

VarDetect: algorithmsVarDetect: algorithmsReading nucleotide traces

Base-CallingBase-Calling

Alignment of input sequences to the reference sequence

SNPs identification

Re-samplingRe-sampling

Pre-alignment, Alignment EnhancementPre-alignment, Alignment Enhancement

CodeMapCodeMap

6

Chromatogram trace: base-callingChromatogram trace: base-calling

Reading nucleotide traces: base-calling

Base-calling with BioJavaBase-calling with BioJava

7

Calculate peak intensity ratioCalculate peak intensity ratio

Reading nucleotide traces: intensity ratio

QQvv– Q– Qo o ((δδ) for increasing the confidence of SNP ) for increasing the confidence of SNP

detectiondetection

8

Partitioning and Re-sampling (PnR) techniquePartitioning and Re-sampling (PnR) technique

Reading nucleotide traces: partition and re-sampling

9

Pooled DNA: possible biallelic Pooled DNA: possible biallelic patternpattern

Base-call parameters setting in Base-call parameters setting in VarDetectVarDetect

10

Pre-alignment & alignment Pre-alignment & alignment enhancementenhancement

Alignment

11

CodeMap analysisCodeMap analysis

SNPs identification: CodeMap

12

VarDetect: main graphical user VarDetect: main graphical user interfaceinterface

http://www.biotec.or.th/GI/tools/http://www.biotec.or.th/GI/tools/vardetectvardetect

13

Experimental resultsExperimental results

- Tocharoentanaphol C, et al. : Evaluation of resequencing on number of tag SNPs of 13 atherosclerosis-related genes in Thai population. J Hum Genet 2008, 53:74–86.- Thailand SNP discovery project : http://www.biotec.or.th/thaisnp

14

ConclusionsConclusions• We presented novel algorithm to interpret fluorescent-based

chromatograms in an automatic fashion and platform independent (Java).

• Three main heuristic procedures are employed:• Turning point (bell shape) detection (PnR algorithm). • Increasing the SNP detection confidence by checking the difference between vicinity and observed quality values (Qv - Qo).• Introduction of CodeMap to detect pattern of SNP and Indel.

• VarDetect offers the most features including the ability to detect SNPs from pooled DNA samples.

• VarDetect use of XML annotated reference sequence to cross check the SNP discovery results within the tool without using external applications.

• VarDetect’s heuristics minimize both false positive and negative errors reducing the effort needed to detect and validate SNPs, making it the tool of choice for automatic SNP detection.

15

AcknowledgementsAcknowledgements

Dr. Mazazumi Takahashi, Centre National de Genotypage (CNG), France Dr. Philip Shaw Dr. Prasit Palittapolgarnpim Dr. Chintana Tocharoentanaphol, Chulabhorn Research Institute Dr. Chanin Limwongse, Siriraj Hospital, Mahidol University Thailand SNP discovery project National Center for Genetic Engineering and Biotechnology (BIOTEC)

Thank You For Your Thank You For Your AttentionAttention

VarDetect

VarDetect

Chumpol NgamphiwChumpol Ngamphiw11, Supasak , Supasak KulawonganunchaiKulawonganunchai22, Anunchai Assawamakin, Anunchai Assawamakin33, , Ekachai JenwitheesukEkachai Jenwitheesuk11

and Sissades Tongsimaand Sissades Tongsima11

1 Genome Institute, National Center for Genetic Engineering and Biotechnology, Thailand2 Department of Computer Science, School of Engineering and Technology,

Asian Institute of Technology, Thailand3 Division of Medical Genetics, Siriraj Hospital, Mahidol University, Thailand