Transcript
Page 1: SNP Detection Analysis - CGIARhpc.ilri.cgiar.org/beca/training/AdvancedBFX2016/content/SNPDetect… · PTC Tasting • Phenylthiocarbamide is a chemical that not everyone can taste

8/11/16

1

SNPDetectionAnalysisManpreetS.Katari

PTCTasting

• Phenylthiocarbamideisachemicalthatnoteveryonecantaste.• Generallyabout70%ofindividualscantastethebitterness.• Canweusegeneticstoexplainthisphenomenon?

• Inordertokeepthedatasetsmallwearegoingtofocusononegenethatisknowntoimportantinthis.

• Thesequenceforthisgenewasobtainedfromfourindividualswhoalsoperformedthetastetest.

Singlenucleotidepolymorphisms(SNPs)l Single-nucleotidepolymorphisms

SNP (Genotypes G/G, reference A/A)

Typesofvariants

Typesofbasesubstitutions

• Transition(Ti):• purine->purinemutation• Or• pyrimidine->pyrimidinemutation

• Transversion (Tv):• Purine->pyrimidinemutation• Or• Pyrimidine->purinemutation

Smallinsertiondeletions(INDELS)

Page 2: SNP Detection Analysis - CGIARhpc.ilri.cgiar.org/beca/training/AdvancedBFX2016/content/SNPDetect… · PTC Tasting • Phenylthiocarbamide is a chemical that not everyone can taste

8/11/16

2

l Single-nucleotidepolymorphisms

l Smallinsertion/deletionpolymorphisms

SNP (Genotypes G/G, reference A/A)

Insertion allele in reference

TypesofvariantsMutationdetectionpipeline

• Alignment:SAMformat(BAM=binary)

• SNPdetection:VCFformat(BCF=binary)

• Functionalassessment

Variant Call Format (VCFs): summary of variants in a genome(s) Variant Call Format (VCFs): representation of variants

VCFlineexample• CHROM:PTC_Human• POS:245• ID:.

• REF:T• ALT:C• QUAL:999

• FILTER:.• INFO:DP=7609;VDB=0.0040;AF1=0.625;AC1=5;DP4=1985,855,2346,2276;MQ=59;FQ=999;PV4=2e-60,1,4.6e-

05,1• FORMAT:GT:PL:GQ GT=genotype,PL=genotypelikelihood,GQ=genotypequality• 0/1:244,0,248:99• 0/0:0,255,255:99

• 1/1:255,255,0:99• 1/1:255,255,0:99

Wholegenome-resequencing:snp-calling

Nielsen, Nat. Rev. Genetics 2011

Requires assembled genome

Steps to reduce the “false discovery rate” (FDR)Multi-sample algorithms improve SNP-calling in population studies

Hard-filtering of suspect SNPs is required in most snp-calling and genotyping applications

Page 3: SNP Detection Analysis - CGIARhpc.ilri.cgiar.org/beca/training/AdvancedBFX2016/content/SNPDetect… · PTC Tasting • Phenylthiocarbamide is a chemical that not everyone can taste

8/11/16

3

l WhatisaPCRduplicate?

PCR duplicates

Not a duplicate

Not a duplicate

Reference genome

Duplicate handling: PCR Duplicates Indel Re-alignment

l Whyisitnecessarytore-alignreads?

l Outcomeisrefinementofinsertion-deletionpositioningandreductioninfalsepositiveSNPs

l Example

DINDEL

GATK IndelRealignerTargetCreator / IndelRealigner Before:

After:

BaseQualityScoreRecalibration Resequencing workflow

l QCofsequencingdata

l Alignreads- generatesSAM/BAMalignment

l Coordinatesortreads

l Markduplicatereads

l Re-alignmentaroundinsertions/deletions

l SNP-calling

l Filtering/Qualitycontrol

l AssesspredictedeffectsofSNPs

GATKframework AnnotatingVariants

Page 4: SNP Detection Analysis - CGIARhpc.ilri.cgiar.org/beca/training/AdvancedBFX2016/content/SNPDetect… · PTC Tasting • Phenylthiocarbamide is a chemical that not everyone can taste

8/11/16

4

Dotheanalysis

IGV:IntegrativeGenomicsViewer

• http://www.broadinstitute.org/igv/• Standalonejavaprogram

• Doesnotrequireamysql databaseserveroranapachewebserver• Limitedtotheresourcesofthemachinethatitisrunningon.• MoreinteractivecomparedtoGbrowse.• BothIGVandGbrowse canuseGFFfileformat.

Loadthefasta fileasthegenome(referencesequence) SelectthePTC_Human.fa file

Noticethesequenceatthebottom NowloadtheAlignmentfiles(.bam)

Page 5: SNP Detection Analysis - CGIARhpc.ilri.cgiar.org/beca/training/AdvancedBFX2016/content/SNPDetect… · PTC Tasting • Phenylthiocarbamide is a chemical that not everyone can taste

8/11/16

5

NowloadtheAlignmentfiles(.bam)Coverageistheconsensusofallsequencestogether

Graymeanssequenceisthesameasthegenome,colorshowsachangefromreference.

Loadallthebamfiles

Youcanmovethetracksbyclickingonthenameanddraggingthem.

Zoomintoregionofinterest

Putyourmouseovereachbasetogetmorestatisticsabouteachbase.

VCFFormat LoadtheVCFfile

Page 6: SNP Detection Analysis - CGIARhpc.ilri.cgiar.org/beca/training/AdvancedBFX2016/content/SNPDetect… · PTC Tasting • Phenylthiocarbamide is a chemical that not everyone can taste

8/11/16

6

LoadtheVCFfile DetailsofSNP

PlaceyourmouseovertheSNPtogetthedetails.Aretheconclusionsthesame?Whatadditionalfilteringshouldweapply?


Recommended