Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
1
SupportingInformation(SI)Appendix
PredictableallelefrequencychangesduetohabitatfragmentationintheGlanville
fritillarybutterfly
TobyFountain,MarkoNieminen,JukkaSirén,SweeChongWongandIlkkaHanski
DNAextraction.Inthecaseoffield-collectedspecimens,larvaltissuewashomogenizedpriorto
extractionusingTissueLyser(Qiagen)at30/sfor1.5minswithTungstenCarbideBeads,3mm
(Qiagen).DNAwasextractedusingtheNucleoSpin96TissueCoreKit(Macherey-Nagel).Where
DNAyieldwaslow,extractedDNAunderwenttworoundsofWholeGenomeAmplification(WGA)
(LGCGenomics).Inthecaseofmuseumspecimens,DNAwasextractedfromaleg(inafewcases
fromasmallwingbiopsy)usingtheQIAmpMicrokit(Qiagen).WhereDNAyieldwaslow,an
additionalextractionwasperformedandtheproductsofthetwoextractionswerepooled.To
furtherincreaseDNAyield,WGAwasperformedusingtheGenomePlexCompleteWholeGenome
AmplificationKit(Sigma).AfterWGA,sampleswerecleanedusingQIAquickPCRpurificationkit
(Qiagen)tooptimizedownstreamPCRperformance.Sterilemethods,andpositiveandnegative
controls,wereusedtoensurenocross-contaminationbetweenmuseumandcontemporary
samples.
InitialSNPselectionandvalidation.SNPmarkerswereselectedfromcandidategenes,putatively
neutralregionsofthegenome,andotherwiseuncoveredchromosomes(seemaintext).SNP
callingwasperformedontheRNA-seqdata,including40unrelatedindividualssampledacrossthe
ÅlandIslands,usingthe“mpileup”functionfrom“SAMtools”package(1)withthedefault
parametervalues.These40ÅlandindividualswereonlyusedforSNPvalidationandwerenot
includedinthesubsequentanalyses.SNPswithminorallelefrequency(MAF)>0.2,callrate>0.9,
andSNPqualityscore>100wereretainedascandidateSNPs,whichweremappedtothegenome
(2)toobtainthecorrespondinggenesandexons.ASNPwasexcludediftherewerelessthan60
nucleotidesflankingbothupstreamanddownstreamfromtheSNPintheexon.Thegap-filling
SNPswerefromcodingregions,andtheyweresubjectedtothesamefilteringcriteriaasthe
candidategenes.TheneutralSNPswereobtainedfromtheSOLiDmatepair-1genomesequences
(2)usinganin-houseSNPcallingmethod(3,4).Asthegenomicsequenceswereattainedby
sequencingonlyonemaleindividual,onlyheterozygotesSNPsspanningacrossall31
chromosomesfromnon-codingregionswereselected.SNPsfromcandidategenesandnon-coding
2
regionswerecombinedintotheinitialcandidateSNPset.A121base-pair(bp)flankingregionwas
extractedforeachSNPintheinitialSNPset.IfthereareanyflankingSNPslocatedwithinboththe
upstreamandthedownstreamregionsforthecandidateSNP,theSNPwasexcluded.Thestepwas
adheredtoensureacleanregionforprimersandprobesdesign.ThefilteredSNPsweresubmitted
toLGCGenomicsforprimerdesignandinsilicotesting.BLASTwasperformedfortheprimerpairs
withthereferencegenometoconfirmthattheprimersbindonlytotheregionwherethe
correspondingSNPislocated.SNPsthatdidnotfulfillthiscriterionwereexcluded.TheSNPsthat
passedallthequalitycontrolstepswereselectedforvalidation.Avalidationpanelwas
constructedusing48individualsfromeightfamiliessampledacrosstheÅlandIslandsin2007-11.
ASNPpassedthevalidationstepifitproducedclearlydefinedgenotypeclustersinascatterplot
withlessthantwoMendelianerrors,andhadahighSNPcallrate(>0.9).Followingthevalidation
process,320SNPs(18SNPsfrompriorstudies,182SNPsfrom164candidategenes,15SNPsfrom
sexchromosomalscaffolds,45neutralSNPs,and60gapfillingSNPs)wereselectedasthefinal
genotypingpanelimplementingKASPchemistry(LGCGenomics).
Potentialascertainmentbiasinmuseumsamples.Onepotentiallimitationofthisdatasetisthat
thecandidateSNPswerevalidatedwiththecontemporaryÅlandsamplesonly,introducingthe
possibilityofascertainmentbiasinthemuseumsamples.Thereasonisthatgeneticdifferences
betweentheothersamplesandtheÅlandsamplecouldreducemarkerperformanceandlevelsof
polymorphisminmoredistantpopulations.However,suchabiasisveryunlikelyinthepresent
case,forseveralreasons.AllmarkerswerepolymorphicintheÅlandmuseumsamples,and>94%
ofthemarkerswerepolymorphicinSWFinnishmuseumsamples.Thealternativealleleatloci
thatweremonomorphicintheSWFinnishsamplestendedtobeatverylowfrequencyinthe
museumÅlandsamples.Moreover,theSWFinnishsamplesshareancestralgeneticvariationwith
thecontemporaryÅlandmetapopulation,andthenowextinctSWFinnishpopulationswere
isolated.
Validationofmuseumsamplegenotypes.Totestrepeatabilityofgenotypingasubsetofthe
museumsamples(n=8)wasgenotypedforasubsetofSNPsusingSequenomiPLEXGold
genotypingplatform.Oneindividualwasgenotypedat20SNPs,sixindividualsat14SNPs,andone
individualatsevenSNPs.IncaseswhereasamplewassuccessfullycalledinbothSequenomand
KASPgenotypeconcordancewas92.1%(70/76genotypesimilarity).Moreover,weexaminedthe
relationshipbetweenthecallrateandthelevelofheterozygositytoensurethatreduced
3
heterozygosityinmuseumsamplesisnotaresultoflowersamplequalityofmuseumthan
contemporarysamples(Fig.S7).Thereisgeneralreductionofheterozygositywhencallrate
decreasesinbothcontemporaryandmuseumsamples,butatthehighestcallrates(>0.8),SW
FinnishmuseumsampleshavethelowestheterozygositycomparedwithmuseumÅland(Tukey
test,P<0.0001)andcontemporaryÅland(Tukeytest,P<0.0001)samples.Asanalternative
analysis,wetestedthedifferencesbetweenthepopulationsinthefulldatasetwhileincludingcall
rateasacovariate.Inthisanalysis,thepopulationeffectwashighlysignificant(population:F=33.09,P<0.0001,callrate:F=74.29,P<0.0001),withSWFinlandhavingsignificantlylower
heterozygositythanbothcontemporaryÅland(Tukeytest,P<0.0001)andmuseumÅland
samples(Tukeytest,P<0.0001).Nonetheless,toavoidintroducinganypotentialbiasduetolow
samplequalityweonlyretainedindividualswithanaveragecallrate>0.8acrossallthe222SNPs.
FSTvalues.Usingthe222loci,wecalculatedtheFSTvalue(5)betweeneachmuseumspecimenand
thecontemporarySaltvikpopulation,sampledin2007(n=530).TheFSTvaluewasusedto
computetheprobabilityforeachmuseumspecimen(genotype)separately,usingtheobserved
allelefrequenciesintheÅlandpopulationastheexpectationandauniformpriordistribution
between0and1.TheposteriormeanoftheFSTwasusedasanestimateoftheevolutionary
distanceofthespecimenfromthereferencepopulation.Theeffectsofyearofsampling,
populationtypeandmarkertype(candidatevsneutral)weretestedusinglinearmodelsinR.We
selectedaprioriasetofbiologicallyplausiblemodelsandusedthefunctionmodel.selinthe
packageMuMInv.1.13.4torankthemodelsbasedontheirAkaikeinformationcriterionforfinite
samplesizes(AICc)(6,7).
Allelefrequencychangesintheoutlierloci.Tocharacterizeallelefrequency(AF)changesdueto
populationturnoverwecalculatedthedifferenceinallelefrequenciesbetweennewly-established,
isolatedpopulations(AF(new))andold,well-connectedpopulations(AF(old)).Thedifference
AF(new)-AF(old)iscorrelatedwithAF(old)(P=0.06;Fig.S6).Wethereforerepeatedtheanalysis
afterremovingthisbiasbyregressingAF(new)-AF(old)againstAF(old),andusedtheresidualfrom
thisregressionasthemeasureofallelefrequencydifferencesbetweenthepopulationtypes.
Similarly,wecalculatedcorrespondingresidualsforthevariablesAF(Sottunga)-AF(old)andAF(SW
Finland)-AF(old)inFig.4andFig.S5.Inallcases,theresultswereonlylittleaffectedbythis
correctionandtheconclusionsremainedunchanged(TableS6).Forsimplicity,weshowtheresults
4
basedontheuncorrectedvaluesinthemaintext,withtheexceptionoftheanalysisassociated
withFigS5inwhichcorrectedvalueswereusedduetoverysmallsamplesize.
Tocharacterizeallelefrequenciesintheoutlierlociinbutterfliesfromfragmentedvs
continuouslandscapes,weextractedtheallelefrequenciesfromanRNA-seqdataset(8)forthe
fourregionalpopulationsinFig.1.Astherewerefourregionalpopulations,twoofeachlandscape
type,wesummarizedvariationinallelefrequencieswithaprincipalcomponentanalysis(Table
S4).PC2,whichexplained33%oftotalvariance,wasstronglycorrelatedwithlandscapetype
(TableS4).PC2wascorrelated,thoughnotsignificantly,withcorrectedAF(Sottunga)–AF(old)(R2
=0.14,P=0.17)andcorrectedAF(SWFinland)–AF(old)(R2=0.29,P=0.08).Tocombinethese
twoanalyses,weranaprincipalcomponentanalysisonAF(SWFinland)-AF(old)andAF(Sottunga)-
AF(old)andusedPC1astheaverageallelefrequencychangeinthetwopopulations.PC1
accountedfor79%ofthetotalvariance.ThreeoutlierlociwerenotavailablefromtheRNA-seq
datasetandwerethereforeexcludedfromthisanalysis.
References
1. LiH,etal.(2009)Thesequencealignment/mapformatandSAMtools.Bioinformatics25(16):2078–2079.
2. AholaV,etal.(2014)TheGlanvillefritillarygenomeretainsanancientkaryotypeandrevealsselectivechromosomalfusionsinLepidoptera.NatComms5:1–9.
3. RastasP,PaulinL,HanskiI,LehtonenR,AuvinenP(2013)Lep-MAP:fastandaccuratelinkagemapconstructionforlargeSNPdatasets.Bioinformatics29(24):563–3134.
4. KvistJ,etal.(2015)FlightinducedchangesingeneexpressionintheGlanvillefritillarybutterly.MolEcol24(19):4886-4900.
5. GaggiottiOE,FollM(2010)QuantifyingpopulationstructureusingtheF-model.MolEcolResour10(5):821–830.
6. BartońK(2015)MuMIn:Multi-modelinference.Rpackagev.1.13.4.https://cran.r-project.org/web/packages/MuMIn/MuMIn.pdf.
7. JohnsonJB,OmlandKS(2004)Modelselectioninecologyandevolution.TrendsEcolEvol19(2):101–108.
8. SomervuoP,etal.(2014)Transcriptomeanalysisrevealssignatureofadaptationtolandscapefragmentation.PlosOne9(7): e101467
5
TableS1:FSTvaluesforthespatio-temporallypooledsamples(seeMaterialandMethodsinmaintext)calculatedusingallthe222SNPs(lowerdiagonal).Significantvalues(P<0.05)arehighlightedinbold.Valuesbasedonthe34neutralSNPs(upperdiagonal)gavequalitativelysimilarresults.
Åland1905 Åland1945 Åland1965 Saltvik2010 Sottunga2010 SWFinland1900 SWFinland1940 SWFinland1965
Åland1905 0.033 0.043 0.014 0.074 0.013 0.148 0.227Åland1945 0.012 0.014 0.035 0.052 -0.003 0.142 0.177Åland1965 0.021 0.013 0.054 0.094 0.009 0.168 0.204Saltvik2010 0.014 0.026 0.040 0.075 -0.022 0.134 0.197Sottunga2010 0.051 0.059 0.074 0.056 -0.053 0.147 0.191SWFinland1900 0.060 0.041 0.050 0.050 0.045 -0.043 -0.015SWFinland1940 0.131 0.116 0.122 0.125 0.149 0.129 0.059SWFinland1965 0.204 0.165 0.190 0.182 0.198 0.226 0.136
6
TableS2.ThesetofaprioriselectedmodelsexplainingFSTvaluesinFig.S1.Kisthenumberofparametersinthemodel.ModelsarerankedbasedonthedifferenceinAICcvalues(ΔAICc)betweenthefocalmodelandthebestmodelintheset.Akaikeweightsreflectthelikelihoodofamodelrelativetoallothermodelsintheset.
model K AICc ΔAICc Akaikeweight
year+pool+markertype+markertype*pool 6 -238.7 0 0.409year+pool+markertype+markertype*pool+year*pool 7 -238.3 0.44 0.328year+pool+markertype 5 -236.2 2.49 0.118year+pool+markertype+year*pool 6 -235.8 2.95 0.094pool+markertype 4 -234.5 4.16 0.051year+pool 4 -223.1 15.59 0.000pool 3 -221.7 16.99 0.000type 3 -181.1 57.61 0.000year+markertype 4 -179.7 58.99 0.000interceptonly 2 -172.0 66.72 0.000year 3 -170.6 68.12 0.000
7
TableS3.Asummaryoftheresultsfromthethreeanalysesinvolvingtheoutlierloci.Inthecaseoftherepeatabilityanalysisandcomparisonwithpopulationturnoverrate,allelefrequencydifferencebetweenthefocalpopulationandtheoldlocalpopulationsfromtheSaltvikreferencepopulationaregiven.Inthecomparisoninvolvingthedegreeoffragmentationatthelandscapelevel,thetwoprincipalcomponentsusedintheanalysisaregiven.Positivevaluesarehighlightedingreen,negativevaluesarehighlightedinred.
OutlierSNPs SWFinland Sottunga SWFinland New Sottunga SWFinland/Sottunga(pc1) Fragmentedlandscapes(pc2)
Mc1:1041:122591 0.462 0.066 0.462 -0.040 0.066 1.176 1.870
Mc1:2666:34531 0.518 0.010 0.518 -0.040 0.010 0.852 -0.663
Mc1:2966:24907 -0.384 0.076 -0.384 -0.135 0.076 0.604 0.011
Mc1:1061:35594 -0.052 -0.259 -0.052 0.049 -0.259 -1.199 0.534
Mc1:1873:36910 -0.585 -0.439 -0.585 -0.171 -0.439 -2.006 -1.853
Mc1:2025:177786 -0.382 -0.197 -0.382 -0.073 -0.197 -0.448 0.639
Mc1:1124:71239 -0.244 -0.055 -0.244 0.040 -0.055 -1.115 -1.375
Mc1:1206:26737 0.238 0.096 0.238 -0.009 0.096 NA NA
Mc1:752:33517 0.196 -0.296 0.196 0.008 -0.296 0.351 0.199
Mc1:2673:141336 0.745 0.323 0.745 0.136 0.323 1.785 0.639
Mc1:1129:26699 -0.252 -0.144 -0.252 -0.195 -0.144 NA NA
Mc1:658:68226 0.091 -0.193 0.091 -0.092 -0.193 NA NA
Repeatability(Fig.3) Habitatfragmentation(Fig.S5)Populationturnover(Fig.4)
8
TableS4.Correlationsbetweenthefirstfourprincipalcomponentsandtheallelefrequenciesinthefourregionalpopulationsinhabitingeitherfragmented(bold)orcontinuouslandscapes.
TableS5.PairsofSNPswithsignificantLD(P<0.05afterFDR).KASPID,chromosomeandpositionareshown.
TableS6.Resultsfromlinearmodelsofallelefrequencydifferencesusingvaluescorrectedversusnotcorrectedforaweakandnon-significantcorrelationwithAF(old)(seeSectionAllelefrequencychangesintheoutlierlociinSIAppendix).
PC1 PC2 PC3 PC4
Åland -0.43 0.61 -0.45 -0.49Uppland -0.46 0.43 0.73 0.27Öland -0.62 -0.28 -0.44 0.59Saaremaa -0.47 -0.61 0.27 -0.58
Proportionofvariance 0.47 0.33 0.16 0.04
Locus#1 Chromosome Position Locus#2 Chromosome Position
Åland
K3-82 1 1620 K4-5 24 100694K2-111 2 194473 K5-7 2 192234K2-25 3 5112 K3-17 3 15952K3-12 13 16716 K3-62 8 103903K2-60 13 37262 K3-150 13 33875K3-185 15 10977 K8-82 10 47209K3-192 15 73713 K5-133 15 70120K2-79 17 168063 K2-80 17 166435K2-86 25 24287 K2-88 25 19949K2-54 25 15219 K2-86 25 24287K3-162 NA 1329 K5-128 10 13465
SWFinland
K3-134 4 2340 K3-137 4 3654
R2 P R2 P R2 P R2 P
Uncorrected 0.36 0.02 0.14 0.13 0.36 0.02 0.15 0.16Corrected 0.30 0.04 0.05 0.24 0.30 0.04 0.30 0.07
Repeatability(Fig.3) Habitatfragmentation(Fig.S5)
SWFinland-Sottunga Sottunga-New SWFinland-New
Populationturnover(Fig.4)
SWFinland/Sottunga
9
Fig.S1.FSTvaluesofeachmuseumspecimencomparedtoSaltvikreferencepopulation(in2007)plottedagainstthetimeofsampling,separatelyforcandidate(solidtriangles)andneutralmarkers(opentriangles)for(a)Ålandand(b)SWFinnishmuseumsamples.TheregressionlinesoftheFSTvaluesagainsttimeareplottedseparatelyforthetwomarkertypes(solidlineforcandidatemarkers,dashedlineforneutralmarkers).FortheanalysisseeTableS2.
Fig.S2.Theallelefrequencyshiftsofoutlierloci(n=12)intheSWFinnishpopulationinrelationtocontemporaryÅlandsamplesaresignificantlycorrelatedwiththecorrespondingallelefrequencyshiftsintheSottungapopulationinrelationtocontemporaryÅlandsamples(R2=0.36,P=0.02)
1880 1920 1960 2000
0.0
0.2
0.4
0.6
0.8
1.0
Aland
Year
Drif
t val
ue
1880 1920 1960 2000
0.0
0.2
0.4
0.6
0.8
1.0
SW Finland
Year
Drif
t val
ueYear Year
F ST
F ST
a) b)
AllelefrequencydifferencebetweenSo4ungaandÅland
Allelefreq
uencydiffe
rencebe
tween
SWFinland
and
Åland
10
Fig.S3.Allelefrequencydifferencesofneutralloci(n=34)betweensamplesfromnowextinctpopulationsinSWFinland(black)andtheintroducedmetapopulationinSottunga(white),comparedtoold,well-connectedlocalpopulationsinSaltvik.Dashedlinesshowlociwithallelefrequencieslessthan0.2orgreaterthan0.8intheoldlocalpopulations.
Allelefreq
uencydiffe
rencefrom
contem
poraryÅland
AllelefrequencyincontemporaryÅland
11
Fig.S4.Allelefrequencyshiftsofneutralloci(n=20)intheSWFinnishmuseumsamplesandinSottungainrelationtothecontemporaryÅlandsamples.AstherewasahighlysignificantrelationshipbetweentheallelefrequencydifferenceAF(Sottunga)-AF(old)inrelationtoAF(old)intheneutralmarkers(R2=0.22,P=0.003),allelefrequencychangesareexpressedasresidualsfromalinearregressionofAF(Sottunga)-AF(old)againstAF(old)andAF(SWFinland)-AF(old)againstAF(old)(seetextinSI).Locithathadallelefrequenciesgreaterthan0.2orlessthan0.8inoldwell-connectedpopulationsareincluded(Fig.S3).Theremainingmarkerswereexcludedastheyfelloutsidetheminorallelefrequencycutoffsusedintheselectionofcandidatemarkers.Thecorrelationisnotsignificant(R2=-0.004,P=0.35).
Allelefreq
uencydiffe
rence
betw
eenSW
Finland
and
Åland
AllelefrequencydifferencebetweenSo8ungaandÅland
12
Fig.S5.Thesecondprincipalcomponentoftheallelefrequenciesintheoutliers(n=9)infourregionalpopulations(Fig.1)plottedagainstthefirstprincipalcomponentoftheallelefrequencydifferencesbetweenSottungaandSWFinlandfromoldlocalpopulationsintheSaltvikreferencepopulation(forthecalculationseeAllelefrequencychangesintheoutlierlociinSItext).DataforthreeoutlierswerenotavailableforallthefourregionalpopulationsintheRNA-seqdataset,hencen=9.PC2ontheverticalaxisispositivelycorrelatedwithhabitatfragmentationintheregionalpopulations(TableS4).Theregressionisclosetosignificant(R2=0.30,P=0.07).
●
●
●
●
●
●
●
●
●
−2 −1 0 1−2−1
01
2
pc1cor
pc2
Allelefrequencychangeinisolatedmetapopula6onandex6nctpopula6ons(PC1)
Allelefreq
uenciesc
haracterizingfragmen
ted
land
scapes(P
C2)
13
Fig.S6.Thereisanegativerelationshipintheoutlierloci(n=13)betweentheallelefrequencydifferenceAF(new)-AF(old)andAF(old)(R2=0.22,P=0.06).Toassesswhetherthiswasinfluencingourresultswerepeatedtheanalysesonallelefrequenciesusingresidualsfromthisregression(seetextinSI).Theresultsremainedqualitativelythesameandtheconclusionswerenotaffected(TableS6).
Fig.S7.Therelationshipbetweenthecallrateandtheproportionofheterozygousloci.SamplesaresplitamongstthetwoKASPgenotypingplatesthatcontainedmuseumsamples.FortheanalysesseethetextinSI.
AllelefrequencyincontemporaryÅland
Allelefreq
uencydiffe
renceinold
comparedtonew
pop
ula6
ons
Plate&15,&museum&Åland&Plate&15,&museum&SW&Finland&
Plate&16,&museum&Åland&Plate&16,&museum&SW&Finland&
Plate&16,&contemporary&So:unga&Plate&16,&contemporary&Åland&
Call&Rate&
Prop
or>o
n&of&heterozygou
s&loci&