Upload
tyty-alx
View
11
Download
0
Embed Size (px)
DESCRIPTION
5
Citation preview
Measurably evolving populations
Alexei J Drummond and Remco Bouckaert
December 13, 2012
1 Time-stamped data
This tutorial estimates the rate of evolution from a set of virus sequences which havebeen isolated at different points in time (heterochronous or time-stamped data). Thedata are 129 sequences from the G (attachment protein) gene of human respiratorysyncytial virus subgroup A (RSVA) from various parts of the world with isolation datesranging from 1956-2002 [?, ?]. RSVA causes infections of the lower respiratory tractcausing symptoms that are often indistinguishable from the common cold. By age 3,nearly all children will be infected and a small percentage (< 3%) will develop moreserious inflammation of the bronchioles requiring hospitalisation.
The aim of this tutorial is to obtain estimates for :
the rate of molecular evolution the date of the most recent common ancestor the phylogenetic relationships with measures of statistical support.The following software will be used in this tutorial:
BEAST - this package contains the BEAST program, BEAUti, DensiTree, TreeAn-notator and other utility programs. This tutorial is written for BEAST v2.0,which has support for multiple partitions. It is available for download fromhttp://beast2.cs.auckland.ac.nz/.
Tracer - this program is used to explore the output of BEAST (and otherBayesian MCMC programs). It graphically and quantitively summarizes the dis-tributions of continuous parameters and provides diagnostic information. At thetime of writing, the current version is v1.5. It is available for download fromhttp://beast.bio.ed.ac.uk/.
FigTree - this is an application for displaying and printing molecular phylogenies,in particular those obtained using BEAST. At the time of writing, the currentversion is v1.3.1. It is available for download from http://tree.bio.ed.ac.uk/.
1
The NEXUS alignment
The data is in a file called RSV2.nex and you can find it in the examples/nexus directoryin the directory where BEAST was installed. This file contains an alignment of 129sequences from the G gene of RSVA virus, 629 nucleotides in length. Import thisalignment into BEAUti. Because this is a protein-coding gene we are going to split thealignment into three partitions representing each of the three codon positions. To dothis we will click the Split button at the bottom of the Partitions panel and thenselect the 1 + 2 + 3 frame 3 from the drop-down menu. This signifies that the firstfull codon starts at the third nucleotide in the alignment. This will create three rowsin the partitions panel. You will have to re-link the tree and clock models across thethree partitions (and name them tree and clock respectively) before continuing tothe next step.
The partition panel should now look something like this:
By default all the taxa are assumed to have a date of zero (i.e. the sequences areassumed to be sampled at the same time). In this case, the RSVA sequences have beensampled at various dates going back to the 1950s. The actual year of sampling is givenin the name of each taxon and we could simply edit the value in the Date column of thetable to reflect these. However, if the taxa names contain the calibration information,then a convenient way to specify the dates of the sequences in BEAUti is to click thecheckbox Use tip dates and then use the Guess button at the top of the Tip Datespanel. Clicking this will make a dialog box appear.
2
Select the option to use everything, choose after last from from drop-down boxand types into the corresponding text box. This will extract the trailing numbers fromthe taxon names after the last little s, which are interpreted as the year (in this casesince 1900) that the sample was isolated.
The dates panel should now look something like this:
3
Setting the substitution model
We will use the HKY model with empirical base frequencies for all three partitions. Todo this first link the site partitions and then choose HKY and Empirical from the SubstModel and Frequencies drop-boxes. Also check the estimate box for the Mutation Rateand finally check the Fix mean mutation rate box. Then go back to Partitions paneland unlink the partitions.
1.0.1 Priors
To set up the priors, select the Priors tab. Choose Coalescent Constant Populationfor the tree prior. Set the prior on the clockRate parameter to a log-normal withM = 5 and S = 1.25.
4
1.1 Setting the MCMC options
For this dataset lets initially set the chain length to 2,000,000 as this will run reasonablyquickly on most modern computers. Set the sampling frequencies for the screen to to1000, the trace log file to 400 and the trees file to 400.
Running BEAST
Save the BEAST file (e.g. RSV2.xml) and run it in BEAST.
5
Analysing the BEAST output
Note that the effective sample sizes (ESSs) for many of the logged quantities are small(ESSs less than 100 will be highlighted in red by Tracer). This is not good. A low ESSmeans that the trace contains a lot of correlated samples and thus may not representthe posterior distribution well. In the bottom right of the window is a frequency plotof the samples which is expected given the low ESSs is extremely rough.
If we select the tab on the right-hand-side labelled Trace we can view the rawtrace, that is, the sampled values against the step in the MCMC chain.
Here you can see how the samples are correlated. There are 5000 samples in thetrace (we ran the MCMC for 2,000,000 steps sampling every 400) but adjacent samplesoften tend to have similar values. The ESS for the absolute rate of evolution (clockRate)is about 138 so we are only getting 1 independent sample to every 36 = 5000/138 actualsamples). With a short run such as this one, it may also be the case that the defaultburn-in of 10% of the chain length is inadequate. Not excluding enough of the start ofthe chain as burn-in will render estimates of ESS unreliable.
The simple response to this situation is that we need to run the chain for longer.Given the lowest ESS (for the constant coalescent prior) is 93, it would suggest that wehave to run the chain for at least twice the length to get reasonable ESSs that are >200.However it would be better to aim higher so lets go for a chain length of 10,000,000.Go back to the MCMC options section in BEAUti, and create a new BEAST XMLfile with a longer chain length. Now run BEAST and load the new log file into Tracer(you can leave the old one loaded for comparison).
6
Click on the Trace tab and look at the raw trace plot.
Again we have chosen options that produce 5000 samples and with an ESS of about419 there is still auto-correlation between the samples but >400 effectively independentsamples will now provide a very good estimate of the posterior distribution. Thereare no obvious trends in the plot which would suggest that the MCMC has not yetconverged, and there are no significant long range fluctuations in the trace which wouldsuggest poor mixing.
As we are satisfied with the mixing we can now move on to one of the parameters ofinterest: substitution rate. Select clockRate in the left-hand table. This is the averagesubstitution rate across all sites in the alignment. Now choose the density plot byselecting the tab labeled Marginal Density. This shows a plot of the marginal posteriorprobability density of this parameter. You should see a plot similar to this:
7
As you can see the posterior probability density is roughly bell-shaped. There issome sampling noise which would be reduced if we ran the chain for longer or sampledmore often but we already have a good estimate of the mean and HPD interval. Youcan overlay the density plots of multiple traces in order to compare them (it is upto the user to determine whether they are comparable on the the same axis or not).Select the relative substitution rates for all three codon positions in the table to the left(labelled mutationRate.1, mutationRate.2 and mutationRate.3). You will now seethe posterior probability densities for the relative substitution rate at all three codonpositions overlaid:
8
Summarizing the trees
Use the program TreeAnnotator to summarize the tree and view the results in Figtree(Figure ??).
9
1940.0 1950.0 1960.0 1970.0 1980.0 1990.0 2000.0 2010.0 2020.0 2030.0 2040.0 2050.0 2060.0 2070.0 2080.0 2090.0 2100.0 2110.0 2120.0 2130.0 2140.0 2150.0 2160.0 2170.0 2180.0 2190.0 2200.0 2210.0 2220.0 2230.0 2240.0 2250.0 2260.0 2270.0 2280.0 2290.0 2300.0 2310.0 2320.0 2330.0 2340.0 2350.0 2360.0 2370.0 2380.0 2390.0 2400.0 2410.0 2420.0 2430.0 2440.0 2450.0 2460.0 2470.0 2480.0 2490.0 2500.0 2510.0 2520.0 2530.0 2540.0 2550.0 2560.0 2570.0 2580.0 2590.0 2600.0 2610.0 2620.0 2630.0 2640.0 2650.0 2660.0 2670.0 2680.0 2690.0 2700.0 2710.0 2720.0 2730.0 2740.0 2750.0 2760.0 2770.0 2780.0 2790.0 2800.0 2810.0 2820.0 2830.0 2840.0 2850.0 2860.0 2870.0 2880.0 2890.0 2900.0 2910.0 2920.0 2930.0 2940.0 2950.0 2960.0 2970.0 2980.0 2990.0 3000.0 3010.0 3020.0 3030.0 3040.0 3050.0 3060.0 3070.0 3080.0 3090.0 3100.0 3110.0 3120.0 3130.0 3140.0 3150.0 3160.0 3170.0 3180.0 3190.0 3200.0 3210.0 3220.0 3230.0 3240.0 3250.0 3260.0 3270.0 3280.0 3290.0 3300.0 3310.0 3320.0 3330.0 3340.0 3350.0 3360.0 3370.0 3380.0 3390.0 3400.0 3410.0 3420.0 3430.0 3440.0 3450.0 3460.0 3470.0 3480.0 3490.0 3500.0 3510.0 3520.0 3530.0 3540.0 3550.0 3560.0 3570.0 3580.0 3590.0 3600.0 3610.0 3620.0 3630.0 3640.0 3650.0 3660.0 3670.0 3680.0 3690.0 3700.0 3710.0 3720.0 3730.0 3740.0 3750.0 3760.0 3770.0 3780.0 3790.0 3800.0 3810.0 3820.0 3830.0 3840.0 3850.0 3860.0 3870.0 3880.0 3890.0 3900.0 3910.0 3920.0 3930.0 3940.0 3950.0 3960.0 3970.0 3980.0 3990.0 4000.0 4010.0 4020.0 4030.0 4040.0 4050.0 4060.0 4070.0 4080.0 4090.0 4100.0 4110.0 4120.0 4130.0 4140.0 4150.0 4160.0 4170.0 4180.0 4190.0 4200.0 4210.0 4220.0 4230.0 4240.0 4250.0 4260.0 4270.0 4280.0 4290.0 4300.0 4310.0 4320.0 4330.0 4340.0 4350.0 4360.0 4370.0 4380.0 4390.0 4400.0 4410.0 4420.0 4430.0 4440.0 4450.0 4460.0 4470.0 4480.0 4490.0 4500.0 4510.0 4520.0 4530.0 4540.0 4550.0 4560.0 4570.0 4580.0 4590.0 4600.0 4610.0 4620.0 4630.0 4640.0 4650.0 4660.0 4670.0 4680.0 4690.0 4700.0 4710.0 4720.0 4730.0 4740.0 4750.0 4760.0 4770.0 4780.0 4790.0 4800.0 4810.0 4820.0 4830.0 4840.0 4850.0 4860.0 4870.0 4880.0 4890.0 4900.0 4910.0 4920.0 4930.0 4940.0 4950.0 4960.0 4970.0 4980.0 4990.0 5000.0 5010.0 5020.0 5030.0 5040.0 5050.0 5060.0 5070.0 5080.0 5090.0 5100.0 5110.0 5120.0 5130.0 5140.0 5150.0 5160.0 5170.0 5180.0 5190.0 5200.0 5210.0 5220.0 5230.0 5240.0 5250.0 5260.0 5270.0 5280.0 5290.0 5300.0 5310.0 5320.0 5330.0 5340.0 5350.0 5360.0 5370.0 5380.0 5390.0 5400.0 5410.0 5420.0 5430.0 5440.0 5450.0 5460.0 5470.0 5480.0 5490.0 5500.0 5510.0 5520.0 5530.0 5540.0 5550.0 5560.0 5570.0 5580.0 5590.0 5600.0 5610.0 5620.0 5630.0 5640.0 5650.0 5660.0 5670.0 5680.0 5690.0 5700.0 5710.0 5720.0 5730.0 5740.0 5750.0 5760.0 5770.0 5780.0 5790.0 5800.0 5810.0 5820.0 5830.0 5840.0 5850.0 5860.0 5870.0 5880.0 5890.0 5900.0 5910.0 5920.0 5930.0 5940.0 5950.0 5960.0 5970.0 5980.0 5990.0 6000.0 6010.0 6020.0 6030.0 6040.0 6050.0 6060.0 6070.0 6080.0 6090.0 6100.0 6110.0 6120.0 6130.0 6140.0 6150.0 6160.0 6170.0 6180.0 6190.0 6200.0 6210.0 6220.0 6230.0 6240.0 6250.0 6260.0 6270.0 6280.0 6290.0 6300.0 6310.0 6320.0 6330.0 6340.0 6350.0 6360.0 6370.0 6380.0 6390.0 6400.0 6410.0 6420.0 6430.0 6440.0 6450.0 6460.0 6470.0 6480.0 6490.0 6500.0 6510.0 6520.0 6530.0 6540.0 6550.0 6560.0 6570.0 6580.0 6590.0 6600.0 6610.0 6620.0 6630.0 6640.0 6650.0 6660.0 6670.0 6680.0 6690.0 6700.0 6710.0 6720.0 6730.0 6740.0 6750.0 6760.0 6770.0 6780.0 6790.0 6800.0 6810.0 6820.0 6830.0 6840.0 6850.0 6860.0 6870.0 6880.0 6890.0 6900.0 6910.0 6920.0 6930.0 6940.0 6950.0 6960.0 6970.0 6980.0 6990.0 7000.0 7010.0 7020.0 7030.0 7040.0 7050.0 7060.0 7070.0 7080.0 7090.0 7100.0 7110.0 7120.0 7130.0 7140.0 7150.0 7160.0 7170.0 7180.0 7190.0 7200.0 7210.0 7220.0 7230.0 7240.0 7250.0 7260.0 7270.0 7280.0 7290.0 7300.0 7310.0 7320.0 7330.0 7340.0 7350.0 7360.0 7370.0 7380.0 7390.0 7400.0 7410.0 7420.0 7430.0 7440.0 7450.0 7460.0 7470.0 7480.0 7490.0 7500.0 7510.0 7520.0 7530.0 7540.0 7550.0 7560.0 7570.0 7580.0 7590.0 7600.0 7610.0 7620.0 7630.0 7640.0 7650.0 7660.0 7670.0 7680.0 7690.0 7700.0 7710.0 7720.0 7730.0 7740.0 7750.0 7760.0 7770.0 7780.0 7790.0 7800.0 7810.0 7820.0 7830.0 7840.0 7850.0 7860.0 7870.0 7880.0 7890.0 7900.0 7910.0 7920.0 7930.0 7940.0 7950.0 7960.0 7970.0 7980.0 7990.0 8000.0 8010.0 8020.0 8030.0 8040.0 8050.0 8060.0 8070.0 8080.0 8090.0 8100.0 8110.0 8120.0 8130.0 8140.0 8150.0 8160.0 8170.0 8180.0 8190.0 8200.0 8210.0 8220.0 8230.0 8240.0 8250.0 8260.0 8270.0 8280.0 8290.0 8300.0 8310.0 8320.0 8330.0 8340.0 8350.0 8360.0 8370.0 8380.0 8390.0 8400.0 8410.0 8420.0 8430.0 8440.0 8450.0 8460.0 8470.0 8480.0 8490.0 8500.0 8510.0 8520.0 8530.0 8540.0 8550.0 8560.0 8570.0 8580.0 8590.0 8600.0 8610.0 8620.0 8630.0 8640.0 8650.0 8660.0 8670.0 8680.0 8690.0 8700.0 8710.0 8720.0 8730.0 8740.0 8750.0 8760.0 8770.0 8780.0 8790.0 8800.0 8810.0 8820.0 8830.0 8840.0 8850.0 8860.0 8870.0 8880.0 8890.0 8900.0 8910.0 8920.0 8930.0 8940.0 8950.0 8960.0 8970.0 8980.0 8990.0 9000.0 9010.0 9020.0 9030.0 9040.0 9050.0 9060.0 9070.0 9080.0 9090.0 9100.0 9110.0 9120.0 9130.0 9140.0 9150.0 9160.0 9170.0 9180.0 9190.0 9200.0 9210.0 9220.0 9230.0 9240.0 9250.0 9260.0 9270.0 9280.0 9290.0 9300.0 9310.0 9320.0 9330.0 9340.0 9350.0 9360.0 9370.0 9380.0 9390.0 9400.0 9410.0 9420.0 9430.0 9440.0 9450.0 9460.0 9470.0 9480.0 9490.0 9500.0 9510.0 9520.0 9530.0 9540.0 9550.0 9560.0 9570.0 9580.0 9590.0 9600.0 9610.0 9620.0 9630.0 9640.0 9650.0 9660.0 9670.0 9680.0 9690.0 9700.0 9710.0 9720.0 9730.0 9740.0 9750.0 9760.0 9770.0 9780.0 9790.0 9800.0 9810.0 9820.0 9830.0 9840.0 9850.0 9860.0 9870.0 9880.0 9890.0 9900.0 9910.0 9920.0 9930.0 9940.0 9950.0 9960.0 9970.0 9980.0 9990.0 10000.0 10010.0 10020.0 10030.0 10040.0 10050.0 10060.0 10070.0 10080.0 10090.0 10100.0 10110.0 10120.0 10130.0 10140.0 10150.0 10160.0 10170.0 10180.0 10190.0 10200.0 10210.0 10220.0 10230.0 10240.0 10250.0 10260.0 10270.0 10280.0 10290.0 10300.0 10310.0 10320.0 10330.0 10340.0 10350.0 10360.0 10370.0 10380.0 10390.0 10400.0 10410.0 10420.0 10430.0 10440.0 10450.0 10460.0 10470.0 10480.0 10490.0 10500.0 10510.0 10520.0 10530.0 10540.0 10550.0 10560.0 10570.0 10580.0 10590.0 10600.0 10610.0 10620.0 10630.0 10640.0 10650.0 10660.0 10670.0 10680.0 10690.0 10700.0 10710.0 10720.0 10730.0 10740.0 10750.0 10760.0 10770.0 10780.0 10790.0 10800.0 10810.0 10820.0 10830.0 10840.0 10850.0 10860.0 10870.0 10880.0 10890.0 10900.0 10910.0 10920.0 10930.0 10940.0 10950.0 10960.0 10970.0 10980.0 10990.0 11000.0 11010.0 11020.0 11030.0 11040.0 11050.0 11060.0 11070.0 11080.0 11090.0 11100.0 11110.0 11120.0 11130.0 11140.0 11150.0 11160.0 11170.0 11180.0 11190.0 11200.0 11210.0 11220.0 11230.0 11240.0 11250.0 11260.0 11270.0 11280.0 11290.0 11300.0 11310.0 11320.0 11330.0 11340.0 11350.0 11360.0 11370.0 11380.0 11390.0 11400.0 11410.0 11420.0 11430.0 11440.0 11450.0 11460.0 11470.0 11480.0 11490.0 11500.0 11510.0 11520.0 11530.0 11540.0 11550.0 11560.0 11570.0 11580.0 11590.0 11600.0 11610.0 11620.0 11630.0 11640.0 11650.0 11660.0 11670.0 11680.0 11690.0 11700.0 11710.0 11720.0 11730.0 11740.0 11750.0 11760.0 11770.0 11780.0 11790.0 11800.0 11810.0 11820.0 11830.0 11840.0 11850.0 11860.0 11870.0 11880.0 11890.0 11900.0 11910.0 11920.0 11930.0 11940.0 11950.0 11960.0 11970.0 11980.0 11990.0 12000.0 12010.0 12020.0 12030.0 12040.0 12050.0 12060.0 12070.0 12080.0 12090.0 12100.0 12110.0 12120.0 12130.0 12140.0 12150.0 12160.0 12170.0 12180.0 12190.0 12200.0 12210.0 12220.0 12230.0 12240.0 12250.0 12260.0 12270.0 12280.0 12290.0 12300.0 12310.0 12320.0 12330.0 12340.0 12350.0 12360.0 12370.0 12380.0 12390.0 12400.0 12410.0 12420.0 12430.0 12440.0 12450.0 12460.0 12470.0 12480.0 12490.0 12500.0 12510.0 12520.0 12530.0 12540.0 12550.0 12560.0 12570.0 12580.0 12590.0 12600.0 12610.0 12620.0 12630.0 12640.0 12650.0 12660.0 12670.0 12680.0 12690.0 12700.0 12710.0 12720.0 12730.0 12740.0 12750.0 12760.0 12770.0 12780.0 12790.0 12800.0 12810.0 12820.0 12830.0 12840.0 12850.0 12860.0 12870.0 12880.0 12890.0 12900.0 12910.0 12920.0 12930.0 12940.0 12950.0 12960.0 12970.0 12980.0 12990.0 13000.0 13010.0 13020.0 13030.0 13040.0 13050.0 13060.0 13070.0 13080.0 13090.0 13100.0 13110.0 13120.0 13130.0 13140.0 13150.0 13160.0 13170.0 13180.0 13190.0 13200.0 13210.0 13220.0 13230.0 13240.0 13250.0 13260.0 13270.0 13280.0 13290.0 13300.0 13310.0 13320.0 13330.0 13340.0 13350.0 13360.0 13370.0 13380.0 13390.0 13400.0 13410.0 13420.0 13430.0 13440.0 13450.0 13460.0 13470.0 13480.0 13490.0 13500.0 13510.0 13520.0 13530.0 13540.0 13550.0 13560.0 13570.0 13580.0 13590.0 13600.0 13610.0 13620.0 13630.0 13640.0 13650.0 13660.0 13670.0 13680.0 13690.0 13700.0 13710.0 13720.0 13730.0 13740.0 13750.0 13760.0 13770.0 13780.0 13790.0 13800.0 13810.0 13820.0 13830.0 13840.0 13850.0 13860.0 13870.0 13880.0 13890.0 13900.0 13910.0 13920.0 13930.0 13940.0 13950.0 13960.0 13970.0 13980.0 13990.0 14000.0 14010.0 14020.0 14030.0 14040.0 14050.0 14060.0 14070.0 14080.0 14090.0 14100.0 14110.0 14120.0 14130.0 14140.0 14150.0 14160.0 14170.0 14180.0 14190.0 14200.0 14210.0 14220.0 14230.0 14240.0 14250.0 14260.0 14270.0 14280.0 14290.0 14300.0 14310.0 14320.0 14330.0 14340.0 14350.0 14360.0 14370.0 14380.0 14390.0 14400.0 14410.0 14420.0 14430.0 14440.0 14450.0 14460.0 14470.0 14480.0 14490.0 14500.0 14510.0 14520.0 14530.0 14540.0 14550.0 14560.0 14570.0 14580.0 14590.0 14600.0 14610.0 14620.0 14630.0 14640.0 14650.0 14660.0 14670.0 14680.0 14690.0 14700.0 14710.0 14720.0 14730.0 14740.0 14750.0 14760.0 14770.0 14780.0 14790.0 14800.0 14810.0 14820.0 14830.0 14840.0 14850.0 14860.0 14870.0 14880.0 14890.0 14900.0 14910.0 14920.0 14930.0 14940.0 14950.0 14960.0 14970.0 14980.0 14990.0 15000.0 15010.0 15020.0 15030.0 15040.0 15050.0 15060.0 15070.0 15080.0 15090.0 15100.0 15110.0 15120.0 15130.0 15140.0 15150.0 15160.0 15170.0 15180.0 15190.0 15200.0 15210.0 15220.0 15230.0 15240.0 15250.0 15260.0 15270.0 15280.0 15290.0 15300.0 15310.0 15320.0 15330.0 15340.0 15350.0 15360.0 15370.0 15380.0 15390.0 15400.0 15410.0 15420.0 15430.0 15440.0 15450.0 15460.0 15470.0 15480.0 15490.0 15500.0 15510.0 15520.0 15530.0 15540.0 15550.0 15560.0 15570.0 15580.0 15590.0 15600.0 15610.0 15620.0 15630.0 15640.0 15650.0 15660.0 15670.0 15680.0 15690.0 15700.0 15710.0 15720.0 15730.0 15740.0 15750.0 15760.0 15770.0 15780.0 15790.0 15800.0 15810.0 15820.0 15830.0 15840.0 15850.0 15860.0 15870.0 15880.0 15890.0 15900.0 15910.0 15920.0 15930.0 15940.0 15950.0 15960.0 15970.0 15980.0 15990.0 16000.0 16010.0 16020.0 16030.0 16040.0 16050.0 16060.0 16070.0 16080.0 16090.0 16100.0 16110.0 16120.0 16130.0 16140.0 16150.0 16160.0 16170.0 16180.0 16190.0 16200.0 16210.0 16220.0 16230.0 16240.0 16250.0 16260.0 16270.0 16280.0 16290.0 16300.0 16310.0 16320.0 16330.0 16340.0 16350.0 16360.0 16370.0 16380.0 16390.0 16400.0 16410.0 16420.0 16430.0 16440.0 16450.0 16460.0 16470.0 16480.0 16490.0 16500.0 16510.0 16520.0 16530.0 16540.0 16550.0 16560.0 16570.0 16580.0 16590.0 16600.0 16610.0 16620.0 16630.0 16640.0 16650.0 16660.0 16670.0 16680.0 16690.0 16700.0 16710.0 16720.0 16730.0 16740.0 16750.0 16760.0 16770.0 16780.0 16790.0 16800.0 16810.0 16820.0 16830.0 16840.0 16850.0 16860.0 16870.0 16880.0 16890.0 16900.0 16910.0 16920.0 16930.0 16940.0 16950.0 16960.0 16970.0 16980.0 16990.0 17000.0 17010.0 17020.0 17030.0 17040.0 17050.0 17060.0 17070.0 17080.0 17090.0 17100.0 17110.0 17120.0 17130.0 17140.0 17150.0 17160.0 17170.0 17180.0 17190.0 17200.0 17210.0 17220.0 17230.0 17240.0 17250.0 17260.0 17270.0 17280.0 17290.0 17300.0 17310.0 17320.0 17330.0 17340.0 17350.0 17360.0 17370.0 17380.0 17390.0 17400.0 17410.0 17420.0 17430.0 17440.0 17450.0 17460.0 17470.0 17480.0 17490.0 17500.0 17510.0 17520.0 17530.0 17540.0 17550.0 17560.0 17570.0 17580.0 17590.0 17600.0 17610.0 17620.0 17630.0 17640.0 17650.0 17660.0 17670.0 17680.0 17690.0 17700.0 17710.0 17720.0 17730.0 17740.0 17750.0 17760.0 17770.0 17780.0 17790.0 17800.0 17810.0 17820.0 17830.0 17840.0 17850.0 17860.0 17870.0 17880.0 17890.0 17900.0 17910.0 17920.0 17930.0 17940.0 17950.0 17960.0 17970.0 17980.0 17990.0 18000.0 18010.0 18020.0 18030.0 18040.0 18050.0 18060.0 18070.0 18080.0 18090.0 18100.0 18110.0 18120.0 18130.0 18140.0 18150.0 18160.0 18170.0 18180.0 18190.0 18200.0 18210.0 18220.0 18230.0 18240.0 18250.0 18260.0 18270.0 18280.0 18290.0 18300.0 18310.0 18320.0 18330.0 18340.0 18350.0 18360.0 18370.0 18380.0 18390.0 18400.0 18410.0 18420.0 18430.0 18440.0 18450.0 18460.0 18470.0 18480.0 18490.0 18500.0 18510.0 18520.0 18530.0 18540.0 18550.0 18560.0 18570.0 18580.0 18590.0 18600.0 18610.0 18620.0 18630.0 18640.0 18650.0 18660.0 18670.0 18680.0 18690.0 18700.0 18710.0 18720.0 18730.0 18740.0 18750.0 18760.0 18770.0 18780.0 18790.0 18800.0 18810.0 18820.0 18830.0 18840.0 18850.0 18860.0 18870.0 18880.0 18890.0 18900.0 18910.0 18920.0 18930.0 18940.0 18950.0 18960.0 18970.0 18980.0 18990.0 19000.0 19010.0 19020.0 19030.0 19040.0 19050.0 19060.0 19070.0 19080.0 19090.0 19100.0 19110.0 19120.0 19130.0 19140.0 19150.0 19160.0 19170.0 19180.0 19190.0 19200.0 19210.0 19220.0 19230.0 19240.0 19250.0 19260.0 19270.0 19280.0 19290.0 19300.0 19310.0 19320.0 19330.0 19340.0 19350.0 19360.0 19370.0 19380.0 19390.0 19400.0 19410.0 19420.0 19430.0 19440.0 19450.0 19460.0 19470.0 19480.0 19490.0 19500.0 19510.0 19520.0 19530.0 19540.0 19550.0 19560.0 19570.0 19580.0 19590.0 19600.0 19610.0 19620.0 19630.0 19640.0 19650.0 19660.0 19670.0 19680.0 19690.0 19700.0 19710.0 19720.0 19730.0 19740.0 19750.0 19760.0 19770.0 19780.0 19790.0 19800.0 19810.0 19820.0 19830.0 19840.0 19850.0 19860.0 19870.0 19880.0 19890.0 19900.0 19910.0 19920.0 19930.0 19940.0 19950.0 19960.0 19970.0 19980.0 19990.0 20000.0 20010.0 20020.0 20030.0 20040.0 20050.0 20060.0 20070.0 20080.0 20090.0
BE1835s101
SE05s96
USALongs56
BE8078s92
BE156s84
BE2149s100
BE14461s98
BE2122s100
WV12342s84
MON9s91
BE13551s95
BE307s87
WV23836s88
BE1587s89
BE1556s101
BE4147s87
BE13393s99
SE08s96
BE614s93
AUSA2s61
BE14808s98
NYCH17s93
BE174s95
BE15752s97
BE076s86
SE12s92
MON1s92
BE1061s100
BE13462s99BE11584s101
BE11s101
MAD6s93
BE410s86
SE10s91
SE02s98
BE1224s101
BE797s100
SE09s92
BE2584s85
BE332s102
BE13281s99
BE339s86
BE3785s87
BIR1734s89
SE01s95
WV2780s79
WV5222s81
MADs91
SE03s91
S2s76
BE12005s94
BE11091s100
NYCH57s94
BE12243s96
SE01s92
BE071s88
BE11129s100
BE004s102
BE191s90
SE05s91
SE12s97
BE12216s96
BE822s100
BE1591s90
MON1s90
BE16s100
BE1836s101
MAD1s93
BE512s95
BE6274s91
BE12350s96
BE11600s94
MAD6s92
BE13280s99
BE1150s100
MON5s91
BE13425s99
BE1440s92
SE12s95
BE13192s99
BE14536s98
BE1441s101
BE11030s100
BE1343s101
BE15471s97
BE76s86
MON9s92
SE10s92
SE11s92
BE1937s100
MON1s89
BE64s101
BIR6190s89
WV6973s82
BE12028s100BE519s101
BE3252s86
BE12895s95
BE2466s85
BE538s88
BE10490s93
SE12s94
NYCH09s93
BE183s85
BE305s89
MAD2s93SE11s95
MON51s90
MAD8s92
BE11465s94
BE1936s100
BE4763s88
MON2s88
BE204s84
BE800s100
BE119s87
BIR642s89
BE1717s101
NYCH34s94
BE944s100
WV19983s87
BE6460s91
BE138s90
BE933s88
BE112s101
BE11976s100
BE14898s98
BE369s90
BE13412s99
Figure 1: The Maximum clade credibility tree for the G gene of 129 RSVA-2 viralsamples.
10
BE8078s92NYCH09s93SE01s92SE05s91MON1s92
BE614s93BE12005s94BE174s95BE12350s96BE12216s96
BE11600s94
SE10s92
MAD2s93BE10490s93SE11s95
MAD6s92 BE15471s97SE03s91
BE13280s99BE1061s100BE13192s99BE16s100BE2122s100BE1556s101BE800s100
BE14536s98
BE1441s101BE11976s100BE64s101BE822s100BE13281s99BE1936s100
BE944s100
BE1835s101BE1224s101BE1343s101
BE1937s100BE1836s101BE1717s101
BE2149s100BE112s101BE1150s100BE797s100
NYCH17s93MAD6s93MAD8s92MAD1s93
BE1587s89MON1s90BIR6190s89WV12342s84BE933s88BE2584s85WV5222s81BE156s84
S2s76
BE11030s100BE11129s100BE11091s100
BE004s102BE519s101BE13425s99
BE11s101BE13393s99BE12028s100
SE12s97BE14808s98SE02s98BE14898s98BE332s102BE14461s98NYCH57s94BE12895s95BE11465s94
BE15752s97SE05s96BE13551s95MON2s88MON1s89SE10s91SE11s92SE12s94BE1591s90 BE512s95SE01s95BE4147s87BE071s88BE538s88BE204s84BE183s85BE076s86BE2466s85BE4763s88BE76s86
BE339s86BE3252s86BE410s86
WV23836s88 BE13462s99BE11584s101BE13412s99SE12s95BE12243s96SE08s96SE09s92SE12s92BE6274s91BE1440s92MADs91BE6460s91BE305s89BE3785s87BE307s87BE119s87
BE138s90BE191s90BE369s90MON9s91BIR642s89WV19983s87MON5s91MON51s90MON9s92NYCH34s94
BIR1734s89
WV2780s79WV6973s82AUSA2s61USALongs56
DensiTree with clade height bars for clades with over 50% support. Root canal treerepresents maximum clade credibility tree.
Questions
In what year did the common ancestor of all RSVA viruses sampled live? What is the95% HPD?
11
Bonus section: Bayesian Skyline plot
We can reconstruct the population history using the Bayesian Skyline plot. In order todo so, load the XML file into BEAUti, select the priors-tab and change the tree priorfrom coalescent with constant population size to coalescent with Bayesian skyline. Notethat an extra item is added to the priors called Markov chained population sizes whichis a prior that ensures dependence between population sizes.
By default the number of groups used in the skyline analysis is set to 5, To changethis, select menu View/Show Initialization panel and a list of parameters is shown. Se-lect bPopSizes.t:tree and change the dimension to 3. Likewise, selection bGroupSizes.t:treeand change its dimension to 3. The dimensions of the two parameters should bethe same. More groups mean more population changes can be detected, but it alsomeans more parameters need to be estimated and the chain runs longer. The extendedBayesian skyline plot automatically detects the number of changes, so it could be usedas an alternative tree prior.
12
This analysis requires a bit longer to converge, so change the MCMC chain lengthto 10 million, and the log intervals for the trace-log and tree-log to 10 thousand. Then,save the file and run BEAST.
To plot the population history, load the log file in tracer and select the menu Anal-ysis/Bayesian Skyline Reconstruction.
A dialog is shown where you can specify the tree file associated with the log file.Also, since the youngest sample is from 2002, change the entry for age of youngest tipto 2002.
13
After some calculation, a graph appears showing population history where the me-dian and 95% HPD intervals are plotted. After selecting the solid interval checkbox,the graph should look something like this.
14
Questions
By what amount did the effective population size of RSVA grow from 1970 to 2002according to the BSP?
What are the underlying assumptions of the BSP? Are the violated by this data set?
1.2 Exercise
Change the Bayesian skyline prior to extended Bayesian skyline plot (EBSP) prior andrun till convergence. EBSP produces an extra log file, called EBSP.$(seed).log where$(seed) is replaced by the seed you used to run BEAST. A plot can be created byrunning the EBSPAnalyser utility, and loading the output file in a spreadsheet.
How many groups are indicated by the EBSP analysis? This is much lower than forBSP. How does this affect the population history plots?
References
[1] Kalina T Zlateva, Philippe Lemey, Elien Moes, Anne-Mieke Vandamme, and MarcVan Ranst, Genetic variability and molecular evolution of the human respiratorysyncytial virus subgroup b attachment g protein, J Virol 79 (2005), no. 14, 915767.
[2] Kalina T Zlateva, Philippe Lemey, Anne-Mieke Vandamme, and Marc Van Ranst,Molecular evolution and circulation patterns of human respiratory syncytial virussubgroup a: positively selected sites in the attachment g glycoprotein, J Virol 78(2004), no. 9, 467583.
15