11
RESEARCH ARTICLE SUMMARY MOSQUITO GENOMICS Highly evolvable malaria vectors: The genomes of 16 Anopheles mosquitoes Daniel E. Neafsey,*Robert M. Waterhouse,* et al. INTRODUCTION: Control of mosquito vectors has historically proven to be an effective means of eliminating malaria. Human malaria is transmitted only by mosquitoes in the genus Anopheles, but not all species within the genus, or even all members of each vector species, are efficient malaria vectors. Variation in vectorial capacity for human malaria among Anopheles mosquito species is determined by many factors, including behavior, immunity, and life history. RATIONALE: This variation in vectorial ca- pacity suggests an underlying genetic/genomic plasticity that results in variation of key traits determining vectorial capacity within the genus. Sequencing the genome of Anopheles gambiae, the most important malaria vector in sub-Saharan Africa, has offered numerous insights into how that species became highly specialized to live among and feed upon hu- mans and how susceptibility to mosquito control strategies is determined. Until very recently, similar genomic resources have not existed for other anophelines, limiting comparisons to individual genes or sets of ge- nomic markers with no genome-wide data to investigate attributes associated with vectorial capacity across the genus. RESULTS: We sequenced and assembled the genomes and transcriptomes of 16 anophe- lines from Africa, Asia, Europe, and Latin America, spanning ~100 million years of evo- lution and chosen to represent a range of evolutionary distances from An. gambiae,a variety of geographic locations and ecological conditions, and varying degrees of vectorial capacity. Genome assembly quality reflected DNA template quality and homozygosity. De- spite variation in contiguity, the assemblies were remarkably complete and searches for arthropod-wide single-copy orthologs gener- ally revealed few missing genes. Genome an- notation supported with RNA sequencing transcriptomes yielded between 10,738 and 16,149 protein-coding genes for each species. Relative to Drosophila, the closest dipteran genus for which equivalent genomic resources exist, Anopheles exhibits a dynamic genomic evolutionary profile. Comparative analyses show a fivefold faster rate of gene gain and loss, elevated gene shuffling on the X chromosome, and more intron losses in Anopheles. Some de- terminants of vectorial capacity, such as chemo- sensory genes, do not show elevated turnover but in- stead diversify through protein-sequence changes. We also document evidence of variation in important reproductive phenotypes, genes controlling immunity to Plasmodium ma- laria parasites and other microbes, genes en- coding cuticular and salivary proteins, and genes conferring metabolic insecticide resist- ance. This dynamism of anopheline genes and genomes may contribute to their flexible capacity to take advantage of new ecological niches, in- cluding adapting to humans as primary hosts. CONCLUSIONS: Anopheline mosquitoes ex- hibit a molecular evolutionary profile very dis- tinct from Drosophila, and their genomes harbor strong evidence of functional variation in traits that determine vectorial capacity. These 16 new reference genome assemblies provide a founda- tion for hypothesis generation and testing to further our understanding of the diverse bio- logical traits that determine vectorial capacity. RESEARCH SCIENCE sciencemag.org 2 JANUARY 2015 VOL 347 ISSUE 6217 43 The complete list of authors and affiliations is available in the full article online. *These authors contributed equally to this work. Corresponding author. E-mail: [email protected] (D.E.N.); [email protected] (N.J.B.) Cite this article as D. E. Neafsey et al., Science 347, 1258522 (2015). DOI: 10.1126/science.1258522 Geography, vector status, and molecular phylogeny of the 16 newly sequenced anopheline mosquitoes and selected other dipterans. The maximum likelihood molecular phylogeny of all sequenced anophelines and two mosquito outgroups was constructed from the aligned protein sequences of 1085 single-copy orthologs. Shapes between branch termini and species names indicate vector status and are colored according to geographic ranges depicted on the map. Ma, million years ago. ON OUR WEB SITE Read the full article at http://dx.doi. org/10.1126/ science.1258522 .................................................. on June 10, 2018 http://science.sciencemag.org/ Downloaded from

MOSQUITO GENOMICS Highlyevolvable … ARTICLE SUMMARY MOSQUITO GENOMICS Highlyevolvable malariavectors:The genomes of 16 Anopheles mosquitoes Daniel E. Neafsey,*† Robert M. Waterhouse,*

  • Upload
    lekiet

  • View
    220

  • Download
    2

Embed Size (px)

Citation preview

Page 1: MOSQUITO GENOMICS Highlyevolvable … ARTICLE SUMMARY MOSQUITO GENOMICS Highlyevolvable malariavectors:The genomes of 16 Anopheles mosquitoes Daniel E. Neafsey,*† Robert M. Waterhouse,*

RESEARCH ARTICLE SUMMARY

MOSQUITO GENOMICS

Highly evolvable malaria vectors Thegenomes of 16 AnophelesmosquitoesDaniel E Neafseydagger Robert M Waterhouse et al

INTRODUCTION Control of mosquito vectorshas historically proven to be an effective meansof eliminating malaria Human malaria istransmitted only by mosquitoes in the genusAnopheles but not all species within the genusor even all members of each vector species areefficient malaria vectors Variation in vectorialcapacity for human malaria among Anophelesmosquito species is determined bymany factorsincluding behavior immunity and life history

RATIONALE This variation in vectorial ca-pacity suggests an underlying geneticgenomicplasticity that results in variation of key traitsdetermining vectorial capacity within thegenus Sequencing the genome of Anophelesgambiae the most important malaria vectorin sub-Saharan Africa has offered numerousinsights into how that species became highlyspecialized to live among and feed upon hu-mans and how susceptibility to mosquitocontrol strategies is determined Until veryrecently similar genomic resources havenot existed for other anophelines limiting

comparisons to individual genes or sets of ge-nomic markers with no genome-wide data toinvestigate attributes associated with vectorialcapacity across the genus

RESULTSWe sequenced and assembled thegenomes and transcriptomes of 16 anophe-lines from Africa Asia Europe and LatinAmerica spanning ~100million years of evo-lution and chosen to represent a range ofevolutionary distances from An gambiae avariety of geographic locations and ecologicalconditions and varying degrees of vectorialcapacity Genome assembly quality reflectedDNA template quality and homozygosity De-spite variation in contiguity the assemblieswere remarkably complete and searches forarthropod-wide single-copy orthologs gener-ally revealed few missing genes Genome an-notation supported with RNA sequencingtranscriptomes yielded between 10738 and16149 protein-coding genes for each speciesRelative to Drosophila the closest dipterangenus for which equivalent genomic resources

exist Anopheles exhibits a dynamic genomicevolutionary profile Comparative analyses showa fivefold faster rate of gene gain and losselevated gene shuffling on the X chromosomeand more intron losses in Anopheles Some de-terminants of vectorial capacity such as chemo-

sensory genes donot showelevated turnover but in-stead diversify throughprotein-sequence changesWealsodocumentevidenceof variation in importantreproductive phenotypes

genes controlling immunity to Plasmodium ma-laria parasites and other microbes genes en-coding cuticular and salivary proteins andgenes conferring metabolic insecticide resist-ance This dynamism of anopheline genes andgenomesmay contribute to their flexible capacityto take advantage of new ecological niches in-cluding adapting to humans as primary hosts

CONCLUSIONS Anopheline mosquitoes ex-hibit a molecular evolutionary profile very dis-tinct fromDrosophila and their genomes harborstrong evidence of functional variation in traitsthat determine vectorial capacity These 16 newreference genome assemblies provide a founda-tion for hypothesis generation and testing tofurther our understanding of the diverse bio-logical traits that determine vectorial capacity

RESEARCH

SCIENCE sciencemagorg 2 JANUARY 2015 bull VOL 347 ISSUE 6217 43

The complete list of authors and affiliations is available inthe full article onlineThese authors contributed equally to this workdaggerCorresponding author E-mail neafseybroadinstituteorg(DEN) nbesanskndedu (NJB)Cite this article as D E Neafsey et al Science 3471258522 (2015) DOI 101126science1258522

Geography vector status and molecular phylogeny of the 16 newly sequenced anopheline mosquitoes and selected other dipteransThe maximum likelihood molecular phylogeny of all sequenced anophelines and two mosquito outgroups was constructed from the alignedprotein sequences of 1085 single-copy orthologs Shapes between branch termini and species names indicate vector status and are coloredaccording to geographic ranges depicted on the map Ma million years ago

ON OUR WEB SITE

Read the full articleat httpdxdoiorg101126science1258522

on June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

RESEARCH ARTICLE

MOSQUITO GENOMICS

Highly evolvable malaria vectors Thegenomes of 16 AnophelesmosquitoesDaniel E Neafsey1dagger Robert M Waterhouse2345 Mohammad R Abai6

Sergey S Aganezov7 Max A Alekseyev7 James E Allen8 James Amon9 Bruno Arcagrave10

Peter Arensburger11 Gleb Artemov12 Lauren A Assour13 Hamidreza Basseri6

Aaron Berlin1 Bruce W Birren1 Stephanie A Blandin1415 Andrew I Brockman16

Thomas R Burkot17 Austin Burt18 Clara S Chan23 Cedric Chauve19 Joanna C Chiu20

Mikkel Christensen8 Carlo Costantini21 Victoria L M Davidson22 Elena Deligianni23

Tania Dottorini16 Vicky Dritsou24 Stacey B Gabriel25 Wamdaogo M Guelbeogo26

Andrew B Hall27 Mira V Han28 Thaung Hlaing29 Daniel S T Hughes830

Adam M Jenkins31 Xiaofang Jiang3227 Irwin Jungreis23 Evdoxia G Kakani3334

Maryam Kamali35 Petri Kemppainen36 Ryan C Kennedy37 Ioannis K Kirmitzoglou1638

Lizette L Koekemoer39 Njoroge Laban40 Nicholas Langridge8 Mara K N Lawniczak16

Manolis Lirakis41 Neil F Lobo42 Ernesto Lowy8 Robert M MacCallum16

Chunhong Mao43 Gareth Maslen8 Charles Mbogo44 Jenny McCarthy11 Kristin Michel22

Sara N Mitchell33 Wendy Moore45 Katherine A Murphy20 Anastasia N Naumenko35

Tony Nolan16 Eva M Novoa23 Samantha OrsquoLoughlin18 Chioma Oringanje45

Mohammad A Oshaghi6 Nazzy Pakpour46 Philippos A Papathanos1624

Ashley N Peery35 Michael Povelones47 Anil Prakash48 David P Price4950

Ashok Rajaraman19 Lisa J Reimer51 David C Rinker52 Antonis Rokas5253

Tanya L Russell17 NrsquoFale Sagnon26 Maria V Sharakhova35 Terrance Shea1

Felipe A Simatildeo45 Frederic Simard21 Michel A Slotman54 Pradya Somboon55

Vladimir Stegniy12 Claudio J Struchiner5657 Gregg W C Thomas58 Marta Tojo59

Pantelis Topalis23 Joseacute M C Tubio60 Maria F Unger42 John Vontas41

Catherine Walton36 Craig S Wilding61 Judith H Willis62 Yi-Chieh Wu2363

Guiyun Yan64 Evgeny M Zdobnov45 Xiaofan Zhou53 Flaminia Catteruccia3334

George K Christophides16 Frank H Collins42 Robert S Cornman62 Andrea Crisanti1624

Martin J Donnelly5165 Scott J Emrich13 Michael C Fontaine4266 William Gelbart67

Matthew W Hahn6858 Immo A Hansen4950 Paul I Howell69 Fotis C Kafatos16

Manolis Kellis23 Daniel Lawson8 Christos Louis412324 Shirley Luckhart46

Marc A T Muskavitch3170 Joseacute M Ribeiro71 Michael A Riehle45 Igor V Sharakhov3527

Zhijian Tu2732 Laurence J Zwiebel72 Nora J Besansky42dagger

Variation in vectorial capacity for human malaria among Anopheles mosquito species isdetermined by many factors including behavior immunity and life history To investigatethe genomic basis of vectorial capacity and explore new avenues for vector control wesequenced the genomes of 16 anopheline mosquito species from diverse locationsspanning ~100 million years of evolution Comparative analyses show faster rates of genegain and loss elevated gene shuffling on the X chromosome and more intron lossesrelative to Drosophila Some determinants of vectorial capacity such as chemosensorygenes do not show elevated turnover but instead diversify through protein-sequencechanges This dynamism of anopheline genes and genomes may contribute to their flexiblecapacity to take advantage of new ecological niches including adapting to humans asprimary hosts

Malaria is a complex disease mediated byobligate eukaryotic parasites with a lifecycle requiring adaption to both verte-brate hosts and mosquito vectors Theserelationships create a rich coevolution-

ary triangle Just as Plasmodium parasites haveadapted to their diverse hosts and vectors infec-tion by Plasmodium parasites has reciprocally in-duced adaptive evolutionary responses in humansand other vertebrates (1) and has also influenced

mosquito evolution (2) Human malaria is trans-mitted only bymosquitoes in the genus Anophelesbut not all species within the genus or even allmembers of each vector species are efficientmalaria vectors This suggests an underlyinggeneticgenomic plasticity that results in varia-tion of key traits determining vectorial capacitywithin the genusIn all five species of Plasmodium have adapted

to infect humans and are transmitted by ~60 of

the 450 known species of anopheline mosqui-toes (3) Sequencing the genome of Anophelesgambiae the most important malaria vector insub-Saharan Africa has offered numerous in-sights into how that species became highly spe-cialized to live among and feed upon humansand how susceptibility to mosquito control strat-egies is determined (4) Until very recently (5ndash7)similar genomic resources have not existed forother anophelines limiting comparisons to in-dividual genes or sets of genomic markers withno genome-wide data to investigate attributes as-sociated with vectorial capacity across the genusThus we sequenced and assembled the ge-

nomes and transcriptomes of 16 anophelinesfrom Africa Asia Europe and Latin AmericaWe chose these 16 species to represent a rangeof evolutionary distances from An gambiae avariety of geographic locations and ecologicalconditions and varying degrees of vectorialcapacity (8) (Fig 1 A and B) For example Anquadriannulatus although extremely closelyrelated to An gambiae feeds preferentially onbovines rather than humans limiting its poten-tial to transmit human malaria An merus Anmelas An farauti and An albimanus femalescan lay eggs in salty or brackish water insteadof the freshwater sites required by other speciesWith a focus on species most closely related toAn gambiae (9) the sampled anophelines spanthe three main subgenera that shared a commonancestor ~100 million years ago (Ma) (10)

Materials and methods summary

Genomic DNA and whole-body RNA were ob-tained from laboratory colonies and wild-caughtspecimens (tables S1 and S2) with samples fornine species procured from newly establishedisofemale colonies to reduce heterozygosityIllumina sequencing libraries spanning a rangeof insert sizes were constructed with ~100-foldpaired-end 101ndashbase pair (bp) coverage gener-ated for small (180 bp) and medium (15 kb) in-sert libraries and lower coverage for large (38 kb)insert libraries (table S3) DNA template for thesmall and medium input libraries was sourcedfrom single femalemosquitoes from each speciesto further reduce heterozygosity High-molecular-weight DNA template for each large insert li-brary was derived from pooled DNA obtainedfrom several hundred mosquitoes ALLPATHS-LG(11) genome assemblies were produced using theldquohaploidifyrdquo option to reduce haplotype assem-blies caused by high heterozygosity Assemblyquality reflectedDNA template quality and homo-zygosity with a mean scaffold N50 of 36 Mbranging to 181 Mb for An albimanus (table S4)Despite variation in contiguity the assemblies wereremarkably complete and searches for arthropod-wide single-copy orthologs generally revealedfew missing genes (fig S1) (12)Genome annotation with MAKER (13) sup-

ported with RNA sequencing (RNAseq) of tran-scriptomes (produced from pooled male andfemale larvae pupae and adults) (table S5) andcomprehensive noncoding RNA gene prediction(fig S2) yielded relatively complete gene sets

RESEARCH

SCIENCE sciencemagorg 2 JANUARY 2015 bull VOL 347 ISSUE 6217 1258522-1

on June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

(fig S3) with between 10738 and 16149 protein-coding genes identified for each species Genecount was generally commensurate with assem-bly contiguity (table S6) Some of this variation intotal gene counts may be attributed to the chal-lenges of gene annotations with variable levelsof assembly contiguity and supporting RNAseqdata To estimate the prevalence of erroneousgene model fusions andor fragmentations wecompared the new gene annotations to An gam-biae gene models and found an average of 33and 97 potentially fused and fragmented genemodels respectively For analyses described be-low that may be sensitive to variation in genemodel accuracy or gene set completeness wehave conducted sensitivity analyses to rule outconfounding results from these factors (12)

Rapidly evolving genes and genomes

Orthology delineation identified lineage-restrictedand species-specific genes as well as ancient genesfound across insect taxa of which universal single-copy orthologs were employed to estimate themolecular species phylogeny (Fig 1 B and C andfig S4) Analysis of codon frequencies in theseorthologs revealed that anophelines unlike dro-sophilids exhibit relatively uniform codon usagepreferences (fig S5)Polytene chromosomes have provided a glimpse

into anopheline chromosome evolution (14) Ourgenome-sequencendashbased view confirmed the

cytological observations and offers many newinsights At the base-pair level ~90 of the non-gapped and nonmasked An gambiae genome(ie excluding transposable elements as detailedin table S7) is alignable to the most closely re-lated species whereas only ~13 aligns to themost distant (Fig 1D fig S6 and table S8) withreduced alignability in centromeres and on theX chromosome (Fig 1D) At chromosomal lev-els mapping data anchored 35 to 76 of the Anstephensi An funestus An atroparvus and Analbimanus genome assemblies to chromosomalarms (tables S9 to S12) Analysis of genes in an-chored regions showed that synteny at the whole-arm level is highly conserved despite severalwhole-arm translocations (Fig 2A and table S13)In contrast small-scale rearrangements disruptgene colinearity within arms over time leadingto extensive shuffling of gene order over a timescale of 29 million years or more (10 15) (Fig2B and fig S7) As in Drosophila rearrangementrates are higher on the X chromosome than onautosomes (Fig 2C and tables S14 to S16)However the difference is significantly morepronounced in Anopheles where X chromosomerearrangements are more frequent by a factorof 27 than autosomal rearrangements in Dro-sophila the corresponding ratio is only 12 (t testt10 = 73 P lt 1 times 10minus5) (fig S8) The X chromo-some is also notable for a significant degree ofobserved gene movement to other chromosomes

relative to Drosophila (one sample proportiontest P lt 22 times 10minus16) (Fig 2D and tables S17 andS18) as was previously noted for Anophelesrelative to Aedes (16) further underscoring itsdistinctive evolutionary profile in Anopheles com-pared with other dipteran generaSuch dynamic gene shuffling and movement

may be facilitated by themultiple families of DNAtransposons and long terminal repeat (LTR) andnon-LTR retroelements found in all genomes(table S7) as well as a weaker dosage compen-sation phenotype in Anopheles compared withDrosophila (17) Despite such shuffling com-paring genomic locations of orthologs can besuccessfully employed to reconstruct ancestralchromosomal arrangements (fig S9) and to con-fidently improve assembly contiguity (tables S19to S21)Copy-number variation in homologous gene

families also reveals striking evolutionary dy-namism Analysis of 11636 gene families withCAFE 3 (18) indicates a rate of gene gainlosshigher by a factor of at least 5 than that ob-served for 12 Drosophila genomes (19) Overallthese Anopheles genomes exhibit a rate of gainor loss per gene per million years of 312 times 10minus3

compared with 590 times 10minus4 for Drosophila sug-gesting substantially higher gene turnover with-in anophelines relative to fruit flies This fivefoldgreater gainloss rate in anophelines holds trueunder models that account for uncertainty in

1258522-2 2 JANUARY 2015 bull VOL 347 ISSUE 6217 sciencemagorg SCIENCE

1Genome Sequencing and Analysis Program Broad Institute 415 Main Street Cambridge MA 02142 USA 2Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology32 Vassar Street Cambridge MA 02139 USA 3The Broad Institute of Massachusetts Institute of Technology and Harvard 415 Main Street Cambridge MA 02142 USA 4Department of Genetic Medicineand Development University of Geneva Medical School Rue Michel-Servet 1 1211 Geneva Switzerland 5Swiss Institute of Bioinformatics Rue Michel-Servet 1 1211 Geneva Switzerland 6Department ofMedical Entomology and Vector Control School of Public Health and Institute of Health Researches Tehran University of Medical Sciences Tehran Iran 7George Washington University Department ofMathematics and Computational Biology Institute 45085 University Drive Ashburn VA 20147 USA 8European Molecular Biology Laboratory European Bioinformatics Institute EMBL-EBI Wellcome TrustGenome Campus Hinxton Cambridge CB10 1SD UK 9National Vector Borne Disease Control Programme Ministry of Health Tafea Province Vanuatu 10Department of Public Health and InfectiousDiseases Division of Parasitology Sapienza University of Rome Piazzale Aldo Moro 5 00185 Rome Italy 11Department of Biological Sciences California State PolytechnicndashPomona 3801 West TempleAvenue Pomona CA 91768 USA 12Tomsk State University 36 Lenina Avenue Tomsk Russia 13Department of Computer Science and Engineering Eck Institute for Global Health 211B Cushing HallUniversity of Notre Dame Notre Dame IN 46556 USA 14Inserm U963 F-67084 Strasbourg France 15CNRS UPR9022 IBMC F-67084 Strasbourg France 16Department of Life Sciences Imperial CollegeLondon South Kensington Campus London SW7 2AZ UK 17Faculty of Medicine Health and Molecular Science Australian Institute of Tropical Health Medicine James Cook University Cairns 4870Australia 18Department of Life Sciences Imperial College London Silwood Park Campus Ascot SL5 7PY UK 19Department of Mathematics Simon Fraser University 8888 University Drive Burnaby BCV5A 1S6 Canada 20Department of Entomology and Nematology One Shields Avenue University of CaliforniandashDavis Davis CA 95616 USA 21Institut de Recherche pour le Deacuteveloppement Uniteacutes Mixtesde Recherche Maladies Infectieuses et Vecteurs Eacutecologie Geacuteneacutetique Eacutevolution et Controcircle 911 Avenue Agropolis BP 64501 Montpellier France 22Division of Biology Kansas State University 271 ChalmersHall Manhattan KS 66506 USA 23Institute of Molecular Biology and Biotechnology Foundation for Research and Technology Hellas Nikolaou Plastira 100 GR-70013 Heraklion Crete Greece 24Centre ofFunctional Genomics University of Perugia Perugia Italy 25Genomics Platform Broad Institute 415 Main Street Cambridge MA 02142 USA 26Centre National de Recherche et de Formation sur lePaludisme Ouagadougou 01 BP 2208 Burkina Faso 27Program of Genetics Bioinformatics and Computational Biology Virginia Polytechnic Institute and State University Blacksburg VA 24061 USA28School of Life Sciences University of Nevada Las Vegas NV 89154 USA 29Department of Medical Research No 5 Ziwaka Road Dagon Township Yangon 11191 Myanmar 30Baylor College of Medicine1 Baylor Plaza Houston TX 77030 USA 31Boston College 140 Commonwealth Avenue Chestnut Hill MA 02467 USA 32Department of Biochemistry Virginia Polytechnic Institute and State UniversityBlacksburg VA 24061 USA 33Harvard School of Public Health Department of Immunology and Infectious Diseases Boston MA 02115 USA 34Dipartimento di Medicina Sperimentale e ScienzeBiochimiche Universitagrave degli Studi di Perugia Perugia Italy 35Department of Entomology Virginia Polytechnic Institute and State University Blacksburg VA 24061 USA 36Computational EvolutionaryBiology Group Faculty of Life Sciences University of Manchester Oxford Road Manchester M13 9PT UK 37Department of Bioengineering and Therapeutic Sciences University of California San FranciscoCA 94143 USA 38Bioinformatics Research Laboratory Department of Biological Sciences New Campus University of Cyprus CY 1678 Nicosia Cyprus 39Wits Research Institute for Malaria Faculty ofHealth Sciences and Vector Control Reference Unit National Institute for Communicable Diseases of the National Health Laboratory Service Sandringham 2131 Johannesburg South Africa 40NationalMuseums of Kenya PO Box 40658-00100 Nairobi Kenya 41Department of Biology University of Crete 700 13 Heraklion Greece 42Eck Institute for Global Health and Department of Biological SciencesUniversity of Notre Dame 317 Galvin Life Sciences Building Notre Dame IN 46556 USA 43Virginia Bioinformatics Institute 1015 Life Science Circle Virginia Polytechnic Institute and State UniversityBlacksburg VA 24061 USA 44Kenya Medical Research Institute-Wellcome Trust Research Programme Centre for Geographic Medicine Research - Coast PO Box 230-80108 Kilifi Kenya 45Departmentof Entomology 1140 East South Campus Drive Forbes 410 University of Arizona Tucson AZ 85721 USA 46Department of Medical Microbiology and Immunology School of Medicine University ofCalifornia Davis One Shields Avenue Davis CA 95616 USA 47Department of Pathobiology University of Pennsylvania School of Veterinary Medicine 3800 Spruce Street Philadelphia PA 19104 USA48Regional Medical Research Centre NE Indian Council of Medical Research PO Box 105 Dibrugarh-786 001 Assam India 49Department of Biology New Mexico State University Las Cruces NM 88003USA 50Molecular Biology Program New Mexico State University Las Cruces NM 88003 USA 51Department of Vector Biology Liverpool School of Tropical Medicine Pembroke Place Liverpool L3 5QAUK 52Center for Human Genetics Research Vanderbilt University Medical Center Nashville TN 37235 USA 53Department of Biological Sciences Vanderbilt University Nashville TN 37235 USA54Department of Entomology Texas AampM University College Station TX 77807 USA 55Department of Parasitology Faculty of Medicine Chiang Mai University Chiang Mai 50200 Thailand 56FundaccedilatildeoOswaldo Cruz Avenida Brasil 4365 RJ Brazil 57Instituto de Medicina Social Universidade do Estado do Rio de Janeiro Rio de Janeiro Brazil 58School of Informatics and Computing Indiana UniversityBloomington IN 47405 USA 59Department of Physiology School of Medicine Center for Research in Molecular Medicine and Chronic Diseases Instituto de Investigaciones Sanitarias University ofSantiago de Compostela Santiago de Compostela A Coruntildea Spain 60Wellcome Trust Sanger Institute Hinxton Cambridgeshire CB10 1SA UK 61School of Natural Sciences and Psychology LiverpoolJohn Moores University Liverpool L3 3AF UK 62Department of Cellular Biology University of Georgia Athens GA 30602 USA 63Department of Computer Science Harvey Mudd College Claremont CA91711 USA 64Program in Public Health College of Health Sciences University of California Irvine Hewitt Hall Irvine CA 92697 USA 65Malaria Programme Wellcome Trust Sanger Institute CambridgeCB10 1SJ UK 66Centre of Evolutionary and Ecological Studies (Marine Evolution and Conservation group) University of Groningen Nijenborgh 7 NL-9747 AG Groningen Netherlands 67Department ofMolecular and Cellular Biology Harvard University 16 Divinity Avenue Cambridge MA 02138 USA 68Department of Biology Indiana University Bloomington IN 47405 USA 69Centers for Disease Controland Prevention 1600 Clifton Road NE MSG49 Atlanta GA 30329 USA 70Biogen Idec 14 Cambridge Center Cambridge MA 02142 USA 71Laboratory of Malaria and Vector Research National Institute ofAllergy and Infectious Diseases 12735 Twinbrook Parkway Rockville MD 20852 USA 72Departments of Biological Sciences and Pharmacology Institutes for Chemical Biology Genetics and Global HealthVanderbilt University and Medical Center Nashville TN 37235 USAThese authors contributed equally to this work daggerCorresponding author E-mail neafseybroadinstituteorg (DEN) nbesanskndedu (NJB)

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

gene family sizes at the tips of the species treedue to annotation or assembly errors and is notsensitive to inclusion or exclusion of taxa affect-ing the root age of the tree nor to the exclusionof taxa with the poorest assemblies and genesets (fig S10 and tables S22 and S23) Examplesinclude expansions of cuticular proteins in Anarabiensis and neurotransmitter-gated ion chan-nels in An albimanus (table S24)The evolutionary dynamismofAnopheles genes

extends to their architecture Comparisons ofsingle-copy orthologs at deeper phylogeneticdepths showed losses of introns at the root ofthe true fly order Diptera and revealed con-tinued losses as the group diversified into thelineages leading to fruit flies and mosquitoesHowever anopheline orthologs have sustainedgreater intron loss than drosophilids leadingto a relative paucity of introns in the genes ofextant anophelines (fig S11 and table S25) Com-parative analysis also revealed that gene fu-sion and fission played a substantial role in theevolution of mosquito genes with apparent re-arrangements affecting an average of 101 of

all genes in the genomes of the 10 specieswith the most contiguous assemblies (fig S12)Furthermore gene boundaries can be flexiblewhole genome alignments identified 325 can-didates for stop-codon readthrough (fig S13 andtable S26)Becausemolecular evolution of protein-coding

sequences is a well-known source of phenotypicchange we compared evolutionary rates amongdifferent functional categories of anopheline ortho-logs We quantified evolutionary divergence intermsof protein sequence identity of alignedortho-logs and the dNdS statistic (ratio of nonsynon-ymous to synonymous substitutions) computedusing PAML (12 20) Among curated sets of geneslinked to vectorial capacity or species-specific traitsagainst a background of functional categories de-fined by Gene Ontology or InterPro annotationsodorant and gustatory receptors showhigh evolu-tionary rates and male accessory gland proteinsexhibit exceptionally high dNdS ratios (Fig 3 figsS14 and S15 and tables S27 to S29) Rapid diver-gence in functional categories related to malariatransmission andor mosquito control strategies

led us to examine the genomic basis of severalfacets of anopheline biology in closer detail

Insights into mosquito biology andvectorial capacity

Mosquito reproductive biology evolves rapidlyand presents a compelling target for vector con-trol This is exemplified by the An gambiaemaleaccessory gland protein (Acp) cluster on chromo-some 3R (21 22) where conservation is mostlylost outside the An gambiae species complex(fig S16) In Drosophila male-biased genes suchas Acps tend to evolve faster than loci withoutmale-biased expression (23ndash25) We looked fora similar pattern in anophelines after assessingeach gene for sex-biased expression using micro-array and RNAseq data sets for An gambiae (12)In contrast to Drosophila female-biased genesshowdramatically faster rates of evolution acrossthe genus thanmale-biased genes (Wilcoxon ranksum test P = 5 times 10minus4) (fig S17)Differences in reproductive genes among

anophelines may provide insight into the or-igin and function of sex-related traits During

SCIENCE sciencemagorg 2 JANUARY 2015 bull VOL 347 ISSUE 6217 1258522-3

An gambiae

An christyi

An epiroticus

An stephensi

An maculatus

An culicifacies

An minimus

An funestus

An dirus

An farauti

An atroparvus

An sinensis

An albimanus

An darlingi

Culex quinquefasciatus

Aedes aegypti

An arabiensis

Culicinae

Anophelinae

Cellia

D melanogaster

D pseudoobscuraD mojavensis

Glossina morsitans

10000 200000

MuscomorphaDrosophilidae

Culicidae

An merus

An melas

An quadri-annulatus

5000 15000

GeneOrthology

Anophelinae

Culicinae

Culicidae

Diptera

Widespread

Universal

None

Number of Genes

100 Ma

260 Ma

005ss

MolecularPhylogeny

Nyssorhynchus

Anopheles

Pyretophorus

050 1

2R3R2L3LX

Genome Alignability

30 Ma

gambiaecomplex

minor

Vector Statusmajor

non

Geography

A B C

D

Fig 1 Geography vector status molecular phylogeny gene orthologyand genome alignability of the 16 newly sequenced anopheline mosqui-toes and selected other dipterans (A) Global geographic distributions ofthe 16 sampled anophelines and the previously sequenced An gambiae andAn darlingi Ranges are colored for each species or group of species as shownin (B) eg light blue for An farauti (B) The maximum likelihood molecularphylogeny of all sequenced anophelines and selected dipteran outgroupsShapes between branch termini and species names indicate vector status

(rectangles major vectors ellipses minor vectors triangles nonvectors) andare colored according to geographic ranges shown in (A) (C) Bar plots showtotal gene counts for each species partitioned according to their orthologyprofiles from ancient genes found across insects to lineage-restricted andspecies-specific genes (D) Heat map illustrating the density (in 2-kb slidingwindows) of whole-genome alignments along the lengths of An gambiaechromosomal arms fromwhite where An gambiae aligns to no other speciesto red where An gambiae aligns to all the other anophelines

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

copulation An gambiae males transfer a gela-tinous mating plug a complex of seminal pro-teins lipids and hormones that are essentialfor successful sperm storage by females and forreproductive success (26ndash28) Coagulation ofthe plug is mediated by a seminal transgluta-minase (TG3) which is found in anophelinesbut is absent in other mosquito genera that donot form a mating plug (26) We examined TG3and its two paralogs (TG1 and TG2) in the se-quenced anophelines and investigated the rateof evolution of each gene (Fig 4A) Silent siteswere saturated at the whole-genus level makingdS difficult to estimate reliably but TG1 (thegene presumed to be ancestral owing to broad-est taxonomic representation) exhibited the lowestrate of amino acid change (dN = 020) TG2 ex-hibited an intermediate rate (dN = 093) and theanopheline-specific TG3 has evolved even morerapidly (dN = 150) perhaps because of malemale or malefemale evolutionary conflict Notethat plug formation appears to be a derived traitwithin anophelines because it is not exhibited byAn albimanus and intermediate poorly coagu-lated plugs were observed in taxa descendingfrom early-branching lineages within the genus(table S30) Functional studies of mating plugs

will be necessary to understand what drove theorigin and rapid evolution of TG3Proteins that constitute themosquito cuticular

exoskeleton play important roles in diverse as-pects of anopheline biology including develop-ment ecology and insecticide resistance andconstitute approximately 2 of all protein-codinggenes (29) Comparisons among dipterans haverevealed numerous amplifications of cuticularprotein (CP) genes undergoing concerted evo-lution at physically clustered loci (30ndash33) Weinvestigated the extent and time scale of genecluster homogenization within anophelines bygenerating phylogenies of orthologous geneclusters (fig S18 and table S31) Throughout thegenus these gene clusters often group phylo-genetically by species rather than by positionwithin tandem arrays particularly in a subsetof clusters These include the 3RB and 3RC clus-ters of CP genes (30) the CPLCG group A andCPLCW clusters found elsewhere on 3R (32) andsix tandemly arrayed genes on 3L designatedCPFL2 through CPFL7 (34) CPLCW genes occurin a head-to-head arrangementwithCPLCGgroupA genes and exhibit highly conserved intergenicsequences (fig S19) Furthermore transcript lo-calization studies using in situ hybridization

revealed identical spatial expression patterns forCLPCW and CPLCG group A gene pairs sugges-tive of coregulation (fig S19) For these five geneclusters complete grouping by organismal lin-eage was observed for most deep nodes as wellas formany individual species outside the shallowAn gambiae species complex (Fig 4B) consistentwith a relatively rapid (less than 20 million years)homogenization of sequences via concerted evo-lution The emerging pattern of anopheline CPevolution is thus one of relative stasis for a ma-jority of single-copy orthologs juxtaposed withconsistent concerted evolution of a subset of genesAnophelines identify hosts oviposition sites

and other environmental cues through special-ized chemosensory membrane-bound receptorsWe examined three of the major gene familiesthat encode these molecules the odorant recep-tors (ORs) gustatory receptors (GRs) and variantionotropic glutamate receptors (IRs) Given therapid chemosensory gene turnover observed inmany other insects we explored whether vary-ing host preferences of anopheline mosquitoescould be attributed to chemosensory gene gainsand losses Unexpectedly in light of the elevatedgenome-wide rate of gene turnover we found thatthe overall size and content of the chemosensory

1258522-4 2 JANUARY 2015 bull VOL 347 ISSUE 6217 sciencemagorg SCIENCE

Fig 2 Patterns of anopheline chromosomal evolution (A) Anopheline ge-nomes have conserved gene membership on chromosome arms (ldquoelementsrdquocolored and labeled 1 to 5) UnlikeDrosophila chromosome elements reshufflebetween chromosomes via translocations as intact elements and do notshow fissions or fusionsThe tree depicts the supported molecular topologyfor the species studied (B) Conserved synteny blocks decay rapidly withinchromosomal arms as the phylogenetic distance increases between spe-cies Moving left to right the dot-plot panels show gene-level synteny betweenchromosome 2R of An gambiae (x axis) and inferred ancestral sequences(y axes inferred using PATHGROUPS) at increasing evolutionary time scales(million years ago) estimated by an ultrametric phylogeny Gray horizontal

lines represent scaffold breaks Discontinuity of the red linesdots indicatesrearrangement (C) Anopheline X chromosomes exhibit higher rates of re-arrangement (P lt 1 times 10minus5) measured as breaks per Mb per million yearscompared with autosomes despite a paucity of polymorphic inversions onthe X (D) The anopheline X chromosome also displays a higher rate of genemovement to other chromosomal arms than any of the autosomes Chromo-somal elements are labeled around the perimeter internal bands are coloredaccording to the chromosomal element source and match element colors in(A) and (C) Bands are sized to indicate the relative ratio of genes importedversus exported for each chromosomal element and the relative allocation ofexported genes to other elements

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

gene repertoire are relatively conserved across thegenus CAFE 3 (18) analyses estimated that themost recent common ancestor of the anophelineshad approximately 60 genes in each of the ORand GR families similar to most extant anophe-lines (Fig 4C and fig S20) Estimated gainlossrates of OR and GR genes per million years(error-corrected l = 13 times 10minus3 for ORs and 20 times10minus4 for GRs) were much lower than the overalllevel of anopheline gene families Similarly wefound almost the same number of antennae-expressed IRs (~20) in all anopheline genomesDespite overall conservation in chemosensorygene numbers we observed several examples ofgene gain and loss in specific lineages Notablythere was a net gain of at least 12 ORs in thecommon ancestor of the An gambiae complex(Fig 4C)OR and GR gene repertoire stability may de-

rive from their roles in several critical behaviorsHost preference differences are likely to be gov-erned by a combination of functional divergenceand transcriptional modulation of orthologs Thismodel is supported by studies of antennal tran-scriptomes in the major malaria vector Angambiae (35) and comparisons between thisvector and its morphologically identical siblingAn quadriannulatus (36) a very closely relatedspecies that plays no role inmalaria transmission(despite vectorial competence) because it does

not specialize on human hosts Furthermore wefound that many subfamilies of ORs and GRsshowed evidence of positive selection (19 of 53ORs 17 of 59 GRs) across the genus suggestingpotential functional divergenceSeveral blood feedingndashrelated behaviors in

mosquitoes are also regulated by peptide hor-mones (37) These peptides are synthesized pro-cessed and released fromnervous and endocrinesystems and elicit their effects through bindingappropriate receptors in target tissues (38) Intotal 39 peptide hormones were identified fromeach of the sequenced anophelines (fig S21)Notably no ortholog of the well-characterizedhead peptide (HP) hormone of the culicine mos-quito Aedes aegypti was identified in any of theassemblies In Ae aegypti HP is responsible forinhibiting host-seeking behavior after a bloodmeal (39) Because anophelines broadly exhibitsimilar behavior (40) the absence ofHP from theentire clade suggests they may have evolved anovelmechanism to inhibit excess blood feedingSimilarly no ortholog of insulin growth factor 1(IGF1) was identified in any anophelines eventhough IGF1 orthologs have been identified inother dipterans includingDmelanogaster (41)and Ae aegypti (42) IGF1 is a key componentof the insulininsulin growth factor 1 signaling(IIS) cascade which regulates processes includ-ing innate immunity reproduction metabolism

and life span (43) Nevertheless other membersof the IIS cascade are present and four insulin-like peptides are found in a compact cluster withgene arrangements conserved across anophelines(fig S22) This raises questions regarding themodification of IIS signaling in the absence ofIGF1 and the functional importance of this con-served genomic arrangementEpigenetic mechanisms affect many biological

processes by modulation of chromatin structuretelomere remodeling and transcriptional con-trol Of the 215 epigenetic regulatory genes inD melanogaster (44) we identified 169 putativeAn gambiae orthologs (table S32) which sug-gested the presence of mechanisms of epigeneticcontrol in Anopheles and Drosophila We findhowever that retrotransposition may have con-tributed to the functional divergence of at leastone gene associated with epigenetic regulationThe ubiquitin-conjugating enzyme E2D (ortholo-gous to effete (45) in D melanogaster) duplicatedvia retrotransposition in an early anopheline an-cestor and the retrotransposed copy is main-tained in a subset of anophelines Although theentire amino acid sequence of E2D is perfectlyconserved between An gambiae and D mela-nogaster the retrogenes are highly divergent(Fig 5A) and may contribute to functional di-versification within the genusSaliva is integral to blood feeding it impairs

host hemostasis and also affects inflammationand immunity In An gambiae the salivary pro-teome is estimated to contain the products ofat least 75 genes most being expressed solely inthe adult female salivary glands Comparativeanalyses indicate that anopheline salivary pro-teins are subject to strong evolutionary pres-sures and these genes exhibit an acceleratedpace of evolution as well as a very high rate ofgainloss (Fig 3 and fig S23) Polymorphismswithin An gambiae populations from limitedsets of salivary genes were previously found tocarry signatures of positive selection (46) Se-quence analysis across the anophelines showsthat salivary genes have the highest incidenceof positively selected codons among the sevengene classes (fig S24) indicating that coevolu-tion with vertebrate hosts is a powerful driverof natural selection in salivary proteomes More-over salivary proteins also exhibit functionaldiversification through new gene creation Se-quence similarity intron-exon boundaries andsecondary structure prediction point to the birthof the SG7SG7-2 inflammation-inhibiting (47)gene family from the genomic region encodingthe C terminus of the 30-kD protein (Fig 5B) acollagen-binding platelet inhibitor already presentin the blood-feeding ancestor of mosquitoes andblack flies (48) Based on phylogenetic representa-tion these events must have occurred before theradiation of anophelines but after separation fromthe culicinesResistance to insecticides and other xeno-

biotics has arisen independently inmany anoph-eline species fostered directly and indirectly byanthropogenic environmentalmodificationMeta-bolic resistance to insecticides is mediated by

SCIENCE sciencemagorg 2 JANUARY 2015 bull VOL 347 ISSUE 6217 1258522-5

Fig 3 Contrasting evolutionaryproperties of selected genefunctional categories Examinedevolutionary properties of ortholo-gous groups of genes include ameasure of amino acid conservationdivergence (evolutionary rate) ameasure of selective pressure(dNdS) a measure of geneduplication in terms of mean genecopy-number per species (number ofgenes) and a measure of orthologuniversality in terms of number ofspecies with orthologs (number ofspecies) Notched box plots showmedians and extend to the first andthird quartiles their widths areproportional to the number oforthologous groups in each func-tional category Functional categoriesderive from curated lists associatedwith various functionsprocesses aswell as annotated Gene Ontology orInterPro categories (denoted byasterisks)

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

multiple gene families including cytochromeP450s and glutathione S-transferases (GSTs)which serve to generally protect against all en-vironment stresses both natural and anthropo-genic We manually characterized these genefamilies in seven anophelines spanning the genusDespite their large size gene numbers (87 to 104P450 genes 27 to 30 GST genes) within bothgene families are highly conserved across all spe-cies although lineage-specific gene duplicationsand losses are often seen (tables S33 and S34) Aswith the OR and GR olfaction-related gene fam-ilies P450 and GST repertoires may be relativelyconstant due to the large number of roles theyplay in anopheline biology Orthologs of genesassociated with insecticide resistance either viaup-regulation or coding variation [eg Cyp6m2Cyp6p3 (Cyp6p9 in An funestus) Gste2 andGste4] were found in all species which suggestedthat virtually all anophelines likely have genescapable of conferring insecticide resistance throughsimilar mechanisms Unexpectedly one memberof the P450 family (Cyp18a1) with a conservedrole in ecdysteroid catabolism [and consequentlydevelopment and metamorphosis (49)] appears

to have been lost from the ancestor of the Angambiae species complex but is found in the ge-nome and transcriptome assemblies of otherspecies which indicates that the An gambiaecomplex may have recently evolved an alternatemechanism for catabolizing ecdysoneSusceptibility to malaria parasites is a key

determinant of vectorial capacity Dissectingthe immune repertoire (50 51) (table S35) intoits constituent phases reveals that classical rec-ognition genes and genes encoding effector en-zymes exhibit relatively low levels of sequencedivergence Signal transducers are more diver-gent in sequence but are conserved in repre-sentation across species and rarely duplicatedCascade modulators although also divergentare more lineage-specific and generally havemore gene duplications (Fig 3 and fig S25) Arare duplication of an immune signal transduc-tion gene occurred through the retrotranspo-sition of the signal transducer and activator oftranscription STAT2 to form the intronless STAT1after the divergence ofAn dirus and An farautifrom the rest of the subgenus Cellia (Fig 5C andtable S36) It is interesting that an independent

retrotransposition event appears to have cre-ated another intronless STAT gene in the Anatroparvus lineage In An gambiae STAT1 con-trols the expression of STAT2 and is activated inresponse to bacterial challenge (52 53) and theSTAT pathway has been demonstrated to me-diate immunity to Plasmodium (53 54) so thepresence of these relatively new immune signaltransducers may have allowed for rewiring of reg-ulatory networks governing immune responsesin this subset of anophelines

Conclusion

Since the discovery over a century ago by RonaldRoss and Giovanni Battista Grassi that humanmalaria is transmitted by a narrow range of blood-feeding female mosquitoes the biological basisof malarial vectorial capacity has been a matterof intense interest Inasmuch as previous suc-cesses in the local elimination of malaria havealways been accomplished wholly or in partthrough effective vector control an increased un-derstanding of vector biology is crucial for con-tinued progress against malarial disease These16 new reference genome assemblies provide a

1258522-6 2 JANUARY 2015 bull VOL 347 ISSUE 6217 sciencemagorg SCIENCE

Fig 4 Phylogeny-based insights into anopheline biology (A) Maximum-likelihood amino acidndashbased phylogenetic tree of three transglutaminase en-zymes (TG1 green TG2 yellow and TG3 red) in 14 anopheline species with Culexquinquefasciatus (Cxqu) Ae aegypti (Aeae) and D melanogaster (Dmel) servingas outgroups TG3 is the enzyme responsible for the formation of themale matingplug in An gambiae acting upon the substrate Plugin the most abundant matingplug protein Higher rates of evolution for plug-forming TG3 are supported byelevated levels of dN Mating plug phenotypes are noted where known within theTG3 clade (B) Concerted evolution in CPFL cuticular proteins Species symbols used are the same as in (A) In contrast to the TG1TG2TG3 phylogeny CPFLparalogs cluster by subgeneric clades rather than individually recapitulating the species phylogeny Gene family size variation among speciesmay reflect bothgene gainloss and variation in gene set completeness (C) OR observed gene counts and inferred ancestral gene counts on an ultrametric phylogeny At least10 OR genes were gained on the branch leading to the common ancestor of the An gambiae species complex although the overall number of OR genes doesnot vary dramatically across the genus

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

foundation for additional hypothesis generationand testing to further our understanding of thediverse biological traits that determine vectorialcapacity

REFERENCES AND NOTES

1 D P Kwiatkowski How malaria has affected the humangenome and what human genetics can teach us about malariaAm J Hum Genet 77 171ndash192 (2005) doi 101086432519pmid 16001361

2 A Cohuet C Harris V Robert D Fontenille Evolutionaryforces on Anopheles What makes a malaria vector TrendsParasitol 26 130ndash136 (2010) doi 101016jpt200912001pmid 20056485

3 S Manguin P Carnivale J Mouchet Biodiversity of Malaria inthe World (John Libbey Eurotext Paris 2008)

4 R A Holt et al The genome sequence of the malaria mosquitoAnopheles gambiae Science 298 129ndash149 (2002)doi 101126science1076181 pmid 12364791

5 O Marinotti et al The genome of Anopheles darlingi the mainneotropical malaria vector Nucleic Acids Res 41 7387ndash7400(2013) doi 101093nargkt484 pmid 23761445

6 D Zhou et al Genome sequence of Anopheles sinensisprovides insight into genetics basis of mosquito competencefor malaria parasites BMC Genomics 15 42 (2014)pmid 24438588

7 X Jiang et al Genome analysis of a major urban malariavector mosquito Anopheles stephensi Genome Biol 15 459(2014) doi 101186s13059-014-0459-2 pmid 25244985

8 D E Neafsey et al The evolution of the Anopheles 16 genomesproject Genes Genomes Genet 3 1191ndash1194 (2013)doi 101534g3113006247 pmid 23708298

9 M C Fontaine et al Extensive introgression in a malariavector species complex revealed by phylogenomics Science346 1258524 (2014)

10 M Moreno et al Complete mtDNA genomes of Anophelesdarlingi and an approach to anopheline divergence timeMalar J 9 127 (2010) doi 1011861475-2875-9-127pmid 20470395

11 S Gnerre et al High-quality draft assemblies of mammaliangenomes from massively parallel sequence data Proc NatlAcad Sci USA 108 1513ndash1518 (2011) doi 101073pnas1017351108 pmid 21187386

12 Materials and methods are available as supplementarymaterials on Science Online

13 C Holt M Yandell MAKER2 An annotation pipeline andgenome-database management tool for second-generationgenome projects BMC Bioinformatics 12 491 (2011)doi 1011861471-2105-12-491 pmid 22192575

14 M Coluzzi A Sabatini A della Torre M A Di DecoV Petrarca A polytene chromosome analysis of the Anophelesgambiae species complex Science 298 1415ndash1418 (2002)doi 101126science1077769 pmid 12364623

15 M Kamali et al Multigene phylogenetics reveals temporaldiversification of major African malaria vectors PLOS ONE 9e93580 (2014) doi 101371journalpone0093580 pmid 24705448

16 M A Toups M W Hahn Retrogenes reveal the direction ofsex-chromosome evolution in mosquitoes Genetics 186763ndash766 (2010) doi 101534genetics110118794pmid 20660646

17 D A Baker S Russell Role of testis-specific gene expressionin sex-chromosome evolution of Anopheles gambiae Genetics189 1117ndash1120 (2011) doi 101534genetics111133157pmid 21890740

18 M V Han G W C Thomas J Lugo-Martinez M W HahnEstimating gene gain and loss rates in the presence of error ingenome assembly and annotation using CAFE 3 Mol Biol Evol30 1987ndash1997 (2013) doi 101093molbevmst100pmid 23709260

19 M W Hahn M V Han S-G Han Gene family evolution across 12Drosophila genomes PLOS Genet 3 e197 (2007) doi 101371journalpgen0030197 pmid 17997610

20 Z Yang PAML 4 Phylogenetic analysis by maximumlikelihood Mol Biol Evol 24 1586ndash1591 (2007) doi 101093molbevmsm088 pmid 17483113

21 T Dottorini et al A genome-wide analysis in Anophelesgambiae mosquitoes reveals 46 male accessory gland genespossible modulators of female behavior Proc Natl AcadSci USA 104 16215ndash16220 (2007) doi 101073pnas0703904104 pmid 17901209

22 F Baldini P Gabrieli D W Rogers F Catteruccia Functionand composition of male accessory gland secretions inAnopheles gambiae A comparison with other insect vectors ofinfectious diseases Pathog Glob Health 106 82ndash93 (2012)doi 1011792047773212Y0000000016 pmid 22943543

23 R Assis Q Zhou D Bachtrog Sex-biased transcriptomeevolution in Drosophila Genome Biol Evol 4 1189ndash1200(2012) doi 101093gbeevs093 pmid 23097318

24 S Grath J Parsch Rate of amino acid substitution isinfluenced by the degree and conservation of male-biasedtranscription over 50 myr of Drosophila evolutionGenome Biol Evol 4 346ndash359 (2012) doi 101093gbeevs012 pmid 22321769

25 J C Perry P W Harrison J E Mank The ontogeny andevolution of sex-biased gene expression in Drosophilamelanogaster Mol Biol Evol 31 1206ndash1219 (2014)doi 101093molbevmsu072 pmid 24526011

26 D W Rogers et al Transglutaminase-mediated semencoagulation controls sperm storage in the malariamosquito PLOS Biol 7 e1000272 (2009) doi 101371journalpbio1000272 pmid 20027206

27 F Baldini et al The interaction between a sexually transferredsteroid hormone and a female protein regulates oogenesisin the malaria mosquito Anopheles gambiae PLOS Biol 11e1001695 (2013) doi 101371journalpbio1001695pmid 24204210

28 W R Shaw et al Mating activates the heme peroxidase HPX15in the sperm storage organ to ensure fertility in Anophelesgambiae Proc Natl Acad Sci USA 111 5854ndash5859 (2014)doi 101073pnas1401715111 pmid 24711401

29 J H Willis Structural cuticular proteins from arthropodsAnnotation nomenclature and sequence characteristics in thegenomics era Insect Biochem Mol Biol 40 189ndash204(2010) doi 101016jibmb201002001 pmid 20171281

30 R S Cornman et al Annotation and analysis of a largecuticular protein family with the RampR Consensus inAnopheles gambiae BMC Genomics 9 22 (2008) doi 1011861471-2164-9-22 pmid 18205929

31 R S Cornman J H Willis Extensive gene amplification andconcerted evolution within the CPR family of cuticularproteins in mosquitoes Insect Biochem Mol Biol 38661ndash676 (2008) doi 101016jibmb200804001pmid 18510978

32 R S Cornman J H Willis Annotation and analysis oflow-complexity protein families of Anopheles gambiae thatare associated with cuticle Insect Mol Biol 18 607ndash622(2009) doi 101111j1365-2583200900902xpmid 19754739

33 R S Cornman Molecular evolution of Drosophila cuticularprotein genes PLOS ONE 4 e8345 (2009) doi 101371journalpone0008345 pmid 20019874

34 T Togawa W A Dunn A C Emmons J Nagao J H WillisDevelopmental expression patterns of cuticular protein genes

SCIENCE sciencemagorg 2 JANUARY 2015 bull VOL 347 ISSUE 6217 1258522-7

Fig 5 Genesis of novel anopheline genes (A) Retrotransposition of theE2Deffete gene generated a ubiquitin-conjugating enzyme at the base of thegenus which exhibits much higher sequence divergence than the originalmultiexon gene WebLogo plots contrast the amino acid conservation of theoriginal effete gene with the diversification of the retrotransposed copy (residues38 to 75 species represented areAnminimusAn dirusAn funestusAn farautiAn atroparvusAn sinensisAn darlingi andAn albimanus) (B) TheSG7 salivaryproteinndashencodinggenewas generated from theC-terminal half of the 30-kDgene

SG7 then underwent tandem duplication and intron loss to generate anothersalivary protein SG7-2 Numerals indicate lengths of segments in base pairs (C)The origin of STAT1 a signal transducer and activator of transcription geneinvolved in immunity occurred through a retrotransposition event in the Celliaancestor after divergence from An dirus and An farauti The intronless STAT1 ismuch more divergent than its multiexon progenitor STAT2 and has been main-tained in all descendant species An independent retrotransposition event createda retrogene copy inAn atroparvuswhich is alsomoredivergent than its progenitor

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

with the RampR Consensus from Anopheles gambiae InsectBiochem Mol Biol 38 508ndash519 (2008) doi 101016jibmb200712008 pmid 18405829

35 D C Rinker et al Blood meal-induced changes to antennaltranscriptome profiles reveal shifts in odor sensitivities inAnopheles gambiae Proc Natl Acad Sci USA 110 8260ndash8265(2013) doi 101073pnas1302562110 pmid 23630291

36 D C Rinker et al Antennal transcriptome profiles ofanopheline mosquitoes reveal human host olfactoryspecialization in Anopheles gambiae BMC Genomics 14749 (2013) doi 1011861471-2164-14-749 pmid 24182346

37 M Altstein D R Naumlssel Neuropeptide signaling in insectsAdv Exp Med Biol 692 155ndash165 (2010) doi 101007978-1-4419-6902-6_8 pmid 21189678

38 J P Goetze I Hunter S K Lippert L Bardram J F RehfeldProcessing-independent analysis of peptide hormones andprohormones in plasma Front Biosci (Landmark Ed) 171804ndash1815 (2012) doi 1027414020 pmid 22201837

39 T H Stracker S Thompson G L Grossman M A RiehleM R Brown Characterization of the AeaHP gene and itsexpression in the mosquito Aedes aegypti (Diptera Culicidae)J Med Entomol 39 331ndash342 (2002) doi 1016030022-2585-392331 pmid 11931033

40 C D de Oliveira W P Tadei F C Abdalla P F PaolucciPimenta O Marinotti Multiple blood meals in Anophelesdarlingi (Diptera Culicidae) J Vector Ecol 37 351ndash358(2012) doi 101111j1948-7134201200238x pmid 23181859

41 N Okamoto et al A fat body-derived IGF-like peptide regulatespostfeeding growth in Drosophila Dev Cell 17 885ndash891(2009) doi 101016jdevcel200910008 pmid 20059957

42 M A Riehle Y Fan C Cao M R Brown Molecularcharacterization of insulin-like peptides in the yellow fevermosquito Aedes aegypti Expression cellular localization andphylogeny Peptides 27 2547ndash2560 (2006) doi 101016jpeptides200607016 pmid 16934367

43 Y Antonova A J Arik W Moore M M Riehle M R Brown inInsect Endocrinology L I Gilbert Ed (Academic Press 2012)pp 63ndash92

44 A Swaminathan A Gajan L A Pile Epigenetic regulation oftranscription in Drosophila Front Biosci (Landmark Ed) 17909ndash937 (2012) doi 1027413964 pmid 22201781

45 F Cipressa et al Effete a Drosophila chromatin-associatedubiquitin-conjugating enzyme that affects telomeric andheterochromatic position effect variegation Genetics195 147ndash158 (2013) doi 101534genetics113153320pmid 23821599

46 B Arcagrave et al Positive selection drives accelerated evolutionof mosquito salivary genes associated with blood-feedingInsect Mol Biol 23 122ndash131 (2014) doi 101111imb12068pmid 24237399

47 H Isawa M Yuda Y Orito Y Chinzei A mosquito salivaryprotein inhibits activation of the plasma contact system bybinding to factor XII and high molecular weight kininogenJ Biol Chem 277 27651ndash27658 (2002) doi 101074jbcM203505200 pmid 12011093

48 E Calvo et al Aegyptin a novel mosquito salivary glandprotein specifically binds to collagen and prevents itsinteraction with platelet glycoprotein VI integrinalpha2beta1 and von Willebrand factor J Biol Chem 28226928ndash26938 (2007) doi 101074jbcM705669200pmid 17650501

49 E Guittard et al CYP18A1 a key enzyme of Drosophilasteroid hormone inactivation is essential for metamorphosisDev Biol 349 35ndash45 (2011) doi 101016jydbio201009023pmid 20932968

50 R M Waterhouse et al Evolutionary dynamics of immune-relatedgenes and pathways in disease-vector mosquitoes Science 3161738ndash1743 (2007) doi 101126science1139862pmid 17588928

51 L C Bartholomay et al Pathogenomics of Culexquinquefasciatus and meta-analysis of infection responses todiverse pathogens Science 330 88ndash90 (2010) doi 101126science1193162 pmid 20929811

52 C Barillas-Mury Y S Han D Seeley F C KafatosAnopheles gambiae Ag-STAT a new insect member of theSTAT family is activated in response to bacterial infection

EMBO J 18 959ndash967 (1999) doi 101093emboj184959pmid 10022838

53 L Gupta et al The STAT pathway mediates late-phaseimmunity against Plasmodium in the mosquito Anophelesgambiae Cell Host Microbe 5 498ndash507 (2009) doi 101016jchom200904003 pmid 19454353

54 A C Bahia et al The JAK-STAT pathway controls Plasmodiumvivax load in early stages of Anopheles aquasalis infectionPLOS Negl Trop Dis 5 e1317 (2011) doi 101371journalpntd0001317 pmid 22069502

ACKNOWLEDGMENTS

All sequencing reads and genome assemblies have been submittedto the National Center for Biotechnology Information (umbrellaBioProject ID PRJNA67511) Genome and transcriptomeassemblies are also available from VectorBase (httpsvectorbaseorg) and the Broad Institute (httpsolivebroadinstituteorgcollectionsanopheles4) The authorsacknowledge the NIH Eukaryotic Pathogen and Disease VectorSequencing Project Working Group for guidance anddevelopment of this project Sequence data generation wassupported at the Broad Institute by the National HumanGenome Research Institute (U54 HG003067) We thank themany members of the Broad Institute Genomics Platform andGenome Sequencing and Analysis Program who contributed tosequencing data generation and analysis

SUPPLEMENTARY MATERIALS

wwwsciencemagorgcontent34762171258522supplDC1Materials and MethodsSupplementary TextFigs S1 to S25Tables S1 to S36References (55ndash248)

9 July 2014 accepted 14 November 2014Published online 27 November 2014101126science1258522

1258522-8 2 JANUARY 2015 bull VOL 347 ISSUE 6217 sciencemagorg SCIENCE

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

mosquitoesAnophelesHighly evolvable malaria vectors The genomes of 16

Riehle Igor V Sharakhov Zhijian Tu Laurence J Zwiebel and Nora J BesanskyKafatos Manolis Kellis Daniel Lawson Christos Louis Shirley Luckhart Marc A T Muskavitch Joseacute M Ribeiro Michael ADonnelly Scott J Emrich Michael C Fontaine William Gelbart Matthew W Hahn Immo A Hansen Paul I Howell Fotis C Zhou Flaminia Catteruccia George K Christophides Frank H Collins Robert S Cornman Andrea Crisanti Martin JJohn Vontas Catherine Walton Craig S Wilding Judith H Willis Yi-Chieh Wu Guiyun Yan Evgeny M Zdobnov Xiaofan Vladimir Stegniy Claudio J Struchiner Gregg W C Thomas Marta Tojo Pantelis Topalis Joseacute M C Tubio Maria F UngerSagnon Maria V Sharakhova Terrance Shea Felipe A Simatildeo Frederic Simard Michel A Slotman Pradya Somboon Anil Prakash David P Price Ashok Rajaraman Lisa J Reimer David C Rinker Antonis Rokas Tanya L Russell NFaleChioma Oringanje Mohammad A Oshaghi Nazzy Pakpour Philippos A Papathanos Ashley N Peery Michael Povelones N Mitchell Wendy Moore Katherine A Murphy Anastasia N Naumenko Tony Nolan Eva M Novoa Samantha OLoughlinErnesto Lowy Robert M MacCallum Chunhong Mao Gareth Maslen Charles Mbogo Jenny McCarthy Kristin Michel Sara Kirmitzoglou Lizette L Koekemoer Njoroge Laban Nicholas Langridge Mara K N Lawniczak Manolis Lirakis Neil F LoboXiaofang Jiang Irwin Jungreis Evdoxia G Kakani Maryam Kamali Petri Kemppainen Ryan C Kennedy Ioannis K Gabriel Wamdaogo M Guelbeogo Andrew B Hall Mira V Han Thaung Hlaing Daniel S T Hughes Adam M JenkinsChiu Mikkel Christensen Carlo Costantini Victoria L M Davidson Elena Deligianni Tania Dottorini Vicky Dritsou Stacey B Birren Stephanie A Blandin Andrew I Brockman Thomas R Burkot Austin Burt Clara S Chan Cedric Chauve Joanna CJames Amon Bruno Arcagrave Peter Arensburger Gleb Artemov Lauren A Assour Hamidreza Basseri Aaron Berlin Bruce W Daniel E Neafsey Robert M Waterhouse Mohammad R Abai Sergey S Aganezov Max A Alekseyev James E Allen

originally published online November 27 2014DOI 101126science1258522 (6217) 1258522347Science

this issue 101126science1258524 and 101126science1258522 see also p 27Sciencehuman malaria (see the Perspective by Clark and Messer)reveal clues that explain the mystery of why only some species of one genus of mosquitoes are capable of transmitting

et al and Fontaine et allife-threatening infection By sequencing the genomes of several mosquitoes in depth Neafsey among these bloodsucking flies that render some bites mere nuisances and others the initiation of a potentially

Virtually everyone has first-hand experience with mosquitoes Few recognize the subtle biological distinctionsMosquito adaptability across genomes

ARTICLE TOOLS httpsciencesciencemagorgcontent34762171258522

MATERIALSSUPPLEMENTARY httpsciencesciencemagorgcontentsuppl20141125science1258522DC1

CONTENTRELATED httpsciencesciencemagorgcontentsci347621727full

REFERENCES

httpsciencesciencemagorgcontent34762171258522BIBLThis article cites 238 articles 62 of which you can access for free

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAASSciencelicensee American Association for the Advancement of Science No claim to original US Government Works The title Science 1200 New York Avenue NW Washington DC 20005 2017 copy The Authors some rights reserved exclusive

(print ISSN 0036-8075 online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

on June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

PERMISSIONS httpwwwsciencemagorghelpreprints-and-permissions

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAASSciencelicensee American Association for the Advancement of Science No claim to original US Government Works The title Science 1200 New York Avenue NW Washington DC 20005 2017 copy The Authors some rights reserved exclusive

(print ISSN 0036-8075 online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

on June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

Page 2: MOSQUITO GENOMICS Highlyevolvable … ARTICLE SUMMARY MOSQUITO GENOMICS Highlyevolvable malariavectors:The genomes of 16 Anopheles mosquitoes Daniel E. Neafsey,*† Robert M. Waterhouse,*

RESEARCH ARTICLE

MOSQUITO GENOMICS

Highly evolvable malaria vectors Thegenomes of 16 AnophelesmosquitoesDaniel E Neafsey1dagger Robert M Waterhouse2345 Mohammad R Abai6

Sergey S Aganezov7 Max A Alekseyev7 James E Allen8 James Amon9 Bruno Arcagrave10

Peter Arensburger11 Gleb Artemov12 Lauren A Assour13 Hamidreza Basseri6

Aaron Berlin1 Bruce W Birren1 Stephanie A Blandin1415 Andrew I Brockman16

Thomas R Burkot17 Austin Burt18 Clara S Chan23 Cedric Chauve19 Joanna C Chiu20

Mikkel Christensen8 Carlo Costantini21 Victoria L M Davidson22 Elena Deligianni23

Tania Dottorini16 Vicky Dritsou24 Stacey B Gabriel25 Wamdaogo M Guelbeogo26

Andrew B Hall27 Mira V Han28 Thaung Hlaing29 Daniel S T Hughes830

Adam M Jenkins31 Xiaofang Jiang3227 Irwin Jungreis23 Evdoxia G Kakani3334

Maryam Kamali35 Petri Kemppainen36 Ryan C Kennedy37 Ioannis K Kirmitzoglou1638

Lizette L Koekemoer39 Njoroge Laban40 Nicholas Langridge8 Mara K N Lawniczak16

Manolis Lirakis41 Neil F Lobo42 Ernesto Lowy8 Robert M MacCallum16

Chunhong Mao43 Gareth Maslen8 Charles Mbogo44 Jenny McCarthy11 Kristin Michel22

Sara N Mitchell33 Wendy Moore45 Katherine A Murphy20 Anastasia N Naumenko35

Tony Nolan16 Eva M Novoa23 Samantha OrsquoLoughlin18 Chioma Oringanje45

Mohammad A Oshaghi6 Nazzy Pakpour46 Philippos A Papathanos1624

Ashley N Peery35 Michael Povelones47 Anil Prakash48 David P Price4950

Ashok Rajaraman19 Lisa J Reimer51 David C Rinker52 Antonis Rokas5253

Tanya L Russell17 NrsquoFale Sagnon26 Maria V Sharakhova35 Terrance Shea1

Felipe A Simatildeo45 Frederic Simard21 Michel A Slotman54 Pradya Somboon55

Vladimir Stegniy12 Claudio J Struchiner5657 Gregg W C Thomas58 Marta Tojo59

Pantelis Topalis23 Joseacute M C Tubio60 Maria F Unger42 John Vontas41

Catherine Walton36 Craig S Wilding61 Judith H Willis62 Yi-Chieh Wu2363

Guiyun Yan64 Evgeny M Zdobnov45 Xiaofan Zhou53 Flaminia Catteruccia3334

George K Christophides16 Frank H Collins42 Robert S Cornman62 Andrea Crisanti1624

Martin J Donnelly5165 Scott J Emrich13 Michael C Fontaine4266 William Gelbart67

Matthew W Hahn6858 Immo A Hansen4950 Paul I Howell69 Fotis C Kafatos16

Manolis Kellis23 Daniel Lawson8 Christos Louis412324 Shirley Luckhart46

Marc A T Muskavitch3170 Joseacute M Ribeiro71 Michael A Riehle45 Igor V Sharakhov3527

Zhijian Tu2732 Laurence J Zwiebel72 Nora J Besansky42dagger

Variation in vectorial capacity for human malaria among Anopheles mosquito species isdetermined by many factors including behavior immunity and life history To investigatethe genomic basis of vectorial capacity and explore new avenues for vector control wesequenced the genomes of 16 anopheline mosquito species from diverse locationsspanning ~100 million years of evolution Comparative analyses show faster rates of genegain and loss elevated gene shuffling on the X chromosome and more intron lossesrelative to Drosophila Some determinants of vectorial capacity such as chemosensorygenes do not show elevated turnover but instead diversify through protein-sequencechanges This dynamism of anopheline genes and genomes may contribute to their flexiblecapacity to take advantage of new ecological niches including adapting to humans asprimary hosts

Malaria is a complex disease mediated byobligate eukaryotic parasites with a lifecycle requiring adaption to both verte-brate hosts and mosquito vectors Theserelationships create a rich coevolution-

ary triangle Just as Plasmodium parasites haveadapted to their diverse hosts and vectors infec-tion by Plasmodium parasites has reciprocally in-duced adaptive evolutionary responses in humansand other vertebrates (1) and has also influenced

mosquito evolution (2) Human malaria is trans-mitted only bymosquitoes in the genus Anophelesbut not all species within the genus or even allmembers of each vector species are efficientmalaria vectors This suggests an underlyinggeneticgenomic plasticity that results in varia-tion of key traits determining vectorial capacitywithin the genusIn all five species of Plasmodium have adapted

to infect humans and are transmitted by ~60 of

the 450 known species of anopheline mosqui-toes (3) Sequencing the genome of Anophelesgambiae the most important malaria vector insub-Saharan Africa has offered numerous in-sights into how that species became highly spe-cialized to live among and feed upon humansand how susceptibility to mosquito control strat-egies is determined (4) Until very recently (5ndash7)similar genomic resources have not existed forother anophelines limiting comparisons to in-dividual genes or sets of genomic markers withno genome-wide data to investigate attributes as-sociated with vectorial capacity across the genusThus we sequenced and assembled the ge-

nomes and transcriptomes of 16 anophelinesfrom Africa Asia Europe and Latin AmericaWe chose these 16 species to represent a rangeof evolutionary distances from An gambiae avariety of geographic locations and ecologicalconditions and varying degrees of vectorialcapacity (8) (Fig 1 A and B) For example Anquadriannulatus although extremely closelyrelated to An gambiae feeds preferentially onbovines rather than humans limiting its poten-tial to transmit human malaria An merus Anmelas An farauti and An albimanus femalescan lay eggs in salty or brackish water insteadof the freshwater sites required by other speciesWith a focus on species most closely related toAn gambiae (9) the sampled anophelines spanthe three main subgenera that shared a commonancestor ~100 million years ago (Ma) (10)

Materials and methods summary

Genomic DNA and whole-body RNA were ob-tained from laboratory colonies and wild-caughtspecimens (tables S1 and S2) with samples fornine species procured from newly establishedisofemale colonies to reduce heterozygosityIllumina sequencing libraries spanning a rangeof insert sizes were constructed with ~100-foldpaired-end 101ndashbase pair (bp) coverage gener-ated for small (180 bp) and medium (15 kb) in-sert libraries and lower coverage for large (38 kb)insert libraries (table S3) DNA template for thesmall and medium input libraries was sourcedfrom single femalemosquitoes from each speciesto further reduce heterozygosity High-molecular-weight DNA template for each large insert li-brary was derived from pooled DNA obtainedfrom several hundred mosquitoes ALLPATHS-LG(11) genome assemblies were produced using theldquohaploidifyrdquo option to reduce haplotype assem-blies caused by high heterozygosity Assemblyquality reflectedDNA template quality and homo-zygosity with a mean scaffold N50 of 36 Mbranging to 181 Mb for An albimanus (table S4)Despite variation in contiguity the assemblies wereremarkably complete and searches for arthropod-wide single-copy orthologs generally revealedfew missing genes (fig S1) (12)Genome annotation with MAKER (13) sup-

ported with RNA sequencing (RNAseq) of tran-scriptomes (produced from pooled male andfemale larvae pupae and adults) (table S5) andcomprehensive noncoding RNA gene prediction(fig S2) yielded relatively complete gene sets

RESEARCH

SCIENCE sciencemagorg 2 JANUARY 2015 bull VOL 347 ISSUE 6217 1258522-1

on June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

(fig S3) with between 10738 and 16149 protein-coding genes identified for each species Genecount was generally commensurate with assem-bly contiguity (table S6) Some of this variation intotal gene counts may be attributed to the chal-lenges of gene annotations with variable levelsof assembly contiguity and supporting RNAseqdata To estimate the prevalence of erroneousgene model fusions andor fragmentations wecompared the new gene annotations to An gam-biae gene models and found an average of 33and 97 potentially fused and fragmented genemodels respectively For analyses described be-low that may be sensitive to variation in genemodel accuracy or gene set completeness wehave conducted sensitivity analyses to rule outconfounding results from these factors (12)

Rapidly evolving genes and genomes

Orthology delineation identified lineage-restrictedand species-specific genes as well as ancient genesfound across insect taxa of which universal single-copy orthologs were employed to estimate themolecular species phylogeny (Fig 1 B and C andfig S4) Analysis of codon frequencies in theseorthologs revealed that anophelines unlike dro-sophilids exhibit relatively uniform codon usagepreferences (fig S5)Polytene chromosomes have provided a glimpse

into anopheline chromosome evolution (14) Ourgenome-sequencendashbased view confirmed the

cytological observations and offers many newinsights At the base-pair level ~90 of the non-gapped and nonmasked An gambiae genome(ie excluding transposable elements as detailedin table S7) is alignable to the most closely re-lated species whereas only ~13 aligns to themost distant (Fig 1D fig S6 and table S8) withreduced alignability in centromeres and on theX chromosome (Fig 1D) At chromosomal lev-els mapping data anchored 35 to 76 of the Anstephensi An funestus An atroparvus and Analbimanus genome assemblies to chromosomalarms (tables S9 to S12) Analysis of genes in an-chored regions showed that synteny at the whole-arm level is highly conserved despite severalwhole-arm translocations (Fig 2A and table S13)In contrast small-scale rearrangements disruptgene colinearity within arms over time leadingto extensive shuffling of gene order over a timescale of 29 million years or more (10 15) (Fig2B and fig S7) As in Drosophila rearrangementrates are higher on the X chromosome than onautosomes (Fig 2C and tables S14 to S16)However the difference is significantly morepronounced in Anopheles where X chromosomerearrangements are more frequent by a factorof 27 than autosomal rearrangements in Dro-sophila the corresponding ratio is only 12 (t testt10 = 73 P lt 1 times 10minus5) (fig S8) The X chromo-some is also notable for a significant degree ofobserved gene movement to other chromosomes

relative to Drosophila (one sample proportiontest P lt 22 times 10minus16) (Fig 2D and tables S17 andS18) as was previously noted for Anophelesrelative to Aedes (16) further underscoring itsdistinctive evolutionary profile in Anopheles com-pared with other dipteran generaSuch dynamic gene shuffling and movement

may be facilitated by themultiple families of DNAtransposons and long terminal repeat (LTR) andnon-LTR retroelements found in all genomes(table S7) as well as a weaker dosage compen-sation phenotype in Anopheles compared withDrosophila (17) Despite such shuffling com-paring genomic locations of orthologs can besuccessfully employed to reconstruct ancestralchromosomal arrangements (fig S9) and to con-fidently improve assembly contiguity (tables S19to S21)Copy-number variation in homologous gene

families also reveals striking evolutionary dy-namism Analysis of 11636 gene families withCAFE 3 (18) indicates a rate of gene gainlosshigher by a factor of at least 5 than that ob-served for 12 Drosophila genomes (19) Overallthese Anopheles genomes exhibit a rate of gainor loss per gene per million years of 312 times 10minus3

compared with 590 times 10minus4 for Drosophila sug-gesting substantially higher gene turnover with-in anophelines relative to fruit flies This fivefoldgreater gainloss rate in anophelines holds trueunder models that account for uncertainty in

1258522-2 2 JANUARY 2015 bull VOL 347 ISSUE 6217 sciencemagorg SCIENCE

1Genome Sequencing and Analysis Program Broad Institute 415 Main Street Cambridge MA 02142 USA 2Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology32 Vassar Street Cambridge MA 02139 USA 3The Broad Institute of Massachusetts Institute of Technology and Harvard 415 Main Street Cambridge MA 02142 USA 4Department of Genetic Medicineand Development University of Geneva Medical School Rue Michel-Servet 1 1211 Geneva Switzerland 5Swiss Institute of Bioinformatics Rue Michel-Servet 1 1211 Geneva Switzerland 6Department ofMedical Entomology and Vector Control School of Public Health and Institute of Health Researches Tehran University of Medical Sciences Tehran Iran 7George Washington University Department ofMathematics and Computational Biology Institute 45085 University Drive Ashburn VA 20147 USA 8European Molecular Biology Laboratory European Bioinformatics Institute EMBL-EBI Wellcome TrustGenome Campus Hinxton Cambridge CB10 1SD UK 9National Vector Borne Disease Control Programme Ministry of Health Tafea Province Vanuatu 10Department of Public Health and InfectiousDiseases Division of Parasitology Sapienza University of Rome Piazzale Aldo Moro 5 00185 Rome Italy 11Department of Biological Sciences California State PolytechnicndashPomona 3801 West TempleAvenue Pomona CA 91768 USA 12Tomsk State University 36 Lenina Avenue Tomsk Russia 13Department of Computer Science and Engineering Eck Institute for Global Health 211B Cushing HallUniversity of Notre Dame Notre Dame IN 46556 USA 14Inserm U963 F-67084 Strasbourg France 15CNRS UPR9022 IBMC F-67084 Strasbourg France 16Department of Life Sciences Imperial CollegeLondon South Kensington Campus London SW7 2AZ UK 17Faculty of Medicine Health and Molecular Science Australian Institute of Tropical Health Medicine James Cook University Cairns 4870Australia 18Department of Life Sciences Imperial College London Silwood Park Campus Ascot SL5 7PY UK 19Department of Mathematics Simon Fraser University 8888 University Drive Burnaby BCV5A 1S6 Canada 20Department of Entomology and Nematology One Shields Avenue University of CaliforniandashDavis Davis CA 95616 USA 21Institut de Recherche pour le Deacuteveloppement Uniteacutes Mixtesde Recherche Maladies Infectieuses et Vecteurs Eacutecologie Geacuteneacutetique Eacutevolution et Controcircle 911 Avenue Agropolis BP 64501 Montpellier France 22Division of Biology Kansas State University 271 ChalmersHall Manhattan KS 66506 USA 23Institute of Molecular Biology and Biotechnology Foundation for Research and Technology Hellas Nikolaou Plastira 100 GR-70013 Heraklion Crete Greece 24Centre ofFunctional Genomics University of Perugia Perugia Italy 25Genomics Platform Broad Institute 415 Main Street Cambridge MA 02142 USA 26Centre National de Recherche et de Formation sur lePaludisme Ouagadougou 01 BP 2208 Burkina Faso 27Program of Genetics Bioinformatics and Computational Biology Virginia Polytechnic Institute and State University Blacksburg VA 24061 USA28School of Life Sciences University of Nevada Las Vegas NV 89154 USA 29Department of Medical Research No 5 Ziwaka Road Dagon Township Yangon 11191 Myanmar 30Baylor College of Medicine1 Baylor Plaza Houston TX 77030 USA 31Boston College 140 Commonwealth Avenue Chestnut Hill MA 02467 USA 32Department of Biochemistry Virginia Polytechnic Institute and State UniversityBlacksburg VA 24061 USA 33Harvard School of Public Health Department of Immunology and Infectious Diseases Boston MA 02115 USA 34Dipartimento di Medicina Sperimentale e ScienzeBiochimiche Universitagrave degli Studi di Perugia Perugia Italy 35Department of Entomology Virginia Polytechnic Institute and State University Blacksburg VA 24061 USA 36Computational EvolutionaryBiology Group Faculty of Life Sciences University of Manchester Oxford Road Manchester M13 9PT UK 37Department of Bioengineering and Therapeutic Sciences University of California San FranciscoCA 94143 USA 38Bioinformatics Research Laboratory Department of Biological Sciences New Campus University of Cyprus CY 1678 Nicosia Cyprus 39Wits Research Institute for Malaria Faculty ofHealth Sciences and Vector Control Reference Unit National Institute for Communicable Diseases of the National Health Laboratory Service Sandringham 2131 Johannesburg South Africa 40NationalMuseums of Kenya PO Box 40658-00100 Nairobi Kenya 41Department of Biology University of Crete 700 13 Heraklion Greece 42Eck Institute for Global Health and Department of Biological SciencesUniversity of Notre Dame 317 Galvin Life Sciences Building Notre Dame IN 46556 USA 43Virginia Bioinformatics Institute 1015 Life Science Circle Virginia Polytechnic Institute and State UniversityBlacksburg VA 24061 USA 44Kenya Medical Research Institute-Wellcome Trust Research Programme Centre for Geographic Medicine Research - Coast PO Box 230-80108 Kilifi Kenya 45Departmentof Entomology 1140 East South Campus Drive Forbes 410 University of Arizona Tucson AZ 85721 USA 46Department of Medical Microbiology and Immunology School of Medicine University ofCalifornia Davis One Shields Avenue Davis CA 95616 USA 47Department of Pathobiology University of Pennsylvania School of Veterinary Medicine 3800 Spruce Street Philadelphia PA 19104 USA48Regional Medical Research Centre NE Indian Council of Medical Research PO Box 105 Dibrugarh-786 001 Assam India 49Department of Biology New Mexico State University Las Cruces NM 88003USA 50Molecular Biology Program New Mexico State University Las Cruces NM 88003 USA 51Department of Vector Biology Liverpool School of Tropical Medicine Pembroke Place Liverpool L3 5QAUK 52Center for Human Genetics Research Vanderbilt University Medical Center Nashville TN 37235 USA 53Department of Biological Sciences Vanderbilt University Nashville TN 37235 USA54Department of Entomology Texas AampM University College Station TX 77807 USA 55Department of Parasitology Faculty of Medicine Chiang Mai University Chiang Mai 50200 Thailand 56FundaccedilatildeoOswaldo Cruz Avenida Brasil 4365 RJ Brazil 57Instituto de Medicina Social Universidade do Estado do Rio de Janeiro Rio de Janeiro Brazil 58School of Informatics and Computing Indiana UniversityBloomington IN 47405 USA 59Department of Physiology School of Medicine Center for Research in Molecular Medicine and Chronic Diseases Instituto de Investigaciones Sanitarias University ofSantiago de Compostela Santiago de Compostela A Coruntildea Spain 60Wellcome Trust Sanger Institute Hinxton Cambridgeshire CB10 1SA UK 61School of Natural Sciences and Psychology LiverpoolJohn Moores University Liverpool L3 3AF UK 62Department of Cellular Biology University of Georgia Athens GA 30602 USA 63Department of Computer Science Harvey Mudd College Claremont CA91711 USA 64Program in Public Health College of Health Sciences University of California Irvine Hewitt Hall Irvine CA 92697 USA 65Malaria Programme Wellcome Trust Sanger Institute CambridgeCB10 1SJ UK 66Centre of Evolutionary and Ecological Studies (Marine Evolution and Conservation group) University of Groningen Nijenborgh 7 NL-9747 AG Groningen Netherlands 67Department ofMolecular and Cellular Biology Harvard University 16 Divinity Avenue Cambridge MA 02138 USA 68Department of Biology Indiana University Bloomington IN 47405 USA 69Centers for Disease Controland Prevention 1600 Clifton Road NE MSG49 Atlanta GA 30329 USA 70Biogen Idec 14 Cambridge Center Cambridge MA 02142 USA 71Laboratory of Malaria and Vector Research National Institute ofAllergy and Infectious Diseases 12735 Twinbrook Parkway Rockville MD 20852 USA 72Departments of Biological Sciences and Pharmacology Institutes for Chemical Biology Genetics and Global HealthVanderbilt University and Medical Center Nashville TN 37235 USAThese authors contributed equally to this work daggerCorresponding author E-mail neafseybroadinstituteorg (DEN) nbesanskndedu (NJB)

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

gene family sizes at the tips of the species treedue to annotation or assembly errors and is notsensitive to inclusion or exclusion of taxa affect-ing the root age of the tree nor to the exclusionof taxa with the poorest assemblies and genesets (fig S10 and tables S22 and S23) Examplesinclude expansions of cuticular proteins in Anarabiensis and neurotransmitter-gated ion chan-nels in An albimanus (table S24)The evolutionary dynamismofAnopheles genes

extends to their architecture Comparisons ofsingle-copy orthologs at deeper phylogeneticdepths showed losses of introns at the root ofthe true fly order Diptera and revealed con-tinued losses as the group diversified into thelineages leading to fruit flies and mosquitoesHowever anopheline orthologs have sustainedgreater intron loss than drosophilids leadingto a relative paucity of introns in the genes ofextant anophelines (fig S11 and table S25) Com-parative analysis also revealed that gene fu-sion and fission played a substantial role in theevolution of mosquito genes with apparent re-arrangements affecting an average of 101 of

all genes in the genomes of the 10 specieswith the most contiguous assemblies (fig S12)Furthermore gene boundaries can be flexiblewhole genome alignments identified 325 can-didates for stop-codon readthrough (fig S13 andtable S26)Becausemolecular evolution of protein-coding

sequences is a well-known source of phenotypicchange we compared evolutionary rates amongdifferent functional categories of anopheline ortho-logs We quantified evolutionary divergence intermsof protein sequence identity of alignedortho-logs and the dNdS statistic (ratio of nonsynon-ymous to synonymous substitutions) computedusing PAML (12 20) Among curated sets of geneslinked to vectorial capacity or species-specific traitsagainst a background of functional categories de-fined by Gene Ontology or InterPro annotationsodorant and gustatory receptors showhigh evolu-tionary rates and male accessory gland proteinsexhibit exceptionally high dNdS ratios (Fig 3 figsS14 and S15 and tables S27 to S29) Rapid diver-gence in functional categories related to malariatransmission andor mosquito control strategies

led us to examine the genomic basis of severalfacets of anopheline biology in closer detail

Insights into mosquito biology andvectorial capacity

Mosquito reproductive biology evolves rapidlyand presents a compelling target for vector con-trol This is exemplified by the An gambiaemaleaccessory gland protein (Acp) cluster on chromo-some 3R (21 22) where conservation is mostlylost outside the An gambiae species complex(fig S16) In Drosophila male-biased genes suchas Acps tend to evolve faster than loci withoutmale-biased expression (23ndash25) We looked fora similar pattern in anophelines after assessingeach gene for sex-biased expression using micro-array and RNAseq data sets for An gambiae (12)In contrast to Drosophila female-biased genesshowdramatically faster rates of evolution acrossthe genus thanmale-biased genes (Wilcoxon ranksum test P = 5 times 10minus4) (fig S17)Differences in reproductive genes among

anophelines may provide insight into the or-igin and function of sex-related traits During

SCIENCE sciencemagorg 2 JANUARY 2015 bull VOL 347 ISSUE 6217 1258522-3

An gambiae

An christyi

An epiroticus

An stephensi

An maculatus

An culicifacies

An minimus

An funestus

An dirus

An farauti

An atroparvus

An sinensis

An albimanus

An darlingi

Culex quinquefasciatus

Aedes aegypti

An arabiensis

Culicinae

Anophelinae

Cellia

D melanogaster

D pseudoobscuraD mojavensis

Glossina morsitans

10000 200000

MuscomorphaDrosophilidae

Culicidae

An merus

An melas

An quadri-annulatus

5000 15000

GeneOrthology

Anophelinae

Culicinae

Culicidae

Diptera

Widespread

Universal

None

Number of Genes

100 Ma

260 Ma

005ss

MolecularPhylogeny

Nyssorhynchus

Anopheles

Pyretophorus

050 1

2R3R2L3LX

Genome Alignability

30 Ma

gambiaecomplex

minor

Vector Statusmajor

non

Geography

A B C

D

Fig 1 Geography vector status molecular phylogeny gene orthologyand genome alignability of the 16 newly sequenced anopheline mosqui-toes and selected other dipterans (A) Global geographic distributions ofthe 16 sampled anophelines and the previously sequenced An gambiae andAn darlingi Ranges are colored for each species or group of species as shownin (B) eg light blue for An farauti (B) The maximum likelihood molecularphylogeny of all sequenced anophelines and selected dipteran outgroupsShapes between branch termini and species names indicate vector status

(rectangles major vectors ellipses minor vectors triangles nonvectors) andare colored according to geographic ranges shown in (A) (C) Bar plots showtotal gene counts for each species partitioned according to their orthologyprofiles from ancient genes found across insects to lineage-restricted andspecies-specific genes (D) Heat map illustrating the density (in 2-kb slidingwindows) of whole-genome alignments along the lengths of An gambiaechromosomal arms fromwhite where An gambiae aligns to no other speciesto red where An gambiae aligns to all the other anophelines

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

copulation An gambiae males transfer a gela-tinous mating plug a complex of seminal pro-teins lipids and hormones that are essentialfor successful sperm storage by females and forreproductive success (26ndash28) Coagulation ofthe plug is mediated by a seminal transgluta-minase (TG3) which is found in anophelinesbut is absent in other mosquito genera that donot form a mating plug (26) We examined TG3and its two paralogs (TG1 and TG2) in the se-quenced anophelines and investigated the rateof evolution of each gene (Fig 4A) Silent siteswere saturated at the whole-genus level makingdS difficult to estimate reliably but TG1 (thegene presumed to be ancestral owing to broad-est taxonomic representation) exhibited the lowestrate of amino acid change (dN = 020) TG2 ex-hibited an intermediate rate (dN = 093) and theanopheline-specific TG3 has evolved even morerapidly (dN = 150) perhaps because of malemale or malefemale evolutionary conflict Notethat plug formation appears to be a derived traitwithin anophelines because it is not exhibited byAn albimanus and intermediate poorly coagu-lated plugs were observed in taxa descendingfrom early-branching lineages within the genus(table S30) Functional studies of mating plugs

will be necessary to understand what drove theorigin and rapid evolution of TG3Proteins that constitute themosquito cuticular

exoskeleton play important roles in diverse as-pects of anopheline biology including develop-ment ecology and insecticide resistance andconstitute approximately 2 of all protein-codinggenes (29) Comparisons among dipterans haverevealed numerous amplifications of cuticularprotein (CP) genes undergoing concerted evo-lution at physically clustered loci (30ndash33) Weinvestigated the extent and time scale of genecluster homogenization within anophelines bygenerating phylogenies of orthologous geneclusters (fig S18 and table S31) Throughout thegenus these gene clusters often group phylo-genetically by species rather than by positionwithin tandem arrays particularly in a subsetof clusters These include the 3RB and 3RC clus-ters of CP genes (30) the CPLCG group A andCPLCW clusters found elsewhere on 3R (32) andsix tandemly arrayed genes on 3L designatedCPFL2 through CPFL7 (34) CPLCW genes occurin a head-to-head arrangementwithCPLCGgroupA genes and exhibit highly conserved intergenicsequences (fig S19) Furthermore transcript lo-calization studies using in situ hybridization

revealed identical spatial expression patterns forCLPCW and CPLCG group A gene pairs sugges-tive of coregulation (fig S19) For these five geneclusters complete grouping by organismal lin-eage was observed for most deep nodes as wellas formany individual species outside the shallowAn gambiae species complex (Fig 4B) consistentwith a relatively rapid (less than 20 million years)homogenization of sequences via concerted evo-lution The emerging pattern of anopheline CPevolution is thus one of relative stasis for a ma-jority of single-copy orthologs juxtaposed withconsistent concerted evolution of a subset of genesAnophelines identify hosts oviposition sites

and other environmental cues through special-ized chemosensory membrane-bound receptorsWe examined three of the major gene familiesthat encode these molecules the odorant recep-tors (ORs) gustatory receptors (GRs) and variantionotropic glutamate receptors (IRs) Given therapid chemosensory gene turnover observed inmany other insects we explored whether vary-ing host preferences of anopheline mosquitoescould be attributed to chemosensory gene gainsand losses Unexpectedly in light of the elevatedgenome-wide rate of gene turnover we found thatthe overall size and content of the chemosensory

1258522-4 2 JANUARY 2015 bull VOL 347 ISSUE 6217 sciencemagorg SCIENCE

Fig 2 Patterns of anopheline chromosomal evolution (A) Anopheline ge-nomes have conserved gene membership on chromosome arms (ldquoelementsrdquocolored and labeled 1 to 5) UnlikeDrosophila chromosome elements reshufflebetween chromosomes via translocations as intact elements and do notshow fissions or fusionsThe tree depicts the supported molecular topologyfor the species studied (B) Conserved synteny blocks decay rapidly withinchromosomal arms as the phylogenetic distance increases between spe-cies Moving left to right the dot-plot panels show gene-level synteny betweenchromosome 2R of An gambiae (x axis) and inferred ancestral sequences(y axes inferred using PATHGROUPS) at increasing evolutionary time scales(million years ago) estimated by an ultrametric phylogeny Gray horizontal

lines represent scaffold breaks Discontinuity of the red linesdots indicatesrearrangement (C) Anopheline X chromosomes exhibit higher rates of re-arrangement (P lt 1 times 10minus5) measured as breaks per Mb per million yearscompared with autosomes despite a paucity of polymorphic inversions onthe X (D) The anopheline X chromosome also displays a higher rate of genemovement to other chromosomal arms than any of the autosomes Chromo-somal elements are labeled around the perimeter internal bands are coloredaccording to the chromosomal element source and match element colors in(A) and (C) Bands are sized to indicate the relative ratio of genes importedversus exported for each chromosomal element and the relative allocation ofexported genes to other elements

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

gene repertoire are relatively conserved across thegenus CAFE 3 (18) analyses estimated that themost recent common ancestor of the anophelineshad approximately 60 genes in each of the ORand GR families similar to most extant anophe-lines (Fig 4C and fig S20) Estimated gainlossrates of OR and GR genes per million years(error-corrected l = 13 times 10minus3 for ORs and 20 times10minus4 for GRs) were much lower than the overalllevel of anopheline gene families Similarly wefound almost the same number of antennae-expressed IRs (~20) in all anopheline genomesDespite overall conservation in chemosensorygene numbers we observed several examples ofgene gain and loss in specific lineages Notablythere was a net gain of at least 12 ORs in thecommon ancestor of the An gambiae complex(Fig 4C)OR and GR gene repertoire stability may de-

rive from their roles in several critical behaviorsHost preference differences are likely to be gov-erned by a combination of functional divergenceand transcriptional modulation of orthologs Thismodel is supported by studies of antennal tran-scriptomes in the major malaria vector Angambiae (35) and comparisons between thisvector and its morphologically identical siblingAn quadriannulatus (36) a very closely relatedspecies that plays no role inmalaria transmission(despite vectorial competence) because it does

not specialize on human hosts Furthermore wefound that many subfamilies of ORs and GRsshowed evidence of positive selection (19 of 53ORs 17 of 59 GRs) across the genus suggestingpotential functional divergenceSeveral blood feedingndashrelated behaviors in

mosquitoes are also regulated by peptide hor-mones (37) These peptides are synthesized pro-cessed and released fromnervous and endocrinesystems and elicit their effects through bindingappropriate receptors in target tissues (38) Intotal 39 peptide hormones were identified fromeach of the sequenced anophelines (fig S21)Notably no ortholog of the well-characterizedhead peptide (HP) hormone of the culicine mos-quito Aedes aegypti was identified in any of theassemblies In Ae aegypti HP is responsible forinhibiting host-seeking behavior after a bloodmeal (39) Because anophelines broadly exhibitsimilar behavior (40) the absence ofHP from theentire clade suggests they may have evolved anovelmechanism to inhibit excess blood feedingSimilarly no ortholog of insulin growth factor 1(IGF1) was identified in any anophelines eventhough IGF1 orthologs have been identified inother dipterans includingDmelanogaster (41)and Ae aegypti (42) IGF1 is a key componentof the insulininsulin growth factor 1 signaling(IIS) cascade which regulates processes includ-ing innate immunity reproduction metabolism

and life span (43) Nevertheless other membersof the IIS cascade are present and four insulin-like peptides are found in a compact cluster withgene arrangements conserved across anophelines(fig S22) This raises questions regarding themodification of IIS signaling in the absence ofIGF1 and the functional importance of this con-served genomic arrangementEpigenetic mechanisms affect many biological

processes by modulation of chromatin structuretelomere remodeling and transcriptional con-trol Of the 215 epigenetic regulatory genes inD melanogaster (44) we identified 169 putativeAn gambiae orthologs (table S32) which sug-gested the presence of mechanisms of epigeneticcontrol in Anopheles and Drosophila We findhowever that retrotransposition may have con-tributed to the functional divergence of at leastone gene associated with epigenetic regulationThe ubiquitin-conjugating enzyme E2D (ortholo-gous to effete (45) in D melanogaster) duplicatedvia retrotransposition in an early anopheline an-cestor and the retrotransposed copy is main-tained in a subset of anophelines Although theentire amino acid sequence of E2D is perfectlyconserved between An gambiae and D mela-nogaster the retrogenes are highly divergent(Fig 5A) and may contribute to functional di-versification within the genusSaliva is integral to blood feeding it impairs

host hemostasis and also affects inflammationand immunity In An gambiae the salivary pro-teome is estimated to contain the products ofat least 75 genes most being expressed solely inthe adult female salivary glands Comparativeanalyses indicate that anopheline salivary pro-teins are subject to strong evolutionary pres-sures and these genes exhibit an acceleratedpace of evolution as well as a very high rate ofgainloss (Fig 3 and fig S23) Polymorphismswithin An gambiae populations from limitedsets of salivary genes were previously found tocarry signatures of positive selection (46) Se-quence analysis across the anophelines showsthat salivary genes have the highest incidenceof positively selected codons among the sevengene classes (fig S24) indicating that coevolu-tion with vertebrate hosts is a powerful driverof natural selection in salivary proteomes More-over salivary proteins also exhibit functionaldiversification through new gene creation Se-quence similarity intron-exon boundaries andsecondary structure prediction point to the birthof the SG7SG7-2 inflammation-inhibiting (47)gene family from the genomic region encodingthe C terminus of the 30-kD protein (Fig 5B) acollagen-binding platelet inhibitor already presentin the blood-feeding ancestor of mosquitoes andblack flies (48) Based on phylogenetic representa-tion these events must have occurred before theradiation of anophelines but after separation fromthe culicinesResistance to insecticides and other xeno-

biotics has arisen independently inmany anoph-eline species fostered directly and indirectly byanthropogenic environmentalmodificationMeta-bolic resistance to insecticides is mediated by

SCIENCE sciencemagorg 2 JANUARY 2015 bull VOL 347 ISSUE 6217 1258522-5

Fig 3 Contrasting evolutionaryproperties of selected genefunctional categories Examinedevolutionary properties of ortholo-gous groups of genes include ameasure of amino acid conservationdivergence (evolutionary rate) ameasure of selective pressure(dNdS) a measure of geneduplication in terms of mean genecopy-number per species (number ofgenes) and a measure of orthologuniversality in terms of number ofspecies with orthologs (number ofspecies) Notched box plots showmedians and extend to the first andthird quartiles their widths areproportional to the number oforthologous groups in each func-tional category Functional categoriesderive from curated lists associatedwith various functionsprocesses aswell as annotated Gene Ontology orInterPro categories (denoted byasterisks)

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

multiple gene families including cytochromeP450s and glutathione S-transferases (GSTs)which serve to generally protect against all en-vironment stresses both natural and anthropo-genic We manually characterized these genefamilies in seven anophelines spanning the genusDespite their large size gene numbers (87 to 104P450 genes 27 to 30 GST genes) within bothgene families are highly conserved across all spe-cies although lineage-specific gene duplicationsand losses are often seen (tables S33 and S34) Aswith the OR and GR olfaction-related gene fam-ilies P450 and GST repertoires may be relativelyconstant due to the large number of roles theyplay in anopheline biology Orthologs of genesassociated with insecticide resistance either viaup-regulation or coding variation [eg Cyp6m2Cyp6p3 (Cyp6p9 in An funestus) Gste2 andGste4] were found in all species which suggestedthat virtually all anophelines likely have genescapable of conferring insecticide resistance throughsimilar mechanisms Unexpectedly one memberof the P450 family (Cyp18a1) with a conservedrole in ecdysteroid catabolism [and consequentlydevelopment and metamorphosis (49)] appears

to have been lost from the ancestor of the Angambiae species complex but is found in the ge-nome and transcriptome assemblies of otherspecies which indicates that the An gambiaecomplex may have recently evolved an alternatemechanism for catabolizing ecdysoneSusceptibility to malaria parasites is a key

determinant of vectorial capacity Dissectingthe immune repertoire (50 51) (table S35) intoits constituent phases reveals that classical rec-ognition genes and genes encoding effector en-zymes exhibit relatively low levels of sequencedivergence Signal transducers are more diver-gent in sequence but are conserved in repre-sentation across species and rarely duplicatedCascade modulators although also divergentare more lineage-specific and generally havemore gene duplications (Fig 3 and fig S25) Arare duplication of an immune signal transduc-tion gene occurred through the retrotranspo-sition of the signal transducer and activator oftranscription STAT2 to form the intronless STAT1after the divergence ofAn dirus and An farautifrom the rest of the subgenus Cellia (Fig 5C andtable S36) It is interesting that an independent

retrotransposition event appears to have cre-ated another intronless STAT gene in the Anatroparvus lineage In An gambiae STAT1 con-trols the expression of STAT2 and is activated inresponse to bacterial challenge (52 53) and theSTAT pathway has been demonstrated to me-diate immunity to Plasmodium (53 54) so thepresence of these relatively new immune signaltransducers may have allowed for rewiring of reg-ulatory networks governing immune responsesin this subset of anophelines

Conclusion

Since the discovery over a century ago by RonaldRoss and Giovanni Battista Grassi that humanmalaria is transmitted by a narrow range of blood-feeding female mosquitoes the biological basisof malarial vectorial capacity has been a matterof intense interest Inasmuch as previous suc-cesses in the local elimination of malaria havealways been accomplished wholly or in partthrough effective vector control an increased un-derstanding of vector biology is crucial for con-tinued progress against malarial disease These16 new reference genome assemblies provide a

1258522-6 2 JANUARY 2015 bull VOL 347 ISSUE 6217 sciencemagorg SCIENCE

Fig 4 Phylogeny-based insights into anopheline biology (A) Maximum-likelihood amino acidndashbased phylogenetic tree of three transglutaminase en-zymes (TG1 green TG2 yellow and TG3 red) in 14 anopheline species with Culexquinquefasciatus (Cxqu) Ae aegypti (Aeae) and D melanogaster (Dmel) servingas outgroups TG3 is the enzyme responsible for the formation of themale matingplug in An gambiae acting upon the substrate Plugin the most abundant matingplug protein Higher rates of evolution for plug-forming TG3 are supported byelevated levels of dN Mating plug phenotypes are noted where known within theTG3 clade (B) Concerted evolution in CPFL cuticular proteins Species symbols used are the same as in (A) In contrast to the TG1TG2TG3 phylogeny CPFLparalogs cluster by subgeneric clades rather than individually recapitulating the species phylogeny Gene family size variation among speciesmay reflect bothgene gainloss and variation in gene set completeness (C) OR observed gene counts and inferred ancestral gene counts on an ultrametric phylogeny At least10 OR genes were gained on the branch leading to the common ancestor of the An gambiae species complex although the overall number of OR genes doesnot vary dramatically across the genus

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

foundation for additional hypothesis generationand testing to further our understanding of thediverse biological traits that determine vectorialcapacity

REFERENCES AND NOTES

1 D P Kwiatkowski How malaria has affected the humangenome and what human genetics can teach us about malariaAm J Hum Genet 77 171ndash192 (2005) doi 101086432519pmid 16001361

2 A Cohuet C Harris V Robert D Fontenille Evolutionaryforces on Anopheles What makes a malaria vector TrendsParasitol 26 130ndash136 (2010) doi 101016jpt200912001pmid 20056485

3 S Manguin P Carnivale J Mouchet Biodiversity of Malaria inthe World (John Libbey Eurotext Paris 2008)

4 R A Holt et al The genome sequence of the malaria mosquitoAnopheles gambiae Science 298 129ndash149 (2002)doi 101126science1076181 pmid 12364791

5 O Marinotti et al The genome of Anopheles darlingi the mainneotropical malaria vector Nucleic Acids Res 41 7387ndash7400(2013) doi 101093nargkt484 pmid 23761445

6 D Zhou et al Genome sequence of Anopheles sinensisprovides insight into genetics basis of mosquito competencefor malaria parasites BMC Genomics 15 42 (2014)pmid 24438588

7 X Jiang et al Genome analysis of a major urban malariavector mosquito Anopheles stephensi Genome Biol 15 459(2014) doi 101186s13059-014-0459-2 pmid 25244985

8 D E Neafsey et al The evolution of the Anopheles 16 genomesproject Genes Genomes Genet 3 1191ndash1194 (2013)doi 101534g3113006247 pmid 23708298

9 M C Fontaine et al Extensive introgression in a malariavector species complex revealed by phylogenomics Science346 1258524 (2014)

10 M Moreno et al Complete mtDNA genomes of Anophelesdarlingi and an approach to anopheline divergence timeMalar J 9 127 (2010) doi 1011861475-2875-9-127pmid 20470395

11 S Gnerre et al High-quality draft assemblies of mammaliangenomes from massively parallel sequence data Proc NatlAcad Sci USA 108 1513ndash1518 (2011) doi 101073pnas1017351108 pmid 21187386

12 Materials and methods are available as supplementarymaterials on Science Online

13 C Holt M Yandell MAKER2 An annotation pipeline andgenome-database management tool for second-generationgenome projects BMC Bioinformatics 12 491 (2011)doi 1011861471-2105-12-491 pmid 22192575

14 M Coluzzi A Sabatini A della Torre M A Di DecoV Petrarca A polytene chromosome analysis of the Anophelesgambiae species complex Science 298 1415ndash1418 (2002)doi 101126science1077769 pmid 12364623

15 M Kamali et al Multigene phylogenetics reveals temporaldiversification of major African malaria vectors PLOS ONE 9e93580 (2014) doi 101371journalpone0093580 pmid 24705448

16 M A Toups M W Hahn Retrogenes reveal the direction ofsex-chromosome evolution in mosquitoes Genetics 186763ndash766 (2010) doi 101534genetics110118794pmid 20660646

17 D A Baker S Russell Role of testis-specific gene expressionin sex-chromosome evolution of Anopheles gambiae Genetics189 1117ndash1120 (2011) doi 101534genetics111133157pmid 21890740

18 M V Han G W C Thomas J Lugo-Martinez M W HahnEstimating gene gain and loss rates in the presence of error ingenome assembly and annotation using CAFE 3 Mol Biol Evol30 1987ndash1997 (2013) doi 101093molbevmst100pmid 23709260

19 M W Hahn M V Han S-G Han Gene family evolution across 12Drosophila genomes PLOS Genet 3 e197 (2007) doi 101371journalpgen0030197 pmid 17997610

20 Z Yang PAML 4 Phylogenetic analysis by maximumlikelihood Mol Biol Evol 24 1586ndash1591 (2007) doi 101093molbevmsm088 pmid 17483113

21 T Dottorini et al A genome-wide analysis in Anophelesgambiae mosquitoes reveals 46 male accessory gland genespossible modulators of female behavior Proc Natl AcadSci USA 104 16215ndash16220 (2007) doi 101073pnas0703904104 pmid 17901209

22 F Baldini P Gabrieli D W Rogers F Catteruccia Functionand composition of male accessory gland secretions inAnopheles gambiae A comparison with other insect vectors ofinfectious diseases Pathog Glob Health 106 82ndash93 (2012)doi 1011792047773212Y0000000016 pmid 22943543

23 R Assis Q Zhou D Bachtrog Sex-biased transcriptomeevolution in Drosophila Genome Biol Evol 4 1189ndash1200(2012) doi 101093gbeevs093 pmid 23097318

24 S Grath J Parsch Rate of amino acid substitution isinfluenced by the degree and conservation of male-biasedtranscription over 50 myr of Drosophila evolutionGenome Biol Evol 4 346ndash359 (2012) doi 101093gbeevs012 pmid 22321769

25 J C Perry P W Harrison J E Mank The ontogeny andevolution of sex-biased gene expression in Drosophilamelanogaster Mol Biol Evol 31 1206ndash1219 (2014)doi 101093molbevmsu072 pmid 24526011

26 D W Rogers et al Transglutaminase-mediated semencoagulation controls sperm storage in the malariamosquito PLOS Biol 7 e1000272 (2009) doi 101371journalpbio1000272 pmid 20027206

27 F Baldini et al The interaction between a sexually transferredsteroid hormone and a female protein regulates oogenesisin the malaria mosquito Anopheles gambiae PLOS Biol 11e1001695 (2013) doi 101371journalpbio1001695pmid 24204210

28 W R Shaw et al Mating activates the heme peroxidase HPX15in the sperm storage organ to ensure fertility in Anophelesgambiae Proc Natl Acad Sci USA 111 5854ndash5859 (2014)doi 101073pnas1401715111 pmid 24711401

29 J H Willis Structural cuticular proteins from arthropodsAnnotation nomenclature and sequence characteristics in thegenomics era Insect Biochem Mol Biol 40 189ndash204(2010) doi 101016jibmb201002001 pmid 20171281

30 R S Cornman et al Annotation and analysis of a largecuticular protein family with the RampR Consensus inAnopheles gambiae BMC Genomics 9 22 (2008) doi 1011861471-2164-9-22 pmid 18205929

31 R S Cornman J H Willis Extensive gene amplification andconcerted evolution within the CPR family of cuticularproteins in mosquitoes Insect Biochem Mol Biol 38661ndash676 (2008) doi 101016jibmb200804001pmid 18510978

32 R S Cornman J H Willis Annotation and analysis oflow-complexity protein families of Anopheles gambiae thatare associated with cuticle Insect Mol Biol 18 607ndash622(2009) doi 101111j1365-2583200900902xpmid 19754739

33 R S Cornman Molecular evolution of Drosophila cuticularprotein genes PLOS ONE 4 e8345 (2009) doi 101371journalpone0008345 pmid 20019874

34 T Togawa W A Dunn A C Emmons J Nagao J H WillisDevelopmental expression patterns of cuticular protein genes

SCIENCE sciencemagorg 2 JANUARY 2015 bull VOL 347 ISSUE 6217 1258522-7

Fig 5 Genesis of novel anopheline genes (A) Retrotransposition of theE2Deffete gene generated a ubiquitin-conjugating enzyme at the base of thegenus which exhibits much higher sequence divergence than the originalmultiexon gene WebLogo plots contrast the amino acid conservation of theoriginal effete gene with the diversification of the retrotransposed copy (residues38 to 75 species represented areAnminimusAn dirusAn funestusAn farautiAn atroparvusAn sinensisAn darlingi andAn albimanus) (B) TheSG7 salivaryproteinndashencodinggenewas generated from theC-terminal half of the 30-kDgene

SG7 then underwent tandem duplication and intron loss to generate anothersalivary protein SG7-2 Numerals indicate lengths of segments in base pairs (C)The origin of STAT1 a signal transducer and activator of transcription geneinvolved in immunity occurred through a retrotransposition event in the Celliaancestor after divergence from An dirus and An farauti The intronless STAT1 ismuch more divergent than its multiexon progenitor STAT2 and has been main-tained in all descendant species An independent retrotransposition event createda retrogene copy inAn atroparvuswhich is alsomoredivergent than its progenitor

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

with the RampR Consensus from Anopheles gambiae InsectBiochem Mol Biol 38 508ndash519 (2008) doi 101016jibmb200712008 pmid 18405829

35 D C Rinker et al Blood meal-induced changes to antennaltranscriptome profiles reveal shifts in odor sensitivities inAnopheles gambiae Proc Natl Acad Sci USA 110 8260ndash8265(2013) doi 101073pnas1302562110 pmid 23630291

36 D C Rinker et al Antennal transcriptome profiles ofanopheline mosquitoes reveal human host olfactoryspecialization in Anopheles gambiae BMC Genomics 14749 (2013) doi 1011861471-2164-14-749 pmid 24182346

37 M Altstein D R Naumlssel Neuropeptide signaling in insectsAdv Exp Med Biol 692 155ndash165 (2010) doi 101007978-1-4419-6902-6_8 pmid 21189678

38 J P Goetze I Hunter S K Lippert L Bardram J F RehfeldProcessing-independent analysis of peptide hormones andprohormones in plasma Front Biosci (Landmark Ed) 171804ndash1815 (2012) doi 1027414020 pmid 22201837

39 T H Stracker S Thompson G L Grossman M A RiehleM R Brown Characterization of the AeaHP gene and itsexpression in the mosquito Aedes aegypti (Diptera Culicidae)J Med Entomol 39 331ndash342 (2002) doi 1016030022-2585-392331 pmid 11931033

40 C D de Oliveira W P Tadei F C Abdalla P F PaolucciPimenta O Marinotti Multiple blood meals in Anophelesdarlingi (Diptera Culicidae) J Vector Ecol 37 351ndash358(2012) doi 101111j1948-7134201200238x pmid 23181859

41 N Okamoto et al A fat body-derived IGF-like peptide regulatespostfeeding growth in Drosophila Dev Cell 17 885ndash891(2009) doi 101016jdevcel200910008 pmid 20059957

42 M A Riehle Y Fan C Cao M R Brown Molecularcharacterization of insulin-like peptides in the yellow fevermosquito Aedes aegypti Expression cellular localization andphylogeny Peptides 27 2547ndash2560 (2006) doi 101016jpeptides200607016 pmid 16934367

43 Y Antonova A J Arik W Moore M M Riehle M R Brown inInsect Endocrinology L I Gilbert Ed (Academic Press 2012)pp 63ndash92

44 A Swaminathan A Gajan L A Pile Epigenetic regulation oftranscription in Drosophila Front Biosci (Landmark Ed) 17909ndash937 (2012) doi 1027413964 pmid 22201781

45 F Cipressa et al Effete a Drosophila chromatin-associatedubiquitin-conjugating enzyme that affects telomeric andheterochromatic position effect variegation Genetics195 147ndash158 (2013) doi 101534genetics113153320pmid 23821599

46 B Arcagrave et al Positive selection drives accelerated evolutionof mosquito salivary genes associated with blood-feedingInsect Mol Biol 23 122ndash131 (2014) doi 101111imb12068pmid 24237399

47 H Isawa M Yuda Y Orito Y Chinzei A mosquito salivaryprotein inhibits activation of the plasma contact system bybinding to factor XII and high molecular weight kininogenJ Biol Chem 277 27651ndash27658 (2002) doi 101074jbcM203505200 pmid 12011093

48 E Calvo et al Aegyptin a novel mosquito salivary glandprotein specifically binds to collagen and prevents itsinteraction with platelet glycoprotein VI integrinalpha2beta1 and von Willebrand factor J Biol Chem 28226928ndash26938 (2007) doi 101074jbcM705669200pmid 17650501

49 E Guittard et al CYP18A1 a key enzyme of Drosophilasteroid hormone inactivation is essential for metamorphosisDev Biol 349 35ndash45 (2011) doi 101016jydbio201009023pmid 20932968

50 R M Waterhouse et al Evolutionary dynamics of immune-relatedgenes and pathways in disease-vector mosquitoes Science 3161738ndash1743 (2007) doi 101126science1139862pmid 17588928

51 L C Bartholomay et al Pathogenomics of Culexquinquefasciatus and meta-analysis of infection responses todiverse pathogens Science 330 88ndash90 (2010) doi 101126science1193162 pmid 20929811

52 C Barillas-Mury Y S Han D Seeley F C KafatosAnopheles gambiae Ag-STAT a new insect member of theSTAT family is activated in response to bacterial infection

EMBO J 18 959ndash967 (1999) doi 101093emboj184959pmid 10022838

53 L Gupta et al The STAT pathway mediates late-phaseimmunity against Plasmodium in the mosquito Anophelesgambiae Cell Host Microbe 5 498ndash507 (2009) doi 101016jchom200904003 pmid 19454353

54 A C Bahia et al The JAK-STAT pathway controls Plasmodiumvivax load in early stages of Anopheles aquasalis infectionPLOS Negl Trop Dis 5 e1317 (2011) doi 101371journalpntd0001317 pmid 22069502

ACKNOWLEDGMENTS

All sequencing reads and genome assemblies have been submittedto the National Center for Biotechnology Information (umbrellaBioProject ID PRJNA67511) Genome and transcriptomeassemblies are also available from VectorBase (httpsvectorbaseorg) and the Broad Institute (httpsolivebroadinstituteorgcollectionsanopheles4) The authorsacknowledge the NIH Eukaryotic Pathogen and Disease VectorSequencing Project Working Group for guidance anddevelopment of this project Sequence data generation wassupported at the Broad Institute by the National HumanGenome Research Institute (U54 HG003067) We thank themany members of the Broad Institute Genomics Platform andGenome Sequencing and Analysis Program who contributed tosequencing data generation and analysis

SUPPLEMENTARY MATERIALS

wwwsciencemagorgcontent34762171258522supplDC1Materials and MethodsSupplementary TextFigs S1 to S25Tables S1 to S36References (55ndash248)

9 July 2014 accepted 14 November 2014Published online 27 November 2014101126science1258522

1258522-8 2 JANUARY 2015 bull VOL 347 ISSUE 6217 sciencemagorg SCIENCE

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

mosquitoesAnophelesHighly evolvable malaria vectors The genomes of 16

Riehle Igor V Sharakhov Zhijian Tu Laurence J Zwiebel and Nora J BesanskyKafatos Manolis Kellis Daniel Lawson Christos Louis Shirley Luckhart Marc A T Muskavitch Joseacute M Ribeiro Michael ADonnelly Scott J Emrich Michael C Fontaine William Gelbart Matthew W Hahn Immo A Hansen Paul I Howell Fotis C Zhou Flaminia Catteruccia George K Christophides Frank H Collins Robert S Cornman Andrea Crisanti Martin JJohn Vontas Catherine Walton Craig S Wilding Judith H Willis Yi-Chieh Wu Guiyun Yan Evgeny M Zdobnov Xiaofan Vladimir Stegniy Claudio J Struchiner Gregg W C Thomas Marta Tojo Pantelis Topalis Joseacute M C Tubio Maria F UngerSagnon Maria V Sharakhova Terrance Shea Felipe A Simatildeo Frederic Simard Michel A Slotman Pradya Somboon Anil Prakash David P Price Ashok Rajaraman Lisa J Reimer David C Rinker Antonis Rokas Tanya L Russell NFaleChioma Oringanje Mohammad A Oshaghi Nazzy Pakpour Philippos A Papathanos Ashley N Peery Michael Povelones N Mitchell Wendy Moore Katherine A Murphy Anastasia N Naumenko Tony Nolan Eva M Novoa Samantha OLoughlinErnesto Lowy Robert M MacCallum Chunhong Mao Gareth Maslen Charles Mbogo Jenny McCarthy Kristin Michel Sara Kirmitzoglou Lizette L Koekemoer Njoroge Laban Nicholas Langridge Mara K N Lawniczak Manolis Lirakis Neil F LoboXiaofang Jiang Irwin Jungreis Evdoxia G Kakani Maryam Kamali Petri Kemppainen Ryan C Kennedy Ioannis K Gabriel Wamdaogo M Guelbeogo Andrew B Hall Mira V Han Thaung Hlaing Daniel S T Hughes Adam M JenkinsChiu Mikkel Christensen Carlo Costantini Victoria L M Davidson Elena Deligianni Tania Dottorini Vicky Dritsou Stacey B Birren Stephanie A Blandin Andrew I Brockman Thomas R Burkot Austin Burt Clara S Chan Cedric Chauve Joanna CJames Amon Bruno Arcagrave Peter Arensburger Gleb Artemov Lauren A Assour Hamidreza Basseri Aaron Berlin Bruce W Daniel E Neafsey Robert M Waterhouse Mohammad R Abai Sergey S Aganezov Max A Alekseyev James E Allen

originally published online November 27 2014DOI 101126science1258522 (6217) 1258522347Science

this issue 101126science1258524 and 101126science1258522 see also p 27Sciencehuman malaria (see the Perspective by Clark and Messer)reveal clues that explain the mystery of why only some species of one genus of mosquitoes are capable of transmitting

et al and Fontaine et allife-threatening infection By sequencing the genomes of several mosquitoes in depth Neafsey among these bloodsucking flies that render some bites mere nuisances and others the initiation of a potentially

Virtually everyone has first-hand experience with mosquitoes Few recognize the subtle biological distinctionsMosquito adaptability across genomes

ARTICLE TOOLS httpsciencesciencemagorgcontent34762171258522

MATERIALSSUPPLEMENTARY httpsciencesciencemagorgcontentsuppl20141125science1258522DC1

CONTENTRELATED httpsciencesciencemagorgcontentsci347621727full

REFERENCES

httpsciencesciencemagorgcontent34762171258522BIBLThis article cites 238 articles 62 of which you can access for free

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAASSciencelicensee American Association for the Advancement of Science No claim to original US Government Works The title Science 1200 New York Avenue NW Washington DC 20005 2017 copy The Authors some rights reserved exclusive

(print ISSN 0036-8075 online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

on June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

PERMISSIONS httpwwwsciencemagorghelpreprints-and-permissions

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAASSciencelicensee American Association for the Advancement of Science No claim to original US Government Works The title Science 1200 New York Avenue NW Washington DC 20005 2017 copy The Authors some rights reserved exclusive

(print ISSN 0036-8075 online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

on June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

Page 3: MOSQUITO GENOMICS Highlyevolvable … ARTICLE SUMMARY MOSQUITO GENOMICS Highlyevolvable malariavectors:The genomes of 16 Anopheles mosquitoes Daniel E. Neafsey,*† Robert M. Waterhouse,*

(fig S3) with between 10738 and 16149 protein-coding genes identified for each species Genecount was generally commensurate with assem-bly contiguity (table S6) Some of this variation intotal gene counts may be attributed to the chal-lenges of gene annotations with variable levelsof assembly contiguity and supporting RNAseqdata To estimate the prevalence of erroneousgene model fusions andor fragmentations wecompared the new gene annotations to An gam-biae gene models and found an average of 33and 97 potentially fused and fragmented genemodels respectively For analyses described be-low that may be sensitive to variation in genemodel accuracy or gene set completeness wehave conducted sensitivity analyses to rule outconfounding results from these factors (12)

Rapidly evolving genes and genomes

Orthology delineation identified lineage-restrictedand species-specific genes as well as ancient genesfound across insect taxa of which universal single-copy orthologs were employed to estimate themolecular species phylogeny (Fig 1 B and C andfig S4) Analysis of codon frequencies in theseorthologs revealed that anophelines unlike dro-sophilids exhibit relatively uniform codon usagepreferences (fig S5)Polytene chromosomes have provided a glimpse

into anopheline chromosome evolution (14) Ourgenome-sequencendashbased view confirmed the

cytological observations and offers many newinsights At the base-pair level ~90 of the non-gapped and nonmasked An gambiae genome(ie excluding transposable elements as detailedin table S7) is alignable to the most closely re-lated species whereas only ~13 aligns to themost distant (Fig 1D fig S6 and table S8) withreduced alignability in centromeres and on theX chromosome (Fig 1D) At chromosomal lev-els mapping data anchored 35 to 76 of the Anstephensi An funestus An atroparvus and Analbimanus genome assemblies to chromosomalarms (tables S9 to S12) Analysis of genes in an-chored regions showed that synteny at the whole-arm level is highly conserved despite severalwhole-arm translocations (Fig 2A and table S13)In contrast small-scale rearrangements disruptgene colinearity within arms over time leadingto extensive shuffling of gene order over a timescale of 29 million years or more (10 15) (Fig2B and fig S7) As in Drosophila rearrangementrates are higher on the X chromosome than onautosomes (Fig 2C and tables S14 to S16)However the difference is significantly morepronounced in Anopheles where X chromosomerearrangements are more frequent by a factorof 27 than autosomal rearrangements in Dro-sophila the corresponding ratio is only 12 (t testt10 = 73 P lt 1 times 10minus5) (fig S8) The X chromo-some is also notable for a significant degree ofobserved gene movement to other chromosomes

relative to Drosophila (one sample proportiontest P lt 22 times 10minus16) (Fig 2D and tables S17 andS18) as was previously noted for Anophelesrelative to Aedes (16) further underscoring itsdistinctive evolutionary profile in Anopheles com-pared with other dipteran generaSuch dynamic gene shuffling and movement

may be facilitated by themultiple families of DNAtransposons and long terminal repeat (LTR) andnon-LTR retroelements found in all genomes(table S7) as well as a weaker dosage compen-sation phenotype in Anopheles compared withDrosophila (17) Despite such shuffling com-paring genomic locations of orthologs can besuccessfully employed to reconstruct ancestralchromosomal arrangements (fig S9) and to con-fidently improve assembly contiguity (tables S19to S21)Copy-number variation in homologous gene

families also reveals striking evolutionary dy-namism Analysis of 11636 gene families withCAFE 3 (18) indicates a rate of gene gainlosshigher by a factor of at least 5 than that ob-served for 12 Drosophila genomes (19) Overallthese Anopheles genomes exhibit a rate of gainor loss per gene per million years of 312 times 10minus3

compared with 590 times 10minus4 for Drosophila sug-gesting substantially higher gene turnover with-in anophelines relative to fruit flies This fivefoldgreater gainloss rate in anophelines holds trueunder models that account for uncertainty in

1258522-2 2 JANUARY 2015 bull VOL 347 ISSUE 6217 sciencemagorg SCIENCE

1Genome Sequencing and Analysis Program Broad Institute 415 Main Street Cambridge MA 02142 USA 2Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology32 Vassar Street Cambridge MA 02139 USA 3The Broad Institute of Massachusetts Institute of Technology and Harvard 415 Main Street Cambridge MA 02142 USA 4Department of Genetic Medicineand Development University of Geneva Medical School Rue Michel-Servet 1 1211 Geneva Switzerland 5Swiss Institute of Bioinformatics Rue Michel-Servet 1 1211 Geneva Switzerland 6Department ofMedical Entomology and Vector Control School of Public Health and Institute of Health Researches Tehran University of Medical Sciences Tehran Iran 7George Washington University Department ofMathematics and Computational Biology Institute 45085 University Drive Ashburn VA 20147 USA 8European Molecular Biology Laboratory European Bioinformatics Institute EMBL-EBI Wellcome TrustGenome Campus Hinxton Cambridge CB10 1SD UK 9National Vector Borne Disease Control Programme Ministry of Health Tafea Province Vanuatu 10Department of Public Health and InfectiousDiseases Division of Parasitology Sapienza University of Rome Piazzale Aldo Moro 5 00185 Rome Italy 11Department of Biological Sciences California State PolytechnicndashPomona 3801 West TempleAvenue Pomona CA 91768 USA 12Tomsk State University 36 Lenina Avenue Tomsk Russia 13Department of Computer Science and Engineering Eck Institute for Global Health 211B Cushing HallUniversity of Notre Dame Notre Dame IN 46556 USA 14Inserm U963 F-67084 Strasbourg France 15CNRS UPR9022 IBMC F-67084 Strasbourg France 16Department of Life Sciences Imperial CollegeLondon South Kensington Campus London SW7 2AZ UK 17Faculty of Medicine Health and Molecular Science Australian Institute of Tropical Health Medicine James Cook University Cairns 4870Australia 18Department of Life Sciences Imperial College London Silwood Park Campus Ascot SL5 7PY UK 19Department of Mathematics Simon Fraser University 8888 University Drive Burnaby BCV5A 1S6 Canada 20Department of Entomology and Nematology One Shields Avenue University of CaliforniandashDavis Davis CA 95616 USA 21Institut de Recherche pour le Deacuteveloppement Uniteacutes Mixtesde Recherche Maladies Infectieuses et Vecteurs Eacutecologie Geacuteneacutetique Eacutevolution et Controcircle 911 Avenue Agropolis BP 64501 Montpellier France 22Division of Biology Kansas State University 271 ChalmersHall Manhattan KS 66506 USA 23Institute of Molecular Biology and Biotechnology Foundation for Research and Technology Hellas Nikolaou Plastira 100 GR-70013 Heraklion Crete Greece 24Centre ofFunctional Genomics University of Perugia Perugia Italy 25Genomics Platform Broad Institute 415 Main Street Cambridge MA 02142 USA 26Centre National de Recherche et de Formation sur lePaludisme Ouagadougou 01 BP 2208 Burkina Faso 27Program of Genetics Bioinformatics and Computational Biology Virginia Polytechnic Institute and State University Blacksburg VA 24061 USA28School of Life Sciences University of Nevada Las Vegas NV 89154 USA 29Department of Medical Research No 5 Ziwaka Road Dagon Township Yangon 11191 Myanmar 30Baylor College of Medicine1 Baylor Plaza Houston TX 77030 USA 31Boston College 140 Commonwealth Avenue Chestnut Hill MA 02467 USA 32Department of Biochemistry Virginia Polytechnic Institute and State UniversityBlacksburg VA 24061 USA 33Harvard School of Public Health Department of Immunology and Infectious Diseases Boston MA 02115 USA 34Dipartimento di Medicina Sperimentale e ScienzeBiochimiche Universitagrave degli Studi di Perugia Perugia Italy 35Department of Entomology Virginia Polytechnic Institute and State University Blacksburg VA 24061 USA 36Computational EvolutionaryBiology Group Faculty of Life Sciences University of Manchester Oxford Road Manchester M13 9PT UK 37Department of Bioengineering and Therapeutic Sciences University of California San FranciscoCA 94143 USA 38Bioinformatics Research Laboratory Department of Biological Sciences New Campus University of Cyprus CY 1678 Nicosia Cyprus 39Wits Research Institute for Malaria Faculty ofHealth Sciences and Vector Control Reference Unit National Institute for Communicable Diseases of the National Health Laboratory Service Sandringham 2131 Johannesburg South Africa 40NationalMuseums of Kenya PO Box 40658-00100 Nairobi Kenya 41Department of Biology University of Crete 700 13 Heraklion Greece 42Eck Institute for Global Health and Department of Biological SciencesUniversity of Notre Dame 317 Galvin Life Sciences Building Notre Dame IN 46556 USA 43Virginia Bioinformatics Institute 1015 Life Science Circle Virginia Polytechnic Institute and State UniversityBlacksburg VA 24061 USA 44Kenya Medical Research Institute-Wellcome Trust Research Programme Centre for Geographic Medicine Research - Coast PO Box 230-80108 Kilifi Kenya 45Departmentof Entomology 1140 East South Campus Drive Forbes 410 University of Arizona Tucson AZ 85721 USA 46Department of Medical Microbiology and Immunology School of Medicine University ofCalifornia Davis One Shields Avenue Davis CA 95616 USA 47Department of Pathobiology University of Pennsylvania School of Veterinary Medicine 3800 Spruce Street Philadelphia PA 19104 USA48Regional Medical Research Centre NE Indian Council of Medical Research PO Box 105 Dibrugarh-786 001 Assam India 49Department of Biology New Mexico State University Las Cruces NM 88003USA 50Molecular Biology Program New Mexico State University Las Cruces NM 88003 USA 51Department of Vector Biology Liverpool School of Tropical Medicine Pembroke Place Liverpool L3 5QAUK 52Center for Human Genetics Research Vanderbilt University Medical Center Nashville TN 37235 USA 53Department of Biological Sciences Vanderbilt University Nashville TN 37235 USA54Department of Entomology Texas AampM University College Station TX 77807 USA 55Department of Parasitology Faculty of Medicine Chiang Mai University Chiang Mai 50200 Thailand 56FundaccedilatildeoOswaldo Cruz Avenida Brasil 4365 RJ Brazil 57Instituto de Medicina Social Universidade do Estado do Rio de Janeiro Rio de Janeiro Brazil 58School of Informatics and Computing Indiana UniversityBloomington IN 47405 USA 59Department of Physiology School of Medicine Center for Research in Molecular Medicine and Chronic Diseases Instituto de Investigaciones Sanitarias University ofSantiago de Compostela Santiago de Compostela A Coruntildea Spain 60Wellcome Trust Sanger Institute Hinxton Cambridgeshire CB10 1SA UK 61School of Natural Sciences and Psychology LiverpoolJohn Moores University Liverpool L3 3AF UK 62Department of Cellular Biology University of Georgia Athens GA 30602 USA 63Department of Computer Science Harvey Mudd College Claremont CA91711 USA 64Program in Public Health College of Health Sciences University of California Irvine Hewitt Hall Irvine CA 92697 USA 65Malaria Programme Wellcome Trust Sanger Institute CambridgeCB10 1SJ UK 66Centre of Evolutionary and Ecological Studies (Marine Evolution and Conservation group) University of Groningen Nijenborgh 7 NL-9747 AG Groningen Netherlands 67Department ofMolecular and Cellular Biology Harvard University 16 Divinity Avenue Cambridge MA 02138 USA 68Department of Biology Indiana University Bloomington IN 47405 USA 69Centers for Disease Controland Prevention 1600 Clifton Road NE MSG49 Atlanta GA 30329 USA 70Biogen Idec 14 Cambridge Center Cambridge MA 02142 USA 71Laboratory of Malaria and Vector Research National Institute ofAllergy and Infectious Diseases 12735 Twinbrook Parkway Rockville MD 20852 USA 72Departments of Biological Sciences and Pharmacology Institutes for Chemical Biology Genetics and Global HealthVanderbilt University and Medical Center Nashville TN 37235 USAThese authors contributed equally to this work daggerCorresponding author E-mail neafseybroadinstituteorg (DEN) nbesanskndedu (NJB)

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

gene family sizes at the tips of the species treedue to annotation or assembly errors and is notsensitive to inclusion or exclusion of taxa affect-ing the root age of the tree nor to the exclusionof taxa with the poorest assemblies and genesets (fig S10 and tables S22 and S23) Examplesinclude expansions of cuticular proteins in Anarabiensis and neurotransmitter-gated ion chan-nels in An albimanus (table S24)The evolutionary dynamismofAnopheles genes

extends to their architecture Comparisons ofsingle-copy orthologs at deeper phylogeneticdepths showed losses of introns at the root ofthe true fly order Diptera and revealed con-tinued losses as the group diversified into thelineages leading to fruit flies and mosquitoesHowever anopheline orthologs have sustainedgreater intron loss than drosophilids leadingto a relative paucity of introns in the genes ofextant anophelines (fig S11 and table S25) Com-parative analysis also revealed that gene fu-sion and fission played a substantial role in theevolution of mosquito genes with apparent re-arrangements affecting an average of 101 of

all genes in the genomes of the 10 specieswith the most contiguous assemblies (fig S12)Furthermore gene boundaries can be flexiblewhole genome alignments identified 325 can-didates for stop-codon readthrough (fig S13 andtable S26)Becausemolecular evolution of protein-coding

sequences is a well-known source of phenotypicchange we compared evolutionary rates amongdifferent functional categories of anopheline ortho-logs We quantified evolutionary divergence intermsof protein sequence identity of alignedortho-logs and the dNdS statistic (ratio of nonsynon-ymous to synonymous substitutions) computedusing PAML (12 20) Among curated sets of geneslinked to vectorial capacity or species-specific traitsagainst a background of functional categories de-fined by Gene Ontology or InterPro annotationsodorant and gustatory receptors showhigh evolu-tionary rates and male accessory gland proteinsexhibit exceptionally high dNdS ratios (Fig 3 figsS14 and S15 and tables S27 to S29) Rapid diver-gence in functional categories related to malariatransmission andor mosquito control strategies

led us to examine the genomic basis of severalfacets of anopheline biology in closer detail

Insights into mosquito biology andvectorial capacity

Mosquito reproductive biology evolves rapidlyand presents a compelling target for vector con-trol This is exemplified by the An gambiaemaleaccessory gland protein (Acp) cluster on chromo-some 3R (21 22) where conservation is mostlylost outside the An gambiae species complex(fig S16) In Drosophila male-biased genes suchas Acps tend to evolve faster than loci withoutmale-biased expression (23ndash25) We looked fora similar pattern in anophelines after assessingeach gene for sex-biased expression using micro-array and RNAseq data sets for An gambiae (12)In contrast to Drosophila female-biased genesshowdramatically faster rates of evolution acrossthe genus thanmale-biased genes (Wilcoxon ranksum test P = 5 times 10minus4) (fig S17)Differences in reproductive genes among

anophelines may provide insight into the or-igin and function of sex-related traits During

SCIENCE sciencemagorg 2 JANUARY 2015 bull VOL 347 ISSUE 6217 1258522-3

An gambiae

An christyi

An epiroticus

An stephensi

An maculatus

An culicifacies

An minimus

An funestus

An dirus

An farauti

An atroparvus

An sinensis

An albimanus

An darlingi

Culex quinquefasciatus

Aedes aegypti

An arabiensis

Culicinae

Anophelinae

Cellia

D melanogaster

D pseudoobscuraD mojavensis

Glossina morsitans

10000 200000

MuscomorphaDrosophilidae

Culicidae

An merus

An melas

An quadri-annulatus

5000 15000

GeneOrthology

Anophelinae

Culicinae

Culicidae

Diptera

Widespread

Universal

None

Number of Genes

100 Ma

260 Ma

005ss

MolecularPhylogeny

Nyssorhynchus

Anopheles

Pyretophorus

050 1

2R3R2L3LX

Genome Alignability

30 Ma

gambiaecomplex

minor

Vector Statusmajor

non

Geography

A B C

D

Fig 1 Geography vector status molecular phylogeny gene orthologyand genome alignability of the 16 newly sequenced anopheline mosqui-toes and selected other dipterans (A) Global geographic distributions ofthe 16 sampled anophelines and the previously sequenced An gambiae andAn darlingi Ranges are colored for each species or group of species as shownin (B) eg light blue for An farauti (B) The maximum likelihood molecularphylogeny of all sequenced anophelines and selected dipteran outgroupsShapes between branch termini and species names indicate vector status

(rectangles major vectors ellipses minor vectors triangles nonvectors) andare colored according to geographic ranges shown in (A) (C) Bar plots showtotal gene counts for each species partitioned according to their orthologyprofiles from ancient genes found across insects to lineage-restricted andspecies-specific genes (D) Heat map illustrating the density (in 2-kb slidingwindows) of whole-genome alignments along the lengths of An gambiaechromosomal arms fromwhite where An gambiae aligns to no other speciesto red where An gambiae aligns to all the other anophelines

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

copulation An gambiae males transfer a gela-tinous mating plug a complex of seminal pro-teins lipids and hormones that are essentialfor successful sperm storage by females and forreproductive success (26ndash28) Coagulation ofthe plug is mediated by a seminal transgluta-minase (TG3) which is found in anophelinesbut is absent in other mosquito genera that donot form a mating plug (26) We examined TG3and its two paralogs (TG1 and TG2) in the se-quenced anophelines and investigated the rateof evolution of each gene (Fig 4A) Silent siteswere saturated at the whole-genus level makingdS difficult to estimate reliably but TG1 (thegene presumed to be ancestral owing to broad-est taxonomic representation) exhibited the lowestrate of amino acid change (dN = 020) TG2 ex-hibited an intermediate rate (dN = 093) and theanopheline-specific TG3 has evolved even morerapidly (dN = 150) perhaps because of malemale or malefemale evolutionary conflict Notethat plug formation appears to be a derived traitwithin anophelines because it is not exhibited byAn albimanus and intermediate poorly coagu-lated plugs were observed in taxa descendingfrom early-branching lineages within the genus(table S30) Functional studies of mating plugs

will be necessary to understand what drove theorigin and rapid evolution of TG3Proteins that constitute themosquito cuticular

exoskeleton play important roles in diverse as-pects of anopheline biology including develop-ment ecology and insecticide resistance andconstitute approximately 2 of all protein-codinggenes (29) Comparisons among dipterans haverevealed numerous amplifications of cuticularprotein (CP) genes undergoing concerted evo-lution at physically clustered loci (30ndash33) Weinvestigated the extent and time scale of genecluster homogenization within anophelines bygenerating phylogenies of orthologous geneclusters (fig S18 and table S31) Throughout thegenus these gene clusters often group phylo-genetically by species rather than by positionwithin tandem arrays particularly in a subsetof clusters These include the 3RB and 3RC clus-ters of CP genes (30) the CPLCG group A andCPLCW clusters found elsewhere on 3R (32) andsix tandemly arrayed genes on 3L designatedCPFL2 through CPFL7 (34) CPLCW genes occurin a head-to-head arrangementwithCPLCGgroupA genes and exhibit highly conserved intergenicsequences (fig S19) Furthermore transcript lo-calization studies using in situ hybridization

revealed identical spatial expression patterns forCLPCW and CPLCG group A gene pairs sugges-tive of coregulation (fig S19) For these five geneclusters complete grouping by organismal lin-eage was observed for most deep nodes as wellas formany individual species outside the shallowAn gambiae species complex (Fig 4B) consistentwith a relatively rapid (less than 20 million years)homogenization of sequences via concerted evo-lution The emerging pattern of anopheline CPevolution is thus one of relative stasis for a ma-jority of single-copy orthologs juxtaposed withconsistent concerted evolution of a subset of genesAnophelines identify hosts oviposition sites

and other environmental cues through special-ized chemosensory membrane-bound receptorsWe examined three of the major gene familiesthat encode these molecules the odorant recep-tors (ORs) gustatory receptors (GRs) and variantionotropic glutamate receptors (IRs) Given therapid chemosensory gene turnover observed inmany other insects we explored whether vary-ing host preferences of anopheline mosquitoescould be attributed to chemosensory gene gainsand losses Unexpectedly in light of the elevatedgenome-wide rate of gene turnover we found thatthe overall size and content of the chemosensory

1258522-4 2 JANUARY 2015 bull VOL 347 ISSUE 6217 sciencemagorg SCIENCE

Fig 2 Patterns of anopheline chromosomal evolution (A) Anopheline ge-nomes have conserved gene membership on chromosome arms (ldquoelementsrdquocolored and labeled 1 to 5) UnlikeDrosophila chromosome elements reshufflebetween chromosomes via translocations as intact elements and do notshow fissions or fusionsThe tree depicts the supported molecular topologyfor the species studied (B) Conserved synteny blocks decay rapidly withinchromosomal arms as the phylogenetic distance increases between spe-cies Moving left to right the dot-plot panels show gene-level synteny betweenchromosome 2R of An gambiae (x axis) and inferred ancestral sequences(y axes inferred using PATHGROUPS) at increasing evolutionary time scales(million years ago) estimated by an ultrametric phylogeny Gray horizontal

lines represent scaffold breaks Discontinuity of the red linesdots indicatesrearrangement (C) Anopheline X chromosomes exhibit higher rates of re-arrangement (P lt 1 times 10minus5) measured as breaks per Mb per million yearscompared with autosomes despite a paucity of polymorphic inversions onthe X (D) The anopheline X chromosome also displays a higher rate of genemovement to other chromosomal arms than any of the autosomes Chromo-somal elements are labeled around the perimeter internal bands are coloredaccording to the chromosomal element source and match element colors in(A) and (C) Bands are sized to indicate the relative ratio of genes importedversus exported for each chromosomal element and the relative allocation ofexported genes to other elements

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

gene repertoire are relatively conserved across thegenus CAFE 3 (18) analyses estimated that themost recent common ancestor of the anophelineshad approximately 60 genes in each of the ORand GR families similar to most extant anophe-lines (Fig 4C and fig S20) Estimated gainlossrates of OR and GR genes per million years(error-corrected l = 13 times 10minus3 for ORs and 20 times10minus4 for GRs) were much lower than the overalllevel of anopheline gene families Similarly wefound almost the same number of antennae-expressed IRs (~20) in all anopheline genomesDespite overall conservation in chemosensorygene numbers we observed several examples ofgene gain and loss in specific lineages Notablythere was a net gain of at least 12 ORs in thecommon ancestor of the An gambiae complex(Fig 4C)OR and GR gene repertoire stability may de-

rive from their roles in several critical behaviorsHost preference differences are likely to be gov-erned by a combination of functional divergenceand transcriptional modulation of orthologs Thismodel is supported by studies of antennal tran-scriptomes in the major malaria vector Angambiae (35) and comparisons between thisvector and its morphologically identical siblingAn quadriannulatus (36) a very closely relatedspecies that plays no role inmalaria transmission(despite vectorial competence) because it does

not specialize on human hosts Furthermore wefound that many subfamilies of ORs and GRsshowed evidence of positive selection (19 of 53ORs 17 of 59 GRs) across the genus suggestingpotential functional divergenceSeveral blood feedingndashrelated behaviors in

mosquitoes are also regulated by peptide hor-mones (37) These peptides are synthesized pro-cessed and released fromnervous and endocrinesystems and elicit their effects through bindingappropriate receptors in target tissues (38) Intotal 39 peptide hormones were identified fromeach of the sequenced anophelines (fig S21)Notably no ortholog of the well-characterizedhead peptide (HP) hormone of the culicine mos-quito Aedes aegypti was identified in any of theassemblies In Ae aegypti HP is responsible forinhibiting host-seeking behavior after a bloodmeal (39) Because anophelines broadly exhibitsimilar behavior (40) the absence ofHP from theentire clade suggests they may have evolved anovelmechanism to inhibit excess blood feedingSimilarly no ortholog of insulin growth factor 1(IGF1) was identified in any anophelines eventhough IGF1 orthologs have been identified inother dipterans includingDmelanogaster (41)and Ae aegypti (42) IGF1 is a key componentof the insulininsulin growth factor 1 signaling(IIS) cascade which regulates processes includ-ing innate immunity reproduction metabolism

and life span (43) Nevertheless other membersof the IIS cascade are present and four insulin-like peptides are found in a compact cluster withgene arrangements conserved across anophelines(fig S22) This raises questions regarding themodification of IIS signaling in the absence ofIGF1 and the functional importance of this con-served genomic arrangementEpigenetic mechanisms affect many biological

processes by modulation of chromatin structuretelomere remodeling and transcriptional con-trol Of the 215 epigenetic regulatory genes inD melanogaster (44) we identified 169 putativeAn gambiae orthologs (table S32) which sug-gested the presence of mechanisms of epigeneticcontrol in Anopheles and Drosophila We findhowever that retrotransposition may have con-tributed to the functional divergence of at leastone gene associated with epigenetic regulationThe ubiquitin-conjugating enzyme E2D (ortholo-gous to effete (45) in D melanogaster) duplicatedvia retrotransposition in an early anopheline an-cestor and the retrotransposed copy is main-tained in a subset of anophelines Although theentire amino acid sequence of E2D is perfectlyconserved between An gambiae and D mela-nogaster the retrogenes are highly divergent(Fig 5A) and may contribute to functional di-versification within the genusSaliva is integral to blood feeding it impairs

host hemostasis and also affects inflammationand immunity In An gambiae the salivary pro-teome is estimated to contain the products ofat least 75 genes most being expressed solely inthe adult female salivary glands Comparativeanalyses indicate that anopheline salivary pro-teins are subject to strong evolutionary pres-sures and these genes exhibit an acceleratedpace of evolution as well as a very high rate ofgainloss (Fig 3 and fig S23) Polymorphismswithin An gambiae populations from limitedsets of salivary genes were previously found tocarry signatures of positive selection (46) Se-quence analysis across the anophelines showsthat salivary genes have the highest incidenceof positively selected codons among the sevengene classes (fig S24) indicating that coevolu-tion with vertebrate hosts is a powerful driverof natural selection in salivary proteomes More-over salivary proteins also exhibit functionaldiversification through new gene creation Se-quence similarity intron-exon boundaries andsecondary structure prediction point to the birthof the SG7SG7-2 inflammation-inhibiting (47)gene family from the genomic region encodingthe C terminus of the 30-kD protein (Fig 5B) acollagen-binding platelet inhibitor already presentin the blood-feeding ancestor of mosquitoes andblack flies (48) Based on phylogenetic representa-tion these events must have occurred before theradiation of anophelines but after separation fromthe culicinesResistance to insecticides and other xeno-

biotics has arisen independently inmany anoph-eline species fostered directly and indirectly byanthropogenic environmentalmodificationMeta-bolic resistance to insecticides is mediated by

SCIENCE sciencemagorg 2 JANUARY 2015 bull VOL 347 ISSUE 6217 1258522-5

Fig 3 Contrasting evolutionaryproperties of selected genefunctional categories Examinedevolutionary properties of ortholo-gous groups of genes include ameasure of amino acid conservationdivergence (evolutionary rate) ameasure of selective pressure(dNdS) a measure of geneduplication in terms of mean genecopy-number per species (number ofgenes) and a measure of orthologuniversality in terms of number ofspecies with orthologs (number ofspecies) Notched box plots showmedians and extend to the first andthird quartiles their widths areproportional to the number oforthologous groups in each func-tional category Functional categoriesderive from curated lists associatedwith various functionsprocesses aswell as annotated Gene Ontology orInterPro categories (denoted byasterisks)

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

multiple gene families including cytochromeP450s and glutathione S-transferases (GSTs)which serve to generally protect against all en-vironment stresses both natural and anthropo-genic We manually characterized these genefamilies in seven anophelines spanning the genusDespite their large size gene numbers (87 to 104P450 genes 27 to 30 GST genes) within bothgene families are highly conserved across all spe-cies although lineage-specific gene duplicationsand losses are often seen (tables S33 and S34) Aswith the OR and GR olfaction-related gene fam-ilies P450 and GST repertoires may be relativelyconstant due to the large number of roles theyplay in anopheline biology Orthologs of genesassociated with insecticide resistance either viaup-regulation or coding variation [eg Cyp6m2Cyp6p3 (Cyp6p9 in An funestus) Gste2 andGste4] were found in all species which suggestedthat virtually all anophelines likely have genescapable of conferring insecticide resistance throughsimilar mechanisms Unexpectedly one memberof the P450 family (Cyp18a1) with a conservedrole in ecdysteroid catabolism [and consequentlydevelopment and metamorphosis (49)] appears

to have been lost from the ancestor of the Angambiae species complex but is found in the ge-nome and transcriptome assemblies of otherspecies which indicates that the An gambiaecomplex may have recently evolved an alternatemechanism for catabolizing ecdysoneSusceptibility to malaria parasites is a key

determinant of vectorial capacity Dissectingthe immune repertoire (50 51) (table S35) intoits constituent phases reveals that classical rec-ognition genes and genes encoding effector en-zymes exhibit relatively low levels of sequencedivergence Signal transducers are more diver-gent in sequence but are conserved in repre-sentation across species and rarely duplicatedCascade modulators although also divergentare more lineage-specific and generally havemore gene duplications (Fig 3 and fig S25) Arare duplication of an immune signal transduc-tion gene occurred through the retrotranspo-sition of the signal transducer and activator oftranscription STAT2 to form the intronless STAT1after the divergence ofAn dirus and An farautifrom the rest of the subgenus Cellia (Fig 5C andtable S36) It is interesting that an independent

retrotransposition event appears to have cre-ated another intronless STAT gene in the Anatroparvus lineage In An gambiae STAT1 con-trols the expression of STAT2 and is activated inresponse to bacterial challenge (52 53) and theSTAT pathway has been demonstrated to me-diate immunity to Plasmodium (53 54) so thepresence of these relatively new immune signaltransducers may have allowed for rewiring of reg-ulatory networks governing immune responsesin this subset of anophelines

Conclusion

Since the discovery over a century ago by RonaldRoss and Giovanni Battista Grassi that humanmalaria is transmitted by a narrow range of blood-feeding female mosquitoes the biological basisof malarial vectorial capacity has been a matterof intense interest Inasmuch as previous suc-cesses in the local elimination of malaria havealways been accomplished wholly or in partthrough effective vector control an increased un-derstanding of vector biology is crucial for con-tinued progress against malarial disease These16 new reference genome assemblies provide a

1258522-6 2 JANUARY 2015 bull VOL 347 ISSUE 6217 sciencemagorg SCIENCE

Fig 4 Phylogeny-based insights into anopheline biology (A) Maximum-likelihood amino acidndashbased phylogenetic tree of three transglutaminase en-zymes (TG1 green TG2 yellow and TG3 red) in 14 anopheline species with Culexquinquefasciatus (Cxqu) Ae aegypti (Aeae) and D melanogaster (Dmel) servingas outgroups TG3 is the enzyme responsible for the formation of themale matingplug in An gambiae acting upon the substrate Plugin the most abundant matingplug protein Higher rates of evolution for plug-forming TG3 are supported byelevated levels of dN Mating plug phenotypes are noted where known within theTG3 clade (B) Concerted evolution in CPFL cuticular proteins Species symbols used are the same as in (A) In contrast to the TG1TG2TG3 phylogeny CPFLparalogs cluster by subgeneric clades rather than individually recapitulating the species phylogeny Gene family size variation among speciesmay reflect bothgene gainloss and variation in gene set completeness (C) OR observed gene counts and inferred ancestral gene counts on an ultrametric phylogeny At least10 OR genes were gained on the branch leading to the common ancestor of the An gambiae species complex although the overall number of OR genes doesnot vary dramatically across the genus

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

foundation for additional hypothesis generationand testing to further our understanding of thediverse biological traits that determine vectorialcapacity

REFERENCES AND NOTES

1 D P Kwiatkowski How malaria has affected the humangenome and what human genetics can teach us about malariaAm J Hum Genet 77 171ndash192 (2005) doi 101086432519pmid 16001361

2 A Cohuet C Harris V Robert D Fontenille Evolutionaryforces on Anopheles What makes a malaria vector TrendsParasitol 26 130ndash136 (2010) doi 101016jpt200912001pmid 20056485

3 S Manguin P Carnivale J Mouchet Biodiversity of Malaria inthe World (John Libbey Eurotext Paris 2008)

4 R A Holt et al The genome sequence of the malaria mosquitoAnopheles gambiae Science 298 129ndash149 (2002)doi 101126science1076181 pmid 12364791

5 O Marinotti et al The genome of Anopheles darlingi the mainneotropical malaria vector Nucleic Acids Res 41 7387ndash7400(2013) doi 101093nargkt484 pmid 23761445

6 D Zhou et al Genome sequence of Anopheles sinensisprovides insight into genetics basis of mosquito competencefor malaria parasites BMC Genomics 15 42 (2014)pmid 24438588

7 X Jiang et al Genome analysis of a major urban malariavector mosquito Anopheles stephensi Genome Biol 15 459(2014) doi 101186s13059-014-0459-2 pmid 25244985

8 D E Neafsey et al The evolution of the Anopheles 16 genomesproject Genes Genomes Genet 3 1191ndash1194 (2013)doi 101534g3113006247 pmid 23708298

9 M C Fontaine et al Extensive introgression in a malariavector species complex revealed by phylogenomics Science346 1258524 (2014)

10 M Moreno et al Complete mtDNA genomes of Anophelesdarlingi and an approach to anopheline divergence timeMalar J 9 127 (2010) doi 1011861475-2875-9-127pmid 20470395

11 S Gnerre et al High-quality draft assemblies of mammaliangenomes from massively parallel sequence data Proc NatlAcad Sci USA 108 1513ndash1518 (2011) doi 101073pnas1017351108 pmid 21187386

12 Materials and methods are available as supplementarymaterials on Science Online

13 C Holt M Yandell MAKER2 An annotation pipeline andgenome-database management tool for second-generationgenome projects BMC Bioinformatics 12 491 (2011)doi 1011861471-2105-12-491 pmid 22192575

14 M Coluzzi A Sabatini A della Torre M A Di DecoV Petrarca A polytene chromosome analysis of the Anophelesgambiae species complex Science 298 1415ndash1418 (2002)doi 101126science1077769 pmid 12364623

15 M Kamali et al Multigene phylogenetics reveals temporaldiversification of major African malaria vectors PLOS ONE 9e93580 (2014) doi 101371journalpone0093580 pmid 24705448

16 M A Toups M W Hahn Retrogenes reveal the direction ofsex-chromosome evolution in mosquitoes Genetics 186763ndash766 (2010) doi 101534genetics110118794pmid 20660646

17 D A Baker S Russell Role of testis-specific gene expressionin sex-chromosome evolution of Anopheles gambiae Genetics189 1117ndash1120 (2011) doi 101534genetics111133157pmid 21890740

18 M V Han G W C Thomas J Lugo-Martinez M W HahnEstimating gene gain and loss rates in the presence of error ingenome assembly and annotation using CAFE 3 Mol Biol Evol30 1987ndash1997 (2013) doi 101093molbevmst100pmid 23709260

19 M W Hahn M V Han S-G Han Gene family evolution across 12Drosophila genomes PLOS Genet 3 e197 (2007) doi 101371journalpgen0030197 pmid 17997610

20 Z Yang PAML 4 Phylogenetic analysis by maximumlikelihood Mol Biol Evol 24 1586ndash1591 (2007) doi 101093molbevmsm088 pmid 17483113

21 T Dottorini et al A genome-wide analysis in Anophelesgambiae mosquitoes reveals 46 male accessory gland genespossible modulators of female behavior Proc Natl AcadSci USA 104 16215ndash16220 (2007) doi 101073pnas0703904104 pmid 17901209

22 F Baldini P Gabrieli D W Rogers F Catteruccia Functionand composition of male accessory gland secretions inAnopheles gambiae A comparison with other insect vectors ofinfectious diseases Pathog Glob Health 106 82ndash93 (2012)doi 1011792047773212Y0000000016 pmid 22943543

23 R Assis Q Zhou D Bachtrog Sex-biased transcriptomeevolution in Drosophila Genome Biol Evol 4 1189ndash1200(2012) doi 101093gbeevs093 pmid 23097318

24 S Grath J Parsch Rate of amino acid substitution isinfluenced by the degree and conservation of male-biasedtranscription over 50 myr of Drosophila evolutionGenome Biol Evol 4 346ndash359 (2012) doi 101093gbeevs012 pmid 22321769

25 J C Perry P W Harrison J E Mank The ontogeny andevolution of sex-biased gene expression in Drosophilamelanogaster Mol Biol Evol 31 1206ndash1219 (2014)doi 101093molbevmsu072 pmid 24526011

26 D W Rogers et al Transglutaminase-mediated semencoagulation controls sperm storage in the malariamosquito PLOS Biol 7 e1000272 (2009) doi 101371journalpbio1000272 pmid 20027206

27 F Baldini et al The interaction between a sexually transferredsteroid hormone and a female protein regulates oogenesisin the malaria mosquito Anopheles gambiae PLOS Biol 11e1001695 (2013) doi 101371journalpbio1001695pmid 24204210

28 W R Shaw et al Mating activates the heme peroxidase HPX15in the sperm storage organ to ensure fertility in Anophelesgambiae Proc Natl Acad Sci USA 111 5854ndash5859 (2014)doi 101073pnas1401715111 pmid 24711401

29 J H Willis Structural cuticular proteins from arthropodsAnnotation nomenclature and sequence characteristics in thegenomics era Insect Biochem Mol Biol 40 189ndash204(2010) doi 101016jibmb201002001 pmid 20171281

30 R S Cornman et al Annotation and analysis of a largecuticular protein family with the RampR Consensus inAnopheles gambiae BMC Genomics 9 22 (2008) doi 1011861471-2164-9-22 pmid 18205929

31 R S Cornman J H Willis Extensive gene amplification andconcerted evolution within the CPR family of cuticularproteins in mosquitoes Insect Biochem Mol Biol 38661ndash676 (2008) doi 101016jibmb200804001pmid 18510978

32 R S Cornman J H Willis Annotation and analysis oflow-complexity protein families of Anopheles gambiae thatare associated with cuticle Insect Mol Biol 18 607ndash622(2009) doi 101111j1365-2583200900902xpmid 19754739

33 R S Cornman Molecular evolution of Drosophila cuticularprotein genes PLOS ONE 4 e8345 (2009) doi 101371journalpone0008345 pmid 20019874

34 T Togawa W A Dunn A C Emmons J Nagao J H WillisDevelopmental expression patterns of cuticular protein genes

SCIENCE sciencemagorg 2 JANUARY 2015 bull VOL 347 ISSUE 6217 1258522-7

Fig 5 Genesis of novel anopheline genes (A) Retrotransposition of theE2Deffete gene generated a ubiquitin-conjugating enzyme at the base of thegenus which exhibits much higher sequence divergence than the originalmultiexon gene WebLogo plots contrast the amino acid conservation of theoriginal effete gene with the diversification of the retrotransposed copy (residues38 to 75 species represented areAnminimusAn dirusAn funestusAn farautiAn atroparvusAn sinensisAn darlingi andAn albimanus) (B) TheSG7 salivaryproteinndashencodinggenewas generated from theC-terminal half of the 30-kDgene

SG7 then underwent tandem duplication and intron loss to generate anothersalivary protein SG7-2 Numerals indicate lengths of segments in base pairs (C)The origin of STAT1 a signal transducer and activator of transcription geneinvolved in immunity occurred through a retrotransposition event in the Celliaancestor after divergence from An dirus and An farauti The intronless STAT1 ismuch more divergent than its multiexon progenitor STAT2 and has been main-tained in all descendant species An independent retrotransposition event createda retrogene copy inAn atroparvuswhich is alsomoredivergent than its progenitor

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

with the RampR Consensus from Anopheles gambiae InsectBiochem Mol Biol 38 508ndash519 (2008) doi 101016jibmb200712008 pmid 18405829

35 D C Rinker et al Blood meal-induced changes to antennaltranscriptome profiles reveal shifts in odor sensitivities inAnopheles gambiae Proc Natl Acad Sci USA 110 8260ndash8265(2013) doi 101073pnas1302562110 pmid 23630291

36 D C Rinker et al Antennal transcriptome profiles ofanopheline mosquitoes reveal human host olfactoryspecialization in Anopheles gambiae BMC Genomics 14749 (2013) doi 1011861471-2164-14-749 pmid 24182346

37 M Altstein D R Naumlssel Neuropeptide signaling in insectsAdv Exp Med Biol 692 155ndash165 (2010) doi 101007978-1-4419-6902-6_8 pmid 21189678

38 J P Goetze I Hunter S K Lippert L Bardram J F RehfeldProcessing-independent analysis of peptide hormones andprohormones in plasma Front Biosci (Landmark Ed) 171804ndash1815 (2012) doi 1027414020 pmid 22201837

39 T H Stracker S Thompson G L Grossman M A RiehleM R Brown Characterization of the AeaHP gene and itsexpression in the mosquito Aedes aegypti (Diptera Culicidae)J Med Entomol 39 331ndash342 (2002) doi 1016030022-2585-392331 pmid 11931033

40 C D de Oliveira W P Tadei F C Abdalla P F PaolucciPimenta O Marinotti Multiple blood meals in Anophelesdarlingi (Diptera Culicidae) J Vector Ecol 37 351ndash358(2012) doi 101111j1948-7134201200238x pmid 23181859

41 N Okamoto et al A fat body-derived IGF-like peptide regulatespostfeeding growth in Drosophila Dev Cell 17 885ndash891(2009) doi 101016jdevcel200910008 pmid 20059957

42 M A Riehle Y Fan C Cao M R Brown Molecularcharacterization of insulin-like peptides in the yellow fevermosquito Aedes aegypti Expression cellular localization andphylogeny Peptides 27 2547ndash2560 (2006) doi 101016jpeptides200607016 pmid 16934367

43 Y Antonova A J Arik W Moore M M Riehle M R Brown inInsect Endocrinology L I Gilbert Ed (Academic Press 2012)pp 63ndash92

44 A Swaminathan A Gajan L A Pile Epigenetic regulation oftranscription in Drosophila Front Biosci (Landmark Ed) 17909ndash937 (2012) doi 1027413964 pmid 22201781

45 F Cipressa et al Effete a Drosophila chromatin-associatedubiquitin-conjugating enzyme that affects telomeric andheterochromatic position effect variegation Genetics195 147ndash158 (2013) doi 101534genetics113153320pmid 23821599

46 B Arcagrave et al Positive selection drives accelerated evolutionof mosquito salivary genes associated with blood-feedingInsect Mol Biol 23 122ndash131 (2014) doi 101111imb12068pmid 24237399

47 H Isawa M Yuda Y Orito Y Chinzei A mosquito salivaryprotein inhibits activation of the plasma contact system bybinding to factor XII and high molecular weight kininogenJ Biol Chem 277 27651ndash27658 (2002) doi 101074jbcM203505200 pmid 12011093

48 E Calvo et al Aegyptin a novel mosquito salivary glandprotein specifically binds to collagen and prevents itsinteraction with platelet glycoprotein VI integrinalpha2beta1 and von Willebrand factor J Biol Chem 28226928ndash26938 (2007) doi 101074jbcM705669200pmid 17650501

49 E Guittard et al CYP18A1 a key enzyme of Drosophilasteroid hormone inactivation is essential for metamorphosisDev Biol 349 35ndash45 (2011) doi 101016jydbio201009023pmid 20932968

50 R M Waterhouse et al Evolutionary dynamics of immune-relatedgenes and pathways in disease-vector mosquitoes Science 3161738ndash1743 (2007) doi 101126science1139862pmid 17588928

51 L C Bartholomay et al Pathogenomics of Culexquinquefasciatus and meta-analysis of infection responses todiverse pathogens Science 330 88ndash90 (2010) doi 101126science1193162 pmid 20929811

52 C Barillas-Mury Y S Han D Seeley F C KafatosAnopheles gambiae Ag-STAT a new insect member of theSTAT family is activated in response to bacterial infection

EMBO J 18 959ndash967 (1999) doi 101093emboj184959pmid 10022838

53 L Gupta et al The STAT pathway mediates late-phaseimmunity against Plasmodium in the mosquito Anophelesgambiae Cell Host Microbe 5 498ndash507 (2009) doi 101016jchom200904003 pmid 19454353

54 A C Bahia et al The JAK-STAT pathway controls Plasmodiumvivax load in early stages of Anopheles aquasalis infectionPLOS Negl Trop Dis 5 e1317 (2011) doi 101371journalpntd0001317 pmid 22069502

ACKNOWLEDGMENTS

All sequencing reads and genome assemblies have been submittedto the National Center for Biotechnology Information (umbrellaBioProject ID PRJNA67511) Genome and transcriptomeassemblies are also available from VectorBase (httpsvectorbaseorg) and the Broad Institute (httpsolivebroadinstituteorgcollectionsanopheles4) The authorsacknowledge the NIH Eukaryotic Pathogen and Disease VectorSequencing Project Working Group for guidance anddevelopment of this project Sequence data generation wassupported at the Broad Institute by the National HumanGenome Research Institute (U54 HG003067) We thank themany members of the Broad Institute Genomics Platform andGenome Sequencing and Analysis Program who contributed tosequencing data generation and analysis

SUPPLEMENTARY MATERIALS

wwwsciencemagorgcontent34762171258522supplDC1Materials and MethodsSupplementary TextFigs S1 to S25Tables S1 to S36References (55ndash248)

9 July 2014 accepted 14 November 2014Published online 27 November 2014101126science1258522

1258522-8 2 JANUARY 2015 bull VOL 347 ISSUE 6217 sciencemagorg SCIENCE

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

mosquitoesAnophelesHighly evolvable malaria vectors The genomes of 16

Riehle Igor V Sharakhov Zhijian Tu Laurence J Zwiebel and Nora J BesanskyKafatos Manolis Kellis Daniel Lawson Christos Louis Shirley Luckhart Marc A T Muskavitch Joseacute M Ribeiro Michael ADonnelly Scott J Emrich Michael C Fontaine William Gelbart Matthew W Hahn Immo A Hansen Paul I Howell Fotis C Zhou Flaminia Catteruccia George K Christophides Frank H Collins Robert S Cornman Andrea Crisanti Martin JJohn Vontas Catherine Walton Craig S Wilding Judith H Willis Yi-Chieh Wu Guiyun Yan Evgeny M Zdobnov Xiaofan Vladimir Stegniy Claudio J Struchiner Gregg W C Thomas Marta Tojo Pantelis Topalis Joseacute M C Tubio Maria F UngerSagnon Maria V Sharakhova Terrance Shea Felipe A Simatildeo Frederic Simard Michel A Slotman Pradya Somboon Anil Prakash David P Price Ashok Rajaraman Lisa J Reimer David C Rinker Antonis Rokas Tanya L Russell NFaleChioma Oringanje Mohammad A Oshaghi Nazzy Pakpour Philippos A Papathanos Ashley N Peery Michael Povelones N Mitchell Wendy Moore Katherine A Murphy Anastasia N Naumenko Tony Nolan Eva M Novoa Samantha OLoughlinErnesto Lowy Robert M MacCallum Chunhong Mao Gareth Maslen Charles Mbogo Jenny McCarthy Kristin Michel Sara Kirmitzoglou Lizette L Koekemoer Njoroge Laban Nicholas Langridge Mara K N Lawniczak Manolis Lirakis Neil F LoboXiaofang Jiang Irwin Jungreis Evdoxia G Kakani Maryam Kamali Petri Kemppainen Ryan C Kennedy Ioannis K Gabriel Wamdaogo M Guelbeogo Andrew B Hall Mira V Han Thaung Hlaing Daniel S T Hughes Adam M JenkinsChiu Mikkel Christensen Carlo Costantini Victoria L M Davidson Elena Deligianni Tania Dottorini Vicky Dritsou Stacey B Birren Stephanie A Blandin Andrew I Brockman Thomas R Burkot Austin Burt Clara S Chan Cedric Chauve Joanna CJames Amon Bruno Arcagrave Peter Arensburger Gleb Artemov Lauren A Assour Hamidreza Basseri Aaron Berlin Bruce W Daniel E Neafsey Robert M Waterhouse Mohammad R Abai Sergey S Aganezov Max A Alekseyev James E Allen

originally published online November 27 2014DOI 101126science1258522 (6217) 1258522347Science

this issue 101126science1258524 and 101126science1258522 see also p 27Sciencehuman malaria (see the Perspective by Clark and Messer)reveal clues that explain the mystery of why only some species of one genus of mosquitoes are capable of transmitting

et al and Fontaine et allife-threatening infection By sequencing the genomes of several mosquitoes in depth Neafsey among these bloodsucking flies that render some bites mere nuisances and others the initiation of a potentially

Virtually everyone has first-hand experience with mosquitoes Few recognize the subtle biological distinctionsMosquito adaptability across genomes

ARTICLE TOOLS httpsciencesciencemagorgcontent34762171258522

MATERIALSSUPPLEMENTARY httpsciencesciencemagorgcontentsuppl20141125science1258522DC1

CONTENTRELATED httpsciencesciencemagorgcontentsci347621727full

REFERENCES

httpsciencesciencemagorgcontent34762171258522BIBLThis article cites 238 articles 62 of which you can access for free

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAASSciencelicensee American Association for the Advancement of Science No claim to original US Government Works The title Science 1200 New York Avenue NW Washington DC 20005 2017 copy The Authors some rights reserved exclusive

(print ISSN 0036-8075 online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

on June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

PERMISSIONS httpwwwsciencemagorghelpreprints-and-permissions

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAASSciencelicensee American Association for the Advancement of Science No claim to original US Government Works The title Science 1200 New York Avenue NW Washington DC 20005 2017 copy The Authors some rights reserved exclusive

(print ISSN 0036-8075 online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

on June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

Page 4: MOSQUITO GENOMICS Highlyevolvable … ARTICLE SUMMARY MOSQUITO GENOMICS Highlyevolvable malariavectors:The genomes of 16 Anopheles mosquitoes Daniel E. Neafsey,*† Robert M. Waterhouse,*

gene family sizes at the tips of the species treedue to annotation or assembly errors and is notsensitive to inclusion or exclusion of taxa affect-ing the root age of the tree nor to the exclusionof taxa with the poorest assemblies and genesets (fig S10 and tables S22 and S23) Examplesinclude expansions of cuticular proteins in Anarabiensis and neurotransmitter-gated ion chan-nels in An albimanus (table S24)The evolutionary dynamismofAnopheles genes

extends to their architecture Comparisons ofsingle-copy orthologs at deeper phylogeneticdepths showed losses of introns at the root ofthe true fly order Diptera and revealed con-tinued losses as the group diversified into thelineages leading to fruit flies and mosquitoesHowever anopheline orthologs have sustainedgreater intron loss than drosophilids leadingto a relative paucity of introns in the genes ofextant anophelines (fig S11 and table S25) Com-parative analysis also revealed that gene fu-sion and fission played a substantial role in theevolution of mosquito genes with apparent re-arrangements affecting an average of 101 of

all genes in the genomes of the 10 specieswith the most contiguous assemblies (fig S12)Furthermore gene boundaries can be flexiblewhole genome alignments identified 325 can-didates for stop-codon readthrough (fig S13 andtable S26)Becausemolecular evolution of protein-coding

sequences is a well-known source of phenotypicchange we compared evolutionary rates amongdifferent functional categories of anopheline ortho-logs We quantified evolutionary divergence intermsof protein sequence identity of alignedortho-logs and the dNdS statistic (ratio of nonsynon-ymous to synonymous substitutions) computedusing PAML (12 20) Among curated sets of geneslinked to vectorial capacity or species-specific traitsagainst a background of functional categories de-fined by Gene Ontology or InterPro annotationsodorant and gustatory receptors showhigh evolu-tionary rates and male accessory gland proteinsexhibit exceptionally high dNdS ratios (Fig 3 figsS14 and S15 and tables S27 to S29) Rapid diver-gence in functional categories related to malariatransmission andor mosquito control strategies

led us to examine the genomic basis of severalfacets of anopheline biology in closer detail

Insights into mosquito biology andvectorial capacity

Mosquito reproductive biology evolves rapidlyand presents a compelling target for vector con-trol This is exemplified by the An gambiaemaleaccessory gland protein (Acp) cluster on chromo-some 3R (21 22) where conservation is mostlylost outside the An gambiae species complex(fig S16) In Drosophila male-biased genes suchas Acps tend to evolve faster than loci withoutmale-biased expression (23ndash25) We looked fora similar pattern in anophelines after assessingeach gene for sex-biased expression using micro-array and RNAseq data sets for An gambiae (12)In contrast to Drosophila female-biased genesshowdramatically faster rates of evolution acrossthe genus thanmale-biased genes (Wilcoxon ranksum test P = 5 times 10minus4) (fig S17)Differences in reproductive genes among

anophelines may provide insight into the or-igin and function of sex-related traits During

SCIENCE sciencemagorg 2 JANUARY 2015 bull VOL 347 ISSUE 6217 1258522-3

An gambiae

An christyi

An epiroticus

An stephensi

An maculatus

An culicifacies

An minimus

An funestus

An dirus

An farauti

An atroparvus

An sinensis

An albimanus

An darlingi

Culex quinquefasciatus

Aedes aegypti

An arabiensis

Culicinae

Anophelinae

Cellia

D melanogaster

D pseudoobscuraD mojavensis

Glossina morsitans

10000 200000

MuscomorphaDrosophilidae

Culicidae

An merus

An melas

An quadri-annulatus

5000 15000

GeneOrthology

Anophelinae

Culicinae

Culicidae

Diptera

Widespread

Universal

None

Number of Genes

100 Ma

260 Ma

005ss

MolecularPhylogeny

Nyssorhynchus

Anopheles

Pyretophorus

050 1

2R3R2L3LX

Genome Alignability

30 Ma

gambiaecomplex

minor

Vector Statusmajor

non

Geography

A B C

D

Fig 1 Geography vector status molecular phylogeny gene orthologyand genome alignability of the 16 newly sequenced anopheline mosqui-toes and selected other dipterans (A) Global geographic distributions ofthe 16 sampled anophelines and the previously sequenced An gambiae andAn darlingi Ranges are colored for each species or group of species as shownin (B) eg light blue for An farauti (B) The maximum likelihood molecularphylogeny of all sequenced anophelines and selected dipteran outgroupsShapes between branch termini and species names indicate vector status

(rectangles major vectors ellipses minor vectors triangles nonvectors) andare colored according to geographic ranges shown in (A) (C) Bar plots showtotal gene counts for each species partitioned according to their orthologyprofiles from ancient genes found across insects to lineage-restricted andspecies-specific genes (D) Heat map illustrating the density (in 2-kb slidingwindows) of whole-genome alignments along the lengths of An gambiaechromosomal arms fromwhite where An gambiae aligns to no other speciesto red where An gambiae aligns to all the other anophelines

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

copulation An gambiae males transfer a gela-tinous mating plug a complex of seminal pro-teins lipids and hormones that are essentialfor successful sperm storage by females and forreproductive success (26ndash28) Coagulation ofthe plug is mediated by a seminal transgluta-minase (TG3) which is found in anophelinesbut is absent in other mosquito genera that donot form a mating plug (26) We examined TG3and its two paralogs (TG1 and TG2) in the se-quenced anophelines and investigated the rateof evolution of each gene (Fig 4A) Silent siteswere saturated at the whole-genus level makingdS difficult to estimate reliably but TG1 (thegene presumed to be ancestral owing to broad-est taxonomic representation) exhibited the lowestrate of amino acid change (dN = 020) TG2 ex-hibited an intermediate rate (dN = 093) and theanopheline-specific TG3 has evolved even morerapidly (dN = 150) perhaps because of malemale or malefemale evolutionary conflict Notethat plug formation appears to be a derived traitwithin anophelines because it is not exhibited byAn albimanus and intermediate poorly coagu-lated plugs were observed in taxa descendingfrom early-branching lineages within the genus(table S30) Functional studies of mating plugs

will be necessary to understand what drove theorigin and rapid evolution of TG3Proteins that constitute themosquito cuticular

exoskeleton play important roles in diverse as-pects of anopheline biology including develop-ment ecology and insecticide resistance andconstitute approximately 2 of all protein-codinggenes (29) Comparisons among dipterans haverevealed numerous amplifications of cuticularprotein (CP) genes undergoing concerted evo-lution at physically clustered loci (30ndash33) Weinvestigated the extent and time scale of genecluster homogenization within anophelines bygenerating phylogenies of orthologous geneclusters (fig S18 and table S31) Throughout thegenus these gene clusters often group phylo-genetically by species rather than by positionwithin tandem arrays particularly in a subsetof clusters These include the 3RB and 3RC clus-ters of CP genes (30) the CPLCG group A andCPLCW clusters found elsewhere on 3R (32) andsix tandemly arrayed genes on 3L designatedCPFL2 through CPFL7 (34) CPLCW genes occurin a head-to-head arrangementwithCPLCGgroupA genes and exhibit highly conserved intergenicsequences (fig S19) Furthermore transcript lo-calization studies using in situ hybridization

revealed identical spatial expression patterns forCLPCW and CPLCG group A gene pairs sugges-tive of coregulation (fig S19) For these five geneclusters complete grouping by organismal lin-eage was observed for most deep nodes as wellas formany individual species outside the shallowAn gambiae species complex (Fig 4B) consistentwith a relatively rapid (less than 20 million years)homogenization of sequences via concerted evo-lution The emerging pattern of anopheline CPevolution is thus one of relative stasis for a ma-jority of single-copy orthologs juxtaposed withconsistent concerted evolution of a subset of genesAnophelines identify hosts oviposition sites

and other environmental cues through special-ized chemosensory membrane-bound receptorsWe examined three of the major gene familiesthat encode these molecules the odorant recep-tors (ORs) gustatory receptors (GRs) and variantionotropic glutamate receptors (IRs) Given therapid chemosensory gene turnover observed inmany other insects we explored whether vary-ing host preferences of anopheline mosquitoescould be attributed to chemosensory gene gainsand losses Unexpectedly in light of the elevatedgenome-wide rate of gene turnover we found thatthe overall size and content of the chemosensory

1258522-4 2 JANUARY 2015 bull VOL 347 ISSUE 6217 sciencemagorg SCIENCE

Fig 2 Patterns of anopheline chromosomal evolution (A) Anopheline ge-nomes have conserved gene membership on chromosome arms (ldquoelementsrdquocolored and labeled 1 to 5) UnlikeDrosophila chromosome elements reshufflebetween chromosomes via translocations as intact elements and do notshow fissions or fusionsThe tree depicts the supported molecular topologyfor the species studied (B) Conserved synteny blocks decay rapidly withinchromosomal arms as the phylogenetic distance increases between spe-cies Moving left to right the dot-plot panels show gene-level synteny betweenchromosome 2R of An gambiae (x axis) and inferred ancestral sequences(y axes inferred using PATHGROUPS) at increasing evolutionary time scales(million years ago) estimated by an ultrametric phylogeny Gray horizontal

lines represent scaffold breaks Discontinuity of the red linesdots indicatesrearrangement (C) Anopheline X chromosomes exhibit higher rates of re-arrangement (P lt 1 times 10minus5) measured as breaks per Mb per million yearscompared with autosomes despite a paucity of polymorphic inversions onthe X (D) The anopheline X chromosome also displays a higher rate of genemovement to other chromosomal arms than any of the autosomes Chromo-somal elements are labeled around the perimeter internal bands are coloredaccording to the chromosomal element source and match element colors in(A) and (C) Bands are sized to indicate the relative ratio of genes importedversus exported for each chromosomal element and the relative allocation ofexported genes to other elements

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

gene repertoire are relatively conserved across thegenus CAFE 3 (18) analyses estimated that themost recent common ancestor of the anophelineshad approximately 60 genes in each of the ORand GR families similar to most extant anophe-lines (Fig 4C and fig S20) Estimated gainlossrates of OR and GR genes per million years(error-corrected l = 13 times 10minus3 for ORs and 20 times10minus4 for GRs) were much lower than the overalllevel of anopheline gene families Similarly wefound almost the same number of antennae-expressed IRs (~20) in all anopheline genomesDespite overall conservation in chemosensorygene numbers we observed several examples ofgene gain and loss in specific lineages Notablythere was a net gain of at least 12 ORs in thecommon ancestor of the An gambiae complex(Fig 4C)OR and GR gene repertoire stability may de-

rive from their roles in several critical behaviorsHost preference differences are likely to be gov-erned by a combination of functional divergenceand transcriptional modulation of orthologs Thismodel is supported by studies of antennal tran-scriptomes in the major malaria vector Angambiae (35) and comparisons between thisvector and its morphologically identical siblingAn quadriannulatus (36) a very closely relatedspecies that plays no role inmalaria transmission(despite vectorial competence) because it does

not specialize on human hosts Furthermore wefound that many subfamilies of ORs and GRsshowed evidence of positive selection (19 of 53ORs 17 of 59 GRs) across the genus suggestingpotential functional divergenceSeveral blood feedingndashrelated behaviors in

mosquitoes are also regulated by peptide hor-mones (37) These peptides are synthesized pro-cessed and released fromnervous and endocrinesystems and elicit their effects through bindingappropriate receptors in target tissues (38) Intotal 39 peptide hormones were identified fromeach of the sequenced anophelines (fig S21)Notably no ortholog of the well-characterizedhead peptide (HP) hormone of the culicine mos-quito Aedes aegypti was identified in any of theassemblies In Ae aegypti HP is responsible forinhibiting host-seeking behavior after a bloodmeal (39) Because anophelines broadly exhibitsimilar behavior (40) the absence ofHP from theentire clade suggests they may have evolved anovelmechanism to inhibit excess blood feedingSimilarly no ortholog of insulin growth factor 1(IGF1) was identified in any anophelines eventhough IGF1 orthologs have been identified inother dipterans includingDmelanogaster (41)and Ae aegypti (42) IGF1 is a key componentof the insulininsulin growth factor 1 signaling(IIS) cascade which regulates processes includ-ing innate immunity reproduction metabolism

and life span (43) Nevertheless other membersof the IIS cascade are present and four insulin-like peptides are found in a compact cluster withgene arrangements conserved across anophelines(fig S22) This raises questions regarding themodification of IIS signaling in the absence ofIGF1 and the functional importance of this con-served genomic arrangementEpigenetic mechanisms affect many biological

processes by modulation of chromatin structuretelomere remodeling and transcriptional con-trol Of the 215 epigenetic regulatory genes inD melanogaster (44) we identified 169 putativeAn gambiae orthologs (table S32) which sug-gested the presence of mechanisms of epigeneticcontrol in Anopheles and Drosophila We findhowever that retrotransposition may have con-tributed to the functional divergence of at leastone gene associated with epigenetic regulationThe ubiquitin-conjugating enzyme E2D (ortholo-gous to effete (45) in D melanogaster) duplicatedvia retrotransposition in an early anopheline an-cestor and the retrotransposed copy is main-tained in a subset of anophelines Although theentire amino acid sequence of E2D is perfectlyconserved between An gambiae and D mela-nogaster the retrogenes are highly divergent(Fig 5A) and may contribute to functional di-versification within the genusSaliva is integral to blood feeding it impairs

host hemostasis and also affects inflammationand immunity In An gambiae the salivary pro-teome is estimated to contain the products ofat least 75 genes most being expressed solely inthe adult female salivary glands Comparativeanalyses indicate that anopheline salivary pro-teins are subject to strong evolutionary pres-sures and these genes exhibit an acceleratedpace of evolution as well as a very high rate ofgainloss (Fig 3 and fig S23) Polymorphismswithin An gambiae populations from limitedsets of salivary genes were previously found tocarry signatures of positive selection (46) Se-quence analysis across the anophelines showsthat salivary genes have the highest incidenceof positively selected codons among the sevengene classes (fig S24) indicating that coevolu-tion with vertebrate hosts is a powerful driverof natural selection in salivary proteomes More-over salivary proteins also exhibit functionaldiversification through new gene creation Se-quence similarity intron-exon boundaries andsecondary structure prediction point to the birthof the SG7SG7-2 inflammation-inhibiting (47)gene family from the genomic region encodingthe C terminus of the 30-kD protein (Fig 5B) acollagen-binding platelet inhibitor already presentin the blood-feeding ancestor of mosquitoes andblack flies (48) Based on phylogenetic representa-tion these events must have occurred before theradiation of anophelines but after separation fromthe culicinesResistance to insecticides and other xeno-

biotics has arisen independently inmany anoph-eline species fostered directly and indirectly byanthropogenic environmentalmodificationMeta-bolic resistance to insecticides is mediated by

SCIENCE sciencemagorg 2 JANUARY 2015 bull VOL 347 ISSUE 6217 1258522-5

Fig 3 Contrasting evolutionaryproperties of selected genefunctional categories Examinedevolutionary properties of ortholo-gous groups of genes include ameasure of amino acid conservationdivergence (evolutionary rate) ameasure of selective pressure(dNdS) a measure of geneduplication in terms of mean genecopy-number per species (number ofgenes) and a measure of orthologuniversality in terms of number ofspecies with orthologs (number ofspecies) Notched box plots showmedians and extend to the first andthird quartiles their widths areproportional to the number oforthologous groups in each func-tional category Functional categoriesderive from curated lists associatedwith various functionsprocesses aswell as annotated Gene Ontology orInterPro categories (denoted byasterisks)

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

multiple gene families including cytochromeP450s and glutathione S-transferases (GSTs)which serve to generally protect against all en-vironment stresses both natural and anthropo-genic We manually characterized these genefamilies in seven anophelines spanning the genusDespite their large size gene numbers (87 to 104P450 genes 27 to 30 GST genes) within bothgene families are highly conserved across all spe-cies although lineage-specific gene duplicationsand losses are often seen (tables S33 and S34) Aswith the OR and GR olfaction-related gene fam-ilies P450 and GST repertoires may be relativelyconstant due to the large number of roles theyplay in anopheline biology Orthologs of genesassociated with insecticide resistance either viaup-regulation or coding variation [eg Cyp6m2Cyp6p3 (Cyp6p9 in An funestus) Gste2 andGste4] were found in all species which suggestedthat virtually all anophelines likely have genescapable of conferring insecticide resistance throughsimilar mechanisms Unexpectedly one memberof the P450 family (Cyp18a1) with a conservedrole in ecdysteroid catabolism [and consequentlydevelopment and metamorphosis (49)] appears

to have been lost from the ancestor of the Angambiae species complex but is found in the ge-nome and transcriptome assemblies of otherspecies which indicates that the An gambiaecomplex may have recently evolved an alternatemechanism for catabolizing ecdysoneSusceptibility to malaria parasites is a key

determinant of vectorial capacity Dissectingthe immune repertoire (50 51) (table S35) intoits constituent phases reveals that classical rec-ognition genes and genes encoding effector en-zymes exhibit relatively low levels of sequencedivergence Signal transducers are more diver-gent in sequence but are conserved in repre-sentation across species and rarely duplicatedCascade modulators although also divergentare more lineage-specific and generally havemore gene duplications (Fig 3 and fig S25) Arare duplication of an immune signal transduc-tion gene occurred through the retrotranspo-sition of the signal transducer and activator oftranscription STAT2 to form the intronless STAT1after the divergence ofAn dirus and An farautifrom the rest of the subgenus Cellia (Fig 5C andtable S36) It is interesting that an independent

retrotransposition event appears to have cre-ated another intronless STAT gene in the Anatroparvus lineage In An gambiae STAT1 con-trols the expression of STAT2 and is activated inresponse to bacterial challenge (52 53) and theSTAT pathway has been demonstrated to me-diate immunity to Plasmodium (53 54) so thepresence of these relatively new immune signaltransducers may have allowed for rewiring of reg-ulatory networks governing immune responsesin this subset of anophelines

Conclusion

Since the discovery over a century ago by RonaldRoss and Giovanni Battista Grassi that humanmalaria is transmitted by a narrow range of blood-feeding female mosquitoes the biological basisof malarial vectorial capacity has been a matterof intense interest Inasmuch as previous suc-cesses in the local elimination of malaria havealways been accomplished wholly or in partthrough effective vector control an increased un-derstanding of vector biology is crucial for con-tinued progress against malarial disease These16 new reference genome assemblies provide a

1258522-6 2 JANUARY 2015 bull VOL 347 ISSUE 6217 sciencemagorg SCIENCE

Fig 4 Phylogeny-based insights into anopheline biology (A) Maximum-likelihood amino acidndashbased phylogenetic tree of three transglutaminase en-zymes (TG1 green TG2 yellow and TG3 red) in 14 anopheline species with Culexquinquefasciatus (Cxqu) Ae aegypti (Aeae) and D melanogaster (Dmel) servingas outgroups TG3 is the enzyme responsible for the formation of themale matingplug in An gambiae acting upon the substrate Plugin the most abundant matingplug protein Higher rates of evolution for plug-forming TG3 are supported byelevated levels of dN Mating plug phenotypes are noted where known within theTG3 clade (B) Concerted evolution in CPFL cuticular proteins Species symbols used are the same as in (A) In contrast to the TG1TG2TG3 phylogeny CPFLparalogs cluster by subgeneric clades rather than individually recapitulating the species phylogeny Gene family size variation among speciesmay reflect bothgene gainloss and variation in gene set completeness (C) OR observed gene counts and inferred ancestral gene counts on an ultrametric phylogeny At least10 OR genes were gained on the branch leading to the common ancestor of the An gambiae species complex although the overall number of OR genes doesnot vary dramatically across the genus

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

foundation for additional hypothesis generationand testing to further our understanding of thediverse biological traits that determine vectorialcapacity

REFERENCES AND NOTES

1 D P Kwiatkowski How malaria has affected the humangenome and what human genetics can teach us about malariaAm J Hum Genet 77 171ndash192 (2005) doi 101086432519pmid 16001361

2 A Cohuet C Harris V Robert D Fontenille Evolutionaryforces on Anopheles What makes a malaria vector TrendsParasitol 26 130ndash136 (2010) doi 101016jpt200912001pmid 20056485

3 S Manguin P Carnivale J Mouchet Biodiversity of Malaria inthe World (John Libbey Eurotext Paris 2008)

4 R A Holt et al The genome sequence of the malaria mosquitoAnopheles gambiae Science 298 129ndash149 (2002)doi 101126science1076181 pmid 12364791

5 O Marinotti et al The genome of Anopheles darlingi the mainneotropical malaria vector Nucleic Acids Res 41 7387ndash7400(2013) doi 101093nargkt484 pmid 23761445

6 D Zhou et al Genome sequence of Anopheles sinensisprovides insight into genetics basis of mosquito competencefor malaria parasites BMC Genomics 15 42 (2014)pmid 24438588

7 X Jiang et al Genome analysis of a major urban malariavector mosquito Anopheles stephensi Genome Biol 15 459(2014) doi 101186s13059-014-0459-2 pmid 25244985

8 D E Neafsey et al The evolution of the Anopheles 16 genomesproject Genes Genomes Genet 3 1191ndash1194 (2013)doi 101534g3113006247 pmid 23708298

9 M C Fontaine et al Extensive introgression in a malariavector species complex revealed by phylogenomics Science346 1258524 (2014)

10 M Moreno et al Complete mtDNA genomes of Anophelesdarlingi and an approach to anopheline divergence timeMalar J 9 127 (2010) doi 1011861475-2875-9-127pmid 20470395

11 S Gnerre et al High-quality draft assemblies of mammaliangenomes from massively parallel sequence data Proc NatlAcad Sci USA 108 1513ndash1518 (2011) doi 101073pnas1017351108 pmid 21187386

12 Materials and methods are available as supplementarymaterials on Science Online

13 C Holt M Yandell MAKER2 An annotation pipeline andgenome-database management tool for second-generationgenome projects BMC Bioinformatics 12 491 (2011)doi 1011861471-2105-12-491 pmid 22192575

14 M Coluzzi A Sabatini A della Torre M A Di DecoV Petrarca A polytene chromosome analysis of the Anophelesgambiae species complex Science 298 1415ndash1418 (2002)doi 101126science1077769 pmid 12364623

15 M Kamali et al Multigene phylogenetics reveals temporaldiversification of major African malaria vectors PLOS ONE 9e93580 (2014) doi 101371journalpone0093580 pmid 24705448

16 M A Toups M W Hahn Retrogenes reveal the direction ofsex-chromosome evolution in mosquitoes Genetics 186763ndash766 (2010) doi 101534genetics110118794pmid 20660646

17 D A Baker S Russell Role of testis-specific gene expressionin sex-chromosome evolution of Anopheles gambiae Genetics189 1117ndash1120 (2011) doi 101534genetics111133157pmid 21890740

18 M V Han G W C Thomas J Lugo-Martinez M W HahnEstimating gene gain and loss rates in the presence of error ingenome assembly and annotation using CAFE 3 Mol Biol Evol30 1987ndash1997 (2013) doi 101093molbevmst100pmid 23709260

19 M W Hahn M V Han S-G Han Gene family evolution across 12Drosophila genomes PLOS Genet 3 e197 (2007) doi 101371journalpgen0030197 pmid 17997610

20 Z Yang PAML 4 Phylogenetic analysis by maximumlikelihood Mol Biol Evol 24 1586ndash1591 (2007) doi 101093molbevmsm088 pmid 17483113

21 T Dottorini et al A genome-wide analysis in Anophelesgambiae mosquitoes reveals 46 male accessory gland genespossible modulators of female behavior Proc Natl AcadSci USA 104 16215ndash16220 (2007) doi 101073pnas0703904104 pmid 17901209

22 F Baldini P Gabrieli D W Rogers F Catteruccia Functionand composition of male accessory gland secretions inAnopheles gambiae A comparison with other insect vectors ofinfectious diseases Pathog Glob Health 106 82ndash93 (2012)doi 1011792047773212Y0000000016 pmid 22943543

23 R Assis Q Zhou D Bachtrog Sex-biased transcriptomeevolution in Drosophila Genome Biol Evol 4 1189ndash1200(2012) doi 101093gbeevs093 pmid 23097318

24 S Grath J Parsch Rate of amino acid substitution isinfluenced by the degree and conservation of male-biasedtranscription over 50 myr of Drosophila evolutionGenome Biol Evol 4 346ndash359 (2012) doi 101093gbeevs012 pmid 22321769

25 J C Perry P W Harrison J E Mank The ontogeny andevolution of sex-biased gene expression in Drosophilamelanogaster Mol Biol Evol 31 1206ndash1219 (2014)doi 101093molbevmsu072 pmid 24526011

26 D W Rogers et al Transglutaminase-mediated semencoagulation controls sperm storage in the malariamosquito PLOS Biol 7 e1000272 (2009) doi 101371journalpbio1000272 pmid 20027206

27 F Baldini et al The interaction between a sexually transferredsteroid hormone and a female protein regulates oogenesisin the malaria mosquito Anopheles gambiae PLOS Biol 11e1001695 (2013) doi 101371journalpbio1001695pmid 24204210

28 W R Shaw et al Mating activates the heme peroxidase HPX15in the sperm storage organ to ensure fertility in Anophelesgambiae Proc Natl Acad Sci USA 111 5854ndash5859 (2014)doi 101073pnas1401715111 pmid 24711401

29 J H Willis Structural cuticular proteins from arthropodsAnnotation nomenclature and sequence characteristics in thegenomics era Insect Biochem Mol Biol 40 189ndash204(2010) doi 101016jibmb201002001 pmid 20171281

30 R S Cornman et al Annotation and analysis of a largecuticular protein family with the RampR Consensus inAnopheles gambiae BMC Genomics 9 22 (2008) doi 1011861471-2164-9-22 pmid 18205929

31 R S Cornman J H Willis Extensive gene amplification andconcerted evolution within the CPR family of cuticularproteins in mosquitoes Insect Biochem Mol Biol 38661ndash676 (2008) doi 101016jibmb200804001pmid 18510978

32 R S Cornman J H Willis Annotation and analysis oflow-complexity protein families of Anopheles gambiae thatare associated with cuticle Insect Mol Biol 18 607ndash622(2009) doi 101111j1365-2583200900902xpmid 19754739

33 R S Cornman Molecular evolution of Drosophila cuticularprotein genes PLOS ONE 4 e8345 (2009) doi 101371journalpone0008345 pmid 20019874

34 T Togawa W A Dunn A C Emmons J Nagao J H WillisDevelopmental expression patterns of cuticular protein genes

SCIENCE sciencemagorg 2 JANUARY 2015 bull VOL 347 ISSUE 6217 1258522-7

Fig 5 Genesis of novel anopheline genes (A) Retrotransposition of theE2Deffete gene generated a ubiquitin-conjugating enzyme at the base of thegenus which exhibits much higher sequence divergence than the originalmultiexon gene WebLogo plots contrast the amino acid conservation of theoriginal effete gene with the diversification of the retrotransposed copy (residues38 to 75 species represented areAnminimusAn dirusAn funestusAn farautiAn atroparvusAn sinensisAn darlingi andAn albimanus) (B) TheSG7 salivaryproteinndashencodinggenewas generated from theC-terminal half of the 30-kDgene

SG7 then underwent tandem duplication and intron loss to generate anothersalivary protein SG7-2 Numerals indicate lengths of segments in base pairs (C)The origin of STAT1 a signal transducer and activator of transcription geneinvolved in immunity occurred through a retrotransposition event in the Celliaancestor after divergence from An dirus and An farauti The intronless STAT1 ismuch more divergent than its multiexon progenitor STAT2 and has been main-tained in all descendant species An independent retrotransposition event createda retrogene copy inAn atroparvuswhich is alsomoredivergent than its progenitor

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

with the RampR Consensus from Anopheles gambiae InsectBiochem Mol Biol 38 508ndash519 (2008) doi 101016jibmb200712008 pmid 18405829

35 D C Rinker et al Blood meal-induced changes to antennaltranscriptome profiles reveal shifts in odor sensitivities inAnopheles gambiae Proc Natl Acad Sci USA 110 8260ndash8265(2013) doi 101073pnas1302562110 pmid 23630291

36 D C Rinker et al Antennal transcriptome profiles ofanopheline mosquitoes reveal human host olfactoryspecialization in Anopheles gambiae BMC Genomics 14749 (2013) doi 1011861471-2164-14-749 pmid 24182346

37 M Altstein D R Naumlssel Neuropeptide signaling in insectsAdv Exp Med Biol 692 155ndash165 (2010) doi 101007978-1-4419-6902-6_8 pmid 21189678

38 J P Goetze I Hunter S K Lippert L Bardram J F RehfeldProcessing-independent analysis of peptide hormones andprohormones in plasma Front Biosci (Landmark Ed) 171804ndash1815 (2012) doi 1027414020 pmid 22201837

39 T H Stracker S Thompson G L Grossman M A RiehleM R Brown Characterization of the AeaHP gene and itsexpression in the mosquito Aedes aegypti (Diptera Culicidae)J Med Entomol 39 331ndash342 (2002) doi 1016030022-2585-392331 pmid 11931033

40 C D de Oliveira W P Tadei F C Abdalla P F PaolucciPimenta O Marinotti Multiple blood meals in Anophelesdarlingi (Diptera Culicidae) J Vector Ecol 37 351ndash358(2012) doi 101111j1948-7134201200238x pmid 23181859

41 N Okamoto et al A fat body-derived IGF-like peptide regulatespostfeeding growth in Drosophila Dev Cell 17 885ndash891(2009) doi 101016jdevcel200910008 pmid 20059957

42 M A Riehle Y Fan C Cao M R Brown Molecularcharacterization of insulin-like peptides in the yellow fevermosquito Aedes aegypti Expression cellular localization andphylogeny Peptides 27 2547ndash2560 (2006) doi 101016jpeptides200607016 pmid 16934367

43 Y Antonova A J Arik W Moore M M Riehle M R Brown inInsect Endocrinology L I Gilbert Ed (Academic Press 2012)pp 63ndash92

44 A Swaminathan A Gajan L A Pile Epigenetic regulation oftranscription in Drosophila Front Biosci (Landmark Ed) 17909ndash937 (2012) doi 1027413964 pmid 22201781

45 F Cipressa et al Effete a Drosophila chromatin-associatedubiquitin-conjugating enzyme that affects telomeric andheterochromatic position effect variegation Genetics195 147ndash158 (2013) doi 101534genetics113153320pmid 23821599

46 B Arcagrave et al Positive selection drives accelerated evolutionof mosquito salivary genes associated with blood-feedingInsect Mol Biol 23 122ndash131 (2014) doi 101111imb12068pmid 24237399

47 H Isawa M Yuda Y Orito Y Chinzei A mosquito salivaryprotein inhibits activation of the plasma contact system bybinding to factor XII and high molecular weight kininogenJ Biol Chem 277 27651ndash27658 (2002) doi 101074jbcM203505200 pmid 12011093

48 E Calvo et al Aegyptin a novel mosquito salivary glandprotein specifically binds to collagen and prevents itsinteraction with platelet glycoprotein VI integrinalpha2beta1 and von Willebrand factor J Biol Chem 28226928ndash26938 (2007) doi 101074jbcM705669200pmid 17650501

49 E Guittard et al CYP18A1 a key enzyme of Drosophilasteroid hormone inactivation is essential for metamorphosisDev Biol 349 35ndash45 (2011) doi 101016jydbio201009023pmid 20932968

50 R M Waterhouse et al Evolutionary dynamics of immune-relatedgenes and pathways in disease-vector mosquitoes Science 3161738ndash1743 (2007) doi 101126science1139862pmid 17588928

51 L C Bartholomay et al Pathogenomics of Culexquinquefasciatus and meta-analysis of infection responses todiverse pathogens Science 330 88ndash90 (2010) doi 101126science1193162 pmid 20929811

52 C Barillas-Mury Y S Han D Seeley F C KafatosAnopheles gambiae Ag-STAT a new insect member of theSTAT family is activated in response to bacterial infection

EMBO J 18 959ndash967 (1999) doi 101093emboj184959pmid 10022838

53 L Gupta et al The STAT pathway mediates late-phaseimmunity against Plasmodium in the mosquito Anophelesgambiae Cell Host Microbe 5 498ndash507 (2009) doi 101016jchom200904003 pmid 19454353

54 A C Bahia et al The JAK-STAT pathway controls Plasmodiumvivax load in early stages of Anopheles aquasalis infectionPLOS Negl Trop Dis 5 e1317 (2011) doi 101371journalpntd0001317 pmid 22069502

ACKNOWLEDGMENTS

All sequencing reads and genome assemblies have been submittedto the National Center for Biotechnology Information (umbrellaBioProject ID PRJNA67511) Genome and transcriptomeassemblies are also available from VectorBase (httpsvectorbaseorg) and the Broad Institute (httpsolivebroadinstituteorgcollectionsanopheles4) The authorsacknowledge the NIH Eukaryotic Pathogen and Disease VectorSequencing Project Working Group for guidance anddevelopment of this project Sequence data generation wassupported at the Broad Institute by the National HumanGenome Research Institute (U54 HG003067) We thank themany members of the Broad Institute Genomics Platform andGenome Sequencing and Analysis Program who contributed tosequencing data generation and analysis

SUPPLEMENTARY MATERIALS

wwwsciencemagorgcontent34762171258522supplDC1Materials and MethodsSupplementary TextFigs S1 to S25Tables S1 to S36References (55ndash248)

9 July 2014 accepted 14 November 2014Published online 27 November 2014101126science1258522

1258522-8 2 JANUARY 2015 bull VOL 347 ISSUE 6217 sciencemagorg SCIENCE

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

mosquitoesAnophelesHighly evolvable malaria vectors The genomes of 16

Riehle Igor V Sharakhov Zhijian Tu Laurence J Zwiebel and Nora J BesanskyKafatos Manolis Kellis Daniel Lawson Christos Louis Shirley Luckhart Marc A T Muskavitch Joseacute M Ribeiro Michael ADonnelly Scott J Emrich Michael C Fontaine William Gelbart Matthew W Hahn Immo A Hansen Paul I Howell Fotis C Zhou Flaminia Catteruccia George K Christophides Frank H Collins Robert S Cornman Andrea Crisanti Martin JJohn Vontas Catherine Walton Craig S Wilding Judith H Willis Yi-Chieh Wu Guiyun Yan Evgeny M Zdobnov Xiaofan Vladimir Stegniy Claudio J Struchiner Gregg W C Thomas Marta Tojo Pantelis Topalis Joseacute M C Tubio Maria F UngerSagnon Maria V Sharakhova Terrance Shea Felipe A Simatildeo Frederic Simard Michel A Slotman Pradya Somboon Anil Prakash David P Price Ashok Rajaraman Lisa J Reimer David C Rinker Antonis Rokas Tanya L Russell NFaleChioma Oringanje Mohammad A Oshaghi Nazzy Pakpour Philippos A Papathanos Ashley N Peery Michael Povelones N Mitchell Wendy Moore Katherine A Murphy Anastasia N Naumenko Tony Nolan Eva M Novoa Samantha OLoughlinErnesto Lowy Robert M MacCallum Chunhong Mao Gareth Maslen Charles Mbogo Jenny McCarthy Kristin Michel Sara Kirmitzoglou Lizette L Koekemoer Njoroge Laban Nicholas Langridge Mara K N Lawniczak Manolis Lirakis Neil F LoboXiaofang Jiang Irwin Jungreis Evdoxia G Kakani Maryam Kamali Petri Kemppainen Ryan C Kennedy Ioannis K Gabriel Wamdaogo M Guelbeogo Andrew B Hall Mira V Han Thaung Hlaing Daniel S T Hughes Adam M JenkinsChiu Mikkel Christensen Carlo Costantini Victoria L M Davidson Elena Deligianni Tania Dottorini Vicky Dritsou Stacey B Birren Stephanie A Blandin Andrew I Brockman Thomas R Burkot Austin Burt Clara S Chan Cedric Chauve Joanna CJames Amon Bruno Arcagrave Peter Arensburger Gleb Artemov Lauren A Assour Hamidreza Basseri Aaron Berlin Bruce W Daniel E Neafsey Robert M Waterhouse Mohammad R Abai Sergey S Aganezov Max A Alekseyev James E Allen

originally published online November 27 2014DOI 101126science1258522 (6217) 1258522347Science

this issue 101126science1258524 and 101126science1258522 see also p 27Sciencehuman malaria (see the Perspective by Clark and Messer)reveal clues that explain the mystery of why only some species of one genus of mosquitoes are capable of transmitting

et al and Fontaine et allife-threatening infection By sequencing the genomes of several mosquitoes in depth Neafsey among these bloodsucking flies that render some bites mere nuisances and others the initiation of a potentially

Virtually everyone has first-hand experience with mosquitoes Few recognize the subtle biological distinctionsMosquito adaptability across genomes

ARTICLE TOOLS httpsciencesciencemagorgcontent34762171258522

MATERIALSSUPPLEMENTARY httpsciencesciencemagorgcontentsuppl20141125science1258522DC1

CONTENTRELATED httpsciencesciencemagorgcontentsci347621727full

REFERENCES

httpsciencesciencemagorgcontent34762171258522BIBLThis article cites 238 articles 62 of which you can access for free

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAASSciencelicensee American Association for the Advancement of Science No claim to original US Government Works The title Science 1200 New York Avenue NW Washington DC 20005 2017 copy The Authors some rights reserved exclusive

(print ISSN 0036-8075 online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

on June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

PERMISSIONS httpwwwsciencemagorghelpreprints-and-permissions

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAASSciencelicensee American Association for the Advancement of Science No claim to original US Government Works The title Science 1200 New York Avenue NW Washington DC 20005 2017 copy The Authors some rights reserved exclusive

(print ISSN 0036-8075 online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

on June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

Page 5: MOSQUITO GENOMICS Highlyevolvable … ARTICLE SUMMARY MOSQUITO GENOMICS Highlyevolvable malariavectors:The genomes of 16 Anopheles mosquitoes Daniel E. Neafsey,*† Robert M. Waterhouse,*

copulation An gambiae males transfer a gela-tinous mating plug a complex of seminal pro-teins lipids and hormones that are essentialfor successful sperm storage by females and forreproductive success (26ndash28) Coagulation ofthe plug is mediated by a seminal transgluta-minase (TG3) which is found in anophelinesbut is absent in other mosquito genera that donot form a mating plug (26) We examined TG3and its two paralogs (TG1 and TG2) in the se-quenced anophelines and investigated the rateof evolution of each gene (Fig 4A) Silent siteswere saturated at the whole-genus level makingdS difficult to estimate reliably but TG1 (thegene presumed to be ancestral owing to broad-est taxonomic representation) exhibited the lowestrate of amino acid change (dN = 020) TG2 ex-hibited an intermediate rate (dN = 093) and theanopheline-specific TG3 has evolved even morerapidly (dN = 150) perhaps because of malemale or malefemale evolutionary conflict Notethat plug formation appears to be a derived traitwithin anophelines because it is not exhibited byAn albimanus and intermediate poorly coagu-lated plugs were observed in taxa descendingfrom early-branching lineages within the genus(table S30) Functional studies of mating plugs

will be necessary to understand what drove theorigin and rapid evolution of TG3Proteins that constitute themosquito cuticular

exoskeleton play important roles in diverse as-pects of anopheline biology including develop-ment ecology and insecticide resistance andconstitute approximately 2 of all protein-codinggenes (29) Comparisons among dipterans haverevealed numerous amplifications of cuticularprotein (CP) genes undergoing concerted evo-lution at physically clustered loci (30ndash33) Weinvestigated the extent and time scale of genecluster homogenization within anophelines bygenerating phylogenies of orthologous geneclusters (fig S18 and table S31) Throughout thegenus these gene clusters often group phylo-genetically by species rather than by positionwithin tandem arrays particularly in a subsetof clusters These include the 3RB and 3RC clus-ters of CP genes (30) the CPLCG group A andCPLCW clusters found elsewhere on 3R (32) andsix tandemly arrayed genes on 3L designatedCPFL2 through CPFL7 (34) CPLCW genes occurin a head-to-head arrangementwithCPLCGgroupA genes and exhibit highly conserved intergenicsequences (fig S19) Furthermore transcript lo-calization studies using in situ hybridization

revealed identical spatial expression patterns forCLPCW and CPLCG group A gene pairs sugges-tive of coregulation (fig S19) For these five geneclusters complete grouping by organismal lin-eage was observed for most deep nodes as wellas formany individual species outside the shallowAn gambiae species complex (Fig 4B) consistentwith a relatively rapid (less than 20 million years)homogenization of sequences via concerted evo-lution The emerging pattern of anopheline CPevolution is thus one of relative stasis for a ma-jority of single-copy orthologs juxtaposed withconsistent concerted evolution of a subset of genesAnophelines identify hosts oviposition sites

and other environmental cues through special-ized chemosensory membrane-bound receptorsWe examined three of the major gene familiesthat encode these molecules the odorant recep-tors (ORs) gustatory receptors (GRs) and variantionotropic glutamate receptors (IRs) Given therapid chemosensory gene turnover observed inmany other insects we explored whether vary-ing host preferences of anopheline mosquitoescould be attributed to chemosensory gene gainsand losses Unexpectedly in light of the elevatedgenome-wide rate of gene turnover we found thatthe overall size and content of the chemosensory

1258522-4 2 JANUARY 2015 bull VOL 347 ISSUE 6217 sciencemagorg SCIENCE

Fig 2 Patterns of anopheline chromosomal evolution (A) Anopheline ge-nomes have conserved gene membership on chromosome arms (ldquoelementsrdquocolored and labeled 1 to 5) UnlikeDrosophila chromosome elements reshufflebetween chromosomes via translocations as intact elements and do notshow fissions or fusionsThe tree depicts the supported molecular topologyfor the species studied (B) Conserved synteny blocks decay rapidly withinchromosomal arms as the phylogenetic distance increases between spe-cies Moving left to right the dot-plot panels show gene-level synteny betweenchromosome 2R of An gambiae (x axis) and inferred ancestral sequences(y axes inferred using PATHGROUPS) at increasing evolutionary time scales(million years ago) estimated by an ultrametric phylogeny Gray horizontal

lines represent scaffold breaks Discontinuity of the red linesdots indicatesrearrangement (C) Anopheline X chromosomes exhibit higher rates of re-arrangement (P lt 1 times 10minus5) measured as breaks per Mb per million yearscompared with autosomes despite a paucity of polymorphic inversions onthe X (D) The anopheline X chromosome also displays a higher rate of genemovement to other chromosomal arms than any of the autosomes Chromo-somal elements are labeled around the perimeter internal bands are coloredaccording to the chromosomal element source and match element colors in(A) and (C) Bands are sized to indicate the relative ratio of genes importedversus exported for each chromosomal element and the relative allocation ofexported genes to other elements

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

gene repertoire are relatively conserved across thegenus CAFE 3 (18) analyses estimated that themost recent common ancestor of the anophelineshad approximately 60 genes in each of the ORand GR families similar to most extant anophe-lines (Fig 4C and fig S20) Estimated gainlossrates of OR and GR genes per million years(error-corrected l = 13 times 10minus3 for ORs and 20 times10minus4 for GRs) were much lower than the overalllevel of anopheline gene families Similarly wefound almost the same number of antennae-expressed IRs (~20) in all anopheline genomesDespite overall conservation in chemosensorygene numbers we observed several examples ofgene gain and loss in specific lineages Notablythere was a net gain of at least 12 ORs in thecommon ancestor of the An gambiae complex(Fig 4C)OR and GR gene repertoire stability may de-

rive from their roles in several critical behaviorsHost preference differences are likely to be gov-erned by a combination of functional divergenceand transcriptional modulation of orthologs Thismodel is supported by studies of antennal tran-scriptomes in the major malaria vector Angambiae (35) and comparisons between thisvector and its morphologically identical siblingAn quadriannulatus (36) a very closely relatedspecies that plays no role inmalaria transmission(despite vectorial competence) because it does

not specialize on human hosts Furthermore wefound that many subfamilies of ORs and GRsshowed evidence of positive selection (19 of 53ORs 17 of 59 GRs) across the genus suggestingpotential functional divergenceSeveral blood feedingndashrelated behaviors in

mosquitoes are also regulated by peptide hor-mones (37) These peptides are synthesized pro-cessed and released fromnervous and endocrinesystems and elicit their effects through bindingappropriate receptors in target tissues (38) Intotal 39 peptide hormones were identified fromeach of the sequenced anophelines (fig S21)Notably no ortholog of the well-characterizedhead peptide (HP) hormone of the culicine mos-quito Aedes aegypti was identified in any of theassemblies In Ae aegypti HP is responsible forinhibiting host-seeking behavior after a bloodmeal (39) Because anophelines broadly exhibitsimilar behavior (40) the absence ofHP from theentire clade suggests they may have evolved anovelmechanism to inhibit excess blood feedingSimilarly no ortholog of insulin growth factor 1(IGF1) was identified in any anophelines eventhough IGF1 orthologs have been identified inother dipterans includingDmelanogaster (41)and Ae aegypti (42) IGF1 is a key componentof the insulininsulin growth factor 1 signaling(IIS) cascade which regulates processes includ-ing innate immunity reproduction metabolism

and life span (43) Nevertheless other membersof the IIS cascade are present and four insulin-like peptides are found in a compact cluster withgene arrangements conserved across anophelines(fig S22) This raises questions regarding themodification of IIS signaling in the absence ofIGF1 and the functional importance of this con-served genomic arrangementEpigenetic mechanisms affect many biological

processes by modulation of chromatin structuretelomere remodeling and transcriptional con-trol Of the 215 epigenetic regulatory genes inD melanogaster (44) we identified 169 putativeAn gambiae orthologs (table S32) which sug-gested the presence of mechanisms of epigeneticcontrol in Anopheles and Drosophila We findhowever that retrotransposition may have con-tributed to the functional divergence of at leastone gene associated with epigenetic regulationThe ubiquitin-conjugating enzyme E2D (ortholo-gous to effete (45) in D melanogaster) duplicatedvia retrotransposition in an early anopheline an-cestor and the retrotransposed copy is main-tained in a subset of anophelines Although theentire amino acid sequence of E2D is perfectlyconserved between An gambiae and D mela-nogaster the retrogenes are highly divergent(Fig 5A) and may contribute to functional di-versification within the genusSaliva is integral to blood feeding it impairs

host hemostasis and also affects inflammationand immunity In An gambiae the salivary pro-teome is estimated to contain the products ofat least 75 genes most being expressed solely inthe adult female salivary glands Comparativeanalyses indicate that anopheline salivary pro-teins are subject to strong evolutionary pres-sures and these genes exhibit an acceleratedpace of evolution as well as a very high rate ofgainloss (Fig 3 and fig S23) Polymorphismswithin An gambiae populations from limitedsets of salivary genes were previously found tocarry signatures of positive selection (46) Se-quence analysis across the anophelines showsthat salivary genes have the highest incidenceof positively selected codons among the sevengene classes (fig S24) indicating that coevolu-tion with vertebrate hosts is a powerful driverof natural selection in salivary proteomes More-over salivary proteins also exhibit functionaldiversification through new gene creation Se-quence similarity intron-exon boundaries andsecondary structure prediction point to the birthof the SG7SG7-2 inflammation-inhibiting (47)gene family from the genomic region encodingthe C terminus of the 30-kD protein (Fig 5B) acollagen-binding platelet inhibitor already presentin the blood-feeding ancestor of mosquitoes andblack flies (48) Based on phylogenetic representa-tion these events must have occurred before theradiation of anophelines but after separation fromthe culicinesResistance to insecticides and other xeno-

biotics has arisen independently inmany anoph-eline species fostered directly and indirectly byanthropogenic environmentalmodificationMeta-bolic resistance to insecticides is mediated by

SCIENCE sciencemagorg 2 JANUARY 2015 bull VOL 347 ISSUE 6217 1258522-5

Fig 3 Contrasting evolutionaryproperties of selected genefunctional categories Examinedevolutionary properties of ortholo-gous groups of genes include ameasure of amino acid conservationdivergence (evolutionary rate) ameasure of selective pressure(dNdS) a measure of geneduplication in terms of mean genecopy-number per species (number ofgenes) and a measure of orthologuniversality in terms of number ofspecies with orthologs (number ofspecies) Notched box plots showmedians and extend to the first andthird quartiles their widths areproportional to the number oforthologous groups in each func-tional category Functional categoriesderive from curated lists associatedwith various functionsprocesses aswell as annotated Gene Ontology orInterPro categories (denoted byasterisks)

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

multiple gene families including cytochromeP450s and glutathione S-transferases (GSTs)which serve to generally protect against all en-vironment stresses both natural and anthropo-genic We manually characterized these genefamilies in seven anophelines spanning the genusDespite their large size gene numbers (87 to 104P450 genes 27 to 30 GST genes) within bothgene families are highly conserved across all spe-cies although lineage-specific gene duplicationsand losses are often seen (tables S33 and S34) Aswith the OR and GR olfaction-related gene fam-ilies P450 and GST repertoires may be relativelyconstant due to the large number of roles theyplay in anopheline biology Orthologs of genesassociated with insecticide resistance either viaup-regulation or coding variation [eg Cyp6m2Cyp6p3 (Cyp6p9 in An funestus) Gste2 andGste4] were found in all species which suggestedthat virtually all anophelines likely have genescapable of conferring insecticide resistance throughsimilar mechanisms Unexpectedly one memberof the P450 family (Cyp18a1) with a conservedrole in ecdysteroid catabolism [and consequentlydevelopment and metamorphosis (49)] appears

to have been lost from the ancestor of the Angambiae species complex but is found in the ge-nome and transcriptome assemblies of otherspecies which indicates that the An gambiaecomplex may have recently evolved an alternatemechanism for catabolizing ecdysoneSusceptibility to malaria parasites is a key

determinant of vectorial capacity Dissectingthe immune repertoire (50 51) (table S35) intoits constituent phases reveals that classical rec-ognition genes and genes encoding effector en-zymes exhibit relatively low levels of sequencedivergence Signal transducers are more diver-gent in sequence but are conserved in repre-sentation across species and rarely duplicatedCascade modulators although also divergentare more lineage-specific and generally havemore gene duplications (Fig 3 and fig S25) Arare duplication of an immune signal transduc-tion gene occurred through the retrotranspo-sition of the signal transducer and activator oftranscription STAT2 to form the intronless STAT1after the divergence ofAn dirus and An farautifrom the rest of the subgenus Cellia (Fig 5C andtable S36) It is interesting that an independent

retrotransposition event appears to have cre-ated another intronless STAT gene in the Anatroparvus lineage In An gambiae STAT1 con-trols the expression of STAT2 and is activated inresponse to bacterial challenge (52 53) and theSTAT pathway has been demonstrated to me-diate immunity to Plasmodium (53 54) so thepresence of these relatively new immune signaltransducers may have allowed for rewiring of reg-ulatory networks governing immune responsesin this subset of anophelines

Conclusion

Since the discovery over a century ago by RonaldRoss and Giovanni Battista Grassi that humanmalaria is transmitted by a narrow range of blood-feeding female mosquitoes the biological basisof malarial vectorial capacity has been a matterof intense interest Inasmuch as previous suc-cesses in the local elimination of malaria havealways been accomplished wholly or in partthrough effective vector control an increased un-derstanding of vector biology is crucial for con-tinued progress against malarial disease These16 new reference genome assemblies provide a

1258522-6 2 JANUARY 2015 bull VOL 347 ISSUE 6217 sciencemagorg SCIENCE

Fig 4 Phylogeny-based insights into anopheline biology (A) Maximum-likelihood amino acidndashbased phylogenetic tree of three transglutaminase en-zymes (TG1 green TG2 yellow and TG3 red) in 14 anopheline species with Culexquinquefasciatus (Cxqu) Ae aegypti (Aeae) and D melanogaster (Dmel) servingas outgroups TG3 is the enzyme responsible for the formation of themale matingplug in An gambiae acting upon the substrate Plugin the most abundant matingplug protein Higher rates of evolution for plug-forming TG3 are supported byelevated levels of dN Mating plug phenotypes are noted where known within theTG3 clade (B) Concerted evolution in CPFL cuticular proteins Species symbols used are the same as in (A) In contrast to the TG1TG2TG3 phylogeny CPFLparalogs cluster by subgeneric clades rather than individually recapitulating the species phylogeny Gene family size variation among speciesmay reflect bothgene gainloss and variation in gene set completeness (C) OR observed gene counts and inferred ancestral gene counts on an ultrametric phylogeny At least10 OR genes were gained on the branch leading to the common ancestor of the An gambiae species complex although the overall number of OR genes doesnot vary dramatically across the genus

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

foundation for additional hypothesis generationand testing to further our understanding of thediverse biological traits that determine vectorialcapacity

REFERENCES AND NOTES

1 D P Kwiatkowski How malaria has affected the humangenome and what human genetics can teach us about malariaAm J Hum Genet 77 171ndash192 (2005) doi 101086432519pmid 16001361

2 A Cohuet C Harris V Robert D Fontenille Evolutionaryforces on Anopheles What makes a malaria vector TrendsParasitol 26 130ndash136 (2010) doi 101016jpt200912001pmid 20056485

3 S Manguin P Carnivale J Mouchet Biodiversity of Malaria inthe World (John Libbey Eurotext Paris 2008)

4 R A Holt et al The genome sequence of the malaria mosquitoAnopheles gambiae Science 298 129ndash149 (2002)doi 101126science1076181 pmid 12364791

5 O Marinotti et al The genome of Anopheles darlingi the mainneotropical malaria vector Nucleic Acids Res 41 7387ndash7400(2013) doi 101093nargkt484 pmid 23761445

6 D Zhou et al Genome sequence of Anopheles sinensisprovides insight into genetics basis of mosquito competencefor malaria parasites BMC Genomics 15 42 (2014)pmid 24438588

7 X Jiang et al Genome analysis of a major urban malariavector mosquito Anopheles stephensi Genome Biol 15 459(2014) doi 101186s13059-014-0459-2 pmid 25244985

8 D E Neafsey et al The evolution of the Anopheles 16 genomesproject Genes Genomes Genet 3 1191ndash1194 (2013)doi 101534g3113006247 pmid 23708298

9 M C Fontaine et al Extensive introgression in a malariavector species complex revealed by phylogenomics Science346 1258524 (2014)

10 M Moreno et al Complete mtDNA genomes of Anophelesdarlingi and an approach to anopheline divergence timeMalar J 9 127 (2010) doi 1011861475-2875-9-127pmid 20470395

11 S Gnerre et al High-quality draft assemblies of mammaliangenomes from massively parallel sequence data Proc NatlAcad Sci USA 108 1513ndash1518 (2011) doi 101073pnas1017351108 pmid 21187386

12 Materials and methods are available as supplementarymaterials on Science Online

13 C Holt M Yandell MAKER2 An annotation pipeline andgenome-database management tool for second-generationgenome projects BMC Bioinformatics 12 491 (2011)doi 1011861471-2105-12-491 pmid 22192575

14 M Coluzzi A Sabatini A della Torre M A Di DecoV Petrarca A polytene chromosome analysis of the Anophelesgambiae species complex Science 298 1415ndash1418 (2002)doi 101126science1077769 pmid 12364623

15 M Kamali et al Multigene phylogenetics reveals temporaldiversification of major African malaria vectors PLOS ONE 9e93580 (2014) doi 101371journalpone0093580 pmid 24705448

16 M A Toups M W Hahn Retrogenes reveal the direction ofsex-chromosome evolution in mosquitoes Genetics 186763ndash766 (2010) doi 101534genetics110118794pmid 20660646

17 D A Baker S Russell Role of testis-specific gene expressionin sex-chromosome evolution of Anopheles gambiae Genetics189 1117ndash1120 (2011) doi 101534genetics111133157pmid 21890740

18 M V Han G W C Thomas J Lugo-Martinez M W HahnEstimating gene gain and loss rates in the presence of error ingenome assembly and annotation using CAFE 3 Mol Biol Evol30 1987ndash1997 (2013) doi 101093molbevmst100pmid 23709260

19 M W Hahn M V Han S-G Han Gene family evolution across 12Drosophila genomes PLOS Genet 3 e197 (2007) doi 101371journalpgen0030197 pmid 17997610

20 Z Yang PAML 4 Phylogenetic analysis by maximumlikelihood Mol Biol Evol 24 1586ndash1591 (2007) doi 101093molbevmsm088 pmid 17483113

21 T Dottorini et al A genome-wide analysis in Anophelesgambiae mosquitoes reveals 46 male accessory gland genespossible modulators of female behavior Proc Natl AcadSci USA 104 16215ndash16220 (2007) doi 101073pnas0703904104 pmid 17901209

22 F Baldini P Gabrieli D W Rogers F Catteruccia Functionand composition of male accessory gland secretions inAnopheles gambiae A comparison with other insect vectors ofinfectious diseases Pathog Glob Health 106 82ndash93 (2012)doi 1011792047773212Y0000000016 pmid 22943543

23 R Assis Q Zhou D Bachtrog Sex-biased transcriptomeevolution in Drosophila Genome Biol Evol 4 1189ndash1200(2012) doi 101093gbeevs093 pmid 23097318

24 S Grath J Parsch Rate of amino acid substitution isinfluenced by the degree and conservation of male-biasedtranscription over 50 myr of Drosophila evolutionGenome Biol Evol 4 346ndash359 (2012) doi 101093gbeevs012 pmid 22321769

25 J C Perry P W Harrison J E Mank The ontogeny andevolution of sex-biased gene expression in Drosophilamelanogaster Mol Biol Evol 31 1206ndash1219 (2014)doi 101093molbevmsu072 pmid 24526011

26 D W Rogers et al Transglutaminase-mediated semencoagulation controls sperm storage in the malariamosquito PLOS Biol 7 e1000272 (2009) doi 101371journalpbio1000272 pmid 20027206

27 F Baldini et al The interaction between a sexually transferredsteroid hormone and a female protein regulates oogenesisin the malaria mosquito Anopheles gambiae PLOS Biol 11e1001695 (2013) doi 101371journalpbio1001695pmid 24204210

28 W R Shaw et al Mating activates the heme peroxidase HPX15in the sperm storage organ to ensure fertility in Anophelesgambiae Proc Natl Acad Sci USA 111 5854ndash5859 (2014)doi 101073pnas1401715111 pmid 24711401

29 J H Willis Structural cuticular proteins from arthropodsAnnotation nomenclature and sequence characteristics in thegenomics era Insect Biochem Mol Biol 40 189ndash204(2010) doi 101016jibmb201002001 pmid 20171281

30 R S Cornman et al Annotation and analysis of a largecuticular protein family with the RampR Consensus inAnopheles gambiae BMC Genomics 9 22 (2008) doi 1011861471-2164-9-22 pmid 18205929

31 R S Cornman J H Willis Extensive gene amplification andconcerted evolution within the CPR family of cuticularproteins in mosquitoes Insect Biochem Mol Biol 38661ndash676 (2008) doi 101016jibmb200804001pmid 18510978

32 R S Cornman J H Willis Annotation and analysis oflow-complexity protein families of Anopheles gambiae thatare associated with cuticle Insect Mol Biol 18 607ndash622(2009) doi 101111j1365-2583200900902xpmid 19754739

33 R S Cornman Molecular evolution of Drosophila cuticularprotein genes PLOS ONE 4 e8345 (2009) doi 101371journalpone0008345 pmid 20019874

34 T Togawa W A Dunn A C Emmons J Nagao J H WillisDevelopmental expression patterns of cuticular protein genes

SCIENCE sciencemagorg 2 JANUARY 2015 bull VOL 347 ISSUE 6217 1258522-7

Fig 5 Genesis of novel anopheline genes (A) Retrotransposition of theE2Deffete gene generated a ubiquitin-conjugating enzyme at the base of thegenus which exhibits much higher sequence divergence than the originalmultiexon gene WebLogo plots contrast the amino acid conservation of theoriginal effete gene with the diversification of the retrotransposed copy (residues38 to 75 species represented areAnminimusAn dirusAn funestusAn farautiAn atroparvusAn sinensisAn darlingi andAn albimanus) (B) TheSG7 salivaryproteinndashencodinggenewas generated from theC-terminal half of the 30-kDgene

SG7 then underwent tandem duplication and intron loss to generate anothersalivary protein SG7-2 Numerals indicate lengths of segments in base pairs (C)The origin of STAT1 a signal transducer and activator of transcription geneinvolved in immunity occurred through a retrotransposition event in the Celliaancestor after divergence from An dirus and An farauti The intronless STAT1 ismuch more divergent than its multiexon progenitor STAT2 and has been main-tained in all descendant species An independent retrotransposition event createda retrogene copy inAn atroparvuswhich is alsomoredivergent than its progenitor

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

with the RampR Consensus from Anopheles gambiae InsectBiochem Mol Biol 38 508ndash519 (2008) doi 101016jibmb200712008 pmid 18405829

35 D C Rinker et al Blood meal-induced changes to antennaltranscriptome profiles reveal shifts in odor sensitivities inAnopheles gambiae Proc Natl Acad Sci USA 110 8260ndash8265(2013) doi 101073pnas1302562110 pmid 23630291

36 D C Rinker et al Antennal transcriptome profiles ofanopheline mosquitoes reveal human host olfactoryspecialization in Anopheles gambiae BMC Genomics 14749 (2013) doi 1011861471-2164-14-749 pmid 24182346

37 M Altstein D R Naumlssel Neuropeptide signaling in insectsAdv Exp Med Biol 692 155ndash165 (2010) doi 101007978-1-4419-6902-6_8 pmid 21189678

38 J P Goetze I Hunter S K Lippert L Bardram J F RehfeldProcessing-independent analysis of peptide hormones andprohormones in plasma Front Biosci (Landmark Ed) 171804ndash1815 (2012) doi 1027414020 pmid 22201837

39 T H Stracker S Thompson G L Grossman M A RiehleM R Brown Characterization of the AeaHP gene and itsexpression in the mosquito Aedes aegypti (Diptera Culicidae)J Med Entomol 39 331ndash342 (2002) doi 1016030022-2585-392331 pmid 11931033

40 C D de Oliveira W P Tadei F C Abdalla P F PaolucciPimenta O Marinotti Multiple blood meals in Anophelesdarlingi (Diptera Culicidae) J Vector Ecol 37 351ndash358(2012) doi 101111j1948-7134201200238x pmid 23181859

41 N Okamoto et al A fat body-derived IGF-like peptide regulatespostfeeding growth in Drosophila Dev Cell 17 885ndash891(2009) doi 101016jdevcel200910008 pmid 20059957

42 M A Riehle Y Fan C Cao M R Brown Molecularcharacterization of insulin-like peptides in the yellow fevermosquito Aedes aegypti Expression cellular localization andphylogeny Peptides 27 2547ndash2560 (2006) doi 101016jpeptides200607016 pmid 16934367

43 Y Antonova A J Arik W Moore M M Riehle M R Brown inInsect Endocrinology L I Gilbert Ed (Academic Press 2012)pp 63ndash92

44 A Swaminathan A Gajan L A Pile Epigenetic regulation oftranscription in Drosophila Front Biosci (Landmark Ed) 17909ndash937 (2012) doi 1027413964 pmid 22201781

45 F Cipressa et al Effete a Drosophila chromatin-associatedubiquitin-conjugating enzyme that affects telomeric andheterochromatic position effect variegation Genetics195 147ndash158 (2013) doi 101534genetics113153320pmid 23821599

46 B Arcagrave et al Positive selection drives accelerated evolutionof mosquito salivary genes associated with blood-feedingInsect Mol Biol 23 122ndash131 (2014) doi 101111imb12068pmid 24237399

47 H Isawa M Yuda Y Orito Y Chinzei A mosquito salivaryprotein inhibits activation of the plasma contact system bybinding to factor XII and high molecular weight kininogenJ Biol Chem 277 27651ndash27658 (2002) doi 101074jbcM203505200 pmid 12011093

48 E Calvo et al Aegyptin a novel mosquito salivary glandprotein specifically binds to collagen and prevents itsinteraction with platelet glycoprotein VI integrinalpha2beta1 and von Willebrand factor J Biol Chem 28226928ndash26938 (2007) doi 101074jbcM705669200pmid 17650501

49 E Guittard et al CYP18A1 a key enzyme of Drosophilasteroid hormone inactivation is essential for metamorphosisDev Biol 349 35ndash45 (2011) doi 101016jydbio201009023pmid 20932968

50 R M Waterhouse et al Evolutionary dynamics of immune-relatedgenes and pathways in disease-vector mosquitoes Science 3161738ndash1743 (2007) doi 101126science1139862pmid 17588928

51 L C Bartholomay et al Pathogenomics of Culexquinquefasciatus and meta-analysis of infection responses todiverse pathogens Science 330 88ndash90 (2010) doi 101126science1193162 pmid 20929811

52 C Barillas-Mury Y S Han D Seeley F C KafatosAnopheles gambiae Ag-STAT a new insect member of theSTAT family is activated in response to bacterial infection

EMBO J 18 959ndash967 (1999) doi 101093emboj184959pmid 10022838

53 L Gupta et al The STAT pathway mediates late-phaseimmunity against Plasmodium in the mosquito Anophelesgambiae Cell Host Microbe 5 498ndash507 (2009) doi 101016jchom200904003 pmid 19454353

54 A C Bahia et al The JAK-STAT pathway controls Plasmodiumvivax load in early stages of Anopheles aquasalis infectionPLOS Negl Trop Dis 5 e1317 (2011) doi 101371journalpntd0001317 pmid 22069502

ACKNOWLEDGMENTS

All sequencing reads and genome assemblies have been submittedto the National Center for Biotechnology Information (umbrellaBioProject ID PRJNA67511) Genome and transcriptomeassemblies are also available from VectorBase (httpsvectorbaseorg) and the Broad Institute (httpsolivebroadinstituteorgcollectionsanopheles4) The authorsacknowledge the NIH Eukaryotic Pathogen and Disease VectorSequencing Project Working Group for guidance anddevelopment of this project Sequence data generation wassupported at the Broad Institute by the National HumanGenome Research Institute (U54 HG003067) We thank themany members of the Broad Institute Genomics Platform andGenome Sequencing and Analysis Program who contributed tosequencing data generation and analysis

SUPPLEMENTARY MATERIALS

wwwsciencemagorgcontent34762171258522supplDC1Materials and MethodsSupplementary TextFigs S1 to S25Tables S1 to S36References (55ndash248)

9 July 2014 accepted 14 November 2014Published online 27 November 2014101126science1258522

1258522-8 2 JANUARY 2015 bull VOL 347 ISSUE 6217 sciencemagorg SCIENCE

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

mosquitoesAnophelesHighly evolvable malaria vectors The genomes of 16

Riehle Igor V Sharakhov Zhijian Tu Laurence J Zwiebel and Nora J BesanskyKafatos Manolis Kellis Daniel Lawson Christos Louis Shirley Luckhart Marc A T Muskavitch Joseacute M Ribeiro Michael ADonnelly Scott J Emrich Michael C Fontaine William Gelbart Matthew W Hahn Immo A Hansen Paul I Howell Fotis C Zhou Flaminia Catteruccia George K Christophides Frank H Collins Robert S Cornman Andrea Crisanti Martin JJohn Vontas Catherine Walton Craig S Wilding Judith H Willis Yi-Chieh Wu Guiyun Yan Evgeny M Zdobnov Xiaofan Vladimir Stegniy Claudio J Struchiner Gregg W C Thomas Marta Tojo Pantelis Topalis Joseacute M C Tubio Maria F UngerSagnon Maria V Sharakhova Terrance Shea Felipe A Simatildeo Frederic Simard Michel A Slotman Pradya Somboon Anil Prakash David P Price Ashok Rajaraman Lisa J Reimer David C Rinker Antonis Rokas Tanya L Russell NFaleChioma Oringanje Mohammad A Oshaghi Nazzy Pakpour Philippos A Papathanos Ashley N Peery Michael Povelones N Mitchell Wendy Moore Katherine A Murphy Anastasia N Naumenko Tony Nolan Eva M Novoa Samantha OLoughlinErnesto Lowy Robert M MacCallum Chunhong Mao Gareth Maslen Charles Mbogo Jenny McCarthy Kristin Michel Sara Kirmitzoglou Lizette L Koekemoer Njoroge Laban Nicholas Langridge Mara K N Lawniczak Manolis Lirakis Neil F LoboXiaofang Jiang Irwin Jungreis Evdoxia G Kakani Maryam Kamali Petri Kemppainen Ryan C Kennedy Ioannis K Gabriel Wamdaogo M Guelbeogo Andrew B Hall Mira V Han Thaung Hlaing Daniel S T Hughes Adam M JenkinsChiu Mikkel Christensen Carlo Costantini Victoria L M Davidson Elena Deligianni Tania Dottorini Vicky Dritsou Stacey B Birren Stephanie A Blandin Andrew I Brockman Thomas R Burkot Austin Burt Clara S Chan Cedric Chauve Joanna CJames Amon Bruno Arcagrave Peter Arensburger Gleb Artemov Lauren A Assour Hamidreza Basseri Aaron Berlin Bruce W Daniel E Neafsey Robert M Waterhouse Mohammad R Abai Sergey S Aganezov Max A Alekseyev James E Allen

originally published online November 27 2014DOI 101126science1258522 (6217) 1258522347Science

this issue 101126science1258524 and 101126science1258522 see also p 27Sciencehuman malaria (see the Perspective by Clark and Messer)reveal clues that explain the mystery of why only some species of one genus of mosquitoes are capable of transmitting

et al and Fontaine et allife-threatening infection By sequencing the genomes of several mosquitoes in depth Neafsey among these bloodsucking flies that render some bites mere nuisances and others the initiation of a potentially

Virtually everyone has first-hand experience with mosquitoes Few recognize the subtle biological distinctionsMosquito adaptability across genomes

ARTICLE TOOLS httpsciencesciencemagorgcontent34762171258522

MATERIALSSUPPLEMENTARY httpsciencesciencemagorgcontentsuppl20141125science1258522DC1

CONTENTRELATED httpsciencesciencemagorgcontentsci347621727full

REFERENCES

httpsciencesciencemagorgcontent34762171258522BIBLThis article cites 238 articles 62 of which you can access for free

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAASSciencelicensee American Association for the Advancement of Science No claim to original US Government Works The title Science 1200 New York Avenue NW Washington DC 20005 2017 copy The Authors some rights reserved exclusive

(print ISSN 0036-8075 online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

on June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

PERMISSIONS httpwwwsciencemagorghelpreprints-and-permissions

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAASSciencelicensee American Association for the Advancement of Science No claim to original US Government Works The title Science 1200 New York Avenue NW Washington DC 20005 2017 copy The Authors some rights reserved exclusive

(print ISSN 0036-8075 online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

on June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

Page 6: MOSQUITO GENOMICS Highlyevolvable … ARTICLE SUMMARY MOSQUITO GENOMICS Highlyevolvable malariavectors:The genomes of 16 Anopheles mosquitoes Daniel E. Neafsey,*† Robert M. Waterhouse,*

gene repertoire are relatively conserved across thegenus CAFE 3 (18) analyses estimated that themost recent common ancestor of the anophelineshad approximately 60 genes in each of the ORand GR families similar to most extant anophe-lines (Fig 4C and fig S20) Estimated gainlossrates of OR and GR genes per million years(error-corrected l = 13 times 10minus3 for ORs and 20 times10minus4 for GRs) were much lower than the overalllevel of anopheline gene families Similarly wefound almost the same number of antennae-expressed IRs (~20) in all anopheline genomesDespite overall conservation in chemosensorygene numbers we observed several examples ofgene gain and loss in specific lineages Notablythere was a net gain of at least 12 ORs in thecommon ancestor of the An gambiae complex(Fig 4C)OR and GR gene repertoire stability may de-

rive from their roles in several critical behaviorsHost preference differences are likely to be gov-erned by a combination of functional divergenceand transcriptional modulation of orthologs Thismodel is supported by studies of antennal tran-scriptomes in the major malaria vector Angambiae (35) and comparisons between thisvector and its morphologically identical siblingAn quadriannulatus (36) a very closely relatedspecies that plays no role inmalaria transmission(despite vectorial competence) because it does

not specialize on human hosts Furthermore wefound that many subfamilies of ORs and GRsshowed evidence of positive selection (19 of 53ORs 17 of 59 GRs) across the genus suggestingpotential functional divergenceSeveral blood feedingndashrelated behaviors in

mosquitoes are also regulated by peptide hor-mones (37) These peptides are synthesized pro-cessed and released fromnervous and endocrinesystems and elicit their effects through bindingappropriate receptors in target tissues (38) Intotal 39 peptide hormones were identified fromeach of the sequenced anophelines (fig S21)Notably no ortholog of the well-characterizedhead peptide (HP) hormone of the culicine mos-quito Aedes aegypti was identified in any of theassemblies In Ae aegypti HP is responsible forinhibiting host-seeking behavior after a bloodmeal (39) Because anophelines broadly exhibitsimilar behavior (40) the absence ofHP from theentire clade suggests they may have evolved anovelmechanism to inhibit excess blood feedingSimilarly no ortholog of insulin growth factor 1(IGF1) was identified in any anophelines eventhough IGF1 orthologs have been identified inother dipterans includingDmelanogaster (41)and Ae aegypti (42) IGF1 is a key componentof the insulininsulin growth factor 1 signaling(IIS) cascade which regulates processes includ-ing innate immunity reproduction metabolism

and life span (43) Nevertheless other membersof the IIS cascade are present and four insulin-like peptides are found in a compact cluster withgene arrangements conserved across anophelines(fig S22) This raises questions regarding themodification of IIS signaling in the absence ofIGF1 and the functional importance of this con-served genomic arrangementEpigenetic mechanisms affect many biological

processes by modulation of chromatin structuretelomere remodeling and transcriptional con-trol Of the 215 epigenetic regulatory genes inD melanogaster (44) we identified 169 putativeAn gambiae orthologs (table S32) which sug-gested the presence of mechanisms of epigeneticcontrol in Anopheles and Drosophila We findhowever that retrotransposition may have con-tributed to the functional divergence of at leastone gene associated with epigenetic regulationThe ubiquitin-conjugating enzyme E2D (ortholo-gous to effete (45) in D melanogaster) duplicatedvia retrotransposition in an early anopheline an-cestor and the retrotransposed copy is main-tained in a subset of anophelines Although theentire amino acid sequence of E2D is perfectlyconserved between An gambiae and D mela-nogaster the retrogenes are highly divergent(Fig 5A) and may contribute to functional di-versification within the genusSaliva is integral to blood feeding it impairs

host hemostasis and also affects inflammationand immunity In An gambiae the salivary pro-teome is estimated to contain the products ofat least 75 genes most being expressed solely inthe adult female salivary glands Comparativeanalyses indicate that anopheline salivary pro-teins are subject to strong evolutionary pres-sures and these genes exhibit an acceleratedpace of evolution as well as a very high rate ofgainloss (Fig 3 and fig S23) Polymorphismswithin An gambiae populations from limitedsets of salivary genes were previously found tocarry signatures of positive selection (46) Se-quence analysis across the anophelines showsthat salivary genes have the highest incidenceof positively selected codons among the sevengene classes (fig S24) indicating that coevolu-tion with vertebrate hosts is a powerful driverof natural selection in salivary proteomes More-over salivary proteins also exhibit functionaldiversification through new gene creation Se-quence similarity intron-exon boundaries andsecondary structure prediction point to the birthof the SG7SG7-2 inflammation-inhibiting (47)gene family from the genomic region encodingthe C terminus of the 30-kD protein (Fig 5B) acollagen-binding platelet inhibitor already presentin the blood-feeding ancestor of mosquitoes andblack flies (48) Based on phylogenetic representa-tion these events must have occurred before theradiation of anophelines but after separation fromthe culicinesResistance to insecticides and other xeno-

biotics has arisen independently inmany anoph-eline species fostered directly and indirectly byanthropogenic environmentalmodificationMeta-bolic resistance to insecticides is mediated by

SCIENCE sciencemagorg 2 JANUARY 2015 bull VOL 347 ISSUE 6217 1258522-5

Fig 3 Contrasting evolutionaryproperties of selected genefunctional categories Examinedevolutionary properties of ortholo-gous groups of genes include ameasure of amino acid conservationdivergence (evolutionary rate) ameasure of selective pressure(dNdS) a measure of geneduplication in terms of mean genecopy-number per species (number ofgenes) and a measure of orthologuniversality in terms of number ofspecies with orthologs (number ofspecies) Notched box plots showmedians and extend to the first andthird quartiles their widths areproportional to the number oforthologous groups in each func-tional category Functional categoriesderive from curated lists associatedwith various functionsprocesses aswell as annotated Gene Ontology orInterPro categories (denoted byasterisks)

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

multiple gene families including cytochromeP450s and glutathione S-transferases (GSTs)which serve to generally protect against all en-vironment stresses both natural and anthropo-genic We manually characterized these genefamilies in seven anophelines spanning the genusDespite their large size gene numbers (87 to 104P450 genes 27 to 30 GST genes) within bothgene families are highly conserved across all spe-cies although lineage-specific gene duplicationsand losses are often seen (tables S33 and S34) Aswith the OR and GR olfaction-related gene fam-ilies P450 and GST repertoires may be relativelyconstant due to the large number of roles theyplay in anopheline biology Orthologs of genesassociated with insecticide resistance either viaup-regulation or coding variation [eg Cyp6m2Cyp6p3 (Cyp6p9 in An funestus) Gste2 andGste4] were found in all species which suggestedthat virtually all anophelines likely have genescapable of conferring insecticide resistance throughsimilar mechanisms Unexpectedly one memberof the P450 family (Cyp18a1) with a conservedrole in ecdysteroid catabolism [and consequentlydevelopment and metamorphosis (49)] appears

to have been lost from the ancestor of the Angambiae species complex but is found in the ge-nome and transcriptome assemblies of otherspecies which indicates that the An gambiaecomplex may have recently evolved an alternatemechanism for catabolizing ecdysoneSusceptibility to malaria parasites is a key

determinant of vectorial capacity Dissectingthe immune repertoire (50 51) (table S35) intoits constituent phases reveals that classical rec-ognition genes and genes encoding effector en-zymes exhibit relatively low levels of sequencedivergence Signal transducers are more diver-gent in sequence but are conserved in repre-sentation across species and rarely duplicatedCascade modulators although also divergentare more lineage-specific and generally havemore gene duplications (Fig 3 and fig S25) Arare duplication of an immune signal transduc-tion gene occurred through the retrotranspo-sition of the signal transducer and activator oftranscription STAT2 to form the intronless STAT1after the divergence ofAn dirus and An farautifrom the rest of the subgenus Cellia (Fig 5C andtable S36) It is interesting that an independent

retrotransposition event appears to have cre-ated another intronless STAT gene in the Anatroparvus lineage In An gambiae STAT1 con-trols the expression of STAT2 and is activated inresponse to bacterial challenge (52 53) and theSTAT pathway has been demonstrated to me-diate immunity to Plasmodium (53 54) so thepresence of these relatively new immune signaltransducers may have allowed for rewiring of reg-ulatory networks governing immune responsesin this subset of anophelines

Conclusion

Since the discovery over a century ago by RonaldRoss and Giovanni Battista Grassi that humanmalaria is transmitted by a narrow range of blood-feeding female mosquitoes the biological basisof malarial vectorial capacity has been a matterof intense interest Inasmuch as previous suc-cesses in the local elimination of malaria havealways been accomplished wholly or in partthrough effective vector control an increased un-derstanding of vector biology is crucial for con-tinued progress against malarial disease These16 new reference genome assemblies provide a

1258522-6 2 JANUARY 2015 bull VOL 347 ISSUE 6217 sciencemagorg SCIENCE

Fig 4 Phylogeny-based insights into anopheline biology (A) Maximum-likelihood amino acidndashbased phylogenetic tree of three transglutaminase en-zymes (TG1 green TG2 yellow and TG3 red) in 14 anopheline species with Culexquinquefasciatus (Cxqu) Ae aegypti (Aeae) and D melanogaster (Dmel) servingas outgroups TG3 is the enzyme responsible for the formation of themale matingplug in An gambiae acting upon the substrate Plugin the most abundant matingplug protein Higher rates of evolution for plug-forming TG3 are supported byelevated levels of dN Mating plug phenotypes are noted where known within theTG3 clade (B) Concerted evolution in CPFL cuticular proteins Species symbols used are the same as in (A) In contrast to the TG1TG2TG3 phylogeny CPFLparalogs cluster by subgeneric clades rather than individually recapitulating the species phylogeny Gene family size variation among speciesmay reflect bothgene gainloss and variation in gene set completeness (C) OR observed gene counts and inferred ancestral gene counts on an ultrametric phylogeny At least10 OR genes were gained on the branch leading to the common ancestor of the An gambiae species complex although the overall number of OR genes doesnot vary dramatically across the genus

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

foundation for additional hypothesis generationand testing to further our understanding of thediverse biological traits that determine vectorialcapacity

REFERENCES AND NOTES

1 D P Kwiatkowski How malaria has affected the humangenome and what human genetics can teach us about malariaAm J Hum Genet 77 171ndash192 (2005) doi 101086432519pmid 16001361

2 A Cohuet C Harris V Robert D Fontenille Evolutionaryforces on Anopheles What makes a malaria vector TrendsParasitol 26 130ndash136 (2010) doi 101016jpt200912001pmid 20056485

3 S Manguin P Carnivale J Mouchet Biodiversity of Malaria inthe World (John Libbey Eurotext Paris 2008)

4 R A Holt et al The genome sequence of the malaria mosquitoAnopheles gambiae Science 298 129ndash149 (2002)doi 101126science1076181 pmid 12364791

5 O Marinotti et al The genome of Anopheles darlingi the mainneotropical malaria vector Nucleic Acids Res 41 7387ndash7400(2013) doi 101093nargkt484 pmid 23761445

6 D Zhou et al Genome sequence of Anopheles sinensisprovides insight into genetics basis of mosquito competencefor malaria parasites BMC Genomics 15 42 (2014)pmid 24438588

7 X Jiang et al Genome analysis of a major urban malariavector mosquito Anopheles stephensi Genome Biol 15 459(2014) doi 101186s13059-014-0459-2 pmid 25244985

8 D E Neafsey et al The evolution of the Anopheles 16 genomesproject Genes Genomes Genet 3 1191ndash1194 (2013)doi 101534g3113006247 pmid 23708298

9 M C Fontaine et al Extensive introgression in a malariavector species complex revealed by phylogenomics Science346 1258524 (2014)

10 M Moreno et al Complete mtDNA genomes of Anophelesdarlingi and an approach to anopheline divergence timeMalar J 9 127 (2010) doi 1011861475-2875-9-127pmid 20470395

11 S Gnerre et al High-quality draft assemblies of mammaliangenomes from massively parallel sequence data Proc NatlAcad Sci USA 108 1513ndash1518 (2011) doi 101073pnas1017351108 pmid 21187386

12 Materials and methods are available as supplementarymaterials on Science Online

13 C Holt M Yandell MAKER2 An annotation pipeline andgenome-database management tool for second-generationgenome projects BMC Bioinformatics 12 491 (2011)doi 1011861471-2105-12-491 pmid 22192575

14 M Coluzzi A Sabatini A della Torre M A Di DecoV Petrarca A polytene chromosome analysis of the Anophelesgambiae species complex Science 298 1415ndash1418 (2002)doi 101126science1077769 pmid 12364623

15 M Kamali et al Multigene phylogenetics reveals temporaldiversification of major African malaria vectors PLOS ONE 9e93580 (2014) doi 101371journalpone0093580 pmid 24705448

16 M A Toups M W Hahn Retrogenes reveal the direction ofsex-chromosome evolution in mosquitoes Genetics 186763ndash766 (2010) doi 101534genetics110118794pmid 20660646

17 D A Baker S Russell Role of testis-specific gene expressionin sex-chromosome evolution of Anopheles gambiae Genetics189 1117ndash1120 (2011) doi 101534genetics111133157pmid 21890740

18 M V Han G W C Thomas J Lugo-Martinez M W HahnEstimating gene gain and loss rates in the presence of error ingenome assembly and annotation using CAFE 3 Mol Biol Evol30 1987ndash1997 (2013) doi 101093molbevmst100pmid 23709260

19 M W Hahn M V Han S-G Han Gene family evolution across 12Drosophila genomes PLOS Genet 3 e197 (2007) doi 101371journalpgen0030197 pmid 17997610

20 Z Yang PAML 4 Phylogenetic analysis by maximumlikelihood Mol Biol Evol 24 1586ndash1591 (2007) doi 101093molbevmsm088 pmid 17483113

21 T Dottorini et al A genome-wide analysis in Anophelesgambiae mosquitoes reveals 46 male accessory gland genespossible modulators of female behavior Proc Natl AcadSci USA 104 16215ndash16220 (2007) doi 101073pnas0703904104 pmid 17901209

22 F Baldini P Gabrieli D W Rogers F Catteruccia Functionand composition of male accessory gland secretions inAnopheles gambiae A comparison with other insect vectors ofinfectious diseases Pathog Glob Health 106 82ndash93 (2012)doi 1011792047773212Y0000000016 pmid 22943543

23 R Assis Q Zhou D Bachtrog Sex-biased transcriptomeevolution in Drosophila Genome Biol Evol 4 1189ndash1200(2012) doi 101093gbeevs093 pmid 23097318

24 S Grath J Parsch Rate of amino acid substitution isinfluenced by the degree and conservation of male-biasedtranscription over 50 myr of Drosophila evolutionGenome Biol Evol 4 346ndash359 (2012) doi 101093gbeevs012 pmid 22321769

25 J C Perry P W Harrison J E Mank The ontogeny andevolution of sex-biased gene expression in Drosophilamelanogaster Mol Biol Evol 31 1206ndash1219 (2014)doi 101093molbevmsu072 pmid 24526011

26 D W Rogers et al Transglutaminase-mediated semencoagulation controls sperm storage in the malariamosquito PLOS Biol 7 e1000272 (2009) doi 101371journalpbio1000272 pmid 20027206

27 F Baldini et al The interaction between a sexually transferredsteroid hormone and a female protein regulates oogenesisin the malaria mosquito Anopheles gambiae PLOS Biol 11e1001695 (2013) doi 101371journalpbio1001695pmid 24204210

28 W R Shaw et al Mating activates the heme peroxidase HPX15in the sperm storage organ to ensure fertility in Anophelesgambiae Proc Natl Acad Sci USA 111 5854ndash5859 (2014)doi 101073pnas1401715111 pmid 24711401

29 J H Willis Structural cuticular proteins from arthropodsAnnotation nomenclature and sequence characteristics in thegenomics era Insect Biochem Mol Biol 40 189ndash204(2010) doi 101016jibmb201002001 pmid 20171281

30 R S Cornman et al Annotation and analysis of a largecuticular protein family with the RampR Consensus inAnopheles gambiae BMC Genomics 9 22 (2008) doi 1011861471-2164-9-22 pmid 18205929

31 R S Cornman J H Willis Extensive gene amplification andconcerted evolution within the CPR family of cuticularproteins in mosquitoes Insect Biochem Mol Biol 38661ndash676 (2008) doi 101016jibmb200804001pmid 18510978

32 R S Cornman J H Willis Annotation and analysis oflow-complexity protein families of Anopheles gambiae thatare associated with cuticle Insect Mol Biol 18 607ndash622(2009) doi 101111j1365-2583200900902xpmid 19754739

33 R S Cornman Molecular evolution of Drosophila cuticularprotein genes PLOS ONE 4 e8345 (2009) doi 101371journalpone0008345 pmid 20019874

34 T Togawa W A Dunn A C Emmons J Nagao J H WillisDevelopmental expression patterns of cuticular protein genes

SCIENCE sciencemagorg 2 JANUARY 2015 bull VOL 347 ISSUE 6217 1258522-7

Fig 5 Genesis of novel anopheline genes (A) Retrotransposition of theE2Deffete gene generated a ubiquitin-conjugating enzyme at the base of thegenus which exhibits much higher sequence divergence than the originalmultiexon gene WebLogo plots contrast the amino acid conservation of theoriginal effete gene with the diversification of the retrotransposed copy (residues38 to 75 species represented areAnminimusAn dirusAn funestusAn farautiAn atroparvusAn sinensisAn darlingi andAn albimanus) (B) TheSG7 salivaryproteinndashencodinggenewas generated from theC-terminal half of the 30-kDgene

SG7 then underwent tandem duplication and intron loss to generate anothersalivary protein SG7-2 Numerals indicate lengths of segments in base pairs (C)The origin of STAT1 a signal transducer and activator of transcription geneinvolved in immunity occurred through a retrotransposition event in the Celliaancestor after divergence from An dirus and An farauti The intronless STAT1 ismuch more divergent than its multiexon progenitor STAT2 and has been main-tained in all descendant species An independent retrotransposition event createda retrogene copy inAn atroparvuswhich is alsomoredivergent than its progenitor

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

with the RampR Consensus from Anopheles gambiae InsectBiochem Mol Biol 38 508ndash519 (2008) doi 101016jibmb200712008 pmid 18405829

35 D C Rinker et al Blood meal-induced changes to antennaltranscriptome profiles reveal shifts in odor sensitivities inAnopheles gambiae Proc Natl Acad Sci USA 110 8260ndash8265(2013) doi 101073pnas1302562110 pmid 23630291

36 D C Rinker et al Antennal transcriptome profiles ofanopheline mosquitoes reveal human host olfactoryspecialization in Anopheles gambiae BMC Genomics 14749 (2013) doi 1011861471-2164-14-749 pmid 24182346

37 M Altstein D R Naumlssel Neuropeptide signaling in insectsAdv Exp Med Biol 692 155ndash165 (2010) doi 101007978-1-4419-6902-6_8 pmid 21189678

38 J P Goetze I Hunter S K Lippert L Bardram J F RehfeldProcessing-independent analysis of peptide hormones andprohormones in plasma Front Biosci (Landmark Ed) 171804ndash1815 (2012) doi 1027414020 pmid 22201837

39 T H Stracker S Thompson G L Grossman M A RiehleM R Brown Characterization of the AeaHP gene and itsexpression in the mosquito Aedes aegypti (Diptera Culicidae)J Med Entomol 39 331ndash342 (2002) doi 1016030022-2585-392331 pmid 11931033

40 C D de Oliveira W P Tadei F C Abdalla P F PaolucciPimenta O Marinotti Multiple blood meals in Anophelesdarlingi (Diptera Culicidae) J Vector Ecol 37 351ndash358(2012) doi 101111j1948-7134201200238x pmid 23181859

41 N Okamoto et al A fat body-derived IGF-like peptide regulatespostfeeding growth in Drosophila Dev Cell 17 885ndash891(2009) doi 101016jdevcel200910008 pmid 20059957

42 M A Riehle Y Fan C Cao M R Brown Molecularcharacterization of insulin-like peptides in the yellow fevermosquito Aedes aegypti Expression cellular localization andphylogeny Peptides 27 2547ndash2560 (2006) doi 101016jpeptides200607016 pmid 16934367

43 Y Antonova A J Arik W Moore M M Riehle M R Brown inInsect Endocrinology L I Gilbert Ed (Academic Press 2012)pp 63ndash92

44 A Swaminathan A Gajan L A Pile Epigenetic regulation oftranscription in Drosophila Front Biosci (Landmark Ed) 17909ndash937 (2012) doi 1027413964 pmid 22201781

45 F Cipressa et al Effete a Drosophila chromatin-associatedubiquitin-conjugating enzyme that affects telomeric andheterochromatic position effect variegation Genetics195 147ndash158 (2013) doi 101534genetics113153320pmid 23821599

46 B Arcagrave et al Positive selection drives accelerated evolutionof mosquito salivary genes associated with blood-feedingInsect Mol Biol 23 122ndash131 (2014) doi 101111imb12068pmid 24237399

47 H Isawa M Yuda Y Orito Y Chinzei A mosquito salivaryprotein inhibits activation of the plasma contact system bybinding to factor XII and high molecular weight kininogenJ Biol Chem 277 27651ndash27658 (2002) doi 101074jbcM203505200 pmid 12011093

48 E Calvo et al Aegyptin a novel mosquito salivary glandprotein specifically binds to collagen and prevents itsinteraction with platelet glycoprotein VI integrinalpha2beta1 and von Willebrand factor J Biol Chem 28226928ndash26938 (2007) doi 101074jbcM705669200pmid 17650501

49 E Guittard et al CYP18A1 a key enzyme of Drosophilasteroid hormone inactivation is essential for metamorphosisDev Biol 349 35ndash45 (2011) doi 101016jydbio201009023pmid 20932968

50 R M Waterhouse et al Evolutionary dynamics of immune-relatedgenes and pathways in disease-vector mosquitoes Science 3161738ndash1743 (2007) doi 101126science1139862pmid 17588928

51 L C Bartholomay et al Pathogenomics of Culexquinquefasciatus and meta-analysis of infection responses todiverse pathogens Science 330 88ndash90 (2010) doi 101126science1193162 pmid 20929811

52 C Barillas-Mury Y S Han D Seeley F C KafatosAnopheles gambiae Ag-STAT a new insect member of theSTAT family is activated in response to bacterial infection

EMBO J 18 959ndash967 (1999) doi 101093emboj184959pmid 10022838

53 L Gupta et al The STAT pathway mediates late-phaseimmunity against Plasmodium in the mosquito Anophelesgambiae Cell Host Microbe 5 498ndash507 (2009) doi 101016jchom200904003 pmid 19454353

54 A C Bahia et al The JAK-STAT pathway controls Plasmodiumvivax load in early stages of Anopheles aquasalis infectionPLOS Negl Trop Dis 5 e1317 (2011) doi 101371journalpntd0001317 pmid 22069502

ACKNOWLEDGMENTS

All sequencing reads and genome assemblies have been submittedto the National Center for Biotechnology Information (umbrellaBioProject ID PRJNA67511) Genome and transcriptomeassemblies are also available from VectorBase (httpsvectorbaseorg) and the Broad Institute (httpsolivebroadinstituteorgcollectionsanopheles4) The authorsacknowledge the NIH Eukaryotic Pathogen and Disease VectorSequencing Project Working Group for guidance anddevelopment of this project Sequence data generation wassupported at the Broad Institute by the National HumanGenome Research Institute (U54 HG003067) We thank themany members of the Broad Institute Genomics Platform andGenome Sequencing and Analysis Program who contributed tosequencing data generation and analysis

SUPPLEMENTARY MATERIALS

wwwsciencemagorgcontent34762171258522supplDC1Materials and MethodsSupplementary TextFigs S1 to S25Tables S1 to S36References (55ndash248)

9 July 2014 accepted 14 November 2014Published online 27 November 2014101126science1258522

1258522-8 2 JANUARY 2015 bull VOL 347 ISSUE 6217 sciencemagorg SCIENCE

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

mosquitoesAnophelesHighly evolvable malaria vectors The genomes of 16

Riehle Igor V Sharakhov Zhijian Tu Laurence J Zwiebel and Nora J BesanskyKafatos Manolis Kellis Daniel Lawson Christos Louis Shirley Luckhart Marc A T Muskavitch Joseacute M Ribeiro Michael ADonnelly Scott J Emrich Michael C Fontaine William Gelbart Matthew W Hahn Immo A Hansen Paul I Howell Fotis C Zhou Flaminia Catteruccia George K Christophides Frank H Collins Robert S Cornman Andrea Crisanti Martin JJohn Vontas Catherine Walton Craig S Wilding Judith H Willis Yi-Chieh Wu Guiyun Yan Evgeny M Zdobnov Xiaofan Vladimir Stegniy Claudio J Struchiner Gregg W C Thomas Marta Tojo Pantelis Topalis Joseacute M C Tubio Maria F UngerSagnon Maria V Sharakhova Terrance Shea Felipe A Simatildeo Frederic Simard Michel A Slotman Pradya Somboon Anil Prakash David P Price Ashok Rajaraman Lisa J Reimer David C Rinker Antonis Rokas Tanya L Russell NFaleChioma Oringanje Mohammad A Oshaghi Nazzy Pakpour Philippos A Papathanos Ashley N Peery Michael Povelones N Mitchell Wendy Moore Katherine A Murphy Anastasia N Naumenko Tony Nolan Eva M Novoa Samantha OLoughlinErnesto Lowy Robert M MacCallum Chunhong Mao Gareth Maslen Charles Mbogo Jenny McCarthy Kristin Michel Sara Kirmitzoglou Lizette L Koekemoer Njoroge Laban Nicholas Langridge Mara K N Lawniczak Manolis Lirakis Neil F LoboXiaofang Jiang Irwin Jungreis Evdoxia G Kakani Maryam Kamali Petri Kemppainen Ryan C Kennedy Ioannis K Gabriel Wamdaogo M Guelbeogo Andrew B Hall Mira V Han Thaung Hlaing Daniel S T Hughes Adam M JenkinsChiu Mikkel Christensen Carlo Costantini Victoria L M Davidson Elena Deligianni Tania Dottorini Vicky Dritsou Stacey B Birren Stephanie A Blandin Andrew I Brockman Thomas R Burkot Austin Burt Clara S Chan Cedric Chauve Joanna CJames Amon Bruno Arcagrave Peter Arensburger Gleb Artemov Lauren A Assour Hamidreza Basseri Aaron Berlin Bruce W Daniel E Neafsey Robert M Waterhouse Mohammad R Abai Sergey S Aganezov Max A Alekseyev James E Allen

originally published online November 27 2014DOI 101126science1258522 (6217) 1258522347Science

this issue 101126science1258524 and 101126science1258522 see also p 27Sciencehuman malaria (see the Perspective by Clark and Messer)reveal clues that explain the mystery of why only some species of one genus of mosquitoes are capable of transmitting

et al and Fontaine et allife-threatening infection By sequencing the genomes of several mosquitoes in depth Neafsey among these bloodsucking flies that render some bites mere nuisances and others the initiation of a potentially

Virtually everyone has first-hand experience with mosquitoes Few recognize the subtle biological distinctionsMosquito adaptability across genomes

ARTICLE TOOLS httpsciencesciencemagorgcontent34762171258522

MATERIALSSUPPLEMENTARY httpsciencesciencemagorgcontentsuppl20141125science1258522DC1

CONTENTRELATED httpsciencesciencemagorgcontentsci347621727full

REFERENCES

httpsciencesciencemagorgcontent34762171258522BIBLThis article cites 238 articles 62 of which you can access for free

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAASSciencelicensee American Association for the Advancement of Science No claim to original US Government Works The title Science 1200 New York Avenue NW Washington DC 20005 2017 copy The Authors some rights reserved exclusive

(print ISSN 0036-8075 online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

on June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

PERMISSIONS httpwwwsciencemagorghelpreprints-and-permissions

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAASSciencelicensee American Association for the Advancement of Science No claim to original US Government Works The title Science 1200 New York Avenue NW Washington DC 20005 2017 copy The Authors some rights reserved exclusive

(print ISSN 0036-8075 online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

on June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

Page 7: MOSQUITO GENOMICS Highlyevolvable … ARTICLE SUMMARY MOSQUITO GENOMICS Highlyevolvable malariavectors:The genomes of 16 Anopheles mosquitoes Daniel E. Neafsey,*† Robert M. Waterhouse,*

multiple gene families including cytochromeP450s and glutathione S-transferases (GSTs)which serve to generally protect against all en-vironment stresses both natural and anthropo-genic We manually characterized these genefamilies in seven anophelines spanning the genusDespite their large size gene numbers (87 to 104P450 genes 27 to 30 GST genes) within bothgene families are highly conserved across all spe-cies although lineage-specific gene duplicationsand losses are often seen (tables S33 and S34) Aswith the OR and GR olfaction-related gene fam-ilies P450 and GST repertoires may be relativelyconstant due to the large number of roles theyplay in anopheline biology Orthologs of genesassociated with insecticide resistance either viaup-regulation or coding variation [eg Cyp6m2Cyp6p3 (Cyp6p9 in An funestus) Gste2 andGste4] were found in all species which suggestedthat virtually all anophelines likely have genescapable of conferring insecticide resistance throughsimilar mechanisms Unexpectedly one memberof the P450 family (Cyp18a1) with a conservedrole in ecdysteroid catabolism [and consequentlydevelopment and metamorphosis (49)] appears

to have been lost from the ancestor of the Angambiae species complex but is found in the ge-nome and transcriptome assemblies of otherspecies which indicates that the An gambiaecomplex may have recently evolved an alternatemechanism for catabolizing ecdysoneSusceptibility to malaria parasites is a key

determinant of vectorial capacity Dissectingthe immune repertoire (50 51) (table S35) intoits constituent phases reveals that classical rec-ognition genes and genes encoding effector en-zymes exhibit relatively low levels of sequencedivergence Signal transducers are more diver-gent in sequence but are conserved in repre-sentation across species and rarely duplicatedCascade modulators although also divergentare more lineage-specific and generally havemore gene duplications (Fig 3 and fig S25) Arare duplication of an immune signal transduc-tion gene occurred through the retrotranspo-sition of the signal transducer and activator oftranscription STAT2 to form the intronless STAT1after the divergence ofAn dirus and An farautifrom the rest of the subgenus Cellia (Fig 5C andtable S36) It is interesting that an independent

retrotransposition event appears to have cre-ated another intronless STAT gene in the Anatroparvus lineage In An gambiae STAT1 con-trols the expression of STAT2 and is activated inresponse to bacterial challenge (52 53) and theSTAT pathway has been demonstrated to me-diate immunity to Plasmodium (53 54) so thepresence of these relatively new immune signaltransducers may have allowed for rewiring of reg-ulatory networks governing immune responsesin this subset of anophelines

Conclusion

Since the discovery over a century ago by RonaldRoss and Giovanni Battista Grassi that humanmalaria is transmitted by a narrow range of blood-feeding female mosquitoes the biological basisof malarial vectorial capacity has been a matterof intense interest Inasmuch as previous suc-cesses in the local elimination of malaria havealways been accomplished wholly or in partthrough effective vector control an increased un-derstanding of vector biology is crucial for con-tinued progress against malarial disease These16 new reference genome assemblies provide a

1258522-6 2 JANUARY 2015 bull VOL 347 ISSUE 6217 sciencemagorg SCIENCE

Fig 4 Phylogeny-based insights into anopheline biology (A) Maximum-likelihood amino acidndashbased phylogenetic tree of three transglutaminase en-zymes (TG1 green TG2 yellow and TG3 red) in 14 anopheline species with Culexquinquefasciatus (Cxqu) Ae aegypti (Aeae) and D melanogaster (Dmel) servingas outgroups TG3 is the enzyme responsible for the formation of themale matingplug in An gambiae acting upon the substrate Plugin the most abundant matingplug protein Higher rates of evolution for plug-forming TG3 are supported byelevated levels of dN Mating plug phenotypes are noted where known within theTG3 clade (B) Concerted evolution in CPFL cuticular proteins Species symbols used are the same as in (A) In contrast to the TG1TG2TG3 phylogeny CPFLparalogs cluster by subgeneric clades rather than individually recapitulating the species phylogeny Gene family size variation among speciesmay reflect bothgene gainloss and variation in gene set completeness (C) OR observed gene counts and inferred ancestral gene counts on an ultrametric phylogeny At least10 OR genes were gained on the branch leading to the common ancestor of the An gambiae species complex although the overall number of OR genes doesnot vary dramatically across the genus

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

foundation for additional hypothesis generationand testing to further our understanding of thediverse biological traits that determine vectorialcapacity

REFERENCES AND NOTES

1 D P Kwiatkowski How malaria has affected the humangenome and what human genetics can teach us about malariaAm J Hum Genet 77 171ndash192 (2005) doi 101086432519pmid 16001361

2 A Cohuet C Harris V Robert D Fontenille Evolutionaryforces on Anopheles What makes a malaria vector TrendsParasitol 26 130ndash136 (2010) doi 101016jpt200912001pmid 20056485

3 S Manguin P Carnivale J Mouchet Biodiversity of Malaria inthe World (John Libbey Eurotext Paris 2008)

4 R A Holt et al The genome sequence of the malaria mosquitoAnopheles gambiae Science 298 129ndash149 (2002)doi 101126science1076181 pmid 12364791

5 O Marinotti et al The genome of Anopheles darlingi the mainneotropical malaria vector Nucleic Acids Res 41 7387ndash7400(2013) doi 101093nargkt484 pmid 23761445

6 D Zhou et al Genome sequence of Anopheles sinensisprovides insight into genetics basis of mosquito competencefor malaria parasites BMC Genomics 15 42 (2014)pmid 24438588

7 X Jiang et al Genome analysis of a major urban malariavector mosquito Anopheles stephensi Genome Biol 15 459(2014) doi 101186s13059-014-0459-2 pmid 25244985

8 D E Neafsey et al The evolution of the Anopheles 16 genomesproject Genes Genomes Genet 3 1191ndash1194 (2013)doi 101534g3113006247 pmid 23708298

9 M C Fontaine et al Extensive introgression in a malariavector species complex revealed by phylogenomics Science346 1258524 (2014)

10 M Moreno et al Complete mtDNA genomes of Anophelesdarlingi and an approach to anopheline divergence timeMalar J 9 127 (2010) doi 1011861475-2875-9-127pmid 20470395

11 S Gnerre et al High-quality draft assemblies of mammaliangenomes from massively parallel sequence data Proc NatlAcad Sci USA 108 1513ndash1518 (2011) doi 101073pnas1017351108 pmid 21187386

12 Materials and methods are available as supplementarymaterials on Science Online

13 C Holt M Yandell MAKER2 An annotation pipeline andgenome-database management tool for second-generationgenome projects BMC Bioinformatics 12 491 (2011)doi 1011861471-2105-12-491 pmid 22192575

14 M Coluzzi A Sabatini A della Torre M A Di DecoV Petrarca A polytene chromosome analysis of the Anophelesgambiae species complex Science 298 1415ndash1418 (2002)doi 101126science1077769 pmid 12364623

15 M Kamali et al Multigene phylogenetics reveals temporaldiversification of major African malaria vectors PLOS ONE 9e93580 (2014) doi 101371journalpone0093580 pmid 24705448

16 M A Toups M W Hahn Retrogenes reveal the direction ofsex-chromosome evolution in mosquitoes Genetics 186763ndash766 (2010) doi 101534genetics110118794pmid 20660646

17 D A Baker S Russell Role of testis-specific gene expressionin sex-chromosome evolution of Anopheles gambiae Genetics189 1117ndash1120 (2011) doi 101534genetics111133157pmid 21890740

18 M V Han G W C Thomas J Lugo-Martinez M W HahnEstimating gene gain and loss rates in the presence of error ingenome assembly and annotation using CAFE 3 Mol Biol Evol30 1987ndash1997 (2013) doi 101093molbevmst100pmid 23709260

19 M W Hahn M V Han S-G Han Gene family evolution across 12Drosophila genomes PLOS Genet 3 e197 (2007) doi 101371journalpgen0030197 pmid 17997610

20 Z Yang PAML 4 Phylogenetic analysis by maximumlikelihood Mol Biol Evol 24 1586ndash1591 (2007) doi 101093molbevmsm088 pmid 17483113

21 T Dottorini et al A genome-wide analysis in Anophelesgambiae mosquitoes reveals 46 male accessory gland genespossible modulators of female behavior Proc Natl AcadSci USA 104 16215ndash16220 (2007) doi 101073pnas0703904104 pmid 17901209

22 F Baldini P Gabrieli D W Rogers F Catteruccia Functionand composition of male accessory gland secretions inAnopheles gambiae A comparison with other insect vectors ofinfectious diseases Pathog Glob Health 106 82ndash93 (2012)doi 1011792047773212Y0000000016 pmid 22943543

23 R Assis Q Zhou D Bachtrog Sex-biased transcriptomeevolution in Drosophila Genome Biol Evol 4 1189ndash1200(2012) doi 101093gbeevs093 pmid 23097318

24 S Grath J Parsch Rate of amino acid substitution isinfluenced by the degree and conservation of male-biasedtranscription over 50 myr of Drosophila evolutionGenome Biol Evol 4 346ndash359 (2012) doi 101093gbeevs012 pmid 22321769

25 J C Perry P W Harrison J E Mank The ontogeny andevolution of sex-biased gene expression in Drosophilamelanogaster Mol Biol Evol 31 1206ndash1219 (2014)doi 101093molbevmsu072 pmid 24526011

26 D W Rogers et al Transglutaminase-mediated semencoagulation controls sperm storage in the malariamosquito PLOS Biol 7 e1000272 (2009) doi 101371journalpbio1000272 pmid 20027206

27 F Baldini et al The interaction between a sexually transferredsteroid hormone and a female protein regulates oogenesisin the malaria mosquito Anopheles gambiae PLOS Biol 11e1001695 (2013) doi 101371journalpbio1001695pmid 24204210

28 W R Shaw et al Mating activates the heme peroxidase HPX15in the sperm storage organ to ensure fertility in Anophelesgambiae Proc Natl Acad Sci USA 111 5854ndash5859 (2014)doi 101073pnas1401715111 pmid 24711401

29 J H Willis Structural cuticular proteins from arthropodsAnnotation nomenclature and sequence characteristics in thegenomics era Insect Biochem Mol Biol 40 189ndash204(2010) doi 101016jibmb201002001 pmid 20171281

30 R S Cornman et al Annotation and analysis of a largecuticular protein family with the RampR Consensus inAnopheles gambiae BMC Genomics 9 22 (2008) doi 1011861471-2164-9-22 pmid 18205929

31 R S Cornman J H Willis Extensive gene amplification andconcerted evolution within the CPR family of cuticularproteins in mosquitoes Insect Biochem Mol Biol 38661ndash676 (2008) doi 101016jibmb200804001pmid 18510978

32 R S Cornman J H Willis Annotation and analysis oflow-complexity protein families of Anopheles gambiae thatare associated with cuticle Insect Mol Biol 18 607ndash622(2009) doi 101111j1365-2583200900902xpmid 19754739

33 R S Cornman Molecular evolution of Drosophila cuticularprotein genes PLOS ONE 4 e8345 (2009) doi 101371journalpone0008345 pmid 20019874

34 T Togawa W A Dunn A C Emmons J Nagao J H WillisDevelopmental expression patterns of cuticular protein genes

SCIENCE sciencemagorg 2 JANUARY 2015 bull VOL 347 ISSUE 6217 1258522-7

Fig 5 Genesis of novel anopheline genes (A) Retrotransposition of theE2Deffete gene generated a ubiquitin-conjugating enzyme at the base of thegenus which exhibits much higher sequence divergence than the originalmultiexon gene WebLogo plots contrast the amino acid conservation of theoriginal effete gene with the diversification of the retrotransposed copy (residues38 to 75 species represented areAnminimusAn dirusAn funestusAn farautiAn atroparvusAn sinensisAn darlingi andAn albimanus) (B) TheSG7 salivaryproteinndashencodinggenewas generated from theC-terminal half of the 30-kDgene

SG7 then underwent tandem duplication and intron loss to generate anothersalivary protein SG7-2 Numerals indicate lengths of segments in base pairs (C)The origin of STAT1 a signal transducer and activator of transcription geneinvolved in immunity occurred through a retrotransposition event in the Celliaancestor after divergence from An dirus and An farauti The intronless STAT1 ismuch more divergent than its multiexon progenitor STAT2 and has been main-tained in all descendant species An independent retrotransposition event createda retrogene copy inAn atroparvuswhich is alsomoredivergent than its progenitor

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

with the RampR Consensus from Anopheles gambiae InsectBiochem Mol Biol 38 508ndash519 (2008) doi 101016jibmb200712008 pmid 18405829

35 D C Rinker et al Blood meal-induced changes to antennaltranscriptome profiles reveal shifts in odor sensitivities inAnopheles gambiae Proc Natl Acad Sci USA 110 8260ndash8265(2013) doi 101073pnas1302562110 pmid 23630291

36 D C Rinker et al Antennal transcriptome profiles ofanopheline mosquitoes reveal human host olfactoryspecialization in Anopheles gambiae BMC Genomics 14749 (2013) doi 1011861471-2164-14-749 pmid 24182346

37 M Altstein D R Naumlssel Neuropeptide signaling in insectsAdv Exp Med Biol 692 155ndash165 (2010) doi 101007978-1-4419-6902-6_8 pmid 21189678

38 J P Goetze I Hunter S K Lippert L Bardram J F RehfeldProcessing-independent analysis of peptide hormones andprohormones in plasma Front Biosci (Landmark Ed) 171804ndash1815 (2012) doi 1027414020 pmid 22201837

39 T H Stracker S Thompson G L Grossman M A RiehleM R Brown Characterization of the AeaHP gene and itsexpression in the mosquito Aedes aegypti (Diptera Culicidae)J Med Entomol 39 331ndash342 (2002) doi 1016030022-2585-392331 pmid 11931033

40 C D de Oliveira W P Tadei F C Abdalla P F PaolucciPimenta O Marinotti Multiple blood meals in Anophelesdarlingi (Diptera Culicidae) J Vector Ecol 37 351ndash358(2012) doi 101111j1948-7134201200238x pmid 23181859

41 N Okamoto et al A fat body-derived IGF-like peptide regulatespostfeeding growth in Drosophila Dev Cell 17 885ndash891(2009) doi 101016jdevcel200910008 pmid 20059957

42 M A Riehle Y Fan C Cao M R Brown Molecularcharacterization of insulin-like peptides in the yellow fevermosquito Aedes aegypti Expression cellular localization andphylogeny Peptides 27 2547ndash2560 (2006) doi 101016jpeptides200607016 pmid 16934367

43 Y Antonova A J Arik W Moore M M Riehle M R Brown inInsect Endocrinology L I Gilbert Ed (Academic Press 2012)pp 63ndash92

44 A Swaminathan A Gajan L A Pile Epigenetic regulation oftranscription in Drosophila Front Biosci (Landmark Ed) 17909ndash937 (2012) doi 1027413964 pmid 22201781

45 F Cipressa et al Effete a Drosophila chromatin-associatedubiquitin-conjugating enzyme that affects telomeric andheterochromatic position effect variegation Genetics195 147ndash158 (2013) doi 101534genetics113153320pmid 23821599

46 B Arcagrave et al Positive selection drives accelerated evolutionof mosquito salivary genes associated with blood-feedingInsect Mol Biol 23 122ndash131 (2014) doi 101111imb12068pmid 24237399

47 H Isawa M Yuda Y Orito Y Chinzei A mosquito salivaryprotein inhibits activation of the plasma contact system bybinding to factor XII and high molecular weight kininogenJ Biol Chem 277 27651ndash27658 (2002) doi 101074jbcM203505200 pmid 12011093

48 E Calvo et al Aegyptin a novel mosquito salivary glandprotein specifically binds to collagen and prevents itsinteraction with platelet glycoprotein VI integrinalpha2beta1 and von Willebrand factor J Biol Chem 28226928ndash26938 (2007) doi 101074jbcM705669200pmid 17650501

49 E Guittard et al CYP18A1 a key enzyme of Drosophilasteroid hormone inactivation is essential for metamorphosisDev Biol 349 35ndash45 (2011) doi 101016jydbio201009023pmid 20932968

50 R M Waterhouse et al Evolutionary dynamics of immune-relatedgenes and pathways in disease-vector mosquitoes Science 3161738ndash1743 (2007) doi 101126science1139862pmid 17588928

51 L C Bartholomay et al Pathogenomics of Culexquinquefasciatus and meta-analysis of infection responses todiverse pathogens Science 330 88ndash90 (2010) doi 101126science1193162 pmid 20929811

52 C Barillas-Mury Y S Han D Seeley F C KafatosAnopheles gambiae Ag-STAT a new insect member of theSTAT family is activated in response to bacterial infection

EMBO J 18 959ndash967 (1999) doi 101093emboj184959pmid 10022838

53 L Gupta et al The STAT pathway mediates late-phaseimmunity against Plasmodium in the mosquito Anophelesgambiae Cell Host Microbe 5 498ndash507 (2009) doi 101016jchom200904003 pmid 19454353

54 A C Bahia et al The JAK-STAT pathway controls Plasmodiumvivax load in early stages of Anopheles aquasalis infectionPLOS Negl Trop Dis 5 e1317 (2011) doi 101371journalpntd0001317 pmid 22069502

ACKNOWLEDGMENTS

All sequencing reads and genome assemblies have been submittedto the National Center for Biotechnology Information (umbrellaBioProject ID PRJNA67511) Genome and transcriptomeassemblies are also available from VectorBase (httpsvectorbaseorg) and the Broad Institute (httpsolivebroadinstituteorgcollectionsanopheles4) The authorsacknowledge the NIH Eukaryotic Pathogen and Disease VectorSequencing Project Working Group for guidance anddevelopment of this project Sequence data generation wassupported at the Broad Institute by the National HumanGenome Research Institute (U54 HG003067) We thank themany members of the Broad Institute Genomics Platform andGenome Sequencing and Analysis Program who contributed tosequencing data generation and analysis

SUPPLEMENTARY MATERIALS

wwwsciencemagorgcontent34762171258522supplDC1Materials and MethodsSupplementary TextFigs S1 to S25Tables S1 to S36References (55ndash248)

9 July 2014 accepted 14 November 2014Published online 27 November 2014101126science1258522

1258522-8 2 JANUARY 2015 bull VOL 347 ISSUE 6217 sciencemagorg SCIENCE

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

mosquitoesAnophelesHighly evolvable malaria vectors The genomes of 16

Riehle Igor V Sharakhov Zhijian Tu Laurence J Zwiebel and Nora J BesanskyKafatos Manolis Kellis Daniel Lawson Christos Louis Shirley Luckhart Marc A T Muskavitch Joseacute M Ribeiro Michael ADonnelly Scott J Emrich Michael C Fontaine William Gelbart Matthew W Hahn Immo A Hansen Paul I Howell Fotis C Zhou Flaminia Catteruccia George K Christophides Frank H Collins Robert S Cornman Andrea Crisanti Martin JJohn Vontas Catherine Walton Craig S Wilding Judith H Willis Yi-Chieh Wu Guiyun Yan Evgeny M Zdobnov Xiaofan Vladimir Stegniy Claudio J Struchiner Gregg W C Thomas Marta Tojo Pantelis Topalis Joseacute M C Tubio Maria F UngerSagnon Maria V Sharakhova Terrance Shea Felipe A Simatildeo Frederic Simard Michel A Slotman Pradya Somboon Anil Prakash David P Price Ashok Rajaraman Lisa J Reimer David C Rinker Antonis Rokas Tanya L Russell NFaleChioma Oringanje Mohammad A Oshaghi Nazzy Pakpour Philippos A Papathanos Ashley N Peery Michael Povelones N Mitchell Wendy Moore Katherine A Murphy Anastasia N Naumenko Tony Nolan Eva M Novoa Samantha OLoughlinErnesto Lowy Robert M MacCallum Chunhong Mao Gareth Maslen Charles Mbogo Jenny McCarthy Kristin Michel Sara Kirmitzoglou Lizette L Koekemoer Njoroge Laban Nicholas Langridge Mara K N Lawniczak Manolis Lirakis Neil F LoboXiaofang Jiang Irwin Jungreis Evdoxia G Kakani Maryam Kamali Petri Kemppainen Ryan C Kennedy Ioannis K Gabriel Wamdaogo M Guelbeogo Andrew B Hall Mira V Han Thaung Hlaing Daniel S T Hughes Adam M JenkinsChiu Mikkel Christensen Carlo Costantini Victoria L M Davidson Elena Deligianni Tania Dottorini Vicky Dritsou Stacey B Birren Stephanie A Blandin Andrew I Brockman Thomas R Burkot Austin Burt Clara S Chan Cedric Chauve Joanna CJames Amon Bruno Arcagrave Peter Arensburger Gleb Artemov Lauren A Assour Hamidreza Basseri Aaron Berlin Bruce W Daniel E Neafsey Robert M Waterhouse Mohammad R Abai Sergey S Aganezov Max A Alekseyev James E Allen

originally published online November 27 2014DOI 101126science1258522 (6217) 1258522347Science

this issue 101126science1258524 and 101126science1258522 see also p 27Sciencehuman malaria (see the Perspective by Clark and Messer)reveal clues that explain the mystery of why only some species of one genus of mosquitoes are capable of transmitting

et al and Fontaine et allife-threatening infection By sequencing the genomes of several mosquitoes in depth Neafsey among these bloodsucking flies that render some bites mere nuisances and others the initiation of a potentially

Virtually everyone has first-hand experience with mosquitoes Few recognize the subtle biological distinctionsMosquito adaptability across genomes

ARTICLE TOOLS httpsciencesciencemagorgcontent34762171258522

MATERIALSSUPPLEMENTARY httpsciencesciencemagorgcontentsuppl20141125science1258522DC1

CONTENTRELATED httpsciencesciencemagorgcontentsci347621727full

REFERENCES

httpsciencesciencemagorgcontent34762171258522BIBLThis article cites 238 articles 62 of which you can access for free

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAASSciencelicensee American Association for the Advancement of Science No claim to original US Government Works The title Science 1200 New York Avenue NW Washington DC 20005 2017 copy The Authors some rights reserved exclusive

(print ISSN 0036-8075 online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

on June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

PERMISSIONS httpwwwsciencemagorghelpreprints-and-permissions

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAASSciencelicensee American Association for the Advancement of Science No claim to original US Government Works The title Science 1200 New York Avenue NW Washington DC 20005 2017 copy The Authors some rights reserved exclusive

(print ISSN 0036-8075 online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

on June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

Page 8: MOSQUITO GENOMICS Highlyevolvable … ARTICLE SUMMARY MOSQUITO GENOMICS Highlyevolvable malariavectors:The genomes of 16 Anopheles mosquitoes Daniel E. Neafsey,*† Robert M. Waterhouse,*

foundation for additional hypothesis generationand testing to further our understanding of thediverse biological traits that determine vectorialcapacity

REFERENCES AND NOTES

1 D P Kwiatkowski How malaria has affected the humangenome and what human genetics can teach us about malariaAm J Hum Genet 77 171ndash192 (2005) doi 101086432519pmid 16001361

2 A Cohuet C Harris V Robert D Fontenille Evolutionaryforces on Anopheles What makes a malaria vector TrendsParasitol 26 130ndash136 (2010) doi 101016jpt200912001pmid 20056485

3 S Manguin P Carnivale J Mouchet Biodiversity of Malaria inthe World (John Libbey Eurotext Paris 2008)

4 R A Holt et al The genome sequence of the malaria mosquitoAnopheles gambiae Science 298 129ndash149 (2002)doi 101126science1076181 pmid 12364791

5 O Marinotti et al The genome of Anopheles darlingi the mainneotropical malaria vector Nucleic Acids Res 41 7387ndash7400(2013) doi 101093nargkt484 pmid 23761445

6 D Zhou et al Genome sequence of Anopheles sinensisprovides insight into genetics basis of mosquito competencefor malaria parasites BMC Genomics 15 42 (2014)pmid 24438588

7 X Jiang et al Genome analysis of a major urban malariavector mosquito Anopheles stephensi Genome Biol 15 459(2014) doi 101186s13059-014-0459-2 pmid 25244985

8 D E Neafsey et al The evolution of the Anopheles 16 genomesproject Genes Genomes Genet 3 1191ndash1194 (2013)doi 101534g3113006247 pmid 23708298

9 M C Fontaine et al Extensive introgression in a malariavector species complex revealed by phylogenomics Science346 1258524 (2014)

10 M Moreno et al Complete mtDNA genomes of Anophelesdarlingi and an approach to anopheline divergence timeMalar J 9 127 (2010) doi 1011861475-2875-9-127pmid 20470395

11 S Gnerre et al High-quality draft assemblies of mammaliangenomes from massively parallel sequence data Proc NatlAcad Sci USA 108 1513ndash1518 (2011) doi 101073pnas1017351108 pmid 21187386

12 Materials and methods are available as supplementarymaterials on Science Online

13 C Holt M Yandell MAKER2 An annotation pipeline andgenome-database management tool for second-generationgenome projects BMC Bioinformatics 12 491 (2011)doi 1011861471-2105-12-491 pmid 22192575

14 M Coluzzi A Sabatini A della Torre M A Di DecoV Petrarca A polytene chromosome analysis of the Anophelesgambiae species complex Science 298 1415ndash1418 (2002)doi 101126science1077769 pmid 12364623

15 M Kamali et al Multigene phylogenetics reveals temporaldiversification of major African malaria vectors PLOS ONE 9e93580 (2014) doi 101371journalpone0093580 pmid 24705448

16 M A Toups M W Hahn Retrogenes reveal the direction ofsex-chromosome evolution in mosquitoes Genetics 186763ndash766 (2010) doi 101534genetics110118794pmid 20660646

17 D A Baker S Russell Role of testis-specific gene expressionin sex-chromosome evolution of Anopheles gambiae Genetics189 1117ndash1120 (2011) doi 101534genetics111133157pmid 21890740

18 M V Han G W C Thomas J Lugo-Martinez M W HahnEstimating gene gain and loss rates in the presence of error ingenome assembly and annotation using CAFE 3 Mol Biol Evol30 1987ndash1997 (2013) doi 101093molbevmst100pmid 23709260

19 M W Hahn M V Han S-G Han Gene family evolution across 12Drosophila genomes PLOS Genet 3 e197 (2007) doi 101371journalpgen0030197 pmid 17997610

20 Z Yang PAML 4 Phylogenetic analysis by maximumlikelihood Mol Biol Evol 24 1586ndash1591 (2007) doi 101093molbevmsm088 pmid 17483113

21 T Dottorini et al A genome-wide analysis in Anophelesgambiae mosquitoes reveals 46 male accessory gland genespossible modulators of female behavior Proc Natl AcadSci USA 104 16215ndash16220 (2007) doi 101073pnas0703904104 pmid 17901209

22 F Baldini P Gabrieli D W Rogers F Catteruccia Functionand composition of male accessory gland secretions inAnopheles gambiae A comparison with other insect vectors ofinfectious diseases Pathog Glob Health 106 82ndash93 (2012)doi 1011792047773212Y0000000016 pmid 22943543

23 R Assis Q Zhou D Bachtrog Sex-biased transcriptomeevolution in Drosophila Genome Biol Evol 4 1189ndash1200(2012) doi 101093gbeevs093 pmid 23097318

24 S Grath J Parsch Rate of amino acid substitution isinfluenced by the degree and conservation of male-biasedtranscription over 50 myr of Drosophila evolutionGenome Biol Evol 4 346ndash359 (2012) doi 101093gbeevs012 pmid 22321769

25 J C Perry P W Harrison J E Mank The ontogeny andevolution of sex-biased gene expression in Drosophilamelanogaster Mol Biol Evol 31 1206ndash1219 (2014)doi 101093molbevmsu072 pmid 24526011

26 D W Rogers et al Transglutaminase-mediated semencoagulation controls sperm storage in the malariamosquito PLOS Biol 7 e1000272 (2009) doi 101371journalpbio1000272 pmid 20027206

27 F Baldini et al The interaction between a sexually transferredsteroid hormone and a female protein regulates oogenesisin the malaria mosquito Anopheles gambiae PLOS Biol 11e1001695 (2013) doi 101371journalpbio1001695pmid 24204210

28 W R Shaw et al Mating activates the heme peroxidase HPX15in the sperm storage organ to ensure fertility in Anophelesgambiae Proc Natl Acad Sci USA 111 5854ndash5859 (2014)doi 101073pnas1401715111 pmid 24711401

29 J H Willis Structural cuticular proteins from arthropodsAnnotation nomenclature and sequence characteristics in thegenomics era Insect Biochem Mol Biol 40 189ndash204(2010) doi 101016jibmb201002001 pmid 20171281

30 R S Cornman et al Annotation and analysis of a largecuticular protein family with the RampR Consensus inAnopheles gambiae BMC Genomics 9 22 (2008) doi 1011861471-2164-9-22 pmid 18205929

31 R S Cornman J H Willis Extensive gene amplification andconcerted evolution within the CPR family of cuticularproteins in mosquitoes Insect Biochem Mol Biol 38661ndash676 (2008) doi 101016jibmb200804001pmid 18510978

32 R S Cornman J H Willis Annotation and analysis oflow-complexity protein families of Anopheles gambiae thatare associated with cuticle Insect Mol Biol 18 607ndash622(2009) doi 101111j1365-2583200900902xpmid 19754739

33 R S Cornman Molecular evolution of Drosophila cuticularprotein genes PLOS ONE 4 e8345 (2009) doi 101371journalpone0008345 pmid 20019874

34 T Togawa W A Dunn A C Emmons J Nagao J H WillisDevelopmental expression patterns of cuticular protein genes

SCIENCE sciencemagorg 2 JANUARY 2015 bull VOL 347 ISSUE 6217 1258522-7

Fig 5 Genesis of novel anopheline genes (A) Retrotransposition of theE2Deffete gene generated a ubiquitin-conjugating enzyme at the base of thegenus which exhibits much higher sequence divergence than the originalmultiexon gene WebLogo plots contrast the amino acid conservation of theoriginal effete gene with the diversification of the retrotransposed copy (residues38 to 75 species represented areAnminimusAn dirusAn funestusAn farautiAn atroparvusAn sinensisAn darlingi andAn albimanus) (B) TheSG7 salivaryproteinndashencodinggenewas generated from theC-terminal half of the 30-kDgene

SG7 then underwent tandem duplication and intron loss to generate anothersalivary protein SG7-2 Numerals indicate lengths of segments in base pairs (C)The origin of STAT1 a signal transducer and activator of transcription geneinvolved in immunity occurred through a retrotransposition event in the Celliaancestor after divergence from An dirus and An farauti The intronless STAT1 ismuch more divergent than its multiexon progenitor STAT2 and has been main-tained in all descendant species An independent retrotransposition event createda retrogene copy inAn atroparvuswhich is alsomoredivergent than its progenitor

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

with the RampR Consensus from Anopheles gambiae InsectBiochem Mol Biol 38 508ndash519 (2008) doi 101016jibmb200712008 pmid 18405829

35 D C Rinker et al Blood meal-induced changes to antennaltranscriptome profiles reveal shifts in odor sensitivities inAnopheles gambiae Proc Natl Acad Sci USA 110 8260ndash8265(2013) doi 101073pnas1302562110 pmid 23630291

36 D C Rinker et al Antennal transcriptome profiles ofanopheline mosquitoes reveal human host olfactoryspecialization in Anopheles gambiae BMC Genomics 14749 (2013) doi 1011861471-2164-14-749 pmid 24182346

37 M Altstein D R Naumlssel Neuropeptide signaling in insectsAdv Exp Med Biol 692 155ndash165 (2010) doi 101007978-1-4419-6902-6_8 pmid 21189678

38 J P Goetze I Hunter S K Lippert L Bardram J F RehfeldProcessing-independent analysis of peptide hormones andprohormones in plasma Front Biosci (Landmark Ed) 171804ndash1815 (2012) doi 1027414020 pmid 22201837

39 T H Stracker S Thompson G L Grossman M A RiehleM R Brown Characterization of the AeaHP gene and itsexpression in the mosquito Aedes aegypti (Diptera Culicidae)J Med Entomol 39 331ndash342 (2002) doi 1016030022-2585-392331 pmid 11931033

40 C D de Oliveira W P Tadei F C Abdalla P F PaolucciPimenta O Marinotti Multiple blood meals in Anophelesdarlingi (Diptera Culicidae) J Vector Ecol 37 351ndash358(2012) doi 101111j1948-7134201200238x pmid 23181859

41 N Okamoto et al A fat body-derived IGF-like peptide regulatespostfeeding growth in Drosophila Dev Cell 17 885ndash891(2009) doi 101016jdevcel200910008 pmid 20059957

42 M A Riehle Y Fan C Cao M R Brown Molecularcharacterization of insulin-like peptides in the yellow fevermosquito Aedes aegypti Expression cellular localization andphylogeny Peptides 27 2547ndash2560 (2006) doi 101016jpeptides200607016 pmid 16934367

43 Y Antonova A J Arik W Moore M M Riehle M R Brown inInsect Endocrinology L I Gilbert Ed (Academic Press 2012)pp 63ndash92

44 A Swaminathan A Gajan L A Pile Epigenetic regulation oftranscription in Drosophila Front Biosci (Landmark Ed) 17909ndash937 (2012) doi 1027413964 pmid 22201781

45 F Cipressa et al Effete a Drosophila chromatin-associatedubiquitin-conjugating enzyme that affects telomeric andheterochromatic position effect variegation Genetics195 147ndash158 (2013) doi 101534genetics113153320pmid 23821599

46 B Arcagrave et al Positive selection drives accelerated evolutionof mosquito salivary genes associated with blood-feedingInsect Mol Biol 23 122ndash131 (2014) doi 101111imb12068pmid 24237399

47 H Isawa M Yuda Y Orito Y Chinzei A mosquito salivaryprotein inhibits activation of the plasma contact system bybinding to factor XII and high molecular weight kininogenJ Biol Chem 277 27651ndash27658 (2002) doi 101074jbcM203505200 pmid 12011093

48 E Calvo et al Aegyptin a novel mosquito salivary glandprotein specifically binds to collagen and prevents itsinteraction with platelet glycoprotein VI integrinalpha2beta1 and von Willebrand factor J Biol Chem 28226928ndash26938 (2007) doi 101074jbcM705669200pmid 17650501

49 E Guittard et al CYP18A1 a key enzyme of Drosophilasteroid hormone inactivation is essential for metamorphosisDev Biol 349 35ndash45 (2011) doi 101016jydbio201009023pmid 20932968

50 R M Waterhouse et al Evolutionary dynamics of immune-relatedgenes and pathways in disease-vector mosquitoes Science 3161738ndash1743 (2007) doi 101126science1139862pmid 17588928

51 L C Bartholomay et al Pathogenomics of Culexquinquefasciatus and meta-analysis of infection responses todiverse pathogens Science 330 88ndash90 (2010) doi 101126science1193162 pmid 20929811

52 C Barillas-Mury Y S Han D Seeley F C KafatosAnopheles gambiae Ag-STAT a new insect member of theSTAT family is activated in response to bacterial infection

EMBO J 18 959ndash967 (1999) doi 101093emboj184959pmid 10022838

53 L Gupta et al The STAT pathway mediates late-phaseimmunity against Plasmodium in the mosquito Anophelesgambiae Cell Host Microbe 5 498ndash507 (2009) doi 101016jchom200904003 pmid 19454353

54 A C Bahia et al The JAK-STAT pathway controls Plasmodiumvivax load in early stages of Anopheles aquasalis infectionPLOS Negl Trop Dis 5 e1317 (2011) doi 101371journalpntd0001317 pmid 22069502

ACKNOWLEDGMENTS

All sequencing reads and genome assemblies have been submittedto the National Center for Biotechnology Information (umbrellaBioProject ID PRJNA67511) Genome and transcriptomeassemblies are also available from VectorBase (httpsvectorbaseorg) and the Broad Institute (httpsolivebroadinstituteorgcollectionsanopheles4) The authorsacknowledge the NIH Eukaryotic Pathogen and Disease VectorSequencing Project Working Group for guidance anddevelopment of this project Sequence data generation wassupported at the Broad Institute by the National HumanGenome Research Institute (U54 HG003067) We thank themany members of the Broad Institute Genomics Platform andGenome Sequencing and Analysis Program who contributed tosequencing data generation and analysis

SUPPLEMENTARY MATERIALS

wwwsciencemagorgcontent34762171258522supplDC1Materials and MethodsSupplementary TextFigs S1 to S25Tables S1 to S36References (55ndash248)

9 July 2014 accepted 14 November 2014Published online 27 November 2014101126science1258522

1258522-8 2 JANUARY 2015 bull VOL 347 ISSUE 6217 sciencemagorg SCIENCE

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

mosquitoesAnophelesHighly evolvable malaria vectors The genomes of 16

Riehle Igor V Sharakhov Zhijian Tu Laurence J Zwiebel and Nora J BesanskyKafatos Manolis Kellis Daniel Lawson Christos Louis Shirley Luckhart Marc A T Muskavitch Joseacute M Ribeiro Michael ADonnelly Scott J Emrich Michael C Fontaine William Gelbart Matthew W Hahn Immo A Hansen Paul I Howell Fotis C Zhou Flaminia Catteruccia George K Christophides Frank H Collins Robert S Cornman Andrea Crisanti Martin JJohn Vontas Catherine Walton Craig S Wilding Judith H Willis Yi-Chieh Wu Guiyun Yan Evgeny M Zdobnov Xiaofan Vladimir Stegniy Claudio J Struchiner Gregg W C Thomas Marta Tojo Pantelis Topalis Joseacute M C Tubio Maria F UngerSagnon Maria V Sharakhova Terrance Shea Felipe A Simatildeo Frederic Simard Michel A Slotman Pradya Somboon Anil Prakash David P Price Ashok Rajaraman Lisa J Reimer David C Rinker Antonis Rokas Tanya L Russell NFaleChioma Oringanje Mohammad A Oshaghi Nazzy Pakpour Philippos A Papathanos Ashley N Peery Michael Povelones N Mitchell Wendy Moore Katherine A Murphy Anastasia N Naumenko Tony Nolan Eva M Novoa Samantha OLoughlinErnesto Lowy Robert M MacCallum Chunhong Mao Gareth Maslen Charles Mbogo Jenny McCarthy Kristin Michel Sara Kirmitzoglou Lizette L Koekemoer Njoroge Laban Nicholas Langridge Mara K N Lawniczak Manolis Lirakis Neil F LoboXiaofang Jiang Irwin Jungreis Evdoxia G Kakani Maryam Kamali Petri Kemppainen Ryan C Kennedy Ioannis K Gabriel Wamdaogo M Guelbeogo Andrew B Hall Mira V Han Thaung Hlaing Daniel S T Hughes Adam M JenkinsChiu Mikkel Christensen Carlo Costantini Victoria L M Davidson Elena Deligianni Tania Dottorini Vicky Dritsou Stacey B Birren Stephanie A Blandin Andrew I Brockman Thomas R Burkot Austin Burt Clara S Chan Cedric Chauve Joanna CJames Amon Bruno Arcagrave Peter Arensburger Gleb Artemov Lauren A Assour Hamidreza Basseri Aaron Berlin Bruce W Daniel E Neafsey Robert M Waterhouse Mohammad R Abai Sergey S Aganezov Max A Alekseyev James E Allen

originally published online November 27 2014DOI 101126science1258522 (6217) 1258522347Science

this issue 101126science1258524 and 101126science1258522 see also p 27Sciencehuman malaria (see the Perspective by Clark and Messer)reveal clues that explain the mystery of why only some species of one genus of mosquitoes are capable of transmitting

et al and Fontaine et allife-threatening infection By sequencing the genomes of several mosquitoes in depth Neafsey among these bloodsucking flies that render some bites mere nuisances and others the initiation of a potentially

Virtually everyone has first-hand experience with mosquitoes Few recognize the subtle biological distinctionsMosquito adaptability across genomes

ARTICLE TOOLS httpsciencesciencemagorgcontent34762171258522

MATERIALSSUPPLEMENTARY httpsciencesciencemagorgcontentsuppl20141125science1258522DC1

CONTENTRELATED httpsciencesciencemagorgcontentsci347621727full

REFERENCES

httpsciencesciencemagorgcontent34762171258522BIBLThis article cites 238 articles 62 of which you can access for free

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAASSciencelicensee American Association for the Advancement of Science No claim to original US Government Works The title Science 1200 New York Avenue NW Washington DC 20005 2017 copy The Authors some rights reserved exclusive

(print ISSN 0036-8075 online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

on June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

PERMISSIONS httpwwwsciencemagorghelpreprints-and-permissions

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAASSciencelicensee American Association for the Advancement of Science No claim to original US Government Works The title Science 1200 New York Avenue NW Washington DC 20005 2017 copy The Authors some rights reserved exclusive

(print ISSN 0036-8075 online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

on June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

Page 9: MOSQUITO GENOMICS Highlyevolvable … ARTICLE SUMMARY MOSQUITO GENOMICS Highlyevolvable malariavectors:The genomes of 16 Anopheles mosquitoes Daniel E. Neafsey,*† Robert M. Waterhouse,*

with the RampR Consensus from Anopheles gambiae InsectBiochem Mol Biol 38 508ndash519 (2008) doi 101016jibmb200712008 pmid 18405829

35 D C Rinker et al Blood meal-induced changes to antennaltranscriptome profiles reveal shifts in odor sensitivities inAnopheles gambiae Proc Natl Acad Sci USA 110 8260ndash8265(2013) doi 101073pnas1302562110 pmid 23630291

36 D C Rinker et al Antennal transcriptome profiles ofanopheline mosquitoes reveal human host olfactoryspecialization in Anopheles gambiae BMC Genomics 14749 (2013) doi 1011861471-2164-14-749 pmid 24182346

37 M Altstein D R Naumlssel Neuropeptide signaling in insectsAdv Exp Med Biol 692 155ndash165 (2010) doi 101007978-1-4419-6902-6_8 pmid 21189678

38 J P Goetze I Hunter S K Lippert L Bardram J F RehfeldProcessing-independent analysis of peptide hormones andprohormones in plasma Front Biosci (Landmark Ed) 171804ndash1815 (2012) doi 1027414020 pmid 22201837

39 T H Stracker S Thompson G L Grossman M A RiehleM R Brown Characterization of the AeaHP gene and itsexpression in the mosquito Aedes aegypti (Diptera Culicidae)J Med Entomol 39 331ndash342 (2002) doi 1016030022-2585-392331 pmid 11931033

40 C D de Oliveira W P Tadei F C Abdalla P F PaolucciPimenta O Marinotti Multiple blood meals in Anophelesdarlingi (Diptera Culicidae) J Vector Ecol 37 351ndash358(2012) doi 101111j1948-7134201200238x pmid 23181859

41 N Okamoto et al A fat body-derived IGF-like peptide regulatespostfeeding growth in Drosophila Dev Cell 17 885ndash891(2009) doi 101016jdevcel200910008 pmid 20059957

42 M A Riehle Y Fan C Cao M R Brown Molecularcharacterization of insulin-like peptides in the yellow fevermosquito Aedes aegypti Expression cellular localization andphylogeny Peptides 27 2547ndash2560 (2006) doi 101016jpeptides200607016 pmid 16934367

43 Y Antonova A J Arik W Moore M M Riehle M R Brown inInsect Endocrinology L I Gilbert Ed (Academic Press 2012)pp 63ndash92

44 A Swaminathan A Gajan L A Pile Epigenetic regulation oftranscription in Drosophila Front Biosci (Landmark Ed) 17909ndash937 (2012) doi 1027413964 pmid 22201781

45 F Cipressa et al Effete a Drosophila chromatin-associatedubiquitin-conjugating enzyme that affects telomeric andheterochromatic position effect variegation Genetics195 147ndash158 (2013) doi 101534genetics113153320pmid 23821599

46 B Arcagrave et al Positive selection drives accelerated evolutionof mosquito salivary genes associated with blood-feedingInsect Mol Biol 23 122ndash131 (2014) doi 101111imb12068pmid 24237399

47 H Isawa M Yuda Y Orito Y Chinzei A mosquito salivaryprotein inhibits activation of the plasma contact system bybinding to factor XII and high molecular weight kininogenJ Biol Chem 277 27651ndash27658 (2002) doi 101074jbcM203505200 pmid 12011093

48 E Calvo et al Aegyptin a novel mosquito salivary glandprotein specifically binds to collagen and prevents itsinteraction with platelet glycoprotein VI integrinalpha2beta1 and von Willebrand factor J Biol Chem 28226928ndash26938 (2007) doi 101074jbcM705669200pmid 17650501

49 E Guittard et al CYP18A1 a key enzyme of Drosophilasteroid hormone inactivation is essential for metamorphosisDev Biol 349 35ndash45 (2011) doi 101016jydbio201009023pmid 20932968

50 R M Waterhouse et al Evolutionary dynamics of immune-relatedgenes and pathways in disease-vector mosquitoes Science 3161738ndash1743 (2007) doi 101126science1139862pmid 17588928

51 L C Bartholomay et al Pathogenomics of Culexquinquefasciatus and meta-analysis of infection responses todiverse pathogens Science 330 88ndash90 (2010) doi 101126science1193162 pmid 20929811

52 C Barillas-Mury Y S Han D Seeley F C KafatosAnopheles gambiae Ag-STAT a new insect member of theSTAT family is activated in response to bacterial infection

EMBO J 18 959ndash967 (1999) doi 101093emboj184959pmid 10022838

53 L Gupta et al The STAT pathway mediates late-phaseimmunity against Plasmodium in the mosquito Anophelesgambiae Cell Host Microbe 5 498ndash507 (2009) doi 101016jchom200904003 pmid 19454353

54 A C Bahia et al The JAK-STAT pathway controls Plasmodiumvivax load in early stages of Anopheles aquasalis infectionPLOS Negl Trop Dis 5 e1317 (2011) doi 101371journalpntd0001317 pmid 22069502

ACKNOWLEDGMENTS

All sequencing reads and genome assemblies have been submittedto the National Center for Biotechnology Information (umbrellaBioProject ID PRJNA67511) Genome and transcriptomeassemblies are also available from VectorBase (httpsvectorbaseorg) and the Broad Institute (httpsolivebroadinstituteorgcollectionsanopheles4) The authorsacknowledge the NIH Eukaryotic Pathogen and Disease VectorSequencing Project Working Group for guidance anddevelopment of this project Sequence data generation wassupported at the Broad Institute by the National HumanGenome Research Institute (U54 HG003067) We thank themany members of the Broad Institute Genomics Platform andGenome Sequencing and Analysis Program who contributed tosequencing data generation and analysis

SUPPLEMENTARY MATERIALS

wwwsciencemagorgcontent34762171258522supplDC1Materials and MethodsSupplementary TextFigs S1 to S25Tables S1 to S36References (55ndash248)

9 July 2014 accepted 14 November 2014Published online 27 November 2014101126science1258522

1258522-8 2 JANUARY 2015 bull VOL 347 ISSUE 6217 sciencemagorg SCIENCE

RESEARCH | RESEARCH ARTICLEon June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

mosquitoesAnophelesHighly evolvable malaria vectors The genomes of 16

Riehle Igor V Sharakhov Zhijian Tu Laurence J Zwiebel and Nora J BesanskyKafatos Manolis Kellis Daniel Lawson Christos Louis Shirley Luckhart Marc A T Muskavitch Joseacute M Ribeiro Michael ADonnelly Scott J Emrich Michael C Fontaine William Gelbart Matthew W Hahn Immo A Hansen Paul I Howell Fotis C Zhou Flaminia Catteruccia George K Christophides Frank H Collins Robert S Cornman Andrea Crisanti Martin JJohn Vontas Catherine Walton Craig S Wilding Judith H Willis Yi-Chieh Wu Guiyun Yan Evgeny M Zdobnov Xiaofan Vladimir Stegniy Claudio J Struchiner Gregg W C Thomas Marta Tojo Pantelis Topalis Joseacute M C Tubio Maria F UngerSagnon Maria V Sharakhova Terrance Shea Felipe A Simatildeo Frederic Simard Michel A Slotman Pradya Somboon Anil Prakash David P Price Ashok Rajaraman Lisa J Reimer David C Rinker Antonis Rokas Tanya L Russell NFaleChioma Oringanje Mohammad A Oshaghi Nazzy Pakpour Philippos A Papathanos Ashley N Peery Michael Povelones N Mitchell Wendy Moore Katherine A Murphy Anastasia N Naumenko Tony Nolan Eva M Novoa Samantha OLoughlinErnesto Lowy Robert M MacCallum Chunhong Mao Gareth Maslen Charles Mbogo Jenny McCarthy Kristin Michel Sara Kirmitzoglou Lizette L Koekemoer Njoroge Laban Nicholas Langridge Mara K N Lawniczak Manolis Lirakis Neil F LoboXiaofang Jiang Irwin Jungreis Evdoxia G Kakani Maryam Kamali Petri Kemppainen Ryan C Kennedy Ioannis K Gabriel Wamdaogo M Guelbeogo Andrew B Hall Mira V Han Thaung Hlaing Daniel S T Hughes Adam M JenkinsChiu Mikkel Christensen Carlo Costantini Victoria L M Davidson Elena Deligianni Tania Dottorini Vicky Dritsou Stacey B Birren Stephanie A Blandin Andrew I Brockman Thomas R Burkot Austin Burt Clara S Chan Cedric Chauve Joanna CJames Amon Bruno Arcagrave Peter Arensburger Gleb Artemov Lauren A Assour Hamidreza Basseri Aaron Berlin Bruce W Daniel E Neafsey Robert M Waterhouse Mohammad R Abai Sergey S Aganezov Max A Alekseyev James E Allen

originally published online November 27 2014DOI 101126science1258522 (6217) 1258522347Science

this issue 101126science1258524 and 101126science1258522 see also p 27Sciencehuman malaria (see the Perspective by Clark and Messer)reveal clues that explain the mystery of why only some species of one genus of mosquitoes are capable of transmitting

et al and Fontaine et allife-threatening infection By sequencing the genomes of several mosquitoes in depth Neafsey among these bloodsucking flies that render some bites mere nuisances and others the initiation of a potentially

Virtually everyone has first-hand experience with mosquitoes Few recognize the subtle biological distinctionsMosquito adaptability across genomes

ARTICLE TOOLS httpsciencesciencemagorgcontent34762171258522

MATERIALSSUPPLEMENTARY httpsciencesciencemagorgcontentsuppl20141125science1258522DC1

CONTENTRELATED httpsciencesciencemagorgcontentsci347621727full

REFERENCES

httpsciencesciencemagorgcontent34762171258522BIBLThis article cites 238 articles 62 of which you can access for free

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAASSciencelicensee American Association for the Advancement of Science No claim to original US Government Works The title Science 1200 New York Avenue NW Washington DC 20005 2017 copy The Authors some rights reserved exclusive

(print ISSN 0036-8075 online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

on June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

PERMISSIONS httpwwwsciencemagorghelpreprints-and-permissions

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAASSciencelicensee American Association for the Advancement of Science No claim to original US Government Works The title Science 1200 New York Avenue NW Washington DC 20005 2017 copy The Authors some rights reserved exclusive

(print ISSN 0036-8075 online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

on June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

Page 10: MOSQUITO GENOMICS Highlyevolvable … ARTICLE SUMMARY MOSQUITO GENOMICS Highlyevolvable malariavectors:The genomes of 16 Anopheles mosquitoes Daniel E. Neafsey,*† Robert M. Waterhouse,*

mosquitoesAnophelesHighly evolvable malaria vectors The genomes of 16

Riehle Igor V Sharakhov Zhijian Tu Laurence J Zwiebel and Nora J BesanskyKafatos Manolis Kellis Daniel Lawson Christos Louis Shirley Luckhart Marc A T Muskavitch Joseacute M Ribeiro Michael ADonnelly Scott J Emrich Michael C Fontaine William Gelbart Matthew W Hahn Immo A Hansen Paul I Howell Fotis C Zhou Flaminia Catteruccia George K Christophides Frank H Collins Robert S Cornman Andrea Crisanti Martin JJohn Vontas Catherine Walton Craig S Wilding Judith H Willis Yi-Chieh Wu Guiyun Yan Evgeny M Zdobnov Xiaofan Vladimir Stegniy Claudio J Struchiner Gregg W C Thomas Marta Tojo Pantelis Topalis Joseacute M C Tubio Maria F UngerSagnon Maria V Sharakhova Terrance Shea Felipe A Simatildeo Frederic Simard Michel A Slotman Pradya Somboon Anil Prakash David P Price Ashok Rajaraman Lisa J Reimer David C Rinker Antonis Rokas Tanya L Russell NFaleChioma Oringanje Mohammad A Oshaghi Nazzy Pakpour Philippos A Papathanos Ashley N Peery Michael Povelones N Mitchell Wendy Moore Katherine A Murphy Anastasia N Naumenko Tony Nolan Eva M Novoa Samantha OLoughlinErnesto Lowy Robert M MacCallum Chunhong Mao Gareth Maslen Charles Mbogo Jenny McCarthy Kristin Michel Sara Kirmitzoglou Lizette L Koekemoer Njoroge Laban Nicholas Langridge Mara K N Lawniczak Manolis Lirakis Neil F LoboXiaofang Jiang Irwin Jungreis Evdoxia G Kakani Maryam Kamali Petri Kemppainen Ryan C Kennedy Ioannis K Gabriel Wamdaogo M Guelbeogo Andrew B Hall Mira V Han Thaung Hlaing Daniel S T Hughes Adam M JenkinsChiu Mikkel Christensen Carlo Costantini Victoria L M Davidson Elena Deligianni Tania Dottorini Vicky Dritsou Stacey B Birren Stephanie A Blandin Andrew I Brockman Thomas R Burkot Austin Burt Clara S Chan Cedric Chauve Joanna CJames Amon Bruno Arcagrave Peter Arensburger Gleb Artemov Lauren A Assour Hamidreza Basseri Aaron Berlin Bruce W Daniel E Neafsey Robert M Waterhouse Mohammad R Abai Sergey S Aganezov Max A Alekseyev James E Allen

originally published online November 27 2014DOI 101126science1258522 (6217) 1258522347Science

this issue 101126science1258524 and 101126science1258522 see also p 27Sciencehuman malaria (see the Perspective by Clark and Messer)reveal clues that explain the mystery of why only some species of one genus of mosquitoes are capable of transmitting

et al and Fontaine et allife-threatening infection By sequencing the genomes of several mosquitoes in depth Neafsey among these bloodsucking flies that render some bites mere nuisances and others the initiation of a potentially

Virtually everyone has first-hand experience with mosquitoes Few recognize the subtle biological distinctionsMosquito adaptability across genomes

ARTICLE TOOLS httpsciencesciencemagorgcontent34762171258522

MATERIALSSUPPLEMENTARY httpsciencesciencemagorgcontentsuppl20141125science1258522DC1

CONTENTRELATED httpsciencesciencemagorgcontentsci347621727full

REFERENCES

httpsciencesciencemagorgcontent34762171258522BIBLThis article cites 238 articles 62 of which you can access for free

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAASSciencelicensee American Association for the Advancement of Science No claim to original US Government Works The title Science 1200 New York Avenue NW Washington DC 20005 2017 copy The Authors some rights reserved exclusive

(print ISSN 0036-8075 online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

on June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

PERMISSIONS httpwwwsciencemagorghelpreprints-and-permissions

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAASSciencelicensee American Association for the Advancement of Science No claim to original US Government Works The title Science 1200 New York Avenue NW Washington DC 20005 2017 copy The Authors some rights reserved exclusive

(print ISSN 0036-8075 online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

on June 10 2018

httpsciencesciencemagorg

Dow

nloaded from

Page 11: MOSQUITO GENOMICS Highlyevolvable … ARTICLE SUMMARY MOSQUITO GENOMICS Highlyevolvable malariavectors:The genomes of 16 Anopheles mosquitoes Daniel E. Neafsey,*† Robert M. Waterhouse,*

PERMISSIONS httpwwwsciencemagorghelpreprints-and-permissions

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAASSciencelicensee American Association for the Advancement of Science No claim to original US Government Works The title Science 1200 New York Avenue NW Washington DC 20005 2017 copy The Authors some rights reserved exclusive

(print ISSN 0036-8075 online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

on June 10 2018

httpsciencesciencemagorg

Dow

nloaded from