39
Comparative Genomics of Aspergilli William Nierman TIGR

Comparative Genomics of Aspergilli William Nierman TIGR

Embed Size (px)

Citation preview

Page 1: Comparative Genomics of Aspergilli William Nierman TIGR

Comparative Genomics of Aspergilli

William NiermanTIGR

Page 2: Comparative Genomics of Aspergilli William Nierman TIGR

Electrophoretic Karyotyping 5 day run

CHEF DRII 1.2% CGA, 1x TAE, 14C, 1.8 V/cm: 2200 s, 48 h; 2200-1800 s, 68 h sizes in Mb

5.74.6

3.5

Sc

Sp

5.0

1x 4.0

3.5

1.8Af

Page 3: Comparative Genomics of Aspergilli William Nierman TIGR

A. fumigatus Chromosomes

Centromeric area Telomere

Size (MB)

3

2

6

5

8

7

1

4~35 copies rDNA

4.891

4.834

4.018

3.933

3.922

3.779

2.021

1.789

Page 4: Comparative Genomics of Aspergilli William Nierman TIGR

Centromeres and Telomeres

• Telomere repeat TTAGGG, 7-21 repeat units – Subtelomeric regions- identical sequences for several kb,

helicase pseudogenes, 7 secondary metabolite clusters, niche adaption role? (Mark Farman)

• Centromeres– Uncloned in shotgun libraries; 36.2 - 55.9kb – Flanked on each side by low complexity AT rich repeat

region– Chromosome 2 centromere 12 kb PCR product 75% AT,

overall centromeric AT of 63%, 40kb.

Page 5: Comparative Genomics of Aspergilli William Nierman TIGR

Finished chromosome sequences

Masked genomic sequence

Gene prediction Protein alignmentsEST alignments

Optimize Predictions

Eukaryotic Genome Control (EGC) is the annotation pipeline responsible for processing genomic sequence

Annotation Pipeline

Page 6: Comparative Genomics of Aspergilli William Nierman TIGR

Training Data

– Full Length cDNAs (625) and 42 partials from 589 loci in 19 Aspergillus species

– 2,633 A. fumigatus ESTs from UK and Spanish collaborators

Gene and splicing site predictions including Glimmer,Exonomy, Unveil, Phat and GeneSplicer were trained with following experimental data:

Page 7: Comparative Genomics of Aspergilli William Nierman TIGR

Optimize Predictions

Combiner combines gene model evidence from:

• Gene prediction programs

• Splice site prediction programs

• Alignments from protein, cDNA and EST databases

• Generates final gene model.

All the genes were manual reviewed and the observed splits and merges were corrected.

Page 8: Comparative Genomics of Aspergilli William Nierman TIGR

Annotation Station Screenshot

Brown 2 Brown 1Yellowish-green

1,3,6,8-tetrahydroxynaphthalene reductase

Scytalone dehydratase

Polyketide synthetase

Page 9: Comparative Genomics of Aspergilli William Nierman TIGR

Chromosome AFU ANA AOA

Size 28635699 30068514 36746653

GC Content 49.9 50.3 48.3

# of Genes 9746 9967 14063

Mean Gene Length 1442.4 1535.9 1177.5

Gene Density 2938.2 3016.8 2613

Percent of Coding 49.1 50.9 45.1

Percent Genes with Introns 75.8 88.7 80.7

Exons AFU ANA AOA

Number 26181 36249 40133

Mean # per Gene 2.7 3.6 2.9

GC Content 54 53.4 52

Mean Length(bp) 536.9 422.3 412.6

Total Length(bp) 14057166 15308196 16559586

Introns AFU ANA AOA

Number 16432 26282 26070

GC Content 46.3 46.1 45.5

Mean Length(bp) 121.8 104.6 129.7

Total Length(bp) 2000799 2748240 3380731

Intergenic Regions AFU ANA AOA

GC Content 46 47.5 45.3

Mean Length(bp) 1276.4 1159.5 1174.3

Functional Annotation AFU ANA AOA

# of Genes w/PFAM Hits 4403 4512 5306

# of Genes with Computed Families 4603 4536 6263

Gene Summary Statistics

Page 10: Comparative Genomics of Aspergilli William Nierman TIGR

Domains Domain name #ProteinsPF00172 Fungal Zn(2)-Cys(6) binuclear cluster dom. 147

PF00083 Major facilitator superfamily 109

PF00400 WD domain G-beta repeat 105

PF00069 Protein kinase domain 105

PF00106 Oxidoreductase, sh. Chain dehydro./reduc. 95

PF00271 Helicase conserved C-terminal domain 75

PF00023 Ankyrin repeat 64

PF00067 Cytochrome P450 65

PF00096 Zinc finger C2H2 type 61

PF00107 Oxidoreductase, Zn-binding dehydrogenase 61

PF00076 RNA recognition motif 59

PF00005 ABC transporter 51

PF00501 AMP-binding enzyme 44

PF00270 DEAD/DEAH box helicase 39

PF01360 Monoxygenase 39

Most Common Domains in A. fumigatus

Page 11: Comparative Genomics of Aspergilli William Nierman TIGR

Synteny Map of A. fumigatus and A. nidulans

Page 12: Comparative Genomics of Aspergilli William Nierman TIGR

Synteny Map of A.fumigatus and A. oryzae

Page 13: Comparative Genomics of Aspergilli William Nierman TIGR

Synteny Map of A. fumigatus, A. nidulans, A. oryzae

Page 14: Comparative Genomics of Aspergilli William Nierman TIGR

The ortholog was computed by performing an all vs. all BlastP of the three

proteomes with a cut-off of 1 x e-15 (no length requirement).  The mutual best

hits were then organized into clusters based on shared protein nodes.

COG A. fumigatus A. Oryzae A. nidulans avg_pctid avg_coverage num_cogs

3 member + + + 70% 86% 5899

  + +   65% 84% 967

2 member +   + 61% 79% 533

    + + 61% 80% 936

Species #genes included in COG percent of predicted proteome

A. fumigatus 7507 79%

A. nidulans 7429 75%

A. Oryzae 7988 57%

Total 22924 68%(22924/33552)

Overview – Comparative Statistics

Page 15: Comparative Genomics of Aspergilli William Nierman TIGR

TIGR Autoannotation vs Sanger Curated Annotation

• Status Count• Total Sanger Genes analyzed 360• Same gene structure 137• Different gene structure 177• Sanger missing in TIGR annotation 37• Sanger matches multiple TIGR annotations 2• Sanger, TIGR annotations opposite strands 7• TIGR missing in Sanger annotation 12• TIGR matches multiple Sanger annotations 9

Page 16: Comparative Genomics of Aspergilli William Nierman TIGR
Page 17: Comparative Genomics of Aspergilli William Nierman TIGR

Using Ortholog Clusters to Identify Potential Annotation Problems

Page 18: Comparative Genomics of Aspergilli William Nierman TIGR

Using Ortholog Clusters to Identify Potential Annotation Problems

Different exon number due to annotation discrepancy

Page 19: Comparative Genomics of Aspergilli William Nierman TIGR

We need to be able to distinguish annotation inconsistencies from real, interesting phenomena

In some cases, differences in exon number are real

Page 20: Comparative Genomics of Aspergilli William Nierman TIGR

Apoptosis in Fungi

• Apoptosis-like process detected in S. cerevisiae, S. pombe, and Aspergilli.

• Fungal genomes lack metazoan upstream machinery.

• Metacaspase-dependent phenotype observed in A. fumigatus and A. nidulans.

• Analysis by Goeff Robson

Page 21: Comparative Genomics of Aspergilli William Nierman TIGR

DOMAINS S.cerevisiae S.pombe A.fumigatus A.nidulans A.oryzae

NB-ARC X X 57.m0539456.m0242472.m1982166.m04653asfu05688

10025.m0012610051.m0044210115.m0008110157.m0005410176.m0000510016.m0017810150.m0005210062.m0013610153.m00210

20175.m0042720175.m0034720116.m0007820180.m0089120167.m0034720122.m0010220168.m00299

Caspase-activated nuclease

X X X X X

CAS/CSE CSE-1 CSE-1 X X X

MATH UBPF UBPFUBP5

53.m0378053.m04162

10139.m00184 20147.m00277

PROTEIN FAMILY

Metacaspase MCA1 AL031179 59.m0848654.m06827

10098.m0029910042.m0004710062.m00137

20149.m002720166.m0020420161.m00321

Anti silencing protein1 ASF1 ASF1 59m.08789 10084.m00239 20175m.00377

STM1 STM1/MPT4 Q42914 X X X

CDC48p CDC48 CDC48 72.m19795 10124.m00023 20134.m00118

Apoptosis in Fungi

Page 22: Comparative Genomics of Aspergilli William Nierman TIGR

Aspergillus fumigatus Secondary Metabolites

• Heterogeneous group of low molecular weight products.

• Toxic, antibiotic, and immunosuppressant activities.–– fumagillin, gliotoxin (apoptosis and phagocyte dysfunction), fumitremorgin, verruculogen, fumigaclavine, helvolic acid, phthioc acid (granulomas when injected into mice) and sphingofungins

• Virulence properties may be augmented by the A. fumigatus numerous secondary metabolites.

Page 23: Comparative Genomics of Aspergilli William Nierman TIGR

Gene type A. oryzae A. fumigatus A. nidulans

PKS 30 14 27

NRPS 18 14 14

FAS 5 1 6

Sesquiterpene cyclase

1 (1) (1)

DMATS 2 7 2

Secondary Metabolite Genes

Analysis by G. Turner, N. Keller, Dr. Kitamoto, and R. Kulkarni

Page 24: Comparative Genomics of Aspergilli William Nierman TIGR

TryptophanProlineNRPS?DMAT synthetase

TryptophanDMAT synthetase (X2)

SerinePhenylalanine2 module NRPS?

TerpeneSesquiterpene cyclase

Gliotoxin

Fumagillin

Fumigaclavines

Fumitremorgens

Page 25: Comparative Genomics of Aspergilli William Nierman TIGR

Gene type A. oryzae A. fumigatus A. nidulans

PKS2 30 14 27

NRPS 18 14 14

FAS 5 1 6

Sesquiterpene cyclase

1 (1) (1)

DMATS 2 7 2

Five 2-module NRPS

Page 26: Comparative Genomics of Aspergilli William Nierman TIGR

A. fumigatus Secondary Metabolite Genes

• Few true orthologues across the genus Aspergillus. Each species has its own repertoire.

• Gene/product relationship requires functional analysis in most cases

• Indole alkaloid pathway in A. fumigatus only. Closely related to Claviceps purpurea ergotamine pathway

• Penicillin and aflatoxin pathways are absent.

• A hybrid PKS/monomodular NRPS seems to be present in several fungi.

Page 27: Comparative Genomics of Aspergilli William Nierman TIGR

Identify A. fumigatus specific genesA. fumigatus genes

All vs. all BlastP of the AFU1,ANA1, AOAN proteomescut-off E value: 1 x e-15, filtering the results for mutual best hitsbetween genomes.

A. fumigatus singletons

(9746)

(2075)BLASTP vs ANA1 and AOA1 proteomes

A. fumigatus singletons E-value > e-10(1081)

Extend 50bp on both ends of the gene in the genome, Tblastx the genomic seq of the gene vs ana and aoa genomic seq

A. fumigatus specific gene candidates E-value > e-50

e-5>E-value>e-10(203)

BLASTP vs ANA1 and AOA1 proteomes

E-value > e-5(808)

e-50<E-value < e-10(181)

e-5>E-value>e-10(75)

E-value > e-5(552)

(1011)

Extend 50bp on both ends of the gene in the genome, Tblastx the genomic seq of the gene vs ana and aoa genomic seq

Page 28: Comparative Genomics of Aspergilli William Nierman TIGR

Aspergillus fumigatus Unique Genes

• Vast majority are hypothetical• Includes

– Several transcriptional regulators

– A chaperonin

– An hsp 70 related protein

Page 29: Comparative Genomics of Aspergilli William Nierman TIGR

Arsenic Fungi

• 19th century poisonings associated with green pigments.• 1892 B. Gosio, certain fungi could metabolize arsenic

pigments producing toxic trimethylarsine (Gosio gas).• Screen in the 1930s (Thom & Raper) found A. fumigatus

to be an arsenic fungus.• Napoleon, imperial colors green and gold, copper arsenite

(Jones 1982).• Analysis of history and genome by J. Bennett, N. Hall, J.

Wortman, C. Lu.

Page 30: Comparative Genomics of Aspergilli William Nierman TIGR

A. fumigatus Arsenate Genes

• Arsenite efflux pump• Arsenite translocating ATPase• Two possibly duplicated clusters

– arsC – arsenate reductase (A. fumigatus unique)

– arsB – arsenite symporter– arsH – Methyltransferase

Page 31: Comparative Genomics of Aspergilli William Nierman TIGR

Chromosome 1

Chromosome 5

Page 32: Comparative Genomics of Aspergilli William Nierman TIGR

arsBarsCarsH Methytrasferase

arsH MethyltransferasearsB

arsC

Page 33: Comparative Genomics of Aspergilli William Nierman TIGR

A. Fumigatus Teichoic Acid Biosynthesis Protein

• Good homology to a the full length of the Streptomyces griseus protein.

• Secretion signal peptide may direct for cell wall.

• Teichoic acids demonstrated to be a virulence factor for Staphylococcus aureus.

• No intervening sequences in gene.

Analysis by Neil Hall

Page 34: Comparative Genomics of Aspergilli William Nierman TIGR

More highly expressed at 48oC

More highly expressed at 37oC

A. Fumigatus Thermotolerance

Page 35: Comparative Genomics of Aspergilli William Nierman TIGR

A. fumigatus Thermotolerance

• Relatively few genes altered• Some HSPs transiently or stably induced

(weakly) and repressed at 37oC.• HSPs induced throughout 180 min 48oC period• Transposases induced at 48oC (Mariner 4). • Stress related genes up regulated at 48oC. • Metabolic proteins down regulated at 48oC

“This fungus likes it hot.”J. Bennett

Page 36: Comparative Genomics of Aspergilli William Nierman TIGR

Microarray Detection of Clusters

Page 37: Comparative Genomics of Aspergilli William Nierman TIGR

Aspergillus fumigatus AF293 Project Participants

• The University of Manchester, UK

• The Wellcome Trust Sanger Centre, UK

• The Institute for Genomic Research, USA

• The University of Salamaca, Spain

• Complutense University, Spain

• Centro de Investigaciones Biológicas, Spain

Page 38: Comparative Genomics of Aspergilli William Nierman TIGR

Aspergillus fumigatus AF293

David DenningMichael AndersonArnab PainGoeff RobsonJavier ArroyoGoeff TurnerDavid Archer

Joan BennettMatt BerrimanJean Paul LatgePaul DyerPaul BowyerNeil Hall

Aspergillus nidulans – James GalaganAspergillus oryzae – Masayuki Machida

Page 39: Comparative Genomics of Aspergilli William Nierman TIGR

TIGR

Sequencing and ClosureTamara FeldblyumHoda Khouri

AnnotationJennifer WortmanJiaqi HuangResham KulkarniNatalie FedorovaCharles Lu

Claire Fraser

Lab GroupHeenam KimDan Chen

NIAID and Dennis Dixon