Antrodia Suppl. Tables-rev gene counts-Jin05152014...10. Trapnell C, et al. (2013) Differential analysis of gene regulation at transcript resolution with RNA-seq. Nature biotechnology

PNAS

Supporting Information

Genomic and transcriptomic analyses of the medicinal fungus

Antrodia cinnamomea for its metabolite biosynthesis and

sexual development.

Lu et al.

Supplementary figures and text.

Supplementary text Supplementary methods

Supplementary Figure S1 De novo genome assembly, gene annotation and transcriptomic analysis.

Supplementary Figure S2 Blastp search for protein homologues with E-value at 1e-05 in NCBI nr database.

Supplementary Figure S3 Differential gene expression analysis of the four transcriptomes.

Supplementary Figure S4 Clusters of triterpenoid and P450 genes and domain organization of polyketide syntethases.

Supplementary Figure S5 tRNA predictions using tRNA Scan-SE.

Supplementary Figure S6 Inter-strain variants identified between strains S27 and S32.

Supplementary Figure S7 S27 and S32 are pure monokaryons.

Supplementary Figure S8 Representative secondary metabolites and biosynthesis pathways of triterpenoids and antrocamphin.

Supplementary Table S1 Summary of the Roche 454 and Illumina data used for assembling the S27 genome.

Supplementary Table S2 Summary of the NGS data obtained for the various A. cinnamomea strains for transcriptome analyses.

Supplementary Table S3 Metrics of the transcriptome assembly of mycelium tissues from two Antrodia strains.

Supplementary Table S4 Statistics of the raw PE-reads from preprocessing to transcriptome single reads mapping.

Supplementary Table S5 Stats of the gene models of A. cinnamomea genome with various functional and expression support.

Supplementary Table S6 The core meiotic genes in the genome of A. cinnamomea.

Supplementary Table S7 Genes and enzymes in KEGG pathways.

Supplementary Table S8 Triterpenoid biosynthesis related enzymes in A. cinnamomea S27 genome.

Supplementary Table S9 Classification of 96 putative CYP proteins into 39 families.

Supplementary Table S10 Sequences of the putative CYP protein coding genes.

Supplementary Table S11 Enzymes for ubiquinone biosynthesis.

Supplementary Table S12 Number of genes in different functional category related to terpenoid biosynthesis.

Supplementary Table S13 Functional description of the genes uniquely expressed in AT and AM.

Supplementary Table S14 Clusters of gene counts of differential expressed genes of triterpenoid pathway.

Supplementary Table S15 IDs and definitions of the GO terms enriched in only one DEG cluster of triterpenoid pathway.

Supplementary Table S16 The number of genes of GO terms which are enriched within the differential expressed gene cluster of triterpenoid genes.

Supplementary Table S17 Classification of putative carbohydrate metabolism proteins of A. cinnamomea based on CaZy database.

Supplementary Table S18 rRNA and tRNA prediction of A. cinnamomea and comparison with G. lucidum.

Supplementary Table S19 A. cinnamomea S27 ribosomal RNA loci and conservation regions.

Supplementary Table S20 Statistics of repeat sequence of the S27 scaffolds from RepeatModeler.

Supplementary Table S21 Statistics of the genomic Illumina reads after pre-processing steps.

Supplementary Table S22 SNP analysis and occurrence of SNPs in genes uniquely expressed in S27 and/or S32.

Supplementary Table S23 Expression of the triterpenoid pathway and P450 genes in the six clusters identified on different scaffolds.

Supplementary methods.

Methods of DNA/RNA extraction. The genomic DNA of S27 was extracted using

a protocol of Chang et al. (1) with modifications. Briefly, frozen tissue was ground to

fine powder in liquid nitrogen. Samples were combined into 2g of powder per

extraction in 20ml of pre-warmed extraction buffer at 60℃, and incubated for 1 hr

with occasional mixing. The solution was subjected to two rounds of chloroform

extraction, and DNA was precipitated by cold isopropanol. The pellet was

resuspended in 3ml of SSTE and treated with RNaseA (total 3mg) at 37℃ overnight.

The suspension was then treated with Proteinase K (3mg) at 50℃ for 30 min, and

extracted once with phenol:chloroform:isoamyl alcohol (25:24:1) and again with

chloroform. DNA was then precipitated by ethanol and the pellet was dissolved in TE.

The genomic DNA was further purified by using AMpure XP beads (Agencourt) to

remove residual RNA. The final samples were checked for purity and quality by

NanoDrop and gel electrophoresis, and quantification was measured by Qubit

(Invitrogen).

Total RNA samples of S27 and S32 were extracted from frozen mycelia, using

the Hot Acidic Phenol method (2) with modifications. The finely ground mycelium

powder was added with 10ml pre-warmed TES buffer and equal volume of acidic

phenol (pH4.3). The mixture was incubated at 65℃ for 1 hr with occasional shaking.

After chilling on ice for 5 min, the solution was centrifuged and aqueous phase was

extracted twice with equal volume of phenol:chloroform:isoamylalcohol, followed by

final extraction with chloroform. RNA was then precipitated with ethanol and the

pellet was water containing RNaseOUT (Invitrogen), and added with DNaseI (New

England Biolabs, 2U/μl) to digest residual DNA. The RNA solution was subjected to

two rounds of phenol:chloroform extraction and one round of chloroform extraction.

RNA was precipitated by isopropanol in the presence of high salt precipitation

solution (0.8M Sodium citrate and 1.2M NaCl) at -20℃. The derived RNA pellet was

then dissolved in water with RNAseOUT at 37℃ for 15 min and stored in aliquots at

-80℃.

Annotation of non-protein coding genes. rRNAs were identified by using

RNAmmer (3) software v. 1.2 with E-value cutoff at 1x10-5. Prediction of tRNA

regions and structures was carried out by tRNAscan-SE (4) software v 1.3.1 with

strict parameter and relaxed cutoff (-32.1) of EuFindtRNA parameter.

Transcriptome analysis and differential gene expression. The raw sequence

reads were trimmed for low quality bases and adapter by Trimmomatic v0.30 (5) with

minimum length 36 bp. For transcriptome profiling by RNA-seq analysis (6), we

aligned reads to the S27 reference genome by Bowtie2 v2.1.0 (7) and TopHat v2.0.8b

(8, 9). Quantification of gene expression was conducted using Cufflinks v2.1.1 (10)

(http://cufflinks.cbcb.umd.edu/) to calculate RPKM (6) (reads per kilobase gene

length per million mapped reads) for all gene models. We quantified the expression

level of S27 from reads by Cufflinks v2.1.1 and normalized it by upper-quartile. A

gene will not be considered as expressed gene if all RPKMs of the four samples are

less than 0.5. Only the expressed genes of the normalized datasets from different

RNA samples were analyzed for differential gene expression using NOISeq (11) of

the Bioconductor package for q=0.8

(http://www.bioconductor.org/packages/2.12/bioc/html/NOISeq.html).

Repeat analysis. Genome repeat sequence analysis was conducted using

RepeatModeler version open-1.0.7 (12) on the S27 scaffolds.

Genome variations between S27 and S32. For genomic mapping, the first

bases of the Illumina PE reads were used as input. The S27 and S32 reads were

preprocessed by Trimmomatic-3.0 to remove the adapters and read bases with quality

< 15. The trimmed reads with length < 36 were removed. For variation identification,

we used the BWA aligner version 0.6.2 (13) for Illumina paired-end reads of the S27

and S32 samples, generating a mapping rate > 93% (SI Appendix, Table S18). We

employed SAMtools-0.1.18 (14) to detect single nucleotide polymorphisms (SNPs)

and small INDELs, using two criteria: (a) read depth > 60 % of average depth, and (b)

allele frequency =1. In addition, the SNP/INDEL sites present in the S27 background

and those located in repeat sequences were removed. The remaining S32-specific

mutations were then subjected to functional analysis using snpEff (15).

References

1. Chang S. PJ, and Cairney J (1993) A simple and efficient method for isolating RNA from

pine trees. Plant Molecular Biology Reporter 11(2):4.

2. Kohrer K & Domdey H (1991) Preparation of high molecular weight RNA. Methods in

enzymology 194:398-405.

3. Lagesen K, et al. (2007) RNAmmer: consistent and rapid annotation of ribosomal RNA

genes. Nucleic acids research 35(9):3100-3108.

4. Schattner P, Brooks AN, & Lowe TM (2005) The tRNAscan-SE, snoscan and snoGPS

web servers for the detection of tRNAs and snoRNAs. Nucleic acids research 33(Web

Server issue):W686-689.

5. Bolger AM, Lohse M, & Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina

sequence data. Bioinformatics 30(15):2114-2120.

6. Mortazavi A, Williams BA, McCue K, Schaeffer L, & Wold B (2008) Mapping and

http://www.bioconductor.org/packages/2.12/bioc/html/NOISeq.html

quantifying mammalian transcriptomes by RNA-Seq. Nature methods 5(7):621-628.

7. Langmead B & Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nature

methods 9(4):357-359.

8. Kim D & Salzberg SL (2011) TopHat-Fusion: an algorithm for discovery of novel fusion

transcripts. Genome Biol 12(8):R72.

9. Kim D, et al. (2013) TopHat2: accurate alignment of transcriptomes in the presence of

insertions, deletions and gene fusions. Genome Biol 14(4):R36.

10. Trapnell C, et al. (2013) Differential analysis of gene regulation at transcript resolution

with RNA-seq. Nature biotechnology 31(1):46-53.

11. Tarazona S. F-TP, Ferrer A. and Conesa A. (2012) NOISeq: Exploratory analysis and

differential expression for RNA-seq data.), 2.0.0.

12. Smit A, Hubley, R (2008-2010) RepeatModeler Open-1.0. p

(http://www.repeatmasker.org).

13. Li H & Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler

transform. Bioinformatics 25(14):1754-1760.

14. Li Heng et. al. 2009. The sequence alignment/map format and SAMtools. Bioinformatics,

25 (16) : 2078 – 2079.

15. Cingolani P, et al. (2012) A program for annotating and predicting the effects of single

nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster

strain w1118; iso-2; iso-3. Fly 6(2):80-92.

A"

B"

Figure S1. De novo genome assembly, gene annotation and transcriptomic analysis. A, Flowcharts of the de novo genome assembly pipeline for assembling the scaffolds of the single nucleated genome of A. cinnamomea strain S27. B, Pipeline of gene model integration. C, The first integration classified GeneMark and FgeneSH models by genome position. Overlapping gene model carried out by applying the filter score of the isotig alignment and blast score. Unique gene set is selected either with blast hits or RPKM > 1. D, Second integration. Classifying the first integrated gene set and MARKER genes by overlapping region share more than 10% of shorter one. Selected MAKER genes if same blast hits within overlapping pair; otherwise, selected both genes. Unique genes selected by blast hits with E-value threshold 1e-05. E, Transcriptome analysis workflow of RNA-seq data. Reads with base quality score lower than Q15 are removed by Trimmomatic-v0.3.0, and trimmed reads shorter than 36nt are discarded. Bowie2-v2.1.0, Tophat2-v2.0.8b and Cufflinks-v2.1.1 are used to calculate RPKMs. The RPKMs are normalized by up-quartile analysis. The genes with a normalized RPKM <0.5 in all four samples are considered non-expressed and removed. We only analyzed the genes differentially expressed between any two of the four samples determined by NOIseq (q = 0.8).�

C" D"

E"

0" 500" 1,000" 1,500" 2,000" 2,500" 3,000" 3,500" 4,000"

others"Macrophomina"phaseolina"

Plasmodium"yoelii"Tuber"melanosporum"

Grifola"frondosa"Methanobrevibacter"ruminan@um"

Nematostella"vectensis"Len@nula"edodes"Dacryopinax"sp."

Rhizoctonia"solani"Auricularia"delicata"

Piriformospora"indica"Schizophyllum"commune"Fomi@poria"mediterranea"

Agaricus"bisporus"Phanerochaete"chrysosporium"

Punctularia"strigosozonata"Taiwanofungus"camphoratus"

Coniophora"puteana"Coprinopsis"cinerea"

Laccaria"bicolor"Stereum"hirsutum"

Phanerochaete"carnosa"Serpula"lacrymans"

Moniliophthora"perniciosa"Dichomitus"squalens"Trametes"versicolor"

Pos@a"placenta"Ceriporiopsis"subvermispora"

Fibroporia"radiculosa"

Top$Hit(species(distribu0on(

BLAST(Top$Hits(

87"104"

4,257"4,611"

6,528"6,552"6,745"6,939"7,077"7,170"7,183"7,329"

7,696"7,776"

0" 1,000" 2,000" 3,000" 4,000" 5,000" 6,000" 7,000" 8,000"

Ganoderma"lucidam"Taiwanofungus"camphoratus"Moniliophthora"perniciosa"

Pos@a"placenta"Laccaria"bicolor"

oniophora"puteana"Coprinopsis"cinerea"

Ceriporiopsis"subvermispora"Stereum"hirsutum"

Phanerochaete"carnosa"Serpula"lacrymans"

Dichomitus"squalens"Trametes"versicolor"

Fibroporia"radiculosa"

A"

Figure S2. Blastp search for protein homologues with E-value at 1e-05 in NCBI nr database. A, Distribution of counts of top-hit species from Bast2GO. B, One-to-one hit distribution on the species above Taiwanofungus in Fig S2A plus Ganoderma lucidum.�

B"

P450%%(92%genes)%

AC%all%gene%set%(4,898%genes)%

Color%Key%%

Row%Z%Score%

a b c

d e f�

''''''''''''''''10''''''''''100''''''''1000�

AM�

AT�

AT�

AT�

S27� S27� S27�

S32� S32� AM�

'''''''''''10'''''''10

0''''''100

0 �S32�

'''''''''''10'''''''10

0''''''100

0 �AM�

''''''''''''''''10''''''''''100''''''''1000� ''''''''''''''''10''''''''''100''''''''1000�

A%

B% Figure S3. Differential gene expression analysis of the four transcriptomes. A, Summary plot of the expression values for pairwise comparisons (black), with differentially expressed genes (q=0.8) highlighted (red). a, MD plot in S27 against S32; b, S27 against AM; c, S27 against AT; d, S32 against AM; e, S32 against AT; f, AM against AT. B, Expression profile clustering of differentially expressed genes. RPKMs are transformed to z-values, and only genes showing differential expression between any two of the four transcriptomes are plotted. Top, clustering of putative P450 gene models with differential expression (92 out of 119). Bottom, clustering of all differentially expressed gene models (4,898 out of 9,255).�

scaf4&

26.6$Kb$

scaf14&91$Kb$

scaf17&12.7Kb$

P450&genes&on&forward&strand&Triterpenoid&pathway&genes&on&forward&strand&P450&genes&on&reverse&strand&Triterpenoid&pathway&genes&on&reverse&strand&

scaf2&67.3$Kb$ 17.8$Kb$

scaf5&6.4$Kb$ 8.7$Kb$ 7$Kb$

A&

scaf24&

53$Kb$

ACg003224 ACg005327 ACg005636 ACg008849

ACg003224&

ACg005327&

ACg005636&

ACg008849&

B&

Figure S4. Clusters of triterpenoid and P450 genes and domain organization of polyketide syntethases. A, Triterpenoid pathway and P450 gene clusters in the A. cinnamomea genome. Red arrows and green arrows show P450 genes and triterpenoid biosynthetic (non-P450) pathway genes, respectively. Solid and open arrows represent the forward strand and on the backward strand, respectively. B, Prediction of the domain organization of four putative multi-modular polyketide synthases in the A. cinnamomea genome. All four proteins contain acyl-carrier protein domains.�

tRNA%of%A.%cinnamomea%S27%

Ala$(9)$Gly$(15)$Pro$(6)$Thr$(7)$Val$(8)$Ser$(9)$Arg$(10)$Leu$(9)$Phe$(5)$Asn$(5)$Lys$(7)$Asp$(8)$Glu$(7)$His$(3)$

Figure S5. tRNA predictions using tRNA Scan-SE. A total of 134 tRNA genes and 14 pseudogenes were predicted.

INTERGENIC

49,61550.15%

INTRONIC

15,76415.93%

EXONIC

33,55133.91%

MISSENSE

14,66343.70%

SILENT

18,66655.63%

NONSENSE

2220.66%

INTERGENIC

5,84575.76%

INTRONIC

1,20815.66% EXONIC

6628.58%

FRAME SHIFT

33250.15%

NON-FRAME SHIFT

33049.85%

A

B

Figure S6. Inter-strain variants identified between strains S27 and S32. A, Distribution of SNPs and their functional effects in coding sequences. (The SNP/INDEL are detected by passing DP ≥ 60 % of average depth at 140 and allele frequency =1, with repeat regions scanning and background noise removal.) B, distribution of INDELs and their functional effects in coding sequences.

10.7% 13.7%

155.4%

297.3%

0%50%100%150%200%250%300%350%

S27% S32% AM% AT%

RPKM%of%clp1%

Strain%

A%

B%

Figure S7. S27 and S32 are pure monokaryons. A, Photos of mycelia of the dikaryon B496 and the monokaryons S27 and S32. Arrowheads indicate clamp or pseudoclamp structures. B, Expression of the clamp formation genes clp1 was much higher in AT and AM compared to in S27 and S32.

B496%Dikaryon� S27�

S32�

Figure S8. Representative secondary metabolites and biosynthesis pathways of

triterpenoids and antrocamphin. Representative secondary metabolites of A.

cinnamomea. (1) Ubiquinone: antroquinonol; (2) Benzenoids:

1,4-dimethoxy-2,3-methylenedioxy-5-methylbenzene; (3) maleic anhydride

derivatives: antrocinnamomins C; (4) lanostane-type triterpenoids: dehydroeburicoic

acid; (5) ergostane-type triterpenoids: antcin C; and (6) polyketides: antrocamphin A

!

Table S1. Summary of the Roche 454 and Illumina data used for assembling the S27

genome. Data type No. of reads Read

length (bp) Fragment

length1 (bp)

No. of

bases (Mb) Coverage

2

(x) No.

of files

454 SR 715,689 571.8 NA 409 12.7 2

454 SR 392,462 325.8 NA 128 4 1

454 SR 450,780 391.7 NA 177 5.5 1

454 PE (20kb) 592,222 313.9 NA 186 5.8 1

GAIIx PE 52,242,470 151 221 7,888 245 2

GAIIx PE 61,066,692 151 475 9,221 286 2

GAIIx MP 62,876,846 80 3,672 5,030 156 2

GAIIx MP 65,609,524 80 5,424 5,249 163 2 1Evaluation of fragment length was based on mapping result.

2Coverage is calculated based on the estimated genome size of 32.2 Mb.

Abbreviations: SR: shotgun read; PE: paired end; LPE, long insert PE; MP: mate pair.

Table S2. Summary of the NGS data obtained for the various A. cinnamomea strains

for transcriptome analyses.

Strain Data type Total read counts Read length (bp) Fragment size (bp) Total bases (Mb)

S27 454 SE 1,005,759 448 NA 451

Hiseq PE 144,843,514 100 ~300 144,84

S32 454 SE 1,023,220 482 NA 493

HiSeq PE 166,670,862 100 ~300 16,667

AM HiSeq PE 147,671,870 100 ~300 14,767

AT HiSeq PE 171,657,456 100 ~300 12,763

Table S3. Metrics of the transcriptome assembly of mycelium tissues from two

Antrodia strains.

Input S27 S32

Number of reads 1,005,759 1,023,220

Number of bases 450,592,501 493,156,641

Number of reads trimmed 1,005,743 (100.00%) 1,023,197 (100.00%)

Number of bases trimmed 450,046,957 (99.90%) 492,278,798 (99.80%)

Isogroup# Metrics

Number of isogroups 7,503 7,541

Average contig count 1.9 2.9

Largest contig count 264 212

Number with one contig 5,558 4,230

Average isotig count 1.5 2.6

Largest isotig count 79 96

Number with one isotig 5,592 4,268

Isotig*

Metrics

Number of Isotigs 11,622 19,478

Average contig count 2.6 4.6

Largest contig count 16 16

Number with one contig 5,558 4,230

Number of bases 20,541,573 49,517,068

Average isotig size 1,767 2,542

N50 isotig size 2,105 3,329

Largest isotig 9,236 13,371

Large

Contig†

Metrics

Number of contigs 8,091 9,391

Number of bases 11,876,562 13,125,720

Average contig size 1,467 1,397

N50 contig size 1,721 1,656

Largest contig size 8,187 10,357

All

Contig†

Metrics

Number of contigs 14,135 22,225

Number of bases 12,972,625 15,371,737

Average contig size 918 692

†Contigs containing reads which are shared by other contigs. *Isotigs sharing reads may be splicing variants of a single gene region. #Isogroup may contain two or more isotigs.

Table S4. Statistics of the raw PE-reads from preprocessing to transcriptome single

reads mapping.

HiSeq Sequencing

stats S27 S32 AM AT

# Raw Reads

Sequencing read

length PE 2*100 PE 2*100 PE 2*100 PE 2*100

Read count 144,834,514 166,670,862 147,671,870 171,657,456

Total bases (bp) 14,483,451,400 16,667,086,200 14,767,187,000 17,165,754,600

# Read Trimming

(by Trimmomatic-

3.0)

Average

processed read

length 86 84 85 86

Read count

remained 94.85% 96.86% 95.86% 94.71%

Total bases

remained 86.97% 87.80% 91.05% 89.80%

# Mapping onto

S27 scaffold

genome

Input reads 140,070,951 161,433,327 13,749,113 158,330,798

Mapped 91.50% 87.31% 74.97% 69.63%

Table S5. Stats of the gene models of A. cinnamomea genome with various functional

and expression support.

Annotation category Counts Percentage

Total gene model 9,254 100.00%

Gene models with NR blast hit (e<10-5

) 8,911 96.29%

Gene models with NR blast hit (e<10-10

) 8,717 94.20%

Gene models with GO terms 6,396 69.12%

Gene models with EC numbers 1,135 12.26%

Gene models with Pfam support 6,226 67.28%

Gene models with CDD support 2,648 28.61%

Gene models with S27 isotig support 6,415 69.32%

Gene models with S32 isotig support 6,505 70.29%

Gene models with all isotig support 7,165 77.43%

Gene models considered "expressed"1 8,079 87.30%

Total number of gene models with support 9,248 99.94% 1Gene models with RPKM≥0.5 in at least one of the four RNA samples.

Table S6. The core meiotic genes in the genome of Antrodia cinnamomea. The search

of homologs was carried out using collection of query proteins (Ref. 1-4).

Gene Function A. cinnamomea

protein I.D. E values

IME1 Master inducer of meiosis in S. cerevisiae. - -

IME2 A serine/threonine protein kinase involved in activation of S.

cerevisiae meiosis. IME2 expression is positively regulated by

Ime1. ACg001560 2e-48

STE12 Mating transcriptional factor required for meiosis initiation in

S. cerevisiae. ACg003227 2e-48

SPO7 Putative regulatory subunit of Nem1p-Spo7p phosphatase

holoenzyme; required for normal nuclear envelope

morphology, premeiotic replication, and sporulation ACg007802 5e-04

SPO11 Meiosis-specific protein that catalytic meiotic DNA DSBs. ACg004129 4e-23

REC102 Spo11 accessory factor. - -

REC103 (SKI8)

Spo11 accessory factor. - -

REC114 Spo11 accessory factor. - -

MRE11 Nuclease subunit of the MRX complex for DSB resection &

DNA damage checkpoint activation. ACg007863 3e-80

RAD50 Subunit of the MRX complex ACg002432 e-161

XRS2 (NBS1)

Subunit of the MRX complex. Mutations in human NBS1, a

XRS2 homolog, are linked to the autosomal recessive disorder.

C. cinerea Nbs1 protein [AFN01892.1] ACg007172 e-116

SAE2 (COM1)

S. cerevisiae Sae2, Schizosaccharomyces pombe CtIP and

Cryptococcus neoformans Sae2 [JGI gene ID: 6285] are

involved in meiotic and mitotic DSB repair. ACg005805 6e-22

EXO1 Protein involved in DNA repair and processing of meiotic

DSB. ACg005778 2e-48

SGS1 A RecQ family nucleolar DNA helicase that forms a

heterodimer with Top3 and regulates chromosome synapsis

and meiotic joint molecule/crossover formation. ACg007726 e-128

DMC1 Meiosis-specific RecA-like strand exchange protein ACg002890 1e-91

RAD51 RecA-Like strand exchange protein ACg001504 e-119

HOP2 Meiosis-specific complexes with Mnd1 to promote Dmc1

function ACg006321 1e-05

MND1 Meiosis-specific, complexes with Mnd1 to promote Dmc1

function ACg003486 1e-12

RAD52 Protein that stimulates Rad51-meidated strand exchange & ACg006850 1e-47

RAD54 DNA-dependent ATPase that stimulates Rad51-meidated

strand exchange ACg008410 0

RHD54 RAD54 homologue that stimulates Rad51 and Dmc1 ACg003420 2e-79

CDC48 An AAA-ATPase that functions as a SUMO-targeted

segregase curbing Rad51-Rad52 interaction ACg005594 e-115

BRH2 Ustilago maydis Rad51-associated protein that similar to

human BRCA2 [AAM92489.1]. No ortholog in S. cerevisiae

and S. pombe. ACg002912 7e-56

REC8 Meiosis-specific component of sister chromatid cohesion

complex. ACg001555 4e-127

SCC1 (MCD1)

The mitotic paralog of Rec8 in yeast. An essential subunit of

the sister chromatin cohesin complex in mitosis. Dissociation

of Scc1 from chromatin requires Esp1, a separin that stimulates

proteolysis of Scc1.

ACg004294 1e-04

SCC3 (IRR1)

Subunit of the cohesin comple ACg005915 1e-26

SMC1 An essential SMC chromosomal ATPase subunit of the cohesin

complex in mitosis and meiosis. Smc1 forms a dimmer with

Smc3. ACg008589 e-173

SMC3 An essential SMC chromosomal ATPase subunit of the sister

chromatin cohesin complex in mitosis and meiosis. ACg004133 0

ECO1 (CTF7)

An acetyltransferase modifies Smc3 at replication forks and

Scc1 in response to DSB breaks. ACg007474 0.015

PDS5 Protein prevents polySUMO-dependent separation of sister

chromatids ACg000799 1e-87

RED1 S. cerevisiae SC axial element protein that activates Mec1 and

Tel1 for Hop1 phosphorylation - -

rec10 S. pombe axial element protein - -



HOP1 S. cerevisiae SC axial element protein that is phosphorylated

by Mec1/ATR and Tel1/ATR checkpoint kinases ACg003938 2e-12

MEK1 Meiosis-specific serine/threonine protein kinase promotes

Interhomolog recombination and stabilizes Hop1-Thr318

phosphorylation ACg005487 6e-38

PCH2 A hexameric ring ATPase that remodels chromosome axis

protein Hop1 and represses Red1-independent Hop1-Thr318

phosphorylation ACg000557 2e-40

MER3 Meiosis specific DNA helicase that involved in the conversion

of DSBs to later recombination intermediates and in crossover

control ACg003801 e-124

MSH4 Meiosis-specific Msh4/5 heterdimeric complex required for

chromosome synapsis and normal levels of crossing over ACg005373 3e-48

MSH5 Meiosis-specific Msh4/5 heterdimeric complex required for

chromosome synapsis and normal levels of crossing over ACg002475 3e-24

ZIP1 S. cerevisiae central element protein of synaptonemal complex - -

ZIP2 S. cerevisiae meiosis-specific protein that involved in normal

synaptonemal complex formation and pairing between

homologous chromosomes during meiosis. - -

ZIP3 (CST9)

S. cerevisiae SUMO E3 ligase that required for synaptonemal

complex formation ACg003668 9e-04

ZIP4 (SPO22)

S. cerevisiae meiosis-specific protein essential for

synaptonemal complex assembly. ACg005454 5e-51

SPO16 S. cerevisiae meiosis-specific protein essential for chromosome

synapsis - -

Ecm11 S. cerevisiae meiosis-specific protein; component of the

synaptonemal complex (SC) along with Gmc2. - -

Gmc2 S. cerevisiae meiosis-specific protein; component of the

synaptonemal complex (SC) along with Ecm11 - -

TEL1 (ATM)

Yeast homolog of human ataxia-telangiectasia mutated (ATM)

protein kinase that controls the levels of meiotic DSBs, DSB

resection and interhomolog recombination ACg004775 1e-81

MEC1 (ATR)

Yeast homolog of human ATR protein kinase that controls

DNA recombination and cell cycle progression in mitosis and

meiosis ACg006123 e-123

RAD17

A component of the 9-1-1 complex (Ddc1-Mec3-Rad17) that

involved in the DNA damage and meiotic pachytene

checkpoints. Homolog of human and S. pombe Rad1 and U.

maydis Rec1 proteins.

ACg001217 1e-14 (Rec1)

RAD24 A subunit of a clamp loader that load the 9-1-1 complex onto

DNA ACg007076 3e-08

KU70 A subunit of the Ku70-Ku80 complex that binds DSB ends to

mediate nonhomologous end joining (NHEJ) and telomere

length maintenance ACg004653 0

KU80 A subunit of the Ku70-Ku80 complex ACg005396 0

LIG4 (DNL4)

The catalytic subunit of DNA ligase 4 complex (Lig4-Lif1-

Nej1) required for NHEJ ACg004756 2e-70

LIF1 (XRCC4)

A component of the DNA ligase IV that is homologous to

mammalian XRCC4 protein. ACg002996 0.035

DNA2 Protein involved in DNA repair and processing of meiotic

DNA double strand breaks ACg007097 e-135

FEN1 (RAD27)

Fen1 is multi-functional nuclease involved in processing

Okazaki fragments during DNA replication, base excision

repair and maintaining genome stability. ACg002509 e-111

MMS4 (Slx2)

A subunit of structure-specific Mms4p-Mus81p endonuclease

that mediates SC-independent class II crossover pathway. - -

MUS81 (Slx3)

The catalytic subunit of structure-specific Mms4p-Mus81p

endonuclease. - -

MLH1 The Mlh1/Pms1 and Mlh1/Mlh3 complex are required for

mismatch repair in mitosis and meiosis. The Mlh1-Mlh3

heterodimer is required for crossing over during meiosis. ACg000922 e-133

MLH3 The catalytic subunit of Mlh1/Mlh3 endonuclease complex

involved in DNA mismatch repair and meiotic recombination. ACg003290 e-160

PMS1 Pms1 forms a dimer with Mlh1. ACg006283 8e-71

YEN1 (GEN1)

Holliday junction resolvase ACg001983 9e-08

MPH1 A 3'-5' DNA helicase similar to FANCM human Fanconi

anemia complementation group protein that involved in error-

free bypass of DNA lesions to stimulate Dna2 and Rad27. ACg001734 1e-81

MSH2 Msh2 forms heterodimers with Msh3 and Msh6 that bind to

DNA mismatches to initiate the mismatch repair process, ACg000996 0

MSH3 Msh3 forms a dimer with Msh2. ACg001134 e-107

MSH6 Msh6 forms a dimer with Msh2. ACg005751 e-144

PMS1 Pms1 forms a dimer with Mlh1. ACg006283 8e-71

SLX1 The catalytic subunit of Slx1-Slx4 endonuclease complex

involved in DNA recombination and repair, functions overlaps

with that of the Sgs1-top3 complex. ACg005838 3e-04

SLX4 The accessory subunit of Slx1-Slx4 endonuclease. - -

SMC5

The Smc5-Smc6 complex, a novel non-structural maintenance

of chromosomes (SMC) component, perturbs meiotic joint

molecule formation and resolution without significantly

changing crossover or non-crossover levels.

ACg007949 0

SMC6 Smc6 forms a dimmer with Smc5 ACg002747 0

NDJ1 (TAM1)

Meiosis-specific telomere protein; required for bouquet

formation and telomere-led rapid prophase movement - -

CSM4 Meiosis-specific protein required for accurate chromosome

segregation during meiosis, bouquet formation and telomere-

led rapid prophase movement - -

MPS3

Nuclear envelope and SPB protein; required with Ndj1p and

Csm4p for meiotic bouquet formation and telomere-led rapid

prophase movement. Member of the SUN protein family,

including S. pombe Sad1.

- -

References

1. Burns C, et al. (2010) Analysis of the Basidiomycete Coprinopsis cinerea

reveals conservation of the core meiotic expression program over half a billion

years of evolution. (Translated from eng) PLoS Genet 6(9):e1001135.

2. Chi J, Mahe F, Loidl J, Logsdon J, & Dunthorn M (2014) Meiosis gene

inventory of four ciliates reveals the prevalence of a synaptonemal complex-

independent crossover pathway. (Translated from eng) Mol Biol Evol

31(3):660-672.

3. Holloman WK, Schirawski J, & Holliday R (2008) The homologous

recombination system of Ustilago maydis. (Translated from eng) Fungal

Genet Biol 45 Suppl 1(1):S31-39.

4. Schurko AM & Logsdon JM, Jr. (2008) Using a meiosis detection toolkit to

investigate ancient asexual "scandals" and the evolution of sex. Bioessays

30(6):579-589.

Table S7. Genes and enzymes in KEGG pathways. (A) Terpenoid backbone

biosynthesis. (B) Sesquiterpenoid and terpenoid biosynthesis. (C) Ubiquinone and

other terpenoid-quinone biosynthesis. (D) Metabolism of xenobiotics by cytochrome

P450. (E) Drug metabolism - cytochrome P450. (F) Fructose and mannose

metabolism. (G) Starch and sucrose metabolism. (G). Number of genes as total counts

vs counts of “expressed gene” (RPKM ≥ 0.5 in at least one transcriptome)

(A) Terpenoid backbone biosynthesis. (Total = 20)

Enzyme Ezyme ID Seqs of

Enzyme Seqs

acetyl-CoA acyltransferase ec:2.3.1.9 1 ACg000346

HMG-CoA synthase ec:2.3.3.10 1 ACg004857

reductase (NADPH) ec:1.1.1.34 1 ACg000822

Kinase ec:2.7.1.36 1 ACg002061

mevalonate phosphate kinase ec:2.7.4.2 1 ACg009255

Decarboxylase ec:4.1.1.33 1 ACg004385

delta-isomerase ec:5.3.3.2 1 ACg001555

geranyl-diphosphate synthase ec:2.5.1.1 2 ACg005547, ACg008849

synthase [geranyl-diphosphate specific] ec:2.5.1.84 1 ACg004020

diphosphate synthase ec:2.5.1.68 1 ACg004020

diphosphate synthase ec:2.5.1.10 1 ACg005547

synthase [(2E,6E)-farnesyl-diphosphate

specific] ec:2.5.1.31 2 ACg003004, ACg004020

diphosphate synthase ec:2.5.1.29 2 ACg007626, ACg008277

diphosphate synthase [geranylgeranyl-

diphosphate specific] ec:2.5.1.85 1 ACg004020

O-methyltransferase ec:2.1.1.100 7 ACg000386, ACg002078,

ACg003108, ACg006939,

ACg007070, ACg008408, ACg008797

(B) Sesquiterpenoid and terpenoid biosynthesis. (Total = 9)

Enzyme Ezyme ID Seqs of Enzyme Seqs

Squalene monooxygenase ec:1.14.13.132 1 ACg008894

sesquiterpene cyclase ec:4.2.3.6 7 ACg001223, ACg003119, ACg003147,

ACg003159, ACg004670, ACg005213,

ACg007731

squalene synthase ec:2.5.1.21 1 ACg004800

(C) Ubiquinone and other terpenoid-quinone biosynthesis. (Total = 7)

Enzyme Ezyme ID Seqs of Enzyme Seqs

methyltransferase ec:2.1.1.114 1 ACg004603

reductase (warfarin-sensitive) ec:1.1.4.1 1 ACg005537

dehydrogenase (quinone) ec:1.6.5.2 1 ACg001532

polyprenyltransferase ec:2.5.1.39 3 ACg007085, ACg007895, ACg008524

Transaminase ec:2.6.1.5 1 ACg006206

(D) Metabolism of xenobiotics by cytochrome P450. (Total = 24)


Enzyme Seqs

epoxide hydrolase ec:3.3.2.9 1 ACg005717

reductase (NADPH) ec:1.1.1.184 2 ACg002146, ACg005929

Dehydrogenase ec:1.1.1.1 2 ACg008552, ACg009092

dehydrogenase

[NAD(P)+] ec:1.2.1.5 5

ACg000306, ACg000817, ACg000847, ACg004671,

ACg007732

Transferase ec:2.5.1.18 10 ACg000589, ACg000736, ACg001212, ACg002721,

ACg005183, ACg005800, ACg007596, ACg008163,

ACg008167, ACg008769

Monooxygenase ec:1.14.14.1 4 ACg001959, ACg005537, ACg005671, ACg007676

(E) Drug metabolism - cytochrome P450. (Total = 22)


Enzyme Seqs

Dehydrogenase ec:1.1.1.1 2 ACg008552, ACg009092

Monooxygenase ec:1.14.13.8 1 ACg005678

dehydrogenase

[NAD(P)+] ec:1.2.1.5 5

ACg000306, ACg000817, ACg000847, ACg004671,

ACg007732

Transferase ec:2.5.1.18 10 ACg000589, ACg000736, ACg001212, ACg002721,

ACg005183, ACg005800, ACg007596, ACg008163,

ACg008167, ACg008769

Monooxygenase ec:1.14.14.1 4 ACg001959, ACg005537, ACg005671, ACg007676

(F) Fructose and mannose metabolism. (Total = 24)


Enzyme Seqs

phosphohexokinase ec:2.7.1.11 1 ACg001977

Isomerase ec:5.3.1.8 2 ACg007002, ACg008758

Isomerase ec:5.3.1.1 1 ACg007001

hexokinase type IV

glucokinase ec:2.7.1.1 1 ACg005983

endo-1,4-beta-mannosidase ec:3.2.1.78 3 ACg001443, ACg003497, ACg005772

Lyase ec:4.2.2.3 1 ACg001671

hexose diphosphatase ec:3.1.3.11 1 ACg003779

guanylyltransferase (GDP) ec:2.7.7.22 2 ACg003462, ACg007434

Synthase ec:1.1.1.271 1 ACg007872

Aldolase ec:4.1.2.13 1 ACg004401

phosphofructokinase 2 ec:2.7.1.105 2 ACg006912, ACg007738

Reductase ec:1.1.1.21 6 ACg000461, ACg005285, ACg005624,

ACg005631, ACg006938, ACg008158

2-dehydrogenase ec:1.1.1.14 1 ACg000383

mannose phosphomutase ec:5.4.2.8 1 ACg008678

(G) Starch and sucrose metabolism. (Total = 49) Enzyme Ezyme ID Seqs of

Enzyme Seqs

Phosphorylase ec:2.4.1.1 1 ACg000144

Isomerase ec:5.3.1.9 1 ACg008705

hexokinase type IV glucokinase ec:2.7.1.1 1 ACg005983

endo-1,4-beta-D-glucanase ec:3.2.1.4 1 ACg005614

1,4-alpha-glucosidase ec:3.2.1.3 1 ACg002651

Glycogenase ec:3.2.1.1 2 ACg007389, ACg007704

1,3-beta-glucosidase ec:3.2.1.58 8 ACg000047,ACg000185,

ACg001971,ACg002915,

ACg004862,ACg007444,

ACg007475,ACg008469 endo-1,3-beta-D-glucosidase ec:3.2.1.39 1 ACg001852

amylo-1,6-glucosidase ec:3.2.1.33 1 ACg004750

Trehalase ec:3.2.1.28 2 ACg001892, ACg005648

Gentiobiase ec:3.2.1.21 2 ACg004065, ACg007761

Maltase ec:3.2.1.20 2 ACg000270, ACg004084

pectin depolymerase ec:3.2.1.15 3 ACg005952, ACg005954,

ACg006785 Uridylyltransferase ec:2.7.7.9 2 ACg003321, ACg003322

glucose 6-phosphate phosphatase ec:3.1.3.9 2 ACg000754, ACg005727

4-alpha-galacturonosyltransferase ec:2.4.1.43 1 ACg000224

Synthase ec:2.4.1.34 11 ACg003679,ACg005906,

ACg006298,ACg006299,

ACg007479,ACg008254,

ACg008255,ACg008256,

ACg008350,ACg008642,

ACg009206 disproportionating enzyme ec:2.4.1.25 1 ACg004750

branching enzyme ec:2.4.1.18 1 ACg000352

synthase (UDP-forming) ec:2.4.1.15 3 ACg002148,ACg003195,

ACg008198 Synthase ec:2.4.1.11 1 ACg007169

6-dehydrogenase ec:1.1.1.22 1 ACg007842

(alpha-D-glucose-1,6-bisphosphate-

dependent) ec:5.4.2.2 1 ACg005867

Table S8. Triterpenoid biosynthesis related enzymes in A. cinnamomea S27 genome. Num. of

sequences Triterpenoid biosynthesis related enzymes

1 AACT: acetyl-CoA acetyltransferase, [EC:2.3.1.9], K00626;

1 HMGS: 3-hydroxy-3-methylglutaryl-CoA synthase, [EC:2.3.3.10], K01641;

1 HMGR: 3-hydroxy-3-methylglutaryl-CoA reductase, [EC:1.1.1.34], K00021;

1 MVK: mevalonate kinase, [EC:2.7.1.36], K00869;

1 MPK: phosphomevalonate kinase, [EC:2.7.4.2], K00938;

1 MVD: pyrophosphomevalonate decarboxylase, [EC:4.1.1.33], K01597;

1 IDI: isopentenyl-diphosphate isomerase, [EC: 5.3.3.2], K01823;

2 GPPs: geranyl diphosphate synthase, [EC: 2.5.1.1], K00787, K00804;

1 FPPs: EC 2.5.1.68 (2Z,6E)-farnesyl diphosphate synthase

1 FPPs: farnesyl diphosphate synthase, [EC: 2.5.1.10], K00787, K00804;

2 GGPPs: geranylgeranyl diphosphate synthase; farnesyltransferase [EC 2.5.1.29]

1 SPS1/2: all-trans-nonaprenyl diphosphate synthase [geranylgeranyl-diphosphate specific][EC 2.5.1.85]

1 SPS2: all-trans-nonaprenyl-diphosphate synthase [EC 2.5.1.84]

7 STE14: protein-S-isoprenylcysteine O-methyltransferase [EC 2.1.1.100]

2 UPPs: Ditrans,polycis-undecaprenyl-diphosphate synthase ((2E,6E)-farnesyl- diphosphate specific) [EC 2.5.1.31]

1 SQS: squalene synthase, [EC: 2.5.1.21], K00801;

1 SE: squalene monooxygenase, Squalene epoxidase [EC:1.14.13.132] [KO:K00511]

1 OSC/LSS: Oxidoqsualene cyclase/Lanosterol synthase [EC:5.4.99.7] [ACg008680]

24 Mono-TPS: monoterpene synthase [EC:4.2.3.-]

7 sesquiterpene cyclase/ trichodiene synthase [EC: 4.2.3.6] [Q6WP50.1, P13513.1, 2OA6_A, 2OA6_B, 2OA6_C, 2OA6_D] {map00909} [ACg001223, ACg003119, ACg003147, ACg003159, ACg004670, ACg005213, ACg007731]

3 COQ2: 4-hydroxybenzoate polyprenyltransferase (EC:2.5.1.39) [ACg007085, ACg007895, ACg008524]

1 COQ3: polyprenyldihydroxybenzoate methyltransferase (EC2.1.1.114) [ACg004603] map00130

14 PKS: polyketide synthase [XP_001876029.1][EC:2.3.1.206] [ACP-domain containing: ACg003224, ACg005327, ACg005636 and ACg008449; no ACP domain: ACg000077, ACg000788, ACg000930, ACg002831, ACg003519, ACg003904, ACg005691, ACg006323, ACg008872, ACg009092]

1 OAC: Polyketide cyclase [EC:4.4.1.26] [ACg002677]

1 Lignin peroxidase [EC: 1.11.1.14] [ACg002413]

1 Manganese peroxidase [EC:1.11.1.13] [ACg008146]

2 Lactoperoxidase [EC: 1.11.1.7] [ACg004767, ACg007568]

1 Sterol 14-α-demethylase, CypX, CYP51F1 [ACg002141]

Table S9. Classification of 96 putative CYP proteins into 39 families.

CYP_family A. cinnamomea G.lucidum P. placenta P. chrysosporium

CYP51 1 2 1 1

CYP53 2 1 7 1

CYP61 0 1 1 1

CYP63 6 6 5 7

CYP66 1 0 0 0

CYP502 0 1 4 1

CYP504 10 0 0 0

CYP505 5 4 2 7

CYP512 6 22 14 14

CYP530 1 0 0 0

CYP537 1 1 2 0

CYP539 1 0 0 0

CYP609 1 0 0 0

CYP619 1 0 0 0

CYP620 1 0 0 0

CYP642 0 1 0 0

CYP645 2 0 0 0

CYP5027 0 0 9 0

CYP5035 1 16 3 13

CYP5036 0 0 0 5

CYP5037 6 6 13 5

CYP5053 1 0 0 0

CYP5065 0 1 0 0

CYP5082 1 0 0 0

CYP5136 3 7 0 5

CYP5137 0 1 6 2

CYP5138 1 1 1 1

CYP5139 3 7 8 1

CYP5140 3 1 1 1

CYP5141 1 2 4 7

CYP5142 0 0 0 7

CYP5143 0 0 0 2

CYP5144 6 3 3 34

CYP5145 1 0 0 3

CYP5146 2 0 0 6

CYP5147 0 0 0 6

CYP5148 0 2 1 2

CYP5149 1 0 1 1

CYP5150 12 36 25 6

CYP5151 2 1 1 1

CYP5152 1 1 2 2

CYP5153 1 0 0 1

CYP5154 2 0 0 1

CYP5155 0 0 0 1

CYP5156 1 1 1 1

CYP5157 1 0 0 1

CYP5158 2 1 2 1

CYP5205 2 0 0 0

CYP5293 1 0 0 0

CYP5339 0 0 2 0

CYP5340 0 2 1 0

CYP5341 0 2 3 0

CYP5342 0 0 1 0

CYP5343 0 0 1 0

CYP5344 0 0 3 0

CYP5345 0 0 1 0

CYP5346 0 0 1 0

CYP5347 0 1 2 0

CYP5348 0 3 34 0

CYP5349 0 1 2 0

CYP5350 0 0 11 0

CYP5351 0 1 1 0

CYP5352 0 0 1 0

CYP5353 0 0 1 0

CYP5354 0 0 2 0

CYP5355 0 0 1 0

CYP5356 0 0 1 0

CYP5357 0 2 0 0

CYP5358 0 1 0 0

CYP5359 0 47 0 0

CYP5360 0 1 0 0

CYP5361 0 1 0 0

CYP5362 0 1 0 0

CYP5363 0 1 0 0

CYP5364 0 3 0 0

CYP5365 0 1 0 0

CYP5366 0 1 0 0

CYP6001 1 0 0 0

CYP6004 1 0 0 0

CYP6005 0 2 0 0

Table S10. Sequences of the putative CYP protein coding genes.

CYP family E-value Hit coverage Query coverage Gene ID

CYP512G2 2.00E-116 85.74% 93.99% ACg000015

CYP5139A1 3.00E-164 85.34% 100.21% ACg000090

CYP504E5 1.00E-07 33.43% 34.12% ACg000167

CYP5136A2 6.00E-53 97.86% 67.72% ACg000189

CYP504E5 5.00E-08 38.12% 38.58% ACg000299

CYP505K1 4.00E-26 25.45% 24.59% ACg000657

CYP53A15 1.00E-177 97.29% 105.70% ACg000810

CYP5037B2 7.00E-176 46.26% 85.29% ACg000830

CYP5158A1 6.00E-153 86.57% 77.31% ACg001138

CYP5037B2 3.00E-111 89.08% 86.59% ACg001139

CYP5150A1 1.00E-179 96.02% 90.46% ACg001140

CYP5150A1 3.00E-177 96.00% 89.95% ACg001143

CYP5152A1 3.00E-134 90.79% 55.10% ACg001165

CYP5153A1 5.00E-68 95.89% 47.21% ACg001266

CYP5144C5 6.00E-140 95.05% 95.93% ACg001456

CYP5037 7.00E-12 85.98% 26.06% ACg001608

CYP645A1 1.00E-09 53.98% 22.73% ACg001635

CYP5150B1 5.00E-169 99.09% 88.15% ACg001953

CYP5150A1 2.00E-177 95.35% 87.39% ACg001957

CYP5150A1 4.00E-169 97.59% 96.76% ACg001959

CYP51F1 9.00E-162 91.32% 99.09% ACg002141

CYP512A1 1.00E-143 99.05% 104.41% ACg002478

CYP512A1 9.00E-180 99.40% 99.80% ACg002480

CYP5205A1 1.00E-57 90.30% 53.95% ACg002607

CYP5144C3 9.00E-136 98.04% 92.75% ACg002666

CYP63A3 9.00E-172 96.97% 94.44% ACg002691

CYP504E5 2.00E-07 51.51% 45.70% ACg002752

CYP5035C1 3.00E-175 92.01% 95.41% ACg002804

CYP512A1 1.00E-172 99.01% 100.00% ACg002875

CYP512A1 2.00E-178 98.39% 98.00% ACg003106

CYP5157A1 6.00E-85 85.26% 72.91% ACg003117

CYP5205A3 1.00E-42 97.45% 48.11% ACg003260

CYP5145A3 2.00E-117 97.29% 95.99% ACg003266

CYP504E5 1.00E-07 40.58% 41.54% ACg003550

CYP63A1 8.00E-99 95.07% 62.31% ACg003854

CYP63A3 2.00E-159 95.01% 106.06% ACg003857

CYP512A1 2.00E-103 92.28% 98.20% ACg004532

CYP6001C16 8.00E-180 97.75% 91.32% ACg004767

CYP5146A3 1.00E-11 80.90% 13.00% ACg004856

CYP5144C3 3.00E-116 98.92% 102.60% ACg005005

CYP5136A1 2.00E-150 98.53% 98.89% ACg005042

CYP5150A5 5.00E-179 97.04% 93.30% ACg005224

CYP5151B1 1.00E-27 95.95% 47.70% ACg005237

CYP5037B3 5.00E-142 98.46% 95.18% ACg005326

CYP63C1 3.00E-157 98.98% 97.33% ACg005354

CYP5037B1 6.00E-75 95.97% 85.77% ACg005358

CYP504E5 5.00E-08 45.45% 43.03% ACg005478

CYP5154A1 6.00E-91 99.12% 88.58% ACg005537

CYP504E5 9.00E-16 50.00% 43.62% ACg005609

CYP5150A2 3.00E-179 57.54% 88.63% ACg005625

CYP5150A5 2.00E-169 97.87% 92.46% ACg005628

CYP537B1 9.00E-94 99.05% 88.03% ACg005634

CYP5151A2 0 95.69% 89.28% ACg005641

CYP5138A1 4.00E-127 94.26% 90.73% ACg005664

CYP5150C1 2.00E-11 90.70% 34.21% ACg005670

CYP5150A2 2.00E-58 98.98% 31.96% ACg005671

CYP504E5 2.00E-15 52.07% 44.81% ACg005697

CYP5150A4 8.00E-138 95.44% 93.91% ACg006042

CYP5141A2 1.00E-178 88.97% 95.91% ACg006201

CYP5144C1 9.00E-140 98.38% 87.91% ACg006355

CYP5053C1 3.00E-07 63.48% 11.57% ACg006373

CYP5150A4 3.00E-100 89.97% 84.78% ACg006374

CYP619D1 2.00E-95 92.09% 76.74% ACg006376

CYP5144C3 2.00E-151 95.43% 93.12% ACg006474

CYP5139A1 2.00E-159 86.59% 103.11% ACg006484

CYP5149A1 8.00E-11 10.32% 10.98% ACg006496

CYP5139B2 5.00E-77 90.30% 80.51% ACg006818

CYP5156A1 1.00E-147 89.06% 65.73% ACg006835

CYP505D5 4.00E-45 98.73% 22.00% ACg006872

CYP505D4 7.00E-17 92.24% 9.64% ACg006873

CYP505D6 1.00E-36 92.23% 16.84% ACg006874

CYP5037B4 2.00E-146 94.56% 90.19% ACg006982

CYP5154A1 3.00E-67 84.52% 81.69% ACg007044

CYP63C1 2.00E-170 98.98% 97.50% ACg007189

CYP6004A2 1.00E-149 80.88% 72.92% ACg007568

CYP539E1 7.00E-32 18.08% 25.95% ACg007616

CYP620H5 4.00E-92 91.30% 80.98% ACg007671

CYP5136A2 3.00E-58 96.78% 94.81% ACg007676

CYP5158A1 1.00E-122 93.08% 89.30% ACg007677

CYP5146B1 3.00E-125 25.83% 82.59% ACg007686

CYP66A1 3.00E-38 86.50% 56.40% ACg007885

CYP530A3 5.00E-76 87.66% 86.37% ACg007894

CYP504E5 2.00E-06 8.26% 33.83% ACg007905

CYP5140A1 4.00E-162 95.03% 98.92% ACg007979

CYP504E5 5.00E-07 12.14% 32.64% ACg007984

CYP5140A1 4.00E-154 88.25% 105.17% ACg007985

CYP5140A1 2.00E-153 88.76% 95.26% ACg008012

CYP504E5 3.00E-07 46.69% 45.99% ACg008209

CYP5144C4 1.00E-31 99.26% 26.37% ACg008523

CYP53C2 2.00E-160 99.60% 90.81% ACg008539

CYP645A1 9.00E-08 60.90% 24.40% ACg008558

CYP505A16 3.00E-13 17.66% 22.53% ACg008683

CYP609A1 7.00E-60 83.08% 84.20% ACg008700

CYP63A1 1.00E-180 99.12% 86.62% ACg008739

CYP5082B1 2.00E-91 88.17% 76.57% ACg008853

CYP5293A1 8.00E-85 93.46% 85.11% ACg008858

Table S11. Enzymes for ubiquinone biosynthesis. Enzyme Query Function Gene ID Qurey% E-val Identity Top nr Blastp hit with functional

description COQ1 AtSPS1 polyprenyl diphaosphase

synthase ACg003004 0.72 9E-50 0.38 terpenoid synthase [Stereum hirsutum

FP-91666 SS1] terpenoid synthase [Trametes

versicolor FP-101664 SS1] COQ2 AtPPT1 PHB:polyprenyl

dephosphate transferase ACg000665 0.87 2E-85 0.4 4-hydroxybenzoate polyprenyl

transferase [Dichomitus squalens

LYAD-421 SS1]

4-hydroxybenzoate polyprenyl

transferase [Trametes versicolor FP-

101664 SS1]

ACg007085 0.69 4E-64 0.41 4-hydroxybenzoate polyprenyl

transferase [Gloeophyllum trabeum ATCC 11539]

UbiA prenyltransferase [Trametes

versicolor FP-101664 SS1]

ACg007592 0.72 6E-46 0.34 4-hydroxybenzoate polyprenyl

transferase [Gloeophyllum trabeum

ATCC 11539] UbiA prenyltransferase [Trametes


ACg007895 0.78 1E-44 0.31 UbiA prenyltransferase [Dichomitus

squalens LYAD-421 SS1] UbiA prenyltransferase [Trametes

versicolor FP-101664 SS1] COQ3 AtCOQ3 Dihydroxyhexaprenylbenz

oate methyltransferase; Hexaprenyldihydroxybenz

oate methyltransferase

ACg004603 0.74 4E-47 0.33 ubiquinone biosynthesis O-

methyltransferase [Dichomitus squalens LYAD-421 SS1]

ubiquinone biosynthesis O-

methyltransferase [Trametes versicolor FP-101664 SS1] ;

COQ4 AT2G036

90 Coenzyme Q (ubiquinone)

biosynthesis protein Coq4 ACg001939 0.95 7E-54 0.37 ubiquinone biosynthesis protein

COQ4 mitochondrial [Trametes versicolor FP-101664 SS1]

ubiquinone biosynthesis protein

COQ4 [Coprinopsis cinerea

okayama7#130]; mitochondrial COQ5 AT5G573

00 2-methoxy-6-polyprenyl-

1,4-benzoquinol methylase ACg002725 0.91 1E-83 0.49 UbiE/COQ5 methyltransferase

[Stereum hirsutum FP-91666 SS1]

hypothetical protein TRAVEDRAFT_122824 [Trametes

versicolor FP-101664 SS1] COQ6 AT3G242

00 ubiquinone biosynthesis monooxygenase COQ6;

Ubi-OHases

ACg006610 0.84 5E-53 0.3 ubiquinone biosynthesis hydrox [Dichomitus squalens LYAD-421

SS1] ubiquinone biosynthesis hydrox

[Trametes versicolor FP-101664 SS1]

COQ7 Ecoli_ubiF Mono-oxygenase; 2-

octaprenyl-3-methyl-6-methoxy-1,4-benzoquinol

oxygenase

ACg006610 0.86 2E-25 0.31 ubiquinone biosynthesis hydrox

[Dichomitus squalens LYAD-421 SS1]

ubiquinone biosynthesis hydrox [Trametes versicolor FP-101664 SS1]

COQ8 COQ8,

C. elegans AarF; Predicted unusual

protein kinase ACg005287 0.45 2E-16 0.27 atypical/ABC1/ABC1-B protein

kinase [Trametes versicolor FP-101664 SS1]

ABC1-domain-containing protein

[Dichomitus squalens LYAD-421 SS1]; TetR_N family, motichondria

COQ9 AT1G191

40 Ubiquinone biosynthesis

protein COQ9 ACg001889 0.66 3E-13 0.24 ubiquinone biosynthesis protein coq9

[Moniliophthora roreri MCA 2997]

hypothetical protein

TRAVEDRAFT_169640 [Trametes


Table S12. Number of genes in different functional category related to terpenoid

biosynthesis. Only the genes with RPKM ≥ 0.5 in at least one of the four

transcriptomes are considered as expressed genes. The DEGs are derived by NOISeq

of bioconductor package with q = 0.8.

Description Expression Gene counts

A. cinnamomea str. S27 all genes 9,254

RPKM≥0.5 8,079

DEGs 4,898

Terpenoid related genes all genes 486

RPKM≥0.5 481

DEGs 314

Triterpenoid pathway genes all genes 242

RPKM≥0.5 242

DEGs 185

P450 all genes 119

RPKM≥0.5 119

DEGs 92

Terpenoid backbone biosynthesis (KEGG map: 00900)

all genes 20

RPKM≥0.5 20

DEGs 17

Table S13. Functional description of the genes uniquely expressed in AT and AM. (A)

The genes uniquely expressed in AT. (B) The genes uniquely expressed in both AT

and AM. (C) The genes uniquely expressed in AM. The function of the uniquely

expressed genes are determined by BLASTp hit with E-value <10-10

and three criteria:

(1) The best hit of the hit genes with known protein function.

(2) The best hit of the hit genes with hypothetical function definition, if none of

the hit genes with known protein function.

(3) The best hit of the hit genes if none of the hit genes with known protein

function of hypothetical function definition.

(A) The genes uniquely expressed in AT.

Gene ID Function Hit species

ACg004031 NAD-binding protein Fomitiporia mediterranea

ACg004725 BTB domain-containing protein Rhizoctonia solani

ACg004753 putative mitochondrial carrier C8C9,12c Rhizoctonia solani

ACg005279 ankyrin repeat protein Aspergillus fumigatus

ACg006161 GPI-anchored domain-containing protein Rhizoctonia solani

ACg008058 other/FunK1 protein kinase Coprinopsis cinerea okayama

ACg000902 hypothetical protein TRAVEDRAFT_84145, partial Trametes versicolor

ACg002837 hypothetical protein PHACADRAFT_206652 Phanerochaete carnosa

ACg000033 - No blast hit

(B) The genes uniquely expressed in both AT and AM.


ACg000962 expressed protein Schizophyllum commune

ACg002403 alpha/beta-hydrolase Trametes versicolor

ACg002720 MRP-L20 domain-containing protein Rhizoctonia solani

ACg005652 PREDICTED: von Willebrand factor-like Ciona intestinalis

ACg006297 expressed protein Agaricus bisporus var.

bisporus ACg006365 MFS general substrate transporter Trametes versicolor

ACg006501 sure-like protein Dichomitus squalens

ACg007569 DNA/RNA polymerase Auricularia delicata

ACg007886 terpenoid synthase Stereum hirsutum


ACg007584 predicted protein Postia placenta ACg005863 - No blast hit

(C) The genes uniquely expressed in AM.


ACg002543 TPA_exp: reverse transcriptase/ribonuclease H Coprinopsis cinerea

ACg005055 NAD(P)-binding protein Coniophora puteana

ACg005761 DUF1769-domain-containing protein Trametes versicolor


Table S14. Clusters of gene counts of differential expressed genes of triterpenoid

pathway.

C1 C2 C3 C4 Total

Triterpenoid 34 36 67 48 185

P450 14 22 28 28 92

Table S15. IDs and definitions of the GO terms, which are enriched in only one DEG

cluster of triterpenoid pathway. There are 34, 36, 67, and 48 genes enriched for the

cluster C1, C2, C3, and C4, respectively.

GO ID GO Definition Enriched

Cluster

GO:0006696 Ergosterol biosynthetic

process

The chemical reactions and

pathways resulting in the

formation of ergosterol, (22E)-

ergosta-5,7,22-trien-3-beta-ol, a

sterol found in ergot, yeast and

moulds.

C1

GO:0051762 Sesquiterpene

biosynthetic process



formation of sesquiterpenes, any of

a class of terpenes of the formula

C15H24 or a derivative of such a

terpene.

C1

GO:0071770 DIM/DIP cell wall layer

assembly

The aggregation, arrangement and

bonding together of a set of

components, including

(phenyl)phthiocerol,

phthiodiolone, phthiotriol

dimycocerosate and

diphthioceranate, to form the

DIM/DIP layer of the

Actinobacterium-type cell wall

C2

GO:0004364 Glutathione transferase

activity

Catalysis of the reaction: R-X +

glutathione = H-X + R-S-

glutathione. R may be an aliphatic,

aromatic or heterocyclic group; X

may be a sulfate, nitrile or halide

group.

C3

GO:0006537 Glutamate biosynthetic

process



formation of glutamate, the anion

of 2-aminopentanedioic acid.

C4

GO:0016712

Oxidoreductase activity,

acting on paired donors,

with incorporation or

reduction of molecular

oxygen, reduced flavin or

flavoprotein as one donor,

and incorporation of one

atom of oxygen

Catalysis of an oxidation-reduction

(redox) reaction in which

hydrogen or electrons are

transferred from reduced flavin or

flavoprotein and one other donor,

and one atom of oxygen is

incorporated into one donor.

C4

Table S16. The number of genes of GO terms which are enriched within the

differential expressed gene cluster of triterpenoid genes.

All gene set C1 C2 C3 C4

32 46 62 40

GO:0004497 Monooxygenase activity 19 5 6

GO:0008202 Steroid metabolic process 20 6 4

GO:0043231 Intracellular membrane-bounded

organelle 9 2 4

GO:0005506 Iron ion binding 83 21 10 15

GO:0009055 Electron carrier activity 70 18 10 9

GO:0045482 Trichodiene synthase activity 7 2 3

GO:0055114 Oxidation-reduction process 125 6 18 11

GO:0020037 Heme binding 92 5 21 9 15

GO:0016705 Oxidoreductase activity, acting on

paired donors, with incorporation

or reduction of molecular oxygen 51 4 19 8 9

GO:0071770 DIM/DIP cell wall layer assembly 7 2

GO:0019748 Secondary metabolic process 7 4

GO:0004364 Glutathione transferase activity 9 4

GO:0006790 Sulfur compound metabolic proces 14 3

GO:0016712

Oxidoreductase activity, acting on

paired donors, with incorporation

or reduction of molecular oxygen,

reduced flavin or flavoprotein as

one donor, and incorporation of

one atom of oxygen

6 2

GO:0006537 Glutamate biosynthetic process 7 2

Table S17. Classification of putative carbohydrate metabolism proteins of A.

cinnamomea based on CaZy database.

Species A. cinnamomea G. lucidum P. chrysosporium P. placenta c L. bicolor

GH 78 288 190 252 168

GT 54 70 65 100 87

PL 1 10 4 8 7

CE 19 30 16 21 17

CBM 59 53 47 28 24

AA1 15 13 0 3 8

AA2 8 16 2 1

AA3 3 5 8 12 5

AA4 0 0 0 0 0

AA5 4 9 7 2 9

AA6 0 1 4 2 2

AA7 0 0 0 0 0

AA8 0 2 2 0 0

AA9 4 0 0 0 0

AA10 0 0 0 0 0

AA11 0 0 0 0 0

GH: Glycoside Hydrolases, GT: Glycosyl Transferases, PL: Polysaccharide Lyases, CE:

Carbohydrate Esterases, CBM: Carbohydrate-Binding Modules, AA: Auxiliary Activities,

AA1: laccases, ferroxidases and laccase-like multicopper oxidases. AA2: class II lignin-

modifying peroxidases. AA3: flavin-adenine dinucleotide (FAD)-binding domain. AA4:

vanillyl-alcohol oxidase. AA5: copper radical oxidases family. AA6: benzoquinone

reductases. AA7: glucooligosaccharide oxidase. AA8: Iron reductase domain. AA9-AA1:

cooper-dependent lytic polysaccharide monooxygenases (LPMOs).

Table S18. rRNA and tRNA prediction of A. cinnamomea and comparison with G.

lucidum.

A. cinnamomea

Input S27.v2.5.fasta

Genome Bases 32177404

Length in tRNA 14040

Length in rRNA 7895

# tRNA 148 (14 pseudo)

# rRNA 3

rRNA 8S, 18S, 28S

Table S19. A. cinnamomea S27 ribosomal RNA loci and conservation regions. We

predicted one ribosomal RNA cluster with 3 rRNAs: 8S, 18S and 28S by RNAmmer

v.1.2. The direction of the rRNA from 5’ to 3’ is 8S, 18S, and 28S with a total length

of 10Kb bases. RNAmmer predicts the ribosomal RNA sequences by doing the

structural alignment with the HMM model. It might over-estimate the rRNA

sequences since the structural folding of a sequence is determined by the sequence

length and content. In order to avoid the over-estimation, we used Blastn to find a

sequentially conserved region of 8S-18S-28S rRNA with 7Kb bases, which is similar

to the size of rRNA repeat cluster of Aspergillus fumigatus (1). Conserved regions (*)

represents those with significant alignment in Blastn.

Scaffold Start End Length Strand Conserved*

start end length

8S scaf10 1,156,450 1,156,563 114 - 1,156,450 1,156,563 114

18S scaf10 1,158,908 1,160,712 1,805 + 1,158,908 1,160,712 1,805

28S scaf10 1,160,933 1,166,911 5,979 + 1,160,933 1,163,715 2,783

References

1. Nierman, W. C., et al. (2005). Genomic sequence of the pathogenic and allergenic

filamentous fungus Aspergillus fumigatus. Nature 438(7071): 1151-1156.

Table S20. Statistics of repeat sequence from RepeatModeler. Total repetitive

sequence accounts for 17.7% of the S27 scaffolds.

Number of elements Length Occupied (bp) Percentage of sequence (%)

SINEs 0 0 0 ALUs 0 0 0 MIRs 0 0 0

LINEs 502 747,611 2.32 LINE1 0 0 0 LINE2 0 0 0 L3/CR1 0 0 0

LTR elements 1,384 2,245,758 6.98 ERVL 0 0 0 ERVL-MaLRs 0 0 0 ERV_classI 0 0 0 ERV_classII 0 0 0

DNA elements 274 199,816 0.62 hAT-Charlie 0 0 0 TcMar-Tigger 0 0 0

Unclassified 4,261 2,198,660 6.83 Total interspersed repeats 5,391,845 16.76 Small RNA 0 0 0 Satellites 0 0 0 Simple repeats 2,942 283,656 0.88 Low complexity 398 21,307 0.07 Total repetitive 5,695,889 17.7

Table S21. Statistics of the genomic Illumina reads after pre-processing steps.

S27 S32

# Raw Reads

Read length PE 2*100 PE 2*100

Read count 113,309,162 127,738,466

Total base (bp) 11,043,231,184 12,449,502,738

# Read Trimming (by Trimmomatic-3.0)

Total bases remained (%) 58.63 58.31

Sequence coverage 201 347

# Mapping onto s27v2.5 scaffold genome

Input reads 83,652,588 94,654,661

Mapped (%) 98.40 93.96

Genome coverage (%) 98.20 97.01

Table S22. SNP analysis and occurrence of SNPs in genes uniquely expressed in S27

and/or S32.

Regions Size (bp) SNP

counts

SNP density

(per kb)

Whole genome 32,177,404 98,930 3.07

Exonic regions 13,925,609 33,551 2.41

Non-exonic regions 18,251,795 65,379 3.58

Uniquely expressed genes (98) 90,729 235 2.59

S27 uniquely expressed genes (38) 44,766 60 1.34

S32 uniquely expressed genes (28) 24,336 30 1.23

Genes uniquely expressed in S27 AND S32 (32) 21,627 145 6.70

Table S23. Expression of the triterpenoid pathway and P450 genes in the six clusters

identified on different scaffolds shown in Figure S7.

P450

Cluster Gene ID S27 S32 AM AT

Scaf2

ACg000810 109.6 221 108.9 136.3

ACg000817 16.6 7.9 3.9 2.5

ACg000822 203 63.4 169.4 202.7

ACg000830 4.9 8 3.2 25.3

ACg001138 22.1 23.7 3.5 3.6

ACg001139 67 41.3 10.8 27.9

ACg001140 26 10.2 12.3 22.2

ACg001143 25.1 7.5 4.2 13.8

Scaf4

ACg001953 69.8 23 24.4 19.5

ACg001954 310.1 126.6 29.4 207.6

ACg001957 4.9 5.9 6.6 27

ACg001959 261.2 17.2 3.7 28.6

ACg001961 13.3 14.7 20.9 6.3

Scaf5

ACg002403 0.5 0.4 1.3 10.6

ACg002404 200.5 56.3 28.4 166.1

ACg002478 2 4.4 1.2 5.9

ACg002480 13.2 2.5 2 117.6

ACg002691 60.3 32.1 50.3 8.6

ACg002693 119.5 8.2 13.6 13.6

Scaf14

ACg005615 37.7 19.9 7.2 64.2

ACg005625 4.2 14.8 4.9 46.2

ACg005628 118.6 14.8 25 43.2

ACg005634 4.4 12.5 5.8 12.5

ACg005636 4.9 3.8 1.6 19.2

ACg005641 23.7 46.7 39.5 10.4

Scaf17

ACg006374 29 16.1 12.4 18.4

ACg006376 12.5 18.3 4 9.1

ACg006378 3.1 13 37.6 12.4

Scaf24

ACg007671 15 24.3 15 2.5

ACg007676 62.5 55.4 61.6 11.8

ACg007677 23.6 36.4 36.5 11.4

ACg007686 30.2 24 37.6 42.6

Documents

Antrodia Suppl. Tables-rev gene counts-Jin05152014...10. Trapnell C, et al. (2013) Differential analysis of gene regulation at transcript resolution with RNA-seq. Nature biotechnology