23
Chiang Mai J. Sci. 2014; 41(4) : 922-944 http://epg.science.cmu.ac.th/ejournal/ Contributed Paper Assessing the Impact of Dominant Sequencing-Based Gene Expression Profiling Techniques (SGEPTs) on Phytopathogenic Fungi Dawei Wang and Youliang Peng* State Key Laboratory of Agrobiotechnology and Ministry of Agriculture Key Laboratory of Plant Pathology, China Agricultural University, Beijing 100193, P.R. China. *Author for correspondence; e-mail: [email protected] Received: 2 January 2014 Accepted: 20 January 2014 ABSTRACT The application of transcriptomics for the selection of candidate genes responsible for fungal pathogenicity or fungal-plant interactions is one of the main methods for studying the aetiology of fungal pathogenesis. The major challenge for researchers is to choose an effective technique for the analysis of transcriptomes. Sequencing-based gene expression profiling techniques (SGEPTs) are powerful tools that are used for this purpose. The analytical SGEPTs, which are well-known as extensive sequencing expressed sequence tags (ESTs), serial analysis of gene expression (SAGE), massively parallel signature sequencing (MPSS), and a revolutionary tool known as RNA sequencing (RNA-seq) have accelerated our understanding of the molecular mechanism of fungal pathogenesis. In this review, we systematically present the methodology and technique considerations of each dominant SGEPT, highlight the application of these techniques in phytopathogenic fungal studies, and briefly consider the future sequencing platform for RNA-seq and the potential application of RNA-seq techniques in phytopathogenic fungi. Keywords: phytopathogenic fungi, EST, SAGE, MPSS, RNA-seq 1. INTRODUCTION Phytopathogenic fungi have an enormous impact on human welfare by causing diseases, such as the rice blast in important crops [1]. Rice blast caused by the filamentous ascomycete Magnaporthe oryzae is able to reduce by 18% the yield of the world’s annual rice harvest [2]. Therefore, it is of primary importance to effectively control these diseases. The study of the pathogenicity factors of phytopathogenic fungi and fungi-host interactions is instrumental for the development of novel disease control strategies. With the availability of the numerous fungal genomic resources (http://www. broad.mit.edu; http://genome.jgi-psf.org), transcriptomics is increasingly attractive

Assessing the Impact of Dominant Sequencing-Based Gene ...simple, well-established and accessible technique, making it easily appliable by most researchers. However, large collections

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Assessing the Impact of Dominant Sequencing-Based Gene ...simple, well-established and accessible technique, making it easily appliable by most researchers. However, large collections

922 Chiang Mai J. Sci. 2014; 41(4)

Chiang Mai J. Sci. 2014; 41(4) : 922-944http://epg.science.cmu.ac.th/ejournal/Contributed Paper

Assessing the Impact of Dominant Sequencing-BasedGene Expression Profiling Techniques (SGEPTs) onPhytopathogenic FungiDawei Wang and Youliang Peng*State Key Laboratory of Agrobiotechnology and Ministry of Agriculture Key Laboratory of Plant Pathology,China Agricultural University, Beijing 100193, P.R. China.*Author for correspondence; e-mail: [email protected]

Received: 2 January 2014Accepted: 20 January 2014

ABSTRACTThe application of transcriptomics for the selection of candidate genes responsible

for fungal pathogenicity or fungal-plant interactions is one of the main methods forstudying the aetiology of fungal pathogenesis. The major challenge for researchers is tochoose an effective technique for the analysis of transcriptomes. Sequencing-based geneexpression profiling techniques (SGEPTs) are powerful tools that are used for this purpose.The analytical SGEPTs, which are well-known as extensive sequencing expressed sequencetags (ESTs), serial analysis of gene expression (SAGE), massively parallel signaturesequencing (MPSS), and a revolutionary tool known as RNA sequencing (RNA-seq)have accelerated our understanding of the molecular mechanism of fungal pathogenesis.In this review, we systematically present the methodology and technique considerationsof each dominant SGEPT, highlight the application of these techniques inphytopathogenic fungal studies, and briefly consider the future sequencing platformfor RNA-seq and the potential application of RNA-seq techniques in phytopathogenicfungi.

Keywords: phytopathogenic fungi, EST, SAGE, MPSS, RNA-seq

1. INTRODUCTIONPhytopathogenic fungi have an

enormous impact on human welfare bycausing diseases, such as the rice blast inimportant crops [1]. Rice blast caused bythe filamentous ascomycete Magnaportheoryzae is able to reduce by 18% the yieldof the world’s annual rice harvest [2].Therefore, it is of primary importance toeffectively control these diseases. The

study of the pathogenicity factors ofphytopathogenic fungi and fungi-hostinteractions is instrumental for thedevelopment of novel disease controlstrategies.

With the availability of the numerousfungal genomic resources (http://www.broad.mit.edu; http://genome.jgi-psf.org),transcriptomics is increasingly attractive

Page 2: Assessing the Impact of Dominant Sequencing-Based Gene ...simple, well-established and accessible technique, making it easily appliable by most researchers. However, large collections

Chiang Mai J. Sci. 2014; 41(4) 923

for dissecting the molecular basis offungal-plant interactions and pathogenesis[1, 3-7]. Recent advances in nucleic acidsequencing technologies favour theapplication of transcriptomics to selectcandidate genes that are responsible forfungal pathogenicity or fungal-plantinteractions to be studied in detail amongthousands of genes encoded in the genome[8, 9]. SGEPTs, which are keeping pace withthe sequencing technologies, are powerfultools for transcriptome analysis. Thesetechniques are well known as extensivesequencing EST, SAGE, MPSS, RNA-seq[10-13]. Among them, RNA-seq hasrecently been used more frequently thanthe other platforms. The goal of this articleis to systematically present the methodologyand technique considerations of eachof these dominant SGEPTs, highlightthe application of these techniques inphytopathogenic fungal studies, and brieflyconsider the future sequencing platformfor RNA-seq and potential applicationsof RNA-seq related techniques inphytopathogenic fungi.

2. SGEPTSSGEPTs (ESTs, SAGE, MPSS, and

RNA-seq) are developed to directlydetermine the cDNA sequences from tissuesor cells of interest based on the availabilityof the sequencing platform. They arearguably powerful and play importantroles in determining genes that areresponsible for fungal pathogenesis andfungal-host interaction.

2.1 Extensive Sequencing ESTs andApplications

ESTs are single-read sequences generatedfrom partial sequencing of a bulk mRNApool. Distinct ESTs are partial sequencesthat correspond to different or same

mRNAs [14]. Extensive sequencing ofESTs provides a representation of thetranscriptome at a particular time andinformation about gene expressionpatterns. This technique was first used forquantitative gene expression profiling byAdams et al [10] to assess human genediversity and expression patterns basedupon 83 million nucleotides of cDNAsequences. During the past decades, thistechnique has been commonly employedby researchers in the field of fungal plantpathology to profile the transcriptionalresponse during phytopathogenic fungalpathogenesis and phytopathogenic fungi-host interaction.

Extensive sequencing ESTs is aconvenient approach for gene expressionprofiling analysis, and the commonworkflow is shown in Figure 1. It is asimple, well-established and accessibletechnique, making it easily appliable bymost researchers. However, large collectionsof ESTs were built by Sanger sequencingof numerous random cDNAs, sufferingfrom lack of sequence depth for transcriptomeanalysis due to cost. This drawback hasbeen overcome by the introduction ofnext-generation sequencing technology,and this will be discussed below in the‘RNA-seq’ section.

The extensive sequencing ESTs techniquehas been used for phytopathogenic fungalcomparative genomic analysis by Soanesand Talbot [5]. A total of 57,727 uniqueESTs from 15 phytopathogenic species andthree saprobic fungi species, which wereheld within the COGEME phytopathogenEST database (http://cogeme.ex.ac.uk/)were used for comparative analysis.Differences were found between pathogenicand free-living fungi based on a substantialcollection of expressed gene sequences andavailable completed fungal genome

Page 3: Assessing the Impact of Dominant Sequencing-Based Gene ...simple, well-established and accessible technique, making it easily appliable by most researchers. However, large collections

924 Chiang Mai J. Sci. 2014; 41(4)

sequences. The expressed gene inventoriesof pathogenic fungi were not significantlymore similar to each other than to thoseof free-living filamentous fungi, andfilamentous fungi as a group sharemore sequences in common than with thefree-living yeast species Saccharomycescerevisiae. The interesting discovery wasthat the ESTs of the obligate biotrophicfungus Blumeria graminis f. sp. hordei werenotably different from those of all otherfungal species assessed, having a lowernumber of sequences in common withfilamentous ascomycetes studied to the

Figure 1. The workflow of extensive sequencing of ESTs technique for investigatinggene expression profiling. (a) mRNA extraction. The mRNA is extracted and purifiedfrom the RNA samples. (b) cDNA library construction. Once fungal mRNA isgenerated, oligo (dT) is used to bind to the poly(A) tail to synthesize the first strand ofcDNA (sscDNA) by reverse transcriptase, and subsequently the sscDNA is convertedinto a double stranded DNA with the help of DNA polymerase. Restrictionendonucleases and DNA ligase are used to clone the sequences into appropriate plasmid.The obtained clones are selected to make a final cDNA library. c Clone sequencing anddata analysis. The individual clones are sequenced by Sanger to generate ESTs forquantitative gene expression profiling.

date and also possessing a larger proportionof uni-sequences of unknown function.ESTs analysis in the COGEME databasesupported identification of a set of functionalgroups of genes that are highly representedin the genomes of pathogenic fungi thannon-pathogenic species. Similar approacheshave been used by Braun et al [15] tocompare ESTs from Neurospora crassa tothe S. cerevisiae genome and Austin et al.[16] has also performed a comparativeanalysis of U. maydis ESTs with species-based genomic and EST sequence databases.The approach for the comparison of

Page 4: Assessing the Impact of Dominant Sequencing-Based Gene ...simple, well-established and accessible technique, making it easily appliable by most researchers. However, large collections

Chiang Mai J. Sci. 2014; 41(4) 925

sequence data from phytopathogenicfungi with those of closely relatednon-pathogenic species was able to provideinformation about genetic factors thatare vital for pathogenesis, or at leastconsistently conserved gene segments inspecies with the capacity to cause fungaldiseases [5].

The extensive sequencing ESTs approachhas also been used for identification ofsoluble secreted proteins from appressoriaof Colletotrichum higginsianum [17]. Atotal of 1,442 ESTs sequences (518 uniquesequences) were generated from bidirectionalsequencing of 980 individual clones from acDNA library prepared from matureappressoria. Among the above uniquesequences, 353 sequences showed significantsimilarity to entries in the NCBI non-redundant protein database, of which 49had homologs to the experimentallyverified fungal pathogenicity genes. Thepredicted ORFs of the unique sequenceswere screened for potential signal peptidesby Signal P. The data highlighted 53%unique sequences that were predicted toencode proteins entering the secretorypathway, of which 26 were likely to encodesoluble secreted proteins. RT-PCRconfirmed that seven genes that encodesecreted proteins of unknown functionwere up-regulated in appressoria andexpressed early during plant infection, ofwhich two were Colletotrichum-specificgenes and therefore were deemed to becandidate effectors.

Extensive sequencing of ESTs approachhas proven to be a rapid approach tocharacterise genes that regulate phytopatho-genic infection-related morphogenesisby Zhang et al. [18]. A total of 4,798individual ESTs (1,118 uni-sequences) werecollected from the prepared germinatedurediniospore cDNA library. BLASTX

analysis of the above-mentioned uniquesequences showed that 267 exhibitedsignificant similarity to functionallycharacterized proteins, and 149 werehomologous to hypothetical proteins.Several ESTs were homologs of verifiedfungal pathogenicity or virulence factors,such as HESP767 of the flax rust andPMK1, GAS1, and GAS2 from the riceblast fungus. Quantitative real-time PCRassay of 6 ESTs (Ps28, Ps85, Ps87, Ps259,Ps261, and Ps159) showed that all ofthem had the highest transcript level ingerminated urediniospores and a muchlower transcript level in ungerminatedurediniospores and infected wheat tissues.The data generated highlighted that thesegenes expressed in germinated urediniosporesof P. striiformis f. sp. tritici could be identifiedby extensive sequencing of ESTs.

Recent studies have demonstrated thatthe extensive sequencing of ESTs approachhas been successfully used to identify andanalyse large numbers of genes that areinvolved in fungal-plant interactions [19].Six different cDNA libraries were preparedusing infected leaf tissues harvested fromsix conditions: resistant, partially resistant,and susceptible reactions at both 6 and24 h after inoculation. In addition, anothertwo control libraries were constructedusing un-inoculated leaves and leaves fromthe lesion mimic mutant spl11. A total of68,920 (13,570 unisequences) ESTs weregenerated from the above eight libraries.Among the above the unisequences, 5,699of the ESTs were predicted to have putativegene function. Comparison analysis ofthe pathogen-challenged libraries with theun-inoculated control library revealed thatgenes in the functional categories of defenseand signal transduction mechanisms and cellcycle control, cell division, and chromosomepartitioning were distinctly enriched.

Page 5: Assessing the Impact of Dominant Sequencing-Based Gene ...simple, well-established and accessible technique, making it easily appliable by most researchers. However, large collections

926 Chiang Mai J. Sci. 2014; 41(4)

Hierarchical clustering analysis groupedthe eight libraries based on their diseasereactions, up to 7,748 new and unique ESTswere identified from the collectioncompared with the KOME full-lengthcDNA collection. The data supportedextensive sequencing of phytopathogenicfungi/host mixed ESTs provided a solidfoundation for further revealing of geneticmechanisms of disease resistance and theplant-fungus interaction.

2.2 Serial Analysis of Gene Expression(SAGE) and Applications

SAGE is an efficient and globalsequence-based technique for quantitativegene expression profiling which allows theidentification of multiple transcriptssimultaneously [11]. The workflow ofSAGE is detailed in Figure 2. The techniquerelies on the sequencing and quantificationof 15-bp or longer oligo tags and matchingof sequences against available EST databasesor genome sequences to identify thecorresponding expressed genes [20]. Thefinal result of the tag counts and tagannotations are combined into a geneexpression profile. By comparing geneexpression profiles of samples that aredifferently treated, gene expression profilingin response to the particular treatment isavailable.

SAGE has three major drawbacks:First, it requires large quantities of inputmRNA (usually 2.5 to 3 μg mRNA).Several variants were later developed toovercome the limitations. MicroSAGEtechnique was developed, which utilises500- to 5,000- fold less RNA [1-5 ng poly(A)mRNA] as starting material [21]. Thedesigned procedure was completed in asingle tube accompanied by reduced lossof material during subsequent steps.MicroSAGE technique employs a two-step

di-tag amplification approach, in whichdi tags are amplified using 28 cycles, thengel-purified and re-amplified for another18 cycles. Peters and his colleaguesintroduced another similar protocol calledSAGE-Lite [22]. SAGE-Lite relies on theinherent poly(C) terminal transferaseactivity of reverse transcriptase and isable to switch templates during DNApolymerisation. According to this approach,first-strand cDNA synthesis was primedby an oligo(dT) primer in the presence ofa second ‘template switching’ oligonucleotide(TS oligo), which is included in thefirst-strand synthesis reaction. The TSoligo contains a short poly(G) sequencewhich hybridises the poly(C) sequence.The reverse transcriptase then switchedtemplates and incorporates the comple-mentary TS sequence. The SAGE variantallows the global analysis of transcriptionfrom less than 100 ng of total RNA. A newapproach has been developed called SADEtechnique, in which mRNAs were directlyisolated from the tissue lysate by bindingto oligo(dT) covalently bound to magneticbeads [23]. Synthesis and cleavage of thecDNAs were then performed on beads.SADE enabled a 1,000-fold reduction ofthe amount of starting material.

The second major drawback is althoughSAGE tags can be directly counted andquantified, without a genome sequence itis not trivial to identify a gene with a15-bp tag in its absence. However, a 15-bptag sometimes is insufficient to distinctlyidentify the gene of origins with morecomplex genomes [24]. To overcome thislimitation, several technical modificationshave been introduced. Newly developedprotocol LongSAGE, 21-bp tags weregenerated by replacing the tagging enzymeBsmFI with another Type II enzyme MmeI[25]. These longer tags are more efficient

Page 6: Assessing the Impact of Dominant Sequencing-Based Gene ...simple, well-established and accessible technique, making it easily appliable by most researchers. However, large collections

Chiang Mai J. Sci. 2014; 41(4) 927

Figure 2. The schematic representation of methodology of SAGE technique forinvestigating gene expression profiling. (a) Total RNA is extracted from target cells ortissues. mRNA is enriched from abundant other molecules by poly(A) select.(b) The mRNA is converted into double stranded cDNA by biotinylated oligo(dT)primer. The cDNA is digested with anchoring enzyme (AE) (an enzyme that cuts every256-bp by recognizing the sequence CTAG) and then recovered by binding tostreptavidin-coated beads. The reaction mixture is divided into two aliquots, and twoindependent linkers (A and B) are ligated to each portion. (c) Linkers A and B are designedto contain a tagging enzyme site (TE). TE is a type IIS restriction enzyme, which cleavesat a defined distance away from the site it recognizes. TE recognizes site in the linker,and cuts off a short “tag” of cDNA. After end repaired, two groups of cDNAs areligated to each other, to create a “ditag” with linkers on either end. PCR is used toamplify the ditags, using a primer that is complementary to the linker. (d) The cDNA isagain digested by the AE, breaking the linker off right where it was added in the beginning.This leaves a “sticky” end with the sequence GTAC (or CATG on the other strand) ateach end of the ditag. Ditags are ligated together to form long concatemers. Betweeneach ditag is the AE site, allowing us to recognize where one ends and the next begins.The concatemers are cloned into vector and sequenced, and the tags are matched up withthe gene that they uniquely represent. By counting the number of times each tag appears,the relative expression levels can be determined.

Page 7: Assessing the Impact of Dominant Sequencing-Based Gene ...simple, well-established and accessible technique, making it easily appliable by most researchers. However, large collections

928 Chiang Mai J. Sci. 2014; 41(4)

in identifying novel genes in complexgenomes compared to conventional SAGEtags. Another new approach was developedcalled SuperSAGE, in which a Type IIIrestriction endonuclease EcoP15I wasemployed as the tagging enzyme, which hasthe longest distance so far reported betweenthe recognition and cleavage sites. The useof EcoP15I could harvest 26-bp tags fromtranscripts [26].

The third major drawback is the costof Sanger sequencing which prohibitsthe analysis of sufficient tags to enablequantitative comparisons in multiplecDNA populations. The limitation hasnow been overcame with the establishmentof a protocol of the SuperSAGE technologycombined with next-generation sequencing,coined ‘High-Throughput (HT-) SuperSAGE’. In the present protocol, index(barcode) sequences were employed todiscriminate tags from different samples.Such barcodes allowed researchers toanalyse digital tags from transcriptomes ofmany samples in a single sequencingrun by simply pooling the libraries.HT-SuperSAGE enabled highly sensitive,reproducible and accurate digital geneexpression data [27].

SAGE was first used for quantitativegene expression profiling in phytopatho-genic fungi by Thomas et al [28]. In thatstudy, mRNA contents in un-germinatedconidia were measured during conidialgermination and during appressoriumformation in B. graminis on barley leaves.A total of 60,000 tags were isolated andsequenced, representing 6,336 uniquetranscripts. Comparison of the SAGEtags with the available EST sequencesallowed the identification of 1,274 tags.The majority of tags displayed a significantchange in gene expression level in all threestages. The most abundant transcripts

encoded a putative metallothionein protein,which is associated with oxidative andheavy metal stress [29]. Other genes withsignificant transcript levels were involvedin protein synthesis, modification anddegradation, which may indicate that earlygermling development, involves rapidprotein synthesis and degradation [30].

SAGE has been employed to comparethe gene expression profiles betweencAMP-treated conidia and untreatedconidia of M. oryzae by Irie et al. [31]. Atotal of 5,087 (2,889 unitags) and 3,983 tags(2,342 unitags) were generated fromcAMP-treated conidia and untreatedconidia, respectively. Comparative analysisexhibited that the cAMP treatment resultedin up- and down-regulation of genescorresponding to 57 and 53 unique tags,respectively. Tag annotation against variousgenomic databases annotated 50 tags. Agroup of genes involved in pathogenicity,e.g., MPG1, MAS1, and MAC1 were inducedon exposure of conidia to cAMP, revealingcAMP is required for appressorium andfungal pathogenicity. SuperSAGE has beenapplied to investigate the M. oryzae-riceinteraction by Matsumura et al. [26]. AType III restriction endonuclease, EcoP15Iwas utilised as the tagging enzyme toproduce 26-bp tags. In that study, A totalof 12,119 ‘‘SuperSAGE tags’’ which matched7,546 different genes were obtained fromM. grisea-infected rice leaves. The completeGenBank database comprising all availablecDNA, EST, and genome sequences wassearched by BLAST with 26-bp tagsequences as query. M. oryzae-derived tagsaccounted for 0.6 % (74/12,119) of the totaltags, 35 of which encoded putative genes.The most abundant transcripts (38 tags)were from a hydrophobin gene, revealingthat the hydrophobin gene is the mostactively transcribed M. grisea gene in

Page 8: Assessing the Impact of Dominant Sequencing-Based Gene ...simple, well-established and accessible technique, making it easily appliable by most researchers. However, large collections

Chiang Mai J. Sci. 2014; 41(4) 929

blast-infected rice leaves.SAGE has also been used to study the

role of protein kinase A (PKA), which isconserved cAMP signaling cascades regulatefungal development and virulence [32,33].Gene expression profiles of a wild-typestrain and both Δubc1 (a mutant with adefect of the regulatory subunit of PKAencoded by UBC1) and “adr1 (a mutantwith a defect of catalytic subunit of PKA)mutants of U. maydis was compared.Approximately 50,000 tags were sequencedfor each strain providing a reasonable depthfor quantification. The data generatedhighlighted that these mutants exhibitedchanges in the transcript levels for genesencoding ribosomal proteins, genesregulated by the b-mating-type proteins,and genes for metabolic functions. The mostinteresting result was the observation ofan elevated transcript levels for genesinvolved in phosphate acquisition andstorage in ubc1 mutant indicating aconnection between cAMP and phosphatemetabolism.

2.3 Massively Parallel Signature Se-quencing (MPSS) and the Applications

MPSS is developed by Brenner and hiscolleagues was used to analyse geneexpression profile by counting the numberof individual mRNA molecules producedby each gene [12]. Typically, MPSS enablesthe production of at least one millionmolecules in a given sample. All genes wereanalysed simultaneously, and bioinformatictools were applied to sort out the numberof mRNAs from each gene relative to thetotal number of molecules in the sample.Counting mRNAs with MPSS wasperformed by generating a 17-bp ‘uniquesignature’ for each mRNA. To measure thelevel of expression of any given gene, thetotal number of signatures was counted

[34]. MPSS signatures for mRNAs in asample were generated by sequencingdscDNA fragments cloned onto 5-μmdiameter micro-beads by the LynxMegaclone technology (Figure 3a). Thesequencing reaction involves an automatedseries of adaptor ligations and enzymaticsteps as Figure 3b described.

Compared to extensive sequencingESTs and SAGE, MPSS has two majoradvantages. It provides in-depth quantificationof virtually all of the genes that areexpressed in a sample with less cost andtime. Additionally, it is more sensitive toaccurately quantify the genes that areexpressed at low levels within a cell.

Typically, MPSS experiment with onemillion micro beads will yield 250,000 to400,000 high quality 17-base signaturesequences, which has enough depth toallow the quantitation and analysis of allgenes within a sample, especially thosewhich are biologically important butexpressed at very low levels in the cell.Because there is no requirement for priorknowledge of any genomic information,this technique is able to generate quantitativegene expression data sets from anyorganism [12]. MPSS combined withRL-SAGE technique were first used forthe comparative analysis of the myceliumand appressorium transcriptome ofMagnaporthe grisea [35]. The MPSS analysesyielded 12,531 and 12,927 distinct significanttags from mycelia and appressoria,respectively, while Robust-LongSAGEanalysis yielded 16,580 distinct significanttags from the mycelial library. Whenmatching these tags to the annotated CDS,500-bp upstream and 500-bp downstreamof CDS, a total of 7,135 mycelium-specificand 7,531 appressorium-specific significantMPSS tags were isolated, representing 2,088and 1,784 annotated genes, respectively.

Page 9: Assessing the Impact of Dominant Sequencing-Based Gene ...simple, well-established and accessible technique, making it easily appliable by most researchers. However, large collections

930 Chiang Mai J. Sci. 2014; 41(4)

Figure 3. Summary of MPSS technique for investigating gene expression profiling.(a) Summary of megaclone technology. Poly(A) mRNAs from tissues or cells are convertedinto double stranded cDNAs and harvested by biotin-labeled oligo(dT) primer, whichare subsequently digested by restriction enzyme (commonly DpnII). Each signaturesequence is cloned into a specially designed plasmid vector containing a 32 baseoligonucleotide tag. Totally 16.8×106 million different 32-bp sequences are available inthe tag library, and each separate cDNA clone contains a different sequence. A libraryof cDNA inserts, along with their adjacent 32-bp oligonucleotide tags are amplifiedusing primer pair PCR-F and PCR-R, and the resulting products are partially treatedwith an exonuclease to expose 32-bp address TAGs. The exposed tags at each of thecDNAs are able to hybridize to 32-bp complementary tags that are covalently linked to5 μm micro beads. Correspondingly, there are 16.8×106 million complementary 32-bptags available for hybridization. Once the cDNAs are hybridized into the beads, thenicks are filled in. The end-product is the ‘special library’ on the microbeads. TaggedPCR products produced from cDNA are amplified so that each corresponding mRNAmolecule gives ~100,000 of PCR products with a unique tag. Tags are used to attachthe PCR products to microbeads. The microbeads are then arrayed in a flow cell forsequencing and quantification. (b) Summary of sequencing reaction of MPSS. Thesequencing process is to employ encode adaptors to identify bases in each ligation-cleavagecycles shown in Figure 3B. Template sequences are determined by detecting successfuladaptor ligation. After microbeads are loaded with fluorescently labeled (F), cDNAs areisolated by a designed fluorescence-activated cell sorter, and subsequently are cleaved by

Page 10: Assessing the Impact of Dominant Sequencing-Based Gene ...simple, well-established and accessible technique, making it easily appliable by most researchers. However, large collections

Chiang Mai J. Sci. 2014; 41(4) 931

DpnII to expose a four-base overhang, which is then converted to a 3-base overhang bya fill-in reaction. Fluorescently labeled (F) adaptors with BbvI (which is a type IIsrestriction enzyme that cuts the DNA at a position 9 to 13 nucleotides away from therecognition sequence) recognition sites are ligated to cDNAs. Then the microbeads areloaded into a specially designed flow cell in a way that allows them to arrange tightlyalong channels. The cDNAs are digested by BbvI and encoded adaptors are hybridizedand ligated. Sixteen different phycoerythrin (PE)-labeled decoder probes are separatelyhybridized to the decoder binding sites of encoded adaptors. After each hybridizationprocess, an image of the microbead array is taken by a high-resolution CCD camera,which is positioned directly over the flow-cell. The image can be converted into sequencebase for later analysis. The encoded adaptors are then treated with BbvI, which cleavesinside the cDNA to expose four new bases for the next cycle of ligation and cleavage.The process is repeated several times, a 17-20 base signature sequence is generated foreach bead in the flow cell. The third step is signature sequence analysis. Each signaturesequence in an MPSS data set is analyzed compared with all other signatures and allidentical signatures are counted. Gene expression level is depended on the dividing thenumber of signatures from the total number of signatures for all mRNAs present in thedataset.

Data highlighted the differentially expressedtranscripts, especially those specificallyexpressed in appressoria, represent a genomicresource useful for gaining a betterunderstanding of the molecular basis ofM. grisea pathogenicity. Interactionbetween host plant and fungus involve therecognition of cellular components andthe exchange of complex molecular signalsfrom both partners [36].

To study how this interaction occursin host cells, Robust-LongSAGE, MPSSand sequencing by synthesis techniqueshave been used to analyse transcriptomeprofiles of Magnaporthe oryzae infected riceleaves. One RL-SAGE, eleven MPSS, andseven SBS libraries were constructed bydeep sequencing. A total of 18,154 significantsignatures were obtained from RL-SAGElibrary generated from rice leaves inoculatedwith the compatible isolate. Among them,3,105 (17.1 %) and 12,263 (67.5 %) significantsignatures matched with the M. oryzae andrice genomes, respectively. Although 3,091signatures specifically matching to the M.

oryzae genome, which correspond to 3,000previously annotated M. oryzae genes, only14 signatures matched to both the M. oryzaeand rice genomes, signifying that themajority of signatures were genomespecific. A total of 57,671 significantsignatures were obtained from fiveincompatible interaction MPSS libraries,of which, 724 (1.2%) and 38,024 (65.9%)significant signatures uniquely matched tothe M. oryzae and rice genomes, respectively.From the six compatible interactionlibraries, a total of 63,132 significantsignatures were isolated, of which, 2,545(4%) and 41,784 (66.1%) significantsignatures uniquely matched to the M.oryzae and rice genomes, respectively.Altogether, 3,216 annotated M. oryzaegenes were identified from both thecompatible and incompatible MPSSlibraries. The same leaf tissues used for thegeneration of the MPSS libraries were alsoused for the construction of the seven SBSlibraries. A total of 65,299 significantsignatures were obtained from the three

Page 11: Assessing the Impact of Dominant Sequencing-Based Gene ...simple, well-established and accessible technique, making it easily appliable by most researchers. However, large collections

932 Chiang Mai J. Sci. 2014; 41(4)

incompatible-interaction SBS libraries,among which, 3,492 (5.3%) and 49,706(76.1%) significant signatures specificallymatched to the M. oryzae genome and ricegenome, respectively. A total of 68,825significant signatures were obtained fromthe four compatible-interaction SBSlibraries. As many as 5,283 (7.7%) signaturesmatched to the M. oryzae genome and50,756 (73.7%) matched to the rice genome.The SBS signatures from both compatibleand incompatible interactions togetheridentified 4,781 annotated M. oryzae genes.Altogether, a total of 6,413 annotated M.oryzae genes expressed during infectionprocess were identified using the threeexpression profiling technologies, including851 genes encoding predicted effectorproteins. Protoplast transient expressionsystem analysis showed 42 of the predictedeffector proteins have the ability to induceplant cell death, and ectopic expressionassays identified five novel effectors thatinduced host cell death only when theycontained the signal peptide for secretionto the extracellular space. This studyhighlighted the fact that the integrativegenomic approach is effective for theidentification of in planta-expressed celldeath-inducing effectors from M. oryzaethat may play an important role infacilitating colonisation and fungal growthduring infection. Although MPSS has notbeen widely applied to study plant-pathogen interactions, the technique iswell suited to monitor transcriptionalregulation in fungi and their host. In thecase of fungal infection, it may be possibleto monitor in parallel the transcriptionalevents that take place in the pathogen.

Because of the depth of sequencing,fungal transcripts could be detected inmixed tissues. More and more fungal andplant genome sequences are increasingly

available and it is necessary to match theseMPSS signatures to corresponding sequencedgenome or cDNA sequences. Therefore,MPSS is a practicable technique to studyfungal-plant interactions by parallelmeasurements of gene expression profilesin both host and pathogen [37].

2.4 RNA-seq and the ApplicationsThe recent development of the Next

Generation Sequencing (NGS) technologiesby companies such as Roche (454 GS FLX),Illumina (Genome Analyzer II), and ABI(AB SOLiD) have completely revolutionisedthe field of molecular biology [38-40]. Thetheme of these NGS technologies is the highdegree of parallelisation, in which millionsto billions of sequencing reactions takeplace at the same time in small reactionvolumes, thereby realising ultra-high-throughput sequencing [39,41,42]. Thedevelopment of NGS technologies hasgiven rise to a novel technique for bothmapping and quantifying transcriptomes.This technique, dubbed RNA-Seq, hasclear advantages over all existing approachesand is expected to revolutionise the mannerin which eukaryotic transcriptomes arerevealed [13].

To achieve its two major functions-transcript discovery and quantification ofgene expression, novel RNA-seq relies onthe generation of short reads of transcriptsequence information which are thenassembled into full-length transcripts(contigs) and mapped to the genome [13].The basic workflow of the approach isshown in Figure 4. The quality of RNA-seq results depends on the quality of thestarting material, method-specific librarypreparation and data processing [43].

Major principles and technique con-siderations that contribute to quality arepresented as follows.

Page 12: Assessing the Impact of Dominant Sequencing-Based Gene ...simple, well-established and accessible technique, making it easily appliable by most researchers. However, large collections

Chiang Mai J. Sci. 2014; 41(4) 933

Figure 4. Schematic diagram of RNA-seq technique for investigating gene expressionprofiling. (a) Sample isolation. Total RNAs should be extracted from target cells ortissues. mRNA must be enriched from abundant other molecules by poly(A) select.(b) Library preparation. A double-stranded cDNA library can be usually prepared byusing: (left) fragmented RNA or (right) fragmented double-stranded (ds) cDNA. mRNAor fragmented mRNA is converted into cDNA. Random primers or oligo(dT) primersare used to synthetize the first strand of cDNAs by reverse transcriptase, The 3’ to 5’exonuclease activity of Klenow polymerase and T4 DNA polymerase are used to bluntthe ends of all of the fragments, followed by ligation of adapters which suitable fordifferent sequencing platform. These adapter-ligated cDNA fragments are then amplifiedseveral cycles into a library. (c) Sequence. The available proper NGS platforms use differentmethodological procedures. The prepared library with special adaptors is sequenced ina high-throughput manner to obtain short sequence reads. (d) Bioinformatics analysis ofgene expression profiling. Mapping programs align reads to the reference genome, andgene expression can be quantified as absolute read counts or normalized values such asRPKM.

Page 13: Assessing the Impact of Dominant Sequencing-Based Gene ...simple, well-established and accessible technique, making it easily appliable by most researchers. However, large collections

934 Chiang Mai J. Sci. 2014; 41(4)

Sample isolation Information gatheredfrom a sample’s transcriptome has manysimilar limitations as other RNA expressionanalysis pipelines. As the information istissue specific and time dependent, thequality of RNA-seq results depends on thequality of the starting material. Thetechnique should begin with enriched andhomogeneous samples and high-quality ofrecovered RNA. The quality of total RNAisolated from target cells or tissues shouldbe measured using the Agilent Bio-analyserRNA integrity number (RIN) [44] to assessthe extent of RNA degradation. As mosteukaryotic mature mRNAs transcribedby RNA polymerase II have a poly(A)tail, mRNA is able to be enriched byhybridisation capture using oligo(dT) beads[45,46]. Other mRNA isolation strategiesare available on the website http://www.protocol-online.org/.

Library preparation How closely theRNA sequencing results reflects theoriginal RNA transcripts are mainlydetermined in the library preparation step[47]. To generate a RNA-seq library,fragmentation of either RNA or cDNA isnecessary to allow processing by next-generation sequencing. The enrichedmRNA should be primed for the reversetranscription reaction by use of eitherrandom primers or oligo(dT) primers. Theadvantage of using oligo(dT) is that themajority of cDNA produced should bepoly adenylated mRNA, hence more of thesequences obtained should be informative.However, the disadvantage of the applicationof oligo(dT) primers is that the falling offof reverse transcriptase enzyme from thetemplate at a characteristic rate, resultingin a bias towards the 3’-end of transcripts.For long mRNAs, this bias results in anunder-representation of the extreme 5’-endof the transcript in the data. Generally, the

use of random primers would therefore bethe preferred method to avoid this problemand to allow a better representation of the5’-end of long ORFs [46,48]. Early librarypreparation methods for RNA-Seq eitherby using fragmented double-stranded (ds)cDNA or fragmented RNA involved someamplification steps. In some cases, manysame short reads can be obtained fromcDNA libraries that have been amplified.These would be a genuine reflection ofabundant mRNA species, or they could bePCR artifacts. To avoid these possibilities,researchers can determine whether samesequences are observed in different biologicalreplicates. Another consideration concerninglibrary construction is whether to maintainstrand specific information for the RNA,as strand information is particularlyhelpful for distinguishing overlappingtranscripts on opposite strands, especiallyfor de novo transcript discovery [49]. Severalapproaches could be used for strand-specificlibrary construction, such as pre-treatingthe RNA with sodium bisulphite to convertcytidine into uridine [50]. Other protocols,such as direct ligation of RNA adaptors tothe RNA sample before or during reversetranscription [51-53], or incorporation ofdUTP during second strand synthesis anddigestion with uracil-N-glycosylase enzymehave been used for maintaining strandspecific information for the original RNA[49]. Recent developments, including directRNA sequencing and the use of sequencingmethods that generate longer reads havehelped to overcome some of above problems[54,55]. In particular, direct RNA sequencingemploys single-molecule sequencingplatforms such as the Helicos single-molecule sequencing platform, avoidscDNA conversion and amplification stepsand requires femtomole quantities of RNA[56,57]. The reduced amount of input

Page 14: Assessing the Impact of Dominant Sequencing-Based Gene ...simple, well-established and accessible technique, making it easily appliable by most researchers. However, large collections

Chiang Mai J. Sci. 2014; 41(4) 935

RNA required could make direct RNAsequencing useful for identifying trans-criptome characteristics of small samples orrare cell types. However, current directRNA sequencing method relies on single-molecule sequencing capabilities, as theamplification of RNA molecules directlywithout cDNA conversion has not beenexamined. Although RNA dependentRNA polymerases do exist [58], the extentto which they can be adapted to the ampli-fication based next-generation sequencingtechnologies is still unknown [55].

Quantitation RNA-seq data analysis ismore computationally challenging thanthat for other sequencing-based methods.The simplest approach to derive a quantitativeexpression within the whole genome is toutilise the expression as the total numberof reads mapping to the coordinates of eachannotated element. Gene expression can bequantified as absolute read counts ornormalised values, such as RPKM(the Reads PerKilobase per Million ofmapped reads) within the same sample anddifferences of expression across biologicalconditions [46,59,60]. Differences inexpression can be analyzed by severalstatistical methods. A popular RNA-seqanalysis pipeline is the combination ofTophat and Cufflinks [61]. Reads aremapped with Tophat and assembled intotranscripts with FPKM (fragments perkilobase of exon per million fragmentsmapped) values using Cufflinks. In addition,there are several software packages designedspecifically to analysis differential geneexpression, including edger [62], DESeq[63], and DEGseq [64]. In some cases,multi-reads match multiple independentgenes, which will present a difficulty inquantifying transcript levels. In particular,the conserved domains of certain proteinfamilies and repetitive sequences were able

to map to multiple locations in thereference sequence, preventing accuratequantification of transcript levels for theseshort reads [46]. To overcome this drawback,extension of read length or by usage ofpaired-end reads to provide short sequencesfrom both ends of cDNA fragments canhelp to improve proper mapping of thesesequences to single locations in the genome[43,55]. While most analyses of RNA-Seqdata depend on alignment of referencegenome sequences, novel software packagessuch as Rnnotator and Trinity assemblesRNA-Seq data into transcriptomes withoutconsulting a reference sequence byassembling short contiguous reads fromRNA sequencing datasets [65,66]. Thesemethods enable identification of noveltranscripts and unbiased detection oftranscripts from multiple sources, allowingmore efficient application of RNA-Seqfor the discovery of transcripts andcharacterisation of transcriptomes in non-model organisms or mixed populations.

Since the introduction of RNA-Seqtechnique in 2008, it has become a dominanttool in functional genomics for exploringgene expression profiling in an organism.RNA-seq technique was first applied toanalysis of yeast, mouse, and Arabidopsistranscriptomes [46,48,53,67]. Recently, anincreasing number of studies focus on geneexpression profiling analysis of plantpathogenic fungi by RNA-Seq. Indeed, thefirst demonstrations of the applicability ofRNA-Seq technique is the monitoring ofgene expression of plant-microbe interactionsoriginated from dual transcriptomics(pathogenic fungi and host plant) analysisof Magnaporthe oryzae and its host rice[68]. A comprehensive understanding ofhost-pathogen interactions requires anunderstanding of gene expression profilesof both interacting organisms simultaneously

Page 15: Assessing the Impact of Dominant Sequencing-Based Gene ...simple, well-established and accessible technique, making it easily appliable by most researchers. However, large collections

936 Chiang Mai J. Sci. 2014; 41(4)

in the same infected plant tissue. Accordingto Kawahara et al [8], RNA-Seq techniquewas used to analyze the mixed transcriptomeof rice and blast fungus in infected leavesat 24 hours post-inoculation, which is thepoint when the primary infection hyphaepenetrate leaf epidermal cells. As a result,up regulation of 240 fungal transcriptsencoding putative secreted proteins werefound, suggesting that these candidates offungal effector genes may play an importantrole in initial infection processes. In addition,up-regulation of transcripts encodingglycosyl hydrolases, cutinases and LysMdomain-containing proteins were alsofound in the blast fungus, whereaspathogenesis-related and phytoalexinbiosynthetic genes were up-regulated inrice. More drastic changes in expressionwere also observed in incompatibleinteractions compared with the compatibleones in both rice and blast fungus at this

stage. Data highlighted showed that dualtranscriptome analysis is useful forscreening candidate genes responsible forfungal pathogenicity or plant resistance tofungal disease.

Approaches for gene disruption havebeen established in at least some pathogens,which facilitated the functional analysis ofcandidate genes. The role in pathogenesiscan be confirmed by subsequent assaysof virulence [8]. Only within two years(2012-2013), RNA-Seq studies haveexamined global gene expression in over10 species of phytopathogenic fungiencompassing a wide variety of researchareas. The majority of the studies haveaddressed aspects of fungal pathogenicityor phytopathogenic fungal-host interaction.Applications of RNA-seq technology infungal phytopathology are summarised inTable 1.

Table 1. Application of RNA-seq technology to fungal phytopathology.

Disease Pathogen Application

Investigation of genome-wide transcriptionalprofiling of appressorium development inM. oryzae [71].

Monitoring of gene expression profiles ofRice blast Magnaporthe oryzae interacting organisms of a mixed transcriptome

of rice and the rice blast fungus [8].

Comparison of transcriptional profiling of aconidiophore stalk-less1 (COS1) gene mutantand the wild-type strain [72].

Transcriptional profiling comparison of atranscription factor that regulates melanin

Black spot Alternaria biosynthesis null mutant (Δamr1) with that ofbrassicicola the wild-type strain [73].

Compared gene expression profiling of aputative transcription factor AbVf19 genemutant with the wild-type strain [74].

Page 16: Assessing the Impact of Dominant Sequencing-Based Gene ...simple, well-established and accessible technique, making it easily appliable by most researchers. However, large collections

Chiang Mai J. Sci. 2014; 41(4) 937

Uncovering of tomato immune receptor Ve1recognizes effector of multiple fungal

Vascular wilt Verticillium dahliae pathogens by combination of genome andRNA-seq analysis [75].

Investigation of the cotton gene expressionchanges after pathogen inoculation [76].

Comparison of transcriptome profiles in rootsof resistant and susceptible Cavendish bananachallenged with Foc TR4 to better understandthe defense response of resistant banana plantsto the Fusarium wilt pathogen [77].

Fusarium wilt Fusarium oxysporum Combination of RNA-seq and DGE analysisto investigate the transcriptional changesinduced by Foc in banana roots [78].

Analyzing the dynamic defense transcriptomeresponsive to F. oxysporum infection inArabidopsis [9].

Smut Sporisorium Identification of transcriptional changes in thereilianum f. sp. zeae roots of Huangzao4 (susceptible) and Mo17

(resistant) after root inoculation withS. reilianum [79].

Characterization of early-expressed pathogenRust Melampsoralarici- effectors and early-modulated host functions

populina by 454-pyrosequencing transcriptome analysisof early stages of poplar leaf colonization byM. larici-populina [80].

Comparison of transcriptome of both fungus-Leaf spot Marssonina brunnea infected and healthy plant tissues to understand

the molecular mechanism of fungal plantdiseases [81].

Dollar spot Sclerotinia Investigation of gene expression profiles of thehomoeocarpa S. homoeocarpa-Creeping Bentgrass

Pathosystem [82].

Transcriptome analysis of the response ofPostharvest Penicillium Metschnikowia fructicola, a biological agent ofdisease digitatum against postharvest diseases of several fruits

with citrus fruit and with P. digitatum [83].

Table 1. Continued.

Disease Pathogen Application

Page 17: Assessing the Impact of Dominant Sequencing-Based Gene ...simple, well-established and accessible technique, making it easily appliable by most researchers. However, large collections

938 Chiang Mai J. Sci. 2014; 41(4)

3. CONCLUSIONS AND PERSPECTIVESSGEPTs are indispensable tools for

research in fungal phytopathology.Improvements in sequencing technologieshave made significant contributions totranscriptome analysis of fungal functionalgenomics. When next generation techno-logies have entered the market, reportsare beginning to emerge on the use of thehigh-throughput techniques for transcriptprofiling in fungal phytopathology(Table 1). As fungi have comparatively smallgenomes, deep sequencing of SAGE tagsor ESTs have been applicable for manywith next-generation sequencing techno-logies. Current emergence of RNA-seqtechnique has revolutionised the mannerin which eukaryotic transcriptomes areanalysed [13]. On a more practical level, itpromises to replace any other methods fortranscriptome profiling and is playing anincreasingly high-profile role in research infungal phytopathology.

NGS is now moving forward at aphenomenal rate. We can expect that withinnext ten years, genome projects of allphytopathogenic fungi will be completedand precisely annotated by transcriptomesdata, when the available genomic informationwill form a solid foundation for systemsbiology studies of phytopathogenic fungi.Most current RNA-seq methods rely oncDNA synthesis and a range of subsequentmanipulation steps, which places limitationson the current approaches for someapplications [43,69]. As we enter the age of‘third generation sequencing’ (for example,single-molecule sequencing and/or directsequencing of RNA), both read numberand read length are expected to increase evenfurther. In particular, nano pore-levelsequencing is highly promising, as it ispotentially compatible with direct RNAsequencing without cDNA intermediates

[54]. The ultimate level of fungal geneexpression profiles is arguably single-cellresolution. The coming “Third GenerationTechnologies” centred on single moleculesequencing will fulfil the need to progresstowards single cell genome wide trans-criptome analysis. For example, the Helicossystem, which can sequence millions ofsingle molecules in parallel, are entering themarket and seem to be suited to analyseRNA [70]. However, much techniqueshould be improved before these methodscan be applied to phytopathogenic fungalcells.

With the development of sequencingtechnology, particularly as its cost dropsin terms of both money and time requiredfor sample preparation and analysis, RNA-Seq has the capability to become a usefultool for disease classification and diagnosis-dependent applicable technology. Thecapability to detect and properly classifytranscripts from multiple fungi will benecessary for such applications, as priorknowledge of transcripts that uniquelyidentify certain phytopathogenic fungusis required. From the perspectives oftranslational science, RNA-Seq could beused to identify global changes in fungipopulations within plant or could be usedto identify novel pathogens. In addition,target-enrichment strategies have been usedto capture the human exome from genomicDNA, given that a large fraction of disease-causing mutations are likely to be locatedin the protein-coding transcriptome. Thepotential suitability of mRNA-seq data forthe identification of nucleotide variationsis able to demonstrate phytopathogenicfungal pathogenicity genes mutant in theirprotein-coding transcriptome. Technologiesare bringing us closer to the ability to useRNA measurements for plant diseasediagnostics. RNA-seq will without doubt

Page 18: Assessing the Impact of Dominant Sequencing-Based Gene ...simple, well-established and accessible technique, making it easily appliable by most researchers. However, large collections

Chiang Mai J. Sci. 2014; 41(4) 939

drive many more exciting applicationswithin the next few years. In fact, progresswill be limited only by our imagination,and exciting times are certainly ahead.

ACKNOWLEDGEMENTSThis work was supported by the 973

program (Grant No. 2012CB114002) fromthe Ministry of Sciences and Technology,China, and the Program for ChangjiangScholars and Innovative Research Team inUniversity from the Ministry of Education,P.R. China to Youliang Peng.

REFERENCES

[1] Xu J.R., Peng Y.L., Dickman M.B.and Sharon A., The dawn of fungalpathogen genomics, Annu. Rev.Phytopathol., 2006; 44: 337-366.

[2] Wilson R.A. and Talbot N.J., Underpressure: investigating the biology ofplant infection by Magnaporthe oryzae,Nat. Rev. Microbiol. Biol., 2009; 7: 185-195.

[3] Fitzpatrick D.A., Logue M.E., StajichJ.E. and Butler G., A fungal phylogenybased on 42 complete genomes derivedfrom supertree and combined geneanalysis, BMC Evol. Biol., 2006; 6: 99.

[4] Bhadauria V., Popescu L., Zhao W.S.and Peng Y.L., Fungal transcriptomics,Microbiol. Res., 2007; 162: 285-298.

[5] Soanes D.M. and Talbot N.J.,Comparative genomic analysis ofphytopathogenic fungi using expressedsequence tag (EST) collections, Mol.Plant Pathol., 2007; 7: 61-70.

[6] Yoder O. and Turgeon B.G., Fungalgenomics and pathogenicity, Curr.Opin. Plant Biol., 2001; 4: 315-321.

[7] Hsiang T. and Goodwin P.H.,Distinguishing plant and fungalsequences in ESTs from infected plant

tissues, J. Micro Biol Meth., 2003; 54:339-351.

[8] Kawahara Y., Oono Y., Kanamori H.,Matsumoto T., Itoh T. and MinamiE., Simultaneous RNA-Seq Analysis ofa Mixed Transcriptome of Rice andBlast Fungus Interaction, PloS One,2012; 7: e49423.

[9] Zhu Q.H., Stephen S., Kazan K., JinG., Fan L., Taylor J., Dennis E.S.,Helliwell C.A. and Wang M.B.,Characterization of the defensetranscriptome responsive to Fusariumoxysporum infection in Arabidopsisusing RNA-seq, Gene, 2012; 2: 259-266.

[10] Adams M.D., Kerlavage A.R.,Fleischmann R.D., Fuldner R.A., BultC.J., Lee N.H., Kirkness E.F.,Weinstock K.G., Gocayne J.D. andWhite O., Initial assessment of humangene diversity and expression patternsbased upon 83 million nucleotides ofcDNA sequence, Nature, 1995; 377: 3-174.

[11] Velculescu V.E., Zhang L., VogelsteinB. and Kinzler K.W., Serial analysisof gene expression, Science, 1995; 270:484-487.

[12] Brenner S., Johnson M., Bridgham J.,Golda G., Lloyd D.H., Johnson D.,Luo S., McCurdy S., Foy M. andEwan M., Gene expression analysis bymassively parallel signature sequencing(MPSS) on microbead arrays, Nat.Biotechnol., 2000; 18: 630-634.

[13] Wang Z., Gerstein M. and Snyder M.,RNA-Seq: A revolutionary tool fortranscriptomics, Nat. Rev. Genet.,2009; 10: 57-63.

[14] Bouck A. and Vision T., The molecularecologist’s guide to expressed sequencetags, Mol. Ecol., 2007; 16: 907-924.

Page 19: Assessing the Impact of Dominant Sequencing-Based Gene ...simple, well-established and accessible technique, making it easily appliable by most researchers. However, large collections

940 Chiang Mai J. Sci. 2014; 41(4)

[15] Braun E.L., Halpern A.L., NelsonM.A. and Natvig D.O., Large-scalecomparison of fungal sequenceinformation: Mechanisms of innovationin Neurospora crassa and gene loss inSaccharomyces cerevisiae, GenomeRes., 2000; 10: 416-430.

[16] Austin R., Provart N.J., SacaduraN.T., Nugent K.G., Babu M. andSaville B.J., A comparative genomicanalysis of ESTs from Ustilago maydis,Funct. Integr. Genomic, 2004; 4: 207-218.

[17] Kleemann J., Takahara H., St ber K.and O’Connell R., Identification ofsoluble secreted proteins from appres-soria of Colletotrichum higginsianumby analysis of expressed sequence tags,Microbiol., 2008; 154: 1204-1217.

[18] Zhang Y., Qu Z., Zheng W., Liu B.,Wang X., Xue X., Xu L., Huang L.,Han Q. and Zhao J., Stage-specificgene expression during urediniosporegermination in Puccinia striiformis f.sp tritici. BMC Genomics, 2008; 9: 203.

[19] Jantasuriyarat C., Gowda M., HallerK., Hatfield J., Lu G., Stahlberg E.,Zhou B., Li H., Kim H. and Yu Y.,Large-scale identification of expressedsequence tags involved in rice and riceblast fungus interaction, Plant Physiol.,2005; 138: 105-115.

[20] Altschul S.F., Gish W., Miller W.,Myers E.W. and Lipman D.J., Basiclocal alignment search tool, J. Mol.Biol., 1990; 215: 403-410.

[21] Datson N.A., van der Perk-de Jong J.,van den Berg M.P., de Kloet E.R. andVreugdenhil E., MicroSAGE: Amodified procedure for serial analysisof gene expression in limited amountsof tissue, Nucleic Acids. Res., 1999; 27:1300-1307.

[22] Peters D.G., O’Hare E.H., FerrellR.E., Kassam A.B., Yonas H. andBrufsky A.M., Comprehensivetranscript analysis in small quantitiesof mRNA by SAGE-lite, NucleicAcids Res., 1999; 27: e39-e44.

[23] Virlon B., Cheval L., Buhler J.M.,Billon E., Doucet A. and Elalouf J.M.,Serial microanalysis of renal trans-criptomes, P. Natl. Acad. Sci., 1999;96 :15286-15291.

[24] Matsumura H., Ito A., Saitoh H. at,Winter P., Kahl G., Reuter M.,Kr ger D.H. and Terauchi R.,SuperSAGE, Cell Microbiol., 2005; 7:11-18.

[25] Saha S., Sparks A.B., Rago C., AkmaevV., Wang C.J., Vogelstein B., KinzlerK.W. and Velculescu V.E., Using thetranscriptome to annotate the genome,Nat. Biotechnol., 2002; 20: 508-512.

[26] Matsumura H., Reich S., Ito A., SaitohH., Kamoun S., Winter P., Kahl G.,Reuter M., Kr ger D.H. and TerauchiR., Gene expression analysis of planthost–pathogen interactions bySuperSAGE, P. Natl. Acad. Sci., 2003;100: 15718-15723.

[27] Matsumura H., Yoshida K., Luo S.,Kimura E., Fujibe T., Albertyn Z.,Barrero R.A., Kr ger D.H., Kahl G.and Schroth G.P., High-throughputSuperSAGE for digital gene expressionanalysis of multiple samples using nextgeneration sequencing, PLoS One,2010; 5: e12010.

[28] Thomas S.W., Glaring M.A.,Rasmussen S.W., Kinane J.T. andOliver R.P., Transcript profiling inthe barley mildew pathogen Blumeriagraminis by serial analysis of geneexpression (SAGE), Mol. PlantMicrobe. In., 2002; 15: 847-856.

Page 20: Assessing the Impact of Dominant Sequencing-Based Gene ...simple, well-established and accessible technique, making it easily appliable by most researchers. However, large collections

Chiang Mai J. Sci. 2014; 41(4) 941

[29] Andrews G.K., Regulation ofmetallothionein gene expression byoxidative stress and metal ions,Biochem. Pharmacol, 2000; 59: 95-104.

[30] McCafferty H.R. and Talbot N.J.,Identification of three ubiquitin genesof the rice blast fungus Magnaporthegrisea, one of which is highlyexpressed during initial stages of plantcolonization, Curr. Genet., 1998; 33:352-361.

[31] Irie T., Matsumura H., Terauchi R.and Saitoh H., Serial analysis of geneexpression (SAGE) of Magnaporthegrisea: genes involved in appressoriumformation, Mol. Genet. Genomics,2003; 270: 181-189.

[32] Larraya L.M., Boyce K.J., So A., SteenB.R., Jones S., Marra M. and KronstadJ.W., Serial analysis of gene expressionreveals conserved links betweenprotein kinase A, ribosome biogenesis,and phosphate metabolism in Ustilagomaydis, Eukaryot. Cell, 2005; 4: 2029-2043.

[33] D’Souza C.A. and Heitman J.,Conserved cAMP signaling cascadesregulate fungal development andvirulence, FEMS. Microbiol. Rev.,2001; 25: 349-364.

[34] Reinartz J., Bruyns E., Lin J.Z.,Burcham T., Brenner S., Bowen B.,Kramer M. and Woychik R.,Massively parallel signature sequencing(MPSS) as a tool for in-depthquantitative gene expression profilingin all organisms, Briefings Funct.Genomics Proteomics, 2002; 1: 95-104.

[35] Gowda M., Venu R., Raghupathy M.,Nobuta K., Li H., Wing R., StahlbergE., Couglan S., Haudenschild C. andDean R., Deep and comparativeanalysis of the mycelium andappressorium transcriptomes of

Magnaporthe grisea using MPSS, RL-SAGE, and oligoarray methods, BMCGenomics, 2006; 7: 310.

[36] Chen S., Songkumarn P., Venu R.,Gowda M., Bellizzi M., Hu J., Liu W.,Ebbole D., Meyers B. and Mitchell T.,Identification and Characterization ofIn planta-expressed secreted effectorproteins from Magnaporthe oryzae thatinduce cell death in rice, Mol. PlantMicrobeI. In., 2013; 26: 191-202.

[37] Meyers B.C., Haudenschild C.D. andVemaraju K., Use of massively parallelsignature sequencing to study genesexpressed during the plant defenseresponse, 2007; In: Ronald R.C., (eds.)Plant-Pathogen Interactions: Methodsand Protocols, Humana Press,Totowa, NJ; pp. 105-119.

[38] Mardis E.R., Next-generation DNAsequencing methods, Annu. Rev.Genomics Hum. Genet., 2008; 9: 387-402.

[39] Shendure J. and Ji H., Next-generationDNA sequencing, Nat. Biotechnol.,2008; 26: 1135-1145.

[40] Metzker M.L., Sequencingtechnologies—the next generation,Nat. Rev. Genet., 2009; 11: 31-46.

[41] Hall N., Advanced sequencingtechnologies and their wider impact inmicrobiology, J. Exp. Biol., 2007; 210:1518-1525.

[42] MacLean D., Jones J.D. andStudholme D.J., Application of ‘next-generation’ sequencing technologies tomicrobial genetics, Nat. Rev. Microbiol.,2009; 7: 287-296.

[43] Costa V., Angelini C., DeFeis I. andCiccodicola A., Uncovering thecomplexity of transcriptomes withRNA-Seq, J. Biomed. Biotechnol.,2010; doi:10.1155/2010/853916.

Page 21: Assessing the Impact of Dominant Sequencing-Based Gene ...simple, well-established and accessible technique, making it easily appliable by most researchers. However, large collections

942 Chiang Mai J. Sci. 2014; 41(4)

[44] Schroeder A., Mueller O., Stocker S.,Salowsky R., Leiber M., GassmannM., Lightfoot S., Menzel W.,Granzow M. and Ragg T., The RIN:an RNA integrity number forassigning integrity values to RNAmeasurements. BMC Mol. Biol., 2006;7: 3.

[45] Morin R.D., Bainbridge M., Fejes A.,Hirst M., Krzywinski M., Pugh T.J.,McDonald H., Varhol R., Jones S.J.and Marra M.A., Profiling the HeLaS3 transcriptome using randomlyprimed cDNA and massively parallelshort-read sequencing, Biotechniques,2008; 45: 81.

[46] Mortazavi A., Williams B.A., McCueK., Schaeffer L. and Wold B.,Mapping and quantifying mammaliantranscriptomes by RNA-Seq, Nat.Methods, 2008; 5: 621-628.

[47] Marguerat S. and B�hler J., RNA-seq:From technology to biology, Cell Mol.Life Sci.; 67: 569-579.

[48] Nagalakshmi U., Wang Z., Waern K.,Shou C., Raha D., Gerstein M. andSnyder M., The transcriptionallandscape of the yeast genome definedby RNA sequencing, Science, 2008;320: 1344-1349.

[49] Parkhomchuk D., Borodina T.,Amstislavskiy V., Banaru M., HallenL., Krobitsch S., Lehrach H. andSoldatov A., Transcriptome analysisby strand-specific sequencing ofcomplementary DNA, Nucleic AcidsRes., 2009; 37: e123-e123.

[50] He Y., Vogelstein B., Velculescu V.E.,Papadopoulos N. and Kinzler K.W.,The antisense transcriptomes of humancells, Science, 2008; 322: 1855-1857.

[51] Cloonan N., Forrest A.R., Kolle G.,Gardiner B.B., Faulkner G.J., BrownM.K., Taylor D.F., Steptoe A.L.,

Wani S. and Bethel G., Stem celltranscriptome profiling via massive-scale mRNA sequencing, Nat. Methods,2008; 5: 613-619.

[52] Core L.J., Waterfall J.J. and Lis J.T.,Nascent RNA sequencing revealswidespread pausing and divergentinitiation at human promoters,Science, 2008; 322: 1845-1848.

[53] Lister R., O’Malley R.C., Tonti-Filippini J., Gregory B.D., BerryC.C., Millar A.H. and Ecker J.R.,Highly Integrated Single-BaseResolution Maps of the Epigenome inArabidopsis, Cell, 2008; 133: 523-536.

[54] Ozsolak F., Platt A.R., Jones D.R.,Reifenberger J.G., Sass L.E., McInerneyP., Thompson J.F., Bowers J., JaroszM. and Milos P.M., Direct RNAsequencing, Nature, 2009; 461: 814-818.

[55] Ozsolak F. and Milos P.M., RNAsequencing: Advances, challenges andopportunities, Nat. Rev. Genet.,2010; 12: 87-98.

[56] Gurumurthy S., Xie S.Z., Alagesan B.,Kim J., Yusuf R.Z., Saez B., TzatsosA., Ozsolak F., Milos P. and FerrariF., The Lkb1 metabolic sensormaintains haematopoietic stem cellsurvival, Nature, 2010; 468: 659-663.

[57] Ozsolak F., Kapranov P., Foissac S.,Kim S.W., Fishilevich E., MonaghanA.P., John B. and Milos P.M.,Comprehensive polyadenylation sitemaps in yeast and human revealpervasive alternative polyadenylation,Cell, 2010; 143: 1018-1029.

[58] Makeyev E.V. and Bamford D.H.,Replicase activity of purifiedrecombinant protein P2 of double-stranded RNA bacteriophage, EMBOJ., 2000; 19: 124-133.

Page 22: Assessing the Impact of Dominant Sequencing-Based Gene ...simple, well-established and accessible technique, making it easily appliable by most researchers. However, large collections

Chiang Mai J. Sci. 2014; 41(4) 943

[59] Perkins T.T., Kingsley R.A., FookesM.C., Gardner P.P., James K.D., YuL., Assefa S.A., He M., Croucher N.J.and Pickard D.J., A strand-specificRNA–Seq analysis of the trans-criptome of the typhoid bacillussalmonella typhi, PLoS Genet., 2009;5: e1000569.

[60] Li B., Ruotti V., Stewart R.M.,Thomson J.A. and Dewey C.N.,RNA-Seq gene expression estimationwith read mapping uncertainty,Bioinformatics, 2010; 26: 493-500.

[61] Trapnell C., Roberts A., Goff L.,Pertea G., Kim D., Kelley D.R.,Pimentel H., Salzberg S.L., Rinn J.L.and Pachter L., Differential gene andtranscript expression analysis of RNA-seq experiments with TopHat andCufflinks, Nat. Protocols, 2012; 7: 562-578.

[62] Robinson M.D. and Oshlack A., Ascaling normalization method fordifferential expression analysis ofRNA-seq data, Genome Biol., 2010;11: R25.

[63] Anders S. and Huber W., Differentialexpression analysis for sequence countdata, Genome Biol., 2010; 11: R106.

[64] Clark A.G., Eisen M.B., Smith D.R.,Bergman C.M., Oliver B., MarkowT.A., Kaufman T.C., Kellis M.,Gelbart W. and Iyer V.N., Evolutionof genes and genomes on theDrosophila phylogeny, Nature, 2007;450: 203-218.

[65] Martin J., Bruno V.M., Fang Z.,Meng X., Blow M., Zhang T.,Sherlock G., Snyder M. and Wang Z.,Rnnotator: an automated de novotranscriptome assembly pipeline fromstranded RNA-Seq reads, BMCGenomics, 2010; 11: 663.

[66] Grabherr M.G., Haas B.J., YassourM., Levin J.Z., Thompson D.A.,Amit I., Adiconis X., Fan L.,Raychowdhury R. and Zeng Q.,Full-length transcriptome assemblyfrom RNA-Seq data without areference genome, Nat.Biotechnol.,2011; 29: 644-652.

[67] Wilhelm B.T., Marguerat S., Watt S.,Schubert F., Wood V., Goodhead I.,Penkett C.J., Rogers J. and B hler J.,Dynamic repertoire of a eukaryotictranscriptome surveyed at single-nucleotide resolution, Nature, 2008;453: 1239-1243.

[68] We1stermann A.J., Gorski S.A. andVogel J., Dual RNA-seq of pathogenand host, Nat. Rev. Microbiol., 2012;10: 618-630.

[69] Zeng W. and Mortazavi A., Technicalconsiderations for functional sequencingassays, Nat. Immunol., 2012; 13: 802-807.

[70] Lipson D., Raz T., Kieu A., JonesD.R., Giladi E., Thayer E., ThompsonJ.F., Letovsky S., Milos P. and CauseyM., Quantification of the yeasttranscriptome by single-moleculesequencing, Nat. Biotechnol., 2009; 27:652-658.

[71] Soanes D.M., Chakrabarti A.,Paszkiewicz K.H., Dawe A.L. andTalbot N.J., Genome-wide transcrip-tional profiling of appressoriumdevelopment by the rice blast fungusMagnaporthe oryzae, PLoS Pathog.,2012; 8: e1002514.

[72] Li X., Han X., Liu Z. and He C., Thefunction and properties of thetranscriptional regulator COS1 inMagnaporthe oryzae, Fungal Biol.,2013; 5: e1000757.

Page 23: Assessing the Impact of Dominant Sequencing-Based Gene ...simple, well-established and accessible technique, making it easily appliable by most researchers. However, large collections

944 Chiang Mai J. Sci. 2014; 41(4)

[73] Srivastava A., Ohm R.A., Oxiles L.,Brooks F., Lawrence C.B., GrigorievI.V. and Cho Y., A zinc-finger-familytranscription factor, AbVf19, isrequired for the induction of a genesubset important for virulence inAlternaria brassicicola, Mol. PlantMicrobe In., 2012; 25: 443-452.

[74] Cho Y., Srivastava A., Ohm R.A.,Lawrence C.B., Wang K.H., GrigorievI.V. and Marahatta S.P., Transcriptionfactor Amr1 induces melaninbiosynthesis and suppresses virulencein Alternaria brassicicola, PLoSPathog., 2012; 8: e1002974.

[75] de Jonge R., van Esse H.P.,Maruthachalam K., Bolton M.D.,Santhanam P., Saber M.K., Zhang Z.,Usami T., Lievens B. and SubbaraoK.V., Tomato immune receptor Ve1recognizes effector of multiple fungalpathogens uncovered by genome andRNA sequencing, P. Natl. Acad. Sci.,2012; 109: 5110-5115.

[76] Xu L., Zhu L., Tu L., Liu L., YuanD., Jin L., Long L. and Zhang X.,Lignin metabolism has a central rolein the resistance of cotton to the wiltfungus Verticillium dahliae as revealedby RNA-Seq-dependent transcrip-tional analysis and histochemistry, J.Exp. Bot., 2011; 62: 5607-5621.

[77] Li C.Y, Deng G.M., Yang J., ViljoenA., Jin Y., Kuang R.B., Zuo C.W.,Lv Z.C., Yang Q.S. and Sheng O.,Transcriptome profiling of resistantand susceptible Cavendish bananaroots following inoculation withFusarium oxysporum f. sp. cubensetropical race 4, BMC Genomics, 2012;13: 374.

[78] Wang Z., Zhang J., Jia C., Liu J., LiY., Yin X., Xu B. and Jin Z., DeNovo characterization of the bananaroot transcriptome and analysis of

gene expression under Fusariumoxysporum f. sp. cubense tropical race4 infection, BMC Genomics, 2012; 13:650.

[79] Zhang S., Xiao Y., Zhao J., Wang F.and Zheng Y., Digital gene expressionanalysis of early root infectionresistance to Sporisorium reilianum f.sp. zeae in maize, Mol. Genet. Genomics,2013; 288: 21-37.

[80] Petre B., Morin E., Tisserant E.,Hacquard S., Da Silva C., Poulain J.,Delaruelle C., Martin F., Rouhier N.and Kohler A., RNA-seq of early-infected poplar leaves by the rustpathogen Melampsora larici-populinauncovers PtSultr3; 5, a fungal-inducedhost sulfate transporter, PLoS One,2012; 7: e44408.

[81] Zhu S., Dai Y.M., Zhang X.Y., YeJ.R., Wang M.X. and Huang M.R.,Untangling the transcriptome fromfungus-infected plant tissues, Gene,2013; 519: 238-244.

[82] O’Connell R.J., Thon M.R.,Hacquard S., Amyotte S.G., KleemannJ., Torres M.F., Damm U., BuiateE.A., Epstein L. and Alkan N.,Lifestyle transitions in plantpathogenic Colletotrichum fungideciphered by genome and transcrip-tome analyses, Nat. Genet., 2012; 44:1060-1065.

[83] Hershkovitz V., Sela N., Taha-SalaimeL., Liu J., Rafael G., Kessler C., AlyR., Levy M., Wisniewski M. andDroby S., De-novo assembly andcharacterization of the transcriptomeof Metschnikowia fructicola revealsdifferences in gene expressionfollowing interaction with Penicilliumdigitatum and grapefruit peel, BMCGenomics, 2013; 14: 168.