Upload
john-r-walker
View
212
Download
0
Embed Size (px)
Citation preview
Databases of free expression
John R. Walker, Tim Wiltshire
Genomics Institute of the Novartis Research Foundation, 10675 John Jay Hopkins Drive, San Diego, California 92121, USA
Received: 6 April 2006 / Accepted: 29 August 2006
Abstract
The rapid development of microarray technologieshas led to a similar progression in gene expressionanalysis methods, gene expression applications, andgene expression databases. Public gene expressiondatabases enable any researcher to examine expres-sion of their favorite genes across a wide variety ofsamples, download sample data for development ofnew analysis methods, or answer broad ques-tions about gene expression regulation, among otherapplications. A wide variety of public gene expres-sion databases exist, and they vary in their content,analysis capabilities, and ease of use. This reviewhighlights the current features and describes exam-ples of two broad categories of mammalian micro-array databases: tissue gene expression databases anddata warehouses.
Introduction
With the development of microarray technology overthe past decade, global gene expression surveys havebecome a popular method to study biological pro-cesses. Along with the technology have come toolsto more easily extract useful information fromexperiments. Many reviews have described methodsto distinguish signal from noise from microarrayimages as well as software to visualize and statisti-cally analyze differences between experimentalgroupings. More recent software tools now assemblebiological pathways that are significantly altered in agene expression experiment. All of these develop-ments have allowed for easier extraction of relevantdata points and interpretation of results. But even
after expression changes are validated by othertechnologies such as quantitative polymerase chainreaction (qPCR), results still need to be interpretedin a broader context. For example, how do we knowif a particular biological pathway that appears to beimportant in a list of differentially expressed genesactually participates in the biology under examina-tion? Are those genes that make up the pathwayexpressed or enriched in the tissue or cells used inthe current experiment? Has anybody else performeda comparable experiment and seen similar individualgene and pathway changes? Do members of a genelist appear as differentially expressed in expressiondata sets from other areas of biology? Because geneexpression data sets are complex, published articlesmost likely do not describe all of the gene expressionchanges in their experiments. Thankfully, journalshave required submission of entire data sets topublic gene expression databases so that others havethe opportunity to extract additional informationfrom the data.
This review focuses on types of publicly avail-able gene expression data sources that can be queriedfor particular genes and biological processes ofinterest. We focus primarily on gene expression re-sources of microarray data in mammals. However,other gene expression resources are cited where theyadd qualitative information. We discuss two cate-gories of databases. The first comprises databases inwhich researchers are seeking tissue expressionlocation information about genes of interest. Thesedatabases typically contain data obtained by a singlegroup using a single microarray type, so cross-samplecomparisons are possible. The second category ofexpression database we discuss is that in whichresearchers seek entire gene expression data setsobtained in a biological area of interest. Cross-sam-ple or cross-experiment comparisons in these data-bases need to be interpreted with caution becausethere is a considerable amount of data variabilityacross experiments, array types, and protocols (seebelow). We comment on the different types ofCorrespondence to: John R. Walker; E-mail: [email protected]
DOI: 10.1007/s00335-006-0043-5 � Volume 17, 1141�1146 (2006) � � Springer Science+Business Media, Inc. 2006 1141
Review
information that can be extracted from these data-bases and highlight their characterizing features.
Tissue Gene Expression Databases
One major stumbling block for researchers is what todo with genes of unknown function that appear asdifferentially expressed in their gene expressionexperiments. Because many microarray platformsare based on Unigene cluster members (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=uni-gene), there may be little information on what typeof protein a target sequence may encode, whichmeans that its function is very likely unknown.More often than not, transcripts with no knownfunction are the first to be ignored when analyzinggene expression data. This may represent a highpercentage of sequences on some arrays. For exam-ple, for the September 2005 annotation for theAffymetrix 430 2.0 array, only 45% of all sequenceson the array are associated with a Gene Ontology(GO) biological process category (http://www.gene-ontology.org/; http://www.affymetrix.com/support/technical/byproduct.affx?product=moe430-20).Obviously, any additional information about thesetranscripts might be useful to determine whetherthey should be considered for further analysis.
Tissue location of expression can be a valuabletool to determine gene function. Mootha et al. (2003)used tissue gene expression to find a previously un-known gene responsible for a mitochondrial disorder.The location of this gene in a genomic regionresponsible for the disorder, as well as its strong co-expression with other known mitochondrial genesacross tissues, hinted that it may be involved in thedisease. Additional experiments proved that the genedid indeed cause the disorder and was most likely amitochondrial gene. Tissue gene expression databas-es, along with other databases, have also been used tocategorize, at awhole-genome level, genes potentiallyinvolved in a particular type of disease category (Calvoet al. 2006). Tissue gene expression data sets can alsohelp prioritize potentially causative genes in humanassociation (as described above) or rodent quantitativetrait locus (QTL) or ethyl-N-nitrosourea (ENU)mutagenesis studies (Brown et al., 2005; Wen et al.2004). Finally, tissue expression can determine if agene product is a realistic target for pharmacotherapy.For example, if one is interested in targets for prostatecancer therapy, an ideal candidate would be onewhose expression is activated in cancer yet whoseexpression is low in other normal tissues besidesprostate (Welsh et al. 2003).
Several websites provide useful informationabout the tissue localization of gene expression.
Some are geared toward particular tissues and dis-ease areas; they are not described in detail here. Thewebsites presented here provide expression data for awider range of tissues. Various highlights of thedifferent data sets are discussed. Links and some keyfeatures of these databases are provided in Table 1.
Symatlas (http://symatlas.gnf.org/SymAtlas/) isa continually updated gene expression source formouse, rat, and human gene expression data across awide variety of tissues (Su et al. 2004). Currently,expression data are available across 79 human tis-sues, 61 mouse tissues, 29 rat tissues, and 83 com-monly used human cell lines. The mouse andhuman data sets cover nearly the entire ‘‘transcrip-tome,’’ and data from older-version Affymetrix ar-rays are also provided. Queries are gene centric, somultiple probe sets for every gene across each dataset are displayed when searching by gene, accessionnumber, or sequence. A particularly useful featurefor candidate gene analyses is the chromosomeinterval search. Queries are flexible because thereare options to combine searches via intersectionsand unions, and results can be filtered. Displaysconsist of bar graphs of expression across all tissuesin any particular data set chosen. There are links topublic information about protein function, genomiclocation using the UCSC genome browser (http://genome.ucsc.edu/), and probe and target sequence.Mining for transcripts that are enriched in expres-sion in a particular tissue or are coexpressed acrosstissues with a chosen transcript can be performed.Various results of queries such as bar graph images,complete updated gene annotation, and processeddata can be downloaded. Finally, all raw and pro-cessed data for the mouse, human, and rat data setsare available upon request.
The characterizing feature of Symatlas is itsgene-centric architecture. Queries result in allinformation about a gene and expression data foreach probe for mouse, rat, and human across all arraytypes in which a particular gene is represented.Because custom microarrays were used for themouse and half of the human data sets, a disadvan-tage to using Symatlas is that direct comparisons tocommercial microarrays are cumbersome.
Stanford University�s SOURCE (http://genome-www5.stanford.edu/cgi-bin/source/sourceSearch) alsoprovides useful alias searches (Diehn et al. 2003). Anadvantage and disadvantage of SOURCE is that itlinks to external sources for gene expression infor-mation. Though this results in a diverse view ofexpression across tissues, sometimes evidence isbased only on representation in expressed sequencetag (EST) libraries and not more quantitative micro-array data. Also, because of this dependence, there is
1142 J.R. WALKER AND T. WILTSHIRE: FREE EXPRESSION DATABASES
not one source of tissue gene expression data that canbe downloaded or from which processed data for par-ticular genes can be retrieved.
Visualization of gene expression in SOURCE,when applicable, is in the form of a heat map.Finding nearest expression neighbors is as simple asclicking on the heat map. Useful additional featuresof SOURCE are availability of clone information forevery gene and ability to retrieve upstream genomicsequence. The one unique feature of SOURCE is itsability to retrieve diverse gene expression data setswhen a single gene is queried.
The Mouse Gene Prediction Database from theHughes lab at the University of Toronto (http://mgpd.med.utoronto.ca/) provides a simple yet veryuseful query interface and a rich mouse geneexpression data source (Zhang et al. 2004). Outputconsists of a heat map of gene expression acrosstissues along with nearest expression neighbors.Various links to gene annotation are available aswell as immediate access to the oligo sequence thatwas on the array. The raw and processed microarraydata for this data set are available via links to thecited publication. A distinguishing feature is theability to search for coexpression of genes that be-long to a GO category that yields immediate func-tion-expression correlations.
Though the Oncogenomics Normal TissueDatabase (http://ntddb.abcc.ncifcrf.gov/cgi-bin/nlt-issue.pl) represents only 19 human tissues, there aremany individual replicates of each tissue allowingfor examination of expression variability across hu-man donors (Son et al. 2005). As with the MouseGene Prediction Database, chromosome interval andGO searches are possible. This database is password-protected, though passwords are easily attainable.Distinguishing features of this database includemany options to search for differential expressionacross tissues, simple and rapid downloading of databehind heat maps, correlation searches in which thehits are sorted by correlation coefficient, and a bargraph display of expression across the individualreplicates.
RIKEN (http://read.gsc.riken.go.jp/) offered oneof the first tissue gene expression databases to thepublic (Bono et al. 2002). This mouse database wasdesigned around RIKEN�s rich clone collection andtherefore contains comprehensive clone informa-tion. Characteristic features include a somewhatrich developmental data set, expression correlationsearches, and tissue-enrichment search capability.
The TeraGenomics database (http://public-web.teragenomics.com/public/login.asp) does notoffer a query tool to examine expression of inputgene(s). However, because it contains expressionT
able
1.Key
featuresoftissueex
pressiondatab
ases
anddatawareh
ouses
URL
Sourcename
Keyfeatures
Tissu
eExpressionSources
http://symatlas.gn
f.org/SymAtlas/
Symatlas
Synonym
search
ing,
genece
ntric,man
ydifferenttissues
http://gen
ome-www5.stanford.edu/cgi-bin/source/sourceS
earch
SOURCE
Lnksbetwee
nge
nes
andclones,multiple
datasources
http://m
gpd.m
ed.utoronto.ca/
Mouse
Gen
ePrediction
GO
catego
rysearch
esDatab
ase
http://ntddb.abcc
.ncifcrf.gov/cgi-bin/nltissu
e.pl
Onco
genomicsNorm
alTissu
eDatab
ase
Datanorm
alizationan
ddisplayoptions,
man
yreplica
tes
http://rea
d.gsc.riken
.go.jp/
RIK
EN
Embryonic
tissues,cloneinform
ation
http://publicw
eb.terag
enomics.co
m/public/login.asp
TeraG
enomics
Differentmouse
strains,
rich
inex
perim
entaldetails
http://w
ww.gen
enetwork.org/home.htm
lW
ebQtl
Phen
otypedata,
multiple
strains
http://w
ww.tigr.org/index
.shtm
lTIG
RSoftware,
EST
library
inform
ation
http://w
ww.inform
atics.jax.org/m
enus/ex
pression_m
enu.shtm
lJack
sonLab
oratory
Dev
elopmen
taltissues,alternativeex
pression
tech
nologies
http://w
ww.brain-m
ap.org
Allen
Brain
Atlas
Insitu
hybridizationofmouse
brain
DataW
areh
ouses
http://w
ww.ncb
i.nlm
.nih.gov/geo
/GEO
Abundan
tdata,
analysistools
http://w
ww.ebi.ac
.uk/array
express/
Array
express
Analysistools,ex
tensivesample
annotation
genome-www.stanford.edu/m
icroarray
Stanford
MicroarrayDatab
ase
Freesoftware,
analysistools
http://w
ww.cbil.upen
n.edu/R
AD/php/index
.php
RAD
Rap
iddatadownload
shttp://proteoge
nomics.musc.edu/m
a/musc_m
adb.php?
MUSC
Rap
iddatadownload
s
J.R. WALKER AND T. WILTSHIRE: FREE EXPRESSION DATABASES 1143
data for several tissues and mouse strains and be-cause it is possible to download processed data foroffline queries, it deserves mention here. Distin-guishing characteristics include expression data forseveral mouse strains, detailed experimental proce-dures, high-quality samples and hybridizations, anddetailed hybridization reports allowing for compari-son of sample and array quality.
WebQTL (http://www.genenetwork.org/home.html) combines microarray expression data in sev-eral tissues with mouse phenotype and DNA se-quence variation data across recombinant inbredmouse strains (Wang et al. 2003). This combinationof data allows for QTL mapping and association ofvarious phenotypic traits and facilitates the integra-tion of networks of genes, transcripts, and traits.
There also are sites that provide queries forexpression across tissues, yet the data sources arenot from microarrays. However, these sites provideadditional useful information such as how to obtainclones for particular genes or EST libraries, andsupplementary information using other expressiontechnologies including Northern blots and in situhybridization across select tissues. Though thesesources may not cover the entire genome nor probemany tissues, they are worth querying for knowngenes because they may provide information that isnot available from microarray technology.
TIGR (http://www.tigr.org/index.shtml), thougha leader in microarray-based expression data analy-sis, offers only expression data for mammals fromEST libraries. Though this does not allow for highlyquantitative visualization of expression across tis-sues, it does provide rich information about indi-vidual clones in a wide array of EST libraries for pig,dog, and cattle in addition to mouse, rat, and human.
The JacksonLaboratory (http://www.informatics.jax.org/searches/expression_form.shtml) is known asa rich source of mouse phenotypic data. It also has amouse gene expression collection using in situhybridization, Northern blotting, reverse transcrip-tase PCR (RT-PCR), and RNase protection (Ringwaldet al. 2001). In addition, it offers protein expressiondata using Western blots and immunohistochemis-try. These sources, therefore, could provide moresensitive expression information than microarrays(RT-PCR), better anatomically defined mRNA local-ization (in situ hybridization), protein size and semi-quantitation (Western), and protein localization(immunohistochemistry). A distinguishing feature ofthis database is its developmental expression data.
The Allen Brain Atlas (http://www.brain-map.org/welcome.do;jsessionid = 70CE335B9D84FCBEC571AB2F1E0027BE) is a rapidly expandingsource of gene expression in the mouse brain.
Currently, over 12,000 genes can be queried for verydetailed in situ hybridization data, with plans for20,000 genes by the end of 2006. Tools to examinedifferential expression across brain regions are al-ready available.
Gene Expression Data Warehouses
Often researchers have a particular gene or set ofgenes for which they would like information inaddition to tissue gene expression. Besides where agene is expressed, it would also be useful to knowwhen it is expressed. For example, it could be usefulto determine what happens to expression of a par-ticular gene when another gene is knocked down oroverexpressed in mice or in cell culture, when ani-mals or cells are treated with a particular drug orgiven a specific stimulus, or in a particular diseasecondition. Genewise queries across all samples arenot yet possible across large publicly available datasets, probably because it would take a huge effort toorganize the data and set up the analyses. Forexample, deciding which experimental factors tocompare and which appropriate statistical tests touse would have to be done in advance. In addition, asdescribed below, queries across laboratories and dataplatforms may produce expression artifacts.
At times, researchers will already have ahypothesis in mind and will want to search throughmulticondition databases to find an experimentalcondition of interest to them. For this purpose, itwould be useful to search for differential expressionof genes across this data set and/or download thedata for analysis with gene expression software oftheir choice. Currently, there are only a few data-bases that provide enough gene expression data inone place to make these searches useful; some aredeveloping tools to analyze and download portions ofthis data.
An ideal data warehouse would make it easy forusers to find differentially expressed genes in anydata set, visualize expression of those genes acrossthe given conditions, and download expression datawith gene and sample annotation. But before thisstage, finding experiments of interest should bemade easy with keyword searches. This requires thatsubmitters include adequate information about theexperiment, the samples, and the microarray proce-dures. Because microarray data analysis methodsvary widely and are continually developing, datawarehouses would not be expected to provide toolsto satisfy every user. For this reason, raw data shouldbe easily available for download.
The currently available data warehouses that arereviewed contain some of the above-described
1144 J.R. WALKER AND T. WILTSHIRE: FREE EXPRESSION DATABASES
options and features. These warehouses are growingat a constant rate and are adding new features tomeet the demands of the research community. Asummary table of these warehouses is provided inTable 1.
Gene Expression Omnibus (GEO; http://www.ncbi.nlm.nih.gov/geo/) is one database that manyjournals recommend as a location to submit micro-array data upon submission of a manuscript (Edgaret al. 2002). GEO now contains data for over 85,000samples that cover many species and microarrayplatforms. Data submission requires data in GEO�sformat (SOFT), but they also accept MAGE-ML for-matted data. Submitters are encouraged to submitraw data formats (CEL files for Affymetrix data, forexample) so that site visitors can download andprocess the data on their own.
GEO has recently added query tools (Barrett et al.2005) that allow visitors to examine differentialexpression in any data set. GEO�s SOFT submissionformat requires description of experiments andsamples, so experiments of interest can be foundwith a simple query. Researchers who visit any ofNCBI�s databases (PubMed, Entrez Gene) will befamiliar with search tools and input options avail-able in GEO. Once experiments of interest are found,differential expression can be examined and expres-sion of individual genes across designated samplescan be displayed via heat maps and bar graphs.
Array Express from EMBL (http://www.ebi.ac.uk/arrayexpress/) is also recommended by manyjournals as a location to deposit microarray data,though its collection is smaller than GEO�s (Brazmaet al. 2003). The most distinguishing characteristicof Array Express is the extensive sample annotationthey require to pass MIAME (http://www.mged.org/Workgroups/MIAME/miame.html ) standards. Theyprovide software for this task, and MAGE-ML for-mats of submitted data sets are easily obtainable.
Data analysis in Array Express is performed afterdata are imported into their Expression Profilersoftware. A unique feature is the ability to normalizedata in several ways. There is also a wide variety ofoptions for clustering and differential expressionanalysis. It is relatively easy to download both rawand processed data, and many download parametersare available. It is also worth mentioning that ArrayExpress is developing a tissue gene expression searchtool using various data sources in their warehouse.
The Stanford Microarray Database (http://gen-ome-www5.stanford.edu/) has been a leader in pro-viding open-source data and creating microarray datatools (Ball et al. 2005). Most of its data submissionsoriginate from Stanford University and collaborators(over 60,000 experiments of which around 11,000 are
public). In addition to being a source for microarraydata, there are links to microarray company sites,collections of microarray publications, and micro-array-related learning materials. There are down-loads available for free microarray-related softwaredeveloped at Stanford and an extensive list of linksto external software. Similar to Array Express, thereare comprehensive experimental and sample anno-tation requirements. In addition, the StanfordMicroarray Database provides many options to ana-lyze, visualize, and download data.
Other much smaller gene expression data sour-ces are worth mentioning because they might con-tain data of interest to particular researchers. RAD(http://www.cbil.upenn.edu/RAD/php/index.php)from the University of Pennsylvania and MUSC(http://proteogenomics.musc.edu/ma/musc_madb.php?page=home&act=manage) from the MedicalUniversity of South Carolina contain a limitednumber of mouse, rat, and human gene expressiondata sets (Argraves et al. 2003; Manduchi et al. 2004).At MUSC experiment descriptions are complete anddata are downloaded rapidly. Because of the quickdownloads, it is worthwhile to examine both ofthese sites for experiments of interest.
The above-mentioned data warehouses allow forsome degree of within-experiment comparisons ofsamples. Because data frommost experiments can bedownloaded, it might be tempting to compare arraysfrom different experiments. However, cautionshould be used when comparing experiments be-tween laboratories and across platforms (Tan et al.2003). However, one recent encouraging set ofstudies shows that data collected using standardizedmethods from experienced laboratories can bereproducible across platforms and laboratories, atleast for a subset of differentially expressed genes(Bammler et al. 2005; Irizarry et al. 2005; Larkinet al. 2005).
Conclusions
There have been many improvements in geneexpression technologies over the past decade. Alongwith those improvements have come better algo-rithms and software to extract and analyze data. Inaddition, journals and gene expression warehouseshave been enforcing better descriptions of samplesand experiments. All of these trends have resulted inbetter data submitted to gene expression ware-houses. Some of these warehouses are now devel-oping user-friendly and powerful tools to more easilyextract meaningful information from these databas-es. All of these trends will allow us to get more outof each other�s data.
J.R. WALKER AND T. WILTSHIRE: FREE EXPRESSION DATABASES 1145
Acknowledgments
The authors thank the reviewers for helpful com-ments and suggestions and the Novartis ResearchFoundation for financial support.
References
1. Argraves GL, Barth JL, Argraves WS (2003) The MUSCDNA Microarray Database. Bioinformatics 19,2473�2474
2. Ball CA, Awad IA, Demeter J, Gollub J, Hebert JM,et al. (2005) The Stanford Microarray Databaseaccommodates additional microarray platforms anddata formats. Nucl Acids Res 33(1), D580�D582
3. Bammler T, Beyer RP, Bhattacharya S, BoormanGA, Members of the Toxicogenomics ResearchConsortium (2005) Standardizing global geneexpression analysis between laboratories and acrossplatforms. Nate Methods 2, 351�356; Erratum(2005) 2, 477
4. Barrett T, Suzek TO, Troup DB, Wilhite SE, Ngau WC,et al. (2005) NCBI GEO: mining millions of expressionprofiles—database and tools. Nucl Acids Res 33(Data-base issue), D562�D566
5. Bono H, Kawukawa T, Hayashizaki Y, Okazaki Y(2002) READ: RIKEN Expression Array Database. NuclAcids Res 30, 211�213
6. Brazma A, Parkinson H, Sarkans U, Shajatalab M, ViloJ, et al. (2003) ArrayExpress—a public repository formicroarray gene expression data at the EBI. Nucl AcidsRes 31, 68�71
7. Brown A, Olver WI, Donnelly CJ, May ME, Naggert JK,et al. (2005) Searching QTL by gene expression: anal-ysis of diabesity. BMC Genet 6, 12
8. Calvo S, Jain M, Xie X, Sheth SA, Chang B, et al. (2006)Systematic identification of human mitochondrialdisease genes through integrative genomics. NatGenet 38, 576�582
9. Diehn M, Sherlock G, Binkley G, Jin H, Matese JC,et al. (2003) SOURCE: a unified genomic resource offunctional annotations, ontologies, and gene expres-sion data. Nucl Acids Res 31(1), 219�223
10. Edgar R, Domrachev M, Lash AE (2002) GeneExpression Omnibus: NCBI gene expression and
hybridization array data repository. Nucl Acids Res30, 207�210
11. Irizarry RA, Warren D, Spencer F, Kim IF, Biswal S,et al. (2005) Multiple-laboratory comparison ofmicroarray platforms. Nat Methods 2, 245�350
12. Larkin JE, Frank BC, Gavras H, Sultana R, Quacken-bush J (2005) Independence and reproducibility acrossmicroarray platforms. Nat Methods 2, 337�344
13. Manduchi E, Grant GR, He H, Liu J, Mailman MD,et al. (2004) RAD and the RAD Study-Annotator: anapproach to collection, organization and exchange ofall relevant information for high-throughput geneexpression studies. Bioinformatics 20, 452�459
14. Mootha VK, Lepage P, Miller K, Bunkenborg J, ReichM, et al. (2003) Identification of a gene causing humancytochrome c oxidase deficiency by integrative ge-nomics. Proc Natl Acad Sci USA 100, 605�610
15. Ringwald M, Eppig JT, Begley DA, Corradi JP,McCright IJ, et al. (2001) The Mouse Gene ExpressionDatabase (GXD). Nucl Acids Res 29(1), 98�101
16. Son CG, Bilke S, Davis S, Greer BT, Wei JS, et al.(2005) Database of mRNA gene expression profiles ofmultiple human organs. Genome Res 15, 443�450
17. Su AI, Wiltshire T, Batalov S, Lapp H, Ching KA, et al.(2004) A gene atlas of the mouse and human protein-encoding transcriptomes. Proc Natl Acad Sci USA 101,6062�6067
18. Tan PK, Downey TJ, Spitznagel EL Jr, Xu P, Fu D, et al.(2003) Evaluation of gene expression measurementsfrom commercial microarray platforms. Nucl AcidsRes 31, 5676�5684
19. Wang J, Williams RW, Manly KF (2003) WebQTL:Web-based complex trait analysis. Neuroinformatics1, 299�308
20. Welsh JB, Sapinoso LM, Kern SG, Brown DA, Liu T,et al. (2003) Large-scale delineation of secreted proteinbiomarkers over expressed in cancer tissue and serum.Proc Natl Acad Sci USA 100, 3410�3415
21. Wen BG, Pletcher MT, Warashina M, Choe SH, ZiaeeN, et al. (2004) Inositol (1,4,5) trisphosphate 3 kinase Bcontrols positive selection of T cells and modulatesErk activity. Proc Natl Acad Sci USA 101, 5604�5609
22. Zhang W, Morris QD, Chang R, Sahi O, Bakowski A,et al. (2004) The functional landscape of mouse geneexpression. J Biol 3, 21
1146 J.R. WALKER AND T. WILTSHIRE: FREE EXPRESSION DATABASES