10
Comparative and Functional Genomics Comp Funct Genom 2004; 5: 633–641. Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/cfg.447 Conference Paper Standardization initiatives in the (eco)toxicogenomics domain: a review Susanna Assunta Sansone 1 *, Norman Morrison 2 , Philippe Rocca-Serra 1 and Jennifer Fostel 3 1 EMBL-EBI, The European Bioinformatics Institute, Cambridge CB10 1SD, UK 2 The University of Manchester, Kilburn Building, School of Computer Science, Oxford Road, Manchester M13 9PL, UK 3 National Institute of Environmental Health Sciences, National Center for Toxicogenomics. Research Triangle Park, NC 27709, USA *Correspondence to: Susanna Assunta Sansone, EMBL-EBI, The European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK. E-mail: [email protected] Revised: 24 November 2004 Accepted: 26 November 2004 Abstract The purpose of this document is to provide readers with a resource of different ongo- ing standardization efforts within the ‘omics’ (genomic, proteomics, metabolomics) and related communities, with particular focus on toxicological and environmental applications. The review includes initiatives within the research community as well as in the regulatory arena. It addresses data management issues (format and reporting structures for the exchange of information) and database interoperability, highlight- ing key objectives, target audience and participants. A considerable amount of work still needs to be done and, ideally, collaboration should be optimized and duplication and incompatibility should be avoided where possible. The consequence of failing to deliver data standards is an escalation in the burden and cost of data management tasks. Copyright 2005 John Wiley & Sons, Ltd. Keywords: toxicogenomics; ecotoxicogenomics; toxicology; environment; functional genomics; standards; database Introduction Molecular-based approaches, such as transcrip- tomics, proteomics, metabolomics and metabo- nomics, are being used to study the impact of chemicals on human and wildlife populations. These high-throughput (eco)toxicogenomics inves- tigations are information-intensive and, by produc- ing massive amounts of data, have placed the infor- matics challenge under the spotlight. The need to provide easy access to integrated data in a struc- tured standard format is clearly signicant. Several efforts are already under way to promote stan- dardization, tackle data management issues and develop databases to facilitate data exchange. We have seen the value of these collaborative efforts already. The Microarray Gene Expression Data (MGED; http://www.mged.org) Society has been successful in developing the MIAME standard and related ontology and object models for microar- ray data (reviewed in Quackenbush 2004). The Reporting Structure for Biological Investigations (RSBI; http://www.mged.org/Workgroups/rsbi) is a new working group formed under the MGED Society umbrella, planning to act as a ‘single point of focus’ for Toxicogenomics, Environmental Genomics and Nutrigenomics communities work- ing towards an international and compatible infor- matics platform for data exchange. Discipline- specic initiatives are regarded as important be- cause they target ‘real world’ data capture require- ments for the particular omics technologies being used. A consequence of this, however, is that, by remaining within each given discipline, the stan- dardization effort fragments, resulting in duplica- tion and the development of different terminol- ogy and data models, thereby limiting the poten- tial for data exchange. One of the objectives of the RSBI working group is to ensure that these initiatives are coordinated, so that synergy and cross-discipline communication can be max- imized, and duplicated effort can be minimized. Copyright 2005 John Wiley & Sons, Ltd.

Standardization initiatives in the (eco)toxicogenomics domain: a …downloads.hindawi.com/journals/ijg/2004/154986.pdf · 2019-08-01 · Comparative and Functional Genomics Comp Funct

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Standardization initiatives in the (eco)toxicogenomics domain: a …downloads.hindawi.com/journals/ijg/2004/154986.pdf · 2019-08-01 · Comparative and Functional Genomics Comp Funct

Comparative and Functional GenomicsComp Funct Genom 2004; 5: 633–641.Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/cfg.447

Conference Paper

Standardization initiatives in the(eco)toxicogenomics domain: a review

Susanna Assunta Sansone1*, Norman Morrison2, Philippe Rocca-Serra1 and Jennifer Fostel31EMBL-EBI, The European Bioinformatics Institute, Cambridge CB10 1SD, UK2The University of Manchester, Kilburn Building, School of Computer Science, Oxford Road, Manchester M13 9PL, UK3National Institute of Environmental Health Sciences, National Center for Toxicogenomics. Research Triangle Park, NC 27709, USA

*Correspondence to:Susanna Assunta Sansone,EMBL-EBI, The EuropeanBioinformatics Institute,Wellcome Trust GenomeCampus, Cambridge CB101SD, UK.E-mail: [email protected]

Revised: 24 November 2004Accepted: 26 November 2004

AbstractThe purpose of this document is to provide readers with a resource of different ongo-ing standardization efforts within the ‘omics’ (genomic, proteomics, metabolomics)and related communities, with particular focus on toxicological and environmentalapplications. The review includes initiatives within the research community as well asin the regulatory arena. It addresses data management issues (format and reportingstructures for the exchange of information) and database interoperability, highlight-ing key objectives, target audience and participants. A considerable amount of workstill needs to be done and, ideally, collaboration should be optimized and duplicationand incompatibility should be avoided where possible. The consequence of failing todeliver data standards is an escalation in the burden and cost of data managementtasks. Copyright 2005 John Wiley & Sons, Ltd.

Keywords: toxicogenomics; ecotoxicogenomics; toxicology; environment; functionalgenomics; standards; database

Introduction

Molecular-based approaches, such as transcrip-tomics, proteomics, metabolomics and metabo-nomics, are being used to study the impact ofchemicals on human and wildlife populations.These high-throughput (eco)toxicogenomics inves-tigations are information-intensive and, by produc-ing massive amounts of data, have placed the infor-matics challenge under the spotlight. The need toprovide easy access to integrated data in a struc-tured standard format is clearly significant. Severalefforts are already under way to promote stan-dardization, tackle data management issues anddevelop databases to facilitate data exchange. Wehave seen the value of these collaborative effortsalready. The Microarray Gene Expression Data(MGED; http://www.mged.org) Society has beensuccessful in developing the MIAME standard andrelated ontology and object models for microar-ray data (reviewed in Quackenbush 2004). The

Reporting Structure for Biological Investigations(RSBI; http://www.mged.org/Workgroups/rsbi)is a new working group formed under the MGEDSociety umbrella, planning to act as a ‘singlepoint of focus’ for Toxicogenomics, EnvironmentalGenomics and Nutrigenomics communities work-ing towards an international and compatible infor-matics platform for data exchange. Discipline-specific initiatives are regarded as important be-cause they target ‘real world’ data capture require-ments for the particular omics technologies beingused. A consequence of this, however, is that, byremaining within each given discipline, the stan-dardization effort fragments, resulting in duplica-tion and the development of different terminol-ogy and data models, thereby limiting the poten-tial for data exchange. One of the objectivesof the RSBI working group is to ensure thatthese initiatives are coordinated, so that synergyand cross-discipline communication can be max-imized, and duplicated effort can be minimized.

Copyright 2005 John Wiley & Sons, Ltd.

Page 2: Standardization initiatives in the (eco)toxicogenomics domain: a …downloads.hindawi.com/journals/ijg/2004/154986.pdf · 2019-08-01 · Comparative and Functional Genomics Comp Funct

634 S. A. Sansone et al.

To capitalize on these efforts, representatives ofthe RSBI working group are also directly par-ticipating in certain initiatives and, by foster-ing interactions, are laying the ground for fur-ther collaborations. One forum for such interac-tion is the Standards and Ontologies for FunctionalGenomics (SOFG; http://www.sofg.org) Confer-ence. We invite comments on the work of the RSBIat [email protected]

Standardization initiatives

Data standardization is now considered beyond theresearch application of high-throughput technolo-gies (reviewed in Quackenbush, 2004) and reg-ulatory bodies, such as the US Food and DrugAdministration (FDA) and Environmental Protec-tion Agency (EPA), are developing their pol-icy or guidance on genomics data submissions(http://www.fda.gov/cder/guidance/5900dft.doc;http://www.epa.gov/osa/genomics.htm). Severalorganizations and committees are tackling datastandardization; however, there is a fundamentaldifference in both the design and objectives ofthe efforts around regulatory submission of datavs. the needs of the research community, whoneed databases and tools for discovery. The for-mer aims to accelerate the review process, facili-tate proprietary data submission and optimize datavisualization in a way that does not impact thevocabulary used by the individual submitter. Theresearch community needs to ease deposition inpublic databases and facilitate data mining by theuse of common annotation standards and ontolo-gies. There is some overlap between the needsof these communities and some level of interac-tion. Thus, there is value in assessing the com-monality between regulatory, research communityand database designers’ objectives in the designof data standards. Specifically, a unified approachto describing and reporting the experimental bio-logical metadata that is common to the different‘omics’ technologies (transcriptomics, proteomicsand metabonomics/metabolomics) or disciplines(e.g. pharmacogenomics, toxicogenomics, environ-mental genomics) is a goal of the RSBI. Undoubt-edly specialized information is needed by certainapplications, but a high-level unified model fordescription of metadata would be able to encom-pass these applications. Here, metadata, refers to

biological information relating to samples and theinformation about experimental design. Data refersto measured values relating to samples (e.g. toxico-logical endpoints and gene expression) under givenexperimental conditions.

This paper is not an exhaustive list of all activ-ity but provides a summary of standardizationefforts for toxicological and environmental applica-tions, which address reporting standards (e.g. whatshould be reported), and management issues (e.g.how reported information should be stored andexchanged, and which ontologies should be usedto annotate data and metadata). The various initia-tives fall into six broad categories, summarized inTable 1 and explored in detail below.

‘Omics’ technology communities

These are academic grass roots communities thathave joined forces with commercial vendors toaddress content standards and reporting needs fora single high-throughput technology.

MGED Society

The MGED Society has established standardsfor microarray data annotation (MIAME; Brazmaet al., 2001; Ball et al., 2002) and exchange(MAGE-ML; Spellman et al., 2002) that have facil-itated the creation of microarray databases andrelated supporting software (MAGE-OM; Spell-man et al., 2002). The response from the sci-entific community to these community standardshas been extremely positive (Editorial, 2002).Most of the major scientific journals and somefunding agencies require publications describingmicroarray experiments to comply with MIAME,for the data to be submitted to public reposito-ries, such as ArrayExpress (Brazma et al., 2003),GEO (Edgar et al., 2002) and CIBEX (Ikeo et al.,2003). Consequently, the MIAME model hasbeen adopted by other communities (Quackenbush,2004). MGED is now working with other initia-tives, such as HUPO-PSI in the proteomics fieldand SMRS (see below). There have been severalextensions to MIAME: MIAME-Tox, an array-based toxicogenomics standard developed by theILSI Health and Environmental Sciences Institute(HESI) (http://hesi.ilsi.org/index.cfm?pubenti-tyid=120); the National Institute of Environmental

Copyright 2005 John Wiley & Sons, Ltd. Comp Funct Genom 2004; 5: 633–641.

Page 3: Standardization initiatives in the (eco)toxicogenomics domain: a …downloads.hindawi.com/journals/ijg/2004/154986.pdf · 2019-08-01 · Comparative and Functional Genomics Comp Funct

Standardization initiatives in (eco)toxicogenomics 635

Table 1. The initiatives divided according to six broad categories

Category Description Acronym Domain URL

Omics technology Academic grass roots communities MGED Microarray http://www.mged.orgcommunities that have joined forces with PSI Proteomics http://psidev.sourceforge.net

commercial vendors to createtechnology-driven standards

SMRS Metabolomics andmetabonomics

http://www.smrsgroup.org

Measurement andmethods validations

Efforts focusing on validationprograms and production of

ECVAM Array-basedtoxicogenomics

http://ecvam.jrc.cec.eu.int

standard materials and methods ERCC Microarrays and http://www.cstl.nist.gov/bio-quantitative RT-PCR tech/workshops/ERCC2004

MARG Microarray http://www.abrf.org/index.cfm/group.s how/

ABRF Microarray.30.htmMFB http://www.mfbprog.org.uk

Regulatory drivendiscussion fora

Efforts aiming for a broader CDISC Clinical data http://www.cdisc.org

understanding and use of omicsdata, defining data models for data

PGx Pharmacogenomicsdata

To be announced

submission to regulators. That SEND Animal toxicity data http://www.cdisc.org/models/preserve the terms andobservations used by the submitter

send/v1.5

Domain-driven discussion Efforts aiming to a broader DSSTox Chemical toxicity data http://www.epa.gov/nheerl/fora exchange and integration of dsstox/

toxicity and ecological data SEEK Ecological data http://seek.ecoinformatics.org

World-wide organizations Efforts producing internationally IPCS Toxicogenomics http://www.who.int/ipcs/en/agreed instruments, decisions and NAS (Eco)toxicogenomics http://dels.nas.edu/emerging-recommendations or acting as issuesfacilitator OECD Ecotoxicogenomics http://www.oecd.org

BSC IEEE Bioscience http://www.csbcon.org

Infrastructure Standards-compliant infrastructure, ArrayExpress Array-based data and http://www.ebi.ac.uk/array-assisting in development of useful and Tox- toxicology endpoints expressand usable standards MIAMExpress values http://www.ebi.ac.uk/tox-

miamexpressCEBS Toxicogenomics http://cebs.niehs.nih.govCTD Genes and proteins http://ctd.mdibl.orgmaxd Array-based data and http://bioinf.man.ac.uk/micro-

environmentalmetadata

array/maxd

TIS Toxicogenomics http://www.fda.gov/nctr/sci-(ArrayTrack) ence/centers/toxicoinfor-

matics/ArrayTrack

Health Sciences (NIEHS); the National Center forToxicogenomics (NCT; http://www.niehs.nih.gov/nct); the FDA National Center for Toxicologi-cal Research (NCTR; http://www.fda.gov/nctr);and the European Bioinformatics Institute (EBI;http://www.ebi.ac.uk). MIAME/Env has beendeveloped by the NERC Environmental GenomicsThematic Programme Data Centre (EGTDC; http://envgen.nox.ac.uk) to fulfil the diverse needs ofthose working in the functional genomic of ecosys-tems, invertebrates and vertebrates which are not

covered by the model organism community. How-ever, extending MIAME to meet domain-specificrequirements is only a partial solution. As multi-technology investigations become commonplace,these checklists will soon be insufficient. Currently,the above communities are working together withthe RSBI group to develop a reporting structure fordescribing multi-platform technologies investiga-tions. The proposed RSBI Tiered Checklist (RSBITC; http://www.mged.org/Workgroups/rsbi) willbe a modular context-dependent structure.

Copyright 2005 John Wiley & Sons, Ltd. Comp Funct Genom 2004; 5: 633–641.

Page 4: Standardization initiatives in the (eco)toxicogenomics domain: a …downloads.hindawi.com/journals/ijg/2004/154986.pdf · 2019-08-01 · Comparative and Functional Genomics Comp Funct

636 S. A. Sansone et al.

Proteomics Standardization Initiative (PSI)

The HUPO (Human Proteome Organization;(http://www.hupo.org) PSI (http://psidev.sourceforge.net) includes the major protein databases,government and industry and is defining standardsfor data representation in proteomics to facili-tate data comparison, exchange and verification.Current focus is on mass spectrometry and pro-tein–protein interaction data. A set of open sourcestandards are being developed along MIAME lines,including a content standard, the Minimum Infor-mation About Proteomics Experiments (MIAPE),an XML standard data exchange format (Herm-jakob et al., 2004) and an ontology of clearlydefined general proteomics terms.

Standard Metabolic Reporting Structure (SMRS)

SMRS (http://www.smrsgroup.org) comprises in-dustry, software developers, governmental repre-sentatives and academia, who are investigatingthe reporting and design of metabonomics andmetabolomics studies in plants, microbial systems,environment, in vivo and in vitro applications, aswell as human studies. A set of draft recommen-dations has been produced as a discussion docu-ment. It considers the factors in a metabolic studythat could be recorded and standardized, includingthe origin of a biological sample, the technologiesand methods for analysis and the chemometric andstatistical approaches. The recommendations alsotouch on the granularity of information requiredfor different reporting needs, including journal sub-missions, public databases and regulatory submis-sions.

Measurement and methods validations

As high-throughput technologies are used in indus-try and are considered by regulatory agencies, themethodology itself comes under scrutiny. Agree-ment on data formats will do little good if exper-imental protocols are inconsistent. Currently, stan-dardization of microarray experiment procedures iskey to the broad acceptance and use of these data.The very variability of microarray data generation,analysis, future validation of the technology andproduction of standard materials is now the focusof many initiatives.

MfB (Measurements for Biotechnology)program

MfB (http://www.mfbprog.org.uk) is a UK pro-gramme that addresses bio-measurements of impor-tance for industry. The ‘Comparability of GeneExpression Measurements on Microarrays’ is an in-dustry-based consortium led by LGC (http://www.lgc.co.uk). The project is designed to determinethe accuracy and comparability of gene expressionmeasurements made on different array platformsand also evaluates data analysis methods. A sec-ond phase is now looking at the standardization ofarray-based toxicogenomics and will build up onthe analysis framework to develop a panel of qual-ity metrics for validating and standardizing array-based toxicogenomics measurements.

The Microarray Research Group (MARG) ofthe Association of Biomolecular ResourceFacilities (ABRF)

The MARG (http://www.abrf.org/index.cfm/gro-up.show/Microarray.30.htm) is a research-focu-sed consortium of academic laboratories promotingcommunication and cooperation among core aca-demic and industrial microarray and data analysisservices providers. The resulting data is used tohelp laboratories evaluate their performance andachieve the highest quality results possible fromthe use of microarray technologies.

The European Centre for the Validationof Alternative Methods (ECVAM)

The ECVAM (http://ecvam.jrc.cec.eu.int) coordi-nates and funds validation studies of alternativemethods that could reduce, refine or replace theuse of laboratory animals in regulatory toxicol-ogy. Both the new EU Chemical Policy (REACH)(Editorials, 2003a, 2003b) that proposes the re-evaluation of about 30 000 chemicals, and the 7thAmendment to the Cosmetics Directive, whichforesees the complete replacement of animal exper-iments by 2013, call for the development andimplementation of alternative methods. ECVAMis working with the US Interagency CoordinatingCommittee on the Validation of Alternative Meth-ods (ICCVAM; http://iccvam.niehs.nih.gov/home.htm) and National Toxicology Program Intera-gency Center for the Evaluation of Alternative Tox-icological Methods (NICEATM; http://iccvam.ni-

Copyright 2005 John Wiley & Sons, Ltd. Comp Funct Genom 2004; 5: 633–641.

Page 5: Standardization initiatives in the (eco)toxicogenomics domain: a …downloads.hindawi.com/journals/ijg/2004/154986.pdf · 2019-08-01 · Comparative and Functional Genomics Comp Funct

Standardization initiatives in (eco)toxicogenomics 637

ehs.nih.gov/home.htm) to investigate the specificconsiderations necessary for adequate validationof array-based toxicogenomics-based test meth-ods. At present, recommendations are being pre-pared which will cover topics such as descriptionof the biological systems, methodological/technicalissues, data analysis, and data format and storage.

External RNA Controls Consortium (ERCC)

ERCC (http://www.cstl.nist.gov/biotech/work-shops/ERCC2004) originated at a US NationalInstitute of Standards and Technology (NIST;http://www.nist.gov) meeting and is composed ofrepresentatives from the public, private and aca-demic sectors, addressing experimental control andperformance evaluation for gene expression anal-ysis. ERCC is considering the utility of univer-sal (platform-independent) spike-in controls, proto-cols, and informatics tools intended for use acrossone- and two-channel microarray and quantitativeRT-PCR (QRT-PCR). Outcomes of this work willbe published and resulting data submitted to a pub-lic database.

Regulatory-driven fora

To streamline regulatory electronic submissions anumber of technical issues need to be addressed.These efforts intend to identify the kind of datathat should be included in submissions to regula-tory bodies and automate the largely paper-basedclinical trials and non-clinical research processes.

Clinical Data Interchange StandardsConsortium (CDISC)

CDISC (http://www.cdisc.org) is an open, mul-tidisciplinary, non-profit organization committedto the development of worldwide pharmaceuti-cal industry standards, vendor-neutral, platform-independent data models to support the electronicacquisition, exchange, and the submission andarchiving of clinical trials data and metadata.

Standard for Exchange of Non-clinical Data(SEND)

SEND (http://www.cdisc.org/models/send/v1.5)is a consortium formed among the pharmaceuticalindustry, contract laboratories, software developers

and the FDA. The goal of SEND is to developa common format for the electronic submissionof animal toxicity data and study description toa regulatory agency. Once the SEND standard isfinalized, it will be merged with CDISC’s model toform the Study Data Tabulation Model (SDTM).

Pharmacogenomics (PGx) Standards Group

The Pharmacogenomics (PGx) Standards Groupwas formed in November 2003 at a workshoporganized by the Drug Information Association(DIA), FDA, Pharmacogenetics Working Group(PWG), Pharmaceutical Research and Manufac-turers of America (PhRMA) and BiotechnologyIndustry Organization (BIO) to review the FDAdraft, ‘Guidance for Industry — PharmacogenomicData Submissions’. The PGx Standards Groupencompasses regulatory bodies, pharma, and indus-try organizations. The goal of this joint project isto help define the requirements for pharmacoge-nomics submission to the FDA and define dataformats and standards. This project focuses on theuse of pharmacogenomics and toxicogenomics datato support pharmacological and toxicological con-clusions. There is a consensus within this groupto use existing standards (e.g. MIAME, MAGE,SEND, CDISC) if available, and to extend them ifneeded.

Domain-driven fora

These toxicoinfomatics and ecoinformatics specificinitiatives are an example of international coordina-tion for the development and adoption of controlledvocabularies and format for exchanging chemicaltoxicity, and ecological and environmental data.

The Distributed Structure-Searchable Toxicity(DSSTox)

DSSTox (http://www.epa.gov/nheerl/dsstox) is anetwork project by the US EPA, providing acommunity forum for publishing standard format,structure-annotated chemical toxicity data files foropen public access. Although a primary focus ofthis effort is aimed towards inclusion of chem-ical structures and standardized chemical fields,DSSTox will also promote the use of a con-trolled vocabulary, i.e. common data field names

Copyright 2005 John Wiley & Sons, Ltd. Comp Funct Genom 2004; 5: 633–641.

Page 6: Standardization initiatives in the (eco)toxicogenomics domain: a …downloads.hindawi.com/journals/ijg/2004/154986.pdf · 2019-08-01 · Comparative and Functional Genomics Comp Funct

638 S. A. Sansone et al.

and entry formats for the same types of toxicitydata across databases. It will link to such pub-lic toxicity data by incorporating DSSTox Stan-dard Fields and Indices in the custom databases,making common queries possible using a stan-dard DSSTox identifier. DSSTox is collaborat-ing with, or using standards from, several otherefforts, including the LeadScope In Silico Tox(LIST) Focus Group, the National Cancer Insti-tute (NCI), NIEHS’s National Center for Tox-icogenomics and the National Toxicology Pro-gram, the National Library of Medicine (NLM)TOXNET, the International Union of Pure andApplied Chemistry (IUPAC), the National Insti-tutes of Standards and Technology (NIST), the ILSIHESI SAR Toxicity Database Project and MGED’sMIAME/Tox, as well as numerous vendors andconsortia (http://www.epa.gov/nheerl/dsstox/Co-ordinatingPublicEfforts.html).

The Science Environment for EcologicalKnowledge (SEEK)

SEEK (http://seek.ecoinformatics.org) is a mul-tidisciplinary initiative designed to create cyber-infrastructure for ecological, environmental andbiodiversity research and to educate the ecologicalcommunity about eco-informatics. SEEK partici-pants are building an integrated data grid (EcoGrid)for accessing a wide variety of ecological andbiodiversity data and analytical tools (Kepler;http://kepler-project.org). Ecological MetadataLanguage (EML) is a metadata specification devel-oped in association with SEEK and the KnowledgeNetwork for Biocomplexity (KNB; http://knb.eco-informatics.org) that can by used in a modu-lar and extensible manner to document ecologicaldata.

World-wide organizations

Global organizations have initiated a dialoguebetween technological experts, regulators and theprincipal validation bodies to draw road mapsfor development, validation and regulatory useof omics-based technologies in chemical assess-ment. Others are liaising with different life sciencesdisciplines, offering support, mediation and con-sultancy to speed up the standards developmentprocess.

Organization for Economic Co-operation andDevelopment (OECD) and the InternationalProgram on Chemical Safety (IPCS)

IPCS (http://www.who.int/ipcs/en/) is a joint pro-gram of three cooperating organizations — theInternational Labour Organization, the United Na-tions Environment Network and the World HealthOrganization — implementing activities related tochemical safety. In collaboration with the Orga-nization for Economic Cooperation and Devel-opment (OECD, http://www.oecd.org), the IPCShas organized a series of workshops to iden-tify the possible application of methods based on(eco)toxicogenomics in regulatory hazard assess-ment, to determine the current limitations to theuse of (eco)toxicogenomics in regulatory assess-ment and develop a plan to overcome such lim-itations, to identify the need for future activitieswith regard to the use of these methods in testguidelines, new and existing chemicals, pesticidesand biocides programs. At present, recommenda-tions are being prepared and will be published. Inview of these recommendations, the developmentof a coordinated international research programon (eco)toxicogenomics will be initiated, aimingto optimize the integration of genomic techniquesinto (eco)toxicology and their use in ecological andhuman health risk assessment.

The National Academy of Sciences (NAS)

The NAS Committee on Emerging Issues and Dataon Environmental Contaminants (http://dels.nas.edu/emergingissues) is a public forum for com-munication among government, industry, envi-ronmental groups and the academic communityabout emerging evidence and issues in toxicoge-nomics, environmental toxicology, risk assessmentand exposure assessment. The Committee willdevelop a framework for how the emerging fieldof genomics will be incorporated into risk assess-ment.

Institute of Electrical and Electronics Engineers(IEEE) Computer Society

The Bioinformatics Standards Committee (BSC;http://www.csbcon.org) has a mission to act asa liaison between groups in the bioscience com-munity, developing standards for biological objects

Copyright 2005 John Wiley & Sons, Ltd. Comp Funct Genom 2004; 5: 633–641.

Page 7: Standardization initiatives in the (eco)toxicogenomics domain: a …downloads.hindawi.com/journals/ijg/2004/154986.pdf · 2019-08-01 · Comparative and Functional Genomics Comp Funct

Standardization initiatives in (eco)toxicogenomics 639

in the life sciences disciplines and the IEEE Stan-dards Association. BSC will provide a neutralforum for the global bioinformatics community towork towards common agreements on standards innew areas and integration between established stan-dards.

Standard(s)-compliant infrastructure

This section provides a short review of pub-lic infrastructure currently available for toxicoge-nomics and environmental genomics data. Theseefforts are in different stages of development, serv-ing specific needs of their user community andrelying on diverse types of funding support. Never-theless, these are examples of institutions workingtogether, sharing expertise and moving towards aninternationally compatible informatics platform fordata exchange, interacting closely with standard-ization initiatives listed here.

ArrayExpress and Tox-MIAMExpress

ArrayExpress (http://www.ebi.ac.uk/arrayexpre-ss) (Brazma et al., 2003) is a MGED standards-compliant, public infrastructure for microarray-based gene expression data at the EBI. Theinfrastructure has been extended to link bio-logical endpoint values with gene expressiondata as result of a collaborative undertakingwith the ILSI HESI Committee on the Applica-tion of Toxicogenomics Data to Mechanism-basedRisk Assessment (http://www.ebi.ac.uk/mic-roarray/Projects/tox-nutri). Their toxicogenomicsdatasets (Pennie et al., 2004) have been submittedto ArrayExpress using Tox-MIAMExpress, the on-line MIAME/Tox-compliant data input tool (Matteset al., 2004) (http://www.ebi.ac.uk/tox-miamex-press). The ILSI HESI Committee research pro-gramme has provided the first large array-basedtoxicogenomics dataset in the public domain anno-tated according to the MGED standards.

Chemical Effects in Biological Systems (CEBS)Knowledgebase

CEBS (http://cebs.niehs.nih.gov) (Waters et al.,2003) is a public toxicogenomics knowledgebasein year two of its 10 year development at theNIEHS’s NCT. CEBS aims to integrate omics

datasets in the context of toxicology to advanceknowledge discovery about toxicity (Waters et al.,2003; Waters and Fostel, 2004; Mattes et al.,2004). CEBS implements standards developedby the MGED Society and the HUPO PSI inthe CEBS SysBio object model (Xirasagar et al.,2004). CEBS is designing an ontological rep-resentation of data and terms used by its col-laborators, which includes descriptors for differ-ent study design types and metadata vocabular-ies.

maxd

maxd (http://bioinf.man.ac.uk/microarray/ma-xd) is an open-source data warehouse and visu-alization environment for genomic expression dataemployed by the NERC EGTDC. The maxd soft-ware suite includes two major components. Thefirst, maxdLoad2, is a database schema and dataloading and curation application designed to enablebiologists to store expression data, annotate it toMIAME and MIAME/Env standards, and exportit in MAGE-ML format to ArrayExpress. Thesecond, maxdView, is a modular analysis andvisualization environment for interactive explo-ration of transcriptomics data and associated meta-data.

Toxicoinformatics Integrated System (TIS)

ArrayTrack (http://www.fda.gov/nctr/science/ce-nters/toxicoinformatics/ArrayTrack; Tong et al.2003) is an integrated software system for man-aging, mining and visualizing microarray geneexpression data at NCTR-FDA. The system hasthree integrated components: a MIAME-compliantdatabase storing array-based toxicogenomics data;a set of tools providing data visualization andanalysis capability; and a library containing func-tional information about genes, proteins, pathwaysand toxicants. ArrayTrack is the first module ofTIS, a system to integrate genomic, proteomic andmetabonomic data with data from the public repos-itories, as well as conventional in vitro and in vivotoxicology data. TIS will serve as a general tox-icogenomics repository for diverse data sources,supporting broad data mining and meta-analysisactivities, as well as the development of robust andvalidated predictive toxicology systems.

Copyright 2005 John Wiley & Sons, Ltd. Comp Funct Genom 2004; 5: 633–641.

Page 8: Standardization initiatives in the (eco)toxicogenomics domain: a …downloads.hindawi.com/journals/ijg/2004/154986.pdf · 2019-08-01 · Comparative and Functional Genomics Comp Funct

640 S. A. Sansone et al.

The Comparative Toxicogenomics Database(CTD)

The CTD (http://ctd.mdibl.org) promotes under-standing about the effects of environmental chem-icals on human health by facilitating cross-speciescomparative studies of toxicologically importantgenes and proteins. CTD is now publicly availableas a prototype. It provides annotated associationsbetween genes, proteins, sequences, references andchemicals in vertebrates and invertebrates; inte-grates molecular and toxicology data; implementsontologies; and will describe gene–chemical inter-actions in diverse organisms. These data provideinsight into the genetic basis of variable sensitivityto chemicals and complex interactions between theenvironment and human health.

Conclusions

Data produced by (eco)toxicogenomics investiga-tions are growing in volume and complexity at astaggering rate. It is not trivial to define precisedata content, presentation and exchange formats.However, there is a growing realization withinthe (eco)toxicogenomics community that, if weare to realize the opportunities offered by omics-based technologies, we will need to change ourapproach to data handling and work more collabo-ratively. The authors, also moderators of the RSBIworking group, would like to emphasize the needfor community participation in the integration ofthese standardization initiatives. It is hoped thathighlighting these different initiatives will help toassess the commonality and optimize harmoniza-tion, thus minimizing duplication and incompatibil-ity and achieving cost-effective results in a timelymanner.

Acknowledgements

The authors would like to thank the following peoplefor their assistance: Alvis Brazma, Ann Richard, BettyCheng, Carole Foy, Carolyn Mattingly, Chris Taylor, CraigZwickl, Dawn Field, Helen Parkinson, Henning Herm-jakob, Jason Snape, Jessie Kennedy, John Lindon, MichaelWaters, Nancy Doerrer, Peter Lord, Raffaella Corvi, RobertStevens, Syril Petit, Thomas Papoian, Weida Tong and theSOFG organizers. Susanna-Assunta Sansone is supportedby the ILSI-HESI Genomics Committee, Norman Morri-son by the NERC Environmental Genomics programme,

Philippe Rocca-Serra by the European Commission NuGOproject and Jennifer Fostel by the NIEHS NCT.

References

ArrayExpress: http://www.ebi.ac.uk/arrayexpressArrayTrack: http://www.fda.gov/nctr/science/centers/toxicoin-

formatics/ArrayTrackBall CA, Sherlock G, Parkinson H, et al. 2002. An open letter to

the scientific journals, published in: Science 298(5593): 539;Bioinformatics 18(11): 1409; Lancet 360: 1019.

Brazma A, Parkinson H, Sarkans U, et al. 2003. ArrayEx-press — a public repository for microarray gene expression dataat the EBI. Nucleic Acids Res 31(1): 68–71.

Brazma A, Hingamp P, Quackenbush J, et al. 2001. Minimuminformation about a microarray experiment (MIAME) — towardstandards for microarray data. Nature Genet 29(4): 365–371.

BSC; http://www.csbcon.orgCDISC; http://www.cdisc.orgCEBS; http://cebs.niehs.nih.govCTD; http://ctd.mdibl.orgDSSTox Coordinating Public Effort project; http://www.epa.gov/

nheerl/dsstox/CoordinatingPublicEfforts.htmlDSSTox; http://www.epa.gov/nheerl/dsstoxEBI toxicogenomics; http://www.ebi.ac.uk/microarray/Projects/

tox-nutriEBI; http://www.ebi.ac.ukECVAM; http://ecvam.jrc.cec.eu.intEditorial. 2002. Microarray standards at last. Nature 419: 323.Editorial. 2003a. EU starts a chemical reaction. Science 300, 5618:

405.Editorial. 2003b. Europe whittles down plans for massive chemical

testing program. Science 302, 5647: 969.Edgar R, Domrachev M, Lash AE. 2002. Gene Expression

Omnibus: NCBI gene expression and hybridization array datarepository. Nucleic Acids Res 30(1): 207–210.

EPA DRAFT — Potential implications of genomics for regulatoryand risk assessment applications at EPA; http://www.epa.gov/osa/genomics.htm

ERCC; http://www.cstl.nist.gov/biotech/workshops/ERCC2004FDA draft guidance for industry pharmacogenomic data

submissions; http://www.fda.gov/cder/guidance/5900dft.docHermjakob H, Montecchi-Palazzi L, Bader G, et al. 2004. The

HUPO PSI molecular interaction format — a communitystandard for the representation of protein interaction data. NatureBiotechnol 22: 177–183.

ICCVAM; http://iccvam.niehs.nih.gov/home.htmIkeo K, Ishi-i J, Tamura T, et al. 2003. CIBEX: center for

information biology gene expression database. C R Biol326(10–11): 1079–1082.

ILSI HESI; http://hesi.ilsi.org/index.cfm?pubentityid=120IPCS; http://www.who.int/ipcs/enKepler; http://kepler-project.orgKNB; http://knb.ecoinformatics.orgLGC; http://www.lgc.co.ukMARG; http://www.abrf.org/index.cfm/group.show/Microar-

ray.30.htm

Copyright 2005 John Wiley & Sons, Ltd. Comp Funct Genom 2004; 5: 633–641.

Page 9: Standardization initiatives in the (eco)toxicogenomics domain: a …downloads.hindawi.com/journals/ijg/2004/154986.pdf · 2019-08-01 · Comparative and Functional Genomics Comp Funct

Standardization initiatives in (eco)toxicogenomics 641

Mattes WB, Pettit SD, Sansone A, et al. 2004. Database devel-opment in toxicogenomics: issues and efforts. Environ HealthPerspect 112: 495–505.

maxd; http://bioinf.man.ac.uk/microarray/maxdMfB; http://www.mfbprog.org.ukMGED RSBI Working Groups; http://www.mged.org/Workgro-

ups/rsbiMGED; http://www.mged.orgNAS Committee on Emerging Issues and Data on Environmental

Contaminants; http://dels.nas.edu/emergingissues/index.aspNCTR-FDA; http://www.fda.gov/nctrNERC EGTDC; http://envgen.nox.ac.ukNICEATM; http://iccvam.niehs.nih.gov/home.htmNIEHS-NCT; http://www.niehs.nih.gov/nctOECD; http://www.oecd.orgPennie W, Pettit SD, Lord PG. 2004. Toxicogenomics in risk

assessment: an overview of an HESI collaborative researchprogram. Environ Health Perspect 112: 417–419.

PSI; http://psidev.sourceforge.netQuackenbush J. 2004. Data standards for ‘omic’ science. Nature

Biotechnol 22: 613–614.SEEK; http://seek.ecoinformatics.orgSEND; http://www.cdisc.org/models/send/v1.5

SMRS; http://www.smrsgroup.orgSOFG; http://www.sofg.orgSpellman PT, Miller M, Stewart J, et al. 2002. Design and

implementation of microarray gene expression mark-uplanguage (MAGE-ML). Genome Biol 3(9): research0046.

Stoeckert CJ, Parkinson H. 2003. The MGED ontology: aframework for describing functional genomics experiments.Comp Funct Genom 4: 127–132.

Stoeckert CJ, Causton HC, Ball CA. 2002. Microarray databases:standards and ontologies. Nature Genet 32: 469–473.

Tong W, Cao X, Harris S, et al. 2003. ArrayTrack — supportingtoxicogenomic research at the U.S. Food and DrugAdministration National Center for Toxicological Research.Environ Health Perspect 111: 1819–1826.

Tox-MIAMExpress; http://www.ebi.ac.uk/tox-miamexpressWaters M, Boorman G, Bushel P, et al. 2003. Systems toxicology

and the chemical effects in biological systems knowledge base.Environ Health Perspect 111: 811–824.

Waters MD, Fostel JM. 2004. Toxicogenomics and systemstoxicology: aims and prospects. Nature Rev Genet 5: 938–948.

Xirasagar S, Gustafson S, Merrick AB, et al. 2004. CEBSobject model for systems biology data, CEBS SysBio-OM.Bioinformatics 20(13): 2004–2015.

Copyright 2005 John Wiley & Sons, Ltd. Comp Funct Genom 2004; 5: 633–641.

Page 10: Standardization initiatives in the (eco)toxicogenomics domain: a …downloads.hindawi.com/journals/ijg/2004/154986.pdf · 2019-08-01 · Comparative and Functional Genomics Comp Funct

Submit your manuscripts athttp://www.hindawi.com

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Anatomy Research International

PeptidesInternational Journal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporation http://www.hindawi.com

International Journal of

Volume 2014

Zoology

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Molecular Biology International

GenomicsInternational Journal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

The Scientific World JournalHindawi Publishing Corporation http://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

BioinformaticsAdvances in

Marine BiologyJournal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Signal TransductionJournal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

BioMed Research International

Evolutionary BiologyInternational Journal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Biochemistry Research International

ArchaeaHindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Genetics Research International

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Advances in

Virolog y

Hindawi Publishing Corporationhttp://www.hindawi.com

Nucleic AcidsJournal of

Volume 2014

Stem CellsInternational

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Enzyme Research

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

International Journal of

Microbiology