15
Bioinformatics and data knowledge: the new frontiers for nutrition and foods Frank Desiere,* ,y Bruce German,* ,x Heribert Watzke,* Andrea Pfeifer* and Sam Saguy { *Nestle´ Research Center, PO Box 44, 1000 Lausanne 26, Switzerland (tel: +41-21-785-8054; fax: +41-21- 785-8925; e-mail: [email protected]) { The Institute of Biochemistry, Food Science and Nutrition, Faculty of Agriculture, Food and Environmental Quality Sciences, The Hebrew University of Jerusalem, PO Box 12, Rehovot 76100, Israel x Department of Food Science and Technology, University of California, Davis, California, 95616, USA The recent publication of the Human Genome poses the question: how will genome technologies influence food development? Food products will be very different within the decade with considerable new values added as a result of the biological and chemical data that bioinformatics is rapidly converting to usable knowledge. Bioinformatics will provide details of the molecular basis of human health. The immediate benefits of this information will be to extend our understanding of the role of food in the health and well- being of consumers. In the future, bioinformatics will impact foods at a more profound level, defining the physical, structural and biological properties of food commodities leading to new crops, processes and foods with greater quality in all aspects. Bioinformatics will improve the tox- icological assessment of foods making them even safer. Eventually, bioinformatics will extend the already existing trend of personalized choice in the food marketplace to enable consumers to match their food product choices with their own personal health. To build this new knowl- edge and to take full advantage of these tools there is a need for a paradigm shift in assessing, collecting and shar- ing databases, in developing new integrative models of biological structure and function, in standardized experi- mental methods, in data integration and storage, and in analytical and visualization tools. # 2001 Elsevier Science Ltd. All rights reserved. Introduction Bioinformatics and genomics are rapidly expanding fields and in a matter of months have become a crucial technology in Life Science Research. Bioinformatics and knowledge integration have played and will con- tinue to play a enabling role in Food Research inte- grating the massive amounts of data that are generated through new genome-wide experimental procedures with other more traditional techniques. Bioinformatics is defined as: ‘‘Research, development, or application of computational tools and approaches for expanding the use of biological, medical, behavioral, health and nutrition data, including those to acquire, store, organize, archive, analyze, visualize or build bio- logical knowledge from very large and traditionally unre- lated sources’’. It is about to revolutionize biological research and more importantly to apply this research to the human condition. With the availability of the human genome, the completion of the rice genome, the mapping and sequencing of other major crop plants and the publicly available complete genome sequences of ever-growing number of micro-organisms (http://www.ncbi.nlm.nih.- gov/PMGifs/Genomes/org.html), Bioinformatics has, out of necessity, become a key aspect in Life Science Research and Food Research. Bioinformatics is essen- tially a cross-disciplinary activity which includes aspects of computer science, software-engineering and mole- cular and physiological biology. Although database management seems to be the major task, bioinformatics goes much deeper; it provides possible gene-function and cellular role of molecular 0924-2244/02/$ - see front matter # 2001 Elsevier Science Ltd. All rights reserved. PII: S0924-2244(01)00089-9 Trends in Food Science & Technology 12 (2002) 215–229 y Corresponding author. Viewpoint

Bioinformática y Tecnología alimentaria

Embed Size (px)

Citation preview

Page 1: Bioinformática y Tecnología alimentaria

Bioinformatics anddata knowledge:the new frontiersfor nutrition and

foods

Frank Desiere,*,y

Bruce German,*,x

Heribert Watzke,*Andrea Pfeifer* and

Sam Saguy{

*Nestle Research Center, PO Box 44, 1000 Lausanne26, Switzerland (tel: +41-21-785-8054; fax: +41-21-

785-8925; e-mail: [email protected]){The Institute of Biochemistry, Food Science and

Nutrition, Faculty of Agriculture, Food andEnvironmental Quality Sciences, The Hebrew

University of Jerusalem, PO Box 12,Rehovot 76100, Israel

xDepartment of Food Science and Technology,University of California, Davis, California, 95616,USA

The recent publication of the Human Genome poses thequestion: how will genome technologies influence fooddevelopment? Food products will be very different withinthe decade with considerable new values added as a resultof the biological and chemical data that bioinformatics israpidly converting to usable knowledge. Bioinformatics willprovide details of the molecular basis of human health. Theimmediate benefits of this information will be to extend ourunderstanding of the role of food in the health and well-being of consumers. In the future, bioinformatics will impactfoods at a more profound level, defining the physical,

structural and biological properties of food commoditiesleading to new crops, processes and foods with greaterquality in all aspects. Bioinformatics will improve the tox-icological assessment of foods making them even safer.Eventually, bioinformatics will extend the already existingtrend of personalized choice in the food marketplace toenable consumers to match their food product choiceswith their own personal health. To build this new knowl-edge and to take full advantage of these tools there is aneed for a paradigm shift in assessing, collecting and shar-ing databases, in developing new integrative models ofbiological structure and function, in standardized experi-mental methods, in data integration and storage, and inanalytical and visualization tools. # 2001 Elsevier ScienceLtd. All rights reserved.

IntroductionBioinformatics and genomics are rapidly expanding

fields and in a matter of months have become a crucialtechnology in Life Science Research. Bioinformaticsand knowledge integration have played and will con-tinue to play a enabling role in Food Research inte-grating the massive amounts of data that are generatedthrough new genome-wide experimental procedureswith other more traditional techniques.Bioinformatics is defined as: ‘‘Research, development,

or application of computational tools and approachesfor expanding the use of biological, medical, behavioral,health and nutrition data, including those to acquire,store, organize, archive, analyze, visualize or build bio-logical knowledge from very large and traditionally unre-lated sources’’. It is about to revolutionize biologicalresearch and more importantly to apply this research tothe human condition. With the availability of the humangenome, the completion of the rice genome, the mappingand sequencing of other major crop plants and the publiclyavailable complete genome sequences of ever-growingnumber of micro-organisms (http://www.ncbi.nlm.nih.-gov/PMGifs/Genomes/org.html), Bioinformatics has,out of necessity, become a key aspect in Life ScienceResearch and Food Research. Bioinformatics is essen-tially a cross-disciplinary activity which includes aspectsof computer science, software-engineering and mole-cular and physiological biology.Although database management seems to be the

major task, bioinformatics goes much deeper; it providespossible gene-function and cellular role of molecular

0924-2244/02/$ - see front matter # 2001 Elsevier Science Ltd. All rights reserved.PI I : S0924-2244 (01 )00089-9

Trends in Food Science & Technology 12 (2002) 215–229

y Corresponding author.

Viewpoint

jesus
Resaltado
jesus
Resaltado
jesus
Resaltado
jesus
Resaltado
Page 2: Bioinformática y Tecnología alimentaria

entities, new theoretical frameworks for complex biolo-gical systems and new biological hypotheses for wet-labresearch. The combination of genomic data, informa-tion technology and other advanced research tools willgive biologists the opportunity to think more broadly—to investigate not only the workings of a single gene, butto study all of the elements of a complex biological sys-tem at the same time. In the future, the starting pointfor a biological investigation will still be the generationof an hypothesis, but that hypothesis will first be testedtheoretically, by modeling and polling existing data-bases. A scientist will begin with a theoretical con-jecture, test it on existing data and only then turning toexperiment as a last, not first resort.The same knowledge doctrine is applicable to food

science. Food science is a coherent and systematic bodyof knowledge and understanding of the nature andcomposition of food biomaterials, and their behaviorunder the various conditions to which they may besubject. Food technology is the application of food sci-ence to the practical treatment of food materials so as toconvert them into food products of the kind, quality andstability, and packaged and distributed, so as to meetthe needs of consumers for safe, wholesome, nutritiousand attractive foods. (http://www.ifst.org/fst.htm).In this respect, food science integrates the knowledge

of several sciences. It includes the knowledge of thechemical composition of food materials, their physical,biological and biochemical properties and behaviors aswell as human nutritional requirements and the nutri-tional and trophic factors in food materials; the natureand behavior of enzymes; the microbiology of foods; theinteraction of food components with each other, withadditives and contaminants, and with packaging mate-rials; the pharmacology and toxicology of food materi-als; and the effects of various manufacturing operations,processes and storage conditions; Thus, food science isan information-based science which integrates knowl-edge from widely disparate sources.The research focus in the food industry is directed by

the consumers need for high quality, convenient, tasty,safe and affordable food. The scientific advances ingenome research and their biotechnological exploitationalike represent unique opportunities to enhance foodperformance and to build sound scientific knowledgeabout its multiple functionalities. In the era beforebioinformatics and genomics, biological effects weremeasurable only according to markers for specific con-ditions (e.g. nutrient deficiencies and impairment ofhealth). Research was therefore targeted solely to con-sumer health problems such as high blood pressure,high cholesterol, lactose intolerance, osteoporosis anddiabetes. As our biological knowledge develops in thisnew era, metabolic conditions consistent with improve-ments in health will be the new markers (Watkins,Hammock, Newman, & German, 2001). This knowl-

edge will allow intervention through foods to preventhealth problems long before deleterious effects areapparent and the consumer will finally take advantageof the technological breakthrough in these areas whichwill yield healthy, high quality foods with positivenutritive properties. This is just a part of the promise ofhow new scientific knowledge of food, gained and madeavailable through bioinformatics will influence theeveryday lives of consumers.

Information and computer technologyBioinformatics is absolutely dependant on integrated

and mature software solutions, which are availablethrough electronic telecommunications to the individualscientist (Table 1). With the massive computing powerof modern computer systems we are facing fewer andfewer limitations in storage space and calculation time,the only limiting factor becoming the lack of informa-tion on specific topics.

Applications and examples in the food industryFood-grade organisms like bacteria, molds and yeasts

are the basis for a variety of biologically based indus-trial food processes (Kuipers, 1999). The fast growingnumber of complete genomic sequences of organismsrelevant to food research (Table 2) promotes the rapidincrease in valuable knowledge that can be used inmany different areas such as metabolic engineering,improvement of cells as microprocess factories and thedevelopment of novel preservation methods.Bioinfor-matics will hasten the development of novel risk assess-ment procedures (Fig. 1). Furthermore, genomicknowledge of bacteria and other microorganisms willrevolutionize pre- and probiotic research making itpossible to, characterizate the broad range of bacterialproperties from growth to stress responses, to multi-species microbial ecology within the human host.

Metabolic pathway reconstructionMicrobial metabolism has been the basis of a major

segment of food processing for centuries. Fermentationof food takes advantage of the ability of desirablemicrobes to convert substrates (usually carbohydrates)to organic tailor-made compounds contributing to theflavor, structure, texture, stability and safety of the foodproduct. Due to its fundamental importance to such awide variety of foods from breads to cheeses, wines tosausage, literally over a century of research has focusedon understanding microbial metabolism. The potentialto build this knowledge into even greater value in foodshas been dramatically expanded by the availability oftools to understand and control microbial metabolismusing modern genomic and bioinformatic approaches.The production of diacetyl, alanine and ethanol fromthis sugar metabolism has already been engineered inlactic acid bacteria. With the metabolic reaction network

216 F. Desiere et al. / Trends in Food Science & Technology 12 (2002) 215–229

jesus
Resaltado
Page 3: Bioinformática y Tecnología alimentaria

Table 1. Several bioinformatics resourcesa

Bioinformatics companies

Company URL Product Area

Affymetrix www.affymetrix.com Gene Chip DataMining Tool

Micro-array analysis

Applied Biosynthesis www.appliedbiosynthesis.com BioMerge Server,BioLIMS

Genetic analysis system, LIMS

Axon Instruments Inc www.axon.com GenePix Pro 3.0 Micro-array analysisBiodiscovery GeneSight www.biodiscovery.com GeneSight Micro-array analysisBiomax Informatics www.biomax.de BioRS DatabasesGMBH Pedant-Pro Bioinformatics analysis

HarvESTer EST-clusteringCompugen Inc. www.cgen.com Z3 2D-GE analysis

LEADS Expression analysisGencarta database

Doubletwist.com www.doubletwist.com Prophecy Human genome DBGeneForest DB of expressed genesClustering AlignmentTools (CAT)

EST-clustering

Genomica www.genomica.com LinkMapper Information managementDiscovery Manager

Hitachi Genetic www.miraibio.com analysis DNASIS Mol-bio applicationSystems CHIP Space ChipSpace Expression-analysis

DNASpace Bioinformatics analysisIBM www-4.ibm.com/software/data DB2 DB-management

Incyte Genomics www.incyte.com LifeExpress, GEMTools, Bioinformatics toolsLifeArray Human genome databaseLifeSeq Gold Gene-expression microarrays

Informax www.informax.com GenoMax Bioinformatics toolsVector NTI Suite Mol-bio tools

Integrated Genomics Inc. www.integratedgenomics WITpro, MPW,MicroAceTM

Sequencing, genome analysis,metabolic design

Lion Bioscience www.lionbioscience.com bioSCOUT Bioinformatics toolsarraySCOUT Expression analysisgenomeSCOUT Genome comparisonsSRS DB managementArrayTAG CDNAarrayBase DB of annotated cDNA

Molecular Mining Corp. www.molecularmining.com GeneLinker Expression analysisPackard Biochip www.packardbiochip.com QuantArray Windows Expression analysisTechnologiesCelera www.paracel.com GeneMatcher Hardware acceleratorParacel Inc CAP4 EST-clustering

GeneWise Bioinformatics toolsRosetta Inpharmatics www.rii.com Rosetta Resolver Expression analysisSilicon Genetics www.sigenetics.com Gene Spring Expression analysis, DB

Allele Sorter SNP AnalysisSilicon Graphics Inc. www.sgi.com MineSet Data-miningSpotfire Inc. www.spotfire.com Spotfire.net Data-mining

Spotfire Array Expression analysis

Commercial bioinformatics web-portals

Company Tool URL

Ebioinformatics Inc. Bionavigator www.bionavigator.com Over 200 bioinformatics tools, morethan 20 databases, access to GCG

Doubletwist.com Doubletwist.com www.doubletwist.com Integrated Genomics portal, accessto an annotatedHuman Genome sequence, researchagents with many bioinformatics tools

Incyte IncyteGenomics www.incyte.com LifeSeq-ZooSeq-sequence DBs andbioinformatics

(Continued on next page)

F. Desiere et al. / Trends in Food Science & Technology 12 (2002) 215–229 217

Page 4: Bioinformática y Tecnología alimentaria

established it becomes possible to determine its under-lying pathway structure by pathway models (Schilling &Palsson, 2000). An important approach to a holisticlook at such biological processes uses genomic infor-mation to reconstruct entire metabolic pathways. Theintegration of the extensive information on metabolicpathways available in the literature and databases(as in KEGG (http://www.genome.ad.jp/kegg/), EcoCyc(http://ecocyc.doubletwist.com/ecocyc/), WIT (http://wit.integratedgenomics.com/IGwit) with the genomicsequences of bacteria and eventually with stochiometricmodels will deliver tools to describe cellular processes indetail and to link genotype and phenotype. The match-ing of well annotated genes and their expression levelfrom a new organism with a collection of known meta-bolic pathways from databases is already feasible today.However, the inclusion of kinetic information, which isindispensable to describing the dynamic evolution ofthese models, remains extremely complex. Beyond that,many of the transcription, regulation and enzymaticcontrol pathways are not well understood. As theknowledge increases in these areas, metabolic recon-

struction models will become more important in study-ing the dynamic response of cells to external stimuli.

PlantsPlant genome research will provide the knowledge to

increase the success of genetics and breeding to produceplants of interest for the food industry. Major objectivesof plant research are to improve the raw materials of thefood supply for higher-quality, better processability,lower cost and safer food. The nutritional health andwell-being that plant based foods provide is tradition-ally (DellaPenna, 1999) dominated by their provision ofessential vitamins and minerals and only recently hasthe potential of a number of other health-promotingphytochemicals been recognized to be valuable in thedaily diet. Genome sequencing projects are providingnovel approaches for identifying plant biosyntheticgenes of more specific health importance. Genomeresearch can therefore directly be used to increase theefficiency and effectiveness of breeding for improvementof plants. Biotechnology, accelerated by genomics andbioinformatics, will increase the quality of food, reducing

Table 1 (continued)Commercial bioinformatics web-portals

Company Tool URL

OnLine Research Tools, LifeExpress expression DBCompugen LabOnWeb.com www.labonweb.com Bioinformatics tools and genome,

transcriptome andZ3OnWeb.com www.2dgels.com Proteome DBs, access to

PathoGenomeCelera Celera Discovery System www.celera.com Access the Celera Human

genome sequence, manybioinformatics tools

Free bioinformatics resources

EMBL www.embl-heidelberg.de/

CMS Molecular Biology Resource www.sdsc.edu/restoolsNational Centre or Biotechnology Information NCBI www.ncbi.nlm.nih.govEuropean Bioinformatics Institute EBI www.ebi.ac.ukExPASy www.expasy.ch/The Institute of Genomic Research TIGR ww.tigr.orgUK Human Genome Mapping Project Resource Centre www.hgmp.mrc.ac.uk/Weizmann Institute of Science http://bioinformatics.weizmann.ac.il/Whitehead Institute http://www-genome.wi.mit.edu/MIPS www.mips.biochem.mpg.ukThe Sanger Centre www.sanger.ac.ukGOLD: Genomes OnLine Database http://wit.integratedgenomics.com/GOLD/

Food Research related public bioinformatics sites

USDA Biotechnology Information Centre www.nal.usda.gov/bic/

UK Crop Plant Bioinformatics Network (UK CropNet) http://ukcrop.net/The USDA-ARS Centre for Bioinformatics and Comparative Genomics http://ars-genome.cornell.edu/

a The selection of companies and web-links is not exhaustive and is not an endorsement of the entities mentioned. These resourcesrepresent the current status. Due to the dynamic nature of bioinformatics, they may change rapidly.

218 F. Desiere et al. / Trends in Food Science & Technology 12 (2002) 215–229

Page 5: Bioinformática y Tecnología alimentaria

all aspects of the cost including the impact of food cropproduction on the environment.Cocoa (Theobroma cacao; Fig. 2) as an example is the

raw material for all chocolate containing foods anddrinks. The breeding and selection of higher qualitybeans with superior flavor characteristics has been diffi-cult in the past, since the trees must be maintained atleast 3–5 years before the cacao bean can be harvestedand analyzed. With the establishment of DNA finger-printing technologies for screening plant collections,RFLP markers for the detection of genotypic relation-ships between breeds or species and the determinationof more than 300 molecular markers, breeding pro-grams have been greatly enhanced. The future avail-ability of EST sequences and genome comparisons toother sequenced plants, which rely heavily on bioinfor-matic tools, will result in a further acceleration with thepossibility to select for desired traits in an early stage ofplant development based on the genotype and the phe-notype (Pridmore et al., 2000).

Implication of genomics/bioinformatics for healthand nutritionGenomics, enabled by bioinformatics will contribute

to an improved understanding of the molecular mech-

anisms underlying the relationships between food andhealth, from basic nutrient actions to the interactionsbetween food microorganisms and the human intestinalsystem, including the gut and immunocompetent cells,and the mechanisms underlying the interactions of themicrobial community in the intestinal tract (German,Schiffrin, Reniero, Mollet, Pfeifer, & Neeser, 1999).With the recent explosion of genome data, includinggenomics, transcriptomics, proteomics, metabolomicsand structural genomics, bioinformatics is addressingthe task of developing computational methods to dealwith the massive flows of data emerging from modernexperimental approaches in relating genotype to pheno-type (Lee & Lee, 2000). The approaches include func-tional and comparative genomics and high-throughputtechnologies such as genome sequencing and DNAmicroarrays. The knowledge developed from this newscience will expand nutrition in three dimensions,mechanism, human variation and time: the geneticmechanisms underlying health, the basis of individualvariations in metabolism and the time scales duringwhich diet influences metabolism.The scientific knowledge of both the genetic variation

amongst humans and the response of individual genes toingested molecules (drugs, foods and toxins) is growing

Table 2. Genome projects of organisms interesting for the food industrya

Organism Genome size (Mbp)b Organism Genome size (Mbp)

Spoilage/pathogens Food-gradeBacillus anthracis 4.5/progr. Aspergillus nidulans 29/progr.Bacillus stearothermophilus 10/progr. Bacillus subtilis 4.20/publishedCandida albicans 15/progr. Lactobacillus acidophilus 1.9/progr.Campylobacter jejuni 1.641/published Lactobacillus sp. �2/progr.Clostridium acetobutylicum 4.1/progr. Lactococcus lactis 2.365/publishedEnterococcus faecalis 3/progr. Saccharomyces cerevisiae 12.069/publishedEscherichia coli O157:H7 4.1/published Streptococcus thermophilus �2/progr.Helicobacter pylori 1.667/publishedListeria innocua 3.2/progr.Listeria monocytogenes 2.9/completedMycobacterium bovis 4.4/progr. Others:Mycobacterium leprae 3.2/publishedMycobacterium tuberculosis 4.411/published Arabidopsis thaliana (thale cress) 115.428/publishedPseudomonas aeruginosa 6.264/published Bos taurus (Cattle) MappingPseudomonas putida 6.1/progr. Canis familiaris (Dog) MappingSalmonella enteritidis 4.5/progr. Felis catus (Cat) MappingSalmonella paratyphi A 4.6/progr. Glycine max (Soybean) MappingSalmonella typhi 4.5/progr. Homo sapiens (Human) 3200/publishedSalmonella typhimurium 4.5/progr. Mus musculus (Mouse) Progr.Shewanella putrefaciens 4.5/progr. Oryza sativa (Rice) 450/finishedShigella flexneri 4.7/progr. Phaseolus vulgaris (Bean) Progr.Staphylococcus aureus 2.8/published Rattus norvegicus (Rat) Progr.Staphylococcus epidermidis 2.4/progr. Solanum tuberosum (Potato) EST-sequencingStreptococcus mutans 2.2/progr. Triticum aestivum (Wheat) MappingStreptococcus pneumoniae 2/completed Zea mays (Maize) MappingStreptococcus pyogenes 1.8/publishedThermus thermophilus 1.8/progr.Vibrio cholerae 4/published

a This table represents the current status. Due to the dynamic nature of bioinformatics it may change rapidly.b MBP, number of mega base pairs; progr., project in progress.

F. Desiere et al. / Trends in Food Science & Technology 12 (2002) 215–229 219

Page 6: Bioinformática y Tecnología alimentaria

exponentially as a result of the arrival of the humangenome and the tools of functional genomics (DNAarrays, etc.). This explosion of information is only beingconverted into usable knowledge because of the arrivalof the massive computing power and the bioinformatictools needed to apply them to large data sets beinggenerated by nutrition-related research. This knowledgewill not only drive a new generation of foods withadditional values but change dramatically the ability offoods to influence individual quality of life. Thisknowledge promises also to drive a new value systemfor agriculture itself.

Genetic responsiveness or gene expressionThe ability of nutrients to directly control the expres-

sion of particular genes is at the heart of a new generation

of nutritional science allowing researchers to applygenomic information to technologies that can quantifythe amount of actively transcribing genes in any cell atany time (e.g. gene expression arrays). With this tech-nology in place, scientists of every biological disciplineare discovering the interaction between organisms andtheir environment with an intimacy never thought pos-sible. Nutrition is at its heart, a multidisciplinary fieldfocusing on integrative metabolism of animals andhumans. Nutritionists have strived for the last centuryto deduce the mechanistic basis of the apparent strongrelationship between diet and health through under-standing the interaction of nutrients with metabolicpathways. Needless to say, this was a daunting task withthe traditional tools of reductionism biochemistry. Mostnutrients affect a wide range of biochemical pathways.The net result is that nutrients exert multiple effects:pleiotropic dysfunctions in their relative absence, i.e.deficiencies, and pleiotropic benefits in their return toappropriate, optimal intakes. Reductionism biochemical

Fig. 1. Electron micrograph of Streptococcus thermophilus (ovalchains) and Lactobacillus johnsonii (rod-like chains) cells used for

starters cultures in food fermentations.

Fig. 2. Example of a Cacao plant (Theobroma cacao L.) in naturalform as fruits, as beans and finally as ground powder. Cacao treesmust be maintained approx. 3–5 years before harvesting thecacao. Selection of specific traits based on genotype in the early

development of the plant is therefore highly desirable.

Fig. 4. Food production is based on biological raw materials whichare refined into food ingredients. A unifiying approach is proposedon the basis of common basic and material properties of thecomprising molecules in both domains. Moreover, the vast store ofknowledge currently being produced by the biomical sciences(genomics, proteomics, metabolomics) will improve the knowledge

on ingredient characteristics and behaviours.

Fig. 3. The perceived food qualities are driven by flavors and tex-ture. Both are composite events whose disparate elements showspecific interactions. While the elements can be controlled sepa-rately, only understanding the underlying neuro-physiologicalprocesses will lead to optimizing the flavor and texture impact of

foods.

220 F. Desiere et al. / Trends in Food Science & Technology 12 (2002) 215–229

Page 7: Bioinformática y Tecnología alimentaria

approaches describe very well the effects of a singlenutrient’s interaction with a single target; however, theyfail to adequately explore the multiplicity of metaboliceffects on the entire organism. The perspective of mod-ern genomics is ostensibly the reverse (expansionist)approach, to measure everything. Genomic-basedinvestigations do not avoid pleiotropic behavior of exo-genous nutrients; quite the contrary, they reveal it. Thegoal of differential gene expression array experimentsare to describe the full spectrum of transcriptionalresponses to any variable, including nutrients. Suchglobal experimental designs are only possible due to theadvent of bioinformatic tools to adequately manage andanalyze the sheer volume of data that are produced.With the arrival of broadly parallel assessment tools

including gene expression arrays and metabolomics,single biomarkers of disease risk will no longer be con-sidered useful (Watkins et al., 2001). Since it will be asstraightforward to measure the expression of 30,000genes as the expression of one gene, knowledge fromexpression profiling will impact health assessment. It isequally certain that the days of building dossiers of effi-cacy and safety based on a single metabolic endpoint,e.g. cholesterol, are limited. Such comprehensiveknowledge of the effects of discrete food and nutritionalvariables to overall metabolism will add new under-standing to their health value.

Genetic variabilityWith the genome of one ‘individual’ human com-

pleted, the effective technologies to establish variationsfrom that single genome, are being implemented. TheSingle Nucleotide Polymorphisms (SNP) Consortium(http://snp.cshl.org/) is mapping the polymorphicregions of the genome that control individual pheno-typic differences among the population (Sachida-nandam et al., 2001). While these variations are beingviewed initially as the key to the discovery of geneticdiseases, they are also the keys to individual variation indiet and health. Sequence variation in particular geneseven as slight as single nucleotides can influence thequantitative need for and physiological response tovarious nutrients. Knowing that genes influence nutri-tion, of course is not new. An understanding of thisvariation is inherent in population recommendations foressential nutrients (Young & Scrimshaw, 1979). How-ever, allowing for the variation in human genetics byincorporating a large margin for error in quantitativerecommendations is not the same as designing diets forspecific individuals according to their genetic profiles(Eckhardt, 2001; Nichols, 2000). An example of poly-morphisms that influence nutrition and disease is phe-nylketonuria, in which the inability to metabolizephenylalanine renders this nutrient toxic (Lindee, 2000).The occurrence of lactose intolerance is due to poly-morphisms both in the structure of the lactase gene

which produce dysfunctional enzyme and in regulatoryregions of the genome that prevent perfectly functionallactase enzyme from being produced in adults (Harveyet al., 1998). With genomics will come the knowledge ofthe integrative nature of multiple genes in predictinghealth. The potential opportunity of bioinformatics todeliver that knowledge to the individual consumer willeventually lead to individualized dietary choices in thehands of the consumer. This bold future is arrivingbecause of bioinformatic tools capable of managing thevolume of data implied by quantitatively assessing indi-vidual metabolism and intervening in an that indivi-dual’s metabolism using foods to improve their health.Genomic and bioinformatic tools will improve human

clinical research. Historically, many nutrition trialsfailed to find statistically significant effects of variousnutrients and food choices not because there was nobenefit, but because the magnitude of the benefit wassmall relative to the overall variability in a sample ofhumans chosen at random from the population.Humans do not respond homogeneously to even themost straightforward nutritional variables. A greatvalue of genotyping individuals in clinical trials is tobegin to assign the variation of the population to spe-cific genetic differences. Clinical and epidemiologicaltrials are now being analyzed using SNP data as inde-pendent input variables (Takeoka et al., 2001). Mostclinical trials are already cataloguing the SNPs of geneswhose variation in function have shown to be importantto the endpoint measures of these trials, for examplecancer, autoimmunity and heart disease (Marth et al.,2001). Such ‘data-mining’ approaches have been suc-cessful not only in identifying the causes of statisticalvariation among trial participants but in identifying thepotential biochemical mechanisms responsible for thevariation in response. This approach is already provingso powerful that scientific agencies are recognizing thattraditional avenues of scientific publishing aren’t ade-quate and the processes of scientific discovery of geneticpolymorphism and health are accelerated by the avail-ability of SNP data sets and bioinformatic packageson the internet (Clifford, Edmonson, Hu, Nguyen,Scherpbier, & Buetow, 2000).

Genetic polymorphism and nutrient requirementsPolymorphisms in the various genes encoding

enzymes, transporters and regulatory proteins affect theabsolute quantities of essential nutrients that are neces-sary to achieve sufficiency, including vitamins, minerals,etc. (Bailey & Gregory, 1999). Thus, the variation in thepopulation’s nutrient status is not simply the result ofvariations in food intakes but also the result of inherentvariation amongst individuals within the population intheir genetically defined abilities to absorb, metabolizeand utilize these nutrients. Recommended daily allow-ances of each nutrient are determined to meet the needs

F. Desiere et al. / Trends in Food Science & Technology 12 (2002) 215–229 221

Page 8: Bioinformática y Tecnología alimentaria

of a statistically representative fraction of the popula-tion; however, the range of responses to both micro-and macronutrients in the general population is large.Very recent research using genomic tools is highlightingjust how specifically individual food choices, geneticsand nutrition are linked. Polymorphism in a recentlyidentified sweet receptor protein has been proposed tobe the basis for the varying intakes of caloric-rich foods,i.e. the famous sweet tooth (Davenport, 2001).As genomics begins to reveal the basis for food pre-

ference and the respective roles of genetics and envir-onment, nutritional superior foods could be made moreorganoleptically attractive to precisely the subset of thepopulation for whom they are most appropriate. How-ever, an important step is still missing. At this point,while the technologies to describe the effects of diet onvarious individuals experimentally are widely used forexample in clinical trials, the technologies are not yetpart of routine consumer assessment. Therefore, con-sumers cannot take advantage of nutritional knowledgeabout themselves, because they do not have it. This lackof knowledge transfer is clearly the largest single factorconstraining a more widespread improvement in nutri-tional health in the consumer population.

Genetic variation and the response to variations inoverall dietGenetic differences affect the basic metabolism of

macronutrients and in particular fat and carbohydratein humans. For example, polymorphisms in the apo-protein genes (apoE, apoAIV) or lipoprotein catalysts(lipoprotein lipase) have been shown to directly affectthe clearance of dietary lipids. Hence polymorphisms inlipid metabolic genes dictate the response of these indi-viduals to dietary fat (Hockey et al., 2001; Pimstone etal., 1996). Polymorphism in the genes encoding for theapoE protein influence the functionality of this proteinin clearing liver-derived lipoproteins (VLDL and LDL)from blood (Weintraub, Eisenberg, & Breslow, 1986).Health outcomes beyond heart disease including Alz-heimer’s disease have been shown to be correlated toapoE phenotypes. Once again, diet plays a differentialrole in the development of these diseases according togenotype through the role of diet in influencing thequantitative flux of hepatic lipoprotein metabolism(Corella et al., 2001).Many consumers are concerned about the widespread

application of genomic testing in the population becausethey see little value to themselves. However, there isgreat value in acquiring knowledge about individualvariation in diet-responsive genes if it can lead to suc-cessful intervention. For example, genotype predicts adifference in post-prandial lipid metabolism of dietaryfat (Hockey et al., 2001). The most exciting aspect ofthis discovery is the realization that this knowledge isnot just academic, but leads to an immediate individual

recommendation how to alter the intakes of dietary fatfor those affected. Thus, the information of how anindividual responds to foods provides that individualwith the means to change their diet to improve theirhealth. With each new discovery of genetic polymorph-isms linked to health, the complexity of the scienceincreases. Fortunately, modern bioinformatics tools areinherently integrative adding each new discovery into arapidly expanding coherent picture of diet and health ofindividual consumers.

Food qualityFood is one of life’s great delights. Modern science

and technology have provided unparalleled value toconsumers in the breadth of individual choices in deli-cious, safe and nutritious foods. This great value hasbeen driven by scientific knowledge at all levels of theagricultural food chain from genetic improvements inproduction agriculture to food process engineering toprecision in the analysis of consumer sensation. With itspower to build detailed molecular knowledge of biolo-gical organisms, modern bioinformatic technologies areassembling the means to re-invent the food supply. Inno other aspect of life do humans interface with otherbiological organisms to the same extent as in the con-sumption of food. Thus, the most tangible, daily valuethat genomics will eventually produce for humans is adramatic increase in the quality of their lives throughthe quality of their foods. Bioinformatics will helpunderstand the basis of different food flavors, and tex-tures and even further why we find them delicious, andhence how to enhance that experience. Bioinformaticswill not only define in molecular detail which foods aresafe, but develop foods that make consumers themselvessafer. Bioinformatics will not only improve the processesof forming foods, but design foods that form themselves.The understanding of the biomolecular basis of flavor

perception has been a major success of the last 5 yearsof scientific investigation in the molecular biology ofsensation (Fig. 3).Success in identifying, in molecular and genetic

details, the taste and flavor receptors has been remark-able in the past months. These include:

� Bitter: A family of �50 G protein-coupledreceptors (GPCRs) identified in human taste cells(Chandrashekar et al., 2000);

� Salt: The epithelial ion channel, ENaC isresponsible for over 80% of salt taste transduc-tion (Nagel, Szellas, Riordan, Friedrich, & Har-tung, 2001);

� Sour: An ion channel, identical to degenerin-1, isproposed to be the receptor (Ugawa et al., 1998);

� Umami: A ‘splice variant’ of brain glutamatereceptor, mGluR4 identified in rat taste cells(Matsunami, Montmayeur, & Buck, 2000); and

222 F. Desiere et al. / Trends in Food Science & Technology 12 (2002) 215–229

Page 9: Bioinformática y Tecnología alimentaria

� Sweet: The putative identity of the sweetnessreceptor identified as a G protein coupled recep-tor Tas1r3 (Max et al., 2001).

The discovery of these taste receptors is being trans-lated rapidly into a variety of research programsdesigned to discover the next generation of taste modi-fiers for foods. The sugar substitutes demonstrated thepotential for replacing the traditional sweet molecules(simple sugars) with non-caloric, non-cariagenic andnon-glycemic alternatives in a variety of food products.Now, with the balance of taste receptors known, it willbe possible to develop flavor systems that either produceor enhance positive or mask negative tastes. Much ofthis work will be possible using combinatorial chemistryapproaches that use bioinformatic tools to screen thou-sands of molecules and combinations at a time. Suchmolecular simulations once took weeks and very largesuper-computer installations. New developments incomputing power, computational algorithms and soft-ware and the available databases of known structuresand successful simulations has brought molecular mod-eling into mainstream food chemistry. Such simulationswill make it possible to develop not only more intensetasting compounds as food additives, but understandthe basis of taste persistence, antagonism and com-plementation. Flavor systems will become more com-plex, more attractive and more individualized toconsumers.

Olfaction: a family of 1000 GPCRs, about 300identifiedNot far behind the taste receptors the much more

abundant odor receptors are being identified as well.The full olfactory complement of genes has been pub-lished (Glusman, Yanai, Rubin, & Lancet, 2001). Thenumber of odor receptors exceeds the number of tastereceptors by a factor of 100. In spite of this expansion insize and complexity, bioinformatics will have little diffi-culty in translating the principles of ligand–receptorinteraction developed with taste into similar applica-tions to odor sensations. With such capabilities, sophis-ticated flavor systems will be designed from theperspective not simply of what is available in naturalcommodities and foods, but with final flavor perceptionas the goal. Ultimately, it will be possible to design fla-vor systems that optimize flavor perception in highlynutritious foods that are currently organolepticallyundesirable in spite of their superior health value.Making the next connection, i.e. understanding thebasis for healthy and unhealthy food choices, is alreadyproceeding.Recently, the connection between gratification and

the brain was verified in rats (Cardinal et al., 2001).Similar developments in our understanding of the brain

could lead the way to furnish tailor made specific orga-nolopetic attributes as well as nutrition needs.

Bioinformatics and food processingThe most immediate application of bioinformatics to

food processing will be in optimizing the quantitativecompositional parameters of traditional unit operations.Food commodities are processed largely to achieve sto-rage stability and safety with considerable excess ofenergy applied to ensure a large margin for error. Thismargin of error is necessary due to our inexact knowl-edge of the composition and structural complexity ofbiological materials, the natural variability of livingorganisms as food process input streams and theresponse of these materials to processing parameters.With the considerable knowledge of biological organ-isms from bacteria and viruses to plants and animalsthat is emerging from bioinformatics, food processdesign will become optimized with narrower margins ofall cost-important inputs, especially energy.The great future for food processing however is not in

simply processing for greater safety, but in mergingbiological knowledge of living organisms with the bio-material knowledge necessary to convert them to foods.Traditional food processing relies on the aggressiveinput of energy to restructure the biomaterials of livingorganisms into simpler macrostructure forms of stable,relatively uniform foods. In most cases the inherentbiological properties of the living systems are lost to thefinal food product in the need to eliminate potentiallyhazardous properties of some of the constituent mole-cules (protease inhibitors, etc.). The arrival of theknowledge base of modern bioinformatics, however, isproviding a detailed description of the inherent com-plexity of biological macromolecules within living cellstogether with the structural properties of these mole-cules that provide much of their functions. Suchknowledge is the cornerstone of functional genomicsand proteomics. The arrival of such knowledge, how-ever, provides an unprecedented opportunity to trans-late this knowledge into an equally accurate assessmentof the biomaterial properties of each of the molecules ina complex mixture. It will soon be possible to use theinherent structural properties of natural food commod-ities to self-assemble new foods with a minimum ofexternal energy retaining a maximum of biological andnutritional value. The biological structure–functionrelationships discovered through bioinformatics of liv-ing systems will be able to be mapped into the struc-ture–function relationships of the next generation offoods with delightful results (Fig. 4).All foodstuffs are ostensibly modified tissues. Thus,

the natural biomaterial properties of the molecules thatmake up living organisms underlie the basic biomaterialproperties of foods. In most traditional food process-ing, however, little advantage is taken of the unique

F. Desiere et al. / Trends in Food Science & Technology 12 (2002) 215–229 223

Page 10: Bioinformática y Tecnología alimentaria

properties of specific molecules and instead, all bio-molecules of a particular class, e.g. proteins, are exposedto substantial physical, thermal and mechanical energyto make these properties uniform in order to restructurethe material into more stable, and/or more bioavailablefood systems. Such processing eliminates the subtledifferences within most of the classes of the major bio-molecules that are inherent to and the basis of complexstructure–function relationships of living organisms.Processing replaces biological complexity with thestatistical average properties of the broad classes ofbiomaterials, i.e. proteins, carbohydrates, lipids.The processing of commodities to eliminate the com-

plexity of their biological structures are not necessary tothe quality of foods, in fact the opposite. There are vividexamples in which highly specific biological propertiesof the original living organism are a key to the proces-sing strategy and ultimately the organoleptic attractive-ness of final food products. The renneting of bovinemilk to induce the natural aggregation of milk caseinsleading to the gelation events of cheese manufacture issuch a process. The final product takes advantage of theunique self-assembly properties of milk casein micellesthat are colloidally stabilized in milk by kappa caseinsbut destabilized when enzymatically cleaved of theirsolubilizing glycomacropeptide. Another example isleavened bread in which a combination of both compo-site processing and biological restructuring is the basisof breads’ structures, textures and nutrition. In thiscase, wheat seeds are ground to disassemble the major-ity of their biological structures through mechanicalenergy, but then the biological processes of yeast fer-mentation achieve simultaneously the enzymatic elim-ination of phytic acid during dough incubation and thebiochemical production of carbon dioxide gas as lea-vening within a mechanically reworked protein gelstructure. In each of these cases, bread and cheese, tak-ing advantage of the biological properties of the livingorganisms, led to substantial value both organolepti-cally and in greater safety and nutritional value. Fur-thermore, the inherent variation in biological organismsthat plagues the standardization of simpler food pro-cessing objectives is not a disadvantage to these twofood staples, but rather a wonderful benefit leading toliterally hundreds of distinctly flavored and texturedcheeses and varieties of breads. Thus, cheeses andbreads provide proof of what is possible when the bio-logical processes of catalysis, self-assembly and restruc-turation is retained as the basis of food processing.Heretofore, empirical trial and error was the majorroute to discovery of biodriven food processing. How-ever, the biological knowledge that is emerging withfunctional genomics, proteomics and metabolomics isproviding precisely the knowledge necessary to read-dress food processing using bimolecular activities ratherthan simply composite biomaterial properties. The

entire protein–protein interaction map of yeast, i.e. allpossible interactions between the 6000 proteins of yeast,has been completed (Ito, Chiba, Ozawa, Yoshida, Hat-tori, & Sazaki, 2001). In the future, the structure func-tion properties of living organisms that are emerging sorapidly with bioinformatics will increasingly dictate thedesign of new foods and new food processes. Once suchtools are in hand, process design engineers can thenwork in a coordinated fashion with plant bioengineersto produce crops that are not simply enriched in a singlevaluable component, but instead redesigned with arenewed purpose to increase the myriad values of foodsin providing quality of life.

Flavor analysisThe complex flavor profiles of many delightful com-

modities (e.g. fruits, baked goods) are not due to singlecompounds but rather are the result of the presence andinteractions of literally dozens of different molecules.This knowledge will provide the link and the compilerintegrating processing, quality and nutrition paving theroad for new product development based on insightknowledge of actual consumers’ preferences and needs.

The impact of genomics on the quality assurance offoodsFood safety is becoming more and more a major area

of concern for consumers and the food industry hasdeveloped a coherent research programme to ensurefood safety with well-established classical methodolo-gies but also new state-of-the-art research tools. Thegoal here is to ensure that the inactivation or inhibitionof undesired microbes is possible using the minimumtreatment of foods necessary, to increase the under-standing on the ecology of food-born microbial popu-lations, to find-out how these populations respond toenvironmental factors like stress and last but not leastthe toxicological evaluation of foods and food com-pounds.The genomics era delivers many new tools like pro-

teomics and DNA-array technology to tackle theabovementioned problems. These new technologies arenow a vital part of the scientific strategic plan to servethe diet and health theme and to provide safe food tothe consumer.Toxicogenomics, for example, is an emerging field

which utilizes DNA arrays (tox-chips) to test the tox-icological effects of a specific compound. These DNAarrays probe human or animal genetic material printedon miniature devices to profile gene expression in cellsexposed to test compounds rather than using animalpathology to define illness (Lovett, 2000). The advan-tages of this test goes beyond the speed and the ease ofuse which is typical for DNA expression analysis; it alsoreduces massively animal testing. Another challengehere is the massive amounts of data which are produced

224 F. Desiere et al. / Trends in Food Science & Technology 12 (2002) 215–229

Page 11: Bioinformática y Tecnología alimentaria

via these high-density DNA arrays and the analysis andthe interpretation of the results is a real challenge. Oncethis task has been tackled, the integration of tox-chipdata must be integrated into the knowledge basis of theresearch institution to draw a maximum of benefit forthe acceleration of the development pipeline.

Data integrationThe explosion of data, ever increasing developments

in information technology, abundant availability ofpowerful computers and the ability to connect themworldwide, affects enormous changes in knowledgemanagement. However, in order to gain full access tothese emerging powerful tools, it is paramount toresolve the enormous challenge of unifying complex anddissimilar data, each describing a large spectrum ofapplications, each of which could be extremely farapart. The need to combine observations from numer-ous sources and domains, into a unified, seamlesslysearchable database and turning it into knowledge isonly the beginning of this uphill battle that will impactevery facet of food and nutrition science.Advances in data collection, storage and distribution

technologies have far outpaced techniques to assist theanalysis and digestion of this information. In the past,most databases were quite small and utilized as typesettables or simple online documents. Today, far largerand more complex databases are emerging in manyfields at a level well beyond the reach of the traditionalmodel of solitary workers or small groups. (Maurer,Firestone, & Scriver, 2000). This has led to an all-too-common data glut situation creating a strong need anda valuable opportunity for extracting knowledge fromdatabases collected throughout R&D and elsewhere.One of the greatest challenges we are facing is how toturn this rapidly expanding or even exploding data intoaccessible and actionable knowledge. Moreover, foodand nutrition R&D is engaged in an assortment ofcomplex studies producing enpoint measures comprisedof numeric, sensory and perceptions, structure, biologi-cal, chemical and vision data. This need to manage suchdisparate inputs is critical as the amount of data dou-bles almost every 20 months (Colbourn & Rowe, 2000).Underlying the need to convert data into actionable

knowledge, organizations have started an aggressiveeffort to deploy Knowledge Discovery in Databases(KDD), Knowledge Management (KM), Data Mining(DM) and Intellectual Asset Management (IAM). Theseareas of common interest to researchers are: patternrecognition, statistics and statistical inference, intelligentdatabases, knowledge acquisition, data visualization, highperformance computing and expert systems, to mentionjust a few. Although these high technology informationmanagement systems are starting to play a fundamentalrole for the experts who are working on their develop-ment, they are however almost invisible for most users.

Data mining refers to a new genre of bioinformaticstools used to sift through the mass of raw data, findingand extracting relevant information and developingrelationships among them. As advances in instrumenta-tion and experimental techniques have led to the accu-mulation of massive amounts of information, datamining applications are providing the tools to harvestthe fruits of these labors. Maximally useful data miningapplications should:

� Process information from disparate experimentaltechniques, and technologies, including data thathave both temporal (time studies) and spatial(organism, organ, cell type, sub-cellular location)dimensions;

� Identifying and interpreting outlying, spuriousand rare data;

� Analyze data in an iterative process, re-applyinggained knowledge to constantly examine and re-examine data;

� Utilize novel text-mining and pattern recognitionalgorithms.

In the early years of modern scientific discovery,research findings would appear in a journal and then getburied in the depths of poorly accessible library space.Information existed in various formats (e.g. graphic,hard copy, tape), and was not easily retrievable. Dataanalyses were generally limited to slide rule and manualmanipulation. However, technological advances incomputational science and scientific instrumentationhave facilitated the exponential growth, not only indata, but also the tools to record and analyze these data.What was the Computer Age as we entered the 1990shas been supplanted by the Information Age. Thischange was made possible by the advent of the Internet,in particular the World Wide Web. This innovative,truly universal mechanism of information dissemina-tion, in concert with new computation-based analyticaltools, has provided practically endless opportunities forscientific discovery.The exponential rate of discovery in the era of mod-

ern molecular biology is phenomenal, culminating withthe June 2000 announcement that preliminary sequen-cing of the human genome had been completed. Thislandmark is just a taste of the scientific successes thatare to come. As impressive as it is, the determination ofthe sequence of the approximately 3.2 billion nucleo-tides of the human genome, encoding an estimated100,000 proteins, represents only the first step down along road of knowledge discovery and its application toadded value to consumers.Another application of bioinformatics that is growing

extremely fast is Chemometrics, the chemical disciplinethat applies mathematics and statistical methods, and

F. Desiere et al. / Trends in Food Science & Technology 12 (2002) 215–229 225

Page 12: Bioinformática y Tecnología alimentaria

uses designs of experiment to understand the effects andinteractions of several process parameters, and also tooptimize specific outcomes (Otto, 1999). Chemometrics,originally rooted in analytical chemistry, is currentlymore focused on addressing issues related to molecularconformations and behavior. With the increasing avail-ability of databases (e.g. through WWW), the need forimproved techniques that help extracting informationand turn it into knowledge has been therefore evergrowing (Brazma, Robinson, Cameron, & Ashburner,2000).It should be highlighted that food and nutrition are

related topics and are prone to another more crucialproblem. Generally, advanced data mining and othersophisticated search tools are no better than the infor-mation provided. As the scientific literature may containboth editorial and/or more fundamental errors (e.g.false methodology, unjustified conclusions, faulty appa-ratus), hence the need for the impartial scrutiny ofhuman editorial judgement is indispensable. One mightmake a compelling case that the value of the databasesis compromised most by their inherent bias: in conceptand design towards only benefit and in publicationtowards only a positive outcome. Databases are mostvaluable to data mining and bioinformatics searcherswhen they are balanced. It should be emphasized that ifdata mining techniques are polling databases that are soinherently unbalanced that no matter what the truthis, the data mining will invariably reflect the inherentbias in the databases that has been the result of con-scious or unconscious editorial influence. Hence, likemost other computer applications, the outcome in theshort term will be only as good as the quality of thedata. Moreover, the more complex the calculation is,the more paramount is the need for adequate checksand balances. The solution is for more balanced datacollection. At present, this is not the norm for nutri-tional research.Typical examples, far from being representative, yet

demonstrating how knowledge management is utilized,are provided:

1. Food industry—A software package (NetStat)was developed for analyzing reams of data, andis reported to have changed every aspect of thePillsbury company (i.e. from the way it developsnew products to how it capitalizes on consumers’tastes). The NetStat uniqueness is its ability toshare information across all the company’s ninebrands including manufacturing lines. The pro-gram is implemented as a Web site shared byresearchers across a 70-country conglomerate,and allows engineers and scientists to performrigorous tests and compare them with dataand specifications and consumer information(Crockett, 2000).

2. Pharmaceutical industry—Building of hugecombinatorial libraries by automatically synthe-sizing all possible combinations of components isunderway. The number of compounds in such adatabase can now be confidently stated to be inthe hundreds of thousands or even millions. Thenew automated screening technologies can testeach of these compounds, giving an indication ofwhether a compound is going to be effectiveagainst a specific biochemical target and a spe-cific disease.

3. Chemical industry—Chemical reaction databasesare available and could be used to derive knowl-edge for predicting the course and products ofchemical reactions as well as to design organicsyntheses. To reach this goal, the essential fea-tures of the chemical reaction have only to berecognized and generalized. This was achieved byclassifying a set of reactions by unsupervisedlearning techniques such as self-organizing maps(Kohonen). In this approach, reactions are char-acterized by physicochemical features directlyderived by computations from the constitution ofthe starting materials or products of a reaction(Gasteiger & Sacher, 1999).

4. Information industry—Chemical Abstracts Ser-vice (CAS) has launched its SciFinder 2000,empowering the user with greater visualizationtools and the ability to cross-tabulate and displaysearches graphically. This ‘wizard’ allows aresearcher to simultaneously locate informationwithin a multitude of databases and subse-quently explore the relationship between them.The retrieved data may be displayed in a 3Drepresentation that can be further manipulatedto zero in on the requisite research. The useof such data mining could revolutionize theway scientists approach their research projects(Massie, 2000).

5. Environmental safety—To reduce the need foranimal testing, Unilever has applied data miningtechniques (Clementine) to model skin corrosiv-ity of organic acids, bases and phenols. Thisfacilitated uncovering new information from theexisting database, and eventually will furnishtoxicologists with neural network based packa-ges to help assess and predict corrosivity andother toxicological properties. This approach ismuch more approachable than current tech-niques (e.g. principal component analysis). Itis hoped that it will lead to a movement awayfrom in vivo and in vitro experimentation towards‘in silico’ analyses, reducing costs, time scalesfor product development, and minimizing theneed for animal testing (http://www.spss.com/clementine/).

226 F. Desiere et al. / Trends in Food Science & Technology 12 (2002) 215–229

Page 13: Bioinformática y Tecnología alimentaria

6. Consumers—Data mining techniques are nowbeing used to extract a surprising amount ofinformation on individual customers and theirbuying patterns. These data are then used todevelop customer loyalty programs, for carefullyfocused marketing or additional services that fitthe customer’s individual preferences, and foridentifying possible synergies with other compa-nies who might share the same or similar base ofcustomers. Applications are ranging from directmarketers, books, to credit card companies,which identify trends, potential users, and targetmarketing strategies.

Development needs for data integrationComputational biology and electronic technologies

will be crucial for the future of Life science research andoffer in addition promising opportunities to manyindustries. Future central issues for the shortening ofresearch driven product development and gaining com-petitive advantage will be the issue of data integration.Companies which started initiatives in this area are nowstruggling to integrate legacy enterprise resource plan-ning and data warehouse technologies with bioinfor-matics. Compared to this challenge all other issuesincluding electronic commerce fade into insignificance.To be successful, companies are now focusing on spe-cific enabling technologies like Java, message-orientedmiddleware and XML to encourage web-based colla-boration between research teams and operating units.Clear integration paths and benchmarks are, however,still lacking.The ability to make better, faster and more innovative

research decisions is paramount to progress. Emergingtechnologies and the exploding amount of data high-light the need for new approaches. The availability of alarge number of fast PC’s connected together allowsparallel processing, overcoming barriers due to speedand computer resources. However, the ability to inte-grate the data and utilize KM is a real challenge, whichis compounded by the increased economic pressures anddemanding marketplace, global competition, regulation,and consumer demands. Implementing these new meth-odologies could open new avenues improving our abilityto quickly and efficiently gain new knowledge andinsights from cell structure to consumer perceived sen-sory attributes. Ultimately, one should envision ‘anengine’ able to ‘plug and play’ into various datadomains, integrating all the facets of a business increas-ing the likelihood of identifying the next target or newfood product for development and quality improvementaddressing the consumers’ real and perceived needs.Planning for the future is no longer a luxury; it is astandard operating procedure for the existence and well-being of the enterprise.

Future areas required development are:

� Models—Models that describe a class of reac-tions in an actual food system or food concept‘in silico’ (Hultzman, 2000). These models shouldbe designed so that they could also be applied fortesting the validity of previous data reported.This goal also mandates that terminology beharmonized, to improve accessibility. It couldlead to a movement away from in vivo and invitro experimentation towards ‘in silico’ ana-lyses, reducing costs, time scales for productdevelopment, and minimizing the need for ani-mal testing.

� Standardized protocols—Standard experimentaldesign and replication must be set if data accu-mulated by different groups and various techni-ques should be integrated. Thus, leading toimproved reproducibility, reduce variability, fur-nishing truly quantitative data, increase sensitiv-ity and provides means for comparing dataobtained from different sets (e.g. Lee, Kuo,Whitmore, & Sklar, 2000).

� Data integration and storage—Linking, inte-grating interoperable large databases with differ-ent heterogeneous structure and data types is farfrom being a straightforward task when con-sidering the vast differences that do exist betweenvarious domains makes this task immense. Simi-larly the ever-growing amount of informationneeds adequate storage and maintenance. Cata-loging and automated extraction (e.g. Andrade& Bork, 2000) are paramount. As the informa-tion complexity and quantity grows, the foodpractitioners need to define and develop a unifiedand acceptable approach. This task requires sig-nificant planning where all facets of the food,nutrition, biology and other domains areinvolved.

� Predictive tools—Techniques allowing the auto-mated discovery from large and different datasets need to be further developed before theycould be fully utilized in the food and nutritiondomains. Once implemented, it would open newavenues towards broad interdisciplinary sciencethat involves both conceptual and practical toolsfor generation, processing, analyzing and propa-gation of information leading ultimately to fun-damental understanding.

� Data visualization—A large volume of thehuman brain is devoted to visual data processing(Going & Gusterson, 1999). Data visualizationmethods therefore will play a significant roleallowing pattern characteristics and recognition.

� Paradigm shift—Food and nutrition scienceshould develop a holistic approach, by moving

F. Desiere et al. / Trends in Food Science & Technology 12 (2002) 215–229 227

Page 14: Bioinformática y Tecnología alimentaria

away from studying ‘vertically’ the role(s) of fewvariables to ‘horizontally’ studying simulta-neously many variables and applying advancedmodeling and analysis techniques (e.g., Fiehn,Kloska, & Altmann, 2001).

ConclusionsBiomics, comprised of genomics, proteomics and

metabolomics, is taking up its position as a lead sciencefor the 21st century. Its influence is already felt throughout the biological sciences. Moreover, its influence onnutrition and food science will generate a unified area ofresearch where both nutritional benefit and traditionalfood values become parts of an extended life sciencedriving towards enhanced quality of life. Impacts of theknowledge obtained through this research on rawmaterials, ingredients, safety, quality and nutrition canbe expected to have a far greater impact on productimprovements than today’s functional food research isimagining. Future developments in biomics, bioinfor-matics and information technology based approaches tofoods will truly change and revolutionize the way foodindustry will satisfy consumer needs and wants.

Uncited referencesBender (1999), Firestein (2000), Gasch et al. (2000)

and Weggemans et al. (2001).

References

Andrade, M. A., & Bork, P. (2000). Minireview: Automated extrac-tion of information in molecular biology. FEBS Letters, 476, 12–17.

Bailey, L. B., & Gregory, J. F. (1999). 3rd Polymorphisms of methyl-enetetrahydrofolate reductase and other enzymes: metabolicsignificance, risks and impact on folate requirement. Journal ofNutrition, 129, 919–922.

Bender, D. A. (1999). Optimum nutrition: thiamin, biotin andpantothenate. Proc Nutr Soc., 58, 1999 427-433.

Brazma, A., Robinson, A., Cameron, G., & Ashburner, M. (2000).One-stop shop for microarray data. Nature, 403, 699–700.

Cardinal, R. et al. (2001). Impulsive choice induced in rats by lesionsof the nucleus accumbens core. Science, 292.

Clifford, R., Edmonson, M., Hu, Y., Nguyen, C., Scherpbier, T., &Buetow, K. H. (2000). Expression-based genetic/physical mapsof single-nucleotide polymorphisms identified by the cancergenome anatomy project. Genome Research, 10, 1259–1265.

Chandrashekar, J., Mueller, K. L., Hoon, M. A., Adler, E., Feng, L.,Guo, W., Zuker, C. S., & Ryba, N. J. (2000). T2Rs function as bittertaste receptors. Cell, 100, 703–711.

Colbourn, E., & Rowe, R. (2000, April 3). A logical step forward.Chem & Ind., 252–254.

Corella, D., Tucker, K., Lahoz, C., Coltell, O., Cupples, L. A., Wilson,P. W., Schaefer, E. J., & Ordovas, J. M. (2001). Alcohol drinkingdetermines the effect of the APOE locus on LDL-cholesterolconcentrations in men: the Framingham Offspring Study.American Journal of Clinical Nutrition, 73, 736–745.

Crockett, R. O. (2000, April 3). Pillsbury: a digital doughboy.Business Week.

Davenport, R. F. (2001). Taste research. New gene may be key tosweet tooth. Science 27, 292(5517), 620.

DellaPenna, D. (1999). Nutritional genomics: manipulating plantmicronutrients to improve human health. Science, 285, 375–379.

Eckhardt, R. B. (2001). Genetic research and nutritional indivi-duality. Journal of Nutrition, 131, 336S–339S.

Fiehn, O., Kloska, S., & Altmann, T. (2001). Integared studies usingmultiparrallel techniques. Current Opinion in Biotechnol, 12,82–86.

Firestein, S. (2000). The good taste of genomics [news; comment].Nature, 404, 552–553.

Gasch, A. P., Spellman, P. T., Kao, C. M., Carmel-Harel, O., Eisen,M. B., Storz, G., Botstein, D., & Brown, P. O. (2000). Genomicexpression programs in the response of yeast cells to environ-mental changes. Mol Biol Cell, 11, 4241–4257.

Gasteiger, J., & Sacher, O. (1999). Unsupervised learning in reactiondatabases. ACS Meeting, March, Anaheim, CA, USA.

German, B., Schiffrin, E. J., Reniero, R., Mollet, B., Pfeifer, A., &Neeser, J. R. (1999). The development of functional foods: lessonsfrom the gut. Trends in Biotechnology, 17, 492–499.

Glusman, G., Yanai, I., Rubin, I., & Lancet, D. (2001). The completehuman olfactory subgenome. Genome Research, 11, 685–702.

Going, I. J., & Gusterson, B. A. (1999). Moleculare phatology andfuture developments. European Journal of Cancer, 35, 1895–1904.

Harvey, C. B., Hollox, E. J., Poulter, M., Wang, Y., Rossi, M.,Auricchio, S., Iqbal, T. H., Cooper, B. T., Barton, R., Sarner, M.,Korpela, R., & Swallow, D. M. (1998). Lactase haplotypefrequencies in Caucasians: association with the lactasepersistence/non-persistence polymorphism. Annals of HumanGenetics, 62(Pt 3), 215–223.

Hockey, K., Anderson, R., Cook, V., Hantgan, R., Weinberg, R.,Hockey, K., Anderson, R., Cook, V., Hantgan, R., & Weinberg, R.(2001). Effect of the apolipoprotein A-IV Q360H polymorphismon postprandial plasma triglyceride clearance. Journal of LipidResearch, 42, 2001 211-217.

Hultzman, S. (2000). In silico toxicology. Annals of the New YorkAcademy of Science, 919, 68–74.

Ito, T. Chiba, T. Ozawa, R. Yoshida, M., Hattori, M., & Sakaki, Y. A.comprehensive two-hybrid analysis to explore the yeast proteininteractome. Proceedings of the National Academy of Sciencesof the United States of America, 98(8), 4569–4574.

Kuipers, O. P. (1999). Genomics for food biotechnology: prospectsof the use of high-throughput technologies for the improvementof food microorganisms. Current Opinion in Biotechnology, 10,511–516.

Lee, M. L. T., Kuo, F. C., Whitmore, G. A., & Sklar, J. (2000). Impor-tance of replication in microarray gene expression studies:statistical methods and evidence from repetitive cDNAhybridizations. Proceedings of the National Acadamy of Sciencesof the United States of America, 97, 9834–9839.

Lee, P. S., & Lee, K. H. (2000). Genomic analysis. Current Opinion inBiotechnology, 11, 171–175.

Lindee, M. S. (2000). Genetic disease since 1945. Nat Rev Genet, 1,236–241.

Lovett, R. A. (2000). Toxicogenomics. Toxicologists brace forgenomics revolution. Science, 289, 536–537.

Marth, G., Yeh, R., Minton, M., Donaldson, R., Li, Q., Duan, S.,Davenport, R., Miller, R. D., & Kwok, P. Y. (2001). Single-nucleotide polymorphisms in the public domain:how useful are they? Nature Genetics, 27, 371–372.

Massie, B. (2000). Moving towards a new digital environment. Am.Chem. Soc. 219th National Meeting, Part XI, 26–30 March,San Francisco, CA.

Matsunami, H., Montmayeur, J. P., & Buck, L. B. (2000). A family ofcandidate taste receptors in human and mouse. Nature,404(6778), 601–604.

228 F. Desiere et al. / Trends in Food Science & Technology 12 (2002) 215–229

Page 15: Bioinformática y Tecnología alimentaria

Maurer, S. M., Firestone, R. B., & Scriver, C. R. (2000). Science’sneglected legacy. Nature, 405, 117–120.

Max, M., Shaker, Y., Huang, L., Rong, M., Liu, Z., Campagne, F.,Weinstein, H., Damak, S., & Margolskee, R. F. (2001). NatureGenetics, 28, 58–63.

Nagel, G., Szellas, T., Riordan, J. R., Friedrich, T., & Hartung, K.(2001). Non-specific activation of the epithelial sodium channelby the CFTR chloride channel. EMBO Reports, 2, 249–254.

Nichols, B. L. (2000). Nutrigenetics and child development in the21st century. Nutrition, 16, 493–495.

Otto, M. (1999). Chemometrics statistical and computer applicationin analytical chemistry. Weinheim, Germany: Wiley VCH.

Pimstone, S. N., Clee, S. M., Gagne, S. E., Miao, L., Zhang, H., Stein,E. A., & Hayden, M. R. (1996). A frequently occurring mutation inlipoprotein lipase gene (Asn291Ser) results in altered post-prandial chylomicron triglyceride and retinal palmitate responsein normolipidemic carriers. Journal of Lipid Research, 37, 1675–1684.

Pridmore, R. D., Crouzillat, D., Walker, C., Foley, S., Zink, R.,Zwahlen, M. C., Brussow, H., Petiard, V., & Mollet, B. (2000).Genomics, molecular genetics and the food industry. Journal ofBiotechnology, 78, 251–258.

Sachidanandam, R., Weissman, D., Schmidt, S. C., Kakol, J. M.,Stein, L. D., Marth, G., Sherry, S., Mullikin, J. C., Mortimore, B. J.,Willey, D. L., Hunt, S. E., Cole, C. G., Coggill, P. C., Rice, C. M.,Ning, Z., Rogers, J., Bentley, D. R., Kwok, P. Y., Mardis, E. R., Yeh,R. T., Schultz, B., Cook, L., Davenport, R., Dante, M., Fulton, L.,Hillier, L., Waterston, R. H., McPherson, J. D., Gilman, B.,Schaffner, S., Van Etten, W. J., Reich, D., Higgins, J., Daly, M. J.,Blumenstiel, B., Baldwin, J., Stange-Thomann, N., Zody, M. C.,

Linton, L., Lander, E. S., & Attshuler, D. (2001). The InternationalSNP Map Working Group A map of human genome sequencevariation containing 1.42 million single nucleotide polymorph-ism. Nature, 409(6822), 928–933.

Schilling, C. H., & Palsson, B. O. (2000). Assessment of the meta-bolic capabilities of Haemophilus influenzae Rd through agenome-scale pathway analysis. Journal of Theoretical Biology,203, 249–283.

Takeoka, S., Unoki, M., Onouchi, Y., Doi, S., Fujiwara, H., Miyatake,A., Fujita, K., Inoue, I., Nakamura, Y., & Tamari, M. (2001). Amino-acid substitutions in the IKAP gene product significantly increaserisk for bronchial asthma in children. Journal of Human Genetics,46, 57–63.

Ugawa, S., Minami, Y., Guo, W., Saishin, Y., Takatsuji, K.,Yamamoto, T., Tohyama, M., & Shimada, S. (1998). Receptor thatleaves a sour taste in the mouth. Nature, 395(6702), 555–556.

Watkins, S. M., Hammock, B. D., Newman, J. W., & German, J. B.(2001). Individual metabolism should guide agriculture towardfoods for improved health and nutrition. American Journal ofClinical Nutrition, 74, 283–286.

Weggemans, R. M., Zock, P. L., Ordovas, J. M., Pedro-Botet, J., &Katan, M. B. (2001). Apoprotein E genotype and the response ofserum cholesterol to dietary fat, cholesterol and cafestol.Atherosclerosis, 154, 547–555.

Weintraub, M. S., Eisenberg, S., & Breslow, J. L. (1987). Dietary fatclearance in normal subjects is regulated by genetic variation inapolipoprotein E. Journal of Clinical Investigation, 80, 1571–1577.

Young, V. R., & Scrimshaw, N. S. (1979). Genetic and biologicalvariability in human nutrient requirements. American Journal ofClinical Nutrition, 32, 486–500.

F. Desiere et al. / Trends in Food Science & Technology 12 (2002) 215–229 229