Analysis Environments For Scientific Communities From Bases to Spaces

Preview:

DESCRIPTION

Analysis Environments For Scientific Communities From Bases to Spaces. Bruce R. Schatz Institute for Genomic Biology University of Illinois at Urbana-Champaign schatz@uiuc.edu,www.beespace.uiuc.edu. Baker Center for Bioinformatics Iowa State University October 6, 2006. - PowerPoint PPT Presentation

Citation preview

Analysis EnvironmentsAnalysis Environments For Scientific CommunitiesFor Scientific Communities

From Bases to SpacesFrom Bases to Spaces

Bruce R. SchatzInstitute for Genomic Biology

University of Illinois at Urbana-Champaignschatz@uiuc.edu,www.beespace.uiuc.edu

Baker Center for BioinformaticsIowa State University

October 6, 2006

What are Analysis Environments

Functional Analysis Find the underlying Mechanisms Of Genes, Behaviors, Diseases

Comparative Analysis Top-down data mining (vs Bottom-up) Multiple Sources especially literature

Building Analysis Environments

Manual by Humans Interaction user navigation Classification collection indexing

Automatic by Computers Federation search bridges Integration results links

Trends in Analysis Environments

Central versus Distributed Viewpoints

The 90s Pre-Genome Entrez (NIH NCBI) versus WCS (NSF Arizona)

The 00s Post-Genome GO (NIH curators) versus BeeSpace (NSF Illinois)

Pre-Genome Environments

Focused on Syntax pre-Web

WCS (Worm Community System) Search words across sources Follow links across sources Words automatic, Links manual

Towards Integrated Searching

Post-Genome Environments

Focused on Semantics post-Web

BeeSpace (Honey Bee Inter Space) Navigate concepts across sources Integrate data across sources Concepts automatic, Links automatic

Towards Conceptual Navigation

Worm Community System WCS Information:Literature BIOSIS, MEDLINE, newsletters,

meetingsData Genes, Maps, Sequences, strains, cells

WCS FunctionalityBrowsing search, navigationFiltering selection, analysisSharing linking, publishing

WCS: 250 users at 50 labs across Internet (1991)

WCSMolecular

WCS Cellular

WCS invokes

gm

WCS vis-à-vis acedb

from Objects to Concepts from Syntax to Semantics Infrastructure is Interaction with Abstraction

Internet is packet transmission across computersInterspace is concept navigation across repositories

Towards the Interspace

THE THIRD WAVE OF NET EVOLUTION

PACKETS

OBJECTS

CONCEPTS

Technology

Engineering

Electrical

FORMAL

INFORMAL

(manual)

(automatic)

IEEE

communities

groups

individuals

LEVELS OF INDEXES

Post-Genome Informatics IComparative Analysis within the

Dry Lab of Biological Knowledge

Classical Organisms have Genetic Descriptions.There will be NO more classical organisms beyondMice and Men, Worms and Flies, Yeasts and Weeds.

Must use comparative genomics on classical organismsVia sequence homologies and literature analysis.

Post-Genome Informatics IIFunctional Analysis within the

Dry Lab of Biological Knowledge

Automatic annotation of genes to standard classifications, e.g. Gene Ontology via homology on computed protein sequences.

Automatic analysis of functions to scientific literature, e.g. concept spaces via text extractions. Thus must use functions in literature descriptions.

Informatics: From Bases to Spacesdata Bases support genome datae.g. FlyBase has sequences and mapsGenes annotated by GeneOntology and

linked to biological literature

information Spaces support biological literaturee.g. BeeSpace uses automatically generated conceptual relationships to navigate functions

BeeSpace FIBR ProjectBeeSpace project is NSF FIBR flagshipFrontiers Integrative Biological Research, $5M for 5 years at University of Illinois

Analyzing Nature and Nurture in Societal Roles using honey bee as model

(Functional Analysis of Social Behavior)

Genomic technologies in wet lab and dry lab BeeBee [Biology] gene expressions SpaceSpace [Informatics] concept navigations

System Architecture

Concept Navigation in BeeSpace

NeuroscienceLiterature

MolecularBiology

Literature

BeeLiterature

Flybase,WormBase

BeeGenome

Brain RegionLocalization

Brain GeneExpression

Profiles

BehavioralBiologist

MolecularBiologist

Neuro-scientist

V1 BeeSpace Community Collections

Organism Honey Bee / Fruit Fly Song Bird / Soy Bean

Behavior Social / Territorial Foraging / Nesting

Development Behavioral Maturation Insect Development Insect Communication

 Structure Fly Genetics / Fly Biochemistry Fly Physiology / Insect Neurophysiology

CONCEPT SWITCHING “Concept” versus “Term”

set of “semantically” equivalent terms Concept switching

region to region (set to set) match

term

Semantic region

Concept SpaceConcept Space

BeeSpace Analysis Environment Build Concept Space of Biomedical Literature

for Functional Analysis of Bee Genes

-Partition Literature into Community Collections-Extract and Index Concepts within Collections-Navigate Concepts within Documents-Follow Links from Documents into Databases

Locate Candidate Genes in Related Literatures then follow links into Genome Databases

Well Characterized Gene

Poorly Characterized Gene

Gene Summarization, BeeSpace V2

Collaboration across Users

Category Browse (Collection)

Category Browse (Search)

PlantSpace Examples

Interactive Functional AnalysisBeeSpace will enable users to navigate a uniform space of

diverse databases and literature sources for hypothesis development and testing, with a software system beyond a searchable database, using literature analyses to discover functional relationships between genes and behavior.

Genes to BehaviorsBehaviors to GenesConcepts to ConceptsClusters to ClustersNavigation across Sources

BeeSpace Information SourcesGeneral for All Spaces: Scientific Literature-Medline, Biosis, CAB Abstracts Genome Databases-GenBank, ProteinDataBank, ArrayExpress

Special for BeeSpace: Model Organisms (heredity)-Gene Descriptions (FlyBase, WormBase) Natural Histories (environment)-BeeKeeping Books (Cornell, Harvard)

XSpace Information SourcesOrganize Genome Databases (XBase)Compute Gene Descriptions from Model OrganismsPartition Scientific Literature for Organism XCompute XSpace using Semantic Indexing

Boost the Functional Analysis from Special SourcesCollecting Useful Data about Natural Historiese.g. CowSpace Leverage in AIPL Databases

Towards SoySpace Organize Genome Databases (SoyBase) Partition Scientific Literature for SoyBean Gene Descriptions from Models (TAIR) Natural Histories from Population Databases

Key to Functional Analysis is Special Sources Collecting Appropriate Text about Genes Extracting Adequate Data about Histories Leverage is National Archives of germplasm

and Historical Records for soybean crops

Towards the InterspaceThe Analysis Environment technology is

GENERAL!

BirdSpace? BeeSpace?PigSpace? CowSpace? BehaviorSpace? BrainSpace?SoySpace? PlantSpace?

BioSpace… Interspace

Recommended