17
The Big Picture: The Industrial Revolution Robert Stevens [email protected]. uk The University of Manchester, UK

The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about industrialising bioinformatics data analysis

Embed Size (px)

DESCRIPTION

Presentation at Astra Zeneca, Alderly Park, 2006

Citation preview

Page 1: The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about industrialising bioinformatics data analysis

The Big Picture: The Industrial Revolution

Robert [email protected] University of Manchester, UK

Page 2: The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about industrialising bioinformatics data analysis

Industrialisation

• Biology has industrialised data production

• Beginning to industrialise data analysis

• Need to automate experimentation

• Need to join them all together

Page 3: The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about industrialising bioinformatics data analysis

Data Integration

• Data integration is possible

• We know how to do it (technically)

• We know how to do plumbing

• What is left is a social issue

Page 4: The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about industrialising bioinformatics data analysis

Classic and Modern Biology

Genotype Phenotype

Modern biology

Classic biology

Page 5: The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about industrialising bioinformatics data analysis

Semantic Knowledge Base

Experimentation,Data generation

Consistency checkingQuerying

Automated reasoning

Hypothesis formulationExperimental design

Information extraction,Knowledge formalization

Semantic Systems

Biology Cycle

Page 6: The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about industrialising bioinformatics data analysis

What’s in a Lab?

• People

• Equipment, reagents, etc.

• Protocols

• Policy, governance

• All there to facilitate and manage investigation

Page 7: The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about industrialising bioinformatics data analysis

What’s in an e-Lab?People

Data Process

Investigation

Page 8: The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about industrialising bioinformatics data analysis

Data: BioGateway

• Uses Virtuoso Open Server– Open Source software that can host a triple store

– Can build this from RDF files

– Has a DB backend

• Supports SPARQL* language which allows querying RDF data (graphs)

• Its syntax is similar to that of SQL.

*http://www.w3.org/TR/rdf-sparql-query/

http://www.openlinksw.com/virtuoso/

Page 9: The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about industrialising bioinformatics data analysis

BioGateWay Resources

Uniprot/SWISS-PROT OBO GOA CCONCBI Taxonomy RO

RDF

Page 10: The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about industrialising bioinformatics data analysis

Data as Input: Asking Questions

• Cancer: what candidate genes are involved in cell cycle control, S-phase to G2 transition, DNA damage response and skin cancer?

• Gastrin: what genes correlate with cancer and the use of anti-acids, and are involved in the gastrin response, and are associated with cell cycle control?

• Inflammation: give me genes that are mentioned in the context of high carbohydrate intake and play a role in (process #1 to be named) and are within x steps from a GO ontology term related to inflammation

Page 11: The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about industrialising bioinformatics data analysis

Processes: Genotype to Pathway

Created by Paul Fisher

Page 12: The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about industrialising bioinformatics data analysis

Pathway to Phenotype

Created by Paul Fisher

Page 13: The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about industrialising bioinformatics data analysis

16

Processes: Finding and Curating Services

Processes: Finding and Curating Services

http://www.biocatalogue.org

Page 14: The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about industrialising bioinformatics data analysis

17

Finding, curating and reusing workflows

Page 15: The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about industrialising bioinformatics data analysis

Data & Processes: Hypotheses

• Run workflow

• Make new data to put in repository

• Also generate hypotheses

• Generate plan from hypothesis

• Execute plan and make more data

• Automated?

Page 16: The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about industrialising bioinformatics data analysis

The Robot Scientist

Page 17: The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about industrialising bioinformatics data analysis

myExperiment

Big PicturePeople

Workflows

BioGateWayRobot ScientistKnowledge

BioCatalogue