18
Functional Genomic Functional Genomic Hypothesis Generation Hypothesis Generation and Experimentation and Experimentation by a Robot Scientist by a Robot Scientist King et al, Nature 2004 King et al, Nature 2004 427:247-252 427:247-252 Presented by Monica C. Sleumer Presented by Monica C. Sleumer February 5, 2004 February 5, 2004

Functional Genomic Hypothesis Generation and Experimentation by a Robot Scientist King et al, Nature 2004 427:247-252 Presented by Monica C. Sleumer February

Embed Size (px)

Citation preview

Page 1: Functional Genomic Hypothesis Generation and Experimentation by a Robot Scientist King et al, Nature 2004 427:247-252 Presented by Monica C. Sleumer February

Functional Genomic Hypothesis Functional Genomic Hypothesis Generation and Experimentation Generation and Experimentation

by a Robot Scientistby a Robot Scientist

King et al, Nature 2004 427:247-252 King et al, Nature 2004 427:247-252

Presented by Monica C. SleumerPresented by Monica C. Sleumer

February 5, 2004February 5, 2004

Page 2: Functional Genomic Hypothesis Generation and Experimentation by a Robot Scientist King et al, Nature 2004 427:247-252 Presented by Monica C. Sleumer February

Scientific DiscoveryScientific Discovery

““Branch of AI devoted to developing algorithms for Branch of AI devoted to developing algorithms for acquiring scientific knowledge”acquiring scientific knowledge”

Current applications:Current applications:– Analysis of mass-spec dataAnalysis of mass-spec data– Discovering structure-activity relationships for compounds Discovering structure-activity relationships for compounds – Making semantic connections in published literatureMaking semantic connections in published literature– Predicting mechanisms for chemical reactionsPredicting mechanisms for chemical reactions– Revising taxonomies to accommodate new dataRevising taxonomies to accommodate new data

Connect to laboratory instrumentationConnect to laboratory instrumentation

Page 3: Functional Genomic Hypothesis Generation and Experimentation by a Robot Scientist King et al, Nature 2004 427:247-252 Presented by Monica C. Sleumer February

AccomplishmentAccomplishment

Automated entire scientific processAutomated entire scientific process

Robotic system that uses AI to “carry out Robotic system that uses AI to “carry out cycles of scientific experimentation”:cycles of scientific experimentation”:– Originates hypothesesOriginates hypotheses– Designs experimentsDesigns experiments– Performs the experimentsPerforms the experiments– Interprets the resultsInterprets the results

Page 4: Functional Genomic Hypothesis Generation and Experimentation by a Robot Scientist King et al, Nature 2004 427:247-252 Presented by Monica C. Sleumer February

Application: Functional genomicsApplication: Functional genomics

Function unknown for 30% of yeast genesFunction unknown for 30% of yeast genesComplete laboratory automation possible Complete laboratory automation possible Goal: connect genes to their functionGoal: connect genes to their functionUsing: Using: – Logical model of aromatic amino acid Logical model of aromatic amino acid

synthesis pathwaysynthesis pathway– 8 deletion mutants8 deletion mutants– 9 metabolites9 metabolites– Auxotrophic growth experimentsAuxotrophic growth experiments

Page 5: Functional Genomic Hypothesis Generation and Experimentation by a Robot Scientist King et al, Nature 2004 427:247-252 Presented by Monica C. Sleumer February

Aromatic Amino Acid PathwayAromatic Amino Acid Pathway

Page 6: Functional Genomic Hypothesis Generation and Experimentation by a Robot Scientist King et al, Nature 2004 427:247-252 Presented by Monica C. Sleumer February

Classical vs Robot ScienceClassical vs Robot Science

Classical method:Classical method:– Scientific expertise and imagination used to Scientific expertise and imagination used to

form hypothesesform hypotheses– Consequences of hypotheses tested by Consequences of hypotheses tested by

experimentexperiment

Robot Scientist:Robot Scientist:– Hypotheses formed by abductionHypotheses formed by abduction– Tested by deduction Tested by deduction

Page 7: Functional Genomic Hypothesis Generation and Experimentation by a Robot Scientist King et al, Nature 2004 427:247-252 Presented by Monica C. Sleumer February

Deduction and AbductionDeduction and Abduction

DeductionDeduction– Rule: P Rule: P Q, Fact: ~Q, Infer: ~P Q, Fact: ~Q, Infer: ~P – E.g.E.g. If a cell grows on minimal medium, then it can If a cell grows on minimal medium, then it can

synthesise tryptophansynthesise tryptophan– Fact Fact Cell cannot synthesise tryptophanCell cannot synthesise tryptophan– ∴ ∴ Cell cannot grow on minimal mediumCell cannot grow on minimal medium

AbductionAbduction– Rule: P Rule: P Q, Fact: ~P, Hypothesize: ~Q Q, Fact: ~P, Hypothesize: ~Q – E.g.E.g. If a cell grows on minimal medium, then it can If a cell grows on minimal medium, then it can

synthesise tryptophansynthesise tryptophan– Fact Fact Cell cannot grow on minimal mediumCell cannot grow on minimal medium– ∴ ∴ Cell cannot synthesise tryptophanCell cannot synthesise tryptophan

Page 8: Functional Genomic Hypothesis Generation and Experimentation by a Robot Scientist King et al, Nature 2004 427:247-252 Presented by Monica C. Sleumer February

ImplementationImplementation

Software:Software:– Background knowledgeBackground knowledge– Logical inference engineLogical inference engine– Hypothesis generation codeHypothesis generation code– Experiment selection codeExperiment selection code– LIMS codeLIMS code

Hardware:Hardware:– Liquid-handling robotLiquid-handling robot– Plate readerPlate reader– CPU to do the scientific reasoningCPU to do the scientific reasoning

No human intellectual input into:No human intellectual input into:– Experimental designExperimental design– Data interpretationData interpretation

Page 9: Functional Genomic Hypothesis Generation and Experimentation by a Robot Scientist King et al, Nature 2004 427:247-252 Presented by Monica C. Sleumer February

Robot ScientistRobot Scientist

Page 10: Functional Genomic Hypothesis Generation and Experimentation by a Robot Scientist King et al, Nature 2004 427:247-252 Presented by Monica C. Sleumer February

Logical ProcessLogical Process

Prolog used to model dataProlog used to model dataMetabolic pathway represented as a Metabolic pathway represented as a directed graphdirected graphDeduction: a knockout mutant will grow Deduction: a knockout mutant will grow IFF a path can be found from the given IFF a path can be found from the given metabolites to the 3 needed aa.metabolites to the 3 needed aa.Abduction: if a knockout mutant doesn’t Abduction: if a knockout mutant doesn’t grow using the given metabolites: grow using the given metabolites: hypothesize which enzyme is missing hypothesize which enzyme is missing

Page 11: Functional Genomic Hypothesis Generation and Experimentation by a Robot Scientist King et al, Nature 2004 427:247-252 Presented by Monica C. Sleumer February

Machine LearningMachine Learning

Improves performance based on prior Improves performance based on prior experienceexperience

Each hypothesis hasEach hypothesis has– Cost of testingCost of testing– Probability of being correctProbability of being correct

GoalsGoals– Find out which gene goes with which enzymeFind out which gene goes with which enzyme– Use the fewest possible resourcesUse the fewest possible resources

Page 12: Functional Genomic Hypothesis Generation and Experimentation by a Robot Scientist King et al, Nature 2004 427:247-252 Presented by Monica C. Sleumer February

Experiment ChoosingExperiment Choosing

3 ways:3 ways:– Intelligent: “ASE”Intelligent: “ASE”– Cheapest Experiment: NaïveCheapest Experiment: Naïve– Random ExperimentRandom Experiment

Performance: Performance: – Accuracy: # of correct predictions madeAccuracy: # of correct predictions made– Cost and number of experiments requiredCost and number of experiments required

Both real experiments and simulationsBoth real experiments and simulationsComparison to humanComparison to human

Page 13: Functional Genomic Hypothesis Generation and Experimentation by a Robot Scientist King et al, Nature 2004 427:247-252 Presented by Monica C. Sleumer February

Accuracy of the Experiment ChoosersAccuracy of the Experiment Choosers

ASE

Naive

Random

ASE

Naive

Random

Page 14: Functional Genomic Hypothesis Generation and Experimentation by a Robot Scientist King et al, Nature 2004 427:247-252 Presented by Monica C. Sleumer February

Results of Computer SimulationsResults of Computer Simulations

ASENaive

Random

Random

ASENaive

No noise

Noise

Page 15: Functional Genomic Hypothesis Generation and Experimentation by a Robot Scientist King et al, Nature 2004 427:247-252 Presented by Monica C. Sleumer February

ConclusionsConclusions

Scientific process can be automatedScientific process can be automated

Experiment selection strategies have significant Experiment selection strategies have significant impact on costimpact on cost

ASE outperforms ASE outperforms – Naïve by 3 foldNaïve by 3 fold– Random by 100 foldRandom by 100 fold

in terms of costin terms of cost

Performance is competitive with humanPerformance is competitive with human

Cost-effectiveness of science can be improvedCost-effectiveness of science can be improved

Page 16: Functional Genomic Hypothesis Generation and Experimentation by a Robot Scientist King et al, Nature 2004 427:247-252 Presented by Monica C. Sleumer February

Future WorkFuture Work

Extend system to uncover function of other Extend system to uncover function of other metabolic genesmetabolic genes

Would need to:Would need to:– Extend model to entire biochemical pathway Extend model to entire biochemical pathway

in KEGGin KEGG– Become more robust in terms of possible Become more robust in terms of possible

errors in KEGGerrors in KEGG– Include prediction of previously unknown Include prediction of previously unknown

enzymesenzymes

Page 17: Functional Genomic Hypothesis Generation and Experimentation by a Robot Scientist King et al, Nature 2004 427:247-252 Presented by Monica C. Sleumer February

CriticismsCriticisms

De-emphasis on how little of the pathway De-emphasis on how little of the pathway was actually testedwas actually tested

Not clear how deletion mutants were Not clear how deletion mutants were chosenchosen

No example of experiment cycleNo example of experiment cycle

Too large of a jump from theory to resultsToo large of a jump from theory to results

Results graphs too crowdedResults graphs too crowded

Page 18: Functional Genomic Hypothesis Generation and Experimentation by a Robot Scientist King et al, Nature 2004 427:247-252 Presented by Monica C. Sleumer February

Discussion QuestionsDiscussion Questions

Would computer-generated experiments and Would computer-generated experiments and results be accepted?results be accepted?How much would we have to understand about a How much would we have to understand about a computer-generated discovery process?computer-generated discovery process?Compare this system to currently common Compare this system to currently common method of: method of: – Large-scale generation of dataLarge-scale generation of data– Extraction of knowledge by data-mining systems Extraction of knowledge by data-mining systems

What other aspects of genome analysis could What other aspects of genome analysis could scientific discovery be applied to?scientific discovery be applied to?