26
1 Artificial Intelligence Applications Institute Centre for Intelligent Systems and their Applications Stuart Aitken Artificial Intelligence Applications Institute A Process Ontology for A Process Ontology for Cell Biology Cell Biology

Stuart Aitken Artificial Intelligence Applications Institute

  • Upload
    keitha

  • View
    22

  • Download
    0

Embed Size (px)

DESCRIPTION

A Process Ontology for Cell Biology. Stuart Aitken Artificial Intelligence Applications Institute. Outline. Rapid Knowledge Formation (RKF) Project RKF Project goals and domain The Cyc knowledge based-system RKF Tools Process Ontology General approach Formalisation Example. - PowerPoint PPT Presentation

Citation preview

Page 1: Stuart Aitken Artificial Intelligence  Applications Institute

1

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

Stuart Aitken

Artificial Intelligence Applications Institute

A Process Ontology for A Process Ontology for Cell BiologyCell Biology

Page 2: Stuart Aitken Artificial Intelligence  Applications Institute

2

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

OutlineOutline

• Rapid Knowledge Formation (RKF) Project– RKF Project goals and domain– The Cyc knowledge based-system– RKF Tools

• Process Ontology– General approach– Formalisation– Example

Page 3: Stuart Aitken Artificial Intelligence  Applications Institute

3

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

Rapid Knowledge FormationRapid Knowledge Formation

• The RKF project aims to develop tools which will allow domain experts to enter knowledge directly into the KBS.

• DARPA-funded, two teams:– CYCORP– SRI

• Organised around ‘Challenge Problems’ – Cell Biology

Page 4: Stuart Aitken Artificial Intelligence  Applications Institute

4

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

RKFRKF

Aim: To enable biologists to construct an ontology/KB from a textbook source

formalise

Ontology

Alberts et al, Essential Cell Biology, 1998

Page 5: Stuart Aitken Artificial Intelligence  Applications Institute

5

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

Rapid Knowledge FormationRapid Knowledge Formation

Key techniques:• The KBS has knowledge of the KA

process– Knowledge of salience– Knowledge of the requirements of an

adequate formalisation

• There is a dialogue between expert and system, which clarifies the concept being defined.

Page 6: Stuart Aitken Artificial Intelligence  Applications Institute

6

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

Rapid Knowledge FormationRapid Knowledge Formation

Evaluation:

After a period of tool development,• trials are organised, both• expert performance, and• KE performance is measured,• and assessed independently.

The evaluation is extensive – over a period of 2 weeks

Page 7: Stuart Aitken Artificial Intelligence  Applications Institute

7

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

The Cyc KBSThe Cyc KBS

• Cyc (Doug Lenat) is a knowledge-based system, under development since ~1984, aiming to represent common sense knowledge.

• Cyc uses a large upper-level ontology

• Uses a logical language based on first-order logic

Page 8: Stuart Aitken Artificial Intelligence  Applications Institute

8

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

The Cyc KBSThe Cyc KBS

Concepts in the Upper Ontology:– Thing, Agent, Event– TangibleThing, InformationBearingObject– …. Dog, Book– subclass(genls), instance-of(isa)– parts, subevent, role predicates– 1600 concepts in total in the public

release (1998) - small% of Cyc

Classification:– Stuff-like vs Object-like– Individual vs Set

Page 9: Stuart Aitken Artificial Intelligence  Applications Institute

9

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

The Cyc KBSThe Cyc KBS

• The upper-ontology supports application development:

Upper-level

Intermediate-level

Application-level

Thing

Page 10: Stuart Aitken Artificial Intelligence  Applications Institute

10

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

The Cyc KBSThe Cyc KBS

Cyc includes:• An inference engine, • GUI, • tools for ontology development.• Until the RKF project, ontology

development was by trained knowledge engineers, working with domain experts.

Page 11: Stuart Aitken Artificial Intelligence  Applications Institute

11

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

RKFRKF

New tools in Cyc:• Define a new concept, and place it

correctly in the ontology• Refine a concept definition• Define a new predicate• Assert a new fact• Define a new rule• State an analogy• Construct a new process

Page 12: Stuart Aitken Artificial Intelligence  Applications Institute

12

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

RKFRKF

User interaction:• Selection of items in the interface

– Choice determined ‘intelligently’, KBS has knowledge of salience, and the KA process, this knowledge must be authored

• Browsing of the ontology• Search• Natural language dialogue

Page 13: Stuart Aitken Artificial Intelligence  Applications Institute

13

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

Process ModelsProcess Models

BindsTogether Move

RNA Transcription

Page 14: Stuart Aitken Artificial Intelligence  Applications Institute

14

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

Process DescriptorProcess Descriptor

Q: Name the processA: [ RNA Transcription ]Q:Select the type of Process that describes

the category best• event localised• creation or destruction event…• ‘say this:’[ _ _ _ _ _ _ ]Q: Define:• affected object: [ _ _ _ _ _ ]• location: [ _ _ _ _ _ ]• actor: [ _ _ _ _ _ ]

Page 15: Stuart Aitken Artificial Intelligence  Applications Institute

15

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

Process ModelsProcess Models

Describing Processes:• Complex expressions at the instance level• Simpler to describe in terms of types

Upper-level

Intermediate-level

subevent(Event,Event)doneBy(Event,Agent)

ForAll ?E ?F ?G implies(subevent(?E,?G) and isa(?E,BindsTogether)subevent(?F,?G) and isa(?F,Move))before(startOf(?E),startOf(?F))

Application-level?

Page 16: Stuart Aitken Artificial Intelligence  Applications Institute

16

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

Script VocabularyScript Vocabulary

The Script theory defines the semantics of Type-Level assertions

(typePlaysRoleInScene RNATranscription DNAMolecule BindsTogether objectActedOn)

• Requires rules for identity– Can require complex reasoning

• Good for user input• Can be extended to cover pre and

postconditions of actions

Page 17: Stuart Aitken Artificial Intelligence  Applications Institute

17

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

ScriptsScripts

subevents

BindsTogether

e

Move

f

RNA Transcription

Forall subevents f of t, of type Move,and all subevents e of t, of type BindsTogether,(startsAfterStartingof f e) where t is of type RNATranscription

t

startsAfterStartingOfInScript

Page 18: Stuart Aitken Artificial Intelligence  Applications Institute

18

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

ScriptsScripts

Type playing role

N

BindsTogetherNucleotide

e

Types:

objectActedOn

Instance:

For some n in N, (objectActedOn e n)

Page 19: Stuart Aitken Artificial Intelligence  Applications Institute

19

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

New Script VocabularyNew Script Vocabulary

• Pre and Post conditions

BindsTogether

N

R

N

RnottouchingDirectly connectedTo

(preconditionOfScene-negated BindsTogether touchingDirectly <Ribonucleotide Nucleotide>)

(postconditionOfScene BindsTogether connectedTo <Ribonucleotide Nucleotide>)

Page 20: Stuart Aitken Artificial Intelligence  Applications Institute

20

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

New Script VocabularyNew Script Vocabulary

N R

Some ?n in N, some ?r in R(not(touchingDirectly ?n ?r))

Some ?n in N, some ?r in R(connectedTo ?n ?r)

BindsTogetherNucleotide Ribonucleotide

e

Types:

roleroleSet ofInstances:

Precondition: Postcondition:

identity

Page 21: Stuart Aitken Artificial Intelligence  Applications Institute

21

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

Script VocabularyScript Vocabulary

• The Script vocabulary forms an ‘intermediate level’, which

• lies behind the Process descriptor GUI (i.e. the textboxes)

• Not, in itself, a taxonomy of processes, but allows processes to be described in detail.

• Defining the subclass relation is just one task.

Page 22: Stuart Aitken Artificial Intelligence  Applications Institute

22

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

Vaccinia Virus Life CycleVaccinia Virus Life Cycle

• The vaccinia virus life cycle was selected as an example of a complex model to formalise as a set of Scripts.

• The model includes actions, decomposition, ordering, objects-playing-roles and pre/postconditions

• It is a good test for the Script vocabulary

Page 23: Stuart Aitken Artificial Intelligence  Applications Institute

23

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

Vaccinia Virus Life CycleVaccinia Virus Life Cycle

mRNATranscription-Early

ViralGeneTranslation-Early

MovementOfProtein

Temporal:

mRNATranscription-Early

ViralGeneTranslation-Early

MovementOfProtein

mRNATranscription-Early

ViralGeneTranslation-Early

MovementOfProtein

Participants

Conditions:

Outputs:messengerRNA

Inputs:messengerRNA

Pre:spatiallySubsumes Cell VirusCore

Post:spatiallySubsumes CellCytoplasm Vitf2

Page 24: Stuart Aitken Artificial Intelligence  Applications Institute

24

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

EvaluationEvaluation

• 8 biologists were selected, and trained in the tools, 4 per team

• The knowledge to be formalised was selected (chapter 7 in Alberts)

• The knowledge base was allowed to contain ‘pump-priming’ knowledge

• The biologists entered knowledge , using the tools, then tested it against a set of questions,

• Ontology/KB was revised

Page 25: Stuart Aitken Artificial Intelligence  Applications Institute

25

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

EvaluationEvaluation

Results (outline)• A huge amount of data was collected,

but analysis is complex (IET Inc)• Domain experts were able to develop

ontologies after ‘light’ training• Knowledge engineers out-perform

domain experts in ontology construction

Page 26: Stuart Aitken Artificial Intelligence  Applications Institute

26

Artificial Intelligence Applications InstituteCentre for Intelligent Systems and their Applications

SummarySummary

‘Power Tools’ for ontology development are being implemented and tested in the RKF project.

• A Script/Process vocabulary has been developed and applied to processes in cell biology, covering:– Temporal order– Participants– Pre/postconditions– Repetition