21
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg

Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg

Embed Size (px)

Citation preview

Page 1: Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg

Ontology-Based Computing

Kenneth Baclawski

Northeastern University and Jarg

Page 2: Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg

The Onslaught

Increasingly large amounts of information is becoming accessible electronically.

The information sources are increasingly complicated.

The diversity of types of information source is also increasing.

Technologies are emerging to cope with this onslaught: ontology-based computing.

Page 3: Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg

Ontologies

Shared understanding within a community of people

Declarative specification of entities and their relationships with each other

Constraints and rules that permit reasoning within the ontology

Behavior associated with stated or inferred facts

Page 4: Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg

Relational Database SchemasWell established technique for specifying the

structure of shared data, not for communication between people or agents

Declarative specification but of tables, not of entities and relationships

Some constraints are expressible but no significant rules (such as inheritance)

No explicit behaviorStandard language is SQL.

Page 5: Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg

Object-Oriented Schemas

Emerging technology for communication between software components

Declarative specificationsConstraints and some rulesSeveral ways to specify behaviorThe Unified Modeling Language (UML) is

the standard OO modeling language.

Page 6: Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg

0..1Enzyme

sequence : string

1..*0..*

1..*

Chemicalname : stringformula : stringweight : number0..*

2..*Reaction

description : string

0..1

1..1catalyzed by

1..*0..* output

1..*

0..*

input

1..1

Pathwayname : string

2..*

1..1

consists of

1..1

Page 7: Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg

Logic

Very expressive but very difficult to use. Not designed for communication.

Most logical languages are not based on entities and relationships.

Very powerful inferencing capabilities.Do not usually have any associated behavior.Many examples: Prolog, KIF, Slang, ...

Page 8: Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg

XML DTDs and XML Schema

Defines a hierarchical document type. XML Schema defines data types. Designed for communication over the Web.

Good support for entities and hierarchical relationships; awkward for others.

Constraints can be imposed on the hierarchical structure and on data types.

Behavior can be specified procedurally.

Page 9: Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg

Knowledge Representations

Very well developed branch of AI. Many tools, but mostly academic. Not yet used for communication over the Web.

Powerful language for specifying entities and their relationships.

Most are linked with inference engines.Behavior is typically handled in an ad hoc

manner.

Page 10: Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg

RDF and DAML

Resource Description Framework (RDF) is a knowledge representation language represented in XML. It is a WWW Consortium Recommendation.

The DARPA Agent Markup Language (DAML) is an extension of RDF to serve as the basis for ontology-based computing over the Web: the Semantic Web.

Page 11: Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg

Ontological Reasoning in RDF

Class Property

Person

type

Fish

type

owns

type

Wanda

type

Wendy

type

owns

Type constraint violation: The range of owns is Fish.

OR There is no inconsistency: Wanda is a fish!

range

domaintype

Mermaid?

Page 12: Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg

Class Property

College

type

Student

type

majors

type

Cardinality constraint violation: George can’t have two majors

OR There is no inconsistency: Engineering = Arts & Sciences

domain

range

Restriction

type

subClassOfonProperty

1maxCardinality

Arts & Sciences

type

Engineering

type

George

typemajors

majors

equivalentTo

DAML

Page 13: Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg

Representing information

Relational database: recordsOO database: objects and linksLogic: factsXML: documentsKnowledge Representations: annotationsAll of these are graph structures: entities

related to other entities by relationships.

Page 14: Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg

Where is the meaning?

Databases: select-project-join queries Logic: rules determined by unification XML: XSLT patterns Knowledge Representations: templates

All of these are forms of graph matching. The units of meaning are small connected subgraphs that I call motifs.

Page 15: Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg

Ontology Infrastructure

Ontology development toolsContent creation systemsStorage and retrieval systemsOntology reasoning, mediation, ...Integration with applications

Simply introducing a language is not enough.There must be an infrastructure to supportontology-based computing, including:

Page 16: Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg

Ontology Development

Ontologies can be developed using graphical tools specifically for ontologies or by adapting existing tools such as CASE tools.

Testing ontologies is not easy because they include constraints and inference rules.

Ontology testing is analogous to type checking in programming languages.

Page 17: Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg

Content Creation

Databases: Data warehousing technologyText: Natural Language Processing (NLP)Image processingDirect creation of content

No matter how the content is created it must be tested using consistency checking.

Page 18: Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg

Storage and Retrieval

Scaling up will require high-performance, distributed storage and indexing technology.

The natural units for indexing are the motifs (precomputed joins), but the number of motifs is large.

Jarg Corporation has developed a scalable, high-performance indexing technology for ontology-based knowledge representations.

Page 19: Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg

Jarg Architecture

Document Knowledge RepresentationNLP

fragmentation

Knowledge Fragments

Distributed Index Engine

Query NLP Knowledge Representationfragmentation

Knowledge Motifs

MatchingDocuments

Page 20: Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg

Conclusion

Ontology-based computing is emerging as a natural evolution of existing technologies to cope with the information onslaught.

Ontology-based technology must be scalable if it is to contribute to the solution rather than add to the problem.

Consistency checking is important for the development of ontologies and content.

Page 21: Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg

Bibliography Semantic Web: www.w3.org/2001/sw Ontologies: www.ontology.org Unified Modeling Language: www.omg.org/uml Knowledge Interchange Format: logic.stanford.edu/kif Specware and Slang: www.kestrel.edu XML and XML Schema: www.w3.org/xml RDF and RDFS: www.w3.org/rdf DAML: www.daml.org Notation 3: www.w3.org/DesignIssues/Notation3.html Consistency checking: vis.home.mindspring.com Jarg Knowledge Engine: www.jarg.com