29
Scalable Ontological Sense Matching Dr. Geoffrey P Malafsky TECHi2 LLC, Fairfax, VA

Scalable Ontological Sense Matching

  • Upload
    paloma

  • View
    33

  • Download
    1

Embed Size (px)

DESCRIPTION

Scalable Ontological Sense Matching. Dr. Geoffrey P Malafsky TECHi2 LLC, Fairfax, VA. Need for Smarter Systems. Enormous and ever increasingly amounts of data and information are available - PowerPoint PPT Presentation

Citation preview

Page 1: Scalable Ontological Sense Matching

Scalable Ontological Sense Matching

Dr. Geoffrey P Malafsky

TECHi2 LLC, Fairfax, VA

Page 2: Scalable Ontological Sense Matching

Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

2

Need for Smarter Systems

Enormous and ever increasingly amounts of data and information are available

Potential exists for significant increases in efficiency, effectiveness, and success in all fields IF the data and information can be harnessed

Most common use case is information overload- too much to sift through in too little time and too little resources/support/authority

Page 3: Scalable Ontological Sense Matching

Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

3

Major Challenges

Value is subjective and context-based, i.e. not deterministic

Metrics and decision criteria are heavily dependent on conditions, information uncertainty, decision/activity timelines, vulnerability to error, risk capacity

Rules and interaction mechanisms are usually nebulous, poorly defined, non-existent, or incorrect

Information Technology approaches are immature with handling this situation with poor scalability (e.g. too computationally intensive, storage requirements, security) and/or untrustworthy results

Page 4: Scalable Ontological Sense Matching

Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

4

Advanced Techniques

Improve machine understanding of information by annotating meaning (e.g. Ontologies)

Compute best scenario using domain models built using Subject Matter Experts defining core knowledge coupled to probabilistic fit calculations

Extract patterns from very large scale data sets

Hire lots of people

Page 5: Scalable Ontological Sense Matching

Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

5

Example: Knowledge Discovery & Dissemination (KDD) Seeks to find knowledge for practical

purposes within large scale data/information stores Cross organizational and functional boundaries High relevance to searcher Uncertainty is assessed and used Secure Rules and domain model based Automated

Page 6: Scalable Ontological Sense Matching

Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

6

Knowledge is Not Just Information or Data Knowledge has:

Context: what is it about? Confidence: is it right? Relationships: what does it have to do with that? Priorities: what is most important?

Types Explicit knowledge is codified and can be manipulated Tacit knowledge is unspoken “know-how”

Looks just like data when in an electronic system It is data Annotations on “about” and “how” tied to intelligent

application logic make it knowledge for user

Page 7: Scalable Ontological Sense Matching

Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

7

KDD Knowledge Map Analysis of scientific and technological areas on emphasis

in Knowledge Discovery and Dissemination (KDD) Gaps reveal technical vulnerabilities

Total and TRL Rated Citations

0

50

100

150

200

250

300

350

Total 1-3 4-7 8-9

TRL

Nu

mb

er o

f C

itat

ion

s

Dissemination

Discovery

Knowledge

Most research and development is

concentrated in the discovery portion of

KDD

Page 8: Scalable Ontological Sense Matching

Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

8

Information & Data Mining Dominates KDD Focus

Citations

0

50

100

150

200

250

Rep

rese

ntat

ion

Rul

es f

orm

atio

n +

anal

ysis

Col

lect

ion

+ T

acit

capt

ure

Life

cycl

eM

aint

enan

ce

Fus

ion

Info

/dat

a m

inin

g

Mod

els

Unc

erta

inty

mgm

t+

miti

gatio

n

Fea

ture

ext

ract

ion

+ a

naly

sis

Vis

ualiz

atio

n

Rea

l-tim

eop

erat

ions

Dis

trib

uted

arch

itect

ure

Age

nts

Per

sona

lizat

ion

On-

dem

and

(TP

PU

)

Sto

rage

Sec

urity

Page 9: Scalable Ontological Sense Matching

Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

9

What is Missing to Make KDD Work

Knowledge-based metadata architecture Predictive personalization algorithms Models of knowledge lifecycle Computational knowledge techniques Real-time analysis with large data and high

uncertainty Conduit to feedback from end-users to

capture and evolve domain knowledge

Page 10: Scalable Ontological Sense Matching

Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

10

Bringing Structure to Unbounded Knowledge Knowledge Mgmt systems have failed to

meet operational requirements Knowledge is inherently expansive and evolving IT tends to collect and organize assets without

relevance, context, .. Level of Effort to manually collect, map, cleanse

source data/information is too high AI is not around the corner

Page 11: Scalable Ontological Sense Matching

Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

11

Structured Knowledge

Applying a structured framework creates repeatable, interoperable, consistent solutions

Knowledge fidelity is maintained with combination of human and machine processable representations

Unknowns are discovered using known analogies via triangulation

Reduction of universe of possible combinations of knowledge and user needs to engineering scale solutions

Page 12: Scalable Ontological Sense Matching

Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

12

Applying Quantum Mechanics to Knowledge Processing Even the small area of this room has an infinite number of

possible combinations of things (macro, micro, atomic, subatomic)

Representing these “knowledge” and “states” with a domain model reducing infinite possible to a tractable few: Schrödinger Equation (H = E )

Wavefunctions describe specific states (4 quantum numbers) Connection between objects defined by overlap integral of

wavefunctions *0 1 = Degree of match

Page 13: Scalable Ontological Sense Matching

Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

13

The KORS™ Framework

Knowledge: collected using templates from SMEs Ontologies: Conceptual models of domain

knowledge Rules: Business and technical rules are extracted

and defined from domain knowledge and ontologies Semantic metadata: knowledge, ontology

relationships, and rules are connected to and represented with data and information

Page 14: Scalable Ontological Sense Matching

Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

14

KORS Structured Knowledge Knowledge is inherently expansive and evolving KM systems have failed to meet operational requirements

IT collects & organizes assets without relevance, context, .. Level of Effort is too high to manually collect, map, cleanse

source data/information KORS™-pending framework creates repeatable, interoperable,

consistent solutions Knowledge fidelity is maintained with combination of human

and machine processable representations Ontologies (concepts) are expressed with domain specific and

standard terms Reduction of universe of possible combinations of knowledge

and concepts to engineering scale solutions

Page 15: Scalable Ontological Sense Matching

Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

15

KORS Ontologies Conventional wisdom

Multi-tiered, fully explicit, broad coverage

= Too large; Too difficult to maintain; Too hard to implement KORS™ Ontologies

Cross-domain framework, domain specific instances of classes, leverage existing ontologies, concepts defined with domain-based uncontrolled vocabulary AND common controlled vocabulary

= Smaller; easier to maintain; supports engineering processes Answers the broad question: ”Do the concepts in this other

ontology have semantic similarity to those in my ontology?” Semantic metadata used to characterize domain ontologies

Page 16: Scalable Ontological Sense Matching

Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

16

Conventional Wisdom

Upper, middle and lower ontologies.

Every concept made fully explicit

Page 17: Scalable Ontological Sense Matching

Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

17

The KORS Engineering Framework

Structured knowledge capture

Identify rules and metadata structures

Incorporate within the engineered solution

Page 18: Scalable Ontological Sense Matching

Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

18

Cross-Concept Overlap Calculation Overlap integrals calculated from semantic metadata – updated

when ontologies change

Overlap (S) is computed at: Ontology-ontology level using primary task-description pairs Term level using allowed and disallowed senses (not just synonyms)

Real-time determination using coarse-medium-fine concept match Coarse= ontology-ontology Medium=Ontology-term Fine=term-term

With metadata architecture implementation, calculation uses what is available Inherently scalable, distributed with evolving improvements

)]([)]([ BBAABA termdomainOntermdomainOnS

Page 19: Scalable Ontological Sense Matching

Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

19

Semantic Variance Across Domains For example, the term “Insurgent” means:

Mission Planner: person who takes part in an armed rebellion against the

constituted authority Geospatial Analyst:

someone who participates in a peaceful public display of group feeling

Diplomatic Corps: someone who participates in a public display against an

established government

Page 20: Scalable Ontological Sense Matching

Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

20

Semantic Challenges

Semantic Consistency Semantic Variance Across Domains Search and Discovery Requirements

Controlled Uncontrolled Vocabularies

Have “local” variantsNavy fliers and Air force Pilots

Change to match changing realityYesterday’s friend could be tomorrow’s foe

Change to match changes in policy (spin)Today, “freedom fighter;” tomorrow, “insurgent”

Page 21: Scalable Ontological Sense Matching

Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

21

KORS is Extensible, Scaleable, Adaptive Ontology level metadata describes the basic

functional concepts and processes of the domain Can be linked to Enterprise Architecture products Direct conceptual match two functional domains

Ontology term descriptions use: domain specific (uncontrolled) expressions Allowed senses from controlled vocabularies Disallowed senses from controlled vocabularies True meaning is found from combination of domain,

allowed, disallowed as is done in real language Metadata architecture: values used if present but

not required scalability and extensibility

Page 22: Scalable Ontological Sense Matching

Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

22

Example Domain: GEOINT Exploitation and analysis of imagery and geospatial

information to describe, assess, and visually depict physical features and geographically referenced activities on the Earth.

GEOINT encompasses all the activities involved in the collection, analysis, and exploitation of spatial information in order to gain knowledge about the national security environment, and the visual depiction of that knowledge.

Page 23: Scalable Ontological Sense Matching

Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

23

MyGEOINT Ontological Architecture

Page 24: Scalable Ontological Sense Matching

Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

24

Functional Application

Semantic Expansion Cross domain commonality Qualified synonym identification

>> discovery of potentially relevant knowledge

Semantic Resolution Allowed and disallowed alternate semantics Binding of dynamic domain-specific semantics to controlled

vocabularies.

>> computable semantic comparisons and knowledge relevance ranking

Page 25: Scalable Ontological Sense Matching

Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

25

Cross domain commonality

Answers the broad question:“Do the concepts in this other ontology have semantic similarity to those in my ontology?” Semantic metadata used to characterize domain

ontologies Overlap integrals calculated from semantic

metadata – updated when ontologies change Run-time ontology-to-ontologies greatly simplified.

Page 26: Scalable Ontological Sense Matching

Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

26

Qualified synonym identification

Knowledge discovery via synonym lists is well supported by standardized lexical tools (WordNet, etc)

Broad-domain perspective limits ability to isolate domain-specific usage

KORS domain-specific ontologies and semantic metadata allow the use of broad-spectrum vocabularies to make fine-grain distinctions.

Page 27: Scalable Ontological Sense Matching

Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

27

Normalizing Uncontrolled Vocabularies

Semantic Homing The Binding of dynamic domain-specific semantics to

controlled vocabularies Allows domain-specific ontologies to evolve Provides stable semantic anchors for knowledge

computability.

Page 28: Scalable Ontological Sense Matching

Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

28

MyGEOINT: Ontology Knowledge Discovery

Ontology applies concept matching to make discoveries more relevant

Page 29: Scalable Ontological Sense Matching

Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

29

Functional Impact

Discovery of knowledge sources of potentially relevance. Simpler solutions, lower real-time computational requirements, practical multi-domain solutions.

Computable semantic comparisons and knowledge relevance ranking Directly computed rather than inferred Higher domain-level precision without the overhead of

extensive upper and mid-level ontologies