Upload
kelly-bradford
View
215
Download
1
Tags:
Embed Size (px)
Citation preview
Ways to Incorporate
Ontology and Bayesian Network
Presented By:Asma Sanam Larik
Three Approaches
Following ways have been applied to incorporate them:
1) Ontology Mapping Enhancement using BN
2) Extending Ontology queries by BN
reasoning 3) Semi automated construction
of BN from domain ontology
Purpose of the Mapping Approach
Mapping Approach Designers of Ontology apply
different views of the same domain during ontology development. This yields heterogeneity at ontology level which is the main obstacle to semantic interoperability. Ontology mapping is the approach trying to solve this problem
Research in Ontology Mapping
1. OMEN (Ontology Mapping ENhancer 2004) OMEN: A Probabilistic Ontology Mapping
Tool by Prasenjit Mitra, Anuj Jaiswal,
Pennsylvania State University and Stanford University
2. BayesOWL (2005) A Bayesian Network Approach to Ontology
mapping by Zhongli Ding, Yun Peng, UMBC3. OntoBayes (2006) OntoBayes:An Ontology driven
Uncertainty Model by Yi Yang and Jacques Calmet, University of Karlsruhe, Germany
Purpose of Extending Ontology reasoning Approach
Extending Ontology Queries: Existing ontology query languages cannot provide answers to queries
involving probabilities like the following ones: What is the likelihood of default of a company given that it is
limited and has branches outside Europe? What is the likelihood of a particular project type given that it is
led by male managers working for a ltd company? Thus BN are applied for this sort of probabilistic reasoning
Proposed by Bellandi Andrea, Turini Franco April 2009, University of Pisa, Italy
Purpose of automatic BN construction approach
Automatic BN Construction: The creation of BN requires
identifying variables of interest, their influence on each other and construction of CPT. Based on existing domain ontologies these methods propose methodology for Ontology based generation of Bayesian Networks
Research on Automated BN Construction
Stefan Fenz (University of Vienna, Austria) 2008
Ann Devitt and K. Mutosikova 2006 (Network Management Research Centre, Ireland)
Hai-tao Zhang , B-Yoeng Kang (Soeul National University, Korea ) 2007
Ontology Mapping with BN
BayesOWL Approach
Probabilistic Ontology is an Annotated Ontology that contains set of prior and Conditional Distributions
This approach takes a simple Ontology file and a Probability file and maps both of them to generate a Bayesian Network
Purpose of doing so is to use Bayesian Inference for OWL reasoning
Purpose/ Direction of Approach In domain modeling I know that A is
a subclass of B now one may wish to express the probability that an instance of B belongs to an instance of A
Also if A and B are not logically related one may still wish to express how much A is overlapped with B
In Ontology Reasoning one may wish to know degree of similarity of A to B even if A and B are not subsumed by each other
Its purpose is in Concept Mapping between two ontologies where it is often the case that concept defined by one ontology has partial matching with concept in other Ontology
How to Incorporate PO?
Probabilistic Information Markups
Structural translation
Constructing CPT for L-Nodes
Constructing CPT for Concept Nodes
Structural Translation
CPT for L-Nodes
CPT for Concept NodesExample taken from Zhongli Ding’s Thesis
Next a constraint on B is applied R1(B)=(0.61,0.39) 17
A B C D Q(X)
T T T T 0.0048
T T T F 0.0432
T T F T 0.0272
T T F F 0.0048
T F T T 0.0864
T F T F 0.1056
T F F T 0.0896
T F F F 0.0384
F T T T 0.0126
F T T F 0.1134
F T F T 0.1989
F T F F 0.0351
F F T T 0.0378
F F T F 0.0462
F F F T 0.1092
F F F F 0.0468
Initial Knowledge Base( )iQ X
B Q(B)
T 0.44
F 0.56
Marginal on B
B R(B)
T 0.61
F 0.39
Constraint on B
B R(B)/Q(B)
T 1.386
F 0.6964
New JPD
A B C D Q(X)
T T T T 0.006653T T T F 0.059875T T F T 0.037699T T F F 0.006653T F T T 0.060169T F T F 0.07354T F F T 0.062397T F F F 0.026742F T T T 0.017464F T T F 0.157172F T F T 0.275675F T F F 0.048649F F T T 0.026324F F T F 0.032174F F F T 0.076047F F F F 0.032592
From new JPD the constraint is satisfied. Next another constraint on C is applied
R2(C)= (0.83,0.17) shown in next slide
18
Initial Knowledge Base( )iQ X
C Q(C)
T 0.433
F 0.567
Marginal on C
C R(C)
T 0.83
F 0.17
Constraint on C
C R(C)/Q(C)
T 1.9168
F 0.2998
New JPD
A B C D Q(X)
T T T T 0.006653T T T F 0.059875T T F T 0.037699T T F F 0.006653T F T T 0.060169T F T F 0.07354T F F T 0.062397T F F F 0.026742F T T T 0.017464F T T F 0.157172F T F T 0.275675F T F F 0.048649F F T T 0.026324F F T F 0.032174F F F T 0.076047F F F F 0.032592
19
A B C D Q(X)
T T T T 0.0127525T T T F 0.1147684T T F T 0.0113022T T F F 0.0019946T F T T 0.1153319T F T F 0.1409615T F F T 0.0187066T F F F 0.0080173F T T T 0.033475F T T F 0.3012673F T F T 0.0826474F T F F 0.014585F F T T 0.0504578F F T F 0.0616711F F F T 0.0227989F F F F 0.0097711
Reasoning
The BayesOWL framework can support common ontology reasoning tasks as probabilistic reasoning in the translated BN
Concept Overlapping Concept Subsumption
Automated BN Construction
Ontology-based generation of Bayesian Networks
By Stefan Fenz and Min Tjoa University of Vienna
S.Fenz Ontology and Bayesian-based information security risk management PhD. Thesis, Vienna University of Technology, Oct 2008
Motivation
Creation of BN requires at least three challenging tasks: Determination of relevant influence
factors Determination of relationships between
identified influence factors Calculation of CPT’s
Ontologies are a potential solution to address stated challenges
Example Security Ontology 1) Security
Attribute: Confidentiality Integrity Availability2) Threat Source: Accidental Deliberate3) Threat origin: Human Natural4) Vulnerability: Physical Technical Administrative5) Control Type: Preventive Corrective Recovery 6) Severity Level High, Medium, Low
Methodology Proposed
Concepts Nodes in BN Relations Links Axioms Node states Instances Findings
Concept Nodes Following factors have been
identified:1) Predecessor Threats (PT1Ti , ……, PTnTi )
influence the considered threat Ti which influences it successor threats (ST1Ti , ……, STnTi)
Continue..
2) Each threat (Ti) requires one or more vulnerabilities (V1,….,Vn) to become effective
Axiomsnode scales and weights
Three point Likert scale (High, Medium)
For CPT construction Severity rating Svi is defined for
each vulnerability therefore a numerical weight Wppvi for each vulnerability is identified by dividing severity of vulnerability by the sum of all vulnerabilities relevant to the threat
Continue..
3) Controls can be used to mitigate identified vulnerabilities, mitigation depends on the effectiveness of a potential control combination (CCEvi) which again depends on the actual effectiveness of control implementation (CE1,…., CEn)
Continue..4) a) Incase of deliberate threat
sources, the vulnerability exploitation probability PPVi is determined by the effectiveness of a potential attacker (AEVi) which is again determined by the motivation (AMVi) and the capabilities of the attacker (ACVi)
b) Incase of accidental threat sources (PPVi) is determined by a prior probability (APTi) of corresponding threat Ti
Relations Links
Limitations
Functions for calculating CPT are not provided by Ontology and have to be modeled externally
Human Intervention is necessary if the ontology provides a knowledge model that does not exactly fit the domain of interest
Clinical CPG’s
Extending Ontology Query with BN Inference
Strategy
Extracting the BN directly from the ontology: Definition of the ontology compiling process for extracting the Bayesian
network structure directly from the schema of the knowledge base.
Learning the initial probability distributions.
Providing a Bayesian query language for answering queries involving probabilities
Using inference over the BN for answering queries involving probabilities:
o Definition of the language operational semantics, based on the well-known Bayesian network reasoning schemas
An example of Ontology Compiling Process
Extracting a Bayesian network from an Ontology – An Example (1) An Ontology O is a pair <T, R > where
- T = {T1,..,Ti,..,Tn} is a set of hierarchies called domain concepts
- R Ti x Tj is a set of arcs binding elements of T such that the resulting graph is acyclic.
COMPANY
PATENT
RESEARCH
PROJECT
MAN
COMPTETITOR
VENDOR
PERSON
SUPPLIERJOINTVENTURE
WOMAN
hasCeo
leads
T1
T3
T2
LIFEINSURANCE
FINANCIAL
CREDITCARD
SERVICES
SECTORT4hasSectorR1=
R2=
R3=
T1= COMPANY = {Company,Vendor, Jointv, Compet, Suplier}
T2= PERSON = {Person, Man, Woman}
T3= PROJECT = {Project, Research,PAtent}
T4= SECTOR = {Sector,Services,Financial,CreditCard,LifeIns}
R1 : COMPANY PERSON
R2 : PERSON PROJECT
R3 : COMPANY SECTOR
is-a relation
Object property
COMPANY
COMPETITOR
VENDOR
SUPPLIERJOINTVENTURE
MAN
PERSON
WOMAN
PATENTRESEARCH
PROJECT
P(COMPANY)
P(JOINTVENTURE|
COMPANY)
P(VENDOR|COMPANY)
P(COMPETITOR|
VENDOR)
P(SUPPLIER|VENDOR)
P(PERSON)
P(MAN|PERSON) P(WOMAN|PERSON)
P(PROJECT)
P(RESEARCH|PROJECT)P(PATENT|PROJECT)
LIFEINSURANCE
FINANCIAL
CREDIT CARD
SERVICES
SECTORP(SECTOR)
P(SERVICES|SECTOR) P(FINANCIAL|SECTOR)
P(CREDITCARD|FINANCIAL)P(LIFEINSURANCE|
FINANCIAL)
P(Company=c)
P(Person=p |hasCeo Company=c)
P(Project=pr |leads
Person=p)
P(Sector=s |hasSector
Company=c)R1
R2
R3
High Level NodeHigh Level RelationLow Level Node
Low Level Relation
Extracting a Bayesian network from an Ontology – An Example (2)
T1
T2
T3
T4
Ontology Compiling Process
Ontology Compiling Process
It is composed of two phases:
- Phase one: compiling TBox ontology in a 2lBN structural part
- Phase two: compiling ABox ontology in a 2lBN probabilistic part
Company Customer Partenrship Jointventure
0.56 0.86 0.36 0.29
0.44 0.14 0.64 0.71
1 1 1 1
Man
Woman
Person
hasCeo
LLRT
HLRT
HLN_COMPANY
LLN
HLN_PERSON
HLR_Ceo
2lBN structural part
The compiling process of a TBox component maps: Each ontology class to a booelan random variable (LLN)
Each concept domain to a multi-valued random variable (HLN)
Each object property to a Bayesian arc (HLR)
2lBN probabilistic part The initial probability distribution is computed on the basis of the distribution
of the instances, that is the ABox component. Two kind of probability exists:
Low Level Relation Probability Table (LLRT)• A Prior probability P(A) represents the probability that an arbitrary ontology instance belongs to the
class A.
• A Conditional Probability P(A|B) represents the probability that an arbitrary ontology instance belonging to the class B, belongs also to the class A.
High Level Relation Probability Table (HLRT)• A Conditional Probability P(A|RB) represents the probability that it does exist a relation R (i.e., an
object property or a path of object properties) between arbitrary ontology instances of A and arbitrary ontology instances of B
Low Level Relations probability distribution - example
Starting from this table, we can compute the probability distribution by applying the Bayes formula in the following form:
High Level Relations probability distribution - example
Number of triples satisfyingthe TBox schema
<Company, hasCeo, Person>
Number of instances belonging to the sub-space of Company
corresponding to “Company with a CEO”
Inference over Bayesian network
Inference over Bayesian network (1)
• Top-Down. Causal Reasoning
• P(D | A) =
• Bottom-Up. Diagnostic Reasoning
• P(A | D) =
• Top-Down/Bottom-Up. Explaining Away Reasoning
• P(A | B, D) =
AA
CC
BB
DD
P(D|A,B)
P(C|B)
P(B)
P(A)
P(D,B | A) + P(D,B | A)
P(D | A) * P(A)
P(D)
P(D, B | A) * P(A)
P(B,D)
P(D | B,A) * P(B | A) * P(A)
P(B,D)=
Inference over Bayesian networks is, in general, NP-hard.
Inference over Bayesian network (2)
Polytree is a class of Bayesian Networks that can efficiently be solved in time linear in the number of nodes.
Polytree property: Exists a unique path between each possible couple of nodes.
Fixed a node D, is always possible to partition all the other nodes into two disjoint sets:
set over D, which is the set of nodes that are connected to D only by the fathers of D.
set under D, which is the set of nodes that are connected to D only by the immediate descendents of D.
Bayesian query structure
The general structure of a probabilistic query is P(QUERY |path EVIDENCE) where:
• QUERY is a node of the polytree
• EVIDENCE can be both one node over and one node under w.r.t query, one node over w.r.t. query, one node under w.r.t. query
• EVIDENCE can refer: is-a ontology relations (classical bayesian conditioning, that is path is empty)
object properties (bayesian conditioning is annotated with the path binding query to evidence)
QUERY node (D)
EVIDENCE over QUERY(A, B, C, E)
EVIDENCE under QUERY(F, G, H, I)
Which is the probability that a Patent project is led by person which is CEO
of a company operating in the financial sector ? P(PROJECT=patent |(leads.hasCeo.hasSector) SECTOR=financial)
COMPANY
COMPETITOR
VENDOR
SUPPLIER
JOINTVENTURE
MAN
PERSONPERSON
WOMAN
PATENTPATENTRESEARCH
PROJECT
SERVICESFINANCIAL
SECTOR
P(COMPANY)
P(JOINTVENTURE|
COMPANY)
P(VENDOR|COMPANY)
P(COMPETITOR|
VENDOR)
P(SUPPLIER|
VENDOR)
P(PERSON)
P(MAN|PERSON) P(WOMAN|PERSON)
P(PROJECT)
P(RESEARCH|PROJECT) P(PATENT|PROJECT)
P(SECTOR)
P(FINANCIAL|SECTOR) P(SERVICES|
SECTOR)
P(Company=c)
hasCeo
Top Down Inference
P(PATENT |(leads.hasCeo.hasSector) FINANCIAL) =
P(PATENT |(leads) Person) * P(Person |(hasCeo.hasSector) S1)
EVIDENCE FINANCIAL is over QUERY PATENT
Top Down Inference
P(Person |(hasCeo.hasSector) FINANCIAL) =
P(Person |(hasCeo) Company) * P(Company |(hasSector) FINANCIAL)
EVIDENCE FINANCIAL is over QUERY Person
FINANCIAL
PATENT
PERSONPERSON
COMPANY
leads
hasSector
EVIDENCE FINANCIAL is under QUERY Company
P(Company |(hasSector) FINANCIAL) =
P(FINANCIAL |(hasSector) Company) * P(Company) * 1
Normalisation factorP(FINANCIAL)
Bottom-Up Inference (Bayes Formula)