1
Learning Co-reference Relations for FOAF Instances Jennifer Sleeman and Tim Finin, University of Maryland, Baltimore County Motivation Establishing co-reference relations for entities is a common problem. Our goal is to establish co-reference relations among FOAF agents. FOAF co-referent issues: No global unique identifiers Inverse Functional Properties not always reliable Multiple versions of FOAF files for a single entity When two instances are thought to be co-referent, information can be combined providing a more complete representation of the entity. In Semantic Web this is termed as 'smushing'. Smushing issues: Outdated information Conflicting information Other alignment-based issues owl:sameAs dangers Co-Referent Predicate :coref a owl:TransitiveProperty. :coref a owl:SymmetricProperty. owl:sameAs rdfs:subPropertyOf :coref. :notCoref a owl:SymmetricProperty. owl:differentFrom rdfs:subPropertyOf :notCoref. {?a :notCoref ?b. ?b :coref ?c.} => {?a :notCoref ?c}. {?a foaf:knows ?b.} => {?a :notCoref ?b}. Methodology Results Future Work After co-reference is established among pairs we cluster our pairs and use these clusters for future co-reference evaluations. We use an ensemble approach with both the rules and a classifier to evaluate pairs. Predicting accurately co-referent/non- co-referent pairs Enhanced clustering algorithm Application to RDF documents non-FOAF specific For experiment one: 900 pairs designated non-match majority other rules returned undetermined state For experiment two we show in Table 1: only inverse functional property rule positive cases majority resulted in undetermined state knows rule resulted in non-coreferent state During E2 clustering, first phase resulted in 90% accuracy. Errors occurred in pairs that should have been clustered but were not. A second round of clustering yielded no new relationship pairs among instances but cluster to cluster pairing did occur. E2 # of Pairs Rule Rule Conclusion 91383326 differentFrom Undetermined 47184 Inverse functional Undetermined 2402 Inverse functional Co-referent 8687410 Knows graph Undetermined 9138326 sameAs Undetermined 1047874 Knows graph Not Co-referent 1 st experiment resulting in 50,000 triples/500 entity mentions/600 training 2 nd experiment with 250,000 triples/3500 entity mentions/1800 training classes 10-fold validation with results shown in Table 2 Figure 2 & Figure 3 : Clustering Approaches Table 1: Rules-based Results Experimen t TP Rate FP Rate Precisio n Recal l F- Measure E1 .933 .267 .93 .933 .93 E2 .959 .128 .958 .959 .958 Table 2: 10-Fold Cross Validation Test 1 2 1 2 3 1 2 3 coreferen t corefere nt coref coref coref coref Two FOAF instances are determined to be co-referent. Instance 1 and 2 add an explicit coref property for each other and form cluster 1. It is determined that cluster 1 and FOAF instance 3 are co-referent. Instance 3 joins cluster 1 and instance 1 and 2 have an explicit coref property that joins each with instance 3. 4 corefere nt 1 2 1 2 3 coreferen t coref core f coref FOAF instance 1 and 2 are determined to be co- referent. Instance 1 and 2 add an explicit coref property for each other and form cluster 1. Instance 3 and 4 add an explicit coref property for each other and form cluster 2. It is determined that cluster 1 and cluster 2 are co- referent. Each instance adds an explicit coref property for each other. 3 4 coreferen t FOAF instance 3 and 4 are determined to be co- referent. 2 coref 1 3 4 coref core f core f core f The following axioms in N3 are for the coref and notCoref properties. coref – transitive and symmetric, owl:sameAs as a sub-property notCoref – symmetric but not transitive, owl:differentFrom as sub- property Figure 1: System Architecture Ingestion Candidate Pair Generatio n Rule- based Reasoning Machine Learning Model Generation Abstract entity generation Potential pairs reduces workload for classifier Deductive Decisions Prediction s clusters form new abstract entities Co-referent designation and clustering 1. Generate candidate pairs 2. Generate a rules-based model 3. Perform classification using SVMs 4. Designate pairs as co- referent 5. Cluster pairs

Learning Co-reference Relations for FOAF Instances Jennifer Sleeman and Tim Finin, University of Maryland, Baltimore County Motivation Establishing co-reference

Embed Size (px)

Citation preview

Page 1: Learning Co-reference Relations for FOAF Instances Jennifer Sleeman and Tim Finin, University of Maryland, Baltimore County Motivation Establishing co-reference

Learning Co-reference Relations for FOAF InstancesJennifer Sleeman and Tim Finin, University of Maryland, Baltimore County

Motivation

Establishing co-reference relations for entities is a common problem. Our goal is to establish co-reference relations among FOAF agents.

FOAF co-referent issues: No global unique identifiers Inverse Functional Properties not always reliable Multiple versions of FOAF files for a single entity

When two instances are thought to be co-referent, information can be combined providing a more complete representation of the entity. In Semantic Web this is termed as 'smushing'.

Smushing issues: Outdated information Conflicting information Other alignment-based issues owl:sameAs dangers

Co-Referent Predicate

:coref a owl:TransitiveProperty.:coref a owl:SymmetricProperty.owl:sameAs rdfs:subPropertyOf :coref.:notCoref a owl:SymmetricProperty.owl:differentFrom rdfs:subPropertyOf :notCoref.{?a :notCoref ?b. ?b :coref ?c.} => {?a :notCoref ?c}.{?a foaf:knows ?b.} => {?a :notCoref ?b}.

Methodology Results

Future Work

After co-reference is established among pairs we cluster our pairs and use these clusters for future co-reference evaluations.

We use an ensemble approach with both the rules and a classifier to evaluate pairs.

Predicting accurately co-referent/non-co-referent pairs Enhanced clustering algorithm Application to RDF documents non-FOAF specific

For experiment one:900 pairs designated non-matchmajority other rules returned undetermined stateFor experiment two we show in Table 1: only inverse functional property rule positive cases majority resulted in undetermined state knows rule resulted in non-coreferent stateDuring E2 clustering, first phase resulted in 90% accuracy. Errors occurred in pairs that should have been clustered but were not. A second round of clustering yielded no new relationship pairs among instances but cluster to cluster pairing did occur.

E2 # of Pairs Rule Rule Conclusion91383326 differentFrom Undetermined

47184 Inverse functional Undetermined2402 Inverse functional Co-referent

8687410 Knows graph Undetermined9138326 sameAs Undetermined

1047874 Knows graph Not Co-referent

1st experiment resulting in 50,000 triples/500 entity mentions/600 training

2nd experiment with 250,000 triples/3500 entity mentions/1800 training classes

10-fold validation with results shown in Table 2

Figure 2 & Figure 3 : Clustering Approaches

Table 1: Rules-based Results

Experiment TP Rate FP Rate Precision Recall F-Measure

E1 .933 .267 .93 .933 .93

E2 .959 .128 .958 .959 .958Table 2: 10-Fold Cross Validation Test

1 2

12

3

1 23

coreferent

coreferentcoref

coref coref

coref

Two FOAF instances aredetermined to be co-referent.

Instance 1 and 2 add an explicit coref property for each other and form cluster 1.It is determined that cluster 1 and FOAF instance 3 are co-referent.

Instance 3 joins cluster 1 and instance 1 and 2 have an explicit coref property that joins each with instance 3. 4

coreferent

1 2

12

3

coreferent

coref

coref

coref

FOAF instance 1 and 2 aredetermined to be co-referent.

Instance 1 and 2 add an explicit coref property for each other and form cluster 1.Instance 3 and 4 add an explicit coref property for each other and form cluster 2.It is determined that cluster 1 and cluster 2 are co-referent.

Each instance adds an explicit coref property for each other.

3 4coreferent

FOAF instance 3 and 4 aredetermined to be co-referent.

2coref1 3

4coref

corefcoref

coref

The following axioms in N3 are for the coref and notCoref properties.coref – transitive and symmetric, owl:sameAs as a sub-propertynotCoref – symmetric but not transitive, owl:differentFrom as sub-property

Figure 1: System Architecture

Ingestion

Candidate Pair

Generation

Rule-based Reasoning

Machine Learning

Model Generation

Abstract entity generation

Potential pairs reduces workload for classifier

Deductive Decisions Predictions

clusters formnew abstract entities

Co-referent designation and clustering

1. Generate candidate pairs2. Generate a rules-based model3. Perform classification using SVMs4. Designate pairs as co-referent5. Cluster pairs