Upload
pembroke
View
19
Download
0
Embed Size (px)
DESCRIPTION
Reference Reconciliation in Complex Information Spaces. Xin (Luna) Dong , Alon Halevy, Jayant Madhavan @ Sigmod 2005 University of Washington. Semex : Personal Information Management System. Homepage(1). SenderOfEmails(7595). RecipientOfEmails(8547). AuthorOfArticles(52). - PowerPoint PPT Presentation
Citation preview
Reference Reconciliation in Complex Information Spaces
Xin (Luna) Dong, Alon Halevy, Jayant Madhavan
@ Sigmod 2005University of Washington
Semex: Personal Information Management System
MentionedIn(315)
AuthorOfArticles(52)
RecipientOfEmails(8547)
SenderOfEmails(7595)
Homepage(1)
Semex: Personal Information Management System
Email Contacts(1145)
Co-authors(24)
Semex: Personal Information Management System
Authors
FromFile
CitedBy
Cites(33)
PublishedIn
Article: Reference Reconciliation in Complex Information Spaces
Semex: Personal Information Management System
Xin (Luna) Dong
xin dong
•¶ ðà xinluna dong
luna
dongxin
x. dong
Lab-#dong xin
dong xin luna
Names
Emails
Semex Without DeduplicationSearch results for luna
luna dongSenderOfEmails(3043)RecipientOfEmails(2445)MentionedIn(94)
23 persons
Semex Without DeduplicationSearch results for luna
Xin (Luna) DongAuthorOfArticles(49)MentionedIn(20)
23 persons
Semex Without Deduplication
A Platform for Personal Information Management and Integration
Semex Without Deduplication
9 Persons: dong xin xin dong
Semex NEEDS Deduplication (Reference Reconciliation)
Reference Reconciliation in Complex Information Spaces
Xin (Luna) Dong, Alon Halevy, Jayant Madhavan
@ Sigmod 2005University of Washington
Complex Information Space Example – An Abstract View of Personal Information Article: a1=(“Distributed Query Processing”,“169-180”,
{p1,p2,p3}, c1)a2=(“Distributed query processing”,“169-180”,
{p4,p5,p6}, c2)
Venue: c1=(“ACM Conference on Management of Data”, “1978”,
“Austin, Texas”) c2=(“ACM SIGMOD”, “1978”, null)
Person: p1=(“Robert S. Epstein”, null)p2=(“Michael Stonebraker”, null)p3=(“Eugene Wong”, null) p4=(“Epstein, R.S.”, null)p5=(“Stonebraker, M.”, null)p6=(“Wong, E.”, null)
Complex Information Space Example – An Abstract View of Personal Information Article: a1=(“Distributed Query Processing”,“169-180”, {p1,p2,p3},
c1)a2=(“Distributed query processing”,“169-180”, {p4,p5,p6},
c2)
Venue: c1=(“ACM Conference on Management of Data”, “1978”,
“Austin, Texas”) c2=(“ACM SIGMOD”, “1978”, null)
Person: p1=(“Robert S. Epstein”, null)p2=(“Michael Stonebraker”, null)p3=(“Eugene Wong”, null) p4=(“Epstein, R.S.”, null)p5=(“Stonebraker, M.”, null)p6=(“Wong, E.”, null) p7=(“Eugene Wong”, “[email protected]”)p8=(null, “[email protected]”)p9=(“mike”, “[email protected]”)
Class
Reference
AtomicAttribute
AssociationAttribute
Other Complex Information Spaces Citation portals, e.g., Citeseer,
Cora Online product catalogs in E-
commerce
Real-World Objects Article: a1=(“Distributed Query Processing”,“169-180”, {p1,p2,p3},
c1)a2=(“Distributed query processing”,“169-180”, {p4,p5,p6},
c2)
Venue: c1=(“ACM Conference on Management of Data”, “1978”,
“Austin, Texas”) c2=(“ACM SIGMOD”, “1978”, null)
Person: p1=(“Robert S. Epstein”, null)p2=(“Michael Stonebraker”, null)p3=(“Eugene Wong”, null) p4=(“Epstein, R.S.”, null)p5=(“Stonebraker, M.”, null)p6=(“Wong, E.”, null) p7=(“Eugene Wong”, “[email protected]”)p8=(null, “[email protected]”)p9=(“mike”, “[email protected]”)
Reference Reconciliation
Input: A set of references R Output: A partitioning over R, such
thatEach partition refers to a single real-
world object – high precision
Different partitions refer to different objects – high recall
Related Work
A very active area of research in Databases, Data Mining and AI
Most current approaches assume matching tuples from a single database table Traditional approaches (Surveyed in [Cohen, et
al. 2003]) Step I. Compare attributes Step II. Combine attribute similarities to decide tuple
match/non-match Step III. Compute transitive closures to get partitions
New approaches explore relationship between reconciliation decisions using probability models[Russell et al, 2002] [Domingos et al, 2004]
Harder for complex information spaces
Challenges in Complex Information Spaces
Article: a1=(“Distributed Query Processing”,“169-180”, {p1,p2,p3}, c1)
a2=(“Distributed query processing”,“169-180”, {p4,p5,p6}, c2)
Venue: c1=(“ACM Conference on Management of Data”, “1978”,
“Austin, Texas”) c2=(“ACM SIGMOD”, “1978”, null)
Person: p1=(“Robert S. Epstein”, null)p2=(“Michael Stonebraker”, null)p3=(“Eugene Wong”, null) p4=(“Epstein, R.S.”, null)p5=(“Stonebraker, M.”, null)p6=(“Wong, E.”, null) p7=(“Eugene Wong”, “[email protected]”)p8=(null, “[email protected]”)p9=(“mike”, “[email protected]”)
1. MultipleClasses 3. Multi-value
Attributes
2. LimitedInformation
?
?
Intuition
Complex information spaces can be considered as networks of instances and associations between the instances
Key: exploit the network, specifically, the clues hidden in the associations
Outline
Introduction and problem definition
Reconciliation algorithmExperimental resultsConclusions
Framework: Dependency Graph p2=(“Michael Stonebraker”, null, {p1, p3})
p3=(“Eugene Wong”, null, {p1, p2}) p7=(“Eugene Wong”, “[email protected]”, {p8}) p8=(null, “[email protected]”, {p7}) p9=(“mike”, “[email protected]”, null)
(p2, p8)
(p3,p7) (“Michael Stonebraker”, “stonebraker@”)
Reference Similarity Attribute Similarity
Compare contacts
Cross-attr similarity
(p1,p7)
(“Michael Stonebraker”, p7)
(p1, “[email protected]”)
(p3, “[email protected]”)
Framework: Dependency Graph p2=(“Michael Stonebraker”, null, {p1, p3})
p3=(“Eugene Wong”, null, {p1, p2}) p7=(“Eugene Wong”, “[email protected]”, {p8}) p8=(null, “[email protected]”, {p7}) p9=(“mike”, “[email protected]”, null)
(p2, p8)
(p3,p7) (“Michael Stonebraker”, “stonebraker@”)
Reference Similarity Attribute Similarity
Compare contacts
Cross-attr similarity
Framework: Dependency Graph p2=(“Michael Stonebraker”, null, {p1, p3})
p3=(“Eugene Wong”, null, {p1, p2}) p7=(“Eugene Wong”, “[email protected]”, {p8}) p8=(null, “[email protected]”, {p7}) p9=(“mike”, “[email protected]”, null)
(p8, p9)
(p2, p8)
(“Michael Stonebraker”, “mike”)
(p2, p9)
(p3,p7) (“Michael Stonebraker”, “stonebraker@”)
Reference Similarity Attribute Similarity
(“[email protected]”, “[email protected]”)
(“Eugene Wong”, “Eugene Wong”)
Exploit the Dependency Graph
(“Distributed…”, “Distributed …”)
(“169-180”, “169-180”)
(a1, a2)
(“Michael Stonebraker”, “Stonebraker, M.”)
(p2, p5)
(“Eugene Wong”, “Wong, E.”)
(p3, p6)(c1, c2)
(“ACM …”, “ACM SIGMOD”) (“1978”, “1978”)
Reference similarity Attribute similarity
(“Robert S. Epstein”, “Epstein, R.S.”)
(p1, p4)
Dependency Graph Example II
(“Distributed…”, “Distributed …”)
(“169-180”, “169-180”)
(a1, a2)
(“Michael Stonebraker”, “Stonebraker, M.”)
(p2, p5)
(“Eugene Wong”, “Wong, E.”)
(p3, p6)(c1, c2)
(“ACM …”, “ACM SIGMOD”) (“1978”, “1978”)
Reference similarity Attribute similarity
(“Robert S. Epstein”, “Epstein, R.S.”)
(p1, p4)
Compare authored papers
Strategy I. Consider Richer Evidence Cross-attribute similarity –
Name&email p5=(“Stonebraker, M.”, null) p8=(null, “[email protected]”)
Context Information I – Contact list p5=(“Stonebraker, M.”, null, {p4, p6}) p8=(null, “[email protected]”, {p7}) p6=p7
Context Information II – Authored articles p2=(“Michael Stonebraker”, null) p5=(“Stonebraker, M.”, null) p2 and p5 authored the same article
Considering Only Attribute-wise Similarities Cannot Merge Persons Well
1750
1950
2150
2350
2550
2750
2950
3150
3350
1 2 3 4
Evidence
#(P
erso
n P
arti
tio
ns)
1409
Person references: 24076 Real-world persons (gold-standard):1750
3159
Considering Richer Evidence Improves the Recall
3159
2169 21692096
1750
1950
2150
2350
2550
2750
2950
3150
3350
Attr-wise Name&Email Article Contact
Evidence
#(P
erso
n P
arti
tio
ns)
1409
346
Person references: 24076 Real-world persons:1750
Exploit the Dependency Graph
(“Distributed…”, “Distributed …”)
(“169-180”, “169-180”)
(a1, a2)
(“Michael Stonebraker”, “Stonebraker, M.”)
(p2, p5)
(“Eugene Wong”, “Wong, E.”)
(p3, p6)(c1, c2)
(“ACM …”, “ACM SIGMOD”) (“1978”, “1978”)
Reference similarity Attribute similarity
(“Robert S. Epstein”, “Epstein, R.S.”)
(p1, p4)
Exploit the Dependency Graph
(“Distributed…”, “Distributed …”)
(“169-180”, “169-180”)
(a1, a2)
(“Michael Stonebraker”, “Stonebraker, M.”)
(p2, p5)
(“Eugene Wong”, “Wong, E.”)
(p3, p6)(c1, c2)
(“ACM …”, “ACM SIGMOD”) (“1978”, “1978”)
(“Robert S. Epstein”, “Epstein, R.S.”)
(p1, p4)
Reconciled Similar
Exploit the Dependency Graph
(“Distributed…”, “Distributed …”)
(“169-180”, “169-180”)
(a1, a2)
(“Michael Stonebraker”, “Stonebraker, M.”)
(p2, p5)
(“Eugene Wong”, “Wong, E.”)
(p3, p6)(c1, c2)
(“ACM …”, “ACM SIGMOD”) (“1978”, “1978”)
(“Robert S. Epstein”, “Epstein, R.S.”)
(p1, p4)
Reconciled Similar
Exploit the Dependency Graph
(“Distributed…”, “Distributed …”)
(“169-180”, “169-180”)
(a1, a2)
(“Michael Stonebraker”, “Stonebraker, M.”)
(p2, p5)
(“Eugene Wong”, “Wong, E.”)
(p3, p6)(c1, c2)
(“ACM …”, “ACM SIGMOD”) (“1978”, “1978”)
(“Robert S. Epstein”, “Epstein, R.S.”)
(p1, p4)
Reconciled Similar
Exploit the Dependency Graph
(“Distributed…”, “Distributed …”)
(“169-180”, “169-180”)
(a1, a2)
(“Michael Stonebraker”, “Stonebraker, M.”)
(p2, p5)
(“Eugene Wong”, “Wong, E.”)
(p3, p6)(c1, c2)
(“ACM …”, “ACM SIGMOD”) (“1978”, “1978”)
(“Robert S. Epstein”, “Epstein, R.S.”)
(p1, p4)
Reconciled Similar
Exploit the Dependency Graph
(“Distributed…”, “Distributed …”)
(“169-180”, “169-180”)
(a1, a2)
(“Michael Stonebraker”, “Stonebraker, M.”)
(p2, p5)
(“Eugene Wong”, “Wong, E.”)
(p3, p6)(c1, c2)
(“ACM …”, “ACM SIGMOD”) (“1978”, “1978”)
(“Robert S. Epstein”, “Epstein, R.S.”)
(p1, p4)
Reconciled Similar
Strategy II. Propagate Information between Reconciliation Decisions After changing the similarity
score of one node, re-compute similarity scores of its neighbors
This process converges ifSimilarity score is monotone in the
similarity values of neighborsCompute neighbor similarities only
if similarity increase is not too small
3159
2169 21692096
3159
2146 2135
2022
1750
1950
2150
2350
2550
2750
2950
3150
3350
Attr-w ise Name&Email Article Contact
Evidence
#(Pe
rson
Par
titio
ns)
Traditional Propagation
Propagating Information between Reconciliation Decisions Further Improves Recall
Person references: 24076 Real-world persons:1750
Strategy III. Enrich References in Reconciliation Enrich knowledge of a real-world
object for later reconciliation Naïve:
Construct graph Compute similarity Transitive Closure
Problems Dependency-graph construction is expensive Reference enrichment takes effect until the
next pass Solution
Instant enrichment by adding neighbors in the dependency graph
Enrich References by Adding Neighbors p2=(“Michael Stonebraker”, null, {p1, p3})
p3=(“Eugene Wong”, null, {p1, p2}) p7=(“Eugene Wong”, “[email protected]”, {p8}) p8=(null, “[email protected]”, {p7}) p9=(“mike”, “[email protected]”, null)
(p8, p9)
(“[email protected]”, “[email protected]”)
(p2, p8)
(“Michael Stonebraker”, “mike”)
(p2, p9)
(p3,p7) (“Michael Stonebraker”, “stonebraker@”)
Reconciled Similar
Enrich References by Adding Neighbors p2=(“Michael Stonebraker”, null, {p1, p3})
p3=(“Eugene Wong”, null, {p1, p2}) p7=(“Eugene Wong”, “[email protected]”, {p8}) p8=(null, “[email protected]”, {p7}) p9=(“mike”, “[email protected]”, null)
(p8, p9)
(“[email protected]”, “[email protected]”)
(p2, p8)
(“Michael Stonebraker”, “mike”)
(p2, p9)
(p3,p7) (“Michael Stonebraker”, “stonebraker@”)
Reconciled Similar
Enrich References by Adding Neighbors p2=(“Michael Stonebraker”, null, {p1, p3})
p3=(“Eugene Wong”, null, {p1, p2}) p7=(“Eugene Wong”, “[email protected]”, {p8}) p8=(null, “[email protected]”, {p7}) p9=(“mike”, “[email protected]”, null)
(p8, p9)
(“[email protected]”, “[email protected]”)
(p2, p8)
(“Michael Stonebraker”, “mike”)
(p2, p9)
(p3,p7) (“Michael Stonebraker”, “stonebraker@”)
Reconciled Similar
Enrich References by Adding Neighbors p2=(“Michael Stonebraker”, null, {p1, p3})
p3=(“Eugene Wong”, null, {p1, p2}) p7=(“Eugene Wong”, “[email protected]”, {p8}) p8=(null, “[email protected]”, {p7}) p9=(“mike”, “[email protected]”, null)
(p8, p9)
(“[email protected]”, “[email protected]”)
(p2, p8)
(“Michael Stonebraker”, “mike”)(p3,p7) (“Michael Stonebraker”, “stonebraker@”)
Reconciled Similar
Enrich References by Adding Neighbors p2=(“Michael Stonebraker”, null, {p1, p3})
p3=(“Eugene Wong”, null, {p1, p2}) p7=(“Eugene Wong”, “[email protected]”, {p8}) p8=(null, “[email protected]”, {p7}) p9=(“mike”, “[email protected]”, null)
(p8, p9)
(“[email protected]”, “[email protected]”)
(p2, p8)
(“Michael Stonebraker”, “mike”)(p3,p7) (“Michael Stonebraker”, “stonebraker@”)
Reconciled Similar
References Enrichment Improves Recall More than Information Propagation
3159
2169 21692096
3169
2036 2036
19101750
1950
2150
2350
2550
2750
2950
3150
3350
Attr-wise Name&Email Article Contact
Evidence
#(P
erso
n P
arti
tio
ns)
Traditional Enrichment Propagation
Person references: 24076 Real-world persons:1750
3159
2169 21692096
3169
2002 1990
18731750
1950
2150
2350
2550
2750
2950
3150
3350
Attr-wise Name&Email Article Contact
Evidence
#(P
erso
n P
artit
ions
)
Traditional Enrichment Propagation Full
Applying Both Information Propagation and Reference Enrichment Get the Highest Recall
Person references: 24076 Real-world persons:1750
1409
125346
Outline
Introduction and problem definition
Reconciliation algorithm Experimental results Conclusions
Experiment Settings Datasets
Four personal datasets Cora dataset for citations
Use the same parameters and thresholds for all datasets Measure
Precision and recall, F-measure Precision: The percentage of correctly reconciled reference pairs over
all reconciled reference pairs Recall: The percentage of correctly reconciled reference pairs over
pairs of references that refer to the same real-world object
Diversity and Dispersion Diversity: For every result partition, how many real-world objects are
included; ideally should be 1 (related to precision) Dispersion: For every real-world object, how many result partitions
include them; ideally should be 1 (related to recall)
3159
2169 21692096
3169
2002 1990
18731750
1950
2150
2350
2550
2750
2950
3150
3350
Attr-wise Name&Email Article Contact
Evidence
#(P
erso
n P
artit
ions
)
Traditional Enrichment Propagation Full
Recall Results on One Personal Dataset
Person references: 24076 Real-world persons:1750
1409
125346
Results Considering All Occurrences of Person Instances
Dataset#per/#ref
Attr-wise Matching Dependency Graph
Prec/Recall
F#Pa
rPrec/Recall
F#Pa
r
A (1750/2407
6)B
(1989/36359)C
(1570/15160)D
(1518/17199)
Avg
0.999/0.741
0.974/0.998
0.999/0.967
0.894/0.998
0.967/0.926
0.8510.9860.9830.943
0.946
3159
2154
1660
1579
0.999/0.999
0.999/0.999
0.982/0.987
0.999/0.920
0.995/0.976
0.9990.9990.9850.958
0.986
1873
2068
1596
1546
Both precision and recall increase compared with attr-wise matching.
Results Considering Only Distinct Person References
Dataset#per/#dist-
ref
Attr-wise Matching Dependency Graph
Prec/Recall
F#Pa
rPrec/Recall
F#Pa
r
A (1750/3114
)B
(1989/3211)C
(1570/2430)D
(1518/2188)
Avg
0.995/0.509
0.81/0.803
0.987/0.782
0.694/0.837
0.872/0.733
0.6730.8060.8730.759
0.778
3159
2154
1660
1579
0.982/0.947
0.958/0.891
0.814/0.925
0.942/0.737
0.924/0.875
0.9640.9230.8670.827
0.895
1873
2068
1596
1546
Precision and recall increase largely compared with attr-wise matching.
Diversity and Dispersion Are Very Close to 1
Dataset#per/#ref
Attr-wise Matching Dependency Graph
Diversity/Dispersion Diversity/Dispersion
A (1750/2407
6)B
(1989/36359)C
(1570/15160)D
(1518/17199)
Avg
1.18/1.0031.067/1.01
1.053/1.0031.041/1.004
1.085/1.005
1.047/1.0031.039/1.0081.03/1.017
1.023/1.005
1.035/1.008
Our Algorithm Equals or OutperformsAttr-wise Matching in All Classes
Class
Attr-wise Matching
Dependency Graph
Precision
RecallPrecisi
onRecall
Person
Article
Venue
0.9670.9970.935
0.9260.9770.790
0.9950.9990.987
0.9760.9760.937
Results on Cora Dataset is Competitive with Other Reported Results
Results reported in other record linkage papers: Precision/Recall = 0.990/0.925 [Cohen et al., 2002] Precision/Recall = 0.842/0.909 [Parag and Domingo, 2004] F-measure = 0.867 [Bilenko and Mooney, 2003]
Class
Attr-wise Matching
Dependency Graph
Prec/Recall
F-msre
Prec/Recall F-msre
Article
PersonVenue
0.985/0.913
0.994/0.985
0.982/0.362
0.948 0.9890.529
0.985/0.924
1/0.9870.837/0.71
4
0.954 0.9930.771
Conclusions
Contributions: Dependency-graph-based reconciliation algorithm Exploit rich evidence Propagate information between
reconciliation decisions Enrich references during reconciliation
Extended Work Propagate negative information through
dependency Graph
Reference Reconciliation in Complex Information Spaces
Xin (Luna) Dong, Alon Halevy, Jayant Madhavan
@ Sigmod 2005http://data.cs.washington.edu/semex
Strategy IV. Enforce Constraints Problem:
Solution: Propagate negative information—ConstraintsNon-merge node: the two elements are
guaranteed to be different and should never be merged
P1
P2
P3
Enforce Constraints by Propagating Negative Information p2=(“Michael Stonebraker”, null, {p1, p3})
p3=(“Eugene Wong”, null, {p1, p2}) p7=(“Eugene Wong”, “[email protected]”, {p8}) p8=(null, “[email protected]”, {p7}) p9=(“matt”, “[email protected]”, null)
(p2, p8)
(“Michael Stonebraker”, “matt”)
(p2, p9)
(p3,p7) (“Michael Stonebraker”, “stonebraker@”)
(p8, p9)
Reference Similarity Attribute Similarity
Enforce Constraints by Propagating Negative Information p2=(“Michael Stonebraker”, null, {p1, p3})
p3=(“Eugene Wong”, null, {p1, p2}) p7=(“Eugene Wong”, “[email protected]”, {p8}) p8=(null, “[email protected]”, {p7}) p9=(“matt”, “[email protected]”, null)
(p2, p8)
(“Michael Stonebraker”, “matt”)
(p2, p9)
(p3,p7) (“Michael Stonebraker”, “stonebraker@”)
Reconciled Similar Non-merge
(p8, p9)
(“[email protected]”, “[email protected]”)
Constraint
Enforce Constraints by Propagating Negative Information p2=(“Michael Stonebraker”, null, {p1, p3})
p3=(“Eugene Wong”, null, {p1, p2}) p7=(“Eugene Wong”, “[email protected]”, {p8}) p8=(null, “[email protected]”, {p7}) p9=(“matt”, “[email protected]”, null)
(p2, p8)
(“Michael Stonebraker”, “matt”)
(p2, p9)
(p3,p7) (“Michael Stonebraker”, “stonebraker@”)
Reconciled Similar Non-merge
(p8, p9)
(“[email protected]”, “[email protected]”)
Constraint
Enforce Constraints by Propagating Negative Information p2=(“Michael Stonebraker”, null, {p1, p3})
p3=(“Eugene Wong”, null, {p1, p2}) p7=(“Eugene Wong”, “[email protected]”, {p8}) p8=(null, “[email protected]”, {p7}) p9=(“matt”, “[email protected]”, null)
(p2, p8)
(“Michael Stonebraker”, “matt”)
(p2, p9)
(p3,p7) (“Michael Stonebraker”, “stonebraker@”)
Reconciled Similar Non-merge
(p8, p9)
(“[email protected]”, “[email protected]”)
Constraint
Enforce Constraints by Propagating Negative Information p2=(“Michael Stonebraker”, null, {p1, p3})
p3=(“Eugene Wong”, null, {p1, p2}) p7=(“Eugene Wong”, “[email protected]”, {p8}) p8=(null, “[email protected]”, {p7}) p9=(“matt”, “[email protected]”, null)
(p2, p8)
(“Michael Stonebraker”, “matt”)
(p2, p9)
(p3,p7) (“Michael Stonebraker”, “stonebraker@”)
Reconciled Similar Non-merge
(p8, p9)
(“[email protected]”, “[email protected]”)
Constraint
Enforcing Constraints Improves Precision
Method Precision
#(Entities reconciled with others incorrectly)
Constraint 0.999 13No
Constraint0.947 61
Similarity Computation
Similarity function for node N – s(N) Input: sim scores of N’s neighborsOutput: sim score of N, ranged from 0 to 1
Similarity function can be defined by applying domain knowledge, learning from training data, resorting to global knowledge, etc.
S = Srv + Ssb + Swb
Srv: from real-valued neighbors. Decision-tree shape.
Ssb: from strong-boolean-valued neighborsSwb: from weak-boolean-valued neighbors
Framework: Dependency Graph Definition
For every pair of references A and B: A node representing their similarity
For every attribute of A and attribute of B A node representing attribute similarity An edge between attr-sim node and ref-sim
node, representing the dependency between the similarities
Each node is associated with a similarity score between 0 and 1
Construction: include only nodes whose two elements have potential to be similar