Upload
jeffery-parrish
View
217
Download
2
Embed Size (px)
Citation preview
Efficient Equal Interval Efficient Equal Interval Neighborhood RingNeighborhood Ring
(P-trees technology is patented by NDSU)(P-trees technology is patented by NDSU)
OUTLINEOUTLINE
• Review: HOBBit Metric• Equal Interval Neighborhood Ring
(EINring)– Prototype Problem– Definition of Range Mask– Propositions– Definition of EINring– Calculation of EINring Using P-trees
• Summary
Review: HOBBit Similarity MetricReview: HOBBit Similarity Metric
• Let X and Y be two values, the HOBBit similarity between X and Y is defined by
• where xi and yi are the bits of X and Y respectively, denotes XOR. In another word, it is the left most position at which X and Y differ.
}1|max{),( ii yxiYXm
Review: HOBBit Distance & RingReview: HOBBit Distance & Ring
• The HOBBit distance between two tuples X and Y is defined by
• HOBBit Ring: The HOBBit ring of radii, r1 and r2 , centered at c is defined as R(c, r1, r2) = {x X | r1 d(c,x) < r2}, where d(c,x) is HOBBit distance.
),(2),( YXmYXd
Diagram of HOBBit RingDiagram of HOBBit Ring
Diagram of HOBBit Ring
Example of HOBBit RingExample of HOBBit Ring
HOBBit Ring Binary Range Decimal
R(22,7,8) 00010110-00010111 22-23
R(22,6,8) 00010100-00010111 20-23
R(22,5,8) 00010000-00010111 16-23
R(22,4,8) 00010000-00011111 16-31
R(22,3,8) 00000000-00011111 0-31
R(22,2,8) 00000000-00111111 0-63
R(22,1,8) 00000000-00010111 0-127
R(22,0,8) 00000000-00010111 0-255
Summary of HOBBit MetricSummary of HOBBit Metric
• The HOBBit metric is based on the most significant matching bit positions starting from the left.
• HOBBit ring is a geometric ring
whose diameter increases exponentially.
• HOBBit ring is eccentric ring.
Equal Interval Equal Interval
Neighborhood Ring Neighborhood Ring
(EINring)(EINring)
Outline Outline
• Prototype Problem
• Definition of Range Mask
• Propositions and Theorem
• Definition of EINring
• Calculation of EINring using P-trees
Prototype ProblemPrototype Problem
• Problem: x > (4)10 > (100)2
• Conjecture: Px>(100)2 = P3(P2P1) 7 7 7 7 5 5 1 17 7 7 7 1 1 1 1 5 5 7 7 4 4 1 1 5 7 7 7 4 5 5 1 6 6 6 6 3 3 0 0 6 6 6 6 0 0 0 0 2 2 6 6 3 3 0 0 2 6 6 6 3 3 3 0
8x8 data set
Walk Through: Peano TreesWalk Through: Peano Trees
Result of Crude MethodResult of Crude Method
Result of ConjectureResult of Conjecture
Definition of Range MaskDefinition of Range Mask
• Range Mask The Range Mask is the P-tree mask that calculates any data point, x, that satisfies range inequality, i.e., x c1, x > c1, x c2, etc., where c1, c2 are integers.
• Example: Px>100 is a P-tree mask that calculates any data point greater than 100.
Proposition 1Proposition 1
• Let m be the number of binary bit of jth attribute of data point x, Pj,m, Pj,m-1, … Pj,0 be the basic P-trees of ith bit of jth attribute, and integers c=bmbm-1…b0, where bi is ith binary bit value of c. Let Pxjr be the Range Mask that satisfies inequality xj r, then
Pxjr = Pj,m … Pj,i opj,i… Pj,0,
s.t. 1) Opj,j is if bi=1, 2) Opj,i is if bi=0, 3) right binding within each attribute.
• Example: Pxj (70)10 = Pxj (01000110)2
= P7(P6(P5(P4( P3( P2P1P0))))
Proof Sketch Proof Sketch
Without loss of generality, assume data point x has one attribute. Let c= bm…bi…b0, where bi is ith bit value of c. Pxjc is the range mask that satisfy x c.
If bi=1, the ith bit of x should be set 1 when x and c have the same bit value from position mth to ith position, e.i., Pxjc =Pm…Pi…P0. (Partially done!)
If bi=0, there are two cases that satisfy x c, one is to set ith bit of x, xi=1, another is to set xi=0. Thus
Pxjc = (Pm… Pi)(Pm…Pi’Pi-1…P0). = < complement rule, X(XY)=XY >
Pxjc =(Pm…(Pi(Pi-1…P0)). Done!
Proposition 2Proposition 2• Let m be the number of binary bit of jth
attribute of data point x, P’j,m, P’j,m-1, … P’j,0 be the complement P-trees of ith bit of jth attribute, and integers c=bmbm-1…b0, where bi is ith binary bit value of c. Let Pxj c be the Range Mask that satisfies xj c, then
Pxj r = P’j,m … P’j,i opj,i… P’j,0
s.t. 1) Opj,i is if bi=0, 2) Opj,i is if bi=1, 3) right binding within each attribute • Example: Pxj (198)10 = Pxj (11000101)2
= P7’ (P6’ (P5’ P4’P3’(P2’ (P1’P0’)))
Proposition 3Proposition 3
• Let m be the number of binary bit of jth attribute of data point x, Pj,m, Pj,m-1, … Pj,0 be the basic P-trees of ith bit of jth attribute, and integers c=bmbm-1…b0, where bi is c’s ith binary bit value. Let Pxj>c be the Range Mask that satisfies inequality xj > c, then
Pxj>c = Pj,m … Pi,j opi,j… Pj,k,
s.t. 1) opi,j is if bi=1, 2) opi,j is if bi=0, 3) right binding within each attribute 4) bk=0, and bj=1 j<k . • Example: Pxj > (72)10 = Pxj > (01001000)2
• = P7 (P6 (P5 P4 P3))
Proposition 4Proposition 4• Let m be the number of binary bit of jth attribute
of data point x, P’j,m, P’j,m-1, … P’j,0 be the complement P-trees of ith bit of jth attribute, and integer c=bmbm-1…b0, where bi is c’s ith binary bit value. Let Pxj<c be the Range Mask that satisfies xj < c, then
Pxj<c = P’j,m … P’j,i opj,i… P’j,k,
s.t. 1) opi,j is if bi=0, 2) opi,j is if bi=1, 3) right binding within each attribute,
4) bk=0, and bj=1 j<k . • Example: Pxj < (72)10 = Pxj < (01001000)2
= P7’P6’ (P5’ P4’P3’)
More ExamplesMore Examples
• c=(70)10=(01001000)2
Px<c =P7’P6’ (P5’ P4’P3’)
• c=(72)10=(01001000)2
Px>c =P7 P6 (P5 P4 P3)
• c=(198)10=(11000101)2,
Px<=c = P7’ (P6’ (P5’ P4’P3’(P2’ (P1’P0’)))
• Let c=(198)10=(11000101)2,
Px=c =P7P6(P5 (P4 (P3 (P2(P1 P0)))))
Theorem – Range Mask Theorem – Range Mask Complement RuleComplement Rule
• Theorem Range Mask Complement Rule Let Pxj<c, Pxj<c, Pxj c and Pxj>c be the Range
Mask of jth attribute of any data point x, where c is integer, then
Pxjc = P’xj<c and Pxj c = P’xj>c
hold.
Definition of Neighborhood RingDefinition of Neighborhood Ring
• Neighborhood Ring: The Neighborhood ring of radii, r1 and r2 , centered at c is defined as R(c, r1, r2) = {x X | r1< abs(x-c) r2}, where abs(x-c) is absolute length between x and c.
Definition of Equal Interval Definition of Equal Interval Neighborhood Ring (EINring)Neighborhood Ring (EINring)
• The Equal Interval Neighborhood ring of radii, r1 and r2, centered at c is defined as R(c, r1, r2) = {x X | r1<abs(x-c) r2}, and abs(r2-r1)=2k, where k=1,2,…, abs(x-c) is absolute length between x and c, and is interval
Diagram of EINringDiagram of EINring
Diagram of EINring
Example of EINringExample of EINring
HOBBit Ring
Binary Range Decimal
R16(86,1) 01000110-01100110 70-102
R16 (86,2) 00110110-11110110 54-118
R16(86,3) 00100110-10000110 38-134
R16(86,4) 00001110-10010110 22-150
… … …
R16(86,10) 00000000-11100110 0-230
R16(86,11) 00000000-11110110 0-246
R16(86,12) 00000000-11111111 0-255
Neighbor Count within EINringNeighbor Count within EINring
• For any data point, x, let x = (x1,x2,…xm) , where x,j is x’s jth attribute column. Let r be vectors with m elements, we define the range mask Px>c+r as
Px>c-r = Px1>c-r1 Px2>c-r2 …. Pxj>c-rj
and define the range mask Pxc+r as
Pxc+r = Px1c+r1 Px2c+r2 …. Pxjc+rj
where c is a constant.
Neighbor Count within EINring:Neighbor Count within EINring:
The range mask for any data points x within the neighborhood ring, R(c, 0, r), are calculated by
Pc,r = Px>c-r Pxc+r
The neighbor count for x within the neighborhood ring, R(c, 0, r), are calculated by
NC (c,0,r) = RC(Pc,r) where RC is the root count of P-tree.
Neighbor Count within EINringNeighbor Count within EINring
The Neighbor Count NC(c, r1, r2) of c within EINring R(c, r1, r2) is calculated as
NC(c, r1,r2) =RC(Pc,r2)-RC(Pc,r1)
SummarySummary
• Equal Interval Neighborhood Ring (EINring)
is much finer than HOBBit ring.
• Calculation of EINring using P-trees is
efficient, comparable to that of HOBBit
ring.