29
Efficient Equal Interval Efficient Equal Interval Neighborhood Ring Neighborhood Ring (P-trees technology is patented by NDSU) (P-trees technology is patented by NDSU)

Efficient Equal Interval Neighborhood Ring (P-trees technology is patented by NDSU)

Embed Size (px)

Citation preview

Page 1: Efficient Equal Interval Neighborhood Ring (P-trees technology is patented by NDSU)

Efficient Equal Interval Efficient Equal Interval Neighborhood RingNeighborhood Ring

(P-trees technology is patented by NDSU)(P-trees technology is patented by NDSU)

Page 2: Efficient Equal Interval Neighborhood Ring (P-trees technology is patented by NDSU)

OUTLINEOUTLINE

• Review: HOBBit Metric• Equal Interval Neighborhood Ring

(EINring)– Prototype Problem– Definition of Range Mask– Propositions– Definition of EINring– Calculation of EINring Using P-trees

• Summary

Page 3: Efficient Equal Interval Neighborhood Ring (P-trees technology is patented by NDSU)

Review: HOBBit Similarity MetricReview: HOBBit Similarity Metric

• Let X and Y be two values, the HOBBit similarity between X and Y is defined by

• where xi and yi are the bits of X and Y respectively, denotes XOR. In another word, it is the left most position at which X and Y differ.

}1|max{),( ii yxiYXm

Page 4: Efficient Equal Interval Neighborhood Ring (P-trees technology is patented by NDSU)

Review: HOBBit Distance & RingReview: HOBBit Distance & Ring

• The HOBBit distance between two tuples X and Y is defined by

• HOBBit Ring: The HOBBit ring of radii, r1 and r2 , centered at c is defined as R(c, r1, r2) = {x X | r1 d(c,x) < r2}, where d(c,x) is HOBBit distance.

),(2),( YXmYXd

Page 5: Efficient Equal Interval Neighborhood Ring (P-trees technology is patented by NDSU)

Diagram of HOBBit RingDiagram of HOBBit Ring

Diagram of HOBBit Ring

Page 6: Efficient Equal Interval Neighborhood Ring (P-trees technology is patented by NDSU)

Example of HOBBit RingExample of HOBBit Ring

HOBBit Ring Binary Range Decimal

R(22,7,8) 00010110-00010111 22-23

R(22,6,8) 00010100-00010111 20-23

R(22,5,8) 00010000-00010111 16-23

R(22,4,8) 00010000-00011111 16-31

R(22,3,8) 00000000-00011111 0-31

R(22,2,8) 00000000-00111111 0-63

R(22,1,8) 00000000-00010111 0-127

R(22,0,8) 00000000-00010111 0-255

Page 7: Efficient Equal Interval Neighborhood Ring (P-trees technology is patented by NDSU)

Summary of HOBBit MetricSummary of HOBBit Metric

• The HOBBit metric is based on the most significant matching bit positions starting from the left.

• HOBBit ring is a geometric ring

whose diameter increases exponentially.

• HOBBit ring is eccentric ring.

Page 8: Efficient Equal Interval Neighborhood Ring (P-trees technology is patented by NDSU)

Equal Interval Equal Interval

Neighborhood Ring Neighborhood Ring

(EINring)(EINring)

Page 9: Efficient Equal Interval Neighborhood Ring (P-trees technology is patented by NDSU)

Outline Outline

• Prototype Problem

• Definition of Range Mask

• Propositions and Theorem

• Definition of EINring

• Calculation of EINring using P-trees

Page 10: Efficient Equal Interval Neighborhood Ring (P-trees technology is patented by NDSU)

Prototype ProblemPrototype Problem

• Problem: x > (4)10 > (100)2

• Conjecture: Px>(100)2 = P3(P2P1) 7 7 7 7 5 5 1 17 7 7 7 1 1 1 1 5 5 7 7 4 4 1 1 5 7 7 7 4 5 5 1 6 6 6 6 3 3 0 0 6 6 6 6 0 0 0 0 2 2 6 6 3 3 0 0 2 6 6 6 3 3 3 0

8x8 data set

Page 11: Efficient Equal Interval Neighborhood Ring (P-trees technology is patented by NDSU)

Walk Through: Peano TreesWalk Through: Peano Trees

Page 12: Efficient Equal Interval Neighborhood Ring (P-trees technology is patented by NDSU)

Result of Crude MethodResult of Crude Method

Page 13: Efficient Equal Interval Neighborhood Ring (P-trees technology is patented by NDSU)

Result of ConjectureResult of Conjecture

Page 14: Efficient Equal Interval Neighborhood Ring (P-trees technology is patented by NDSU)

Definition of Range MaskDefinition of Range Mask

• Range Mask The Range Mask is the P-tree mask that calculates any data point, x, that satisfies range inequality, i.e., x c1, x > c1, x c2, etc., where c1, c2 are integers.

• Example: Px>100 is a P-tree mask that calculates any data point greater than 100.

Page 15: Efficient Equal Interval Neighborhood Ring (P-trees technology is patented by NDSU)

Proposition 1Proposition 1

• Let m be the number of binary bit of jth attribute of data point x, Pj,m, Pj,m-1, … Pj,0 be the basic P-trees of ith bit of jth attribute, and integers c=bmbm-1…b0, where bi is ith binary bit value of c. Let Pxjr be the Range Mask that satisfies inequality xj r, then

Pxjr = Pj,m … Pj,i opj,i… Pj,0,

s.t. 1) Opj,j is if bi=1, 2) Opj,i is if bi=0, 3) right binding within each attribute.

• Example: Pxj (70)10 = Pxj (01000110)2

= P7(P6(P5(P4( P3( P2P1P0))))

Page 16: Efficient Equal Interval Neighborhood Ring (P-trees technology is patented by NDSU)

Proof Sketch Proof Sketch

Without loss of generality, assume data point x has one attribute. Let c= bm…bi…b0, where bi is ith bit value of c. Pxjc is the range mask that satisfy x c.

If bi=1, the ith bit of x should be set 1 when x and c have the same bit value from position mth to ith position, e.i., Pxjc =Pm…Pi…P0. (Partially done!)

If bi=0, there are two cases that satisfy x c, one is to set ith bit of x, xi=1, another is to set xi=0. Thus

Pxjc = (Pm… Pi)(Pm…Pi’Pi-1…P0). = < complement rule, X(XY)=XY >

Pxjc =(Pm…(Pi(Pi-1…P0)). Done!

Page 17: Efficient Equal Interval Neighborhood Ring (P-trees technology is patented by NDSU)

Proposition 2Proposition 2• Let m be the number of binary bit of jth

attribute of data point x, P’j,m, P’j,m-1, … P’j,0 be the complement P-trees of ith bit of jth attribute, and integers c=bmbm-1…b0, where bi is ith binary bit value of c. Let Pxj c be the Range Mask that satisfies xj c, then

Pxj r = P’j,m … P’j,i opj,i… P’j,0

s.t. 1) Opj,i is if bi=0, 2) Opj,i is if bi=1, 3) right binding within each attribute • Example: Pxj (198)10 = Pxj (11000101)2

= P7’ (P6’ (P5’ P4’P3’(P2’ (P1’P0’)))

Page 18: Efficient Equal Interval Neighborhood Ring (P-trees technology is patented by NDSU)

Proposition 3Proposition 3

• Let m be the number of binary bit of jth attribute of data point x, Pj,m, Pj,m-1, … Pj,0 be the basic P-trees of ith bit of jth attribute, and integers c=bmbm-1…b0, where bi is c’s ith binary bit value. Let Pxj>c be the Range Mask that satisfies inequality xj > c, then

Pxj>c = Pj,m … Pi,j opi,j… Pj,k,

s.t. 1) opi,j is if bi=1, 2) opi,j is if bi=0, 3) right binding within each attribute 4) bk=0, and bj=1 j<k . • Example: Pxj > (72)10 = Pxj > (01001000)2

• = P7 (P6 (P5 P4 P3))

Page 19: Efficient Equal Interval Neighborhood Ring (P-trees technology is patented by NDSU)

Proposition 4Proposition 4• Let m be the number of binary bit of jth attribute

of data point x, P’j,m, P’j,m-1, … P’j,0 be the complement P-trees of ith bit of jth attribute, and integer c=bmbm-1…b0, where bi is c’s ith binary bit value. Let Pxj<c be the Range Mask that satisfies xj < c, then

Pxj<c = P’j,m … P’j,i opj,i… P’j,k,

s.t. 1) opi,j is if bi=0, 2) opi,j is if bi=1, 3) right binding within each attribute,

4) bk=0, and bj=1 j<k . • Example: Pxj < (72)10 = Pxj < (01001000)2

= P7’P6’ (P5’ P4’P3’)

Page 20: Efficient Equal Interval Neighborhood Ring (P-trees technology is patented by NDSU)

More ExamplesMore Examples

• c=(70)10=(01001000)2

Px<c =P7’P6’ (P5’ P4’P3’)

• c=(72)10=(01001000)2

Px>c =P7 P6 (P5 P4 P3)

• c=(198)10=(11000101)2,

Px<=c = P7’ (P6’ (P5’ P4’P3’(P2’ (P1’P0’)))

• Let c=(198)10=(11000101)2,

Px=c =P7P6(P5 (P4 (P3 (P2(P1 P0)))))

Page 21: Efficient Equal Interval Neighborhood Ring (P-trees technology is patented by NDSU)

Theorem – Range Mask Theorem – Range Mask Complement RuleComplement Rule

• Theorem Range Mask Complement Rule Let Pxj<c, Pxj<c, Pxj c and Pxj>c be the Range

Mask of jth attribute of any data point x, where c is integer, then

Pxjc = P’xj<c and Pxj c = P’xj>c

hold.

Page 22: Efficient Equal Interval Neighborhood Ring (P-trees technology is patented by NDSU)

Definition of Neighborhood RingDefinition of Neighborhood Ring

• Neighborhood Ring: The Neighborhood ring of radii, r1 and r2 , centered at c is defined as R(c, r1, r2) = {x X | r1< abs(x-c) r2}, where abs(x-c) is absolute length between x and c.

Page 23: Efficient Equal Interval Neighborhood Ring (P-trees technology is patented by NDSU)

Definition of Equal Interval Definition of Equal Interval Neighborhood Ring (EINring)Neighborhood Ring (EINring)

• The Equal Interval Neighborhood ring of radii, r1 and r2, centered at c is defined as R(c, r1, r2) = {x X | r1<abs(x-c) r2}, and abs(r2-r1)=2k, where k=1,2,…, abs(x-c) is absolute length between x and c, and is interval

Page 24: Efficient Equal Interval Neighborhood Ring (P-trees technology is patented by NDSU)

Diagram of EINringDiagram of EINring

Diagram of EINring

Page 25: Efficient Equal Interval Neighborhood Ring (P-trees technology is patented by NDSU)

Example of EINringExample of EINring

HOBBit Ring

Binary Range Decimal

R16(86,1) 01000110-01100110 70-102

R16 (86,2) 00110110-11110110 54-118

R16(86,3) 00100110-10000110 38-134

R16(86,4) 00001110-10010110 22-150

… … …

R16(86,10) 00000000-11100110 0-230

R16(86,11) 00000000-11110110 0-246

R16(86,12) 00000000-11111111 0-255

Page 26: Efficient Equal Interval Neighborhood Ring (P-trees technology is patented by NDSU)

Neighbor Count within EINringNeighbor Count within EINring

• For any data point, x, let x = (x1,x2,…xm) , where x,j is x’s jth attribute column. Let r be vectors with m elements, we define the range mask Px>c+r as

Px>c-r = Px1>c-r1 Px2>c-r2 …. Pxj>c-rj

and define the range mask Pxc+r as

Pxc+r = Px1c+r1 Px2c+r2 …. Pxjc+rj

where c is a constant.

Page 27: Efficient Equal Interval Neighborhood Ring (P-trees technology is patented by NDSU)

Neighbor Count within EINring:Neighbor Count within EINring:

The range mask for any data points x within the neighborhood ring, R(c, 0, r), are calculated by

Pc,r = Px>c-r Pxc+r

The neighbor count for x within the neighborhood ring, R(c, 0, r), are calculated by

NC (c,0,r) = RC(Pc,r) where RC is the root count of P-tree.

Page 28: Efficient Equal Interval Neighborhood Ring (P-trees technology is patented by NDSU)

Neighbor Count within EINringNeighbor Count within EINring

The Neighbor Count NC(c, r1, r2) of c within EINring R(c, r1, r2) is calculated as

NC(c, r1,r2) =RC(Pc,r2)-RC(Pc,r1)

Page 29: Efficient Equal Interval Neighborhood Ring (P-trees technology is patented by NDSU)

SummarySummary

• Equal Interval Neighborhood Ring (EINring)

is much finer than HOBBit ring.

• Calculation of EINring using P-trees is

efficient, comparable to that of HOBBit

ring.