Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Euler diagrams with circlesGennaro Cordasco and Rosario De Chiara
Problem and ideas
INTRODUCTIONProblem and ideas
Euler Venn DiagramsEuler‐Venn Diagrams
• Euler diagrams are a diagrammatic method for representing information about the p grelationships between sets– They have been used in various forms since Euler– They have been used in various forms since Euler first introduced them, and they generalize Venn diagrams which represent all set intersectionsdiagrams which represent all set intersections
ScenariosScenarios
• Currently Euler diagrams (or their variants) are being applied in a multitude of areas g ppincluding: file‐information systems, library systems statistical data representation and assystems, statistical data representation and as the basis for logical specification and reasoning systemsreasoning systems
ScenariosScenariosVisualize the number of documents matching a query. This may help in slightly modify a query in order to refine/expand the result.
ScenariosScenariosArea proportional zones: in this Venn diagram the ellipses are drawn in order to convey more information about any single zone.
ScenariosScenariosThis diagram shows an Euler diagram with a large number of curves, being used to visualize complex genetic set relations. Evolutionary optimization techniques can be used to obtain area proportionality.optimization techniques can be used to obtain area proportionality.
ScenariosScenariosVennFS allows to draw Euler Diagrams to let user organize files within categories. The diagram is then “exported” to file system.
The problemThe problem
• To compute the set of zones associated to a given collection of contours g
The real problemThe real problem
• Adding curves to a diagram creates zones
• Zones hardly have an analytic description:Zones hardly have an analytic description:• Cannot be described using parameters in a functionfunction
• Zones are “blob”‐shaped figures:• Non convex
• HoledHoled
Not so trivial after all
TRIVIAL ALGORITHMNot so trivial after all
Euler Diagram: Formal definitionEuler Diagram: Formal definition
A E l Di (ED) i iAn Euler Diagram (ED) is a pair: d = <C(d), Z(d)>
where:where:– C(d) is a finite collection of closed curves in the plane whose members are called contours.
– Z ∈ Z(d) is the collection of zones, where each zone is given by
II cdisccdiscZ = )()(where X⊆C(d).
IIXdCc
jXc
iji
cdisccdiscZ−∈∈
=)(
)()(⊆ ( )
– aU)(
)(dZZ
dCZ∈
=)(dZZ∈
Euler Diagram: NotationEuler Diagram: NotationLet d be an Euler diagram. g
‐ Each contour A∈C(d) has a unique |C(d)|-bit identifierA.id having Hamming weight 1.g g g
‐ Let Z ∈ Z(d) be a zone of d with containing set of contours X. Then Z has a unique |C(d)|-bit identifierq | ( )| fZ.id, which is the bitwise OR of all of the identifiers of the contours in X.
000100101000
1000
0000
0100
0001
0011 1000
0101
(a) (b)
0100
Trivial algorithmTrivial algorithm
• We can use the trivial algorithm to compute the zones
Trivial algorithm 1Trivial algorithm 1
• Adding a new set at the most doubles the number of zones already present in the y pdiagram– We add 2 new candidate zones: 1 as rightmost– We add 2 new candidate zones: 1 as rightmost digit and 0 rightmost digit
Candidate zones001 101
Zones present in the diagram
010 110 011 111
01 10 11
Trivial algorithm 2Trivial algorithm 2
• Check every candidate zone whether belongs to the diagram or notg– This depends on the shape of curves
Candidate zones Legal zones001 101 010 110 011 111
010 101 010 110011 111
Trivial algorithm for circlesTrivial algorithm for circles
l l i h l i f “ i l ”• A legal zone is the solution of “simple” simultaneous inequalities:
(x – a1)2 + (y – b1)2 > r12
(x – a2)2 + (y – b2)2 ≤ r22
01:
Circleidentifier
Centercoordinates
Radius
0 0 01 ( b ) Standard
(x – ai)2 + (y – bi)2 > ri2
::0:
0…0…01 (a1, b1) r1
0…0…10 (a2, b2) r2
0…1…00 (ai, bi) ri
Standard methods
If th i th di it i 0 th t t th t f 2
(x – an)2 + (y – bn)2 ≤ rn2
::1
( i, i) i
1…0…00 (an, bn) rn
– If the i‐th digit is 0 that we want the part of 2
outside the circle iIf the i th digit is 1 that we want the part of 2– If the i‐th digit is 1 that we want the part of 2
inside the circle i
ExampleExampleCheck the zone 100
(x – 2) 2 + (y – 2)2 ≤ 4
(x 2) 2 + (y 2)2 > 1x 2 + y2 > 4
100(x – 2) 2 + (y – 2)2 > 1 0
(x – 2) 2 + (y – 2)2 ≤ 4 x 2 + y2 > 4 (x – 2) 2 + (y – 2)2 > 1
Rectangles (!?)
EULER DIAGRAMS WITH Rectangles (!?)
RECTANGLES
Euler diagrams with rectanglesEuler diagrams with rectangles
• A solution for the case of rectangle contours is presented is based upon R‐tree p p
[De Chiara et al.]R t [G tt ] i d t t t d i d t– R‐tree [Guttman] is a data structure designed to answer range queries
Range queriesRange queries
• Example:
ComplexityComplexity
S dGiven a collection S of n boxes in the d
space, a perfectly balanced R‐tree can be built p p y[Agarwal et al.] in
)log( nnO ⋅
ComplexityComplexity
An R‐tree can answer a query returning kboxes in
B i d 2 th ti t ll i
)log( /11 nknO d ⋅+−
Being d = 2 the query time actually is
)log( nknO ⋅+
Zones vs SubspacesZones vs Subspaces
• The R‐Tree will be used to track down all the subspaces created by intersecting setsp y g– Contours are rectangles so it is easy to “rectangulate” the zonesrectangulate the zones
EulerTreeEulerTree
• EulerTree extends R‐Tree– Another level is added underneath the last level of the R‐Tree
– This level contains a node for every set in theThis level contains a node for every set in the Euler Diagram
ExampleExample
s2s1Level 1 R‐Tree
s2
s3
s1
s5Level 2
R‐Tree leaves: these are the subspaces created by the
s1 s2 s3 s4s4 s5Level 3
subspaces created by the intersection of the sets
The interface: connects subspaces to the setsOne node for each set in the
Set1 Set2 Level 4One node for each set in the
diagram
23/09/2004 ‐ Euler Diagrams 2004 Workshop
A data structure for Euler Diagrams 26
23/09/2004 ‐ Euler Diagrams 2004 Workshop
A data structure for Euler Diagrams 27What about the elegant solutions ?
An elegant solution (?)
EULER DIAGRAMS WITH CIRCLESAn elegant solution (?)
Not well formed
Wellformedness constraintsWellformedness constraints1. Curves can only intersect trasversely: no single point intersection
Not well formed
Wellformedness constraintsWellformedness constraints2. No triple point intersection
Not well formed
Wellformedness constraintsWellformedness constraints3. Zones must be connected
Curves shape constraintCurves shape constraint
• Curves must be circles
• We will relax shape constraint to:We will relax shape constraint to:– Axis aligned ellipses
Elli– Ellipses
– Convex curves
Assumptions:Given two curves A and B we are still able to:‐ Find the relationship between A and B ‐ Find their intersection points Ch k if A f i i t‐ Check if x ∈ A, for any given point x.
Marker pointsMarker pointsEach zone Z has a marker point Z point whichEach zone Z has a marker point Z.point which‐ witnesses the existence of Z;
b d h k h h Z b l‐ can be used to check whether Z belongs to a given contour.
By construction, each maker point belongs to its zone’s boundaryy
Let V a generic zone
C ) th t C th t h
Marker point associationV
Case a) the contour C, that has generated V, has no intersection:a random point in the boundary of V is assigned to V.point.
V.point
Obs: in this case V has the same shape of C.
Case b) the contour C that has
(a)
Case b) the contour C, that has generated V1, intersects another contour and the point associated to the split zone V belongs to V1 :V i t V i t
VV2
V1V1.point = V.pointV2.point = x where x is an intersection point between C and V.
V.point
V1 C
V2.pointV1.pointx
(b)
Case b) the contour C, that has generated V1, intersects another contour and the point associated to the
V
V.point
V
V2.point
( )
contour and the point associated to the split zone V does not belong to V1 :V1.point = xV2.point = V.point
V2
V1 C
where x is an intersection point between C and V.
(c)V1.point
x
The Key Idea
The Key Idea
The Key Idea
On the number of zonesOn the number of zones
j h b f i l h b fConjecture: The number of zones is equal to the number of intersection points plus the number of connected componentscomponents
A connected component is a maximal set of intersecting contours.
Our StrategyOur StrategyInductive Approach:
– Base: an empty diagram d0=<C0=∅, Z0={U}>– Inductive Step: let di-1=<Ci-1, Zi-1> and a contour A∉Ci-1 ,
we compute di=<Ci-1=Ci-1∪A, Zi>
⎯The set Zi is obtained by identifying the set Y ∈ Zi-1 of zones that are split by A and the set of zones that are
l d Aproperly contained in A.
A⎯ AY
SkimDiagramSkimDiagramComputes the relationship between each circle B ∈ Ci-1
and the new circle A to be added to d
( ) A d B
(b) A properly contains B
A and B?
(a) A and B are interior disjoint
(c) B properly contains A
A and B overlap
(d) A and Bproperly overlapproperly overlap
rA rA rB
rArB
r
A
d(A,B)
rB
d(A,B)>rA+rB(a)
rA>d(A,B)+rB(b)
rB>d(A,B)+rA(c) (d)
SkimDiagramSkimDiagramLemmaGi ll ti f i l C d i l A th dGiven a collection of circle C and a circle A, the procedureSkimDiagram computes:
1. D, the set of circles which properly overlap with A;2. X, the set of intersection points between A and any circle in D;3 the identifier of the zone which A belongs to (i e the bitwise OR3. the identifier of the zone which A belongs to (i.e. the bitwise OR
of the identifier associated to the circles which properly contain A).
rA rA rB
rArB
r
A
d(A,B)
rB
d(A,B)>rA+rB(a)
rA>d(A,B)+rB(b)
rB>d(A,B)+rA(c) (d)
SkimDiagramSkimDiagram
O(|C|) steps
SkimDiagramSkimDiagramProperty
After the execution of the procedure SkimDiagram thenumber of point in X is exactly m=2|D|p y 2| |
Lemmax0Lemma
The number of zones split by A is at most m=2|D|
A A A
Computing split zonesComputing split zones1. Order split points clockwise around A. Let (x0,x1,…,xm-1)
the sorting so obtained.2. Compute the identifier of the zone in which x0 belongs to0
3. Compute the identifier of the zones associated to x1,…,xm-11 m 1
Step 2A
x0
x1
Step 2
Computing split zones: Step 3Computing split zones: Step 31. Order split points clockwise around A.
Let (x0,x1,…,xm-1) the sorting so b
Ax1
obtained.2. Compute the identifier of the zone in
which x0 belongs to
x01
x0 g3. Compute the identifier of the zones
associated to x1,…,xm-1
Step 3 O(|C|log|C|) steps O(|C|2) steps
‐ Similar to step 2 (but m‐1 times )‐ Actually, we can do something better ☺
(| | g| |) p
Computing split zonesComputing split zonesLemmaThe identifier associated to each point in X correspond to the
identifier of the zone split by AProof [Sketch]‐ Each arc (xi,x(i+1)mod m) split exactly one zone‐ Consecutive arcs, (xi,x(i+1) mod m) and (x(i+1)mod m,x(i+2)mod m)
are separated by exactly one contour and then the identifiers of the two zones associated differ in exactly one bit whichof the two zones associated differ in exactly one bit, which correspond to the crossed contour's id
‐ Let G the crossed contour if an arc/zone is properlyLet G the crossed contour, if an arc/zone is properly contained (resp. not contained) into G then the successive arc/zone is not contained (resp. contained) into G
‐ Hence, xi.id=xi-1.id ⊕ G.id
Computing split zonesComputing split zonesLemmaThe identifier associated to each point in X correspond to the
identifier of the zone split by AProof [Sketch]
Intersectionpoints
10100
00001
01000
00010
1 X > 10110
00001 10100 idGidxidx ii ... 1 ⊕= −
2 X > X 11110
3 X > X X 11111
4 X X X < 11101
01000
4 X X X < 11101
5 X < X 11100
6 X < 10100
00010A
AddCircleAddCircle
AddCircleAddCircle
WWW
AddCircleAddCircle
0001
0010x0
0100id=0011x0.id=1011x id=1111
1000
x1.id=1111x2.id=0111x3.id=0011
UpdateMarkedPointUpdateMarkedPoint
x0
0001
A id 100000010
A.id=10000x0
0100id=0011x0.id=1011x id=1111
1000
x1.id=1111x2.id=0111x3.id=0011
UpdateMarkedPointUpdateMarkedPointLemmaThe UpdateMarkedPoint procedure correctly maintains the invariant that the marked points associated to the zones in Y, Y belong to the boundary of the associated zone.
AddCircleAddCircle
UpdateZonesUpdateZones
O(|Z|)(| |)
A
Zones which need to update their ids
A
AddCircleAddCircle
Theoremeo eGiven a well‐formed diagram d=<C,Z> and a circle A∉C, then the procedure AddCircle(d,A):1. computes the new set of zones Z’ associated to C ∪A;2. for each zone V ∈ Z’ computes the marked point
V.point such that V.point belongs to V ’s boundary.
AddCircle: ComplexityAddCircle: Complexity
O(|C|)
O(|C| log |C|)O(|C| log |C|)
O(|C|)
O(|Z|)
To Do ListTo Do List
• P f th j t ( )• Proof the conjecture ( )• Delete ( )• Relax some constraints
– Axis‐aligned ellipses ( ) • Beware of disconnected zone
– Freely rotated ellipses ( )y p ( )• Computing intersection is tough
– Convex figure( )• Intersections can’t have a parametric form
– Triple point ( )– Single point( )
• Contour’s shapeContour s shape– Disconnected zones ( … … ) …
• Pixel oriented solutions ( )
BibliographyBibliography
[G tt ] A G tt “R t d i i d• [Guttman] A. Guttman, “R‐trees: a dynamic indexstructure for spatial searching” in Proceedings ofthe 1984 ACM SIGMOD international conference
fon Management of data, 1984, pp. 47–57.• [Agarwal et al.] P. K. Agarwal, M. de Berg, J.
Gudmundsson M Hammar and H J HaverkortGudmundsson, M. Hammar, and H. J. Haverkort, “Box‐trees and R‐trees with near‐optimal querytime” in Symposium on Computational Geometry, 2001 pp 124 1332001, pp. 124–133.
• [De Chiara et al.] “A system for Virtual Directoriesusing Euler Diagrams” Rosario De Chiara, MikaelHammar, Vittorio Scarano in Euler DiagramsWorkshop 2004
23/09/2004 ‐ Euler Diagrams 2004 Workshop
A data structure for Euler Diagrams 58