Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Knowledge Management Institute
707.009 Foundations of Knowledge Managementg g
„Categorization & Formal Concept Analysis“
Markus Strohmaier
Univ. Ass. / Assistant ProfessorKnowledge Management Institute
Graz University of Technology, Austria
e-mail: [email protected]: http://www.kmi.tugraz.at/staff/markus
1
Markus Strohmaier 2009
Knowledge Management Institute
Slides in part based on
• Gerd Stumme– Course at Otto-von-Guericke Universität Magdeburg / Summer Term 2003– ECML PKDD TutorialECML PKDD Tutorial
• Rudolf Wille– „Formal Concept Analysis as Mathematical Theory of Concepts and
C t Hi hi “ I F l C t A l i Ed B G t t lConcept Hierarchies“, In Formal Concept Analysis, Eds B. Ganter et al., LNAI 3626, pp1-33, (2005)
Further Literature:
2
Markus Strohmaier 2009
http://www.aifb.uni-karlsruhe.de/WBS/gst/FBA03/chapter1_2.pdf (Ganter / Stumme)
Knowledge Management Institute
Overview
T d ‘ A dToday‘s Agenda:
Categorization & Formal Concept Analysis
• Formal Context• Formal Concepts• Formal Concept Lattices• FCA Implications• Constructing Concept Lattices
3
Markus Strohmaier 2009
Knowledge Management Institute
CategorizationCategorization[Mervis Rosch 1981]
Intension (Meaning)• The specification of those qualities that a thing must have to be a
member of the classExtension (the objects in the class)Extension (the objects in the class)• Things that have those qualities
4
Markus Strohmaier 2009
Knowledge Management Institute
CategorizationCategorization[Mervis Rosch 1981]
Six salient problems:Six salient problems:• Arbitrariness of categories. Are there any a priori reasons for dividing
objects into categories, or is this division initially arbitrary? • Equivalence of category members. Are all category members equally
representative of the category as has often been assumed?• Determinacy of category membership and representation. Are categories
specified by necessary and sufficient conditions for membership? Are boundaries of categories well defined?
• The nature of abstraction. How much abstraction is required--that is, do we need only memory for individual exemplars to account for categorization? Or, at the other extreme, are higher-order abstractions of general knowledge, beyond the individual categories, necessary?
• Decomposability of categories into elements. Does a reasonable explanation of objects consist in their decomposition into elementary qualities?
• The nature of attributes. What are the characteristics of these "attributes“
5
Markus Strohmaier 2009
into which categories are to be decomposed?
Knowledge Management Institute
TerminologyISO 704 T i l W k P i i l d th dISO 704: Terminology Work: Principles and methodsDIN 2330: Begriffe und Ihre Benennungen
C t
Name DefinitionRepresentation level
Conceptattribute aattribute battribute c
Concept levelattribute c
Object 1 Object 2 Object 3Object 1property Aproperty Bproperty C
Object 2property Aproperty Bproperty C
Object 3property Aproperty Bproperty C
Object level
6
Markus Strohmaier 2009
property C property C property C
Knowledge Management Institute
Formal Concept AnalysisFormal Concept Analysis[Wille 2005]
M d l t it f th ht A t iModels concepts as units of thought. A concept is constituted by its:Extension: consists of all objects belonging to a•Extension: consists of all objects belonging to a
concept•Intension: consists of all attributes common to all those•Intension: consists of all attributes common to all those objects
Concepts „live“ in relationships with many other concepts where the sub-concept-superconcept-relation concepts where the sub concept superconcept relation plays a prominent role.
7
Markus Strohmaier 2009
Knowledge Management Institute
Formal Concept AnalysisFormal Concept Analysis[Wille 2005]
Formal context:Formal context:
A Formal Context is a tripel (G, M, I) for which G and M are sets while I isa binary relation between G and Ma binary relation between G and M.
Formal Concept:
A formal concept of a formal context K := (G,M, I) is defined as a pair (A,B) with
and A = B´, and B = A´; A and B are called the extent and theintent of the formal concept (A,B), respectively.
8
Markus Strohmaier 2009
Knowledge Management Institute
Formal Concept Analysis
Running Example:
Taste: Sweet/Sour, Shape: Round/Long/, Color: Red/Yellow/.., Texture: Smooth/Bumpy,
I h ö ht d f hi i d ß i Z it b i d di Äh li hk it dIch möchte nur darauf hinweisen, daß es eine Zeit gab, in der man die Ähnlichkeit der Empfindungen zur Basis der Kategorisierung von Pflanze und Tier gemacht hat. Man denke
[...] an die frühen Taxonomien des Ulisse Aldrovandi aus dem 16. Jahrhundert, der die scheußlichen Tiere (die Spinnen, Molche und Schlagen) und die Schönheiten (diescheußlichen Tiere (die Spinnen, Molche und Schlagen) und die Schönheiten (die
Leoparden, die Adler usw.) zu eigenen Gruppen [von Lebewesen] zusammenfasste.
Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23
9
Markus Strohmaier 2009
[Mervis Rosch 1981]
Knowledge Management Institute
Formal Concept Analysis
Taste: Sweet/Sour, Shape: Round/Long/, Color: Red/Yellow/ Texture: Smooth/Bumpy
Def.: A formal contextis a tripel (G,M,I), where
Color: Red/Yellow/.., Texture: Smooth/Bumpy,
• G is a set of objects,• M is a set of attributes• and I is a relationbetween G and M.
(g,m) I is read as„object g has attribute m“.
10
Markus Strohmaier 2009
Knowledge Management Institute
Formal Concept Analysis
Derivation Operators
Taste: Sweet/Sour, Shape: Round/Long/, Color: Red/Yellow/ Texture: Smooth/Bumpy
• A G, B M (A…Extent, B…Intent)• all attributes shared by all objects of A• all objects having all attributes of B
Color: Red/Yellow/.., Texture: Smooth/Bumpy,
all objects having all attributes of BA1´
A1 X X XA formal concept is
X
X
X
X
X
X
defined as a pair (A,B)
A = B´, and B = A´
The two derivation operators satisfy the follwing 3 conditions:
11
Markus Strohmaier 2009
Knowledge Management Institute
Formal Concept Analysis
Def.: A formal conceptis a pair (A,B), with
Taste: Sweet/Sour, Shape: Round/Long/, Color: Red/Yellow/ Texture: Smooth/Bumpy
• A G, B Mall attributes shared by all objects of Aall objects having all attributes of B
Color: Red/Yellow/.., Texture: Smooth/Bumpy,
j g
• A´=B and B´= A
Set A is called the extent
IntenttSet A is called the extent
(a set of objects)Set B is called the intent(a set of attributes)
Ext
ent
Of the formal concept (A,B)
12
Markus Strohmaier 2009
Knowledge Management Institute
Formal Concept Analysis
Sub/Superconcept Relation
Taste: Sweet/Sour, Shape: Round/Long/, Color: Red/Yellow/ Texture: Smooth/Bumpy
• A G, B M• all attributes shared by all objects of A• all objects having all attributes of B
Color: Red/Yellow/.., Texture: Smooth/Bumpy,
j gB2 ↔A2´
A2 X X X
B1↔A1´A
X
X
X
X
X
X
A1
The orange concept is a subconcept of the blue concept, since its extent is contained in the blue one (equivalent to the blue intent is contained in the orange one)
13
Markus Strohmaier 2009
blue one. (equivalent to the blue intent is contained in the orange one)
Knowledge Management Institute
Formal Concept AnalysisFormal Concept AnalysisConcept Lattices
Concept Lattices(cf. Galois Lattices)
Taste: Sweet/Sour, Shape: Round/Long/, Color: Red/Yellow/ Texture: Smooth/Bumpy
1 X X X
A1´A2´
Color: Red/Yellow/.., Texture: Smooth/Bumpy,
A1 X
X
X
XX
X
XX
X
A2
C1
X X X
Extent Intent
Formal Concept C1 (A1, A1´)
The set of objects that are „yellow“, „sweet“ and „smooth“
C2
15
Markus Strohmaier 2009
„sweet and „smooth
Knowledge Management Institute
Formal Concept AnalysisFormal Concept AnalysisConcept Lattices
Concept Lattices(cf. Galois Lattices)
Taste: Sweet/Sour, Shape: Round/Long/, Color: Red/Yellow/ Texture: Smooth/Bumpy
1 X X X
A1´
Color: Red/Yellow/.., Texture: Smooth/Bumpy,
)
A1 X
X
X
XX
X
XX
X(Attr
ibut
es)
A1
X X X
Inte
nt
cts)
xten
t (O
bjec
16
Markus Strohmaier 2009
Ex
Knowledge Management Institute
Formal Concept AnalysisFormal Concept AnalysisConcept Lattices
D f Th t l tti f f l t t (G M I) iDef.: The concept lattice of a formal context (G,M,I) isthe set of all formal concepts of (G,M,I), together with the partial order
The concept lattice is denoted by B(G,M,I) .
Theorem: The concept lattice is a lattice, i.e. for two concepts (A1,B1) and (A B ) there is alwaysand (A2,B2), there is always
• a greatest common subconcept• and a least common superconceptp p
17
Markus Strohmaier 2009
Knowledge Management Institute
Formal Concept AnalysisFormal Concept AnalysisGreatest Common Subconcept
Which objects share the attributes „smooth“ and „red“ and „sour“?
greatest common subconcept
A: Grapes, Apples
(infimum)
• a greatest common subconcept• and a least common superconcept
18
Markus Strohmaier 2009
p p
Knowledge Management Institute
Formal Concept AnalysisFormal Concept AnalysisLeast Common Superconcept
Which attributes share the objects „strawberries“
and „lemon“?
least common superconcept
A: Bumpy, round
(supremum)
• a greatest common subconcept• and a least common superconcept
19
Markus Strohmaier 2009
p p
Knowledge Management Institute
Formal Concept AnalysisFormal Concept AnalysisImplications
Def.: An implication
A > B holds in a context (G M I) if every object intent respects A >B i e
Taste: Sweet/Sour, Shape: Round/Long/, Color: Red/Yellow/.., Texture: Smooth/Bumpy,
A -> B holds in a context (G, M, I) if every object intent respects A ->B, i.e.
if each object that has all the attributes in A also has all the attributes in B.
We also say A->B is an implication of (G,M,I).ABBA
20
Markus Strohmaier 2009
Knowledge Management Institute
Formal Concept AnalysisFormal Concept AnalysisImplicationsObject count f
Taste: Sweet/Sour, Shape: Round/Long/, Color: Red/Yellow/ Texture: Smooth/BumpyColor: Red/Yellow/.., Texture: Smooth/Bumpy,
21
Markus Strohmaier 2009
Knowledge Management Institute
Formal Concept AnalysisFormal Concept AnalysisApplications / AOL Search Query Logs
Implications:
22
Markus Strohmaier 2009
Knowledge Management Institute
Formal Concept AnalysisFormal Concept AnalysisApplications / Goal Tagging
Implications:
23
Markus Strohmaier 2009
Knowledge Management Institute
Formal Concept AnalysisFormal Concept AnalysisApplications / Bugs - Bugzilla
Implications:
24
Markus Strohmaier 2009
Knowledge Management Institute
FCA / S liFCA / ScalingTransforming many-valued into single valued contexts
Many-valued contextsColor Shape Taste Texture
Apple Red/green/yellow
Round Sweet/sour
Smooth
Lemon Yellow Round sour Bumpy
What hasn‘t been mentioned yetLemon Yellow Round sour Bumpy
Banana Yellow Long Sweet Smooth
Strawberries Red Round Sweet Bumpy
Grapes Red/ green Round Sweet/ Smooth
y
Grapes Red/ green Round Sweet/sour
Smooth
Pear Yellow Long Sweet Smooth
Color Shape Taste TextureS1 red green yellow
red X
Via Scales
red X
green X
yellow X
25
Markus Strohmaier 2009
Knowledge Management Institute
K l d A i itiKnowledge AcquisitionCIMIANO, HOTHO, & STAAB 2005
26
Markus Strohmaier 2009
Knowledge Management Institute
K l d A i itiKnowledge AcquisitionCIMIANO, HOTHO, & STAAB 2005
27
Markus Strohmaier 2009
Knowledge Management Institute
K l d A i itiKnowledge AcquisitionCIMIANO, HOTHO, & STAAB 2005
28
Markus Strohmaier 2009
Knowledge Management Institute
tion
cqui
sit
STA
AB 2
005
dge
Ac
HO
THO
, & S
owle
dC
IMIA
NO
, K
no
29
Markus Strohmaier 2009
Knowledge Management Institute
Formal Concept AnalysisFormal Concept AnalysisLattice Construction Algorithms
A N i A hA Naive Approach:
Ganter / Stumme 2003
31
Markus Strohmaier 2009
Knowledge Management Institute
Formal Concept AnalysisLattice Construction Algorithms
E lExample:(Formal Concepts)
Ganter / Stumme 2003
32
Markus Strohmaier 2009
Ganter / Stumme 2003
Knowledge Management Institute
Formal Concept AnalysisFormal Concept AnalysisLattice Construction Algorithms
A N i A hA Naive Approach:
Ganter / Stumme 2003
33
Markus Strohmaier 2009
Knowledge Management Institute
Formal Concept AnalysisyLattice Construction Algorithms
E lExample:
({T2,T7},{e}) <= ({T1,T2,T3,T4,T5,T6,T7},{0})
umm
e 20
03 „5“ is a subconcept of „9“
({0},{a,b,c,d,e}) <= ({T4},{a,b,c})({0},{a,b,c,d,e}) <= ({T4},{a,b,c})
Gan
ter /
Stu
Ganter / Stumme 2003a circle for a concept is always positioned higher
than all circles for its proper subconcepts.
34
Markus Strohmaier 2009
Knowledge Management Institute
Formal Concept AnalysisLattice Construction Algorithms
E l td´Example ctd´:
10Reduced Labeling
5
10
2 4 3
Reduced Labeling(Top to bottom)
5 2 4 3
9 8 79 8 7
1
Ganter / Stumme 2003
6
35
Markus Strohmaier 2009
Knowledge Management Institute
Formal Concept Analysis
T lTools:E.g. • ConExp http://conexp.sourceforge.net/index.html• Networks.tb http://networks-tb.sourceforge.net/• JaLaBa http://maarten.janssenweb.net/jalaba/JaLaBA.pl
Further Information• FCA Homepage http://www.upriss.org.uk/fca/fca.html
36
Markus Strohmaier 2009
Knowledge Management Institute
Formal Concept Analysis
C E htt // f t/i d ht l• ConExp http://conexp.sourceforge.net/index.html
37
Markus Strohmaier 2009
Knowledge Management Institute
Bonus Task
Sk t h t i ti l• Sketch a categorization example• Define a Formal Context, for which |G| >=10, |M| >=10 and
|I|~|G|+|M|| | | | | |• Use Conexp to
– Name all elements of G and M (choose plausible G and M)Represent your formal context (choose plausible I)– Represent your formal context (choose plausible I)
– Draw the Concept Lattice– Calculate Implications (there should be at least one implication with f>3)
S b it• Submit– A .pdf file that contains your 1) context, 2) the layouted(!) lattice 3) the top10
implications and 4) a brief explanationN th df Fil i th f ll i S t• Name the pdf File using the following Syntax: „GWM09-BT2-YOURMATR-YOURLASTNAME.cex“
– To me via e-mail using subject „[GWM09-BT2-YOURMATR]“– before the beginning of next week‘s class
38
Markus Strohmaier 2009
– before the beginning of next week s class
Knowledge Management Institute
Any questions?y q
See you next week!y
39
Markus Strohmaier 2009