32
Basic Level Categories for Knowledge Representation Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services http://www.kapsgroup.com

Basic Level Categories for Knowledge Representation Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services

Embed Size (px)

Citation preview

Basic Level Categoriesfor

Knowledge Representation

Tom ReamyChief Knowledge Architect

KAPS Group

Knowledge Architecture Professional Services

http://www.kapsgroup.com

2

Agenda

Introduction – Context– Category Theory – Cognitive Science– Enterprise Text Analytics

Basic Level Categories – Features and Issues

Basic Level Categories and Expertise– Experts prefer lower levels– Categorization of Expertise

Applications– Integration with Search and ECM– Platform for Information Applications

3

KAPS Group: General

Knowledge Architecture Professional Services Virtual Company: Network of consultants – 8-10 Partners – SAS, SAP, Microsoft-FAST, Concept Searching, etc. Consulting, Strategy, Knowledge architecture audit Services:

– Taxonomy/Text Analytics development, consulting, customization– Technology Consulting – Search, CMS, Portals, etc.– Evaluation of Enterprise Search, Text Analytics– Metadata standards and implementation– Knowledge Management: Collaboration, Expertise, e-learning– Applied Theory – Faceted taxonomies, complexity theory, natural

categories

4

Basic Level CategoriesContext Unstructured Content - Enterprise & External Preprocessing of documents and sets

– Includes categorization, information extraction Representation of Domain knowledge – taxonomy, ontology Presentation of results of search, text mining – and refinement Categorization

– Most basic to human cognition– Most difficult to do with software

No single correct categorization– Women, Fire, and Dangerous Things

5

Basic Level CategoriesContext Borges – Celestial Emporium of Benevolent Knowledge

– Those that belong to the Emperor– Embalmed ones– Those that are trained– Suckling pigs– Mermaids– Fabulous ones– Stray dogs– Those that are included in this classification– Those that tremble as if they were mad– Innumerable ones– Other

6

Basic Level Categories – software contextEnterprise Text Analytics (ETA) Enterprise Search – Faceted Navigation

– Categorization – Document Topics – Aboutness– Entity Extraction – noun phrases, feed facets, ontologies– Summarization – beyond snippets

Enterprise Content Management– Hybrid model of metadata– Categorization – suggestions– Entity, Noun phrase – facets need a lot of metadata

7

Basic Level Categories – software contextEnterprise Text Analytics (ETA) Advanced Text Analytics

– Fact extraction – ontologies– Sentiment Analysis – good, bad, and ugly– Expertise Analysis

Enterprise Applications –Information Applications – Text mining – alone or in conjunction with data mining– Business & Customer intelligence

8

Basic Level CategoriesIntroduction: What are Basic Level Categories? Mid-level in a taxonomy / hierarchy Short and easy words Maximum distinctness and expressiveness Similarly perceived shapes Most commonly used labels Easiest and fastest to indentify members First level named and understood by children Terms usually used in neutral contexts Level at which most of our knowledge is organized

9

Basic Level CategoriesIntroduction: What are Basic Level Categories? Objects – most studied, most pronounced effects Levels: Superordinate – Basic – Subordinate

– Mammal – Dog – Golden Retriever– Furniture – chair – kitchen chair

Basic in 4 dimensions– Perception – overall perceived shape, single mental image, fast

identification– Function – general motor program– Communication – shortest, most commonly used, neutral, first

learned by children– Knowledge Organization – most attributes are stored at this level

10

Basic Level CategoriesIntroduction: Basic Level Categories: Non-Object Basic level effects, but no widespread acceptance of categories and

category names Thus a basic level in a category hierarchy but not the category hierarchy

that people actually use in everyday life Not just IS-A relationship – messier – more like ontologies Examples:

– Scenes – indoors – school – elementary school– Events – travel – highway travel – truck travel– Emotions – positive emotion – joy – contentment– Programming – Algorithm – sort – binary

11

Basic Level CategoriesIntroduction: Other levels Subordinate – more informative but less distinctive

– Basic shape and function with additional details• Ex – Chair – office chair, armchair

– Convention – people name objects by their basic category label, unless extra information in subordinate is useful

Superordinate – Less informative but more distinctive– All refer to varied collections – furniture– Often mass nouns, not count nouns– List abstract / functional properties– Very hard for children to learn

12

Basic Level CategoriesIntroduction: How recognize Basic level Short words – noun phrase

– Selected list (extended stop words)

Kinds of attributes– Superordinate – functional (keeps you warm, sit on it)– Basic – Noun and adjectives – legs, belt loops, cloth– Subordinate – adjectives – blue, tall

Basic Level – similar movements, similar shapes More complex for non-object domains Issue – what is basic level is context dependent

13

Basic Level CategoriesIntroduction: How recognize Basic level Cue Validity – probability that a particular object belongs to

some category given that it has a particular feature (cue)– X has wings – bird– Superordinates have lower – fewer common attributes– Subordinates have lower – share more attributes with other

members at same level

Category utility – frequency of a category + category validity + base rates of each of these features

Issue – how decide which features?– Cat – “can be picked up”, is bigger than a beetle

14

Basic Level Categories and Expertise

Experts prefer lower, subordinate levels– In their domain, (almost) never used superordinate

Novice prefer higher, superordinate levels General Populace prefers basic level Not just individuals but whole societies / communities differ

in their preferred levels Issue – artificial languages – ex. Science discipline Issue – difference of child and adult learning – adults start

with high level

15

Basic Level Categories and Expertise

Experts chunk series of actions, ideas, etc.– Novice – high level only– Intermediate – steps in the series– Expert – special language – based on deep connections

Expertise is a combination of knowledge and skill– Everything from riding a bike to merging two companies– No such thing as tacit knowledge - spectrum

Types of expert :– Technical – lower level terms only– Strategic – high level and lower level terms, special language

16

Basic Level CategoriesAnalytical Techniques What is basic level is context(s) dependent Documents / Tags – analyze in terms of levels of words

– Taxonomy for high level– Length for basic – short– Length for subordinate – long, special vocabulary

Category Utility Hybrid – simple high level taxonomy (superordinate), short words –

basic, longer words – expert Plus Develop expertise rules – similar to categorization rules

– Use basic level for subject– Superordinate for general, subordinate for expert

17

Basic Level CategoriesAnalytical Techniques Corpus context dependent

– Author748 – is general in scientific health care context, advanced in news health care context

Need to generate overall expertise level for a corpus Also contextual rules

– “Tests” is general, high level– “Predictive value of tests” is lower, more expert

Categorization rule – SENT, DIST– If same sentence, expert

Demo – Sample Documents, Rules

18

Expert General

Research (context dependent) Kid

Statistical Pay

Program performance Classroom

Protocol Fail

Adolescent Attitudes Attendance

Key academic outcomes School year

Job training program Closing

American Educational Research Association Counselor

Graduate management education Discipline

Education Terms

19

Expert General

Mouse Cancer

Dose Scientific

Toxicity Physical

Diagnostic Consumer

Mammography Cigarette

Sampling Smoking

Inhibitor Weight gain

Edema Correct

Neoplasms Empirical

Isotretinion Drinking

Ethylene Testing

Significantly Lesson

Population-base Knowledge

Pharmacokinetic Medicine

Metabolite Sociology

Polymorphism Theory

Subsyndromic Experience

Radionuclide Services

Etiology Hospital

Oxidase Social

Captopril Domestic

Pharmacological agents

Dermatotoxicity

Mammary cancer model

Biosynthesis

Healthcare Terms

20

Basic Level CategoriesExpertise – application areas Taxonomy development /design – use basic level User contribution

– Card sorting – non-experts use superficial similarities– Survey for attributes instead of cart sorting, general structure

Develop expert and general versions/sections/synonyms– ID communities by their documents, tags

Info presentation – combine superordinate and basic– Similar to scientific – Genus – Species is official name

Info presentation – document maps – expose basic level

21

Basic Level CategoriesExpertise – application areas Ontology development / design

– Need more focus on who is intended audience• Structure, nomenclature

– Defining classes & hierarchy – same as taxonomy– Defining properties - Expert dependent

• Wine for snobs (experts) very different than Joe Sixpack

– Two approaches• One ontology, classes and/or properties as expert

• Two ontologies – expert and novice

22

Basic Level CategoriesExpertise – application areas Text Mining

– Preprocessing of documents– Expertise characterization of writer– Best results with existing taxonomy

• Can use a very general, high level taxonomy – superordinate and basic

• Can use existing large taxonomies – MeSH, etc.

eCommerce– Organization and Presentation of information – expert, novice– How determine?

• Search queries, profiles, buying patterns, specific products

23

Basic Level CategoriesExpertise – application areas Search – enterprise and/or internet

– Query level

Relevance ranking– Adjust documents for novice and expert queries

Information presentation– Tag clouds – match novice and expert

Clustering– Incorporate into clustering algorithms– Presentation – expose basic level & provide up and down

browse

24

Basic Level CategoriesExpertise – application areas Social Media - Community of Practice

– Characterize the level of expertise in the community– Evaluate other communities expertise level– Personalize information presentation by expertise

Expertise location– Generate automatic expertise characterization based on

authored documents

Expertise of people in a social network– Terrorists and bomb-making

Issue of Levels of expertise – how granular?

Basic Level CategoriesExpertise – application areas - CoP Basic Level Blog Software (Design) Web (Design) Linux Javascript Web2.0 Google Css Flash

Superordinate Music Photography News Education Business Technology Politics Science Culture

25

Basic Level CategoriesExpertise – Related Tags - Delicious CSS Web Design Design Css3 Tutorial Webdev Javascript Web Development Html Jquery html5

Education Technology Resources Teaching Learning Science Web20 Games Interactive Research Tools reference

26

27

Basic Level CategoriesExpertise – application areas Business & Customer intelligence

– General – characterize people’s expertise to add to evaluation of their comments

– Combine with sentiment analysis – finer evaluation – what are experts saying, what are novices saying

– Deeper research into communities, customers

Enterprise Content Management– At publish time, software automatically gives an expertise

level – present to author for validation– Combine with categorization – offer tags that are suitable

level of expertise

28

Basic Level CategoriesConclusions Basic Level Categories are fundamental to thought What is basic level is context dependent Basic level effect is most obvious with objects, more work

for concepts Most domains need some taxonomy – need not be big

– Categorization-like rules

This is exciting, but not a revolution Beware Egalitarian stance – People are different Text Analytics needs Cognitive Science

– Not just library science or data modeling or ontology

29

Resources

Books– Women, Fire, and Dangerous Things

• George Lakoff– Knowledge, Concepts, and Categories

• Koen Lamberts and David Shanks– The Stuff of Thought – Steven Pinker

Web Sites– Text Analytics News -

http://social.textanalyticsnews.com/index.php

– Text Analytics Wiki - http://textanalytics.wikidot.com/

30

Resources

Blogs– SAS- Manya Mayes – Chief Strategist -

http://blogs.sas.com/text-mining/

Web Sites – Taxonomy Community of Practice:

http://finance.groups.yahoo.com/group/TaxoCoP/

– Whitepaper – CM and Text Analytics - http://www.textanalyticsnews.com/usa/contentmanagementmeetstextanalytics.pdf

– Whitepaper – Enterprise Content Categorization – coming soon

31

Resources

Articles– Malt, B. C. 1995. Category coherence in cross-cultural

perspective. Cognitive Psychology 29, 85-148– Rifkin, A. 1985. Evidence for a basic level in event

taxonomies. Memory & Cognition 13, 538-56– Shaver, P., J. Schwarz, D. Kirson, D. O’Conner 1987.

Emotion Knowledge: further explorations of prototype approach. Journal of Personality and Social Psychology 52, 1061-1086

– Tanaka, J. W. & M. E. Taylor 1991. Object categories and expertise: is the basic level in the eye of the beholder? Cognitive Psychology 23, 457-82

Questions?

Tom [email protected]

KAPS Group

Knowledge Architecture Professional Services

http://www.kapsgroup.com