19
Knowledge Representation Knowledge Representation & & Acquisition for Large-Scale Acquisition for Large-Scale Semantic Memory Semantic Memory Julian Szymański Dept. of Electronic, Telecommunication Dept. of Electronic, Telecommunication & & Informatics, Gda Informatics, Gda ń ń sk University of sk University of Technology, Technology, Poland Poland , , Włodzisław Duch Włodzisław Duch Department of Informatics, Department of Informatics, Nicolaus Copernicus University, Nicolaus Copernicus University, Toruń, Toruń, Poland Poland

Knowledge Representation & Acquisition for Large-Scale Semantic Memory Julian Szymański Dept. of Electronic, Telecommunication & Informatics, Gdańsk University

Embed Size (px)

Citation preview

Page 1: Knowledge Representation & Acquisition for Large-Scale Semantic Memory Julian Szymański Dept. of Electronic, Telecommunication & Informatics, Gdańsk University

Knowledge Representation Knowledge Representation & & Acquisition Acquisition for Large-Scalefor Large-Scale Semantic MemorySemantic Memory

Knowledge Representation Knowledge Representation & & Acquisition Acquisition for Large-Scalefor Large-Scale Semantic MemorySemantic Memory

Julian Szymański Dept. of Electronic, TelecommunicationDept. of Electronic, Telecommunication & & Informatics, Informatics,

GdaGdańńsk University of Technology, sk University of Technology, PolandPoland,,

Włodzisław DuchWłodzisław Duch Department of Informatics, Department of Informatics,

Nicolaus Copernicus University, Nicolaus Copernicus University, Toruń, Toruń, PolandPoland

WCCI 2008WCCI 2008

Page 2: Knowledge Representation & Acquisition for Large-Scale Semantic Memory Julian Szymański Dept. of Electronic, Telecommunication & Informatics, Gdańsk University

Plan & main pointsPlan & main pointsGoal: Reaching human-level competence in all aspects of NLP. Well ... step by step.

• Representation of semantic concepts is necessary for the understanding of natural languages by cognitive systems.

• Word games - opportunity for semantic knowledge acquisition that may be used to construct semantic memory.

• A task-dependent architecture of the knowledge base inspired by psycholinguistic theories of cognition process is introduced.

• Semantic search algorithm for simplified concept vector reps.

• 20 questions game based on semantic memory implemented.

• Good test for linguistic competence of the system.• A web portal with Haptek-based talking head interface

facilitates acquisition of a new knowledge while playing the game and engaging in dialogs with users.

Page 3: Knowledge Representation & Acquisition for Large-Scale Semantic Memory Julian Szymański Dept. of Electronic, Telecommunication & Informatics, Gdańsk University

Humanized interface

Store

Applications, search, 20 questions game.

Query

Semantic memory

Parser

Part of speech tagger& phrase extractor

On line dictionariesActive search and dialogues with usersManual

verification

Page 4: Knowledge Representation & Acquisition for Large-Scale Semantic Memory Julian Szymański Dept. of Electronic, Telecommunication & Informatics, Gdańsk University

Ambitious Ambitious approaches…approaches…

CYC, Douglas Lenat, started in 1984. Developed by CyCorp, with 2.5 millions of assertions linking over 150.000 concepts and using thousands of micro-theories (2004).Cyc-NL is still a “potential application”, knowledge representation in frames is quite complicated and thus difficult to use.

Open Mind Common Sense Project (MIT): a WWW collaboration with over 14,000 authors, who contributed 710,000 sentences; used to generate ConceptNet, very large semantic network.Other such projects: HowNet (Chinese Academy of Science), FrameNet (Berkley), various large-scale ontologies.

The focus of these projects is to understand all relations in text/dialogue. NLP is hard and messy! Many people lost their hope that without deep embodiment we shall create good NLP systems.

Go the brain way! How does the brain do it?

Page 5: Knowledge Representation & Acquisition for Large-Scale Semantic Memory Julian Szymański Dept. of Electronic, Telecommunication & Informatics, Gdańsk University

Semantic Memory ModelsSemantic Memory ModelsEndel Tulving „Episodic and Semantic Memory” 1972.

Semantic memory refers to the memory of meanings and understandings. It stores concept-based, generic, context-free knowledge.

Permanent container for general knowledge (facts, ideas, words etc).

Semantic network Semantic network Collins & Loftus, 1975Collins & Loftus, 1975

Hierarchical Model Hierarchical Model Collins & Quillian, 1969Collins & Quillian, 1969

Page 6: Knowledge Representation & Acquisition for Large-Scale Semantic Memory Julian Szymański Dept. of Electronic, Telecommunication & Informatics, Gdańsk University

Words in the brainWords in the brainWords in the brainWords in the brainPsycholinguistic experiments show that most likely categorical, phonological representations are used, not the acoustic input.Acoustic signal => phoneme => words => semantic concepts.Phonological processing precedes semantic by 90 ms (from N200 ERPs).F. Pulvermuller (2003) The Neuroscience of Language. On Brain Circuits of Words and Serial Order. Cambridge University Press.

Phonological neighborhood density = the number of words that are similar in sound to a target word. Similar = similar pattern of brain activations.

Semantic neighborhood density = the number of words that are similar in meaning to a target word.

Action-perception networks inferred from ERP and fMRI

Page 7: Knowledge Representation & Acquisition for Large-Scale Semantic Memory Julian Szymański Dept. of Electronic, Telecommunication & Informatics, Gdańsk University

Semantic => vector repsSemantic => vector repsSemantic => vector repsSemantic => vector repsWord w in the context: (w,Cont), distribution of brain activations.

States (w,Cont) lexicographical meanings: clusterize (w,Cont) for all contexts, define prototypes (wk,Cont) for different meanings wk.

Simplification: use spreading activation in semantic networks to define . How does the activation flow? Try this algorithm on collection of texts:

• Perform text pre-processing steps: stemming, stop-list, spell-Perform text pre-processing steps: stemming, stop-list, spell-checking ...checking ...

• Use MetaMap with a very restrictive settings to discover Use MetaMap with a very restrictive settings to discover concepts, avoiding highly ambiguous results when mapping concepts, avoiding highly ambiguous results when mapping text to UMLS ontology. text to UMLS ontology.

• Use UMLS relations to create first-order cosets (terms + all new Use UMLS relations to create first-order cosets (terms + all new terms from included relations); add only those types of terms from included relations); add only those types of relations that lead to improvement of classification results.relations that lead to improvement of classification results.

• Reduce dimensionality of the first-order coset space, leave all Reduce dimensionality of the first-order coset space, leave all original features; use feature ranking method for this reduction. original features; use feature ranking method for this reduction.

• Repeat last two steps iteratively to create second- and higher-Repeat last two steps iteratively to create second- and higher-order enhanced spaces, first expanding, then shrinking the order enhanced spaces, first expanding, then shrinking the space. space.

Create Create XX vectors representing concepts.vectors representing concepts.

Page 8: Knowledge Representation & Acquisition for Large-Scale Semantic Memory Julian Szymański Dept. of Electronic, Telecommunication & Informatics, Gdańsk University

Semantic knowledge representationSemantic knowledge representation

vwCRK: certainty – truth – Concept Relation KeywordSimilar to RDF in semantic web.

Cobra

is_a animalis_a beastis_a beingis_a bruteis_a creatureis_a faunais_a organismis_a reptileis_a serpentis_a snakeis_a vertebratehas bellyhas body parthas cellhas chesthas costa

Simplest rep. for massive Simplest rep. for massive evaluation/association: evaluation/association: CDV – CDV – CConcept oncept DDescription escription VVectors, forming ectors, forming Semantic MatrixSemantic Matrix

Page 9: Knowledge Representation & Acquisition for Large-Scale Semantic Memory Julian Szymański Dept. of Electronic, Telecommunication & Informatics, Gdańsk University

RelationsRelations• IS_A: specific features from more general objects.

Inherited features with w from superior relations; v decreased by 10% + corrected during interaction with user.

• Similar: defines objects which share features with each other; acquire new knowledge from similar objects through swapping of unknown features with given certainty factors.

• Excludes: exchange some unknown features, but reverse the sign of w weights.

• Entail: analogical to the logical implication, one feature automatically entails a few more features (connected via the entail relation).

Atom of knowledge contains strength and the direction of relations between concepts and keywords coming from 3 components:

• directly entered into the knowledge base;• deduced using predefined relation types from stored information; • obtained during system's interaction with the human user.

Page 10: Knowledge Representation & Acquisition for Large-Scale Semantic Memory Julian Szymański Dept. of Electronic, Telecommunication & Informatics, Gdańsk University

20q for semantic data acquisition20q for semantic data acquisition

Play 20 questions with Play 20 questions with Avatar!Avatar!http://diodor.eti.pg.gda.plhttp://diodor.eti.pg.gda.pl

Think about animal – system tries Think about animal – system tries to guess it, asking no more than to guess it, asking no more than 20 questions that should be 20 questions that should be answered only with answered only with YYes or es or NNo.o. Given answers narrows the Given answers narrows the subspace of the most probable subspace of the most probable objects.objects.

System learns from the games – System learns from the games – obtains new knowledge from obtains new knowledge from interaction with the human users.interaction with the human users.

Is it vertebrate? Is it vertebrate? YYIs it mammal? Is it mammal? YYDoes it have hoof? Does it have hoof? YYIs it equine? Is it equine? NNIs it bovine? Is it bovine? NNDoes it have horn? Does it have horn? NNDoes it have long neck? Does it have long neck? YY

I guess it is I guess it is giraffegiraffe..

Page 11: Knowledge Representation & Acquisition for Large-Scale Semantic Memory Julian Szymański Dept. of Electronic, Telecommunication & Informatics, Gdańsk University

Algorithm for 20 questions gameAlgorithm for 20 questions game

0

( ) logK

i i

i

I keyword p keyword v p keyword v

pp((keyword=vkeyword=vii) ) is fraction of concepts for which the keyword is fraction of concepts for which the keyword has value vihas value viSubspace of cSubspace of candidate concepts O(A) is selectedandidate concepts O(A) is selected: :

O O ((AA)) = = {{i; i; d=d=||CDVi-ACDVi-ANSNS || is minimal is minimal }} 1

( , ) 1 ( ) / | |

0

0( , )

2

N

n nn

d CDV ANS dist CDV ANS ANS

if y NULL

ydist x y if x NULL

x y

wherewhere CDVCDVii is a vector for is a vector for i-i-concept, concept, AANSNS is a partial vector is a partial vector of of retrieved answers we can deal with user mistakes choosing retrieved answers we can deal with user mistakes choosing dd > > minimalminimal

Page 12: Knowledge Representation & Acquisition for Large-Scale Semantic Memory Julian Szymański Dept. of Electronic, Telecommunication & Informatics, Gdańsk University

Automatic data aAutomatic data acquisition cquisition Basic semantic data obtained from aggregation of machine readable dictionaries: Wordnet, ConceptNet, Sumo Ontology

– Used relations for semantic category: animal– Semantic space truncated using word popularity rank:

( )max( )

IC GR BNCRank word

Rank

• IC – IC – information contentinformation content is an amount of appearances of the is an amount of appearances of the particular word in WordNet descriptions.particular word in WordNet descriptions.GR - GR - GoogleRankGoogleRank is an amount of web pages returned by is an amount of web pages returned by Google search engine for a given word .Google search engine for a given word .

• BNC - are the words statistics taken from BNC - are the words statistics taken from British National British National CorpusCorpus. .

• Initial semantic space reduced to 94 objects and 72 featuresInitial semantic space reduced to 94 objects and 72 features

Page 13: Knowledge Representation & Acquisition for Large-Scale Semantic Memory Julian Szymański Dept. of Electronic, Telecommunication & Informatics, Gdańsk University

Human interaction knowledge acquisitionHuman interaction knowledge acquisition

• Data obtained from machine readable dictionaries:– Not complete – Not Common Sense– Sometimes specialized concepts– Some errors

• Knowledge correction in the semantic space:

0 *N

w ANSw

N

WW00 – initial weight, initial knowledge (from dictionaries) – initial weight, initial knowledge (from dictionaries)ANS – answer given by userANS – answer given by userN – number of answersN – number of answersβ - parameter indicating importance of initial knowledgeβ - parameter indicating importance of initial knowledge

Page 14: Knowledge Representation & Acquisition for Large-Scale Semantic Memory Julian Szymański Dept. of Electronic, Telecommunication & Informatics, Gdańsk University

Active DialoguesActive Dialogues

Dialogues with the user for obtaining new knowledge/features:

While system fails guess the object:I give up. Tell me what did you think of?

The concepts used in the game corrects the semantic space

While two concepts has the same CDVTell me what is characteristic for <concept1/2> ?

The new keywords for specified concepts are stored in semantic memory

While system needs more knowledge for same concept:I don’t have any particular knowledge about <concept>.

Tell me more about <concept>. System obtains new keywords for a given concept.

Page 15: Knowledge Representation & Acquisition for Large-Scale Semantic Memory Julian Szymański Dept. of Electronic, Telecommunication & Informatics, Gdańsk University

Experiments in animal domainExperiments in animal domain

WordNet, ConceptNet, SumoMilo ontology + MindNet project as knowledge sources; added to SM only if it appears in at least 2 sources. Basic space: 172 objects, 475 features, 5031 relations.

# features/concept = CDV density.Initial CDV density = 29, adding IS_A relations =41,adding similar, entails, excludes=46.

Quality Q = NS/N = #searches with success/# all searches.

Error E = 1-Q = 1-NS/N.

For 10 concepts selected with #features close to the average. Q~0.8, after 5 repetition E ~ 18%, so some learning is needed.

Page 16: Knowledge Representation & Acquisition for Large-Scale Semantic Memory Julian Szymański Dept. of Electronic, Telecommunication & Informatics, Gdańsk University

Quality measuresQuality measures

Initial semantic space: average # of games for correct recognition ~2.8. This depends on the number of semantic neighbors close to this concept.

Completeness of concept representation: •is CDV description sufficient to win the game? •how far is it from the golden standard (manually created)?

4 measures of the concept description quality:

Sd = Nf(GS)–Nf(O) = #Golden Standard features - #features in O.how many features are still missing compared to the golden standard.

SGS=i [1–(CDVi(GS),CDVi(O))] similarity based on co-occurrence.

SNO = #features in O but not found in GS (reverse of SGS).

Difw= i [|CDVi(O)–CDVi(GS)|/m, average difference of O and GS.

Page 17: Knowledge Representation & Acquisition for Large-Scale Semantic Memory Julian Szymański Dept. of Electronic, Telecommunication & Informatics, Gdańsk University

Learning from gamesLearning from games

Select O randomly with preference for larger # features, p~exp(-N(O)/N))N(O) = #features in O, and N = total number of features,

Learning procedure: CDV(O) representation of the chosen concept O is inspected, and if necessary corrected.CDV(O) is removed from the memory. Try to learn the concept O by playing the 20 questions game.

Average results for 5 test objects as a function of # games shown.

NO = SNO + SGS graph showing the average growth of the number of features as a function of the number of games played. Randomization of questions helps to find different features in each game.

Average number of games to learn selected concepts Nf=2.7. After the first successful game when a particular concept has been correctly recognized it was always found properly. After 4 games only a few new features are added.

Page 18: Knowledge Representation & Acquisition for Large-Scale Semantic Memory Julian Szymański Dept. of Electronic, Telecommunication & Informatics, Gdańsk University

Few conclusionsFew conclusionsFew conclusionsFew conclusions• Complex knowledge in frames is not too useful for large-scale search.• Semantic search requires extensive knowledge.• We do not have even the simplest common sense knowledge

description, in many applications such representations are sufficient. • It should be easier to generate this knowledge rather than wait for

embodied systems.

• Semantic memory built from parsing dictionaries, encyclopedias, ontologies, results of collaborative projects.

• Active search is used to assign features found for concepts that are not far in ontology (for example, have same parents).

• Large-scale effort to create a numerical version of Wordnet for general applications is necessary, specialized knowledge is also important.

• Word games may help to create and correct some knowledge.• 20Q is easier than Turing text, good intermediate step.• Time for word games Olympics!

Page 19: Knowledge Representation & Acquisition for Large-Scale Semantic Memory Julian Szymański Dept. of Electronic, Telecommunication & Informatics, Gdańsk University

Thank Thank youyoufor for

lending lending your your ear ear ......

Google: W. Duch => Papers, talks