View
33
Download
3
Category
Preview:
DESCRIPTION
The Large-Scale Structure of Semantic Networks. A. Tuba Baykara Cognitive Science 2002700187. 1) Introduction 2) Analysis of 3 semantic networks and their statistical properties - Associative Network - WordNet - Roget’s Thesaurus 3) The Growing Network Model proposed by the authors - PowerPoint PPT Presentation
Citation preview
The Large-Scale Structure of Semantic Networks
A. Tuba BaykaraCognitive Science
2002700187
2
Overview
1) Introduction2) Analysis of 3 semantic networks and their statistical
properties- Associative Network- WordNet- Roget’s Thesaurus
3) The Growing Network Model proposed by the authors- Undirected Growing Network Model- Directed Growing Network Model
4) Psychological Implications of the findings5) General Discussion and Conclusions
3
1) Introduction
Semantic Network: A network where concepts are represented as hierarchies of inter-connected nodes, which are linked to characteristic attributes.
Important to understand their structure because they reflect the organization of meaning and language.
Statistical similarities important because of their implications on language evolution and/or acquisition.
Would a similar model have the same statistical properties? Growing Network Model
4
1) Introduction1) IntroductionPredictions related to the model
1- It would have the same characteristics:
* Degree distribution would follow a power-law some concepts would have much higher connections
* Addition of new concepts would not change such structure Scale-free (vs. small-world!!)
2- Previously added (early acquired) concepts would have higher connectivity than later added (acquired) concepts.
5
1) Introduction1) IntroductionTerminology
Graph, network– Node, edge (undirected link), arc (directed link), degree– Avg. shortest path (L), diameter (D), clustering coefficient
(C), degree distribution ()
Small-world network, random graph
6
2) Analysis of 3 Semantic Networksa. Associative Network
“The University of South Florida Word Association, Rhyme and Word Fragment Norms”
>6000 thousand participants; 750,000 responses to 5,019 cues (stimulus words)
great majority of these words are nouns (76%), but adjectives (13%) and verbs (7%), and other parts of speech are also represented. In addition, 16% are identified as homographs
7
2) Analysis of 3 Semantic Networks2) Analysis of 3 Semantic Networksa. Associative Network
Examples:
BOOK _______
BOOK READ SUPPER _______
SUPPER LUNCH
8
2) Analysis of 3 Semantic Networks2) Analysis of 3 Semantic Networksa. Associative Network
DINNER SUPPER EAT LUNCH FOOD MEAL
DINNER - 0.54 0.11 0.10 0.09 0.09
SUPPER 0.55 - 0.02 0.03 0.17 0.01
EAT - 0.41 0.02
LUNCH 0.27 0.02 0.08 - 0.20 0.06
FOOD 0.41 0.01 - 0.02
MEAL 0.21 0.06 0.06 0.06 0.49 -
Note: for simplicity, the networks were constructed with all arcs and edges unlabeled and equally-weighted.
Forward & backward strength imply directions.
(when SUPPER was normed, it produced LUNCH as a target with a forward strength of .03)
9
2) Analysis of 3 Semantic Networks2) Analysis of 3 Semantic Networksa. Associative Network
I) Undirected network Word nodes were joined by an edge if associatively
related, regardless of associative direction
The shortest path from VOLCANO to ACHE is highlighted.
10
2) Analysis of 3 Semantic Networks2) Analysis of 3 Semantic Networksa. Associative Network
II) Directed network Words x & y were joined by an arc from x to y if cue x
evoked y as an associative response
all shortest directed paths from VOLCANO to ACHE are shown.
11
2) Analysis of 3 Semantic Networksb. Roget’s Thesaurus
1911 edition with 29,000 words from 1,000 categories A connection is made only between a word and a
semantic category, if that word is within that category. bipartite graph
12
2) Analysis of 3 Semantic Networks2) Analysis of 3 Semantic Networksb. Roget’s Thesaurus
numeration representation painting
calculator numbering accounting computer imitation map design perspective chalk monochrome
accountingdesign chalk
map
calculator perspective
computer
imitation
numberingmonochrome
Bipartite graph
Unipartite graph
13
2) Analysis of 3 Semantic Networks2) Analysis of 3 Semantic Networksc. WordNet
Developed by George Miller at the CogSci Lab in Princeton Uni.: http://wordnet.princeton.edu
Based on the relation between synsets; contained more than 120k word forms and 99k meanings
ex: The noun "computer" has 2 senses in WordNet.1. computer, computing machine, computing device, data processor, electronic computer, information processing system -- (a machine for performing calculations automatically)2. calculator, reckoner, figurer, estimator, computer -- (an expert at calculation (or at operating calculating machines))
14
2) Analysis of 3 Semantic Networks2) Analysis of 3 Semantic Networksc. WordNet
Links are between word forms and their meanings according to the relationships between word forms such as:
– SYNONYMY– POLYSEMY– ANTONYMY – HYPERNYMY (Computer is a kind of machine/device/object.)– HYPONYMY (Digital computer/Turing machine… is a kind of computer)– HOLONYMY (Computer is a part of a platform)– MERONYMY (CPU/chip/keyboard… is a part of a computer)
Links can be established in any desired way, so WordNet treated as an undirected graph.
15
2) Analysis of 3 Semantic Networks2) Analysis of 3 Semantic NetworksStatistical Properties
I) How sparse are the 3 networks? <k>: avg. # of connections In all 3, a node is connected to only a small % of other nodes.
II) How connected are the networks? Undirected A/N: completely connected Directed A/N: largest connected component has 96% of all words WordNet & Thesaurus: 99%
All further analyses with these components!!!
16
2) Analysis of 3 Semantic Networks2) Analysis of 3 Semantic NetworksStatistical Properties
17
2) Analysis of 3 Semantic Networks2) Analysis of 3 Semantic NetworksStatistical Properties
III) Short Path-length (L) and Diameter (D) In WordNet & Thesaurus, L & D based on a sample of 10,000
words. In A/N, all words considered. L & D in random graphs with equivalent size; expected
IV) Local Clustering (C) To measure its C, directed A/N regarded as undirected To calculate C of Thesaurus, bipartite graph converted into
unipartite graph C of all 4 networks much higher than in random graphs
18
2) Analysis of 3 Semantic Networks2) Analysis of 3 Semantic NetworksStatistical Properties
V) Power-Law Degree Distribution ()
• All distributions are plotted in log-log coordinates with the line showing best fitting power law distribution.• in of Directed A/N lower than the rest
These semantic networksare scale-free!
19
2) Analysis of 3 Semantic Networks2) Analysis of 3 Semantic NetworksStatistical Properties / Summary
Sparsity & High-ConnectivityOn avg. words are related to only a few other words
Local ClusteringConnections between words are coherent and transitive:
if xy and yz; then xz
Short Path Length and DiameterLanguage is expressive and flexible (thru’ polysemy & homonymy..)
Power-Law Degree DistributionLanguage hosts hubs as well as many words connected to few others
20
3) The Growing Network Model
Inspired by Barabási & Albert (1999) Incorporates both growth and preferential attachment Aim: to see whether the same mechanisms are at work
or not in real-life semantic networks and artificial ones Might be applied to lexical development in children
+ growth of semantic structures across languages,
or even language evolution
21
3) The Growing Network Model
Assumptions: how children learn concepts is thru’ semantic
differentiation: a new concept differentiates an already existing one, acquires a similar meaning, but also different, with a different pattern of connectivity.
more complex concepts get more differentiated more frequent concepts get more involved in
differentiation
22
3) The Growing Network Model3) The Growing Network ModelStructure
Nodes are words, and connections are semantic associations/relations
Nodes are different in their utility frequency of use Over time new nodes are added and attached to
existing nodes probabilistically according to:– Locality principle: New links are added only into a local
neighborhood a set of nodes with a common neighbor– Size principle: New connections will be to neighborhoods
with already large # of connections– Utility principle: New connections within a neighborhood will
be onto nodes with high utility (rich-get-richer phenomenon)
23
3) The Growing Network Model3) The Growing Network Modela. Undirected GN Model
Aim: To grow a network with nn nodes # of nodes at time tt is n(t)n(t) Start with a fully connected network of MM nodes (MM<<nn) At each tt, add a node ii with MM links (chosen for a desired
avg. density of connections) into a local neighborhood HHii the set of neighbors of i i including i i itself.
Choose a neighborhood according to the size principle:
kkii(t)(t): degree of node ii at time tt
Ranges over all current n(t)n(t) in the network
24
3) The Growing Network Model3) The Growing Network Modela. Undirected GN Model
Ranges over all nodes in Hi
Connect to a node j j in the neighborhood of node i i according to the utility principle:
If all utilities are equal, make a connection randomly:
Stop when n n nodes are reached.
UUjj = log(ffjj+1); ffjj taken from Kučera &Francis(1967) frequency count
25
3) The Growing Network Model3) The Growing Network Modela. Undirected GN Model
The growth process and a small resulting network with n=150, M=2:
26
3) The Growing Network Model3) The Growing Network Modelb. Directed GN Model
Very similar to the Undirected GN Model: insert nodes with MM arcs instead of links
Same equations to apply locality, size and utility principles, since:
ki = kiin + ki
out
Difference: Direction Principle: majority (!) of arcs are pointed from new nodes to existing nodes the p that an arc points away from the new node is , where >0.5 is assumed; so most arcs will point towards existing nodes.
27
3) The Growing Network Model3) The Growing Network ModelModel Results
Due to computational constraints, the GN model was compared only with A/N model.
n=5018; M=11 and M=12 in the undirected and directed GN models respectively.
The only free parameter in Directed GN model, , was set to 0.95
The networks produced by the model are similar to A/N in terms of their L, D, C. Same low in as in Directed A/N.
28
3) The Growing Network Model3) The Growing Network ModelModel Results
Also checked if the same results would be produced when the Directed GN Model was converted into an undirected one. why!?
Convert all arcs into links, with MM=11 and =0.95 Results similar to Undirected GN model.
Degree distribution follows a power-law
29
3) The Growing Network Model3) The Growing Network ModelArgument
L, C and from the artificial networks were expected to compare to real-life networks:
– incorporation of growth – incorporation of preferential attachment (locality, size & utility
principles) Do models without growth not produce such power-laws? Analyze the co-occurrence of words within a large corpus
Latent Semantic Analysis (LSA): meaning of words can be represented by vectors in a high dimensional space
Landauer & Dumais (1997) have already shown that local neighborhoods in semantic space captures semantic relations between words.
30
3) The Growing Network Model3) The Growing Network ModelLSA Results
Higher L, D and C than in real-life semantic networks Very different degree-distribution. The distributions do not
follow a power-law. Difficult to interpret the slope of the best fitting line.
31
3) The Growing Network Model3) The Growing Network ModelLSA Results
Analysis of the TASA corpus (>10mio words) using LSA vector representation:
All words from A/N in TASA Most freq. words in TASA
All words from LSA (>92k)represented as vectors
32
3) The Growing Network Model3) The Growing Network ModelLSA Results
Non-existence of power-law degree distribution implies LSA does not produce hubs.
In contrast, a growing model provides a principled explanation for the origin of power-law: Words with high connectivity acquire even more connections over time.
33
4) Psychological Implications
Number of connections a node has is related to the time at which the node is introduced into the network.
Predictions: – Concepts that are learned early in life will have more
connections than concepts learned later.– Concepts with high utility (frequency) will receive more links
than concepts with lower utility.
34
4) Psychological Implications4) Psychological ImplicationsAnalysis of AoA-related data
To test the prediction, two data sets were analyzed:
I) Age of Acquisition Ratings (Gilhooly & Logie, 1980) AoA effect: Early acquired words are retrieved from
memory more rapidly than late acquired words An experiment with 1,944 words Adults were required to estimate the age at which they
thought they first learned a word on a rating scale (100-700, 700 rated to be very late-learned concept)
II) Picture naming norms (Morrison, Chappell & Ellis, 1997) Estimation of the age at which 75% of children could
successfully name the object depicted by a picture
35
4) Psychological Implications4) Psychological ImplicationsAnalysis of AoA-related data
Predictions
are
confirmed!
Standarderrorbarsaroundthe means
36
4) Psychological Implications4) Psychological ImplicationsDiscussion
Important consequences on psychological research on AoA and word frequency
– Weakens: AoA affects mainly the speech output system AoA & word frequency display their effect on behavioral tasks
independently
– Confirms: early acquired words show short naming-latencies and lexical-
decision-latencies AoA affects semantic tasks AoA is mere cumulative frequency
37
4) Psychological Implications4) Psychological ImplicationsCorrelational Analysis of Findings
Early acquired words have more semantic connections (more central in an underlying semantic network) early acquired words have higher degree centrality
Centrality can also be measured by computing the eigenvector of the adjacency matrix with the largest eigenvalue.
Analysis of how degree centrality, word frequency and AoA from previous rating & naming studies correlate with 2 databases:
– Naming-latency db of 796 words– Lexical-decision-latency db of 2,905 words
38
4) Psychological Implications4) Psychological ImplicationsCorrelational Analysis of Findings
• Centrality negatively correlates with latencies• AoA correlates positively withlatencies• Word frequency correlates negatively with latencies• When effects of word freq. andAoA partialled out, centrality-latency correlation remainsignificant there must be othervariables
39
5) General Discussion and and ConclusionsConclusions
Weakness of correlational analysis: direction of causation is unknown:
– Because acquired early, a word will have more connections
vs.– Because of having more connections, a word will be acquired
early A connectionist model can produce similar results: early
acquired words are learnt better.
40
5) General Discussion and General Discussion and Conclusions
Power-law degree distributions in semantic networks can be understood by semantic growth processes hubs
Non-growing semantic representations as LSA do not produce such a distribution per se.
Early acquired concepts have richer connections confirmed by AoA norms.
41
References
Barabási, A.L., & R. Albert (1999). Emergence of scaling in random network models. Science, 286, 509-512.
Gilhooly, K.J., & R.H.Logie (1980). Age of Acquisition, imagery, concreteness, familiarity and ambiguity measures for 1944 words. Behavior Research Methods and Instrumentation, 12, 395-427.
Kučera, H., & W.N.Francis (1967). Computational analysis of present-day American English. Providence, RI: Brown University Press.
Landauer, T.K., & S.T.Dumais (1997). A solution to Plato’s problem: The Latent Semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review, 104, 211-240.
Morrison, C.M., T.D.Chappell and A.W.Ellis (1997). Age of Acquisition norms for a large set of object names and their relation to adult estimates and other variables. Quarterly Journal of Experimental Psychology, 50A, 528-559.
Thanks for your attention!
Questions / comments
are appreciated.
43
2) Analysis of 3 Semantic Networks2) Analysis of 3 Semantic Networksc. WordNet
Number of words, synsets, and senses
POS Unique Synsets Total Word-Strings Sense Pairs
Noun 114,648 79,689 141,690
Verb 11,306 13,508 24,632
Adjective 21,436 18,563 31,015
Adverb 4,669 3,664 5,808
Totals 152,059 115,424 203,145
44
2) Analysis of 3 Semantic Networks2) Analysis of 3 Semantic NetworksStatistical Properties
With N nodes and <k> avg.degree If <k> = pN < , the graph is composed of isolated trees If <k> > 1, a giant cluster appears If <k> ln(N), the graph is totally connected
45
Roget’s Thesaurus
WORDS EXPRESSING ABSTRACT RELATIONS
WORDS RELATING TO SPACE
WORDS RELATING TO MATTER
WORDS RELATING TO THE INTELLECTUAL FACULTIES
WORDS RELATING TO THE VOLUNTARY POWERS
WORDS RELATING TO THE SENTIMENT AND MORAL POWERS
Recommended