Upload
vuongthuan
View
240
Download
1
Embed Size (px)
Citation preview
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
1/55
Memory Modeling & Knowledge Representation
Felix Putze
5.5.2011
Lecture „Cognitive Modeling“
SS 2011
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
2/55
Structure of Lecture
• Introduction and Motivation
• Memory Modeling
• Knowledge Representation• First Order Logic
• Frames
• Semantic Networks
• Neural Networks
• Bayesian Networks
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
3/55
Why do memory modeling?
• Any process that spans a period of time requires the handling of limited human memory capacity
• Memory access is not of guaranteed success and with instantaneous reaction time
• For Human-Machine-Interaction: User has limited capability of remembering and recalling Not all presented information is stored or available at all times
• Interaction systems should know what is on the user‘s mind and what is not
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
4/55
Requests to a Memory Model
• There is a number of questions a memory model should be able to answer:• What items are currently active on the human‘s mind?
• Is a certain bit of information retrievable?
• What is associated with a certain input?
• How is memory organized?
• How is new information integrated?
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
5/55
Types of Memory• Squire (1992) distinguishes several distinct types of memory
and associates them with different parts of the brain:
• Declarative Memory: Explicit and conscious recollection of…• facts (semantic memory, e.g. “France is a country in Europe.”)
• events (episodic memory, e.g. “Last summer, I spend my holidaysin France.”)
• Procedural Memory: Implicitly learned skills (e.g. riding bicycle)
• Priming: Automated associations caused by frequent repetition
• Conditioning: Automatic stimulus-reflex pairs (e.g. Pawlow‘s dogs)
• In this lecture, we will focus on semantic memory
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
6/55
Short-term and long-term Memory
• Short-term memory: Storage for a limited number of items• Small capacity
• Limited duration for storage (seconds), decay
• Longer storage duration requires rehearsal, i.e. periodic repetition
• Acoustically and visually coded (e.g. multiple phonetically similar items are hard to keep in memory)
• Long-term memory:• Nearly unlimited capacity
• Items can last for years without rehearsal
• Items are mostly retrieved and coded semantically, however there is a phonetic component (tip-of-tongue effect)
• Other types of memory: sensory memory, working memory
• The existence of distinct memory systems in the brain is controversial; experiments support both theories
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
7/55
The magic number 7 (+/- 2)
• Miller (1956): Determined the capacity of short-term memory to be about 7 items• Estimated by having people recall sequences of digits or words
• Performance is very good for around five to six items
• Performance degrades rapidly for more items
• Miller’s conclusion: Memory span is not a function of encoding length in bit, but a function of the number of elements
• Later, Miller acknowledged that the “magic number” was a coincidence and heavily context-dependent
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
8/55
Chunking and Mnemonics
• How can people remember longer phone number if their short-time memory is limited to 7 (or fewer) elements?• Most people do not remember the number 0123456789 as 0-1-2-3-4-5-
6-7-8-9 but as 01-23-45-67-89 (or similar)
• This division of information into smaller pieces is called chunking
• This is also a question of skill: A trained person can chunk a stream of binary digits into larger blocks, convert them to decimal numbers and remember those
• There are many other mnemonic techniques:• Make use of linguistic or phonetic similarities
• Construct images or stories to connect multiple items into one (e.g. „man“, „horse“, „fish“ A man riding on a horse hunting a fish)
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
9/55
Controversy regarding memory limitations
• There are a lot of conflicting viewpoints on memory limitation:
• A general limit exists but is lower than seven (≈ 4 without possibility for chunking or mnemonic techniques)
• The acoustic encoding of items in short-term memory influences this capacity:• Of long words (which take longer to speak), only shorter sequences can
be remembered
• Memory span decreases when remembering phonetically similar words
• There are specialized parts of short term memory with separate capacity limits
• There is no limitation of short term memory at all (observed limitations are an effect of general scheduling conflicts)
• There is no special faculty for short term memory at all, only an attention limitation on generic memory
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
10/55
Influence of Emotion on Memory
• Emotion-congruent information is encoded better• In a happy mode, we encode more „happy“ facts than „sad“ ones
• With high arousal, central information is encoded better• …while peripheral information is encoded worse
• Yerkes-Dodson law: Relation between arousal and performance is described as an „inverted u-curve“
• Consequence: Do not study memory as an isolated concept!
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
11/55
Structure of Lecture
• Introduction and Motivation
• Memory Modeling
• Knowledge Representation• First Order Logic
• Frames
• Semantic Networks
• Neural Networks
• Bayesian Networks
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
12/55
Atkinson‘s & Siffrin‘s Memory Model
• Incoming information is extracted from parts of sensory input, initially stored in STM and later transferred to LTM or displaced linear process
• Monolithic modeling (one model for each type of information)
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
13/55
Components of Siffrin’s and Atkinson’s Model
• Sensory Memory:• Specialized for different sensory inputs (e.g. visual, auditive, …)
• Lasts for a very short time (milliseconds for visual, few seconds for aural information)
• Contains raw data, used to select relevant information (partial report)
• Decoupled from other components (localized, unconscious)
• Short term memory:• Keeps currently relevant information
• Duration of 15-30 seconds (unless rehearsed)
• Bottleneck between raw data from sensors and unlimited long term memory
• Long term memory:• Information which is rehearsed often enough is stored here
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
14/55
Baddeley‘s Memory Model
• Model of short-term (or working) memory
• Three slave systems for different types of information
• Controlled by central executive
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
15/55
Baddeley‘s Memory Model
• The phonological loop consists of two main parts:• Phonological store: contains ca. 2 seconds of audio information
• Phonological rehearsal: performs periodic rehearsal to keep information available ( „inner voice“)
• Evidence: Suppression of rehearsal impairs memory
• Visuo-Spatial sketchpad is divided in two components:• Inner cache: forms, color
• Inner scribe: spatial information, movement (planning)
• Visually presented information can also be transferred to the phonological loop by verbalization
• Separation between phonologic and visual system explains differences in dual-tasking: Combining one acoustic and one visual task is easier than combining two tasks of the same kind
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
16/55
Baddeley‘s Memory Model
• Central Executive:• attention
• retrieval strategies
• episode forming
• Episodic buffer:• Added in 2000 as third slave system
• Contains concrete, multimodal “episodes”
• Introduced to explain memory which is not limited to one channel
• Also explains the ability to memorize a longer sequence of words which form a “story”
• Still less defined than the other two subsystems
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
17/55
Cowan‘s memory model
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
18/55
Cowan‘s memory model
• No distinction between long-term and short-term memory
• No division in modality-specific components
• Short-term memory is implicitly represented as activated items in memory
• Activation decays over time unless it is refreshed
• A subset of the activated items forms the focus of attention
• Theoretical foundation of the ACT-R memory model
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
19/55
Structure of Lecture
• Introduction and Motivation
• Memory Modeling
• Knowledge Representation• First Order Logic
• Frames
• Semantic Networks
• Neural Networks
• Bayesian Networks
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
20/55
First Order Logic
• First order logic is a traditional and still widely used knowledgerepresentation scheme
• Express knowledge in form of logical clauses
• Example:• „All humans are mortal.“ =
• Immediately benefits from established algorithms
• Models logical, concious deduction and inference
• Model everything in one unconstrained language, no meta-ontology, etc.
)()(: xmortalxhumanx
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
21/55
Cyc: A Database of Human Knowledge
• Under development since 1984 by the company CyCorp, large collection of everyday knowledge („water is wet“)
• Currently contains ~500.000 items, ~5.000.000 facts
• A free version OpenCyc exists (subset of Cyc)
• An inference engine is able to deduce facts form the knowledge base
• Developed for language generation and language understanding
• Cyc uses higher order logic to increase its expressiveness:• A micro-theory describes the context in which a statement is valid
• For example the statement „vampires fear garlic“ is (only) true in thecontext „mythology“
• Introduces modal operator: isTrue(context, assertion)
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
22/55
Cyc Example
• (isa BurningOfPapalBull SocialGathering)• The burning of the papal bull is an instance of of „SocialGathering“
• (relationInstanceExistsMin BurningOfPapalBull attendees UniversityStudents 40)
• At least 40 students attended the event
• (isa BurningOfPapalBull-Document CombustionProcess)
• (properSubEvent BurningOfPapalBull-Document BurningOfPapalBull)
• The actual burning event (as part of the social event)
• (relationInstanceExistsinputsDestroyed BurningOfPapalBull-Document (CopyOfConceptualWorkFn PapalBull-ExcommunicationCW))
• The thing destroyed is a member of the functionally defined collection „copies of the conceptual work PapalBull-ExcommunicationCW“
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
23/55
FOL for Memory Modeling
• First order logic is typically used for modeling reasoning, deduction and inference
• No easy representation of “fuzzy”, associative processes
• As we see with Cyc, FOL may not be expressive enough to represent complex knowledge
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
24/55
Frames
• Developed by Marvin Minsky in 1975
• A Frame consists of a name and several named attributes(„slots“) which can contain• atomic values or
• references to other frames
• nothing to represent partial knowledge
• Unification algorithm combines two frames by combiningatomic attributes and recursively unifying non-atomic attributes
• Related to the schema theory of cognitive psychology
[Class
Title: „Cognitive Modeling“
NumStudents: <empty>
Teacher: [Person
FirstName: „Tanja“
FamilyName: „Schultz“
]
]
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
25/55
Parallels to Object Oriented Design
• The graphical representation of a set of frames can beregarded as an UML class/object diagram
• We identify…• Abstract concepts with (abstract and non-abstract) classes
• Entities with objects
• Relations with associations (compounds) and class attributes (atomicvalues)
• This analogy may…• help in the implementation of a knowledge representation
• allow the use of powerful tools which are readily available
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
26/55
Memory Modeling in ACT-R
• The main building block of knowledge representation in ACT-R (chunk) is essentially a frame
• Semantic memory is handeled by the declarative module
• The declarative module makes no distinction between long-term and short-term memory
• Each item is associated with an activation value
• For retrieval from the declarative module, a request is storedin the input buffer of the declarative module
• This request is a partial description of a chunk and the modulereturns the chunk which matches this description and (if thereis ambiguity) which has the highest activation
• No partial matching of chunks which „almost“ fit thedescription
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
27/55
Activation
• Activation of a chunk is the sum of two components (plus noise):
• Base Activation: Depends on the frequency and recency ofstimulations of a chunk:
• Spreading Activation (associative* activation):
• * chunk j is associated with chunk i if j is an attribute of a slot in i
)log(1
n
j
ji tB
age of jth activation of chunk i
else,))log((
associatednot are j and i if ,01
1 j
n
j
i fanSnS
sum over all chunks associatedwith the content of the goal buffer
number of chunksof which j is value of
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
28/55
The Fan Effect
• Experiment lets participants learn the facts in the left column. When given the probes, they have to identify those whichoccured in the training set (target probes)
• It is easier to identify those sentences for which at least onecomponent (person or location) was rare in the training corpus
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
29/55
Semantic Networks
• Goal of knowledge representation is the modeling of facts andtheir relationships
• Natural formalism are graphs with nodes representing factsand edges representing relationships
• Different forms of networks exist: • Are edges themselves semantically annotated?
• Are edges directed?
• Are edges weighted?
London
Paris
north-of
London
Paris
north-of
1
2
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
30/55
Examples of semantic networks: Hierarchies
• Focus on „is-a“ relations
• Example from Porphyry, 300 AD:
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
31/55
Examples of semantic networks: KL-ONE
• KL-ONE: Developed in 1979 by Brachman
• Knowledge representation framework for AI
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
32/55
Examples of semantic networks: MultiNet
• Multi-Layer architecture, focus on language understanding
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
33/55
Characteristics of Semantic Networks
• No predefined ontology and attributes
• Can introduce meta-knowledge directly into the network
• Natural tool for the representation of associations
• Allow the application of well-studied graph algorithms for analysis of network (connected components, distances, topology, …)
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
34/55
The (extended) LTMc Model
• LTMc memory model designed as a replacement for the ACT-R declarative module
• LTM = Long Term Memory; largely follows the model of Cowan no explicit distinction between LTM and STM
• Models memory as semantic network• Each node has an activation value, similar to the activation in the
original ACT-R model
• Base activation and noise activation are similar, spreading activation is adapted to the network structure
• Items which are activated above a threshold are considered to be active
• Attention focus: Identify connected component with highest overall activation
• JAM: Stand-alone extension of to better model the dynamics of memory (e.g. topic drifts) developed at the CSL
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
35/55
Spreading in the extended LTMc Model
• The graph structure of the model is well suited to model the process of spreading activation (i.e. association)
• When an item is stimulated, its activation is distributed to its neighbors in the network
• If total spreading activation received exceeds a threshold, it is further propagated in a Breadth-First-Search style
• Total activation spread by one node is constant, equally divided among all outgoing edges Fan effect
• JAM: Extension to handle topic drifts• Decay function to decrease spreading activation over time
• capping of spreading activation to control the total activation in the network
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
36/55
Spreading Example
Stimulation of Author, Writer, FemaleHuman
Stimulation of ThomasHobbes, JohnLocke, DavidHume, BertrandRussel
(Excerpt from Cyc database converted to network structure; using the JAM algorithm)
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
37/55
Stimulation of JaneAusten
Spreading Example
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
38/55
Evaluation: „Moses Illusion“
• „How many animals of each type did Moses bring to the arc?“
• Typical answer: „two“. Correct answer: „none“ (it was Noah!)
• Can a memory model reproduce this behavior?
• Note that this is not a case of false knowledge but of partial matching
• Vanilla ACT-R does not reflect this phenomenon, but (extended) LTMC does:
Moses
Noah Arc
bring
two
BiblicalPerson
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
39/55
ConceptNet
• Created at MIT Media Lab
• Huge common sense database represented as semantic net
• Not developed by experts but using a crowd sourcing approach• Data is entered by users of a webpage
• People play a “Game with a Purpose” (e.g. association games)
• Data can later be validated and weighted by other judges
• Contains subjective associations
• Easily accessible using Python interfaces
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
40/55
Verbosity: A Game with a Purpose
Describer’s view
Guesser’s view
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
41/55
ConceptNet: Example
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
42/55
Knowledge Database WordNet
• Lexical database of English Language in form of a semantic network
• Developed since 1985 by George A. Miller in Princeton
• Main unit forming the nodes of the network: Synsets (group of synonymes with short description)
• Models semantic relations (mostly language-oriented) between synsets
• Contains more than 110,000 synsets
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
43/55
WordNet: Example graph of hypernyms
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
44/55
Information in WordNet
• Some relations are allowed at word level (e.g. antonym = of opposite meaning), but the majority is defined on synset level
• Examples for relations in WordNet:• Holonyms (part-of), e.g. „family“ is a holonym of „mother“
• Hypernyms (kind-of), e.g. „animal“ is hypernym of „dog“
• WordNet also contains short definitions in plain text for each term
• Also contains additional linguistic information, e.g. syntactic constraints on the use of certain words
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
45/55
Soundness of the WordNet Graph
• WordNet ontology is represented such that more abstract generalizations of a word a further up in the ontology
• This is in accordance with the spreading model of LTMC
applied to the WordNet graph
• Introduction of the concept evocation, measures how much one concept brings to mind another
• Evocation creates a much denser network with weighted edges
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
46/55
Neural Knowledge Representation
• Encode information in the structure of a neural network• Train network by presenting input patterns by stimulating neurons
• When stimulating learned (or similar) input patterns, the network should recognize them
• Example: Self-organizing maps (SOM)• Maps multi-dimensional input space to two-dimensional representation
• Map consists of interconnected nodes (neurons) in a plane (think of the cortex of the brain)
• Weights of each neuron are prototypes which describe to which input patterns the neuron is similar
• Idea: Different input patterns activate different parts of the neural network• Analogy: Human brain also shows different activation patterns for
different stimuli (e.g. regions for processing of visual vs. aural stimuli)
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
47/55
Self-organizing map: Learning
• Initialize weights of all neurons randomly
• Iteratively adjust weights:• Present input pattern P as vector in a high dimensional space
• Compare P with all neuron weights, find the most similar one: S
• Determine the neighborhood N(S) using the network structure
• Shift the weights of N(S) to be more similar to P
• Optional: Shift all other weights to be less similar to P
• Over time, decrease the learning rate (strength of adaptation) and the neighborhood size
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
48/55
Self-organizing map: Application
• Example: We train a SOM with pictures of different objects• Input patterns are the raw pixels or low-level descriptors
• For each category of objects, a typical activation pattern will emerge in the network• For each category, we expect a region of matching nodes
• Network learns a generalizing mapping from input patterns to categories
• SOM now can map unseen input patterns to a category
• Note that there are many other connectionist approaches (e.g. Hopfield Networks)!
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
49/55
Self-organizing map: Example
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
50/55
Belief
• Up to now, knowledge was either (subjectively) false or true, i.e. part of the individual knowledge base or not
• Fuzziness was part of the model concerning the activation value, not the truth of an information
• Not a realistic assumption
• Introduce belief: Degree to which some information is considered to be valid
• Example: I estimate the probability that P!=NP to be 5% (I am “pretty sure”, but there is room for doubt)
• Belief is subjective, depends on prior assumptions and experience or observations
• Need to find a formalism to model and manipulate belief
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
51/55
Probability according to Bayes
• Representation of belief as probability
• Probability according to Bayes: „Confidence in the personal assessment of an issue.“
• Can be different for different individuals with different background and experience
• Model probability of non-stochastic and unique events
• Example: P(student A passes the exam on cognitive modeling)
• This is not possible in classic frequentist statistic which isdefined based on the frequency of events
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
52/55
Bayes‘ Theorem
• Important Instrument: Bayes‘ Theorem
• Bayes‘ Theorem allow the combination of a-priori knowledgeP(A) with information from a cue B to calculate the a-posteriori probability P(A|B)
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
53/55
Application of Bayes‘ Theorem
• Important Instrument: Bayes‘ Theorem
• Example: A = Person X is rich (true/false)B = Person X wears expensive jewelry (true/false)
• P(A=true|B=true)?
• P(A=true) = 0.1
• P(B=true|A=true) = 0.8
• P(B=true) = 0.2
• P(A=true|B=true) = 0.4
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
54/55
Bayesian Networks
• Want to model joint distribution of multiple variables
• A Bayesian Network is a directed acyclic graph:• Nodes = Random Variables
• Arcs = Direct Causality
• Each node contains conditional probability distribution dependent on the parents in the graph
• Using Bayes‘ theorem, we can infer probabilities of some nodes given information on some of the others
family-out bowel-problem
lights-ondog-out
hear-bark
Co
gnit
ive
Mo
del
ing:
Mem
ory
Mo
del
ing
& K
no
wle
dge
Rep
r.
55/55
The Bayesian Brain
• Bayesian coding hypothesis: Brain represents information probabilistically• Coding and computing with probability density functions
• Not limited to memory, targets noisy perception, planning and action execution
• Instead of deterministically modeling a concept X, model its probability density function p(X)
• Natural and expressive representation of uncertainty
• May present a generic framework for modeling cognition
• Allows seamless integration of models with statistical machine learning techniques