40
CS 182 Sections 103 - 104 slides created by Eva Mok ( [email protected] ) modified by jgm April 13, 2005

CS 182 Sections 103 - 104

Embed Size (px)

DESCRIPTION

CS 182 Sections 103 - 104. slides created by Eva Mok ( [email protected] ) modified by jgm April 13, 2005. Announcements. a8 out, due Monday April 19 th , 11:59pm BBS articles are assigned for the final paper: - PowerPoint PPT Presentation

Citation preview

Page 1: CS 182 Sections 103 - 104

CS 182Sections 103 - 104

slides created by Eva Mok ([email protected])

modified by jgm

April 13, 2005

Page 2: CS 182 Sections 103 - 104

Announcements

• a8 out, due Monday April 19th, 11:59pm

• BBS articles are assigned for the final paper:

– Arbib, Michael A. (2002). The mirror system, imitation, and the evolution of language.

– Hurford, James R. (2003). The neural basis of predicate-argument structure.

– Grush, Rick (2004). The emulation theory of representation: motor control, imagery, and perception.

• Skim through them and let us know, as part of a8, which article you plan to use.

Page 3: CS 182 Sections 103 - 104

Schedule

• Last Week

– Inference in Bayes Net

– Metaphor understanding using KARMA

• This Week

– Formal Grammar and Parsing

– Construction Grammar, ECG

• Next Week

– Psychological model of sentence processing

– Grammar Learning

Page 4: CS 182 Sections 103 - 104

Quiz

1. How are the source and target domains represented in KARMA?

2. How does the source domain information enter KARMA? How should it?

3. What does SHRUTI buy us?

4. How are bindings propagated in a structured connectionist framework?

Page 5: CS 182 Sections 103 - 104

Quiz

1. How are the source and target domains represented in KARMA?

2. How does the source domain information enter KARMA? How should it?

3. What does SHRUTI buy us?

4. How are bindings propagated in a structured connectionist framework?

Page 6: CS 182 Sections 103 - 104

KARMA

• DBN to represent target domain knowledge

• Metaphor maps link target and source domain

• X-schema to represent source domain knowledge

Page 7: CS 182 Sections 103 - 104

DBNs

• Explicit causal relations + full joint table Bayes Nets

• Sequence of full joint states over time HMM

• HMM + BN DBNs

• DBNs are a generalization of HMMs which capture sparse causal relationships of full joint

Page 8: CS 182 Sections 103 - 104

Dynamic Bayes Nets

Page 9: CS 182 Sections 103 - 104

Metaphor Maps

1. map entities and objects between embodied and abstract domains

2. invariantly map the aspect of the embodied domain event onto the target domain

by setting the evidence for the status variable based on controller state (event structure metaphor)

3. project x-schema parameters onto the target domain

Page 10: CS 182 Sections 103 - 104

Where does the domain knowledge come from?

• Both domains are structured by frames

• Frames have:

– List of roles (participants, frame elements)

– Relations between roles

– Scenario structure

Page 11: CS 182 Sections 103 - 104

DBN for the target domain

T0 T1

Economic State

Goal

Policy

Outcome

Difficulty

[Liberalization, Protectionism]

[free trade, protection ]

[success, failure]

[present, absent]

[recession,nogrowth,lowgrowth,higrowth]

Page 12: CS 182 Sections 103 - 104

Let’s try a different domain

• I didn’t quite catch what he was saying

• His slides are packed with information

• He sent the audience a clear message

9/11 Commission Public Hearing, Monday, March 31, 2003

When we can get a good flow of information from the streets of our cities across to, whether it is an investigating magistrate in France or an intelligence operative in the Middle East, and begin to assemble that kind of information and analyze it and repackage it and send it back out to users, whether it's a policeman on the beat or a judge in Italy or a Special Forces Team in Afghanistan, then we will be getting close to the kind of capability we need to deal with this kind of problem. That's going to take a couple, a few years.

Page 13: CS 182 Sections 103 - 104

Target domain belief net (T-1)

Metaphor Map (conduit metaphor)

Ideas are

objects

Words are

containers

Sendersare

speakers

Receiversare

addressees

sendis

talk

receiveis

hear

Target domain belief net (T) (communication frame)

speaker addressee action outcomedegree of

understanding

Source domain f-structs (transfer)

X-Schema representation

sender receiver means force rate

transfersend receive

pack

Page 14: CS 182 Sections 103 - 104

Quiz

1. How are the source and target domains represented in KARMA?

2. How does the source domain information enter KARMA? How should it?

3. What does SHRUTI buy us?

4. How are bindings propagated in a structured connectionist framework?

Page 15: CS 182 Sections 103 - 104

How do the source domain f-structs get parameterized?

• In the KARMA system, they are hand-coded.

• In general, you need analysis of sentences:

– syntax

– semanticsSyntax captures:

• constraints on word order

• constituency (units of words)

• grammatical relations (e.g. subject, object)

• subcategorization & dependency (e.g. transitive, intransitive, subject-verb agreement)

Page 16: CS 182 Sections 103 - 104

Quiz

1. How are the source and target domains represented in KARMA?

2. How does the source domain information enter KARMA? How should it?

3. What does SHRUTI buy us?

4. How are bindings propagated in a structured connectionist framework?

Page 17: CS 182 Sections 103 - 104

SHRUTI

• A connectionist model of reflexive processing

Reflexive reasoning

automatic, extremely fast (~300ms), ubiquitous

• computation of coherent explanations and predictions• gradual learning of causal structure• episodic memory• understanding language

Reflective reasoning

conscious deliberation, slowovert consideration of alternativesexternal props (pencil + paper)

• solving logic puzzles• doing cryptarithmetic• planning a vacation

Page 18: CS 182 Sections 103 - 104

SHRUTI

• synchronous activity without using global clock

• An episode of reflexive processing is a transient propagation of rhythmic activity

• An “entity” is a phase in the above rhythmic activity.

• Bindings are synchronous firings of role and entity cells

• Rules are interconnection patterns mediated by coincidence detector circuits that allow selective propagation of activity

• Long-term memories are coincidence and coincidence-failure detector circuits

• An affirmative answer / explanation corresponds to reverberatory activity around closed loops

Page 19: CS 182 Sections 103 - 104

focal cluster

• provides locus of coordination, control and decision making

• enforce sequencing and concurrency

• initiate information seeking actions

• initiate evaluation of conditions

• initiate conditional actions

• link to other schemas, knowledge structures

Page 20: CS 182 Sections 103 - 104

Quiz

1. How are the source and target domains represented in KARMA?

2. How does the source domain information enter KARMA? How should it?

3. What does SHRUTI buy us?

4. How are bindings propagated in a structured connectionist framework?

Page 21: CS 182 Sections 103 - 104

dynamic binding example

• asserting that get(father, cup)

• father fires in phase with agent role

• cup fires in phase with patient role

+ - ? agt pat

+e +v ?e ?v

+ ?

get

cup

my-father

type

entity

predicate

Page 22: CS 182 Sections 103 - 104

Active Schemas in SHRUTI

• active schemas require control and coordination, dynamic role binding and parameter setting

• schemas are interconnected networks of focal clusters

• bindings are encoded and propagated using temporal synchrony

• scalar parameters are encoded using rate-encoding

Page 23: CS 182 Sections 103 - 104
Page 24: CS 182 Sections 103 - 104

Review: Probability

• Random Variables

– Boolean/Discrete

• True/false

• Cloudy/rainy/sunny

– Continuous

• [0,1] (i.e. 0.0 <= x <= 1.0)

Page 25: CS 182 Sections 103 - 104

Priors/Unconditional Probability

• Probability Distribution

– In absence of any other info

– Sums to 1– E.g. P(Sunny=T) = .8 (thus, P(Sunny=F) = .2)

• This is a simple probability distribution

• Joint Probability

– P(Sunny, Umbrella, Bike)• Table 23 in size

– Full Joint is a joint of all variables in model

• Probability Density Function

– Continuous variables• E.g. Uniform, Gaussian, Poisson…

Page 26: CS 182 Sections 103 - 104

Conditional Probability

• P(Y | X) is probability of Y given that all we know is the value of X

– E.g. P(cavity=T | toothache=T) = .8• thus P(cavity=F | toothache=T) = .2

• Product Rule

– P(Y | X) = P(X Y) / P(X) (normalizer to add up to 1)

Y X

Page 27: CS 182 Sections 103 - 104

Inference

Toothache Cavity Catch Prob

False False False .576

False False True .144

False True False .008

False True True .072

True False False .064

True False True .016

True True False .012

True True True .108

P(Toothache=T)?P(Toothache=T, Cavity=T)? P(Toothache=T | Cavity=T)?

Page 28: CS 182 Sections 103 - 104

Independence

•Rainy Cloudy

•Sunny Windy

Page 29: CS 182 Sections 103 - 104

Bayes NetsBayes Nets

B E P(A|…)

TTFF

TFTF

0.950.940.290.001

Burglary Earthquake

Alarm

MaryCallsJohnCalls

P(B)

0.001

P(E)

0.002

A P(J|…)

TF

0.900.05

A P(M|…)

TF

0.700.01

Page 30: CS 182 Sections 103 - 104

Independence

X Y Z X Y Z

X

Y

Z X

Y

Z

X

Y

Z X

Y

Z

X independent of Z?X independent of Z? X conditionally independent of Z given Y?X conditionally independent of Z given Y?

NoNo

NoNo

NoNo

YesYes

YesYes

YesYes

Or below

Page 31: CS 182 Sections 103 - 104

Markov Blanket

X

X is independentof everything else given:

Parents, Children, Parents of Children

Page 32: CS 182 Sections 103 - 104

Reference: Joints

• Representation of entire network

• P(X1=x1 X2=x2 ... Xn=xn) =P(x1, ..., xn) = i=1..n P(xi|parents(Xi))

• How? Chain Rule

– P(x1, ..., xn) = P(x1|x2, ..., xn) P(x2, ..., xn) =... = i=1..n P(xi|xi-1, ..., x1)

– Now use conditional independences to simplify

Page 33: CS 182 Sections 103 - 104

Reference: Joint, cont.

P(x1, ..., x6) =P(x1) *P(x2|x1) *P(x3|x2, x1) *P(x4|x3, x2, x1) *P(x5|x4, x3, x2, x1) *P(x6|x5, x4, x3, x2, x1)

X2

X1

X3

X4

X6

X5

Page 34: CS 182 Sections 103 - 104

Reference: Joint, cont.

P(x1, ..., x6) =P(x1) *P(x2|x1) *P(x3|x2, x1) *P(x4|x3, x2, x1) *P(x5|x4, x3, x2, x1) *P(x6|x5, x4, x3, x2, x1)

X2

X1

X3

X4

X6

X5

Page 35: CS 182 Sections 103 - 104

Reference: Inference

• General case

– Variable Eliminate

– P(Q | E) when you have P(R, Q, E)

– P(Q | E) = ∑R P(R, Q, E) / ∑R,Q P(R, Q, E)

• ∑R P(R, Q, E) = P(Q, E)

• ∑Q P(Q, E) = P(E)

• P(Q, E) / P(E) = P(Q | E)

Page 36: CS 182 Sections 103 - 104

Inference

Toothache Cavity Catch Prob

False False False .576

False False True .144

False True False .008

False True True .072

True False False .064

True False True .016

True True False .012

True True True .108

P(Toothache=T, Cavity=T)?

Page 37: CS 182 Sections 103 - 104

Inference

Toothache Cavity Prob

False False .72

False True 0.08

True False 0.08

True True 0.12

Page 38: CS 182 Sections 103 - 104

Reference: Inference, cont.

Q = {X1}, E = {X6}

R = X \ Q,E

P(x1, ..., x6) =P(x1) * P(x2|x1) * P(x3|x1) * P(x4|x2) *P(x5|x3) * P(x6|x5, x2)

X2

X1

X3

X4

X6

X5

P(x1, x6) = ∑x2 ∑x3 ∑x4 ∑x5 P(x1) P(x2|x1) P(x3|x1) P(x4|x2) P(x5|x3) P(x6|x5, x2)

= P(x1) ∑x2 P(x2|x1) ∑x3 P(x3|x1) ∑x4 P(x4|x2) ∑x5 P(x5|x3) P(x6|x5, x2)

= P(x1) ∑x2 P(x2|x1) ∑x3 P(x3|x1) ∑x4 P(x4|x2) m5(x2, x3)

= P(x1) ∑x2 P(x2|x1) ∑x3 P(x3|x1) m5(x2, x3) ∑x4 P(x4|x2) = ...

Page 39: CS 182 Sections 103 - 104

Approximation Methods

• Simple– no evidence

• Rejection– just forget about the invalids

• Likelihood Weighting– only valid, but not necessarily useful

• MCMC– Best: only valid, useful, in proportion

Page 40: CS 182 Sections 103 - 104

Stochastic SimulationStochastic Simulation

RainSprinkler

Cloudy

WetGrass1. Repeat N times: 1.1. Guess Cloudy at random 1.2. For each guess of Cloudy, guess Sprinkler and Rain, then WetGrass

2. Compute the ratio of the # runs where WetGrass and Cloudy are True over the # runs where Cloudy is True

P(WetGrass|Cloudy)?

P(WetGrass|Cloudy) = P(WetGrass Cloudy) / P(Cloudy)