Hidden Markov Models - GitHub Pagesaritter.github.io/courses/5522_slides/hmm.pdfProbability Recap...

CS 5522: Artificial Intelligence II Hidden Markov Models

Instructor: Alan Ritter

Ohio State University [These slides were adapted from CS188 Intro to AI at UC Berkeley. All materials available at http://ai.berkeley.edu.]

Pacman – Sonar (P4)

[Demo: Pacman – Sonar – No Beliefs(L14D1)]

Video of Demo Pacman – Sonar (no beliefs)

Probability Recap

▪ Conditional probability

Probability Recap

▪ Product rule

Probability Recap

▪ Product rule

▪ Chain rule

Probability Recap

▪ Product rule

▪ Chain rule

Probability Recap

▪ Product rule

▪ Chain rule

Probability Recap

▪ Product rule

▪ Chain rule

▪ X, Y independent if and only if:

Probability Recap

▪ Product rule

▪ Chain rule

▪ X and Y are conditionally independent given Z if and only if:

Probability Recap

▪ Product rule

▪ Chain rule

Probability Recap

▪ Product rule

▪ Chain rule

Probability Recap

▪ Product rule

▪ Chain rule

Hidden Markov Models

▪ Markov chains not so useful for most agents ▪ Need observations to update your beliefs

▪ Hidden Markov models (HMMs) ▪ Underlying Markov chain over states X ▪ You observe outputs (effects) at each time step

X1 X3 X4

E2 E3 E4 E5

Example: Weather HMM

Umbrellat-1 Umbrellat Umbrellat+1

Raint-1 Raint Raint+1

▪ An HMM is defined by: ▪ Initial distribution: ▪ Transitions: ▪ Emissions:

Rt Rt+1 P(Rt+1|Rt)

+r +r 0.7

+r -r 0.3

-r +r 0.3

-r -r 0.7

Umbrellat-1 Umbrellat Umbrellat+1

Rt Rt+1 P(Rt+1|Rt)

+r +r 0.7

+r -r 0.3

-r +r 0.3

-r -r 0.7

Umbrellat-1

Rt Ut P(Ut|Rt)

+r +u 0.9

+r -u 0.1

-r +u 0.2

-r -u 0.8

Umbrellat Umbrellat+1

Example: Ghostbusters HMM

▪ P(X1) = uniform 1/9 1/9

1/9 1/9

1/9 1/9 1/9

[Demo: Ghostbusters – Circular Dynamics – HMM (L14D2)]

▪ P(X1) = uniform

▪ P(X|X’) = usually move clockwise, but sometimes move in a random direction or stay in place

1/9 1/9

1/9 1/9 1/9

▪ P(X1) = uniform

1/9 1/9

1/9 1/9 1/9

P(X|X’=<1,2>)

1/6 1/6

▪ P(X1) = uniform

▪ P(Rij|X) = same sensor model as before: red means close, green means far away.

1/9 1/9

1/9 1/9 1/9

P(X|X’=<1,2>)

1/6 1/6

X1 X3 X4

Ri,j Ri,j Ri,j

Video of Demo Ghostbusters – Circular Dynamics -- HMM

Joint Distribution of an HMM

▪ Joint distribution:

▪ More generally:

▪ Questions to be resolved: ▪ Does this indeed define a joint distribution? ▪ Can every joint distribution be factored this way, or are we making some assumptions

about the joint distribution by using this factorization?

E2 E3 E5

▪ From the chain rule, every joint distribution over can be written as:

▪ Assuming that

gives us the expression posited on the previous slide:

Chain Rule and HMMs

▪ From the chain rule, every joint distribution over can be written as:

▪ Assuming that for all t: ▪ State independent of all past states and all past evidence given the previous state, i.e.:

▪ Evidence is independent of all past states and all past evidence given the current state, i.e.:

gives us the expression posited on the earlier slide:

Implied Conditional Independencies

▪ Many implied conditional independencies, e.g.,

▪ To prove them ▪ Approach 1: follow similar (algebraic) approach to what we did in the Markov

models lecture ▪ Approach 2: directly from the graph structure (3 lectures from now)

▪ Intuition: If path between U and V goes through W, then

[Some fineprint later]

Real HMM Examples

▪ Speech recognition HMMs: ▪ Observations are acoustic signals (continuous valued) ▪ States are specific positions in specific words (so, tens of thousands)

▪ Machine translation HMMs: ▪ Observations are words (tens of thousands) ▪ States are translation options

▪ Robot tracking: ▪ Observations are range readings (continuous) ▪ States are positions on a map (continuous)

Filtering / Monitoring

▪ Filtering, or monitoring, is the task of tracking the distribution Bt(X) = Pt(Xt | e1, …, et) (the belief state) over time

▪ We start with B1(X) in an initial setting, usually uniform

▪ As time passes, or we get observations, we update B(X)

▪ The Kalman filter was invented in the 60’s and first implemented as a method of trajectory estimation for the Apollo program

Example: Robot Localization

t=0 Sensor model: can read in which directions there is a wall, never

more than 1 mistake Motion model: may not execute action with small prob.

10Prob

Example from Michael Pfeiffer

t=1 Lighter grey: was possible to get the reading, but less likely

b/c required 1 mistake

10Prob

t=1 Lighter grey: was possible to get the reading, but less likely

b/c required 1 mistake

10Prob

Inference: Base Cases

Passage of Time

▪ Assume we have current belief P(X | evidence to date)X2X1

Passage of Time

▪ Assume we have current belief P(X | evidence to date)

▪ Then, after one time step passes:

Passage of Time

▪ Or compactly:

Passage of Time

▪ Basic idea: beliefs get “pushed” through the transitions▪ With the “B” notation, we have to be careful about what time step t the belief is about, and

what evidence it includes

▪ Or compactly:

Example: Passage of Time

▪ As time passes, uncertainty “accumulates”

(Transition model: ghosts usually go clockwise)

T = 1 T = 2

T = 1 T = 2 T = 5

Observation▪ Assume we have current belief P(X | previous evidence):

▪ Then, after evidence comes in: E1

▪ Then, after evidence comes in:

▪ Or, compactly:

▪ Basic idea: beliefs “reweighted” by likelihood of evidence

▪ Unlike passage of time, we have to renormalize

Example: Observation

▪ As we get observations, beliefs get reweighted, uncertainty “decreases”

Before observation After observation

Umbrella1 Umbrella2

Rain0 Rain1 Rain2

B(+r) = 0.5 B(-r) = 0.5

B’(+r) = 0.5 B’(-r) = 0.5

B(+r) = 0.818 B(-r) = 0.182

B’(+r) = 0.627 B’(-r) = 0.373

B(+r) = 0.883 B(-r) = 0.117

Rt Rt+1 P(Rt+1|Rt)

+r +r 0.7

+r -r 0.3

-r +r 0.3

-r -r 0.7

Umbrella1 Umbrella2

Rain0 Rain1 Rain2

B(+r) = 0.5 B(-r) = 0.5

B’(+r) = 0.5 B’(-r) = 0.5

B(+r) = 0.818 B(-r) = 0.182

B’(+r) = 0.627 B’(-r) = 0.373

B(+r) = 0.883 B(-r) = 0.117

Rt Rt+1 P(Rt+1|Rt)

+r +r 0.7

+r -r 0.3

-r +r 0.3

-r -r 0.7

Rt Ut P(Ut|Rt)

+r +u 0.9

+r -u 0.1

-r +u 0.2

-r -u 0.8

Umbrella1 Umbrella2

Rain0 Rain1 Rain2

B(+r) = 0.5 B(-r) = 0.5

B’(+r) = 0.5 B’(-r) = 0.5

B(+r) = 0.818 B(-r) = 0.182

B’(+r) = 0.627 B’(-r) = 0.373

B(+r) = 0.883 B(-r) = 0.117

The Forward Algorithm▪ We are given evidence at each time and want to know

▪ We can derive the following updatesWe can normalize as we go if we want to have P(x|e) at each time

step, or just once at the end…

Online Belief Updates

▪ Every time step, we start with current P(X | evidence) ▪ We update for time:

▪ We update for evidence:

▪ The forward algorithm does both at once (and doesn’t normalize)

Pacman – Sonar (P4)

[Demo: Pacman – Sonar – No Beliefs(L14D1)]

Video of Demo Pacman – Sonar (with beliefs)

Next Time: Particle Filtering and Applications of HMMs

Hidden Markov Models - GitHub Pagesaritter.github.io/courses/5522_slides/hmm.pdfProbability Recap...

Documents

Rule Mate - Xylem Inc. · Manual de instrucciones de Rule Mate Bedienungsanleitung für Rule Mate Handleiding Rule Mate Rule Mate INSTRUCTION MANUAL Manuel d’utilisation de Rule

(Slides from Mausam) - GitHub Pagesaritter.github.io/courses/slides/mdp.pdf · Markov Decision Processes (Slides from Mausam) Markov Decision Process Operations ... Communication

informed search - GitHub Pagesaritter.github.io/courses/5522_slides/informed_search.pdf · Informed Search Heuristics Greedy Search A* Search Graph Search. Recap: Search ... strategies

Rules of Procedure · Rules of Procedure Rule 1.04 Rule 2.02 Rule 2.05 Rule 4.01 Rule 4.04 Rule 4.05 Rule 5.01 Rule 5.02 Rule 5.03 . 2 Article 6 - Incidental Measures 6.01 Referral

3.4 Chain Rule. Prerequisite Knowledge Product Rule Quotient Rule Power Rule Special Derivatives:

Lecture 15: Summariza0on - GitHub Pagesaritter.github.io/courses/5525_slides_v2/lec15-summ.pdf · 2020. 9. 25. · Lecture 15: Summariza0on Alan Ri6er (many slides from Greg Durrett)

Reinforcement Learningaritter.github.io/courses/5522_slides/rl1.pdf · 2021. 8. 9. · Reinforcement Learning Basic idea: Receive feedback in the form of rewards Agent’s utility

PROCUREMENT RULE (RULE N0.002)

· rule 8 8.10 rule 9 rule 10 10.10 10.11. 10.12. 10.13. rule 11 rule 12 rule 13 13.10 13.12 13.14 judicial vacation judicial vacation omitted attorneys of record

Kernels and Kernelized Perceptron - GitHub Pagesaritter.github.io/courses/5523_slides/kernels.pdf · Kernels and Kernelized Perceptron Instructor: Alan Ritter Many Slides from Carlos

Recurrent Neural Networks - GitHub Pagesaritter.github.io/courses/5523_slides/rnn.pdf · 2019-12-06 · [On the difficulty of training Recurrent Neural Networks, Pascanu et al., 2013]

1/25 Warm Up- Monday Product Rule Power Rule Quotient Rule Negative Rule Power of 1 Rule Zero Power Rule

Bayes’ Netsaritter.github.io/courses/5522_slides/bn.pdfGhostbusters Chain Rule Each sensor depends only on where the ghost is That means, the two sensors are conditionally independent,

Frequently Asked Questions about Rule 144 and Rule … ASKED QUESTIONS ABOUT RULE 144 AND RULE 145 Understanding Rule 144 under the Securities Act of 1933 What is Rule 144? Rule 144

Language Modeling - GitHub Pagesaritter.github.io/courses/5525_slides/language models.pdf · Probabilistic Language Models •Today’s goal: assign a probability to a sentence •Machine

Lecture 8: RNNs - GitHub Pagesaritter.github.io/courses/5525_slides_v2/lec8-nn3.pdf‣ Feedforward NNs can’t handle variable length input: each posiFon in the feature vector has

relation extraction - GitHub Pagesaritter.github.io/courses/5525_slides/relation_extraction.pdf · Why Relation Extraction? • Create new structured knowledge bases, useful for any

apps-dso.sws.iastate.edu€¦ · · 2016-10-19MATH 151 Sl Name Rule Constant Power Rule Sum Rule Difference Rule Quotient Rule Rule Exponential Rules Logarithmic Rules Exam 2 Overview

Getting to Accountability · 2018-02-15 · Rule 3 Rule 5 Macau –Personal Data Protection Act 8/2005 –Personal Rule 1 Rule 2 Rule 3 Rule 4 Rule 5 South Korea –Personal Data

images-se-ed.com11 ( Rule 71 ) 12 ( Rule 72) 13 ( Rule 73 ) ( Rule 74) 15 ( Rule 75) 16 ( Rule 76) ( Rule 77 ) 18 ( Rule 78) 19 ( Rule 79) 20 ( Rule 80) The Art and Power of Ignoring