28
EE462 MLCV Lecture 11-12 Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1

EE462 MLCV Lecture 11-12 Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1

Embed Size (px)

Citation preview

Page 1: EE462 MLCV Lecture 11-12 Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1

EE462 MLCV

1

Lecture 11-12

Introduction of Graphical ModelsMarkov Random FieldsSegmentation

Tae-Kyun Kim

Page 2: EE462 MLCV Lecture 11-12 Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1

EE462 MLCV

2

Intro: Graphical Models• Probabilistic graphical model: is a diagrammatic

representation of probability distributions

• Advantages:

1. It provides a simple way to visualise the structure of a probabilistic model and can be used to design and motivate new models.

2. Insights into the properties of the model, including conditional independence properties, can be obtained.

3. Complex computations of inference and learning can be expressed in terms of graphical manipulations.

Page 3: EE462 MLCV Lecture 11-12 Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1

EE462 MLCV

3

• A graph comprises nodes (or vertices) and links (or edges),

where each node represents a random variable and the link probabilistic relationship between

these variables.

• The graph captures the joint distribution of all random variables.

• Bayesian networks: are directed graphical models.

• Markov random fields: are as undirected graphical models.

• Inference (exact or approximate) is the process of computing posterior probabilities of one or more nodes.

Page 4: EE462 MLCV Lecture 11-12 Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1

EE462 MLCV

4

Bayesian Networks• Consider three variables a,b,c. The joint distribution can

be written as

This can be represented by a graphical model as

The node a is called the parent node of b, and b is the child of the node a.

Page 5: EE462 MLCV Lecture 11-12 Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1

EE462 MLCV

5

𝑥1

𝑥2

𝑥3

𝑥4

…𝑥𝐾

• Consider K variables whose joint distribution is given

The graphical model of this formulation has a link between every pair of nodes, and is thus fully connected.

Page 6: EE462 MLCV Lecture 11-12 Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1

EE462 MLCV

6

• See below for the case of absence of some links.

The joint distribution is given as

Page 7: EE462 MLCV Lecture 11-12 Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1

EE462 MLCV

7

The joint distribution for a graph with K nodes can be written in the form of

where pak denotes the set of parents of xk, and x = {x1,...,xK}.

The equation expresses the factorization properties of the joint distribution for a directed graphical model.

Page 8: EE462 MLCV Lecture 11-12 Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1

EE462 MLCV

8

Examples: Bayesian Polynomial Regression

(discriminative)Gaussian Mixture Models (generative)

Page 9: EE462 MLCV Lecture 11-12 Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1

EE462 MLCV

Polynomial curve fitting (recap)

Page 10: EE462 MLCV Lecture 11-12 Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1

EE462 MLCV

10

• The Bayesian polynomial regression model is represented by the directed graphical model as

• A more compact representation for the same is by using a plate (the box labelled N) that represents N nodes of tn .

Bayesian polynomial regression

Lecture 15-16

Page 11: EE462 MLCV Lecture 11-12 Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1

EE462 MLCV

11

The nodes {tn} (and also xn) are shaded to indicate that the corresponding random variables are set to their observed (training set) values.

Bayesian polynomial regression

Page 12: EE462 MLCV Lecture 11-12 Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1

EE462 MLCV

12

See below for the graphical model of GMM.

• Generative (unsupervised) vs Discriminative (supervised) models:

In GMM, the input variable xn receives arrows and is thus generative.

The regression model maps the input xn to the target tn , it thus takes an arrow in the direction from xn to tn (discriminative).

Gaussian Mixture Model (GMM)

Page 13: EE462 MLCV Lecture 11-12 Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1

EE462 MLCV

13

Conditional Independence

• The variable a is conditionally independent of b given c, if

or

The variables a,b are statistically independent given c.

They are notated as

Page 14: EE462 MLCV Lecture 11-12 Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1

EE462 MLCV

14

• See below for the example.

The joint distribution in the graph is given as

If none of the variables are observed,

and, in general, is not factorized into p(a)p(b), and so

Page 15: EE462 MLCV Lecture 11-12 Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1

EE462 MLCV

15

Suppose we condition on the variable c, we can write down

Page 16: EE462 MLCV Lecture 11-12 Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1

EE462 MLCV

16

• The example of Markov chain is given below.

The joint distribution is given as

thus, is not factorized into p(a)p(b), and so

Page 17: EE462 MLCV Lecture 11-12 Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1

EE462 MLCV

17

So, we get the conditional independence property

Suppose we condition on the variable c, we can write down

Page 18: EE462 MLCV Lecture 11-12 Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1

EE462 MLCV

18

This will help graph separation or factorization, then inference.

Page 19: EE462 MLCV Lecture 11-12 Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1

EE462 MLCV

19

Markov Random Fields• An example of an undirected graph in which every path from any

nodes in set A to any node in set B passes through at least one node in set C. Consequently the conditional independence property

A ⊥ B|C holds.

Page 20: EE462 MLCV Lecture 11-12 Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1

EE462 MLCV

20

• A clique is a subset of the nodes in a graph such that there exists a link between all pairs of nodes in the subset.

A maximal clique is a clique such that it is not possible to include any other nodes without it ceasing to be a clique.

The two maximal cliques in the example are

Page 21: EE462 MLCV Lecture 11-12 Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1

EE462 MLCV

21

Let us denote a maximal clique by C and the set of variables in the clique by xC.

The joint distribution is given as a product of potential functions ΨC(xC) over the maximal cliques of the graph as

where Z is a normalisation constant. We do not restrict the choice of potential functions to those of marginal or conditional probability distributions in contrast to directed graphs.

Markov Random Fields

Page 22: EE462 MLCV Lecture 11-12 Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1

EE462 MLCV

22

where E(xC) is called an energy function.

Markov Random Fields

We define strictly positive potential functions ΨC(xC), e.g.

Page 23: EE462 MLCV Lecture 11-12 Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1

EE462 MLCV

23

Markov Random Fields for Image De-noising

Page 24: EE462 MLCV Lecture 11-12 Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1

EE462 MLCV

24

Markov Random Fields for Image De-noising

• Let the observed noisy image be described by binary pixel values yi {-1,+1}.

The image is obtained by taking an unknown noise-free image, described by xi {-1,+1}.

The original image is corrupted by flipping the sign of the pixels with probability 10%. Our goal is to recover the original noise-free image.

Page 25: EE462 MLCV Lecture 11-12 Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1

EE462 MLCV

25

• The undirected graphical model of MRF for image de-noising:

xi is a binary variable denoting the state of pixel i in the unknown noise-free image.

yi denotes the corresponding value of pixel i in the observed noisy image.

Page 26: EE462 MLCV Lecture 11-12 Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1

EE462 MLCV

26

The energy function to minimise for the model is

where the first term is a bias (or prior) term.

The joint distribution over x and y are given by

Markov Random Fields for Image De-noising

*The above example was obtained by β = 1.0, η = 2.1 and h = 0. It means the equal prior probabilities for the two states of xi.

Page 27: EE462 MLCV Lecture 11-12 Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1

EE462 MLCV

27

It initialises the variables xi by xi = yi for all i.

For all i, it takes one node xi at a time and evaluates the

total energy for the two possible states xi= +1 and xi= -1, keeping all other node variables fixed.

it sets xi to whichever state has the lower energy.

The solution: Iterated conditional models (ICM)

It iterates the following until convergence.