CONTEXT DEPENDENT CLASSIFICATION

CONTEXT DEPENDENT CONTEXT DEPENDENT CLASSIFICATIONCLASSIFICATION

Remember: Bayes rule

Here: The class to which a feature vectorbelongs depends on:

Its own value The values of the other features An existing relation among the various

classes

ijxPxP ji ),()(

This interrelation demands the classification to be performed simultaneously for all available feature vectors

Thus, we will assume that the training vectors

occur in sequence, one after the other and we will refer to them as observations

Nx,...,x,x 21

The Context Dependent Bayesian Classifier Let Let Let be a sequence of classes, that is

There are MN of those Thus, the Bayesian rule can equivalently be

stated as

Markov Chain Models (for class dependence)

},...,,{: 21 NxxxX

Mii ,...,2,1 ,

iNiii ... : 21

Njii MjijiXPXPX ,...,2,1, , )()( :

)(),...,,(1121

kkkkk iiiiii PP

NOW remember:

Assume: statistically mutually independent The pdf in one class independent of the

others, then

)()...,...,(

).,...,(

),...,,()(

kiiii PPP

)())(()(11

kiki k

From the above, the Bayes rule is readily seen to be equivalent to:

that is, it rests on

To find the above maximum in brute-force task we need Ο(NMΝ ) operations!!

)()())(()(

)())((

XpPXpP

kkkxpP

xpPPXp

.)()()()(

The Viterbi Algorithm

Thus, each Ωi corresponds to one path through the trellis diagram. One of them is the optimum (e.g., black). The classes along the optimal path determine the classes to which ωi are assigned.

To each transition corresponds a cost. For our case

).(),(ˆ11

)()(),(ˆ1101 iiiii xpPd

)()(),(ˆˆ1

kiii PXpdD

• Equivalently

where,

• Define the cost up to a node ,k,

dDdD1 1

(.,.)(.,.)ˆlnˆln

),(ˆln),(11

kkkk iiii dd

riii rrk

),()(1

Bellman’s principle now states

The optimal path terminates at

• Complexity O (NM2)

iiiii kkkk

,...,2,1,

),()(max)(

maxmax 111

0)(0max iD

)(maxarg max*

N ii D

Channel Equalization

The problem

knkkkk nIIIfx ),...,,( 11

Tlkkkk xxxx ],...,,[ 11

rkkk IIx ˆor ˆ

rkk Ix equalizer

Example

• In xk three input symbols are involved:

Ik, Ik-1, Ik-2

k1kkk nII5.0x

Ik Ik-1 Ik-2 xk xk-10 0 0 0 0 ω1

0 0 1 0 1 ω2

0 1 0 1 0.5 ω3

0 1 1 1 1.5 ω4

1 0 0 0.5 0 ω5

1 0 1 0.5 1 ω6

1 1 0 1.5 0.5 ω7

1 1 1 1.5 1.5 ω8

Not all transitions are allowed

• Then

)1 ,0 ,0(),,( 21 kkk III

),,( 11 kkk III)0 ,0 ,1(

)0 ,0 ,0(

1)( 2 iP

1 ,5 ,5.0 i

otherwise ,0

• In this context, ωi are related to states. Given the current state and the transmitted bit, Ik, we determine the next state. The probabilities P(ωi|ωj) define the state dependence model.

The transition cost

for all allowable transitions

)() ,(1

xddkikk ii

))()((k kk i ik

Tik xx

Assume:• Noise white and Gaussian• A channel impulse response estimate to be

available

• The states are determined by the values of the binary variablesIk-1,…,Ik-n+1

For n=3, there will be 4 states

knkkk nIIfx )),...,(ˆ( 1

21)),...,(ˆ(

)(ln)(ln),(1

npxpdkkkk

Hidden Markov Models

In the channel equalization problem, the states are observable and can be “learned” during the training period

Now we shall assume that states are not observable and can only be inferred from the training data

Applications:• Speech and Music Recognition• OCR• Blind Equalization• Bioinformatics

An HMM is a stochastic finite state automaton, that generates the observation sequence, x1, x2,…, xN

We assume that: The observation sequence is produced as a result of successive transitions between states, upon arrival at a state:

This type of modeling is used for nonstationary stochastic processes that undergo distinct transitions among a set of different stationary processes.

Examples of HMM:• The single coin case: Assume a coin that is

tossed behind a curtain. All it is available to us is the outcome, i.e., H or T. Assume the two states to be:

S = 1HS = 2T

This is also an example of a random experiment with observable states. The model is characterized by a single parameter, e.g., P(H). Note that

P(1|1) = P(H)P(2|1) = P(T) = 1 – P(H)

• The two-coins case: For this case, we observe a sequence of H or T. However, we have no access to know which coin was tossed. Identify one state for each coin. This is an example where states are not observable. H or T can be emitted from either state. The model depends on four parameters.

P1(H), P2(H),

P(1|1), P(2|2)

• The three-coins case example is shown below:

• Note that in all previous examples, specifying the model is equivalent to knowing:

– The probability of each observation (H,T) to be emitted from each state.

– The transition probabilities among states: P(i|j).

A general HMM model is characterized by the following set of parameters

• Κ, number of states

K,...,2,1i),ix(p

KjijiP ,...,2,1, ),(

(.) ies,probabilit state initial ,,...,2,1 ),( PKiiP

That is:

What is the problem in Pattern Recognition• Given M reference patterns, each

described by an HMM, find the parameters, S, for each of them (training)

• Given an unknown pattern, find to which one of the M, known patterns, matches best (recognition)

} ),(),( ),({ KiPixpjiPS

Recognition: Any path method• Assume the M models to be known (M

classes).• A sequence of observations, X, is given.• Assume observations to be emissions

upon the arrival on successive states• Decide in favor of the model S* (from the

M available) according to the Bayes rule

for equiprobable patterns

)(maxarg* XSPSS

)(maxarg* SXpSS

• For each model S there is more than one possible sets of successive state transitions Ωi, each with probability

• For the efficient computation of the above DEFINE

SXpSXP

),,...,()( 1111 Sixxpi kkk

kkkkk ixpiiPi )()( )( 111

History Local activity

)( SP i

• Observe that

)()( Compute this for each S

• Some more quantities

)()()(

),,...,,()(

ixpiiPi

Sixxxpi

),,...,()( 1

Sixxpi

Training• The philosophy:

Given a training set X, known to belong to the specific model, estimate the unknown parameters of S, so that the output of the model, e.g.

to be maximized

This is a ML estimation problem with missing data

Assumption: Data x discrete

Definitions:

)()(},...,2,1{ ixPixprx

)()()()()(

),( 11

SXPjijxPijPii

ji kkkk

)()()()(

SXPiiiii kk

The Algorithm:• Initial conditions for all the unknown

parameters.

• Step 1: From the current estimates of the model parameters reestimate the new model S from

)SX(P Compute

),()( N

from ns transitioof # to from ns transitioof #

) and 1

irxi stateat being of

and stateat

)()( 1 iiP

• Step 3: Computego to step 2. Otherwise stop

• Remarks:– Each iteration improves the model

– The algorithm converges to a maximum (local or global)

– The algorithm is an implementation of the EM algorithm

S SSXPSXPSXP ,)()( If ).(

)()( : SXPSXPS

CONTEXT DEPENDENT CLASSIFICATION

Documents

Context Dependent Models

(1),82-93 Context-independentand context-dependent ...Memory & Cognition 1982, Vol. 10(1),82-93 Context-independentand context-dependent information inconcepts LAWRENCE W. BARSALOU

Context-Aware Query Classification

PATTERN RECOGNITION USING CONTEXT DEPENDENT …

Introducing context-dependent and spatially-variant viewing …people.irisa.fr/Olivier.Le_Meur/publi/2016_VR/... · 2016-04-22 · Introducing context-dependent and spatially-variant

The importance of context-dependent learning in

Genetic Selection for Context-Dependent Stochastic

(1),82-93 Context-independentand context-dependent

Groundwater Dependent Ecosystems: Classification, Identification … · 2017-08-27 · Groundwater Dependent Ecosystems: Classification, Identification Techniques and Threats 13 Derek

1 CONTEXT DEPENDENT CLASSIFICATION Remember: Bayes rule Here: The class to which a feature vector belongs depends on: Its own value The values

SCALE-DEPENDENT EFFECTS OF LANDSCAPE CONTEXT ON …

Context-Dependent Automatic Processing in Depression: … · 2019. 12. 19. · Context-Dependent Automatic Processing in Depression: Accessibility of Negative Constructs With Regard

How to Capture Context and Context-dependent Behavior · How to Capture Context and Context-dependent Behavior Tetsuo Tamai (Hosei University) EASSy 2015 (Sep. 7, 2015)

ETHNIC CLASSIFICATION IN INTERNATIONAL CONTEXT

A context-dependent scheduling model considering

Intervention Harvesting for Context-Dependent Examination-Bias … · 2019-06-04 · Intervention Harvesting for Context-Dependent Examination-Bias Estimation Zhichong Fang Tsinghua

Context-Dependent Multiple Distribution Phonetic Modeling ...papers.nips.cc/paper/710-context-dependent... · Horacio Franco Nelson Morgan SRl International IntI. Computer Science

Content Management Systems for mobile, context-dependent ...€¦ · Content Management Systems for mobile, context-dependent augmented reality applications Dipl. -Ing. Mario Lengheimer,

Application of diet theory reveals context-dependent

CONTEXT-DEPENDENT TOP-DOWN INFLUENCES SUPERSEDE …