18
Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Wa tanabe Tokyo Institute of Techno logy, Japan

Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,

Embed Size (px)

DESCRIPTION

Problem : Calculations which include a Bayes posterior require huge computational cost. Mean field approximation a Bayes posterior a trial distribution Stochastic Complexity Accuracy of approximation Difference from regular Model selection statistical models

Citation preview

Page 1: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,

Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines

Yu Nishiyama and Sumio Watanabe

Tokyo Institute of Technology, Japan

Page 2: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,

BackgroundLearning machines

Mixture modelsHidden Markov modelsBayesian networks

Pattern recognitionNatural language processing

Gene analysis

Information systems

mathematically

Bayes learning is effective

Singular statistical models

Page 3: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,

Problem : Calculations which include a Bayes posterior require huge computational cost.

Mean field approximation

a Bayes posterior a trial distribution

Stochastic Complexity

Accuracy of approximation Difference from regular       Model selection statistical models

Page 4: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,

Asymptotic behavior of mean field stochastic complexities are studied.

Mixture models [ K. Watanabe, et al. 2004. ] Reduced rank regressions [ Nakajima, et al. 2005. ]

Hidden Markov models [ Hosino, et al. 2005. ] Stochastic context-free grammar [ Hosino, et al. 2005. ]

Neural networks [ Nakano, et al. 2005. ]

Page 5: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,

PurposeWe derive the upper bound of mean field stochastic complexity of complete bipartite graph-type Boltzmann machines.

Boltzmann Machines

Graphical models

Spin systems

Page 6: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,

Table of ContentsReview

Bayes LearningMean Field ApproximationBoltzmann Machines

Main Theorem

Outline of the Proof Discussion and Conclusion

Main Theorem

( Complete Bipartite Graph-type )

Page 7: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,

Bayes Learning

1X 2X nX

)(

)|()()|( 1

n

i

n

in

XZ

XpXp

dXpxpXxp nn )|()|()|(

)(xqTrue distribution

)(

)|( xp model

prior

: Bayes posterior

: Bayes predictive distribution

Page 8: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,

Mean Field Approximation (1)

)()}(~exp{

)(

)|()()|( 1

nn

n

i

n

in

XZHn

XZ

XpXp

0

)()}(~exp{

)(log)()]|(||)([

d

XZHn

ffXpfD

nn

n

dHfndffXZ nn )(~)()(log)()(log

The Bayes posterior can be rewritten as

We consider a Kullback distance from a trial distribution

to the Bayes posterior

)(f

)|( nXp

.

.

Page 9: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,

Mean Field Approximation (2)

])(~)()(log)([)](log[ dHfndffEXZE nXn

X nn

When we restrict the trial distribution

)(f to

)()(1

ii

d

i

ff

The minimum value of

which minimizes )(f

}])(~)()(log)({min[)()(

dHfndffEnF nfX n

is called mean field stochastic complexity.

,

is called mean field approximation.

Page 10: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,

Complete Bipartite Graph-typeBoltzmann Machines

1y 2y 3y Ky

1x 2x Mx

Kunits

M units

ijw KMw

Mjjx 1}{

Kiiy 1}{

)exp(

)exp()|(

11

11

ijij

M

j

K

i

ijij

M

j

K

i

yxw

yxwwxp

yx

y

)(

)exp(11

wZ

yxw ijij

M

jy

K

i i

)(

)cosh(11

wZ

xw jij

M

j

K

i

parametric model takes }1,1{

Page 11: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,

True Distribution

1y Ky Ky

1x 2x Mx

K units

M units

1Ky

0ijw0

ijw

)( KK

We assume that the true distribution is included in the parametric model

)|( wxp and the number of hidden units is

.

)(

)cosh()|( 11

wZ

xwwxp

jij

M

j

K

i

True distribution is

Page 12: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,

Main TheoremThe mean field stochastic complexity of complete bipartite graph-type Boltzmann machines has the following upper bound.

CnKMMKnF

log4

)(

M: the number of input and output units K: the number of hidden units (learning machines)

K: the number of hidden units (true distribution)

C: constant

Page 13: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,

Outline of the Proof (Methods)

dwwHwfndwwfwfnF )(~)(~)(~log)(~)(

})ˆ(2

1exp{21)( 2

211

ijij

M

j

K

i

KM

www

})ˆ(exp{)(

1)(~ 2

11ijijij

M

j

K

i

wwNNZ

wf

normal distribution family

prior

depends on the BM

Page 14: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,

Outline of the Proof [lemma]

of parameter )(H dRand ,

such that the number of elements of the set

if there exists a value

0)( and0)ˆ(;ˆ

2

2

i

HHi

is less than or equal to r, mean field stochastic complexity has the

)1(log4

)( OnrdnF

0

rdero

non-z

following upper bound. Hessian matrix

For Kullback information

Page 15: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,

We apply this lemma to the Boltzmann machines.

)(

)cosh()( 11

wZ

xwwH

jij

M

j

K

i

x

)(

)cosh(

)(

)cosh(

log

11

11

wZ

xw

wZ

xw

jij

M

j

K

i

jij

M

j

K

i

Kullback information is given by

The second order differential is

wwwwH

ˆ

2

2 )(

ww

tt ˆ2

ˆ )(

Here

.

.

xxwt jj

M

j

)tanh(1

)|()ˆ|()|( ˆ wxfwxpwxfw

x, .

Page 16: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,

The parameter is a true parameter

*w

0w },,1{ KK for

.

wwwwH2

2 )(

0)( 2 wwtt

0)0tanh()tanh(1

xxxwt jj

M

j},,1{ KK

Then,

becomest

},,1{ KK

.

MKr KMd hold.

By using the lemma, we have

CnKMMKnF

log4

)( .

,

0

MK

KMero

non-z

Then,

.

and

Page 17: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,

Discussion

n

CnKM log

2

Comparison with other studiesregular statistical model

:Number of Training dataasymptotic

area

Bayes learning

mean field approximation

derived resultCnKMMK

log4

upper bound

algebraic geometry

[Yamazaki]

upper bound

Stochastic Complexity

Page 18: Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,

ConclusionWe derived the upper bound of mean field stochastic complexity of complete bipartite graph-type Boltzmann Machines.

Lower bound

Future works

Comparison with experimental results