1
RESEARCH POSTER PRESENTATION DESIGN © 2012 www.PosterPresentation s.com We propose a simple hierarchical infinite HMM (iHMM) model, an extension to iHMM with efficient inference scheme. The model can capture dynamics of a sequence in two /mescales and does not suffer from the problems of other related models in terms of implementa/on and /me complexity. We use the model to analyze the dynamics in two /mescales of some synthe/c and real physiological data. We show that the model performs reasonably well compared to a baseline on two physiological datasets. ABSTRACT MODEL A generaliza/on of iHMM where the transi/on probability is a mixture of: 1. a statedependent transi/on probability distribu/on which resembles the transi/on probability in an iHMM 2. a stateindependent probability distribu/on The mixture component is sampled from a Bernoulli distribu/on with a parameter that depends on both the hidden state and the observa/on. REFERENCES o Time series with mul/ple /mescales appear in domains where there is a hierarchical structure; for instance, in natural language, handwri/ng and mo/on recogni/on. o iHMM and its variants have been successfully applied to ... Speech recogni/on Document modeling Biology Corporate bond ra/ng o Their applica/on to /me series with mul/ple /mescales has been limited. The reasons are mainly: Inefficient inference Complex implementa/on 1 Computer Science and Ar/ficial Intelligence Laboratory, MIT 2 Media Lab, MIT 3 Adobe Research Ardavan Saeedi 1 , Asma Ghandeharioun 2 , Ma‘ Hoffman 3 A simple hierarchical infinite HMM with efficient inference o Defining a Bayesian nonparametric model for /me series with dynamics at two /me scales o Proposing an efficient stochas/c varia/onal inference scheme o Applying the model to physiological data and performing reasonably well compared to a baseline. Katherine A Heller, Yee W Teh, and Dilan Go ̈ru ̈r. Infinite hierarchical hidden markov models. In Inter na:onal Conference on Ar:ficial Intelligence and Sta:s:cs, pages 224–231, 2009. MaChew Johnson and Alan Willsky. Stochas/c varia/onal inference for bayesian /me series models. In Proceedings of the 31st Interna:onal Conference on Machine Learning (ICML14), pages 1854–1862, 2014. Thomas S Stepleton, Zoubin Ghahramani, Geoffrey J Gordon, and Tai S Lee. The block diagonal infinite hidden markov model. In Interna:onal Conference on Ar:ficial Intelligence and Sta:s:cs, pages 552– 559, 2009. MOTIVATION CONTRIBUTION The generaBve descripBon 1. Generate the transi/on probability matrix according to the genera/ve process of iHMM 2. At /me step t, given a hidden state, generate an observa/on from a condi/onal observa/on distribu/on. 3. Sample a segmenta/on variable from a Bernoulli distribu/on with a parameter which depends on both the hidden states and the observa/ons 4. Condi/oned on the segmenta/on variable, either sample the next state from a statedependent distribu/on or ignore the current state and sample from a distribu/on . 0 z t hidden state at time t y t observation at time t s t segmentation variable at time t H prior distribution over φ F (φ z t ) the observation distribution φ z t the parameter corresponding to z t GEM(γ ) the stick-breaking distribution with parameter γ parameter of the DP ! y t observation feature weight ! z t hidden state feature weight A SIMPLE ILLUSTRATION ON TOY DATASET o A toy dataset with 15000 data points from 3 different transi/on matrix each with 2 hidden states. o The goal is to find the points where we have switched from one regime to another one and also the dynamics within each segment. True segments Inferred segments STOCHASTIC VARIATIONAL INFERENCE (SVI) o Truncated SVI is used for inference; the posterior is approximated with mean field family distribu/on. o We maximize the marginal likelihood lower bound: by using stochas/c natural gradient ascent over the global factors and standard mean field updates for the local factors (z and s). o Minibatch of M sequences for upda/ng local factors o Global factors are updated by taking a step of size in the approximate natural gradient direc/on. L , E q p(z, s, β , ! , , φ, y) q (z, s)q (β )q (! )q ()q (φ) VariaBonal factors o “Direct assignment” trunca/on with trunca/on level K (Johnson & Willsky 2014) for z and s: If for any to we have and . o Point es/mate for , and : o For , we assume the prior is: Due to conjugacy the op/mal varia/onal factor is in the form of with parameter . o For , we assume the prior is in exponen/al family and conjugate for the likelihood func/on . Hence, q (z 1:T ,s 1:T )=0 z 1 z T z t = k k>K β q (β )= δ β (β ) ! y ! z q (! y )= δ ! y (! y ) q (! z )= δ ! z (! z ) i p((i1 ,..., iK , i,rest )) = Dir(↵β 1 ,..., ↵β K , ↵β rest ) i,rest =1 - P K k =1 k β rest =1 - P K k =1 β k Dir(˜ i ) ˜ φ f (y t |φ) q (φ i ) / exp{h ˜ i ,t φ (φ i )i} SVI update equaBons o For the expecta/ons with respect to : o The update for the parameters of the global varia/onal factors: Where and are expected sufficient sta/s/cs with respect to . q (z 1:T ,s 1:T ) F (z t ,s t ) , f (y t |φ z t )p(s t |z t ,y t ) X z t-1 ,s t-1 F (z t-1 ,s t-1 )p(z t |s t-1 ,z t-1 ); B (z t ,s t ) , X z t+1 ,s t+1 B (z t+1 ,s t+1 )f (y t+1 |φ z t+1 )p(s t+1 |z t+1 ,y t+1 )p(z t+1 |s t ,z t ) ˜ i (1 - i + (i + m. ˜ t i y ) ˜ i (1 - i + (i + m. ˜ t i trans ) ˜ 0 (1 - 0 + (0 + m. ˜ t 0 trans ). ˜ t i trans ˜ t i y q (z 1:T ,s 1:T ) φ i H ; β GEM(γ ); i DP(↵β ); z 1 0 ; y t |z t F (φ z t ); s t |z t ,y t Bern(σ (! y y t + ! z z t )); z t+1 |z t ,s t 1-s t 0 s t z t , RESULTS o Electrodermal ac/vity (EDA) refers to changes in electrical proper/es of the skin caused by sudomotor innerva/on. It is an indica/on of physiological or psychological arousal and has been u/lized to objec/vely sleep quality. o We use two datasets of sizes 12000 and 32000 and split them into sequences of size 1000. In both datasets, we normalize the EDA values and use batch size and heldout size of two. RELATED MODELS o Infinite hierarchical HMM (Heller et al. 2009) o The blockdiagonal iHMM (Stepleton et al. 2009) o In contrast, our model is much simpler and easier to implement inference for. It can also discover transi/on matrices with approximately blockdiagonal structure; the segmenta/on events provide a mechanism for transi/oning from one group of connected states to another. 0 2000 4000 6000 8000 10000 12000 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 0 2000 4000 6000 8000 10000 12000 t -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 True segments Segmenta/on HDPHMM S/cky HDPHMM

AsimplehierarchicalinfiniteHMM QUICK START (cont.)You can insert a logo by dragging and dropping it from your desktop, copy and paste or by going to INSERT > PICTURES. Logos taken

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: AsimplehierarchicalinfiniteHMM QUICK START (cont.)You can insert a logo by dragging and dropping it from your desktop, copy and paste or by going to INSERT > PICTURES. Logos taken

RESEARCH POSTER PRESENTATION DESIGN © 2012

www.PosterPresentations.com

QU ICK START ( con t . )

How to change the template color theme

You can easily change the color theme of your poster by going to the DESIGN menu, click on COLORS, and choose the color theme of your choice. You can also create your own color theme. You can also manually change the color of your background by going to VIEW > SLIDE MASTER. After you finish working on the master be sure to go to VIEW > NORMAL to continue working on your poster.

How to add Text The template comes with a number of pre-formatted placeholders for headers and text blocks. You can add more blocks by copying and pasting the existing ones or by adding a text box from the HOME menu.

Text size Adjust the size of your text based on how much content you have to present. The default template text offers a good starting point. Follow the conference requirements.

How to add Tables To add a table from scratch go to the INSERT menu and click on TABLE. A drop-down box will help you select rows and columns.

You can also copy and a paste a table from Word or another PowerPoint document. A pasted table may need to be re-formatted by RIGHT-CLICK > FORMAT SHAPE, TEXT BOX, Margins.

Graphs / Charts You can simply copy and paste charts and graphs from Excel or Word. Some reformatting may be required depending on how the original document has been created.

How to change the column configuration

RIGHT-CLICK on the poster background and select LAYOUT to see the column options available for this template. The poster columns can also be customized on the Master. VIEW > MASTER.

How to remove the info bars

If you are working in PowerPoint for Windows and have finished your poster, save as PDF and the bars will not be included. You can also delete them by going to VIEW > MASTER. On the Mac adjust the Page-Setup to match the Page-Setup in PowerPoint before you create a PDF. You can also delete them from the Slide Master.

Save your work Save your template as a PowerPoint document. For printing, save as PowerPoint of “Print-quality” PDF.

Print your poster When you are ready to have your poster printed go online to PosterPresentations.com and click on the “Order Your Poster” button. Choose the poster type the best suits your needs and submit your order. If you submit a PowerPoint document you will be receiving a PDF proof for your approval prior to printing. If your order is placed and paid for before noon, Pacific, Monday through Friday, your order will ship out that same day. Next day, Second day, Third day, and Free Ground services are offered. Go to PosterPresentations.com for more information.

Student discounts are available on our Facebook page. Go to PosterPresentations.com and click on the FB icon.

©  2013  PosterPresenta/ons.com          2117  Fourth  Street  ,  Unit  C                            Berkeley  CA  94710          [email protected]  

(—THIS SIDEBAR DOES NOT PRINT—) DES I G N G U I DE

This PowerPoint 2007 template produces a 36”x48” presentation poster. You can use it to create your research poster and save valuable time placing titles, subtitles, text, and graphics. We provide a series of online tutorials that will guide you through the poster design process and answer your poster production questions. To view our template tutorials, go online to PosterPresentations.com and click on HELP DESK. When you are ready to print your poster, go online to PosterPresentations.com Need assistance? Call us at 1.510.649.3001

QU ICK START

Zoom in and out As you work on your poster zoom in and out to the level that is more comfortable to you.

Go to VIEW > ZOOM.

Title, Authors, and Affiliations Start designing your poster by adding the title, the names of the authors, and the affiliated institutions. You can type or paste text into the provided boxes. The template will automatically adjust the size of your text to fit the title box. You can manually override this feature and change the size of your text. TIP: The font size of your title should be bigger than your name(s) and institution name(s).

Adding Logos / Seals Most often, logos are added on each side of the title. You can insert a logo by dragging and dropping it from your desktop, copy and paste or by going to INSERT > PICTURES. Logos taken from web sites are likely to be low quality when printed. Zoom it at 100% to see what the logo will look like on the final poster and make any necessary adjustments. TIP: See if your school’s logo is available on our free poster templates page.

Photographs / Graphics You can add images by dragging and dropping from your desktop, copy and paste, or by going to INSERT > PICTURES. Resize images proportionally by holding down the SHIFT key and dragging one of the corner handles. For a professional-looking poster, do not distort your images by enlarging them disproportionally.

Image Quality Check Zoom in and look at your images at 100% magnification. If they look good they will print well.

ORIGINAL   DISTORTED  

Corner  handles  

Good

 prin

/ng  qu

ality

 

Bad  prin/n

g  qu

ality

 

We  propose  a  simple  hierarchical  infinite  HMM  (iHMM)  model,   an   extension   to   iHMM  with   efficient   inference  scheme.  The  model  can  capture  dynamics  of  a  sequence  in   two   /mescales   and   does   not   suffer   from   the  problems   of   other   related   models   in   terms   of  implementa/on   and   /me   complexity.   We   use   the  model   to   analyze   the   dynamics   in   two   /mescales   of  some   synthe/c   and   real   physiological   data.   We   show  that  the  model  performs  reasonably  well  compared  to  a  baseline  on  two  physiological  datasets.    

ABSTRACT   MODEL  A  generaliza/on  of  iHMM  where  the  transi/on  probability  is  a  mixture  of:  

1.  a   state-­‐dependent   transi/on   probability  distribu/on   which   resembles   the   transi/on  probability  in  an  iHMM  

2.  a  state-­‐independent  probability  distribu/on  The   mixture   component   is   sampled   from   a   Bernoulli  distribu/on  with   a   parameter   that   depends   on   both   the  hidden  state  and  the  observa/on.  

REFERENCES  

o  Time  series  with  mul/ple  /mescales  appear  in  domains  where  there  is  a  hierarchical  structure;  for  instance,  in  natural  language,  handwri/ng  and  mo/on  recogni/on.  

o  iHMM  and  its  variants  have  been  successfully  applied  to  ...  Ø   Speech  recogni/on  Ø  Document  modeling  Ø  Biology  Ø  Corporate  bond  ra/ng  

o  Their  applica/on  to  /me  series  with  mul/ple  /mescales  has  been  limited.  The  reasons  are  mainly:  

Ø  Inefficient  inference  Ø  Complex  implementa/on  

1Computer  Science  and  Ar/ficial  Intelligence  Laboratory,  MIT  2  Media  Lab,  MIT  3  Adobe  Research  

Ardavan  Saeedi1,  Asma  Ghandeharioun2,  Ma`  Hoffman3    

A  simple  hierarchical  infinite  HMM    with  efficient  inference  

o  Defining  a  Bayesian  nonparametric  model  for  /me  series  with  dynamics  at  two  /me  scales  

o  Proposing  an  efficient  stochas/c  varia/onal  inference  scheme    

o  Applying  the  model  to  physiological  data  and  performing  reasonably  well  compared  to  a  baseline.  

Katherine  A  Heller,  Yee  W  Teh,  and  Dilan  Go  r̈u  ̈r.  Infinite  hierarchical  hidden  markov  models.   In   Inter-­‐   na:onal   Conference   on   Ar:ficial   Intelligence   and   Sta:s:cs,   pages  224–231,  2009.    MaChew  Johnson  and  Alan  Willsky.  Stochas/c  varia/onal  inference  for  bayesian  /me  series   models.   In   Proceedings   of   the   31st   Interna:onal   Conference   on   Machine  Learning  (ICML-­‐14),  pages  1854–1862,  2014.    Thomas  S  Stepleton,  Zoubin  Ghahramani,  Geoffrey  J  Gordon,  and  Tai  S  Lee.  The  block  diagonal   infinite   hidden   markov   model.   In   Interna:onal   Conference   on   Ar:ficial  Intelligence  and  Sta:s:cs,  pages  552–  559,  2009.    

 

MOTIVATION  

CONTRIBUTION  

The  generaBve  descripBon  

1.  Generate  the  transi/on  probability  matrix  according  to  the  genera/ve  process  of  iHMM  

2.  At   /me   step   t,   given   a   hidden   state,   generate   an  observa/on   from   a   condi/onal   observa/on  distribu/on.  

3.  Sample   a   segmenta/on   variable   from   a   Bernoulli  distribu/on  with  a  parameter  which  depends  on  both  the  hidden  states  and  the  observa/ons    

4.  Condi/oned   on   the   segmenta/on   variable,   either  sample   the   next   state   from   a   state-­‐dependent  distribu/on   or   ignore   the   current   state   and   sample  from  a  distribu/on              .    ⇡0

zt hidden state at time tyt observation at time tst segmentation variable at time tH prior distribution over �F (�zt) the observation distribution

�zt the parameter corresponding to ztGEM(�) the stick-breaking distribution

with parameter �↵ parameter of the DP

!yt observation feature weight

!zt hidden state feature weight

A  SIMPLE  ILLUSTRATION  ON  TOY  DATASET  

o  A  toy  dataset  with  15000  data  points  from  3  different  transi/on  matrix  each  with  2  hidden  states.    

o  The  goal  is  to  find  the  points  where  we  have  switched  from  one  regime  to  another  one  and  also  the  dynamics  within  each  segment.  

True  segments  

Inferred  segments  

STOCHASTIC  VARIATIONAL  INFERENCE  (SVI)  

o  Truncated  SVI  is  used  for  inference;  the  posterior  is  approximated  with  mean  field  family  distribu/on.    

o  We  maximize  the  marginal  likelihood  lower  bound:  

by  using  stochas/c  natural  gradient  ascent  over  the  global  factors  and  standard  mean  field  updates  for  the  local  factors  (z  and  s).  o  Minibatch  of  M  sequences  for  upda/ng  local  factors  o  Global  factors  are  updated    by  taking  a  step  of  size          in  

the  approximate  natural  gradient  direc/on.      

L , Eq

p(z, s,�,!,⇡,�,y)

q(z, s)q(�)q(!)q(⇡)q(�)

VariaBonal  factors  

o  “Direct  assignment”  trunca/on  with  trunca/on   level  K  (Johnson  &  Willsky  2014)  for  z  and  s:    

   If  for  any            to              we  have                                  and                                  .        o  Point  es/mate  for          ,                and            :            o  For          ,  we  assume  the  prior  is:            Due   to   conjugacy   the  op/mal   varia/onal   factor   is   in   the  form  of                                    with  parameter            .        o  For        ,  we  assume  the  prior  is  in  exponen/al  family  and  

conjugate  for  the  likelihood  func/on                                .    Hence,      

q(z1:T , s1:T ) = 0z1 zT zt = k k > K

�q(�) = ��⇤(�)

!y !z

q(!y) = �!⇤y(!y) q(!z) = �!⇤

z(!z)

⇡i

p((⇡i1, . . . ,⇡iK ,⇡i,rest)) = Dir(↵�1, . . . ,↵�K ,↵�rest)

⇡i,rest = 1�PK

k=1 ⇡k �rest = 1�PK

k=1 �k

Dir(↵̃i) ↵̃�

f(yt|�)q(�i) / exp{h⌘̃i, t�(�i)i}

SVI  update  equaBons  

o   For  the  expecta/ons  with  respect  to                                                  :  

o  The  update  for  the  parameters  of  the  global  varia/onal  factors:  

     Where                          and              are  expected  sufficient  sta/s/cs  with  respect  to                                                  .  

q(z1:T , s1:T )

F (zt, st) , f(yt|�zt)p(st|zt, yt)X

zt�1,st�1

F (zt�1, st�1)p(zt|st�1, zt�1);

B(zt, st) ,X

zt+1,st+1

B(zt+1, st+1)f(yt+1|�zt+1)p(st+1|zt+1, yt+1)p(zt+1|st, zt)

⌘̃i (1� ⇢)⌘̃i + ⇢(⌘i +m.t̃iy)

↵̃i (1� ⇢)↵̃i + ⇢(↵i +m.t̃itrans)

↵̃0 (1� ⇢)↵̃0 + ⇢(↵0 +m.t̃0trans).

t̃itrans t̃iyq(z1:T , s1:T )

�i ⇠H;

� ⇠ GEM(�); ⇡i ⇠ DP(↵�);

z1 ⇠ ⇡0; yt|zt ⇠ F (�zt);

st|zt, yt ⇠ Bern(�(!yyt + !zzt));

zt+1|zt, st ⇠ ⇡1�st0 ⇡st

zt ,

RESULTS  

o  Electrodermal  ac/vity  (EDA)  refers  to  changes  in  electrical  proper/es  of  the  skin  caused  by  sudomotor  innerva/on.  It  is  an  indica/on  of  physiological  or  psychological  arousal  and  has  been  u/lized  to  objec/vely  sleep  quality.  

o  We  use  two  datasets  of  sizes  12000  and  32000  and  split  them  into  sequences  of  size  1000.  In  both  datasets,  we  normalize  the  EDA  values  and  use  batch  size  and  heldout  size  of  two.  RELATED  MODELS  

o  Infinite  hierarchical  HMM  (Heller  et  al.  2009)  o   The  block-­‐diagonal  iHMM  (Stepleton  et  al.  2009)  o  In  contrast,  our  model  is  much  simpler  and  easier  to  

implement  inference  for.  It  can  also  discover  transi/on  matrices  with  approximately  block-­‐diagonal  structure;  the  segmenta/on  events  provide  a  mechanism  for  transi/oning  from  one  group  of  connected  states  to  another.  

 

0 2000 4000 6000 8000 10000 12000�2.5�2.0�1.5�1.0�0.5

0.00.51.01.52.0

0 2000 4000 6000 8000 10000 12000t

�2.5�2.0�1.5�1.0�0.5

0.00.51.01.52.0

True

 segm

ents  

Segm

enta/o

n    

HDP-­‐HM

M  

S/cky  HD

P-­‐HM

M