31
Change Point Detec.on with Bayesian Inference By Frank Kelly Py data 6 th January 2015

Changepoint Detection with Bayesian Inference

Embed Size (px)

Citation preview

Page 1: Changepoint Detection with Bayesian Inference

Change  Point  Detec.on  with  Bayesian  Inference  

By  Frank  Kelly  Py  data  

6th  January  2015  

Page 2: Changepoint Detection with Bayesian Inference

Overview  

•  Nigeria,  oil  wells  &  drilling  •  Noisy  data  •  Some  maths  •  Python  implementaDon  •  Examples  in  different  domains  

Page 3: Changepoint Detection with Bayesian Inference

FPSO  (oil  plaIorm  picture)  

Page 4: Changepoint Detection with Bayesian Inference
Page 5: Changepoint Detection with Bayesian Inference
Page 6: Changepoint Detection with Bayesian Inference

Mud  pulse  telemetry  

•  InformaDon  encoded  digitally,  transmiOed  via  pressure  pulses  through  mud  fluid.  

•  Alert  drillers  that  they  have  reached  oil,  detect  rock  types  and  general  monitoring.  

Page 7: Changepoint Detection with Bayesian Inference

The  problem  

•  Poor  bit  rate  and  resoluDon  

•  Time  consuming  analysis  

Page 8: Changepoint Detection with Bayesian Inference

Approaches  to  staDsDcs  

•  FrequenDst  – Data  gathered  is  a  repeatable  random  sample.  “Frequency”  

– Underlying  parameters  are  constant  

– Fisher’s  0.05  

•  Bayesian  – Data  are,  fixed  and  observed  from  the  realised  sample  

– Parameters  unknown  and  described  probabilisDcally  

–  Introduce  “subjecDvity”  

 

Page 9: Changepoint Detection with Bayesian Inference

FrequenDst  vs.  Bayesian  

Page 10: Changepoint Detection with Bayesian Inference

The  Theory:  Bayesian  inference  

•  Methodology  of  mathemaDcal  inference:    –  Choosing  between  several  possible  models  –  ExtracDng  parameters  for  these  models  

•  Bayes’  Theorem:  

Rev  Thomas  Bayes  1702  -­‐  1761  

p(w |D) = p(D |w)p(w)p(D)

Likelihood  Prior  

Probability  

Posterior  Probability   Evidence  

-­‐  Remove  nuisance  parameters  by  marginalisaDon  

-­‐  InteresDng  ones  remain  

Page 11: Changepoint Detection with Bayesian Inference

Modelling  the  problem  

µ2

m

N

Page 12: Changepoint Detection with Bayesian Inference

0   20   40   60   80   100   120   140   160   180   200  0.5  

1  

1.5  

2  

2.5  

data  =  model  +  noise  

 •  a  sequence  of  N  

samples  of  data  from  a  piecewise  constant  source  with  added  Gaussian  noise.  

•  Noise  independent  of  mean,  idenDcally  distributed  and  S.D.  =  σ  

•  Heterogenous:  divide  into  two  homogenous  segments  

µ2⎩⎨⎧

+

+=

i

ii e

ed

2

1

µµ

Nimmi≤<

Nm

Page 13: Changepoint Detection with Bayesian Inference

Single  changepoint  detector:  How  does  it  work?  

 •  SubsDtute  likelihood  into  Bayes’ Law  

–  Simple  model-­‐  consider  Ockham’s  Razor  

•  Interested  in  changepoint  locaDon  m,  integrate  w.r.t.  the  nuisance  parameters  (µ1,  µ2  and  σ)…rearrange  this…  

•  …get  a  BIG  expression  for  p({m}|dI),  code  in  Python  

•  On  running  obtain  most  likely  changepoint  locaDon  

Ockham’s  razor:  hOp://www.jstor.org/discover/10.2307/29774559?sid=21105568247973&uid=3738032&uid=4&uid=2    

Page 14: Changepoint Detection with Bayesian Inference

The  maths  

Page 15: Changepoint Detection with Bayesian Inference

More  maths  

•  Integrate  w.r.t.  (and  thereby  remove)  nuisance  parameters  

Page 16: Changepoint Detection with Bayesian Inference
Page 17: Changepoint Detection with Bayesian Inference
Page 18: Changepoint Detection with Bayesian Inference

Other  applicaDons…  

Page 19: Changepoint Detection with Bayesian Inference

hOp://moz.com/google-­‐algorithm-­‐change  

Page 20: Changepoint Detection with Bayesian Inference

“Google’s  algorithm  is  the  “secret  sauce  recipe”  that  has  enabled  it  to  dominate  search.”      -­‐  FT.com  16th  Sept  2014  

hOp://www.p.com/cms/s/0/9615661c-­‐3ce1-­‐11e4-­‐9733-­‐00144feabdc0.html?siteediDon=uk#axzz3DSwXYAW8  

Any  business  with  an  online  presence  today  open  struggles  to  accurately  evaluate:      ●  The  quality  of  their  website  and  associated  linking  pages,  as  perceived  by  Google    ●  The  robustness  of  their  website  to  a  sudden  change  in  Google’s  search  algorithm  

Page 21: Changepoint Detection with Bayesian Inference

Web  traffic  

30000  

35000  

40000  

45000  

50000  

55000  

60000  

raw  daily  google  search-­‐sourced  pageviews  

Page 22: Changepoint Detection with Bayesian Inference

Web  traffic  (2)  

30000  

35000  

40000  

45000  

50000  

55000  

60000  

smoothed  data  using  moving  average  

Page 23: Changepoint Detection with Bayesian Inference

Web  traffic  (3)  

30000  

35000  

40000  

45000  

50000  

55000  

60000  

smoothed  data  with  cyclicality  removed  

Page 24: Changepoint Detection with Bayesian Inference

Web  traffic  (4)  

-­‐838  

-­‐837.5  

-­‐837  

-­‐836.5  

-­‐836  

-­‐835.5  

-­‐835  

-­‐834.5  

-­‐834  

-­‐833.5  

-­‐833  

30000  

35000  

40000  

45000  

50000  

55000  

60000  

likelihood  of  change  in  data  plo>ed  over  .me  

day  removed   likelihood  CP  

Page 25: Changepoint Detection with Bayesian Inference
Page 26: Changepoint Detection with Bayesian Inference

number  of  tropical  storms  per  year  in  the  North  AtlanDc  

Data  obtained  from  ibtracs  database:  hOps://www.ncdc.noaa.gov/ibtracs/  

Page 27: Changepoint Detection with Bayesian Inference

"Amo  Dmeseries  1856-­‐present"  by  Rosentod,  Marsupilami  -­‐  hOp://www.cdc.noaa.gov/CorrelaDon/amon.us.long.data.  Licensed  under  Public  Domain  via  Wikimedia  Commons  -­‐  hOp://commons.wikimedia.org/wiki/File:Amo_Dmeseries_1856-­‐present.svg#mediaviewer/File:Amo_Dmeseries_1856-­‐present.svg  

Page 28: Changepoint Detection with Bayesian Inference
Page 29: Changepoint Detection with Bayesian Inference

Other  applicaDons  /  possibiliDes  

•  Financial  markets  and  poliDcal  events  

•  Combine  with  frequenDst  staDcal  methods:  – Use  of  GLR  in  online  (moving  window)  detecDon  applicaDon  

•  Your  own  data/  ideas  !  

Page 30: Changepoint Detection with Bayesian Inference

Thank  you  •  Link  to  Python  code  on  github:  

hOps://github.com/swhustla/pydata-­‐bayes-­‐changepoint    –  Single  changepoint  detector  (as  seen  tonight)  –  Dual  changepoint  detector  –  Ramp  detector  

•  Further  reading:  –  Numerical  Bayesian  Methods  Applied  to  Signal  Processing  (StaDsDcs  and  CompuDng)  by  Fitzgerald,  O’Ruanaidh,  1996  :  hOp://www.amazon.co.uk/Numerical-­‐Bayesian-­‐Processing-­‐StaDsDcs-­‐CompuDng/dp/0387946292      

–  Bayesian  Inference  on  Change  Point  Problems  (2007)hOp://www.cs.ubc.ca/~murphyk/Students/Xuan_MSc07.pdf    

 TwiOer:  @norhustla  Email:  [email protected]  

Page 31: Changepoint Detection with Bayesian Inference

Thank  you  •  AddiDonal  links:  

–  Google  Algo  updates:    hOp://moz.com/google-­‐algorithm-­‐change    –  Mathsight  -­‐>  insights  into  algorithm  changes  hOp://mathsight.org    –  AtlanDc  mulD-­‐decadal  oscillaDon  spaDal  paOern:

hOp://commons.wikimedia.org/wiki/File:AMO_PaOern.png  –  NaDonal  climaDc  data  center  hOps://www.ncdc.noaa.gov/ibtracs/    –  Ockham’s  Razor  and  Bayesian  Inference:  

hOp://www.jstor.org/discover/10.2307/29774559?sid=21105568247973&uid=3738032&uid=4&uid=2  

–  ConverDng  from  Matlab  to  Python:  hOp://mathesaurus.sourceforge.net/matlab-­‐numpy.html    

 TwiOer:  @norhustla  Email:  [email protected]