or Revd. Bayes meets Countess Lovelaceconferences.inf.ed.ac.uk/bayes250/slides/winn.pdf ·...

Probabilistic Programming or

Revd. Bayes meets Countess Lovelace

John Winn, Microsoft Research Cambridge

Bayes 250 Workshop, Edinburgh, September 2011

Statistician Programmer

“Reverend Bayes, meet Countess Lovelace”

1702 – 1761 1815 – 1852

Roadmap

Bayesian inference is hard

Two key problems

Probabilistic programming

Examples

Infer.NET

An application

Future of Bayesian inference

Bayesian inference is hard

! Complex mathematics

! Approximate algorithms

! Error toleration

! Hard to schedule

! Hard to detect convergence ! Numerical stability

! Computational cost

The average developer…

! ! ! !

The expert statistician

! ! ! !

The expert statistician

! ! ! ! !

Probabilistic programming

Bayesian inference at the language level

BUGS & WinBUGS showed the way

Three keywords added to (any) language

random – makes a random variable

constrain – constrains a variable e.g. to data

infer – returns the distribution

of a variable

Random variables

Normal variables have a fixed single value: int length=6,

bool visible=true.

Random variables have uncertain value

specified by a probability distribution: int length = random Uniform(0,10)

bool visible = random Discrete(0.8)

random operator means ‘is distributed as’.

Constraints

We can define constraints on random

variables: constrain(visible==true)

constrain(length==4)

constrain(length>0)

constrain(i==j)

constrain(b)means ‘we constrain b

to be true’.

Inference

The infer operator gives the posterior

distribution of one or more random

variables.

Example: int i = random Uniform(1,10);

bool b = (i*i>50);

Dist bdist = infer(b);//Bernoulli(0.3)

Output of infer is always deterministic

even when input is random.

Hello Uncertain World

string A = random new Uniform<string>();

string B = random new Uniform<string>();

string C = A+" "+B;

constrain(C == "Hello Uncertain World");

infer(A)

infer(B)

// 50%: "Hello", 50%: "Hello Uncertain"

// 50%: “Uncertain World", 50%: “World"

Semantics: sampling interpretation

Imagine running the program many times:

random(d) samples from the distribution d

constrain(b) discards the run if b is false

infer(x) collects the value of x into a

persistent memory

If enough x’s have been stored, returns their

distribution

Otherwise starts a new run

bool drugWorks = random new Bernoulli(0.5);

if (drugWorks) {

pControl = random new Beta(1,1);

control[:] = random new Bernoulli(pControl);

pTreated = random new Beta(1,1);

treated[:] = random new Bernoulli(pTreated);

} else {

pAll = random new Beta(1,1);

control[:] = random new Bernoulli(pAll);

treated[:] = random new Bernoulli(pAll); }

Bayesian Model Comparison (if, else)

// constrain to data

constrain(control == controlData);

constrain(treated == treatedData);

// does the drug work?

infer(drugWorks)

Probabilistic programs and graphical models

Probabilistic

Program

Graphical

Variables Variable nodes

Functions/operators Factor nodes/edges

Fixed size loops/arrays Plates

If statements Gates (Minka & Winn)

Variable sized loops,

Complex indexing,

jagged arrays, mutation,

recursion, objects/

properties…

No common equivalent

Causality bool AcausesB = random new Bernoulli(0.5);

if (AcausesB) {

A = random Aprior;

B = NoisyFunctionOf(A);

} else {

B = random Bprior;

A = NoisyFunctionOf(B);

// intervention replaces above definition of B

if (interventionOnB) B = interventionValue;

// constrain to data

constrain(A == AData);

constrain(B == BData);

constrain(interventionOnB==interventionData);

// does A causes B, or vice versa?

infer(AcausesB)

Infer.NET

Compiles probabilistic programs into

inference code (EP/VMP/Gibbs).

Supports many (but not all)

probabilistic program elements

Extensible – distribution channel for new

machine learning research

infer.net

Consists of a chain of code transformations:

T1 T2 T3 Probabilistic

program

Inference

program

Infer.NET inference engine

A Raining

T1 T2 T3 Probabilistic

program

Inference

program

Infer.NET compiler

Channel

transform T2 T3

Inference

program

Probabilistic

program

Infer.NET compiler

Channel

transform

Message

transform T3

Inference

program

Probabilistic

program

Infer.NET compiler

Channel

transform

Message

transform Scheduler

Inference

program

Schedule

Probabilistic

program

Infer.NET architecture

Infer.NET

compiler

compiler C# Algo-

Infer.NET Inference Engine

Probabilistic

program Observed values

(data, priors)

Algorithm

execution

Probability

distributions

----------------

Application: Reviewer Calibration

Submissions

Strong

Reject

Accept

Reject

Accept Weak

Accept

Reviewers

[SIGKDD Explorations ‘09]

Reviewer calibration code

// Calibrated score – one per submission Quality[s] = random Gaussian(qualMean,qualPrec).ForEach(s);

// Precision associated with each expertise level Expertise[e] = random Gamma(expMean,expVar).ForEach(e);

// Review score – one per review Score[r]= random Gaussian(Quality[sOf[r]],Expertise[eOf[r]]);

// Accuracy of judge Accuracy[j] = random Gamma(judgeMean,judgeVar).ForEach(j);

// Score thresholds per judge Threshold[t][j] = random Gaussian(NomThresh[t], Accuracy[j]);

// Constrain to match observed rating constrain(Score[r] > Threshold[rating][jOf[r]]); constrain(Score[r] < Threshold[rating+1][jOf[r]]);

Results for KDD 2009

Paper scores

Highest score: 1 ‘strong accept’ and 2 ‘accept’

Beat paper with 3 ‘strong accept’ from more generous reviewers

Score certainties

Most certain: 5 ‘weak accept’ reviews

Least certain: ‘weak reject’, ‘weak accept’, and ‘strong accept’.

Reviewer generosity

Most generous reviewer: 5 strong accepts

More expert reviews are higher precision:

Informed Outsider: 1.22, Knowledgeable: 1.35 Expert: 1.59

Experts are more likely to agree with each other (!)

Future of Bayesian inference

How to make Bayesian inference accessible to the

average developer + break the complexity barrier?

Probabilistic programming in familiar languages

Probabilistic debugging tools

Scalable execution

Online community with shared programs and

shared data + continual evaluation of each

program against all relevant data and vice versa.

We hope Infer.NET will be part of this future!

research.microsoft.com/infernet

Questions?

Infer.NET now and next

Domains

Execution

platform

Models

Data size MB GB TB

2008 Future

DryadLINQ

Multicore

CamGraph

Azure GPU

Classification

Regression

Factor analysis

Bayes nets

Ranking Hierarchical

models

Sparse

models HMMs

Grid models

Undirected

models

Object models

Collaborative

filtering

Information retrieval

Biological

User modelling

Software development

Healthcare

Social networks

Natural language

Vision

Semantic web

2011 2010 2009

or Revd. Bayes meets Countess Lovelaceconferences.inf.ed.ac.uk/bayes250/slides/winn.pdf ·...

Documents

New Bayesian inference for partially observed Markov processes, …conferences.inf.ed.ac.uk/bayes250/slides/wilkinson.pdf · 2011. 9. 17. · Partially observed Markov process (POMP)

Bayeasian inference

Cross-Sectional Mutual Fund Performance...Cross-Sectional Mutual Fund Performance Mark Fisher Mark J. Jensen ... Economics, Finance and Business Section of Bayes250, the 2014 Bayesian

STATS 200: Introduction to Statistical Inference 200: Introduction to Statistical Inference ... Statistical inference Statistical inference = Probability 1 ... STATS 200: Introduction

Visualizing inference

Happy ABC: Expectation-Propagation for Summary-Less ...conferences.inf.ed.ac.uk/bayes250/slides/chopin.pdf · Results from alpha-stable example a Density 1.50 1.60 1.70 0 2 4 6 8

Bayesian and frequentist inference for ecological ... · Key Words and Phrases: ecological inference, Bayesian inference, frequentist inference, voting patterns. 1 Introduction to

Variational Inference and Mean Field · Variational Inference “Variational inference”: Formulate inference problem as constrained optimization. Approximate the function or constraintsto

BDL.NET: BAYESIAN DICTIONARY LEARNING IN INFER.NET Tom

Semiparametric Regression Analysis via Infer...later sections. Each of Sections 3–10 illustrates a different type of semiparametric regression analysis via Infer.NET. All code is

Parametric Inference Maximum Likelihood Inference …bioucas/IP/files/statistical_inference.pdf · Statistical Inference Parametric Inference Maximum Likelihood Inference Exponential

INDUCTION (probable inference) : inference moving from specific facts to general conclusions. DEDUCTION (necessary inference): inference moving from general

1 Probabilistic Inference and Learningepxing/Class/10708-17/... · 1 Probabilistic Inference and Learning In practice, exact inference is not used widely, and most probabilistic inference

통계적 추론 Statistical Inference 개념wolfpack.hnu.ac.kr/2015_Fall/D4BE/통계적추론.pdfMathematical Statistics Statistical Inference 통계적 추론 Statistical Inference

Infer.NET 101 - Microsoft · Microsoft Infer.NET programming Version 1.2 – October 2018 Abstract This document provides a detailed, sample-based introduction to the basics of Microsoft®

Exact Inference: Complexitysrihari/CSE574/Chap8/Ch8-PGM-Inference… · Machine Learning Srihari 2 Topics 1. What is Inference? 2. Complexity Classes 3. Exact Inference 1. Variable

Desenvolvimento de protótipo para prova de conceito com a framework Infer.NET Elaborado por Carlos Mareco Aluno nº 20101417 Orientador: Professor Joaquim

Graphical Inference

Inference. Overview The MC-SAT algorithm Knowledge-based model construction Lazy inference Lifted inference

Low precision Inference on GPU - Nvidia · 3 INFERENCE • Inference: using a trained model to make predictions • Much of inference is fwd pass in training • Inference engines