A. Darwiche Learning in Bayesian Networks. A. Darwiche Known Structure Complete Data Known Structure...

A. Darwiche

Learning in Bayesian NetworksLearning in Bayesian Networks

A. Darwiche

Known StructureComplete Data

Known StructureIncomplete Data

Unknown StructureComplete Data

Unknown StructureIncomplete Data

Learning

The Learning ProblemThe Learning Problem

A. Darwiche

Known Structure Complete DataKnown Structure Complete Data

A. Darwiche

Known Structure Incomplete DataKnown Structure Incomplete Data

A. Darwiche

Unknown Structure Complete DataUnknown Structure Complete Data

A. Darwiche

Unknown Structure Incomplete DataUnknown Structure Incomplete Data

A. Darwiche

Known StructureKnown Structure

Method A

CPTs A

Method B

CPTs B

A. Darwiche

A= PrB

+CPTs B

Which probability distribution should we choose?

Common criterion: Choose distribution that maximizes

likelihood of data

A. Darwiche

A= PrB

+CPTs B

Data D

PrA (D) = PrA (d1) … PrA (dm)

Likelihood of data given PrA

PrB (D) = PrB (d1) … PrB (dm)

Likelihood of data given PrB

A. Darwiche

Maximizing Likelihood of DataMaximizing Likelihood of Data

• Complete Data: Unique set of CPTs which maximize likelihood of data

• Incomplete Data: No Unique set of CPTs which maximize likelihood of data

A. Darwiche

Maximizing Likelihood of DataMaximizing Likelihood of Data

• Complete Data: Unique set of CPTs which maximize likelihood of data

• Incomplete Data: No Unique set of CPTs which maximize likelihood of data

A. Darwiche

Known Structure, Complete DataKnown Structure, Complete DataData D

òêdjbc= Count(bc;D)Count(dbc;D)

Estimated parameter:

Number of data points di with d b c

Number of data points di with b c=

A. Darwiche

Known Structure, Complete DataKnown Structure, Complete DataData D

òêdjbc= Count(bc;D)Count(dbc;D)

Estimated parameter:

= Pj=1m I (bc;dj)

Pj=1m I (dbc;dj )

A. Darwiche

ComplexityComplexity

• Network with:– Nodes: n– Parameters: k– Data points: m

• Time complexity: O(m k n)(straightforward implementation)

• Space complexity: O(k)parameter count

A. Darwiche

Known Structure, Incomplete DataKnown Structure, Incomplete Data

EM Algorithm (Expectation-Maximization):-Initial CPTs to random values-Repeat until convergence:

-Estimate parameters using current CPTs (E-step)-Update CPTs using estimates (M-step)

A. Darwiche

Known Structure, Incomplete DataKnown Structure, Incomplete Data

òêdjbc= Pj=1m Pri(bcjdj)

Pj=1m Pri(dbcjdj )

Estimated parameters at iteration i+1 (using the CPTs at iteration i):

Pr0 corresponds to the initial Bayesian network (random CPTs)

A. Darwiche

EM AlgorithmEM Algorithm

• Likelihood of data cannot get smaller after an iteration

• Algorithm is not guaranteed to return the network which absolutely maximizes likelihood of data

• It is guaranteed to return a local maxima: Random re-starts

• Algorithm is stopped when – change in likelihood gets very small

– Change in parameters gets very small

A. Darwiche

ComplexityComplexity• Network with:

– Nodes: n– Parameters: k– Data points: m– Treewidth: w

• Time complexity (per iteration): O(m k n 2w)(straightforward implementation)

• Space complexity: O(k + n 2w)parameter count + space for inference

A. Darwiche

Collaborative FilteringCollaborative Filtering

• Collaborative Filtering (CF) finds items of interest to a user based on the preferences of other similar users.– Assumes that human behavior is predictable

A. Darwiche

Where is it used?Where is it used?• E-commerce

– Recommend products based on previous purchases or click-stream behavior

– Ex: Amazon.com

• Information sites– Rate items based on

previous user ratings

– Ex: MovieLens, Jester

A. Darwiche

John 5 - 3 2

Sam - 4 1 5

Cindy 3 - 5 -

Bob 5 1 - -

Bob 5 1 3.5 1.7

A. Darwiche

Memory-based AlgorithmsMemory-based Algorithms

• Use the entire database of user ratings to make predictions.– Find users with similar voting histories to the

active user.– Use these users’ votes to predict ratings for

products not voted on by the active user.

A. Darwiche

Model-based AlgorithmsModel-based Algorithms

• Construct a model from the vote database.

• Use the model to predict the active user’s ratings.

A. Darwiche

Bayesian ClusteringBayesian Clustering

• Use a Naïve Bayes network to model the vote database.

• m vote variables: one for each title.– Represent discrete vote values.

• 1 “cluster” variable– Represents user personalities

A. Darwiche

cvCv kk

Naïve BayesNaïve Bayes

V1 V2 V3 Vm…

A. Darwiche

V1 V2 V3 Vm…

cvCv kk

A. Darwiche

• Inference– Evidence: known votes vk for titles k I

– Query: title j for which we need to predict vote

• Expected value of vote:

hkjj Ikvhvhp

):|Pr(

V1 V2 V3 Vm…

A. Darwiche

LearningLearning• Simplified Expectation Maximization (EM)

Algorithm with partial data

• Initialize CPTs with random values subject to the following constraints:

)Pr(cc )|Pr(| cvkcvk

A. Darwiche

DatasetsDatasets• MovieLens

– 943 users; 1682 titles; 100,000 votes (1..5); explicit voting

• MS Web – website visits– 610 users; 294 titles; 8,275 votes (0,1) :

null votes => 0 : 179,340 votes; implicit voting

A. Darwiche

0 5 10 15

Iteration

• Learning curve for MovieLens Dataset

A. Darwiche

ProtocolsProtocols

• User database is divided into: 80% training set and 20% test set.– One-by-one select a user from the test set to be the

active user.– Predict some of their votes based on remaining

A. Darwiche

• All-But-One

• Given-{Two, Five, Ten}

Qe eIa e e e e e e e ee e e

Q eeIa Q Q Q Q Q Q QQ Q Q Q

e e e e eQIa Q Q Q Q Q QQ Q

e eee QeIa QQ Q e e ee e

A. Darwiche

Evaluation MetricEvaluation Metric

• Average Absolute Deviation

• Ranked Scoring

jaja vpP ,

A. Darwiche

ResultsResults• Experiments were run 5 times and averaged

• Movielens

Algorithm Given-Two Given-Five Given-Ten All-But-One

Correlation 1.019 .916 .865 .806

VecSim .948 .878 .843 .799

BC(9) .771 .765 .763 .753

A. Darwiche

• MS Web

Algorithm Given-Two Given-Five Given-Ten All-But-One

Correlation 0.105 0.0911 0.0844 0.0673

VecSim 0.101 0.0885 0.0818 0.0675

BC(9) 0.0652 0.0652 0.0649 0.0507

A. Darwiche

Computational IssuesComputational Issues

• Prediction time: (Memory-based) 10 minutes per experiment; (Model-based) 2 minutes

• Learning time: 20 minutes per iteration

• n: number of data point; m: number of titles; w: number of votes per title;|C| number of personality types

Algorithm Prediction Time Learning Time Space

Memory-based O(n*m) N/A O(n*m)

Model-based O(|C|*m) O(n*m*|C|*w) O(|C|*m*w)

A. Darwiche

Demo of SamIamDemo of SamIam

• Building networks:– Nodes, Edges– CPTs

• Inference:– Posterior marginals– MPE– MAP

• Learning: EM• Sensitivity Engine

A. Darwiche Learning in Bayesian Networks. A. Darwiche Known Structure Complete Data Known Structure...

Documents

Paskaitos Unknown Unknown

Making an unknown unknown a known unknown Missing data in

Gold Binding Protein: 1. Structure Prediction€¦ · Proteins have been genetically engineered which bind to Au and alter crystal morphology (GBPs). Binding mechanism unknown; structure

Semantic Structure From Motion with Points, Regions, … · Semantic Structure From Motion with Points, Regions, ... objects and regions, ... timation of the unknown parameters

An Unknown Force Awakened by A Pyramidal Structure · run lasted for 30 min (Hemi-Sync ® [9

An unknown stone structure in Sarmizegetusa Regia's sacred zone

Booz Company Bahjat El-Darwiche

A. Darwiche Knowledge Compilation Jinbo Huang NICTA and ANU Slides made by Adnan Darwiche and Jinbo Huang

Bayesian Networks Darwiche

TheComputaonalConnecon - Max Planck Societyresources.mpi-inf.mpg.de/departments/rg1/conferences/... · 2011. 3. 23. · LogicandProbability & TheComputaonalConnecon & Adnan&Darwiche&

User Centric Community Clouds...User Centric Community Clouds 33 interfaces, which are supported by a pool of hosts, at an unknown location, and organized with unknown structure (from

Crystal structure of an unknown solvate of dodecakis (μ2-alaninato

Compiling Graphical Models Adnan Darwiche University of California, Los Angeles UAI06 Tutorial

A. Darwiche Bayesian Networks. A. Darwiche Bayesian Network Battery Age Alternator Fan Belt Battery Charge Delivered Battery Power Starter Radio LightsEngine

author unknown address unknown accessed unknown Plant Structure & Growth Plant Structure & Growth Transport in Angiosperms Transport in Angiosperms Reproduction

1e Ok usufdcimages.uflib.ufl.edu/UF/00/04/87/34/00593/00239.pdf · Clayton w3Sofs Unknown Unknown Unknown Unknown Unknown Unknown Unknown Unknown RJmc Cynthia easterly Unknown Unknown

Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…

Value Project Webinar Positioning for the Unknown by Transforming Organizational Culture and Management Structure

Department of Urban Developmentud.delhigovt.nic.in/DoIT/DOIT/DOIT_UDD/Pdf/616-1639.pdf · 2012. 9. 12. · Devta Gu ta Jeevacll Ram Unknown Unknown Unknown Unknown Unknown Unknown

Adnan Darwiche July 8, 2017 - arXiv · Human-Level Intelligence or Animal-Like Abilities? Adnan Darwiche Computer Science Department University of California, Los Angeles darwiche@cs.ucla.edu