10
Bayesian Networks for Data Mining David Heckerman Microsoft Research (Data Mining and Knowledge Discovery 1, 79-119 (1997))

Bayesian Networks for Data Mining

Embed Size (px)

DESCRIPTION

Bayesian Networks for Data Mining. David Heckerman Microsoft Research (Data Mining and Knowledge Discovery 1, 79-119 (1997)). The Bayesian approach #1 Question What is Bayesian probability?. A person’s degree of belief in certain event. Personal (subjective) - PowerPoint PPT Presentation

Citation preview

Page 1: Bayesian Networks for Data Mining

Bayesian Networks for Data Mining

David Heckerman

Microsoft Research

(Data Mining and Knowledge Discovery 1, 79-119 (1997))

Page 2: Bayesian Networks for Data Mining

The Bayesian approach#1 Question

What is Bayesian probability?

• A person’s degree of belief in certain event.

• Personal (subjective)

• Your degree of belief that the coin will land heads.

Page 3: Bayesian Networks for Data Mining

The Classical approach

• Physical property of the world.

• Repeated trials (frequency)

• The probability that a coin will land heads.

Page 4: Bayesian Networks for Data Mining

#2 QuestionWhat are the advantages and disadvantages of the Bayesian

and classical interpretation of probability?

Bayesian probability:+ Reflects an expert’s knowledge.+ Compiles with rules of probability- Arbitrary

Classical probability:+ Objective, unbiased.- Not available in most situations.

Page 5: Bayesian Networks for Data Mining

Bayes Theorem

Posterior = (likelihood X prior) / evidence

)(

)()|()|(

Dp

hphDpDhp

Page 6: Bayesian Networks for Data Mining

Bayesian Networks

• Graphical model that encodes the joint probability distribution (JPD) for a set of variables X.

• It is a directed acyclic (not cyclic) graph.

• Each node represents one variable and contains a set local probability distributions (LPD) associated with each variable.

Page 7: Bayesian Networks for Data Mining

Bayesian Networks

• Nodes – Parents– Children

• Conditional probability tables

• Construction

Page 8: Bayesian Networks for Data Mining

Inference

The computation of a probability of interest given a model is known as

probabilistic inference

P(X|e)=P(x,e)/P(e) = cP(X,e)

Example on board.

Page 9: Bayesian Networks for Data Mining

Learning

• Learning from data– Refine the structure and LPD of a BN– Combine prior knowledge with data

• Result: IMPROVED KNOWLEDGE

Page 10: Bayesian Networks for Data Mining

Question #3Mention at least 3 advantages of Bayesian

Networks for data analysis. Explain each one.• Handle incomplete data sets

• Learning about causal relationships

• Combine domain knowledge + data

• Avoid over fitting.