22
Naive Bayes Classifier Lecturer: Ji Liu This is not me :-). He is Bayes Most slides are from Eamonn Keogh's.

Naive Bayes ClassifierFrom Jiawei Han's slides Play-tennis example: estimating P(di|c) Outlook Temperature Humidity Windy Class sunny hot high false N sunny hot high true N overcast

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Naive Bayes ClassifierFrom Jiawei Han's slides Play-tennis example: estimating P(di|c) Outlook Temperature Humidity Windy Class sunny hot high false N sunny hot high true N overcast

Naive Bayes Classifier

Lecturer: Ji Liu

This is not me :-). He is Bayes

Most slides are from Eamonn Keogh's.

Page 2: Naive Bayes ClassifierFrom Jiawei Han's slides Play-tennis example: estimating P(di|c) Outlook Temperature Humidity Windy Class sunny hot high false N sunny hot high true N overcast
Page 3: Naive Bayes ClassifierFrom Jiawei Han's slides Play-tennis example: estimating P(di|c) Outlook Temperature Humidity Windy Class sunny hot high false N sunny hot high true N overcast
Page 4: Naive Bayes ClassifierFrom Jiawei Han's slides Play-tennis example: estimating P(di|c) Outlook Temperature Humidity Windy Class sunny hot high false N sunny hot high true N overcast
Page 5: Naive Bayes ClassifierFrom Jiawei Han's slides Play-tennis example: estimating P(di|c) Outlook Temperature Humidity Windy Class sunny hot high false N sunny hot high true N overcast
Page 6: Naive Bayes ClassifierFrom Jiawei Han's slides Play-tennis example: estimating P(di|c) Outlook Temperature Humidity Windy Class sunny hot high false N sunny hot high true N overcast
Page 7: Naive Bayes ClassifierFrom Jiawei Han's slides Play-tennis example: estimating P(di|c) Outlook Temperature Humidity Windy Class sunny hot high false N sunny hot high true N overcast
Page 8: Naive Bayes ClassifierFrom Jiawei Han's slides Play-tennis example: estimating P(di|c) Outlook Temperature Humidity Windy Class sunny hot high false N sunny hot high true N overcast
Page 9: Naive Bayes ClassifierFrom Jiawei Han's slides Play-tennis example: estimating P(di|c) Outlook Temperature Humidity Windy Class sunny hot high false N sunny hot high true N overcast
Page 10: Naive Bayes ClassifierFrom Jiawei Han's slides Play-tennis example: estimating P(di|c) Outlook Temperature Humidity Windy Class sunny hot high false N sunny hot high true N overcast

Key of Bayes Classifiers

Page 11: Naive Bayes ClassifierFrom Jiawei Han's slides Play-tennis example: estimating P(di|c) Outlook Temperature Humidity Windy Class sunny hot high false N sunny hot high true N overcast
Page 12: Naive Bayes ClassifierFrom Jiawei Han's slides Play-tennis example: estimating P(di|c) Outlook Temperature Humidity Windy Class sunny hot high false N sunny hot high true N overcast

Key of Bayes Classifiers●

Page 13: Naive Bayes ClassifierFrom Jiawei Han's slides Play-tennis example: estimating P(di|c) Outlook Temperature Humidity Windy Class sunny hot high false N sunny hot high true N overcast

Single attribute = “name”

Page 14: Naive Bayes ClassifierFrom Jiawei Han's slides Play-tennis example: estimating P(di|c) Outlook Temperature Humidity Windy Class sunny hot high false N sunny hot high true N overcast
Page 15: Naive Bayes ClassifierFrom Jiawei Han's slides Play-tennis example: estimating P(di|c) Outlook Temperature Humidity Windy Class sunny hot high false N sunny hot high true N overcast
Page 16: Naive Bayes ClassifierFrom Jiawei Han's slides Play-tennis example: estimating P(di|c) Outlook Temperature Humidity Windy Class sunny hot high false N sunny hot high true N overcast
Page 17: Naive Bayes ClassifierFrom Jiawei Han's slides Play-tennis example: estimating P(di|c) Outlook Temperature Humidity Windy Class sunny hot high false N sunny hot high true N overcast

More Attributes (Features)

● In the “policewoman” case, we only consider a single attribute “name” to predict the gender;

● What if the number of attributes is more than one (a more general case)?

● The way to estimate is the same● Then how to estimate ?

Page 18: Naive Bayes ClassifierFrom Jiawei Han's slides Play-tennis example: estimating P(di|c) Outlook Temperature Humidity Windy Class sunny hot high false N sunny hot high true N overcast

This is the key assumption for NAIVE Bayes!!!

Page 19: Naive Bayes ClassifierFrom Jiawei Han's slides Play-tennis example: estimating P(di|c) Outlook Temperature Humidity Windy Class sunny hot high false N sunny hot high true N overcast

How to estimate?

Page 20: Naive Bayes ClassifierFrom Jiawei Han's slides Play-tennis example: estimating P(di|c) Outlook Temperature Humidity Windy Class sunny hot high false N sunny hot high true N overcast

From Jiawei Han's slides

Play-tennis example: estimating P(di|c)

Outlook Temperature Humidity Windy Classsunny hot high false Nsunny hot high true Novercast hot high false Prain mild high false Prain cool normal false Prain cool normal true Novercast cool normal true Psunny mild high false Nsunny cool normal false Prain mild normal false Psunny mild normal true Povercast mild high true Povercast hot normal false Prain mild high true N

P(true|n) = 3/5P(true|p) = 3/9

P(false|n) = 2/5P(false|p) = 6/9

P(high|n) = 4/5P(high|p) = 3/9

P(normal|n) = 2/5P(normal|p) = 6/9

P(hot|n) = 2/5P(hot|p) = 2/9

P(mild|n) = 2/5P(mild|p) = 4/9

P(cool|n) = 1/5P(cool|p) = 3/9

P(rain|n) = 2/5P(rain|p) = 3/9

P(overcast|n) = 0P(overcast|p) = 4/9

P(sunny|n) = 3/5P(sunny|p) = 2/9

windy

humidity

temperature

outlook

P(n) = 5/14

P(p) = 9/14

Page 21: Naive Bayes ClassifierFrom Jiawei Han's slides Play-tennis example: estimating P(di|c) Outlook Temperature Humidity Windy Class sunny hot high false N sunny hot high true N overcast

From Jiawei Han's slides

Issues for Naive Bayes

● p(d|cj)=p(d1|c)* p(d2|c) * ... * p(dn|c) would be a tiny number. How to deal with the numerical issue in practice?

● Compute log p(d|cj) instead of p(d|cj)

– log p(d|cj) = log p(d1|cj) + log p(d2|cj) + … log p(dn|cj)

Page 22: Naive Bayes ClassifierFrom Jiawei Han's slides Play-tennis example: estimating P(di|c) Outlook Temperature Humidity Windy Class sunny hot high false N sunny hot high true N overcast

From Jiawei Han's slides

Play-tennis example: classifying dAn unseen sample d = <rain, hot, high, false>

P(<rain, hot, high, false> | p) * P(p) = P(rain|p) * P(hot|p) * P(high|p) * P(false|p) * P(p)

P(d|p)·P(p) = P(rain|p)·P(hot|p)·P(high|p)·P(false|p)·P(p) = 3/9·2/9·3/9·6/9·9/14 = 0.010582

P(d|n)·P(n) = P(rain|n)·P(hot|n)·P(high|n)·P(false|n)·P(n) = 2/5·2/5·4/5·2/5·5/14 = 0.018286

Sample d is classified in class n (don’t play)