Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
Supervised Learning [1] Artificial Intelligence 2015-2016
Artificial Intelligence
Supervised Learning [2] Artificial Intelligence 2015-2016
Types of learning problems
{D1, D2, ..., Dn}
{D1, D2, ..., Dn}
{Y1, Y2, ..., Yn}
P
{D1, D2, ..., Dn}
P
{D1, D2, ..., Dn}
Xi ai ri
ai = (Di)
v(< r1, r2, ..., rn >)
Supervised Learning [3] Artificial Intelligence 2015-2016
Events and observations
P
W X = x1
X = x2
X = xn
...
Y = y1
Y = y2
Y = ym ...
W
Supervised Learning [4] Artificial Intelligence 2015-2016
Observations and Independence
{X, Y}
n D1 ={X1, Y1}, D2 ={X2, Y2}, ..., Dn ={Xn, Yn}
{X1, X2, ... , Xn}
<Xi Xj> i ≠ j
P(Xi) = P(Xi), i ≠ j
Supervised Learning [5] Artificial Intelligence 2015-2016
Maximum Likelihood Estimation (MLE) P(X)
P(X) P(X)
D = {D1, D2, ... , Dn}
P(X)
P(D | )
D
{D1, D2, ... , Dn}
)|,,,()|()|( 21 nDDDPDPDL
i
in DPDPDPDPDP )|()|()|()|()|( 21
)|(maxarg* DLML
i
i
i
i DPDPDLD )|(log)|(log)|(log)|(
)|(maxarg* DML
Supervised Learning [6] Artificial Intelligence 2015-2016
Example: coin tossing (Bernoulli Trials)
X X = 1 X = 0
P(X = 1) = , P(X = 0) = 1 < D1, D2, ... , Dn >
D = {D1={X1 = 1}, D2={X2 = 1}, D2={X3 = 0} ...}
P
i
i
i
ii XPXPXPDPD )|(log)|(log)|}({log)|(log)|(
i
i
i
i
i
XXXXD ii ]0[)1(log]1[log)1(log)|(
]0[]1[
( where NX=1 is the number of Xi = 1 in the sequence D )
)1(
01
XX NN
01
1*0
XX
XML
NN
N
vXif
vXifvX
i
i
i0
1][where:
)1(loglog 01 XX NN
]0[]1[ )1()|( XXXP
Supervised Learning [7] Artificial Intelligence 2015-2016
Anti-spam filter
D = {D1 ={Y1 = 1, X11 = 1, X12 = 1, ..., X1n = 0}, D1 ={Y2 = 0, X21 = 0, X22 = 1, ..., X2n = 1}, ...}
kjiik kYjXPkYP )|(,)(
m
kjikmkjikmkjik DPDPDPDL }),{|(}),{|}({)|()|},({
i
kjimmimim
m
kmm
m
kjimmimimkmm
m
kjikmmimimkjikmm
m
kjikimimmm
yYxXPyYP
yYxXPyYP
yYxXPyYP
xXyYP
}){,|(}){|(
}){,|}({}){|(
}),{,|}({}),{|(
}),{|},({
(<Xi Xj, Y>)
X1
Y
X2 ... Xn
n
i
ii YXPYPXYP1
)|()(}){,(
Supervised Learning [8] Artificial Intelligence 2015-2016
Anti-spam filter
P
j k
kYjX
kjikjikjimi
k
kY
kkk
ikYjXP
kYP
]][[
][
}){,|(
}){|(
m i
kjimmimim
m
kmmkjik yYxXPyYPD }){,|(log}){|(log)|},({
m i j k
kjimim
m
k
k
mkjik kYjXkYD log]][[log][)|},({
X1
Y
X2 ... Xn
n
i
ii YXPYPXYP1
)|()(}){,(
Supervised Learning [9] Artificial Intelligence 2015-2016
Anti-spam filter
m i j k
kjimim
m
k
k
mkjik kYjXkYD log]][[log][)|},({
)1(log][)|}({* k
k
m
k
k
mk kYD
D
k
kY
k
kY
k
k
kYk
k
k
m
m
k
NNN
N
kY
11
0
][
*
*
D
kYk
N
N *
D k
k
X1
Y
X2 ... Xn
n
i
ii YXPYPXYP1
)|()(}){,(
Supervised Learning [10] Artificial Intelligence 2015-2016
Anti-spam filter
i k j
kjiki
m i j k
kjimimkji kYjXD )1(log]][[)|}({*
kY
kYjX
kjiN
Ni
,*
kY
j
kYjX
j ik
kYjX
j
kji
ik
kYjX
kji
kji
ik
kji
m
mim
kji
NNN
N
kYjX
i
i
i
,
,
,*
*
11
0
]][[
i j k
X1
Y
X2 ... Xn
n
i
ii YXPYPXYP1
)|()(}){,(
m i j k
kjimim
m
k
k
mkjik kYjXkYD log]][[log][)|},({
Supervised Learning [11] Artificial Intelligence 2015-2016
Learning CPTs for a graphical model
T F
A S
L
R
T P(T)
0
1
T F A P(A | T,F)
0 0 0
0 0 1
0 1 0
0 1 1
1 0 0
1 0 1
1 1 0
1 1 1
F P(F)
0
1 F S P(S | F)
0 0
0 1
1 0
1 1
A L P(L | A)
0 0
0 1
1 0
1 1
L R P(R | L)
0 0
0 1
1 0
1 1
0
0,0
F
FS
N
ND
F
N
N 0
D
F
N
N 1D
T
N
N 0
D
T
N
N 1
0
0,1
F
FS
N
N
1
1,0
F
FS
N
N
1
1,1
F
FS
N
N
0
0,0
A
AL
N
N
0
0,1
A
AL
N
N
1
1,0
A
AL
N
N
1
1,1
A
AL
N
N
Supervised Learning [12] Artificial Intelligence 2015-2016
Bayesian learning
D
D
)()|(
)()|(
)(
)()|()|(
PDP
PDP
DP
PDPDP
)()|( PDP
)(P
Supervised Learning [13] Artificial Intelligence 2015-2016
Beta distribution n > 0
> 0
> 0
)!1(:)( nn
)!1(
)!1()!1(
)(
)()(:),(B
),(B
)1(:),;(Beta
11
xxx
Beta(x;1,1) Beta(x;2,2) Beta(x;10,10) Beta(x;2,3)
2
1
x
(*) In a finitary setting
Supervised Learning [14] Artificial Intelligence 2015-2016
Conjugate prior distributions
]0[]1[)1()|(
ii XX
iDP
DD
i
ii DPDPDP )1()|()|}({)|(
),(B
)1()1(),;(Beta)1()()|(
11
PP
PP
PP
DDDDPDP
),;(Beta),(B
),(B
),(B
)1(11
PDPD
PP
PDPD
PP
PDPD
),;(Beta)()|( PDPDPDP
2
1),;(Betamaxarg*
PDPD
PDPDPDMAP
2 PP
)(P )|( DP
)()|( PDP )(P
𝛼𝐷 𝛽𝐷
𝛼𝑃 𝛽𝑃
Supervised Learning [15] Artificial Intelligence 2015-2016
Anti-spam filter
D
2
1,*
kYkjikji
kYjXkji
kjiN
Ni
i j k
2
1*
Dkk
kYkk
N
N
k
)()|(maxarg* PDPMAP
kjikjikk ,,,
X1
Y
X2 ... Xn
n
i
ii YXPYPXYP1
)|()(}){,(