Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
Review of HW2Liyu Chen
When , to compute , we consider the following three cases:
1.
2.
3.
=
When , the derivative is simply 0.
Combining both cases, we get
By , we can store instead of
Make a prediction:
Update weights:
Computation: Loss:
Parameters:
Chain rule:
Chain rule:
Chain rule:
Note: should be
Can also write it as a -dimensional vector