41
deeplearning.ai Sequence to sequence models Basic models

Sequence to sequence models - Deep LearningAndrew Ng Sequence to sequence model Jane visite l’Afrique en septembre Jane is visiting Africa in September.!"#$ %"&$ %"'($ ⋯ [Cho et

  • Upload
    others

  • View
    30

  • Download
    0

Embed Size (px)

Citation preview

deeplearning.ai

Sequencetosequencemodels

Basicmodels

AndrewNg

Sequence to sequence modelJane visite l’Afrique en septembre

Jane is visiting Africa in September.

!"#$

%"&$ %"'($

[Cho et al., 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation]

%"&$ %"*$ %"+$ %",$ %"-$

."&$ ."*$ ."+$ .",$ ."-$ ."/$

[Sutskever et al., 2014. Sequence to sequence learning with neural networks]

AndrewNg

Image captioningA cat sitting on a chair."&$ ."*$ ."+$ .",$ ."-$

55×55 ×96 27×27 ×96 27×27 ×256 13×13 ×256

11× 11s = 4

3× 3s = 2

MAX-POOL

5× 5same

3× 3s = 2

MAX-POOL

13×13 ×384

3× 3same

3× 3

=

13×13 ×384 13×13 ×256 6×6 ×256

3× 3 3× 3s = 2

MAX-POOL

9216

Softmax1000

4096

4096

[Mao et. al., 2014. Deep captioning with multimodal recurrent neural networks][Vinyals et. al., 2014. Show and tell: Neural image caption generator][Karpathy and Li, 2015. Deep visual-semantic alignments for generating image descriptions]

."/$

.9"':$

%

.9"&$ .9"*$

deeplearning.ai

Sequencetosequencemodels

Pickingthemostlikelysentence

AndrewNg

Machine translation as building a conditional language model

Language model:

Machine translation: !"#$

%"&$

'("&$

%"*+$

'("*,$

⋯⋯

!"#$

%"&$

'("&$ '(".$ '("*,$

⋯%".$

AndrewNg

Finding the most likely translation

Jane visite l’Afrique en septembre. /('"&$, … , '"*,$|%)

Jane is visiting Africa in September.

Jane is going to be visiting Africa in September.

In September, Jane will visit Africa.

Her African friend welcomed Jane in September.

argmax:;<=,…,:;>,=

/('"&$,… , '"*,$|%)

AndrewNg

Why not a greedy search?

Jane is visiting Africa in September.

Jane is going to be visiting Africa in September.

!"#$

%"&$

'("&$

%"*+$

'("*,$

⋯⋯

deeplearning.ai

Sequencetosequencemodels

Beamsearch

AndrewNg

Beam search algorithm

!"#$

%"&$

'("&$

%"*+$

a

in

jane

september

zulu

10000

0('"&$|%)Step 1

AndrewNg

Beam search algorithm

a

in

jane

september

zulu

Step 1

10000

!"#$

%"&$ %"*+$

!"#$

%"&$ %"*+$

!"#$

%"&$ %"*+$

Step 2

AndrewNg

Beam search (4 = 3)in september

jane is

jane visits

!"#$

%"&$ %"*+$

'("7$

⋯septemberin

0('"&$, '"9$|%) jane visits africa in september. <EOS>

!"#$

%"&$ %"*+$

'("7$

⋯isjane

!"#$

%"&$ %"*+$

'("7$

⋯visitsjane

deeplearning.ai

Sequencetosequencemodels

Refinementstobeamsearch

AndrewNg

Length normalization

argmax'

() *+,- ., *+0-, … , *+,20-)45

,60

argmax'

7 log) *+,- ., *+0-, … , *+,20-)45

,60

7 log) *+,- ., *+0-, … , *+,20-)45

,60

AndrewNg

Beam search discussion

Beam width B?

Unlike exact search algorithms like BFS (Breadth First Search) or DFS (Depth First Search), Beam Search runs faster but is not guaranteed to find exact maximum forargmax

')(*|.).

deeplearning.ai

Sequencetosequencemodels

Erroranalysisonbeamsearch

AndrewNg

ExampleJane visite l’Afrique en septembre.

Human: Jane visits Africa in September.

Algorithm: Jane visited Africa last September.

!"#$

%"&$ %"'($⋯

AndrewNg

Error analysis on beam searchHuman: Jane visits Africa in September. (+∗)Algorithm: Jane visited Africa last September. (+.)Case 1:

Beam search chose +.. But +∗ attains higher / + % .Conclusion: Beam search is at fault.

Case 2:

+∗is a better translation than +.. But RNN predicted / +∗ % < / +. % .Conclusion: RNN model is at fault.

AndrewNg

Error analysis process

Jane visits Africa in September.

Jane visited Africa last September.

Human Algorithm / +∗ % / +. % At fault?

Figures out what faction of errors are “due to” beam search vs. RNN model

deeplearning.ai

Sequencetosequencemodels

Bleuscore(optional)

AndrewNg

Evaluating machine translation

French: Le chat est sur le tapis.

Reference 1: The cat is on the mat.

Reference 2: There is a cat on the mat.

MT output: the the the the the the the.

Precision: Modified precision:

[Papineni et. al., 2002. Bleu: A method for automatic evaluation of machine translation]

AndrewNg

Bleu score on bigramsExample: Reference 1: The cat is on the mat.

Reference 2: There is a cat on the mat.

MT output: The cat the cat on the mat.

the cat

cat the

cat on

on the

the mat

[Papineni et. al., 2002. Bleu: A method for automatic evaluation of machine translation]

AndrewNg

Bleu score on unigramsExample: Reference 1: The cat is on the mat.

Reference 2: There is a cat on the mat.

MT output: The cat the cat on the mat.

[Papineni et. al., 2002. Bleu: A method for automatic evaluation of machine translation]

!" =

%

&'()*+,∈./

0123456(7

(239:;<=)

%

&'()*+,∈./

01234

(239:;<=)

!' =

%

')*+,∈./

0123456(7

(3:;<=)

%

')*+,∈./

01234

(3:;<=)

AndrewNg

Bleu details

!' = Bleu score on n-grams only

Combined Bleu score:

BP =1 if MT_output_length > reference_output_length

exp(1 − MT_output_length/reference_output_length) otherwise

[Papineni et. al., 2002. Bleu: A method for automatic evaluation of machine translation]

deeplearning.ai

Sequencetosequencemodels

Attentionmodelintuition

AndrewNg

The problem of long sequences

Jane s'est rendue en Afrique en septembre dernier, a apprécié la culture et a rencontré beaucoup de gens merveilleux; elle est revenue en parlant comment son voyage était merveilleux, et elle me tente d'y aller aussi.

Jane went to Africa last September, and enjoyed the culture and met many wonderful people; she came back raving about how wonderful her trip was, and is tempting me to go too.

10 20 30 40 50 Sentence length

Bleu score

!"#$

%"&$

'("&$

%"*+$

'("*,$

⋯⋯

AndrewNg

Attention model intuition

%"&$ %".$ %"/$ %"0$%"1$

jane visite l’Afrique en septembre

[Bahdanau et. al., 2014. Neural machine translation by jointly learning to align and translate]

'("&$ '(".$ '("0$'("/$ '("1$

!"#$

deeplearning.ai

Sequencetosequencemodels

Attentionmodel

AndrewNg

Attention model

!"#$ !"%$ !"&$ !"'$!"($jane visite l’Afrique en septembre

[Bahdanau et. al., 2014. Neural machine translation by jointly learning to align and translate]

)"*$

AndrewNg[Bahdanau et. al., 2014. Neural machine translation by jointly learning to align and translate]

Computing attention +",,,.$

[Xu et. al., 2015. Show, attend and tell: Neural image caption generation with visual attention]

+",,,.$ = amount of attention /",$ should pay to )",.$

+",,,.$ = 123(567,7.8)

∑7.;<=> 123(567,7.8)

?",@#$

)",.$A",,,.$

+

/C",@#$ /C",$

?",@#$ ?",$

)"*$

!"#$

!"%$ !"E>$!"E>@#$

AndrewNg

Attention examples

July 20th 1969 1969 − 07 − 20

23April, 1564 1564 − 04 − 23

Visualization of +",,,.$:

deeplearning.ai

Audiodata

Speechrecognition

AndrewNg

Speech recognition problem

!audio clip

#transcript

“the quick brown fox”

AndrewNg

Attention model for speech recognition

+

“h”

%&'( %&)(

“T”

+&,(

!&'(⋯

!&)( !&',,,(!&---(⋯

AndrewNg

CTC cost for speech recognition

[Graves et al., 2006. Connectionist Temporal Classification: Labeling unsegmented sequence data with recurrent neural networks]

(Connectionist temporal classification)

“the quick brown fox”

Basic rule: collapse repeated characters not separated by “blank”

+&,(

!&'(

#.&',,,(

⋯!&)( !&',,,(

#.&'( #.&)(

deeplearning.ai

Audiodata

Triggerworddetection

AndrewNg

What is trigger word detection?

Amazon Echo(Alexa)

Baidu DuerOS(xiaodunihao)

Apple Siri(Hey Siri)

Google Home(Okay Google)

AndrewNg

Trigger word detection algorithm

!"#$

%"&$ %"'$ %"($

deeplearning.ai

Conclusion

Summaryandthankyou

AndrewNg

Specialization outline

1. Neural Networks and Deep Learning

2. Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization

3. Structuring Machine Learning Projects

4. Convolutional Neural Networks

5. Sequence Models

AndrewNg

Deep learning is a super power

Please buy this from shutterstockand replace in final video.

AndrewNg

-Andrew NgThank you.