17
Deep learning Deep learning Deep dual learning Hamid Beigy Sharif university of technology December 25, 2018 Hamid Beigy | Sharif university of technology | December 25, 2018 1 / 14

Deep learning - Deep dual learningce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-30.… · Deep learning j Dual learning Dual learning (Introduction) 1 Dual learning

  • Upload
    others

  • View
    54

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Deep learning - Deep dual learningce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-30.… · Deep learning j Dual learning Dual learning (Introduction) 1 Dual learning

Deep learning

Deep learningDeep dual learning

Hamid Beigy

Sharif university of technology

December 25, 2018

Hamid Beigy | Sharif university of technology | December 25, 2018 1 / 14

Page 2: Deep learning - Deep dual learningce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-30.… · Deep learning j Dual learning Dual learning (Introduction) 1 Dual learning

Deep learning

Table of contents

1 Introduction

2 Dual learning

3 Course overview

Hamid Beigy | Sharif university of technology | December 25, 2018 2 / 14

Page 3: Deep learning - Deep dual learningce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-30.… · Deep learning j Dual learning Dual learning (Introduction) 1 Dual learning

Deep learning | Introduction

Introduction

Hamid Beigy | Sharif university of technology | December 25, 2018 2 / 14

Page 4: Deep learning - Deep dual learningce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-30.… · Deep learning j Dual learning Dual learning (Introduction) 1 Dual learning

Deep learning | Introduction

Machine translation

1 How translate from a source language to a destination language?

2 Main problems

How translate words from the source language to the destinationlanguage?How order words in the destination language?How measure goodness of translation?What type of corpus is needed? (monolingual or bilingual)How build a sequence of translators? (Persian → English → French)

Hamid Beigy | Sharif university of technology | December 25, 2018 3 / 14

Page 5: Deep learning - Deep dual learningce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-30.… · Deep learning j Dual learning Dual learning (Introduction) 1 Dual learning

Deep learning | Introduction

Neural machine translation (NMT)

1 In NMT1, recurrent neural networks such as LSTM or GRU units areused.

1Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. ”Neural machinetranslation by jointly learning to align and translate.” ICLR 2015.

Hamid Beigy | Sharif university of technology | December 25, 2018 4 / 14

Page 6: Deep learning - Deep dual learningce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-30.… · Deep learning j Dual learning Dual learning (Introduction) 1 Dual learning

Deep learning | Introduction

Neural machine translation (NMT)

1 A critical disadvantage of this fixed-length context vector design isincapability of remembering long sentences.

2 The attention mechanism was proposed to help memorize long sourcesentences in NMT

3 Another critical disadvantage of this model is training set. We need alarge bilingual corpus.

4 Dual learning was introduced to overcome the need for a largebilingual corpus.

Hamid Beigy | Sharif university of technology | December 25, 2018 5 / 14

Page 7: Deep learning - Deep dual learningce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-30.… · Deep learning j Dual learning Dual learning (Introduction) 1 Dual learning

Deep learning | Dual learning

Dual learning

Hamid Beigy | Sharif university of technology | December 25, 2018 5 / 14

Page 8: Deep learning - Deep dual learningce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-30.… · Deep learning j Dual learning Dual learning (Introduction) 1 Dual learning

Deep learning | Dual learning

Dual learning (Introduction)

1 Dual learning is a auto-encoder like mechanism to utilize themonolingual datasets 2.

2Y. Xia, D. He, T. Qin, L. Wang, N. Yu, T.-Y. Liu, and W.-Y. Ma. Dual learning formachine translation. NIPS, 2016.

Hamid Beigy | Sharif university of technology | December 25, 2018 6 / 14

Page 9: Deep learning - Deep dual learningce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-30.… · Deep learning j Dual learning Dual learning (Introduction) 1 Dual learning

Deep learning | Dual learning

Dual learning (Definition)

1 Let us to define

DA Corpus of language A.DB Corpus of language B.P(.|s, θAB) translation model from A to B.P(.|s, θBA) translation model from B to A.LMA(.) learned language model of A.LMB(.) learned language model of B.

Hamid Beigy | Sharif university of technology | December 25, 2018 7 / 14

Page 10: Deep learning - Deep dual learningce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-30.… · Deep learning j Dual learning Dual learning (Introduction) 1 Dual learning

Deep learning | Dual learning

Dual learning (Algorithm)

1 We have

2 Generate K translated sentencessmid ,1, smid ,2, . . . , smid ,K

from P(.|s, θAB)3 Compute intermediate rewards

r1,1, r1,2, . . . , s1,Kfrom LMB(smid ,K ) for each sentence asr1,k = LMB(smid ,k)

Hamid Beigy | Sharif university of technology | December 25, 2018 8 / 14

Page 11: Deep learning - Deep dual learningce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-30.… · Deep learning j Dual learning Dual learning (Introduction) 1 Dual learning

Deep learning | Dual learning

Dual learning (Algorithm)

1 We have

2 Compute communication rewards r2,1, r2,2, . . . , s2,Kfor each sentence as r2,k = lnP(s|smid , ; θBA)

3 Set the total reward of kth sentence asrk = αr1,k + (1− α)r2,k

Hamid Beigy | Sharif university of technology | December 25, 2018 9 / 14

Page 12: Deep learning - Deep dual learningce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-30.… · Deep learning j Dual learning Dual learning (Introduction) 1 Dual learning

Deep learning | Dual learning

Dual learning (Algorithm)

1 We have

2 Compute the stochastic gradient of θAB and θBA

3 Update the mode parameters θAB and θBA

Hamid Beigy | Sharif university of technology | December 25, 2018 10 / 14

Page 13: Deep learning - Deep dual learningce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-30.… · Deep learning j Dual learning Dual learning (Introduction) 1 Dual learning

Deep learning | Dual learning

Experimental results

Hamid Beigy | Sharif university of technology | December 25, 2018 11 / 14

Page 14: Deep learning - Deep dual learningce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-30.… · Deep learning j Dual learning Dual learning (Introduction) 1 Dual learning

Deep learning | Dual learning

Some extensions

1 Yingce Xia, Tao Qin, Wei Chen, Jiang Bian, Nenghai Yu, Tie-YanLiu, Dual Supervised Learning, ICML 2017.

2 Yijun Wang, Yingce Xia, Li Zhao, Jiang Bian, Tao Qin, Guiquan Liuand Tie-Yan Liu, Dual Transfer Learning for Neural MachineTranslation with Marginal Distribution Regularization, AAAI 2018.

Hamid Beigy | Sharif university of technology | December 25, 2018 12 / 14

Page 15: Deep learning - Deep dual learningce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-30.… · Deep learning j Dual learning Dual learning (Introduction) 1 Dual learning

Deep learning | Course overview

Course overview

Hamid Beigy | Sharif university of technology | December 25, 2018 12 / 14

Page 16: Deep learning - Deep dual learningce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-30.… · Deep learning j Dual learning Dual learning (Introduction) 1 Dual learning

Deep learning | Course overview

Covered Topics

1 Introduction to neural networks.

2 Feed-forward networks.

3 Convolutional neural networks

4 Recurrent neural networks & attention models

5 Autoencoders & word2vec

6 Generative models

7 Variational autoencoders

8 Sum-product networks

9 Deep reinforcement learnings

10 Dual learning

11 Other topics

Hamid Beigy | Sharif university of technology | December 25, 2018 13 / 14

Page 17: Deep learning - Deep dual learningce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-30.… · Deep learning j Dual learning Dual learning (Introduction) 1 Dual learning

Deep learning | Course overview

Other topics

1 Other models such as RBM

2 Neural Turing machines

3 Probabilistic modeling of Deep networks

4 Generalization bounds of Deep networks

Hamid Beigy | Sharif university of technology | December 25, 2018 14 / 14