Deep learning - Deep dual learningce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-30.… · Deep learning j Dual learning Dual learning (Introduction) 1 Dual learning

Deep learning

Deep learningDeep dual learning

Hamid Beigy

Sharif university of technology

December 25, 2018

Hamid Beigy | Sharif university of technology | December 25, 2018 1 / 14

Deep learning

Table of contents

1 Introduction

2 Dual learning

3 Course overview


Deep learning | Introduction

Introduction



Machine translation

1 How translate from a source language to a destination language?

2 Main problems

How translate words from the source language to the destinationlanguage?How order words in the destination language?How measure goodness of translation?What type of corpus is needed? (monolingual or bilingual)How build a sequence of translators? (Persian → English → French)



Neural machine translation (NMT)

1 In NMT1, recurrent neural networks such as LSTM or GRU units areused.

1Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. ”Neural machinetranslation by jointly learning to align and translate.” ICLR 2015.



Neural machine translation (NMT)

1 A critical disadvantage of this fixed-length context vector design isincapability of remembering long sentences.

2 The attention mechanism was proposed to help memorize long sourcesentences in NMT

3 Another critical disadvantage of this model is training set. We need alarge bilingual corpus.

4 Dual learning was introduced to overcome the need for a largebilingual corpus.


Deep learning | Dual learning

Dual learning



Dual learning (Introduction)

1 Dual learning is a auto-encoder like mechanism to utilize themonolingual datasets 2.

2Y. Xia, D. He, T. Qin, L. Wang, N. Yu, T.-Y. Liu, and W.-Y. Ma. Dual learning formachine translation. NIPS, 2016.



Dual learning (Definition)

1 Let us to define

DA Corpus of language A.DB Corpus of language B.P(.|s, θAB) translation model from A to B.P(.|s, θBA) translation model from B to A.LMA(.) learned language model of A.LMB(.) learned language model of B.



Dual learning (Algorithm)

1 We have

2 Generate K translated sentencessmid ,1, smid ,2, . . . , smid ,K

from P(.|s, θAB)3 Compute intermediate rewards

r1,1, r1,2, . . . , s1,Kfrom LMB(smid ,K ) for each sentence asr1,k = LMB(smid ,k)




1 We have

2 Compute communication rewards r2,1, r2,2, . . . , s2,Kfor each sentence as r2,k = lnP(s|smid , ; θBA)

3 Set the total reward of kth sentence asrk = αr1,k + (1− α)r2,k




1 We have

2 Compute the stochastic gradient of θAB and θBA

3 Update the mode parameters θAB and θBA



Experimental results



Some extensions

1 Yingce Xia, Tao Qin, Wei Chen, Jiang Bian, Nenghai Yu, Tie-YanLiu, Dual Supervised Learning, ICML 2017.

2 Yijun Wang, Yingce Xia, Li Zhao, Jiang Bian, Tao Qin, Guiquan Liuand Tie-Yan Liu, Dual Transfer Learning for Neural MachineTranslation with Marginal Distribution Regularization, AAAI 2018.


Deep learning | Course overview

Course overview



Covered Topics

1 Introduction to neural networks.

2 Feed-forward networks.

3 Convolutional neural networks

4 Recurrent neural networks & attention models

5 Autoencoders & word2vec

6 Generative models

7 Variational autoencoders

8 Sum-product networks

9 Deep reinforcement learnings

10 Dual learning

11 Other topics



Other topics

1 Other models such as RBM

2 Neural Turing machines

3 Probabilistic modeling of Deep networks

4 Generalization bounds of Deep networks


Documents

Deep learning - Deep dual learningce.sharif.edu/courses/97-98/1/ce959-1/resources/root/slides/Lect-30.… · Deep learning j Dual learning Dual learning (Introduction) 1 Dual learning