Upload
others
View
54
Download
0
Embed Size (px)
Citation preview
Deep learning
Deep learningDeep dual learning
Hamid Beigy
Sharif university of technology
December 25, 2018
Hamid Beigy | Sharif university of technology | December 25, 2018 1 / 14
Deep learning
Table of contents
1 Introduction
2 Dual learning
3 Course overview
Hamid Beigy | Sharif university of technology | December 25, 2018 2 / 14
Deep learning | Introduction
Introduction
Hamid Beigy | Sharif university of technology | December 25, 2018 2 / 14
Deep learning | Introduction
Machine translation
1 How translate from a source language to a destination language?
2 Main problems
How translate words from the source language to the destinationlanguage?How order words in the destination language?How measure goodness of translation?What type of corpus is needed? (monolingual or bilingual)How build a sequence of translators? (Persian → English → French)
Hamid Beigy | Sharif university of technology | December 25, 2018 3 / 14
Deep learning | Introduction
Neural machine translation (NMT)
1 In NMT1, recurrent neural networks such as LSTM or GRU units areused.
1Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. ”Neural machinetranslation by jointly learning to align and translate.” ICLR 2015.
Hamid Beigy | Sharif university of technology | December 25, 2018 4 / 14
Deep learning | Introduction
Neural machine translation (NMT)
1 A critical disadvantage of this fixed-length context vector design isincapability of remembering long sentences.
2 The attention mechanism was proposed to help memorize long sourcesentences in NMT
3 Another critical disadvantage of this model is training set. We need alarge bilingual corpus.
4 Dual learning was introduced to overcome the need for a largebilingual corpus.
Hamid Beigy | Sharif university of technology | December 25, 2018 5 / 14
Deep learning | Dual learning
Dual learning
Hamid Beigy | Sharif university of technology | December 25, 2018 5 / 14
Deep learning | Dual learning
Dual learning (Introduction)
1 Dual learning is a auto-encoder like mechanism to utilize themonolingual datasets 2.
2Y. Xia, D. He, T. Qin, L. Wang, N. Yu, T.-Y. Liu, and W.-Y. Ma. Dual learning formachine translation. NIPS, 2016.
Hamid Beigy | Sharif university of technology | December 25, 2018 6 / 14
Deep learning | Dual learning
Dual learning (Definition)
1 Let us to define
DA Corpus of language A.DB Corpus of language B.P(.|s, θAB) translation model from A to B.P(.|s, θBA) translation model from B to A.LMA(.) learned language model of A.LMB(.) learned language model of B.
Hamid Beigy | Sharif university of technology | December 25, 2018 7 / 14
Deep learning | Dual learning
Dual learning (Algorithm)
1 We have
2 Generate K translated sentencessmid ,1, smid ,2, . . . , smid ,K
from P(.|s, θAB)3 Compute intermediate rewards
r1,1, r1,2, . . . , s1,Kfrom LMB(smid ,K ) for each sentence asr1,k = LMB(smid ,k)
Hamid Beigy | Sharif university of technology | December 25, 2018 8 / 14
Deep learning | Dual learning
Dual learning (Algorithm)
1 We have
2 Compute communication rewards r2,1, r2,2, . . . , s2,Kfor each sentence as r2,k = lnP(s|smid , ; θBA)
3 Set the total reward of kth sentence asrk = αr1,k + (1− α)r2,k
Hamid Beigy | Sharif university of technology | December 25, 2018 9 / 14
Deep learning | Dual learning
Dual learning (Algorithm)
1 We have
2 Compute the stochastic gradient of θAB and θBA
3 Update the mode parameters θAB and θBA
Hamid Beigy | Sharif university of technology | December 25, 2018 10 / 14
Deep learning | Dual learning
Experimental results
Hamid Beigy | Sharif university of technology | December 25, 2018 11 / 14
Deep learning | Dual learning
Some extensions
1 Yingce Xia, Tao Qin, Wei Chen, Jiang Bian, Nenghai Yu, Tie-YanLiu, Dual Supervised Learning, ICML 2017.
2 Yijun Wang, Yingce Xia, Li Zhao, Jiang Bian, Tao Qin, Guiquan Liuand Tie-Yan Liu, Dual Transfer Learning for Neural MachineTranslation with Marginal Distribution Regularization, AAAI 2018.
Hamid Beigy | Sharif university of technology | December 25, 2018 12 / 14
Deep learning | Course overview
Course overview
Hamid Beigy | Sharif university of technology | December 25, 2018 12 / 14
Deep learning | Course overview
Covered Topics
1 Introduction to neural networks.
2 Feed-forward networks.
3 Convolutional neural networks
4 Recurrent neural networks & attention models
5 Autoencoders & word2vec
6 Generative models
7 Variational autoencoders
8 Sum-product networks
9 Deep reinforcement learnings
10 Dual learning
11 Other topics
Hamid Beigy | Sharif university of technology | December 25, 2018 13 / 14
Deep learning | Course overview
Other topics
1 Other models such as RBM
2 Neural Turing machines
3 Probabilistic modeling of Deep networks
4 Generalization bounds of Deep networks
Hamid Beigy | Sharif university of technology | December 25, 2018 14 / 14