Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
NLP using Transformer Architectures
TF World 2019
1
Aurélien GéronML Consultant @aureliengeron
❖ NLP Tasks and Datasets
❖ The Transformer Architecture
❖ Recent Language Models
Natural Language Processing
Sentiment Analysis
Sentiment Analysis
POSITIVE ?
Sentiment Analysis
POSITIVE ?
LABEL = POSITIVE
Sentiment Analysis
● Recommender Systems● Market Sentiment
Sentiment Analysis
Text Classification
Sentiment Analysis
!pip install tensorflow-datasets # or tfds-nightly
import tensorflow_datasets as tfds
datasets = tfds.load("imdb_reviews")
Sentiment Analysis
TensorFlow Datasetsdatasets = tfds.load("imdb_reviews")train_set = datasets["train"] # 25,000 reviewstest_set = datasets["test"] # 25,000 reviews
TensorFlow Datasetstrain_set, test_set = tfds.load( "imdb_reviews", split=["train", "test"])
TensorFlow Datasetstrain_set, test_set = tfds.load( "imdb_reviews:1.0.0", split=["train", "test"])
TensorFlow Datasetstrain_set, test_set = tfds.load( "imdb_reviews:1.0.0", split=["train", "test[:60%]"])
TensorFlow Datasetstrain_set, test_set, valid_set = tfds.load( "imdb_reviews:1.0.0", split=["train", "test[:60%]", "test[60%:]"])
TensorFlow Datasetstrain_set, test_set, valid_set = tfds.load( "imdb_reviews:1.0.0", split=["train", "test[:60%]", "test[60%:]"], as_supervised=True)
for review, label in train_set.take(2): print(review.numpy().decode("utf-8")) print(label.numpy())
TensorFlow Datasets
for review, label in train_set.take(2): print(review.numpy().decode("utf-8")) print(label.numpy())
This was an absolutely terrible movie. Don't be lured in by Christopher Walken or Michael Ironside. Both are great actors, but this must simply be their worst role in history.[...]0This is the kind of film for a snowy Sunday afternoon when the rest of the world can go ahead with its own [...] A family film in every sense and one that deserves the praise it received.1
TensorFlow Datasets
TensorFlow Datasets
tensorflow.org/datasets/catalog
"Das Leben ist wunderbar."
"Life is wonderful."
Translation
"Das Leben ist wunderbar."
"Life is wonderful."
Translation
Workshop on Machine Translation (WMT)
datasets = tfds.load("wmt19_translate/de-en") # 10 GB!
Summarization
"Popular YouTubers raise 10 million dollars in 5 days to plant 10 million trees, in an effort to fight global warming."
Summarization
"Popular YouTubers raise 10 million dollars in 5 days to plant 10 million trees, in an effort to fight global warming."
datasets = tfds.load("multi_news")Predict abstract from news articles
SoCal
What is Southern California often abbreviated as?
Question
Question Answering
Southern California, often abbreviated SoCal, is a geographic and cultural region
that generally comprises California's southernmost 10 counties. The region is
traditionally described as "eight counties", based on demographics and [...]
Context
Answer
datasets = tfds.load("squad")
SoCal
What is Southern California often abbreviated as?
Question
Question Answering
Southern California, often abbreviated SoCal, is a geographic and cultural region
that generally comprises California's southernmost 10 counties. The region is
traditionally described as "eight counties", based on demographics and [...]
Context
Answer
Stanford Question Answering Dataset (SQuAD)
Semantic Equivalence
"Alice could not find her keys.""Alice lost her keys.”?
Semantic Equivalence
"Alice could not find her keys.""Alice lost her keys.”?
datasets = tfds.load("glue/mrpc")Microsoft Research Paraphrase Corpus
Entailment
"Alice got her keys back.""Alice lost her keys, but Betty found them and returned them to her.”
Entailment
"Alice got her keys back.""Alice lost her keys, but Betty found them and returned them to her.” Entails
Entailment
"Alice lost her keys forever.""Alice lost her keys, but Betty found them and returned them to her.” Contradicts
Entailment
"Alice and Betty are sisters.""Alice lost her keys, but Betty found them and returned them to her.” Neutral
Entailment
"Alice and Betty are sisters."
datasets = tfds.load("snli/plain_text")Stanford Natural Language Inference Dataset
"Alice lost her keys, but Betty found them and returned them to her.” Neutral
Coreference Resolution
"Alice lost her keys, but Betty found them and returned them to her.”
Coreference Resolution
"Alice lost her keys, but Betty found them and returned them to her.”
Alice Betty
Coreference Resolution
"Alice lost her keys, but Betty found them and returned them to her.”
Alice Betty
Coreference Resolution
datasets = tfds.load("gap")Stanford Natural Language Inference Dataset
"Alice lost her keys, but Betty found them and returned them to her.”
Alice Betty
born in
Information Extraction
Alice London
Bob
married to
Charliefriends
Eve
friends
Speech to Text
Text to Speech
Optical Character Recognition
Metrics
Source: “Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer” by Raffel, Shazeer, Roberts, Lee et al., https://arxiv.org/abs/1910.10683
Transformer Architecture
Attention Is All You Need
Source: Figure 1 from the paper“Attention Is All You Need” byVaswani, Shazeer, Parmar, Uszkoreit, Jones, Gomez, Kaiser and Polosukhin https://arxiv.org/abs/1706.03762
GPT-2Feb 2019
BERTOct 2018
Transformer-XLJan 2019
XLNetJun 2019
RoBERTaJul 2019
T5Oct 2019
LSTM Encoder/Decoder
“I drink milk”
BiLSTM Encoder/Decoder
“I drink milk”
Attention MechanismsSep 2014
Encoder/Decoder with Attention
“I drink milk”
bois
“I drink milk”
Encoder/Decoder with Attentionlait
“I drink milk”
Encoder/Decoder with Attentionlait
“I drink milk”
Encoder/Decoder with Attention
I
Attention Mechanism
milk
drink
I
Attention Mechanism
milk
Verb
Non-verb
Food-relatedNot food-related
drink
“I drink milk”
Encoder/Decoder with Attention
I
Attention Mechanismdrink
milk
Verb
Non-verb
Food-relatedNot food-related
“I drink”
“I drink milk”
Encoder/Decoder with Attention
I
Attention Mechanismdrink
milk
Verb
Non-verb
Food-relatedNot food-related
Query
“Je bois le <?>”
Query
I
Attention Mechanismdrink
milk
Verb
Non-verb
Food-relatedNot food-related
0.9
0.05
0.05
Input to next stepI
Attention Mechanismdrink
milk
Verb
Non-verb
Food-relatedNot food-related
“I drink milk”
Encoder/Decoder with Attentionlait
“I drink milk”
Attention Is All You Needlait
Attention Is All You Need
The bat was sleeping
Attention Is All You Need
The bat was sleeping?
Attention Is All You Need
The bat was sleeping
Attention Is All You Need
The bat was sleeping
The bat was sleepingAttention layer
Attention Is All You Need
The bat was sleeping
The bat was sleepingAttention layer
Figure 1 from the paper
Figure 1 from the paper(simplified)
Encoder
Decoder
Figure 1 from the paper(simplified)
Encoder
Decoder
Encoder
Decoder
Decoder 1
Figure 1 from the paper(simplified)
Encoder 2
Encoder 3
Encoder 4
Encoder 5
Decoder 2
Decoder 3
Decoder 4
Decoder 5
Encoder 1
Encoder 6 Decoder 6
Encoder
Decoder
Decoder 1
Figure 1 from the paper(simplified)
Encoder 2
Encoder 3
Encoder 4
Encoder 5
Decoder 2
Decoder 3
Decoder 4
Decoder 5
Encoder 1
Encoder 6 Decoder 6
Attention Is All You Need
The bat was sleeping
Attention Is All You Need
The bat was sleeping
The bat was sleeping
Attention Is All You Need
The bat was sleeping
The bat was sleeping
Attention Is All You Need
The bat was sleeping
The bat was sleeping
The bat was sleeping
Attention Is All You Need
The bat was sleeping
The bat was sleeping
The bat was sleeping
The bat was sleeping
Attention Is All You Need
The bat was sleeping
The bat was sleeping
The bat was sleeping
The bat was sleeping
<sos> La chauve souris dort
Attention Is All You Need
The bat was sleeping
The bat was sleeping
The bat was sleeping
The bat was sleeping
<sos> La chauve souris dort
<sos> La chauve souris dort
Attention Is All You Need
The bat was sleeping
The bat was sleeping
The bat was sleeping
The bat was sleeping
<sos> La chauve souris dort
<sos> La chauve souris dort
<sos> La chauve souris dort
<sos> La chauve souris dort
Attention Is All You Need
The bat was sleeping
The bat was sleeping
The bat was sleeping
The bat was sleeping
<sos> La chauve souris dort
<sos> La chauve souris dort
<sos> La chauve souris dort
<sos> La chauve souris dort
La chauve souris dort <eos>
Attention Is All You Need
The bat was sleeping
The bat was sleeping
The bat was sleeping
The bat was sleeping
<sos>
<sos>
<sos>
<sos>
La
Attention Is All You Need
The bat was sleeping
The bat was sleeping
The bat was sleeping
The bat was sleeping
<sos> La
<sos>
<sos>
<sos>
La
Attention Is All You Need
The bat was sleeping
The bat was sleeping
The bat was sleeping
The bat was sleeping
<sos> La
<sos> La
<sos> La
<sos> La
La chauve
Attention Is All You Need
The bat was sleeping
The bat was sleeping
The bat was sleeping
The bat was sleeping
<sos> La chauve
<sos> La
<sos> La
<sos> La
La chauve
Attention Is All You Need
The bat was sleeping
The bat was sleeping
The bat was sleeping
The bat was sleeping
<sos> La chauve souris dort
<sos> La chauve souris dort
<sos> La chauve souris dort
<sos> La chauve souris dort
La chauve souris dort <eos>
Query
I
Multi-Head Attentiondrink
milk
Verb
Non-verb
Food-relatedNot food-related
Query
I
Multi-Head Attentiondrink
milk
Verb
Non-verb
Food-relatedNot food-related
milk
Query
I
Multi-Head Attentiondrink
Verb
Non-verb
Food-relatedNot food-related
milk
Query
I
Multi-Head Attentiondrink
Verb
Non-verb
Food-relatedNot food-related
Query
I
Multi-Head Attentiondrink
milk
Verb
Non-verb
Food-relatedNot food-related
I
Query
Multi-Head Attentiondrink
milk
Verb
Non-verb
Food-relatedNot food-related
Language Models
Pretraining + Fine-tuningULMFitJan 2018
ELMoFeb 2018
BERT● Large number of parameters● Pretraining on a huge
corpus● Subword tokenization● Single architecture for
multiple tasks
Subword Tokenization
Going on computerless vacation
Going on a computer ##less vacation
BERT — Figure 4 From Paper
BERT — Figure 4 From Paper
GPT-2
Transformer-XL
XLNet
RoBERTa
T5
T5 — Figure 1 From Paper
Get Coding!
https://github.com/huggingface/transformershttps://medium.com/@lysandrejik
Thank you!