88
Modern NLP for Pre-Modern Practitioners Joel Grus @joelgrus #QConAI #2019

Modern NLP - QCon.ai · NLP is cool Modern NLP is solving really hard problems (And is changing really really quickly) Lots of really smart people with lots of data and lots of compute

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Modern NLPfor

Pre-Modern Practitioners

Joel Grus@joelgrus

#QConAI

#2019

"True self-control is waiting until the

movie starts to eat your popcorn."

starts to eat your popcorn.

True self-control is waiting

the movieuntil

Natural Language Understanding is Hard

But We're Getting Better At It** as measured by performance on tasks we're getting better at

As Measured by Performance on

Tasks We're Getting Better at*

* tasks that would be easy if we were good at natural language understanding and that we therefore use to measure our progress toward natural language understanding

About Me

Obligatory Plug for AllenNLP

A Handful of Tasks That Would Be

Easy if We Were Good at Natural

Language Understanding

Parsing

Named-Entity Recognition

Coreference Resolution

Machine Translation

Summarization

Attend QCon.ai.

Text Classification

Machine Comprehension

Machine Comprehension?

Textual Entailment

Winograd Schemas

The conference organizer disinvited the speaker because he feared a boring talk.

The conference organizer disinvited the speaker because he proposed a boring talk.

conference organizer

speaker

Language Modeling

And many others!

If you were good at natural language understanding, you'd also be pretty good at these tasks

So if computers get good at each of these

tasks, then...

(I Am Being Unfair)

Each of these tasks is valuable on its own merits

Likely they are getting us closer to actual natural language understanding

Pre-Modern NLP

Lots of Linguistics

Grammars

SS -> NP VP

NP VPVP -> VBZ ADJP

NP VBZ ADJPNP -> JJ NN

JJ NN VBZ ADJPADJP -> JJ

JJ NN VBZ JJJJ -> "Artificial"NN -> "intelligence"VBZ -> "is"JJ -> "dangerous"

Artificial intelligence is dangerous

Hand-Crafted Features

Rule-Based Systems

Modern NLP

Theme 1: Neural Nets and Low-Dimensional Representations

Theme 2:

Putting Things in

Theme 3:

Theme 4:

Theme 5: Transfer Learning

Word Vectors

0 0 0 0 0 0 0 0 0 1 0 0 0 0 ... 0

.01 0 0 .9 0 0 0 0 0 .05 0 0 0 0 ... 0

.3 .6 .1 .2 2.3

Joel is attending an artificial intelligence conference.

artificial

intelligence

embedding

prediction

Using Word Vectors

??

Using Word Vectors

VN

Using Word Vectors

NJ

The official department heads all quit .

bites

man

dog

Using Context for Sequence Labeling

VN

Using Context for Sequence Classification

Recurrent Neural Networks

LSTMs and GRUs

Bidirectionality

Generative Character-Level Modeling

Convolutional Networks

Sequence-to-Sequence Models

Attention

Large "Unsupervised"

Language Models

Contextual Embeddings

Contextual Embeddings

The Seahawks football today

word2vec

ELMo

ELMo

Self-Attention

RNN vs CNN vs Self-Attention

The Transformer("Attention Is All You Need")

OpenAI GPT, or

Transformer Decoder Language Model

One Model to Rule Them All?

The GLUE Benchmark

BERT

Task 1: Masked Language Modeling

Joel is giving a [MASK] talk at a [MASK] in San Francisco

interestingexcitingderivativepedestriansnooze-worthy...

conferencemeetupravecoffeehouseWeWork...

Task 2: Next Sentence Prediction[CLS] Joel is giving a talk. [SEP] The audience is enthralled. [SEP]

[CLS] Joel is giving a talk. [SEP] The audience is falling asleep. [SEP]

99% is_next_sentence1% is_not_next_sentence

1% is_next_sentence99% is_not_next_sentence

BERT for downstream tasks

GPT-2

1.5 billionparameters

Is GPT-2 Dangerous?

PRETRAINEDLANGUAGEMODEL

How Can You Use These In

Your Work?

Use Pretrained Word Vectors

Better Still, Use Pretrained Contextual Embeddings

Use Pretrained BERT to Build Great Classifiers

Use GPT-2 (small)(if you dare)

PRETRAINEDLANGUAGEMODEL

In Conclusion● NLP is cool● Modern NLP is solving really hard

problems● (And is changing really really quickly)● Lots of really smart people with lots of

data and lots of compute power have trained models that you can just download and use

● So take advantage of their work!

I'm fine-tuning a transformer model!

Thanks!● I'll tweet out the slides: @joelgrus

○ read the speaker notes○ they have lots of links

● I sometimes blog: joelgrus.com● AI2: allenai.org● AllenNLP: allennlp.org● GPT-2 Explorer: gpt2.apps.allenai.org● podcast: adversariallearning.com

Appendix

Referenceshttp://ruder.io/a-review-of-the-recent-history-of-nlp/https://ankit-ai.blogspot.com/2019/03/future-of-natural-language-processing.htmlhttps://lilianweng.github.io/lil-log/2019/01/31/generalized-language-models.html#openai-gpt