30
Sentence Modeling •Representation of sentences is the heart of Natural Language Processing •A sentence model is a representation and analysis of semantic content of a sentence for classification or generation •The sentence modeling task is at the core of many tasks such as sentiment analysis, paraphrase detection, entailment recognition, summarization, discourse analysis, machine translation, grounded language learning and image retrieval •The aim of sentence modeling is a feature function that guides the process by which

Sentence Modeling - Computer Science - Wayne State …mdong/CNN for mod… · PPT file · Web view · 2016-03-24Properties of Max TDNN. Sensitive to order of the words. Does not

  • Upload
    lyngoc

  • View
    216

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Sentence Modeling - Computer Science - Wayne State …mdong/CNN for mod… · PPT file · Web view · 2016-03-24Properties of Max TDNN. Sensitive to order of the words. Does not

Sentence Modeling• Representation of sentences is the heart of Natural Language

Processing • A sentence model is a representation and analysis of semantic

content of a sentence for classification or generation• The sentence modeling task is at the core of many tasks such as

sentiment analysis, paraphrase detection, entailment recognition, summarization, discourse analysis, machine translation, grounded language learning and image retrieval• The aim of sentence modeling is a feature function that guides the

process by which features of a sentence are extracted.

Page 2: Sentence Modeling - Computer Science - Wayne State …mdong/CNN for mod… · PPT file · Web view · 2016-03-24Properties of Max TDNN. Sensitive to order of the words. Does not

One Dimensional Convolution

Page 3: Sentence Modeling - Computer Science - Wayne State …mdong/CNN for mod… · PPT file · Web view · 2016-03-24Properties of Max TDNN. Sensitive to order of the words. Does not

Vector of weights Size: m Filter

Input SequenceSentence

M-gram Dot product

Page 4: Sentence Modeling - Computer Science - Wayne State …mdong/CNN for mod… · PPT file · Web view · 2016-03-24Properties of Max TDNN. Sensitive to order of the words. Does not

Produces a sequence c

Page 5: Sentence Modeling - Computer Science - Wayne State …mdong/CNN for mod… · PPT file · Web view · 2016-03-24Properties of Max TDNN. Sensitive to order of the words. Does not

Narrow ConvolutionSize of c :

s – m + 1

It requires thats ≥ m

Page 6: Sentence Modeling - Computer Science - Wayne State …mdong/CNN for mod… · PPT file · Web view · 2016-03-24Properties of Max TDNN. Sensitive to order of the words. Does not

Wide Convolution

Size:s + m - 1

Page 7: Sentence Modeling - Computer Science - Wayne State …mdong/CNN for mod… · PPT file · Web view · 2016-03-24Properties of Max TDNN. Sensitive to order of the words. Does not

Wide Convolution• Size of c

s+m-1

• No requirement on s or m• Out of range values are taken to be 0• Result of narrow convolution is subsequence of result of wide

convolution

Page 8: Sentence Modeling - Computer Science - Wayne State …mdong/CNN for mod… · PPT file · Web view · 2016-03-24Properties of Max TDNN. Sensitive to order of the words. Does not

Advantages of Wide Convolution• Guarantees that a valid non empty c will always be produced• All weights in the filter reach the entire sentence• Holds no limit on the size of m or s

Page 9: Sentence Modeling - Computer Science - Wayne State …mdong/CNN for mod… · PPT file · Web view · 2016-03-24Properties of Max TDNN. Sensitive to order of the words. Does not

Time Delay Neural Network• A key feature for TDNN’s are the ability to express a relation between

inputs in time.• The sequence s is viewed as having a time dimension and the

convolution is applied over the time dimension.

Page 10: Sentence Modeling - Computer Science - Wayne State …mdong/CNN for mod… · PPT file · Web view · 2016-03-24Properties of Max TDNN. Sensitive to order of the words. Does not
Page 11: Sentence Modeling - Computer Science - Wayne State …mdong/CNN for mod… · PPT file · Web view · 2016-03-24Properties of Max TDNN. Sensitive to order of the words. Does not
Page 12: Sentence Modeling - Computer Science - Wayne State …mdong/CNN for mod… · PPT file · Web view · 2016-03-24Properties of Max TDNN. Sensitive to order of the words. Does not

Max TDNN

Page 13: Sentence Modeling - Computer Science - Wayne State …mdong/CNN for mod… · PPT file · Web view · 2016-03-24Properties of Max TDNN. Sensitive to order of the words. Does not

Properties of Max TDNN• Sensitive to order of the words• Does not depend on external language specific features• Largely uniform importance to the signal from each of the words

• Range of feature detectors is limited• Higher order and long range feature detectors cannot be incorporated• Multiple occurrences of features and the sequence ignored• Pooling factor: s-m+1

Page 14: Sentence Modeling - Computer Science - Wayne State …mdong/CNN for mod… · PPT file · Web view · 2016-03-24Properties of Max TDNN. Sensitive to order of the words. Does not

k-Max Pooling

Page 15: Sentence Modeling - Computer Science - Wayne State …mdong/CNN for mod… · PPT file · Web view · 2016-03-24Properties of Max TDNN. Sensitive to order of the words. Does not

k – Max Pooling• Given a value k and a sequence P of length p, k-max pooling selects

the subsequence p-max of the k highest values of p.

• The order of the values in p-max corresponds to the original order in p.

Page 16: Sentence Modeling - Computer Science - Wayne State …mdong/CNN for mod… · PPT file · Web view · 2016-03-24Properties of Max TDNN. Sensitive to order of the words. Does not

k-Max Pooling• k most active features• Features may be number of positions apart• Preserves the order of the features• But is insensitive to their specific positions• Can detect multiple occurrences of feature

Page 17: Sentence Modeling - Computer Science - Wayne State …mdong/CNN for mod… · PPT file · Web view · 2016-03-24Properties of Max TDNN. Sensitive to order of the words. Does not

What should k be?• Why not let it decide for itself?

Page 18: Sentence Modeling - Computer Science - Wayne State …mdong/CNN for mod… · PPT file · Web view · 2016-03-24Properties of Max TDNN. Sensitive to order of the words. Does not

Dynamic k-Max Pooling

Suppose length of sentence = 18L = 3Ktop = 3

K1 = 12K2 = 6K3 = 3

Page 19: Sentence Modeling - Computer Science - Wayne State …mdong/CNN for mod… · PPT file · Web view · 2016-03-24Properties of Max TDNN. Sensitive to order of the words. Does not

Multiple Feature Maps

Convolution K-max Pooling Layer

Non linear function

Feature Map

Second Order

Feature Map

Page 20: Sentence Modeling - Computer Science - Wayne State …mdong/CNN for mod… · PPT file · Web view · 2016-03-24Properties of Max TDNN. Sensitive to order of the words. Does not

To increase the number of learnt feature detectors of a certain order, multiple feature maps may be computed in parallel at the same layer.

Page 21: Sentence Modeling - Computer Science - Wayne State …mdong/CNN for mod… · PPT file · Web view · 2016-03-24Properties of Max TDNN. Sensitive to order of the words. Does not

Folding• Feature detectors in different rows are independent of each other.

Page 22: Sentence Modeling - Computer Science - Wayne State …mdong/CNN for mod… · PPT file · Web view · 2016-03-24Properties of Max TDNN. Sensitive to order of the words. Does not

Properties of Sentence Model• The subsequence of n-grams extracted by the pooling operation

induces invariance to absolute positions, but maintains their order and relative positions.

• DCNN feature graphs have a global range of the pooling operations

• DCNN has internal input dependent structure and does not rely on externally provided parse trees.

Page 23: Sentence Modeling - Computer Science - Wayne State …mdong/CNN for mod… · PPT file · Web view · 2016-03-24Properties of Max TDNN. Sensitive to order of the words. Does not

Experiments

Page 24: Sentence Modeling - Computer Science - Wayne State …mdong/CNN for mod… · PPT file · Web view · 2016-03-24Properties of Max TDNN. Sensitive to order of the words. Does not

Sentiment Prediction in Movie Reviews• Concerns prediction of sentiment of movie reviews in Stanford

Sentiment Treebank• Output is binary in experiment 1 and “negative, somewhat negative,

neutral, somewhat positive, positive” in experiment 2• Binary: MultiClass:

Page 25: Sentence Modeling - Computer Science - Wayne State …mdong/CNN for mod… · PPT file · Web view · 2016-03-24Properties of Max TDNN. Sensitive to order of the words. Does not
Page 26: Sentence Modeling - Computer Science - Wayne State …mdong/CNN for mod… · PPT file · Web view · 2016-03-24Properties of Max TDNN. Sensitive to order of the words. Does not

Question Type Classification• TREC question dataset• Six Different Question Types

Page 27: Sentence Modeling - Computer Science - Wayne State …mdong/CNN for mod… · PPT file · Web view · 2016-03-24Properties of Max TDNN. Sensitive to order of the words. Does not
Page 28: Sentence Modeling - Computer Science - Wayne State …mdong/CNN for mod… · PPT file · Web view · 2016-03-24Properties of Max TDNN. Sensitive to order of the words. Does not

Twitter Sentiment Prediction with Distant Supervision• Large dataset of tweets• Tweet is labelled positive or negative automatically based on

emoticon• Tweets are preprocessed

Page 29: Sentence Modeling - Computer Science - Wayne State …mdong/CNN for mod… · PPT file · Web view · 2016-03-24Properties of Max TDNN. Sensitive to order of the words. Does not
Page 30: Sentence Modeling - Computer Science - Wayne State …mdong/CNN for mod… · PPT file · Web view · 2016-03-24Properties of Max TDNN. Sensitive to order of the words. Does not

Conclusion• Dynamic CNN defined, which uses Dynamic k-max Pooling• Feature Graph captures word relation of varying size• High performance on sentiment prediction and question classification

without requiring external features