LSTM Networks for Sentiment Analysis with Keras

Preview:

Citation preview

LSTM Networks for Sentiment Analysis

YAN TING LIN

Summary• This tutorial aims to provide an example of how a Recurrent Neural

Network (RNN) using the Long Short Term Memory (LSTM) architecture can be implemented using Theano. In this tutorial, this model is used to perform sentiment analysis on movie reviews from the Large Movie Review Dataset, sometimes known as the IMDB dataset.• In this task, given a movie review, the model attempts to predict

whether it is positive or negative. This is a binary classification task.

• Ref: http://deeplearning.net/tutorial/lstm.html

Data• Ref: https://keras.io/datasets/• Dataset of 25,000 movies reviews from IMDB, labeled by sentiment

(positive/negative). Reviews have been preprocessed, and each review is encoded as a sequence of word indexes (integers). For convenience, words are indexed by overall frequency in the dataset, so that for instance the integer "3" encodes the 3rd most frequent word in the data. This allows for quick filtering operations such as: "only consider the top 10,000 most common words, but eliminate the top 20 most common words".• As a convention, "0" does not stand for a specific word, but instead is

used to encode any unknown word.

Data

Data Label

Train Data : X_trainTrain Data Answer: y_trainTest Data: X_testTest Data Answer: y_test

Understanding LSTM Networks• Ref: http://colah.github.io/posts/2015-08-Understanding-LSTMs/• Recurrent Neural Networks• The Problem of Long-Term Dependencies• LSTM Networks• The Core Idea Behind LSTMs• Step-by-Step LSTM Walk Through• Variants on Long Short Term Memory• Conclusion

Install TensorFlowImportError: No module named tensorflow# creating virtual environment using python 2.7 version• conda create -n tensorflow python=2.7# enter Conda Virtual Environment• source activate tensorflow# Using pip to install # Mac OS X, GPU enabled, Python 2.7:• Export

TF_BINARY_URL=https://storage.googleapis.com/tensorflow/mac/gpu/tensorflow-0.11.0-py2-none-any.whl• sudo pip install --upgrade $TF_BINARY_URL

Install Keras (conda) • conda install -c conda-forge keras• # you may use conda-forge to install Tensorflow• # ref: https://conda-forge.github.io• conda install -c conda-forge tensorflow

Data Preprocessing

Make each comment of imdb data be fixed length (80)

Model

Train Model

• In the neural network terminology:• one epoch = one forward pass and one backward pass of all the training

examples• batch size = the number of training examples in one forward/backward pass.

The higher the batch size, the more memory space you'll need.• number of iterations = number of passes, each pass using [batch size] number

of examples. To be clear, one pass = one forward pass + one backward pass (we do not count the forward pass and backward pass as two different passes).• Example: if you have 1000 training examples, and your batch size is 500, then

it will take 2 iterations to complete 1 epoch.

Result - 1 : It takes much time to download data and train model

Result - 2 : After 1 hour

Time Reduction• Make the training data smaller. 5x smaller and 5x faster.

Visualizing your model

# install pydot and graphvis conda install -c anaconda pydot=1.0.28conda install -c anaconda graphviz=2.38.0# in python code

Dropout Comparison - 1

Dropout Comparison - 2

Why Keras?

Recommended