16
Analyzing Arguments during a Debate using Natural Language Processing in Python -I ABHINAV GUPTA

Argumentation Framework

Embed Size (px)

Citation preview

Page 1: Argumentation Framework

Analyzing Arguments during a Debate using Natural Language Processing in Python - IABHINAV GUPTA

Page 2: Argumentation Framework

How a debate may proceed

This new movie ‘Superman vs. Batman’ is so cool! The

winner has got to be Superman, with his mighty

Kryptonian abilities and people’s support. What do

you think?!

I agree! Superman is definitely more capable than Batman.

What are you saying? Batman is so much technologically

advanced!

Both, Batman and Superman, are powerful in their own ways.

It will be a draw.

Ben Affleck is so HOT!

Page 3: Argumentation Framework

What will we discuss? Basic Natural Language Processing (NLP) techniques

Implementation of NLP in Python NLTK

Stepwise workflow for processing arguments in a debate to: Determine polarity of an argument

Determine quality of argument and score it

Determine the winner of debate

A complete debating framework built from various Python modules

Page 4: Argumentation Framework

Why Natural Language Toolkit (NLTK)?

Platform for implementing Natural Language Processing through Python programs

Huge database of corpora and lexical resources with an easy interface

Built-in libraries of several text processing algorithms

Open Source!

Page 5: Argumentation Framework

Starting with the Basics

“I do not feel very good about Monday mornings.”

Tokenization [‘I’, ‘do’, ‘not’, ‘feel’, ‘very’, ‘good’, ‘about’, ‘Monday’, ‘mornings’]

Parts of Speech Tagging ‘I’ – Personal Pronoun‘do’ – Verb‘not’ – Adverb‘feel’ - Verb,‘very’ – Adverb‘good’ – Adjective‘about’ – Preposition‘Monday’ – Proper Noun‘mornings’ – Plural Proper Noun]

Page 6: Argumentation Framework

Basics with NLTK

Page 7: Argumentation Framework

Tokens [‘I’, ‘do’, ‘not’, ‘feel’, ‘very’, ‘good’, ‘about’, ‘Monday’, ‘mornings’]

Removal of Stop Words [‘I’, ‘feel’, ‘good’, ‘Monday’, ‘mornings’]

Stemmed Words [‘I’, ‘feel’, ‘good’, ‘Monday’, ‘morn’]

Page 8: Argumentation Framework

What we look for in an argument?

What is the stance taken by the debater in this argument?

Has the debater changed stance from the previous arguments?

Is the argument related to the debate or irrelevant?

Is the argument good enough?

Page 9: Argumentation Framework

Analysis of an Argument

• Is the argument related to the debate?

SEMANTIC SIMILARITY

• What is the polarity of the argument?

SENTIMENT ANALYSIS • Is the argument

good enough?

SCORING

• Has the debater changed stance?

BACKTRACK

Page 10: Argumentation Framework

Semantic Similarity Semantic Distance between words in context is the distance between their

underlying senses or lexical concepts. d(festival, celebration) < d(school, circus)

Semantic Similarity is how close the lexical concepts of two units (word, sentence, paragraph) of language are. d(Mangoes and bananas are fruits, Mangoes are sweeter than bananas) < d(Raj has a job at the

hospital, Hospitals have a huge staff of doctors and nurses)

Lexical databases like WordNet group English words into sets of synonyms expressing a distinct concept and are used for calculating semantic similarity

Page 11: Argumentation Framework

Word Net based Similarity

Such a network forms the basis of several distance formulae to calculate semantic similarity

Page 12: Argumentation Framework

Similarity between Sentences

A new NASA initiative will help lead the search for signs

of life beyond our solar system

The Nexus for Exoplanet System Science, or NExSS, will take a multidisciplinary

approach to the hunt for alien life

newNASA

InitiativeHelpLead

SearchSignsLife

BeyondSolar

SystemNexus

ExoplanetAlien

ScienceNExSSTake

multidisciplinaryApproach

Hunt

Joint Word Set

Sentence 1

Sentence 2

11111111111000000000

00000001001111111111

1 2

Page 13: Argumentation Framework

Similarity between Sentences The simplest similarity score is to take the cosine distance between the two vectors:

More sophisticated formulae identify similar pair of words and assign decimal values depending on the semantic distance. For example, in our word set, d(Search, Hunt) = 0.8

d(Solar, Exoplanet) = 0.4

Sometimes, the order in which the words appear in the sentence also make a difference. Order Similarity is also considered. India defeated Pakistan

Pakistan defeated India

Page 14: Argumentation Framework

Sentiment Analysis Sentiment Analysis (or opinion mining) is the process of detecting the contextual

polarity of text

NLP Techniques, Statistics and Machine Learning is used to identify the sentiment content in a text

It finds application with Movie Reviews, Blogs, Customer Feedback, Twitter and other microblogging sources

Most popular classifier used for Sentiment Analysis is the Naïve Bayes Classifier, available as a module in NLTK and TextBlob, a Python library for textual data

Page 15: Argumentation Framework

Sentiment Analysis using Naïve Bayes

Training Corpus

Polarity Lexicon

Naïve Bayes Classifier

Neutral?

Test Data Yes

Positive/Negative

Page 16: Argumentation Framework

Thank You!Please keep watching this space for Part II