Upload
rian
View
39
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Incorporating Game Theory in Feature Selection for Text Categorization. Nouman Azam and JingTao Yao Department of Computer Science University of Regina CANADA S4S 0A2 [email protected] [email protected] http://www.cs.uregina.ca/~azam200n http://www.cs.uregina.ca/~jtyao. - PowerPoint PPT Presentation
Citation preview
Incorporating Game Theory in Feature Selection for Text Categorization
Nouman Azam and JingTao Yao
Department of Computer Science University of ReginaCANADA S4S 0A2
[email protected] [email protected]://www.cs.uregina.ca/~azam200n http://www.cs.uregina.ca/~jtyao
Acknowledgement• Thanks to Dr. Dominik Slezak for presenting
this work on our behalves.
J T Yao Incorporating Game Theory in Feature Selection for TC 2
J T Yao Incorporating Game Theory in Feature Selection for TC 3
Introduction• Feature selection.
– Selecting a subset of important features.
• Text categorization.– Assigning textual documents to predefined
categories.
• Text categorization and high imbalance.– The number of instances in categories varies
significantly. – Importance of features vary accordingly.– Hard to apply feature selection techniques directly.
J T Yao Incorporating Game Theory in Feature Selection for TC 4
Feature Selection in Text Categorization
• Assigning positive or negative values to features.– The values indicate importance of features.– Positive values indicates importance for positive
category.– Negative values indicates importance for negative
category.
J T Yao Incorporating Game Theory in Feature Selection for TC 5
Existing Feature Selection Approaches
• One sided approaches.– Selecting features with high positive values.
• Two sided approaches.– Selecting features with high absolute value.
• Explicit combinational approach.– Selecting features with high positive or negative
values generated by a one sided method.
J T Yao Incorporating Game Theory in Feature Selection for TC 6
Limitations of Existing Approaches
• Favours features indicative of either positive or negative category.– There may be features that indicates both categories.– It is plausible to include such features in some
applications.
• Dilemma: positive features vs. negative features.• However, we need to find a way to select these
features. – Incorporating Game Theory in Feature Selection to
deal with this issue.
J T Yao Incorporating Game Theory in Feature Selection for TC 7
Incompetence of Existing Approaches• An Example.
– Considering an imbalanced data set with 10 documents in positive and 100 in negative categories.
– There are eight words in these documents.
• Considering four methods.– One sided approaches: correlation coefficient and
GSS coefficient.– Two sided approaches: chi square and gini index.
J T Yao Incorporating Game Theory in Feature Selection for TC 8
Probabilities of Words in Categories
• Meaning of probabilities.– Referring to fraction of documents from a category
containing the word.
J T Yao Incorporating Game Theory in Feature Selection for TC 9
Scores of Words
J T Yao Incorporating Game Theory in Feature Selection for TC 10
Rankings of Words
• Observations– w7 and w8 are not considered as important by any
method. – They will be ignored, if we select three features.
J T Yao Incorporating Game Theory in Feature Selection for TC 11
A Simple Solution
• Using an explicit combinational approach.– Probabilities in respective categories are used for
rankings. – The new rankings.
– Considering positive category twice as important as negative category.
• We may select w1, w8 and w4.• We note that w8 which indicates both categories is selected.
J T Yao Incorporating Game Theory in Feature Selection for TC 12
• A feature may be considered as good for, – Positive category, – Negative category, – Both of them, or – Neither of them.
• We are trying to find a systematic method, that finds the best decision choice.
• Game theory may be useful for formulating such method.
Conclusion from the Simple Solution
J T Yao Incorporating Game Theory in Feature Selection for TC 13
• Game theory is a core subject in decision sciences.– Prisoners Dilemma.
• A classical example in Game Theory.
Game Theory
J T Yao Incorporating Game Theory in Feature Selection for TC 14
• Formulating problems with Game Theory requires to,– Identify the player set.– Identify the strategy set.– Determine the payoff functions.– Implement a competition.
Feature Selection with Game Theory
J T Yao Incorporating Game Theory in Feature Selection for TC 15
• Two players were selected.
• The players represents positive and negative category.– The player C+ represents positive category. – The player C- represents negative category.
• Each player determine the features’ utility for its respective category.
The Player Set
J T Yao Incorporating Game Theory in Feature Selection for TC 16
• Two actions were formulated for each player.– Action a1 for keeping a feature.– Action a2 for discarding a feature.
• For Differentiating the actions of the two players– denote the actions of C+. – denote the actions of C-.
The Strategy Set
J T Yao Incorporating Game Theory in Feature Selection for TC 17
The Payoff Functions
• Notation for a payoff function.– Payoff of player i, performing action j, given action
k of opponent is denoted as .
• The payoff sets.
J T Yao Incorporating Game Theory in Feature Selection for TC 18
Defining the Payoff Functions
• Let cat and cat represents positive and negative categories.– A and B represent the number of documents from cat
and cat containing word w.– C and D represent the number of documents from
cat and cat that does not contain w.
• Conditional probabilities of w in cat and cat are
J T Yao Incorporating Game Theory in Feature Selection for TC 19
Payoffs Functions for Players
• Both players deciding to keep a feature.• The payoffs of players are calculated as average.
.
• Both players deciding to discard a feature. – The payoffs are calculated as .
• C+ deciding to keep while C- discard. – The payoffs are and respectively.
• C+ deciding to discard while C- keep.– The payoffs are and respectively.
J T Yao Incorporating Game Theory in Feature Selection for TC 20
Actions Scenarios for Players
J T Yao Incorporating Game Theory in Feature Selection for TC 21
Implementing Competition
• Representing the game in a payoff table.– Determining Nash equilibrium for finding the
actions of players.
J T Yao Incorporating Game Theory in Feature Selection for TC 22
Selected Features Set
• Defining two features sets.– FS+ as set of features representing positive
category.– FS- as set of features representing negative category.
• The game will determine the inclusion or exclusion of features in these sets.– Final selected features is the union of FS+ and FS-.
J T Yao Incorporating Game Theory in Feature Selection for TC 23
A Demonstrative Example
• Considering earlier example.
J T Yao Incorporating Game Theory in Feature Selection for TC 24
• The bold cells represents Nash equilibrium.– Considering w1.
• The actions of players in equilibrium are for C+ and
for C-.• The actions of players decides to include w1 in FS+.
Payoff Tables for Words
J T Yao Incorporating Game Theory in Feature Selection for TC 25
Payoff Tables for Words
J T Yao Incorporating Game Theory in Feature Selection for TC 26
• Result of implementing game for features. – FS+ = {w1, w7, w8} and FS- = {w4, w7,w8}.– FS = {w1, w4, w7, w8}.
• Observation.– The words w7 and w8 are selected.– The suggested approach selects features, that
indicates both categories.
Selected Features
J T Yao Incorporating Game Theory in Feature Selection for TC 27
Conclusion
• Limitations of existing approaches.– Preference is given to features indicating positive or
negative category.• The may not be suitable for selecting features indicating
both categories.
• Game theory based method.– Implements a game between categories.
• Importance of the method. – Useful in selecting features indicating positive
category, negative category or both of them.