Upload
nibal
View
58
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Evaluating Classifiers Reading: T. Fawcett, An introduction to ROC analysis , Sections 1-4, 7 (linked from class website). Evaluating Classifiers. What we want: classifier that best predicts unseen data Assumption: Data is “ iid ” (independently and identically distributed). - PowerPoint PPT Presentation
Citation preview
Evaluating Classifiers
Reading:
T. Fawcett, An introduction to ROC analysis, Sections 1-4, 7
(linked from class website)
Evaluating Classifiers
• What we want: classifier that best predicts unseen data
• Assumption: Data is “iid” (independently and identically distributed)
Accuracy and Error Rate
• Accuracy = fraction of correct classifications on unseen data (test set)
• Error rate = 1 − Accuracy
How to use available data to best measure accuracy?
Split data into training and test sets.
But how to split?
Too little training data: Don’t get optimal classifier
Too little test data: Measured accuracy is not correct
One solution: “k-fold cross validation”
• Each example is used both as a training instance and as a test instance.
• Split data into k disjoint parts: S1, S2, ..., Sk.
• For i = 1 to kSelect Si to be the test set. Train on the remaining
data, test on Si , to obtain accuracy Ai .
• Report as the final accuracy.
“Confusion matrix” for a given class c
Actual Predicted (or “classified”) Positive Negative(in class c) (not in class c)
Positive (in class c) TruePositive FalseNegative
Negative (not in class c) FalsePositive TrueNegative
Evaluating classification algorithms
Example: “A” vs. “B”
Results from Perceptron:
Instance Class Perception Output1 “A” −12 “A” +13 “A” +14 “A” −15 “B” +16 “B” −17 “B” −11 “B” −1
Accuracy:
Actual Predicted Positive Negative
Positive
Negative
Assume “A” is positive class Confusion Matrix
Evaluating classification algorithms
“Confusion matrix” for a given class c
Actual Predicted (or “classified”) Positive Negative(in class c) (not in class c)
Positive (in class c) TruePositive FalseNegative
Negative (not in class c) FalsePositive TrueNegative
Type 2 error
Type 1 error
Precision and Recall
All instances in test set
Positive Instances
Negative Instances
Precision and Recall
All instances in test set
Instancesclassifiedas positive
Positive Instances
Negative Instances
Instancesclassifiedas negative
Precision and Recall
All instances in test set
Instancesclassifiedas positive
Positive Instances
Negative Instances
Recall = Sensitivity = True Positive Rate
Example: “A” vs. “B”
Results from Perceptron:
Instance Class Perception Output1 “A” −12 “A” +13 “A” +14 “A” −15 “B” +16 “B” −17 “B” −11 “B” −1
Assume “A” is positive class
Results of classifier
Threshold Accuracy Precision Recall
.9
.8
.7
.6
.5
.4
.3
.2
.1
-∞
Creating a Precision vs. Recall Curve
14
(“sensitivity”)(1 “specificity”)
15
Results of classifier
Threshold Accuracy TPR FPR
.9
.8
.7
.6
.5
.4
.3
.2
.1
-∞
Creating a ROC Curve
17
Precision/Recall versus ROC curves
http://blog.crowdflower.com/2008/06/aggregate-turker-judgments-threshold-calibration/
18
19
20
• There are various algorithms for computing AOC (often built into statistical software) – we won’t cover this in this class.
How to create a ROC curve for a perceptron
• Run your perceptron on each instance in the test data, without doing the sgn step:
• Get the range [min, max] of your scores
• Divide the range into about 200 thresholds, including −∞ and max
• For each threshold, calculate TPR and FPR
• Plot TPR (y-axis) vs. FPR (x-axis)