Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Performance
Performance Measures in Data MiningCommon Performance Measures used in Data Mining and
Machine Learning Approaches
L. Richter J.M. Cejuela
Department of Computer ScienceTechnische Universität München
Master Lab Course Data Mining, SS 2015, Jul 1st
Performance
Outline
Item Set and Association Rule WeightsSimple MeasuresComplex Measures
ClassificationBasic Performance MeasuresComplex Measures – Performance Curves
Regression
Assessment Strategies
Performance
Item Set and Association Rule Weights
Simple Measures
Measures for Item Sets
Various algorithms can yield frequent item sets. From frequentitem sets c and c ∪ {i} you can derive if c then {i}. Typicallythere is only one item in the RHS (right hand side of the rule).
I Support (of an item set):sup(X → Y ) = sup(Y → X ) = P(X ∧ Y )how many times the item set is found in the database
I Confidence (of a rule): conf (X → Y ) = P(Y |X ) =P(X and Y )/P(X ) = sup(X → Y )/sup(X )
Performance
Item Set and Association Rule Weights
Simple Measures
Measures for Item Sets
Various algorithms can yield frequent item sets. From frequentitem sets c and c ∪ {i} you can derive if c then {i}. Typicallythere is only one item in the RHS (right hand side of the rule).
I Support (of an item set):sup(X → Y ) = sup(Y → X ) = P(X ∧ Y )how many times the item set is found in the database
I Confidence (of a rule): conf (X → Y ) = P(Y |X ) =P(X and Y )/P(X ) = sup(X → Y )/sup(X )
Performance
Item Set and Association Rule Weights
Simple Measures
Measures for Item Sets
Various algorithms can yield frequent item sets. From frequentitem sets c and c ∪ {i} you can derive if c then {i}. Typicallythere is only one item in the RHS (right hand side of the rule).
I Support (of an item set):sup(X → Y ) = sup(Y → X ) = P(X ∧ Y )how many times the item set is found in the database
I Confidence (of a rule): conf (X → Y ) = P(Y |X ) =P(X and Y )/P(X ) = sup(X → Y )/sup(X )
Performance
Item Set and Association Rule Weights
Complex Measures
Measures for Item Sets cont.’d
Measures how frequent an item set / how interesting a rule is incomparison to the expected occurrence (interesting):
I Leverage (of an item set):lev(X → Y ) = P(X and Y )− (P(X )P(Y ))
I Lift (of a rule): lift(X → Y ) = lift(Y → X ) =P(X ∧ Y )/(P(X )P(Y )) =conf (X → Y )/sup(Y ) = conf (Y → X )/sup(X )
I Conviction (of a rule): Similar to Lift, but directedCompares the probability that X appears without Y, if theywere independent with the observed frequency of X and Y.conviction(X → Y ) = P(X )P(¬Y )/P(X ∧ ¬Y ) =(1− sup(Y )/(1− conf (X → Y ))
Performance
Item Set and Association Rule Weights
Complex Measures
Measures for Item Sets cont.’d
Measures how frequent an item set / how interesting a rule is incomparison to the expected occurrence (interesting):
I Leverage (of an item set):lev(X → Y ) = P(X and Y )− (P(X )P(Y ))
I Lift (of a rule): lift(X → Y ) = lift(Y → X ) =P(X ∧ Y )/(P(X )P(Y )) =conf (X → Y )/sup(Y ) = conf (Y → X )/sup(X )
I Conviction (of a rule): Similar to Lift, but directedCompares the probability that X appears without Y, if theywere independent with the observed frequency of X and Y.conviction(X → Y ) = P(X )P(¬Y )/P(X ∧ ¬Y ) =(1− sup(Y )/(1− conf (X → Y ))
Performance
Item Set and Association Rule Weights
Complex Measures
Measures for Item Sets cont.’d
Measures how frequent an item set / how interesting a rule is incomparison to the expected occurrence (interesting):
I Leverage (of an item set):lev(X → Y ) = P(X and Y )− (P(X )P(Y ))
I Lift (of a rule): lift(X → Y ) = lift(Y → X ) =P(X ∧ Y )/(P(X )P(Y )) =conf (X → Y )/sup(Y ) = conf (Y → X )/sup(X )
I Conviction (of a rule): Similar to Lift, but directedCompares the probability that X appears without Y, if theywere independent with the observed frequency of X and Y.conviction(X → Y ) = P(X )P(¬Y )/P(X ∧ ¬Y ) =(1− sup(Y )/(1− conf (X → Y ))
Performance
Item Set and Association Rule Weights
Complex Measures
Supplementary Material
J-MeasureI empirically observed accuracy of ruleI Cross-entropy (measuring how good a distribution
approximates another distribution) between the binaryvariables φ and θ with vs. without conditioning on event θ
J(θ ⇒ φ) = p(θ)
(p(φ|θ)log
p(φ|θ)
p(φ)+ (1− p(φ|θ)) log
1− p(φ|θ)
1− p(φ)
)
Performance
Classification
Basic Performance Measures
Basic Building Blocks for Performance Measures
I True Positive (TP): positive instances predicted as positiveI True Negative (TN): negative instances predicted as
negativeI False Positive (FP): negative instances predicted as
positiveI False Negative (FN): positive instances predicted as
negativeI Confusion Matrix:
Predicted a Predicted bReal a TP FNReal b FP TN
Performance
Classification
Basic Performance Measures
Performance Measures
Accuracy ,acc =TP + TN
TP + FN + FP + TN
=Number of correct predictions
Total number of predictions
Error rate,err =FN + FP
TP + FN + FP + TN= 1− acc
=Number of wrong predictionsTotal number of predictions
Performance
Classification
Basic Performance Measures
Performance Measures cont’d
True Positive Rate,TPR,Sensitivity =TP
TP + FN
True Negative Rate,TNR,Specificity =TN
TN + FP
False Positive Rate,FPR =FP
TN + FP
False Negative Rate,FNR =FN
TP + FN
Performance
Classification
Basic Performance Measures
Performance Measures cont’d
True Positive Rate,TPR,Sensitivity =TP
TP + FN
True Negative Rate,TNR,Specificity =TN
TN + FP
False Positive Rate,FPR =FP
TN + FP
False Negative Rate,FNR =FN
TP + FN
Performance
Classification
Basic Performance Measures
Performance Measures cont’d
True Positive Rate,TPR,Sensitivity =TP
TP + FN
True Negative Rate,TNR,Specificity =TN
TN + FP
False Positive Rate,FPR =FP
TN + FP
False Negative Rate,FNR =FN
TP + FN
Performance
Classification
Basic Performance Measures
Performance Measures cont’d
True Positive Rate,TPR,Sensitivity =TP
TP + FN
True Negative Rate,TNR,Specificity =TN
TN + FP
False Positive Rate,FPR =FP
TN + FP
False Negative Rate,FNR =FN
TP + FN
Performance
Classification
Basic Performance Measures
Performance Measures cont’d
Precision,p =TP
TP + FP
Recall , r =TP
TP + FN
F1measure =2rp
r + p=
2× TP2× TP + FP + FN
Performance
Classification
Basic Performance Measures
Performance Measures cont’d
Precision,p =TP
TP + FP
Recall , r =TP
TP + FN
F1measure =2rp
r + p=
2× TP2× TP + FP + FN
Performance
Classification
Basic Performance Measures
Performance Measures cont’d
Precision,p =TP
TP + FP
Recall , r =TP
TP + FN
F1measure =2rp
r + p=
2× TP2× TP + FP + FN
Performance
Classification
Complex Measures – Performance Curves
Performance Curves
I "costs" of different error types are differentI prediction behaviour changes over the test setI performance display in 2DI different domains prefer different chart types
Performance
Classification
Complex Measures – Performance Curves
Lift Charts
Taken fromhttp://www.dmg.org/rfc/pmml-3.2/ModelExplanation.html
I from marketing area toevaluate mailingsuccess
I y -axis: number orpercentage ofresponders
I x-axis: sampleI red diagonal: random
liftI green line: optimum lift
Performance
Classification
Complex Measures – Performance Curves
ROC Curves
Taken from http://en.wikipedia.org/wiki/File:Roccurves.png
I Receiver OperatorCharacteristics
I y -axis: TPRI x-axis: FPRI diagonal: random
guessing (TPR=FPR)
Performance
Classification
Complex Measures – Performance Curves
Sensitivity vs. Specificity
Taken from http://i.stack.imgur.com/fnUd2.png
I preferred in medicineI y -axis: TPRI x-axis: TNR (specificity)I also frequently as ROC
curve with 1 - specificity
Performance
Classification
Complex Measures – Performance Curves
Recall Precision Curves
Taken from http://scikit-learn.github.io/scikit-learn.org/
I preferred in informationretrieval
I positives are thedocuments retrieved inresponse to a query
I true positives aredocuments really relevantto the query
I y -axis: precisionI x-axis: recall
Performance
Regression
Error Measures for Regression
Mean − squared error ,MSE =(p1 − a1)2 + · · ·+ (pn − an)2
n
Root mean−squared error ,RMSE =
√(p1 − a1)2 + · · ·+ (pn − an)2
n
Mean − absolute error ,MAE =|p1 − a1|+ · · ·+ |pn − an|
n
Performance
Regression
Error Measures for Regression
Mean − squared error ,MSE =(p1 − a1)2 + · · ·+ (pn − an)2
n
Root mean−squared error ,RMSE =
√(p1 − a1)2 + · · ·+ (pn − an)2
n
Mean − absolute error ,MAE =|p1 − a1|+ · · ·+ |pn − an|
n
Performance
Regression
Error Measures for Regression
Mean − squared error ,MSE =(p1 − a1)2 + · · ·+ (pn − an)2
n
Root mean−squared error ,RMSE =
√(p1 − a1)2 + · · ·+ (pn − an)2
n
Mean − absolute error ,MAE =|p1 − a1|+ · · ·+ |pn − an|
n
Performance
Regression
Relative Error Measures
Relative − squared error =(p1 − a1)2 + · · ·+ (pn − an)2
(a1 − a)2 + · · ·+ (an − a)2
Root relative−squared error =
√(p1 − a1)2 + · · ·+ (pn − an)2
(a1 − a)2 + · · ·+ (an − a)2
Relative − absolute error =|p1 − a1|+ · · ·+ |pn − an||a1 − a|+ · · ·+ |an − a|
Performance
Assessment Strategies
General Problem
I each algorithm abstracts from observations (instances)I the aspects kept and the aspect discarded differ between
the learning scheme (inductive bias)I this means also: information about individual instances are
contained in the model, tooI individual instance information leads to overfitting
Performance
Assessment Strategies
Solution Strategies
I use fresh date, i.e. instances not used for the trainingI for very large numbers of instances: simple split in test and
training setI most common: 10-fold cross validationI LOOCV: Leave one out cross validation