24
On Human Predictions with Explanations and Predictions of Machine Learning Models: A Case Study on Deception Detection Vivian Lai and Chenhao Tan @vivwylai | @chenhaotan vivlai.github.io | chenhaot.com University of Colorado Boulder deception.machineintheloop.com

On Human Predictions with Explanations and Predictions of … · 2020. 5. 13. · Explanationshelp increase humans trust on predictions Trust (%) p

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: On Human Predictions with Explanations and Predictions of … · 2020. 5. 13. · Explanationshelp increase humans trust on predictions Trust (%) p

On Human Predictions with Explanations andPredictions of Machine Learning Models:A Case Study on Deception Detection

Vivian Lai and Chenhao Tan@vivwylai | @chenhaotanvivlai.github.io | chenhaot.comUniversity of Colorado Boulderdeception.machineintheloop.com

Page 2: On Human Predictions with Explanations and Predictions of … · 2020. 5. 13. · Explanationshelp increase humans trust on predictions Trust (%) p

https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing

Risk assessment: COMPAS

Page 3: On Human Predictions with Explanations and Predictions of … · 2020. 5. 13. · Explanationshelp increase humans trust on predictions Trust (%) p

https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing

Most previous studies are concerned with the impact of such tools used in full automation

Page 4: On Human Predictions with Explanations and Predictions of … · 2020. 5. 13. · Explanationshelp increase humans trust on predictions Trust (%) p

Judges are required to take account of the algorithm’s limitations in Wisconsin

In the end, though, Justice Bradley allowed sentencing judges to use Compas. They must take account of thealgorithm's limitations and the secrecy surrounding it, she wrote, but she said the software could be helpful ”in providing the sentencing court with as much information as possible in order to arrive at an individualized sentence.”

https://www.nytimes.com/2017/05/01/us/politics/sent-to-prison-by-a-software-programs-secret-algorithms.html

Page 5: On Human Predictions with Explanations and Predictions of … · 2020. 5. 13. · Explanationshelp increase humans trust on predictions Trust (%) p

Full automation is not desired

Page 6: On Human Predictions with Explanations and Predictions of … · 2020. 5. 13. · Explanationshelp increase humans trust on predictions Trust (%) p

https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing

How judges make decisions with COMPAS?

Page 7: On Human Predictions with Explanations and Predictions of … · 2020. 5. 13. · Explanationshelp increase humans trust on predictions Trust (%) p

How humans make decisions with machine assistance in challenging tasks?

Full humanagency

Full automation

Page 8: On Human Predictions with Explanations and Predictions of … · 2020. 5. 13. · Explanationshelp increase humans trust on predictions Trust (%) p

Showing machine predicted labels

Showing machine predicted labels and

explanations

Showing machine predicted labels and

suggesting high accuracy

Showing only explanations (by highlighting salient

information)

A spectrum between full human agency and full automation

Full humanagency

Full automation

Page 9: On Human Predictions with Explanations and Predictions of … · 2020. 5. 13. · Explanationshelp increase humans trust on predictions Trust (%) p

Showing machine predicted labels

Showing machine predicted labels and

explanations

Showing machine predicted labels and

suggesting high accuracy

Showing only explanations (by highlighting salient

information)

Deception Detection as a Case Study

87%~50%

Page 10: On Human Predictions with Explanations and Predictions of … · 2020. 5. 13. · Explanationshelp increase humans trust on predictions Trust (%) p

I would not stay at this hotel again. The rooms had a fowl odor. It seemed as though the carpets have never been cleaned. The neighborhood was also less than desirable. The housekeepers seemed to be snooping around while they were cleaning the rooms. I will say that the front desk staff was friendly albeit slightly dimwitted.

Page 11: On Human Predictions with Explanations and Predictions of … · 2020. 5. 13. · Explanationshelp increase humans trust on predictions Trust (%) p

I would not stay at this hotel again. The rooms had a fowl odor. It seemed as though the carpets have never been cleaned. The neighborhood was also less than desirable. The housekeepers seemed to be snooping around while they were cleaning the rooms. I will say that the front desk staff was friendly albeit slightly dimwitted.

Page 12: On Human Predictions with Explanations and Predictions of … · 2020. 5. 13. · Explanationshelp increase humans trust on predictions Trust (%) p

The machine predicts that the below review is deceptive

I would not stay at this hotel again. The rooms had a fowl odor. It seemed as though the carpets have never been cleaned. The neighborhood was also less than desirable. The housekeepers seemed to be snooping around while they were cleaning the rooms. I will say that the front desk staff was friendly albeit slightly dimwitted.

Page 13: On Human Predictions with Explanations and Predictions of … · 2020. 5. 13. · Explanationshelp increase humans trust on predictions Trust (%) p

Showing machine predicted labels

Showing machine predicted labels and

explanations

Showing machine predicted labels and

suggesting high accuracy

Showing only explanations(by highlighting salient

information)

Can explanations alone improve human performance?

Page 14: On Human Predictions with Explanations and Predictions of … · 2020. 5. 13. · Explanationshelp increase humans trust on predictions Trust (%) p

87%

57.6%

55.9%

54.4%

51.1%

45 55 65 75 85

Machine

Heatmap

Highlight

Examples

Control

p=0.006

p<0.001

Explanations alone slightly improve human performance

Accuracy (%)

p=0.056

Page 15: On Human Predictions with Explanations and Predictions of … · 2020. 5. 13. · Explanationshelp increase humans trust on predictions Trust (%) p

Showing machine predicted labels

Showing machine predicted labels and

explanations

Showing machine predicted labels and

suggesting high accuracy

Showing only explanations(by highlighting salient

information)

Predicted labels > explanations

Page 16: On Human Predictions with Explanations and Predictions of … · 2020. 5. 13. · Explanationshelp increase humans trust on predictions Trust (%) p

87%

74.6%

61.9%

57.6%

51.1%

45 55 65 75 85

Machine

Predicted labelwith accuracy

Predicted labelwithout accuracy

Heatmap

Control

Explicit accuracy improve human performance drastically

Accuracy (%)

p<0.001

p<0.001

p<0.001

Page 17: On Human Predictions with Explanations and Predictions of … · 2020. 5. 13. · Explanationshelp increase humans trust on predictions Trust (%) p

Showing machine predicted labels

Showing machine predicted labels and

explanations

Showing machine predicted labels and

suggesting high accuracy

Showing only explanations (by highlighting salient

information)

Tradeoff between human performance and human agency

Higher agency,lower performance

Lower agency,higher performance

Page 18: On Human Predictions with Explanations and Predictions of … · 2020. 5. 13. · Explanationshelp increase humans trust on predictions Trust (%) p

Showing machine predicted labels

Showing machinepredicted labels and

explanations

Showing machine predicted labels and

suggesting high accuracy

Showing only explanations (by highlighting salient

information)

Can explanations moderate this tradeoff?

Page 19: On Human Predictions with Explanations and Predictions of … · 2020. 5. 13. · Explanationshelp increase humans trust on predictions Trust (%) p

87%

74.6%

72.5%

61.9%

45 55 65 75 85

Machine

Predicted labelwith accuracy

Predicted label& heatmap

Predicted labelwithout accuracy

Predicted labels + explanations ≈ explicit accuracy

Accuracy (%)

p<0.001

p<0.001

Page 20: On Human Predictions with Explanations and Predictions of … · 2020. 5. 13. · Explanationshelp increase humans trust on predictions Trust (%) p

Showing machine predicted labels

Showing machinepredicted labels and

explanations

Showing machine predicted labels and

suggesting high accuracy

Showing only explanations (by highlighting salient

information)

How much do humans trust the predictions?

Page 21: On Human Predictions with Explanations and Predictions of … · 2020. 5. 13. · Explanationshelp increase humans trust on predictions Trust (%) p

79.6%

78.7%

64.4%

45 55 65 75 85

Predicted labelwith accuracy

Predicted label& heatmap

Predicted labelwithout accuracy

Explanations help increase humans trust on predictions

Trust (%)

p<0.001

p<0.001

Page 22: On Human Predictions with Explanations and Predictions of … · 2020. 5. 13. · Explanationshelp increase humans trust on predictions Trust (%) p

69.8%

74.1%

60%

81.1%

79.4%

65.1%

45 55 65 75 85

Predicted labelwith accuracy

Predicted label& heatmap

Predicted labelwithout accuracy

CorrectIncorrect

Humans are more likely to trust predictions when they are correct

Trust (%)

Page 23: On Human Predictions with Explanations and Predictions of … · 2020. 5. 13. · Explanationshelp increase humans trust on predictions Trust (%) p

Other analysis

Showing varying accuracies Heterogeneity between participants

506070

Page 24: On Human Predictions with Explanations and Predictions of … · 2020. 5. 13. · Explanationshelp increase humans trust on predictions Trust (%) p

Showing machine predicted labels

Showing machine predicted labels and

suggesting high accuracy

Higher agency,lower performance

Lower agency,higher performance

Vivian Lai and Chenhao Tan@vivwylai | @chenhaotanvivlai.github.io | chenhaot.comUniversity of Colorado Boulderdeception.machineintheloop.com

Takeaway

Explanations alone only slightly improve human

performance

Explanations can moderate the

tradeoff