46
Classifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION A: Approved for public release: distribution unlimited: 16 May 2016. Case #88ABW-2016- 2511

Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Classifier Inspired Scaling forTraining Set SelectionWalter Bennette

DISTRIBUTION A: Approved for public release: distribution unlimited: 16 May 2016. Case #88ABW-2016-2511

Page 2: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Outline

Instance-based classification

Training set selection

Scaling approaches

Experimental results

·

·

ENN

DROP3

CHC

-

-

-

·

Stratified

Classifier inspired

-

-

·

2/46

Page 3: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Instance-based classification

Page 4: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Instance-based classification

4/46

Page 5: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Instance-based classification

5/46

Page 6: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Instance-based classification

6/46

Page 7: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Instance-based classification

7/46

Page 8: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Instance-based classification

8/46

Page 9: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Instance-based classification

9/46

Page 10: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Instance-based classification

10/46

Page 11: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Instance-based classification

11/46

Page 12: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Instance-based classification

12/46

Page 13: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Instance-based classification

13/46

Page 15: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Instance-based classification

What if there is a large amount of data?

15/46

Page 16: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Instance-based classification

What if there is a huge amount of data?

16/46

Page 17: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Instance-based classification

What if there is a serious amount of data?

17/46

Page 18: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Training set selection (TSS)

Page 19: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Training set selection (TSS)

Instead of maintaining all of the training data

Keep only certain necessary data points

·

·

19/46

Page 20: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Edited Nearest Neighbors (ENN)

Formulation:

Effect:

An instance is removed from the training data if its does not agree with themajority of it nearest neighbors

·k

Makes decision boundaries smoother

Doesn't remove much data

·

·

20/46

Page 21: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Edited Neares Neighbors (ENN)

21/46

Page 22: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

DROP3

Formulation:

DROP3 (Training set TR): Selection set S. Let S = TR after applying ENN. For each instance Xi in S: Find the k +1 nearest neighbors of Xi in S. Add Xi to each of its lists of associates. For each instance Xi in S: Let with = # of associates of Xi classified correctly with Xi as a neighbor. Let without = # of associates of Xi classified correctly without Xi. If without ≥ with Remove Xi from S. For each associate a of Xi Remove Xi from a’s list of neighbors. Find a new nearest neighbor for a. Add a to its new list of associates. Endif Return S.

22/46

Page 23: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

DROP3

Formulation:

Effect:

Iterative procedure that compares accuracy of neighbors with and withoutmembers

·

Removes much more data than ENN

Maintains acceptable accuracy

·

·

23/46

Page 24: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

DROP3

24/46

Page 25: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Genetic algorithm (CHC)

Formulation:

Effectiveness:

A chromosome is a subset of the training data

A binary gene represents each instance

·

·

· Fitness = α 0 Accuracy + (1 + α) 0 Reduction

Removes a large amount of data

Achieves acceptable accuracy

·

·

25/46

Page 26: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Genetic algorithm (CHC)

26/46

Page 27: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Scaling

Page 28: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Scaling

As datasets grow, TSS becomes more and more expensive

May be prohibitive

The vast majority of scaling approaches rely on a stratified approach

·

·

·

28/46

Page 29: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

No scaling

29/46

Page 30: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Stratified scaling

30/46

Page 31: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Representative Data Detection (ReDD)

Lin et al. 2015

Used for support vector machines and did not consider data reduction

·

·

31/46

Page 32: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Our approach

Page 33: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Classifier inspired approach

Based heavily on ReDD

Used for kNN and monitor data reduction

·

·

33/46

Page 34: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

The filter

The "Balance"" dataset

Determine scale positions

Attributes

·

Balanced

Leaning right

Leaning left

-

-

-

·

Left weight

Left distance

Right weight

Right distance

-

-

-

-

34/46

Page 35: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

The filter

35/46

Page 36: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

The filter

36/46

Page 37: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

The filter

37/46

Page 38: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Experimentation

Parameters:

Design:

Learn a Random Forest for the filter

Split data into 1/3rd, 2/3rd

·

·

Perform for ENN, CHC, and DROP3 with 3-NN

Compare no scaling, stratified, and classifier inspired

Calculate reduction, accuracy, and computation time with 10-fold CV

·

·

·

38/46

Page 39: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Datasets

10 experimental datasets from KEEL·

39/46

Page 40: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Reduction

40/46

Page 41: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Accuracy

41/46

Page 42: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Time

42/46

Page 43: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Results

Maintains accuracy (mostly)

Maintains data reduction

Slower than stratified approach, but may improve for larger datasets

·

·

·

43/46

Page 44: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Future work

Perform for many more datasets

Apply to very large datasets

Investigate if damage can be spotted apriori

·

·

·

44/46

Page 45: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Conclusion

Promising candidate for scaling Training Set Selection to large datasets

45/46

Page 46: Classifier Inspired Scaling for Training Set Selectioncredit.pvamu.edu/MCBDA2016/Slides/Day1_WalterAFRL.pdfClassifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION

Questions

Walter Bennette [email protected] 315-330-4957

46/46