Feature extraction for change detection Can you detect an abrupt change in this picture? Ludmila I Kuncheva School of Computer Science Bangor University

Embed Size (px)

Citation preview

  • Slide 1
  • Slide 2
  • Feature extraction for change detection Can you detect an abrupt change in this picture? Ludmila I Kuncheva School of Computer Science Bangor University Answer at the end
  • Slide 3
  • Plan 1.Zeno says there is no such thing as change... 2.If change exists, is it a good thing? 3.Context or nothing! 4.Feature extraction for change detection PCA backwards?
  • Slide 4
  • Zeno of Elea (ca. 490430 BC) If everything, when it occupies an equal space, is at rest, and if that which is in locomotion is always occupying such a space at any moment, the flying arrow is therefore motionless. as recounted by Aristotle, Physics VI:9, 239b5 No motion, no movement, NO CHANGE Zenos Paradox of the Arrow
  • Slide 5
  • Does change exist? Zeno says no...
  • Slide 6
  • Nonetheless... Change Types Possible applications: fraud detection market analysis medical condition monitoring network traffic control Univariate detectors (Control charts): Shewhart's method CUSUM (CUmulative SUM) SPRT (Wald's Sequential Probability Ratio Test)
  • Slide 7
  • 2 approaches Use an adaptive algorithm (No need to identify the type of change or detect change explicitly) Detect change (Update/re-train the algorithm if necessary) Labelled data Unlabelled data
  • Slide 8
  • Data (all features) Labels are available Classifier Distribution modelling Error rate Change statistic threshold Change/ NO change Classification
  • Slide 9
  • Data (all features) Labels are available Labels are NOT available Classifier Distribution modelling Error rate Change statistic threshold Change/ NO change Data (all features) Feature EXTRACTOR Distribution modelling Change statistic threshold Change/ NO change Features multidimensional
  • Slide 10
  • Data (all features) Labels are available Labels are NOT available Classifier GMM HMM Parzen windows kernel methods martingales Error rate threshold Change/ NO change Data (all features) Feature EXTRACTOR clustering kernel methods GMM kd-trees Hotelling threshold Change/ NO change Features
  • Slide 11
  • A change in the (unconditional) data distribution will: 1.render the classifier useless 2.make no difference to the classification performance 3.improve the classification performance Classification
  • Slide 12
  • A change in the (unconditional) data distribution will: 1.render the classifier useless 2.make no difference to the classification performance 3.improve the classification performance Vote, please!
  • Slide 13
  • A change in the (unconditional) data distribution will: 1.render the classifier useless 2.make no difference to the classification performance 3.improve the classification performance Vote, please!
  • Slide 14
  • Classification No change in the (unconditional) data distribution will: 1.render the classifier useless 2.make no difference to the classification performance 3.improve the classification performance
  • Slide 15
  • No change in the (unconditional) data distribution will: 1.render the classifier useless 2.make no difference to the classification performance 3.improve the classification performance Vote, please!
  • Slide 16
  • Classifier ensembles Brain-computer interface MathWorks products My scope of interest Literature
  • Slide 17
  • Change may or may not cause trouble...
  • Slide 18
  • Is there a change ?
  • Slide 19
  • Slide 20
  • mean (moving average) mean 2std changes Shewhart with threshold 2 sigma Yes!
  • Slide 21
  • Is there a change ? No!
  • Slide 22
  • Is there a change?
  • Slide 23
  • Yes, for the purposes of Spot the difference. No, as this is a bee with a flower in the sun.
  • Slide 24
  • Is there a change? No!
  • Slide 25
  • Is there a change? sin(10x) * randn sin(20x) * randn Yes!
  • Slide 26
  • change detection
  • Slide 27
  • Slide 28
  • Change does not exist out of context!
  • Slide 29
  • ENTER Feature Extraction
  • Slide 30
  • Context: Amplitude variability Feature: AMPLITUDE
  • Slide 31
  • Context: Time series patterns in a fixed window. Feature: A PATTERN IN A FIXED WINDOW
  • Slide 32
  • Context: Childrens puzzle Feature: PIXEL B/W VALUE Context: Frequency variability Feature: FREQUENCY sin(10x) * randnsin(20x) * randn
  • Slide 33
  • Suppose that CONTEXT is not available. Principal Component Analysis (PCA) captures data variability. Then why not use PCA here? Labels are NOT available Data (all features) Feature EXTRACTOR Distribution modelling Change statistic threshold Change/ NO change Features
  • Slide 34
  • PCA intuition: The components corresponding to the largest eigen values are more important
  • Slide 35
  • But is this the case for change detection? Distributions are similar (small sensitivity to change) Distributions are different (large sensitivity to change) PC1 PC2 Holds for blind: Translation Rotation Variance change... Kuncheva L.I. and W.J. Faithfull, PCA feature extraction for change detection in multidimensional unlabelled data, IEEE Transactions on Neural Networks and Learning Systems, 25(1), 2014, 69-80
  • Slide 36
  • Some experiments: 1.Take a data set with n features 2.Sample randomly windows W1 and W2 with K objects in each window. 3.Calculate PCA from W1. Choose a proportion of explained variance and use the remaining (low- variance) components. 4.Generate a random integer k between 1 and n 4(a)Shuffle VALUES Choose randomly k features. For each chosen feature, shuffle randomly the values for this feature in window W2. 4(b)Shuffle FEATURES Choose randomly k features. Randomly permute the respective columns in window W2. 5.Transform W2 using the calculated PC and keep the low-variance components. 6.Calculate the CHANGE DETECTION CRITERION between W1 and W2. Store as NEGATIVE INSTANCE (no change).
  • Slide 37
  • Some experiments: 1.Take a data set with n features 2.Sample randomly windows W1 and W2 with K objects in each window. 3.Calculate PCA from W1. Choose a proportion of explained variance and use the remaining (low- variance) components. 4.Generate a random integer k between 1 and n 4(a)Shuffle VALUES Choose randomly k features. For each chosen feature, shuffle randomly the values for this feature in window W2. 4(b)Shuffle FEATURES Choose randomly k features. Randomly permute the respective columns in window W2. 5.Transform W2 using the calculated PC and keep the low-variance components. 6.Calculate the CHANGE DETECTION CRITERION between W1 and W2. Store as POSITIVE INSTANCE (change).
  • Slide 38
  • Run 100 times for POS and 100 for NEG to get the ROC curve for a given data set. Run 100 times for POS and 100 for NEG without applying PCA to get the ROC curve for a given data set. Use the Area Under the Curve (AUC), however disputed this might have become recently... Larger AUC corresponds to better change detection
  • Slide 39
  • VALUE shuffle
  • Slide 40
  • Slide 41
  • FEATURE shuffle
  • Slide 42
  • Slide 43
  • PCA - use the least relevant components!?
  • Slide 44
  • Conclusion 1.Change detection may be harmful, beneficial or indifferent to classification performance 2.Change does not exist out of context, therefore GENERIC algorithms for change detection are somewhat pointless... 3.Feature extraction for change detection may not follow conventional intuition.
  • Slide 45
  • 1-3 4-6 Can you detect an abrupt change in this picture? Remember my little puzzle?