66
Novelty Detection in Data Streams Profa. Elaine Faria UFU - 2018

Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

Novelty Detection in Data Streams

Profa. Elaine Faria UFU - 2018

Page 2: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

• Slides based on the papers– FARIA, ELAINE R.; GONÇALVES, ISABEL J. C. R. ; DE

CARVALHO, ANDRÉ C. P. L. F. ; GAMA, JOÃO . Novelty detection in data streams. Artificial Intelligence Review, v. 45, p. 235-269, 2016.

– FARIA, ELAINE RIBEIRO; PONCE DE LEON FERREIRA CARVALHO, ANDRÉ CARLOS ; GAMA, JOÃO . MINAS: multiclass learning algorithm for novelty detection in data streams. Data Mining and Knowledge Discovery, v. 30, p. 640-680, 2016.

– FARIA, ELAINE; GONCALVES, ISABEL ; GAMA, JOAO ; PONCE DE LEON FERREIRA CARVALHO, ANDRE . Evaluation of Multiclass Novelty Detection Algorithms for Data Streams. IEEE Transactions on Knowledge and Data Engineering (Print), v. 27, p. 2961-2973, 2015.

2

Page 3: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

Introduction

• Novelty Detection(ND) - DefinitionsNovelty detection is concerned with identifying abnormal system behaviours and abrupt changes from one regime to another (Lee and Roberts 2008)

The recognition that an input differs in some respect from previous inputs (Perner 2008)

Novelty detection makes it possible to recognize novel concepts, which may indicate the appearance of a new concept, a change occurred in known concepts or the presence of noise (Gama 2010).

3

Page 4: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

Introduction

• Novelty detection – is useful in cases where an important class is

under-represented in the training set– is an important task, since, for many problems,

we never know if the currently available training data include on all possible object classes

– allows the recognition of novel profiles (concepts) in unlabeled data

4

Page 5: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

Introduction

• Novelty Detection - Challenges– Concept drift

– Noise and outliers

– Recurring Concepts

– Concept Evolution• Number of problem classes increases over time

5

Page 6: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

Introduction

• Data stream applications for ND– Intrusion detection– Fraud detection– Medical diagnosis– Detection of interest regions in images– Fault detection– Spam filter– Text classification– ....

6

Page 7: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

Introduction

• It is important to distinguish– Anomaly detection

– Outlier detection

– Novelty detection

7

Page 8: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

Introduction

• Novelty, anomaly and outlier detection are related to find patterns that are different from the normal (usual)– Anomaly and outlier detection give the idea of

an undesired pattern– Novelty indicates an emergent or a new

concept that needs to be incorporated to the normal pattern

8

Page 9: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

Novelty detection - Formalization of the Problem

Training set (Offline Phase )Dtr = {(X1, y1), (X2, y2), …, (Xm, ym)}

Xi: vector of input attributes for the ith example yi: target attributeyi Ytr and Ytr ={c1,c2, …,cL}

When new data arrive (Online Phase)Yall ={c1,c2, …,cL, …, cK}, K > LGoal: Classify Xnew in Yall

9

Page 10: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

Novelty detection - Phases

• Offline Phase– Induces a classifier from a set of labeled

examples → known concept about the problem

• Online Phase– Classifies new unlabeled examples– Identifies novelty patterns– Updates the decision model

10

Page 11: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

Offline Phase - Taxonomy

11

Page 12: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

Offline Phase

• Learning task– Unsupervised approaches

• Suppose that all the examples from the training set belongs to the normal concept

– Supervised approaches• Use the label of the examples to build the decision

model• Normal concept is composed by a set of different

classes

12

Page 13: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

Online Phase

• Tasks– Classification of new examples– Detection of novelty patterns– Adaptation of the decision model

• some algorithms update the decision model in an offline fashion

13

Page 14: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

Online Phase

14

Classification

Page 15: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

Online Phase

• Classification– Verify if a new example can be explained by

the current decision model– Approach 1

• Classify new examples only as normal or novelty– Approach 2

• Consider the problem as a multiclass classification task

15

Page 16: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

Online Phase

16

Classification - Taxonomy

Page 17: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

Online Phase

• Classification with unknown label option– Examples not explained by the current

decision are not immediately classified• Assign an unknown profile

– They are put in a short-term memory for future analysis

• Used to update the decision model: extensions and novelty patterns

17

Page 18: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

Online Phase

18

Detection of novelty patterns

or

Page 19: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

Online Phase

• Detection of novelty patterns– Uses unlabeled examples not explained by

the current decision model to identify novelty patterns

– Anomaly detection• Presence of one example not explained by the

model identifies an anomaly behavior– Novelty

• Composed by a set of cohesive and representative examples not explained by the decision model

19

Page 20: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

Online Phase

20

Detection of novelty patterns: Taxonomy

Page 21: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

Online Phase

21

Update of the decision model

or

Page 22: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

Online Phase

• Update of the decision model– Necessary task to address concept drift and

concept evolution– Can be carried with or without feedback– Forgetting mechanisms

• Important strategy used to remove outdated concepts

22

Page 23: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

Online Phase

23

Update of the decision model

Page 24: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

Online Phase

• Update of the decision model: External Feedback– Approach 1: external feedback

• Assume that the true label of all the examples will be available after a delay

• Unrealistic assumption for data streams– Approach 2: active learning

• Ask the user the label of a subset of the examples in the stream

– Approach 3: without feedback• Decision model is updated without information

about the true label of the examples 24

Page 25: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

Online Phase

• Update of the decision model: Forgetting mechanism → Important to forget previous, outdated, concepts– Approach 1: Based on an ensemble of classifiers

• To train a new classifier and replace an old one– Approach 2: Based on clusters

• Clusters that do not received new examples for a long time are removed

– Approach 3: Based on weight• To reduce the weight of the old examples

25

Page 26: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

Detection of recurring concepts

• Recurring concepts: definition– The class definitions may change when

previous situations recur, in periodic or random way, after some period of time (Elwell and Polikar 2011)

– Special type of concept drift where concepts that appeared in the past may recur in the future (Katakis et al. 2010)

26

Page 27: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

Detection of recurring concepts

• Recurring contexts: Examples– Climate change– Electricity demand – Buyer habits – ....

27

Page 28: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

Detection of recurring concepts

It would be a waste of effort to relearn an old concept from scratch for each recurrence (Widmer and Kubat 1996)

– In recurring contexts • Instead of forgetting outdated concepts, these

concepts should be saved and reexamined at some later time when they can improve the prediction performance in a cost-effective way

28

Page 29: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

Detection of recurring concepts

• Systems that do not address recurring concepts: Treat them as novelty– Undesirable effects

• Increase in the false alarm rate• Increase in the human effort in analyzing the false

alarms• Computational efforts in executing a novelty

detection task and in learning a new class that was already learned

29

Page 30: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

Detection of recurring concepts

• Approaches– Approach 1: To use an auxiliary ensemble of

classifiers that detects recurring classes– Approach 2: To use c ensembles, one per

class• Each ensemble is never deleted, but only updated• c is the number of classes seen so far in the

stream– Approach 3: To use a sleep memory to store

clusters not used to classify new examples for a long time 30

Page 31: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

Treatment of Outliers

• Outliers– Data that are isolated, sparse and not present

in a representative number• Novelty detection algorithms

– Look for a cohesive and representative set of examples

– Must address the treatment of noise or outliers which can be confused with the appearing of a new concept or a change in the known concepts

31

Page 32: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

Treatment of Outliers

• Approach for outlier treatment (used by MCM, ECSMiner, MINAS, OLINDDA algorithms)– To store the examples not explained by the current

model in a temporary memory– To cluster these examples– To apply validation criteria on the clusters

• Examples of validation criteria: cohesiveness, representativeness, separability

• Not valid clusters are potential outliersMinas also propose to remove old examples, which stay in the temporary memory for a long time

32

Page 33: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

Examples of Novelty Detection Algorithms for Data Streams

• ECSMiner (Masud et al. 2011)• OLINDDA (Spinosa 2009)• MINAS (Faria 2016)• MCM (Masud et al. 2010)• CLAM (Al-Khateeb et al. 2012)

33

Page 34: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

ECSMiner • Supervised algorithm for concept drift and

concept evolution• The decision model is composed by an

ensemble of classifiers– It supposes that all examples will be labeled after a

delay– Each classification model is trained from a chunk of

data– The ensemble is composed by M models– The ensemble is continuoulsy updated

• The model with the highest prediction error is replaced by a new model

34

Page 35: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

ECSMiner

• Assumptions– After Tl timestamps the true label of the

example will be available– It is possible to wait to Tc timestamps before

to make a decision about the classification of an example

Tc < Tl

35

Page 36: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

ECSMiner• Offline Phase

– Supervised– Ensemble of classifiers

• Decision tree or KNN

• Online Phase– Use the ensemble for classify new examples– Store the examples not explained by the ensemble (f-outliers)– Build clusters from f-outliers using K-Means– Calculate the q-NSC measure (q-neighbourhood silhouette

coefficient)– If most of the classifiers has the q-NSC positive→ a novelty is

detected

36

Page 37: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

OLINDDA

• Offline Phase– Unsupervised– Learn a decision model about the normal class– The decision model is a set of clusters (k-hypershperes)

• Clustering algorithm: K-Means

37

Page 38: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

OLINDDA

• Online Phase• Unsupervised• Use the decision model created in the offline

phase to classify new examples as normal • Examples not explained by the decision model are

put in a short-term memory (unknown)• Valid clusters of unknown examples are used to

create the extension and novelty models

38

Page 39: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

OLINDDA

Normal

Extension

Novelty

39

Page 40: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

OLINDDA

Normal

Extension

Novelty

Example ???

Example

Example

Example

If a new example is inside the radius of one of the hypersphers classify it with the label of the hypersphere

Example

Example

Example

Example

Example

40

Normal

Extension

Page 41: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

OLINDDA

• If the example is labeled as unknown it is stored in a short-term memory

Example

Short-term memory

Not explained by any of the hyperspheres

41

Page 42: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

OLINDDA

If the number of examples in the short-term memory > threshold cluster the examples using K-Means Only valid clusters (cohesive and representative) are considered

Sort-term memory

# Examples > Threshold

K- Means

42

Page 43: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

OLINDDA

• A new cluster is – Extension

• Neighbourhood of the normal model

– Novelty• Distant from the

normal model

43

Page 44: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

MINASMultIclass learNing Algorithm for data Streams

• Offline Phase– Learns a decision model based on the known concept

about the problem – Execute once– Each class represent by a set of clusters (hyperspheres)

• Online Phase– Receives new examples and classify them either as one of

the known classes or as unknown– Cohesive group of unknown examples are used to detect

new classes or extensions44

Page 45: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

MINAS - Offline Phase

45

Page 46: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

MINAS - Offline Phase• Micro-clusters: statistical summary (incremental)N number of examplesLS linear sum of the examplesSS squared sum of the examplest timestamp of the arrival of the last example classified by the micro-

cluster

• Example of clustering algorithms used in the Training Phase– K-Means– Clustream

46

Page 47: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

MINAS - Offline Phase

47

Page 48: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

MINAS

• Online Phase– To classify new examples– To detect novelty patterns– To update the decision model

48

Page 49: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

MINAS - Classification

49

Page 50: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

MINAS - Classification

• Classify an example as unknown means– The example is a noise or outlier and it can not be

explained by anyone of the micro-clusters • The example must be discarded

– The example represents a concept drift • The example must be used to update the decision

model– The example represents a novelty pattern

• The example must be used to update the decision model

50

Page 51: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

MINAS – Novelty detection and update

51

Page 52: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

MINAS - Online Phase

52

Page 53: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

MINAS - Online Phase

53

Page 54: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

MINAS-Active Learning

• Used when the label of a reduced set of examples are available

• Use active learning techniques to select a representative set of examples to be labeled and used to update the decision model

• Main idea– Time to time select the centroid of the new created

micro-clusters as the examples to be labeled by the specialist

– Update the decision model with the new label

54

Page 55: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

Evaluation in Novelty DetectionMulticlass novelty detection data stream algorithms use binary evaluation measures

% of examples misclassified in the normal class

% of normal class examples wrongly classified as novelty

% classificações incorretas

FP: # of examples from the known classes wrongly classified as noveltyFN: # of examples from the novel classes wrongly classified as known classesFE: # of examples from known classes misclassified (other than FP)N: # of examples in the stream Nc: # of examples from the novel classes

55

Page 56: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

Evaluation in Novelty Detection

• Binary classification evaluation measures: Problems– Considers the novelty detection as a binary

classification task• It is a multiclassification task

– Do not consider the unknown examples separately

– Do not consider that different novelty patterns can appear

– Evaluate only the final confusion matrix 56

Page 57: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

57

Evaluation in Novelty Detection (Faria et. al 2013)

• Confusion matrix– Not square (rectangle)– Number of columns

increases over time– Novelty patterns do not

have direct matching with problem classes

– Presence of unknown examples

Page 58: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

58

Evaluation in Novelty Detection (Faria et. al 2013)

• Rectangular Confusion Matrix – Problem

• Difficult to define hits and errors• Matrix is not squared• Each novelty pattern needs to be assigned to only

one class – One class may be associated with one or more novelties

– Solution• Representation using Bipartite graph• Based on the Hungarian Method

Page 59: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

59

Evaluation in Novelty Detection (Faria et. al 2013)

Confusion Matrix

Corresponding Bipartite Graph Resulting Bipartite Subgraph

Page 60: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

60

Evaluation in Novelty Detection (Faria et. al 2013)

• Unknown examples– Problem

• How to consider the unknown examples? – Hits or Errors?

– Solution• Neither hits nor errors• Unknown examples should be computed

separately

Page 61: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

61

Unknown examples

ACCExp + ErrExp = 1 ACCExp/ErrExp: accuracy/error considering only the

examples explained by the model

Unki: # examples from the class Ci classified as unknown

ExCi: # examples from class Ci

M: # classes

Evaluation in Novelty Detection (Faria et. al 2013)

Page 62: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

62

Evaluation in Novelty Detection (Faria et. al 2013)

• Use evaluation measure CER (Combined Error Rate) to calculate classification error rate

• Considerer only the examples classified as not unknown

#Ex′Ci: number of examples from class Ci#Ex′: number of examples

FPRi: false positive rateFNRi: false negative rate

Page 63: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

63

Evaluation in Novelty Detection (Faria et. al 2013)

• Evaluation over time: Problem– In evolving data stream, it is not sufficient to extract

information about the final confusion matrix• Solution

– Plot a 2D-graphic• X represents the data timestamps • Y represents the evaluation measure values

– Plot the information about errors and unknown examples

– Identify the timestamps of when a new concept was detected

Page 64: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

64

Referências• Masud M, Gao J, Khan L, Han J, Thuraisingham BM (2011)

Classification and novel class detection in concept-drifting data streams under time constraints. IEEE Transaction on Knowledge Data Engineering 23(6):859–874

• Spinosa EJ, Carvalho ACPLF, Gama J (2009) Novelty detection with application to data streams. Intelligent Data Analysis 13(3):405–422

• Faria, ER; Carvalho ACPLF, Gama J (2016) MINAS: multiclass learning algorithm for novelty detection in data streams. Data Mining and Knowledge Discovery, v. 30, p. 640-680

• Masud MM, Chen Q, Khan L, Aggarwal CC, Gao J, Han J, Thuraisingham BM (2010) Addressing concept evolution in concept-drifting data streams. In: Proceedings of the 10th IEEE international conference on data mining (ICDM’10), pp 929–934

Page 65: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

65

Referências• Al-Khateeb TM, Masud MM, Khan L, Thuraisingham B

(2012) Cloud guided stream classification using class-based ensemble. In: Proceedings of the 2012 IEEE 5th international conference on computing (CLOUD’12). IEEE Computer Society, Washington, DC, USA, pp 694–701

• Elwell R, Polikar R (2011) Incremental learning of concept drift in nonstationary environments. IEEE Transactions on Neural Network 22(10):1517–1531

• Katakis I, Tsoumakas G, Vlahavas I (2010) Tracking recurring contexts using ensemble classifiers: an application to email filtering. Knowl Inf Syst 2(3):371–391

Page 66: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set

66

Referências• Widmer G, Kubat M (1996) Learning in the presence of

concept drift and hidden contexts. Machine Learning 23(1):69–101