J&J Thesis Presentation July 2016

Machine Learning Techniques In Categorical Time Series Analysis

Of Manufacturing Process

Haris Michailidis, Isidora Tourni

National Technical University of AthensSchool of Electrical and Computer Engineering

Professor: Nectarios KozirisJ&J Responsible: Michalis AvgoulisPresentation Date: 26/07/2016

Contents

● Problem Motivation

● Visualization

● Machine Learning

● Results

● Future Work

2

Introduction

3

Problem Motivation

In cooperation with Johnson & Johnson Hellas

Goals:

● Visualisation of Mixing Process● Quantification of Procedures● Classification & Clustering of processes

Further Goals:

● Optimization of the Mixing Process● Comparison with Golden Standard (Evaluation)● Comparison between different batches of the same Product

4

Process Description

Example Vessel Actions:

● Heating● Agitation● Addition of Materials ● Pressure adjustment

PLC logging

● Output to CSV

MixingRaw Materials Bottling

5

Product Categories

6

Emulsion

Product Cleaning Group

Product Categories

7

Picsou C

Product Group

Apple Cream

Data-Set Description ~130.000 rows/year

45 message code sets (values,set-points)

8

Categorical Data

Visualization

9

Visualization Tool

Goals:

● Visualization of Mixing Process● Selective representation of variables● Overview with flexible Timeframe● Accessible from multiple terminals (web interface)

10

The human brain processes visuals 60.000 timesfaster than text. *

* Forrester CSO Insights 2012

Visualization Tool (interface)

11

Initial Page of the Visualisation Tool


12

4 days overview


13

1 day overview


14

Detailed box in complex visualization

Machine Learning

15

Goal: Explore the possibilities of Machine Learning in Manufacturing space, in order to produce useful insights for the process.

● Classification● Clustering

Challenges: Represent an object in an N-dimensional space

● Representation of each batch | Object Creation● Data Cleansing / Creation of training set | Labelling● “Distance” between batches | Distance Calculation

Inspiration:

● DNA sequence analysis → Markov Models

Machine Learning Introduction

16

Unit of analysis: batch

1. Data cleansing

Value - Set-Point Flattening, Typos Correction

2. Labelling of batches

Through 2 files:

● Log file: containing manual entries from operators● Mapping table: containing information for each product

3. Time-series splitting to batches

Depending on business rules which derived from experience and observations. Keep only production chunks.

Solving the Challenges

17


4. Feature selection

Message Number

5. Representation of each batch 6. Unequal length time-series comparison

18

Message Mapping Table

Transition Matrix Concept

19

Sequence 1 :

B-B-C-A-B-C-A-C-A-B-C

Sequence 2 :

A-A-B-B-A-B-B-C-C-A-B-B-C-A-B-C-A-C

A B C

A 0.00 0.66 0.33

B 0.00 0.25 0.75

C 1.00 0.00 0.00

A B C

A 0.17 0.67 0.17

B 0.14 0.43 0.43

C 0.75 0.00 0.25

Transition Matrix 1 : Transition Matrix 2 :


4. Feature selection

Message Number

5. Representation of each batch 6. Unequal length time-series comparison

Chunk Object, containing:● Transition Matrix (fixed size 45x45)● Labels

7. Distance calculation method

Great research area

20

Transition Matrix

Message Mapping Table

Distance Evaluation

Goal

● Distance {batch - batch} → Distance between 2D Transition Matrices

Problems:

1. Choosing the proper Vector Distance Metric2. Converting 2D Transition Matrix → Vector

Solutions:

1. Distance between Vectors:● Euclidean Distance● Cosine Distance ● Kullback- Leibler Divergence ● Kolmogorov- Smirnov Test● Infinite Norm

21

Distance Evaluation2. 2D Matrix → Vector: *

A. Append each row to the firstB. Append each row from the diagonal matrix to the firstC. Average of distances between corresponding rows

A.

B.

* Not using Space-Filling curves due to unrelated spatial characteristics.22

Classification (supervised)

The process of classifying objects accordingto shared attributes.

Algorithms used:

● Nearest Centroid● k-Nearest Neighbors

Evaluation Methods:

● Accuracy ● Cohen’s Kappa (Kappa coefficient)

23

train

test

Dat

a

Clustering (unsupervised)

The task of grouping objects in such way that objects in the same group (cluster) are more similar to each other than to those in other groups.

Algorithms used:

● k-Means

Evaluation Methods:

● V-Measure● Rand-Index

24

Classification Results

25

Distance Comparison | Classification

26

Nearest Centroid Classifier

27

Train - Test Split Evaluation [1/2]

Classification Baseline (ZeroR):Product Cleaning Group Accuracy: 0.520

Product Group Accuracy: 0.377

83%

65%

28

k-Nearest Neighbors Classifier

Train - Test Split Evaluation [2/2]

Classification Baseline (ZeroR):Product Cleaning Group Accuracy: 0.520

Product Group Accuracy: 0.377

73%

55%

Clustering Results

29

Distance Comparison | Clustering

30

33%

Conclusions

1) Visualizationa) Visual Production Overviewb) Enabling Comparison between batches

2) Machine Learninga) Valid Representation of Categorical Time-Seriesb) Quantification of Production Processesc) Application of Machine Learning Techniques

31

Future Work | Academic

● Research on 2D-specific Distance Metrics● Clustering Algorithms, based on Markov Models● Classification using Transition Matrices of different Dimensions (Markov-

0,2,...,N)● Different Feature Selection (temperature, pressure, etc)

32

● Data Gathering Automation● Creation of Golden Standard for each Product● Scoring of Production Process● Distribution of Batches compared to the Average Batch● Clustering to more efficient clusters based on the process

Future Work | Business

33

Thank you!

34

Questions?

Appendix

35

Distance Comparison | Classification

36

Setup:● Algorithm:

○ Nearest Centroid Classifier● Attributes:

○ Product Cleaning Group○ Product Group

● Split: ○ 80% training set, 20% test set

● Distances:○ All

Determining k in k-Nearest Neighbors

37


○ k-Nearest Neighbors● Attributes:

○ Product Cleaning Group○ Product Group

● Split: ○ 80% training set, 20% test set

● Distances (Average of):○ Euclidean total○ Cosine vector○ KL - Divergence diagonal

Train - Test Split Evaluation

38


○ Nearest Centroid Classifier○ k-Nearest Neighbors

● Attributes:○ Product Cleaning Group○ Product Group

● Split (train-test): ○ 80% - 20%○ 65% - 35%○ 50% - 50%

● Distances (Average of):○ Euclidean total○ Cosine vector○ KL - Divergence diagonal

Distance Comparison | Clustering

39


○ Baseline○ k-Means


● Initial Centroid Sets Type: ○ All centroids of each set belong to different clusters (Alldiff)

Average of 20 sets○ All centroids of each set belong to the same cluster (Allsame)

Average of 20 sets● Distances:

○ All

Impact of Initial Centroids

40


○ Baseline○ k-Means


● Initial Centroid Sets Type: ○ All centroids of each set belong to different clusters (Alldiff)

Average of 100 sets○ All centroids of each set belong to the same cluster (Allsame)

Average of 100 sets○ All centroids of each set belong to a random cluster (Allrandom)

Average of 100 sets● Distances (Average of):

○ Euclidean Total○ Euclidean Rowl○ Euclidean Column

Determining k in k-Nearest Neighbors [1/2]

41

Accuracy: Average: 0.727 Deviation: <1% Kappa: Average: 0.531 Deviation: ~2%

Determining k in k-Nearest Neighbors [2/2]

42

Accuracy: Average: 0.560 Deviation: <1% Kappa: Average: 0.391 Deviation: ~1%

Distance Comparison | Classification [2/2]

43

Distance Comparison | Clustering [2/2]

44

Impact of Initial Centroids [2/2]

45

Labelling

1. Data cleansing 2. Labelling of batches

Object Creation

3. Time-series splitting to batches 4. Representation of each batch (chunk) 5. Feature selection 6. Unequal length time-series comparison

Distance Calculation

7. Distance calculation method

Challenges in ML

46

Impact of Initial Centroids

47

Documents

J&J Thesis Presentation July 2016