29
4 2 5 1 0011 0010 1010 1101 0001 0100 1011 QPRC June 2009 [email protected] 1 Wookyeon Hwang Univ. of South Carolina George Runger Industrial Engineering Industrial, Systems, and Operations Engineering School of Computing, Informatics, and Decision Systems Engineering Arizona State University Eugene Tuv Intel Process Monitoring with Supervised Learning and Artificial Contrasts

QPRC June [email protected] Wookyeon Hwang Univ. of South Carolina George Runger Industrial Engineering Industrial, Systems, and Operations Engineering

Embed Size (px)

Citation preview

Page 1: QPRC June 2009runger@asu.edu1 Wookyeon Hwang Univ. of South Carolina George Runger Industrial Engineering Industrial, Systems, and Operations Engineering

4251

0011 0010 1010 1101 0001 0100 1011

QPRC June 2009 [email protected] 1

Wookyeon Hwang Univ. of South Carolina

George Runger Industrial EngineeringIndustrial, Systems, and Operations EngineeringSchool of Computing, Informatics, and Decision Systems EngineeringArizona State University

Eugene Tuv Intel

Process Monitoring with Supervised Learning and Artificial Contrasts

Page 2: QPRC June 2009runger@asu.edu1 Wookyeon Hwang Univ. of South Carolina George Runger Industrial Engineering Industrial, Systems, and Operations Engineering

4251

0011 0010 1010 1101 0001 0100 1011

QPRC June 2009 [email protected] 22

Statistical Process Control/Anomaly Detection

• Objective is to detect change in a system– Transportation, environmental, security, health, processes, etc.

• In modern approach, leverage massive data– Continuous, categorical, missing, outliers, nonlinear relationships

• Goal is a widely-applicable, flexible method– Normal conditions and fault type unknown

• Capture relationships between multiple variables– Learn patterns, exploit patterns– Traditional Hotelling’s T2 captures structure, provides control region

(boundary), quantifies false alarms

Page 3: QPRC June 2009runger@asu.edu1 Wookyeon Hwang Univ. of South Carolina George Runger Industrial Engineering Industrial, Systems, and Operations Engineering

4251

0011 0010 1010 1101 0001 0100 1011

QPRC June 2009 [email protected] 3

Traditional Monitoring• Traditional approach is

Hotelling’s (1948) T-squared chart

• Numerical measurements, based on multivariate normality

• Simple elliptical pattern (Mahalanobis distance)

• Time-weighted extensions, exponentially weighted moving average, and cumulative sum– More efficient, but same

elliptical [email protected] 3

Page 4: QPRC June 2009runger@asu.edu1 Wookyeon Hwang Univ. of South Carolina George Runger Industrial Engineering Industrial, Systems, and Operations Engineering

4251

0011 0010 1010 1101 0001 0100 1011

QPRC June 2009 [email protected] 4

Transform to Supervised Learning• Process monitoring can be transformed to a supervised

learning problem– One approach--supplement with artificial, contrasting data– Any one of multiple learners can be used, without pre-

specified faults– Results can generalize monitoring in several directions—such

as arbitrary (nonlinear) in-control conditions, fault knowledge, and categorical variables

– High-dimensional problems can be handled with an appropriate learner

Page 5: QPRC June 2009runger@asu.edu1 Wookyeon Hwang Univ. of South Carolina George Runger Industrial Engineering Industrial, Systems, and Operations Engineering

4251

0011 0010 1010 1101 0001 0100 1011

QPRC June 2009 [email protected] 55

Learn Process Patterns• Learn pattern compared to “structureless” alternative• Generate noise, artificial data without structure to

differentiate– For example, f(x) = f1(x1)… f2(x2) joint distribution as product of

marginals (enforce independence)– Or f(x) = product of uniforms

• Define & assign y = +/–1 to “actual” and “artificial” data, artificial contrast

• Use supervised (classification) learner to distinguish the data sets– Only simple examples used here

Page 6: QPRC June 2009runger@asu.edu1 Wookyeon Hwang Univ. of South Carolina George Runger Industrial Engineering Industrial, Systems, and Operations Engineering

4251

0011 0010 1010 1101 0001 0100 1011

QPRC June 2009 [email protected] 66

Learn Pattern from Artificial Contrast

Page 7: QPRC June 2009runger@asu.edu1 Wookyeon Hwang Univ. of South Carolina George Runger Industrial Engineering Industrial, Systems, and Operations Engineering

4251

0011 0010 1010 1101 0001 0100 1011

QPRC June 2009 [email protected] 77

Regularized Least Squares (Kernel Ridge) Classifier with Radial Basis Functions

• Model with a linear combination of basis functions• Smoothness penalty controls complexity

– Tightly related to Support Vector Machines (SVM)– Regularized least squares allows closed form solution, trades it for

sparsity, may not want to trade!• Previous example: challenge for a generalized learner--

multivariate normal data!f(x)

x1

x2

Page 8: QPRC June 2009runger@asu.edu1 Wookyeon Hwang Univ. of South Carolina George Runger Industrial Engineering Industrial, Systems, and Operations Engineering

4251

0011 0010 1010 1101 0001 0100 1011

QPRC June 2009 [email protected] 88

RLS Classifier

where

with parameters , Solution

1

( ) K( , )n

i ii

f c

x x x

2 2( , ) exp( 2 )K x x x x

( ) I K c y

min ( ) '( )c y Kc y Kc c'Kc

K

n

iiiHf fxfyL

])(,[min

1

Page 9: QPRC June 2009runger@asu.edu1 Wookyeon Hwang Univ. of South Carolina George Runger Industrial Engineering Industrial, Systems, and Operations Engineering

4251

0011 0010 1010 1101 0001 0100 1011

QPRC June 2009 [email protected] 99

Patterns Learned from Artificial Contrast RLSC

• True Hotelling’s 95% probability bound

• Red: learned contour function to assign +/-1

• Actual: n = 1000 Artificial: n = 2000

• Complexity: 4/3000• Sigma2 = 5

Page 10: QPRC June 2009runger@asu.edu1 Wookyeon Hwang Univ. of South Carolina George Runger Industrial Engineering Industrial, Systems, and Operations Engineering

4251

0011 0010 1010 1101 0001 0100 1011

QPRC June 2009 [email protected] 1010

More Challenging Example withHotelling’s Contour

Page 11: QPRC June 2009runger@asu.edu1 Wookyeon Hwang Univ. of South Carolina George Runger Industrial Engineering Industrial, Systems, and Operations Engineering

4251

0011 0010 1010 1101 0001 0100 1011

QPRC June 2009 [email protected] 1111

Patterns Learned from Artificial Contrast RLSC

• Actual: n = 1000 Artificial: n = 2000

• Complexity: 4/3000

• Sigma2 = 5

Page 12: QPRC June 2009runger@asu.edu1 Wookyeon Hwang Univ. of South Carolina George Runger Industrial Engineering Industrial, Systems, and Operations Engineering

4251

0011 0010 1010 1101 0001 0100 1011

QPRC June 2009 [email protected] 1212

Patterns Learned from Artificial Contrast RLSC

-20 -15 -10 -5 0 5 10 15 20-20

-15

-10

-5

0

5

10

15

20

random datagiven datadecision boundary

Actual: n = 1000 Artificial: n = 1000

Complexity: 4/2000

Sigma2 = 5

Page 13: QPRC June 2009runger@asu.edu1 Wookyeon Hwang Univ. of South Carolina George Runger Industrial Engineering Industrial, Systems, and Operations Engineering

4251

0011 0010 1010 1101 0001 0100 1011

QPRC June 2009 [email protected] 1313

RLSC for p = 10 dimensions

Shift = 1

 

Training error (Type II error)

Testing error (Type II error)

Chi-squared (99.5%) (Type II error)

Mean 0.00666 0.980 0.982StDev 0.00057 0.00305  

Shift = 3Mean 0.005 0.487 0.489StDev 0.00264 0.0483  

Page 14: QPRC June 2009runger@asu.edu1 Wookyeon Hwang Univ. of South Carolina George Runger Industrial Engineering Industrial, Systems, and Operations Engineering

4251

0011 0010 1010 1101 0001 0100 1011

QPRC June 2009 [email protected] 1414

Tree-Based Ensembles p = 10• Alternative learner– works with mixed

data – elegantly handle

missing data– scale invariant– outlier resistance– insensitive to

extraneous predictors

• Provide an implicit ability to select key variables

Shift = 1

 

Training error(Type I error)

OOB for training data

Testing error(Type II error)

OOB for test data

Chi-squared (99.5%)(Type II error)

Mean 0 0.00233 0.989 0.0026 0.982StDe

v 0 0.00152 0.0075 0.0011

Shift = 3

Mean 0 0.00266 0.532 0.0033 0.489

StDev 0 0.00115 0.2270 0.0023

Page 15: QPRC June 2009runger@asu.edu1 Wookyeon Hwang Univ. of South Carolina George Runger Industrial Engineering Industrial, Systems, and Operations Engineering

4251

0011 0010 1010 1101 0001 0100 1011

QPRC June 2009 [email protected] 15

Nonlinear Patterns• Hotelling’s

boundary—not a good solution when patterns are not linear

• Control boundaries from supervised learning captures the normal operating condition

Page 16: QPRC June 2009runger@asu.edu1 Wookyeon Hwang Univ. of South Carolina George Runger Industrial Engineering Industrial, Systems, and Operations Engineering

4251

0011 0010 1010 1101 0001 0100 1011

QPRC June 2009 [email protected] 16

-4 -3 -2 -1 0 1 2 3 4 5-5

-4

-3

-2

-1

0

1

2

3

4

x1

x 2

Boundaries by RLSC vs Hotelling 95% boundary

in-control data

reference data

Tuned Control• Extend to incorporate specific

process knowledge of faults• Artificial contrasts generated from

the specified fault distribution – or from a mixture of samples from

different fault distributions • Numerical optimization to design a

control statistic can be very complicated– maximizes the likelihood function

under a specified fault (alternative)

Page 17: QPRC June 2009runger@asu.edu1 Wookyeon Hwang Univ. of South Carolina George Runger Industrial Engineering Industrial, Systems, and Operations Engineering

4251

0011 0010 1010 1101 0001 0100 1011

QPRC June 2009 [email protected] 17

Tuned Control

• Fault: means of both variables x1 and x2 are known to increase

• Artificial data (black) are sampled from 12 independent normal distributions– Mean vectors are selected from a grid

over the area [0, 3] x [0, 3] • Learned control region is shown in

the right panel—approx. matches the theoretical result in Testik et al., 2004.

Page 18: QPRC June 2009runger@asu.edu1 Wookyeon Hwang Univ. of South Carolina George Runger Industrial Engineering Industrial, Systems, and Operations Engineering

4251

0011 0010 1010 1101 0001 0100 1011

QPRC June 2009 [email protected] 18

Incorporate Time-Weighted Rules

• What form of statistic should be filtered and monitored?– Log likelihood ratio

• Some learners provide call probability estimates • Bayes’ theorem (for equal sample size) gives

• Log likelihood ratio for an observation xt estimated as

• Apply EWMA (or CUSUM, etc.) to lt

tintoutt ppl __ lnln

1)1( ttt ZlZ

)(

)(

)(

)(

x

x

x

x

in

out

in

out

p

p

f

f

Page 19: QPRC June 2009runger@asu.edu1 Wookyeon Hwang Univ. of South Carolina George Runger Industrial Engineering Industrial, Systems, and Operations Engineering

4251

0011 0010 1010 1101 0001 0100 1011

QPRC June 2009 [email protected] 19

Time-Weighted ARLs• ARLs for selected schemes applied to lt statistic

– 10-dimensional, independent normal

EWMA No shift 5 vars. shift 1 sigma 10 vars. shift 1 sigmaAverage 202.8 10.1 4.68Stdev 3.65 1.21 0.27IndAverage 200.5 39.4 11.8Stdev 5.79 18.68 5.62

Page 20: QPRC June 2009runger@asu.edu1 Wookyeon Hwang Univ. of South Carolina George Runger Industrial Engineering Industrial, Systems, and Operations Engineering

4251

0011 0010 1010 1101 0001 0100 1011

QPRC June 2009 [email protected] 20

Example: 50 Dimensions

[email protected] 20

Page 21: QPRC June 2009runger@asu.edu1 Wookyeon Hwang Univ. of South Carolina George Runger Industrial Engineering Industrial, Systems, and Operations Engineering

4251

0011 0010 1010 1101 0001 0100 1011

QPRC June 2009 [email protected] 21

Example: 50 Dimensions• Hotelling’s: left• Artificial contrast:

right

[email protected] 21

Page 22: QPRC June 2009runger@asu.edu1 Wookyeon Hwang Univ. of South Carolina George Runger Industrial Engineering Industrial, Systems, and Operations Engineering

4251

0011 0010 1010 1101 0001 0100 1011

QPRC June 2009 [email protected] 22

Example: Credit Data (UCI)• 20 attributes: 7 numerical and 13 categorical• Associated class label of “good” or “bad” credit

risk• Artificial data generated from continuous and

discrete uniform distributions, respectively, independently for each attribute

• Ordered by 300 “good” instances followed by 300 “bad”

[email protected]

Page 23: QPRC June 2009runger@asu.edu1 Wookyeon Hwang Univ. of South Carolina George Runger Industrial Engineering Industrial, Systems, and Operations Engineering

4251

0011 0010 1010 1101 0001 0100 1011

QPRC June 2009 [email protected] 23

Artificial Contrasts for Credit Data

• Plot of lt over time

[email protected]

Page 24: QPRC June 2009runger@asu.edu1 Wookyeon Hwang Univ. of South Carolina George Runger Industrial Engineering Industrial, Systems, and Operations Engineering

4251

0011 0010 1010 1101 0001 0100 1011

QPRC June 2009 [email protected] 24

Diagnostics: Contribution Plots

• 50 dimensions: 2 contributors, 48 noise variables (scatter plot projections to contributor variables)

Page 25: QPRC June 2009runger@asu.edu1 Wookyeon Hwang Univ. of South Carolina George Runger Industrial Engineering Industrial, Systems, and Operations Engineering

4251

0011 0010 1010 1101 0001 0100 1011

QPRC June 2009 [email protected] 25

Contributor Plots from PCA T2

Page 26: QPRC June 2009runger@asu.edu1 Wookyeon Hwang Univ. of South Carolina George Runger Industrial Engineering Industrial, Systems, and Operations Engineering

4251

0011 0010 1010 1101 0001 0100 1011

QPRC June 2009 [email protected] 26

Contributor Plots from PCA SPE

Page 27: QPRC June 2009runger@asu.edu1 Wookyeon Hwang Univ. of South Carolina George Runger Industrial Engineering Industrial, Systems, and Operations Engineering

4251

0011 0010 1010 1101 0001 0100 1011

QPRC June 2009 [email protected] 27

Contributor Plots from Artificial Contrast Ensemble (ACE)

• Impurity importance weighted by means of split variable

Page 28: QPRC June 2009runger@asu.edu1 Wookyeon Hwang Univ. of South Carolina George Runger Industrial Engineering Industrial, Systems, and Operations Engineering

4251

0011 0010 1010 1101 0001 0100 1011

QPRC June 2009 [email protected] 28

Contributor Plots for Nonlinear System

• Contributor plots from SPE, T2 and ACE in left, center, right, respectively

Page 29: QPRC June 2009runger@asu.edu1 Wookyeon Hwang Univ. of South Carolina George Runger Industrial Engineering Industrial, Systems, and Operations Engineering

4251

0011 0010 1010 1101 0001 0100 1011

QPRC June 2009 [email protected] 2929

Conclusions• Can/must leverage the automated-ubiquitous, data-

computational environment– Professional obsolesce

• Employ flexible, powerful control solution, for broad applications: environment, health, security, etc., as well as manufacturing– “Normal” sensors not obvious, patterns not known

• Include automated diagnosis– Tools to filter to identify contributors

• Computational feasibility in embedded software

This material is based upon work supported by the National Science Foundation under Grant No. 0355575.