www.qualicent.net
Advanced Analytics for Zero Defects
Anil Gandhi / Data Scientist Joy Gandhi / Quality Consultant
GSA – Quality Working Group Meeting March 25, 2015
Agenda
• The case for Zero Defects • Problem Statement • Solution • Case studies • Analytics in the Product Lifecycle • Summary • Q & A
2
Qualicent Introduction
• Services – Advanced Analytics – Quality Engineering/Management, Quality System
Standards – Big Data Implementation
• Software – ZeroDefectMiner® software for Automotive, Medical
Electronics, Aerospace
3
http://www.pwc.com/gx/en/technology/publications/semiconductor-report-spotlight-on-automotive.jhtml
Semiconductor is a growing share of the automobile cost
Semiconductors in Automotive
4
Source: McKinsey
http://www.mckinsey.com/client_service/semiconductors/latest_thinking
Semiconductors in Automotive…
…are pervasive
5
Automotive High Cost of Field Failure
Source: CNN
Source: USA Today
Source: The Economist
6
Automotive Recalls are Expensive
…VERY expensive
Semiconductor makers have to deliver products with zero defects
7
Reducing Risk
RISK
Field Failure
COST
Design Manufacture
Contain
Resolve
Prevent @Product and Process Design
@Manufacturing
@Shipped @Manufacturing
8
Advanced Analytics Algorithms
Advanced Anomaly Detection Detect out-of-pattern units/ decision support
unsupervised learning
Design Rules Operating zones /exclusion zones IFTTT / ML / supervised learning
Root Cause Non-linear explanatory approaches
ML /supervised learning
Contain
Resolve
Prevent
9
Dashboards, Visualizations
Enterprise Class Infrastructure Hadoop, Big Data, Scalable
Advanced Analytics Algorithms … must accurately predict field failures
Advanced Anomaly Detection Detect out-of-pattern units/ decision support
unsupervised learning
Design Rules Operating zones /exclusion zones IFTTT / ML / supervised learning
Root Cause Non-linear explanatory approaches
ML /supervised learning
10
The Problem
Predictors for large excursions / large effects not difficult to source…BUT
× Biggest field failure losses are from marginal effects and/or intermittent deviations over extended periods
× Marginal effects are difficult to detect with standard methods because of high dimensionality, noise, small # of fails, …
TECHNICAL GOAL: find multivariate marginality
11
OUTLIER 6 σ 2 σ
6 σ 6 σ
2 σ 2 σ __ ?
CHALLENGE: Detect parts that are not similar to the rest of the population
The Problem
• 1000s of components • 1000s of solder points • 100,000s of vias • ~100s of part SKUs • ~10s of suppliers
o Lots of available multi-variate combinations = lot of opportunities for marginal units o Each parameter could be within tolerance but combination of parameters may be an outlier o Inability to detect multivariate problem process corners
12
The Solution
Traditional: ANOVA, t-test
screen / coarse reduce
Composite distance
cluster analysis
visualization / client
Machine learning model 1. Operating and exclusion
zones for design 2. Anomaly detection
13
Pattern Discovery
Deductive Reasoning
Inductive Reasoning
1. Make a hypothesis based on prior knowledge 2. Test the hypothesis
1. Discover patterns, discover hypothesis 2. Check if patterns have material meaning
DISCOVER PATTERNS IMPOSSIBLE TO HYPOTHESIZE
Machine Learning
Traditional Statistics
14
Case Study 1
Large Electronics Manufacturer / Auto Who
Field Failure KPI
Composite Distance How
Detect field failures with high class purity
Result
15
VarZ
VarY VarX
Anomaly Detection
Outlier yes no
16
Com
posi
te D
ista
nce
Topm
ost p
aram
eter
Anomaly Detection
median + 6*robust σ
USL
17
Five out of seven field failures are detected by Composite Distance…at low cost
Com
posi
te D
ista
nce
Topm
ost p
aram
eter
Anomaly Detection
pass fail pass 18,399 5 18,404 fail 2 5 7
18,401 10 18,411
predicted
actu
al
pass fail pass 18,288 116 18,404 fail 3 4 7
18,291 120 18,411
predicted
actu
al
18
Composite Distance offers significant improvement over single parameter controls
UCL = Median + x * robust sigma
Accuracy Purity
Composite distance
Top Parameter
Detection Metrics
19
Case Study 2
Large Semiconductor Company Who
Yield KPI
Machine learning algorithms How
Revenue increase by > $ MM/quarter Result
21
Rule Discovery
Variables M, Q and T individually have no influence on Metric of Interest (MOI)
Data is normalized, scaled and transformed
22
Variable M Variable Q Variable T
0.0
0.2
0.4
0.6
0.8
Yield = 0 Yield = 1
100 150 200 250 300 700 750 800 850 900 9950 10000 10050 10100
M < 191 Q < 812 T > 10,006
100 150 200 250 300 700 750 800 850 900 9950 10000 10050 10100
0 1
+ +
Variables M, Q and T interactively strongly influence the output
Variable M Variable Q Variable T
Rule Discovery / Machine learning
RESULT: EXCLUSION ZONE
23
Discover: Analytical Outputs
Response= 0.89653 - 0.916669 * BF1 - 0.012894523 * BF3 + 7.26853E-0059* BF4 + 2.847878 * BF6 - 1.023234 * BF7 + 3.0275966 * BF8; BF1 = max(0, X1 - 82.398); BF2 = max(0, 82.398 – X1); BF3 = max(0, X2 - 161.82) * BF2; BF4 = max(0, 161.82 – X2) * BF2; BF6 = max(0, 88.92 - TOP_X4); BF7 = max(0, X5 - 92.692) * BF6; BF8 = max(0, X6 - 38.109) * BF1;
• Ranked Variables of Importance • Non linear predictive model • Graphical Representations
100. Sub thresh leakage 96. Leff 95. CD 75. IDDQ…. 73. …
24
Case Study 3
PV Solar Company Who
Cell Efficiency KPI
Machine learning algorithms How
Prevent cell efficiency loss by 30% Result
25
Solar Panel Line Flow
A
B
C
D
Measurement at four sites all passing inspection but low cell efficiency
Algorithms discovered that it’s the ratio that matters = PATTERN DISCOVERY
Measures A, B, C, D fully in control and within normal distribution
26
Case Study 3
Before Date X
After Date X A
C
Machine learning algorithms discover ratio of A/C as critical parameter (not predicted by domain experts, but later successfully explained by experts)
27
EXCLUSION ZONE: Y - low process metric readings (< 24.5) X -low in line measure(< 81) Z (date) > something
Case Study 3: Solar
28
Machine learning model predicts ~31% reduction in EFF in exclusion zone
Design Verification Validation HV Production
Zero Defect and the Product Lifecycle
Pre-proto-type A, B samples C, D Samples Production
• DPPM Forecast • Model Historical Data • Extract operating and
exclusion zones • Improve product and
process design
• Model with A,B data • Extract operating and
exclusion zones • Calculate DPPM • Improve product and
process design
• Model with C, D data • Extract operating and
exclusion zones • Outlier Detection for
Safe Launch • Calculate DPPM • Improve process for
Safe Launch
• Ongoing Outlier Detection
• DPPM Monitoring • Continuous
improvement of Process/product
Prevent Prevent Prevent
Contain
Resolve
Contain
Resolve
Predictive Models Predictive Models
Anomaly Detection (Supplier Data)
Automotive Sample Phase
Advanced Analytics
Goal
Predictive Models
Anomaly Detection
Explanatory Models
Anomaly Detection Rule Discovery
29
Summary
• Zero defect can be achieved using Advanced Analytics – Anomaly Detection – unsupervised learning – Machine Learning – supervised learning
• Contain high probability field failures using composite distance analysis
• Defect reduction and yield improvement can be achieved with predictive models
• Root cause identification with explanatory models
Advanced Analytics can be employed in the entire product life-cycle
30
THANK YOU!
31