Characterization and Clustering Analysis
Michael BorutaIndustrial Solutions Manager
Optical Spectroscopy Product Manager
Outline
• Background
• Algorithm
• Analysis & Review
• Examples
Background
• The most common use of cluster analysis is classification.
• Several assumptions– No prior judgments used to organize the data (un-supervised
clustering)
– Each member belongs to one and only one group
• Several questions– What will be used to measure the similarity
– How are classes formed & defined
– What inferences can be drawn regarding their significance
???
Euclidean Distance
Total area = 369
Total area = 927
HQI = 54.4
1st Derivative Euclidean Distance
Total area = 7.46
Total area = 6.83HQI = 97.96
Gap Analysis
Gap= 4.3; Gap % = 11.5
Gap= 11.8; Gap % = 31.1
Gap= 3.4; Gap % = 9.0
CH3
CH3
CH3
CH3
CH3
PVA sample
Example 1: XRPD Polymorphs
Example 1: XRPD Polymorphs
Data Review
Spectral Overlays
Graph
Overlay Legend
Nearest Neighbors Table
DSC curve
TGA curve
Image
Analysis/Review
Example 2: C-13 NMR Polymers
Example 3: IR Polymers
Example 3: IR Polymers
Example 3: IR Polymers
Example 4: IR Oils
All spectra
Example 4: IR Oils
Groups 1 and 2
Example 4: IR Oils
Summary
• Clustering available for several spectroscopy types; IR, Raman, C13, H1, & XRPD
• Clustering assumes each member can be in only one cluster
• Data analysis/review can merge or split clusters, or move members from one cluster to another
• Clustering can be used for many types of classification problems;
– Comparing competitive products
– Salts and polymorphs
– Classifying polymer types
– Chemical imaging analysis
• Once classifications exist, new samples can be compared to existing clusters