37
Some Analytical Chemistry of Potato Chips Lessons on Sampling and ANOVA in SAS and JMP Eric Cai

Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty

Some Analytical Chemistry of Potato Chips

Lessons on Sampling and ANOVA in SAS and JMP

Eric Cai

Page 3: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty

Objectives

• Estimate the weight percentage of sodium in a bag of potato chips

• Obtain a confidence interval for the estimated weight percentage

• Need to minimize the cumulative uncertainty in the final result – Minimize the width of the confidence interval

Page 4: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty

Objectives

• Estimate the weight percentage of sodium in a bag of potato chips

• Obtain a confidence interval for the estimated weight percentage

• Need to minimize the cumulative uncertainty in the final result – Minimize the width of the confidence interval

Page 5: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty

Bag of Potato Chips

1 2 3 4

Page 6: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty

How to minimize uncertainty?

• Use precise instruments

• Measure many aliquots

• Minimize the variation between the samples

Page 7: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty

How to minimize uncertainty?

• Use precise instruments

• Measure many aliquots

• Minimize the variation between the samples

Page 8: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty

Bag of Potato Chips

1 2 3 4

Variation in Weight

Percentage Between

Chips

Variation in Weight

Percentage Between

Chips

Variation in Weight

Percentage Between

Chips

Page 9: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty

Bag of Potato Chips

1 2 3 4

Variation in Weight

Percentage Between

Chips

Variation in Weight

Percentage Between

Chips

Variation in Weight

Percentage Between

Chips

Variation in Weight

Percentage Within a Chip

Variation in Weight

Percentage Within a Chip

Page 10: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty

Raw Data – Wide Format Chip 1 Chip 2 Chip 3 Chip 4

Aliquot 1 0.324% 0.455% 0.420% 0.447%

Aliquot 2 0.311% 0.467% 0.463% 0.377%

Aliquot 3 0.352% 0.448% 0.424% 0.398%

Page 11: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty

Raw Data – Wide Format Chip 1 Chip 2 Chip 3 Chip 4

Aliquot 1 0.324% 0.455% 0.420% 0.447%

Aliquot 2 0.311% 0.467% 0.463% 0.377%

Aliquot 3 0.352% 0.448% 0.424% 0.398%

Chip Weight Percentage

Chip 1

Chip 1

Chip 1

Chip 2

Chip 2

Chip 2

Chip 3

Chip 3

Chip 3

Chip 4

Chip 4

Chip 4

Desired Data Long Format

Needed for analysis in both SAS and JMP

Page 12: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty

* enter the raw data; data sodium1; input chip1 chip2 chip3 chip4; datalines; 0.324 0.455 0.420 0.447 0.311 0.467 0.463 0.377 0.352 0.448 0.424 0.398 ; run; * transpose the data; * convert the weight percentages from a vertical display to a horizontal display; proc transpose data = sodium1 out = sodium2 name = sample prefix = aliquot; var chip:; run; * show the transposed data; proc print data = sodium2; run;

Page 13: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty

Long, but still wide

sample aliquot1 aliquot2 aliquot3

chip1 0.324 0.311 0.352

chip2 0.455 0.467 0.448

chip3 0.420 0.463 0.424

chip4 0.447 0.377 0.398

Page 14: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty

* sodium2 needs to be transposed once more for all weight percentages to be in one column; proc transpose data = sodium2 out = sodium3 ( rename = ( col1 = weight_percentage ) ) name = subsample; var aliquot:; by sample; run; * show sodium3 - it is now ready for analysis; proc print data = sodium3; run;

Page 15: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty

Transformed Data – Long Format

sample subsample weight_percentage

chip1 aliquot1 0.324

chip1 aliquot2 0.311

chip1 aliquot3 0.352

chip2 aliquot1 0.455

chip2 aliquot2 0.467

chip2 aliquot3 0.448

chip3 aliquot1 0.420

chip3 aliquot2 0.463

chip3 aliquot3 0.424

chip4 aliquot1 0.447

chip4 aliquot2 0.377

chip4 aliquot3 0.398

Page 16: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty

PROC TRANSPOSE X 2 Wide to Long

sample subsample weight_percentage

chip1 aliquot1 0.324

chip1 aliquot2 0.311

chip1 aliquot3 0.352

chip2 aliquot1 0.455

chip2 aliquot2 0.467

chip2 aliquot3 0.448

chip3 aliquot1 0.420

chip3 aliquot2 0.463

chip3 aliquot3 0.424

chip4 aliquot1 0.447

chip4 aliquot2 0.377

chip4 aliquot3 0.398

sample aliquot1 aliquot2 aliquot3

chip1 0.324 0.311 0.352

chip2 0.455 0.467 0.448

chip3 0.420 0.463 0.424

chip4 0.447 0.377 0.398

Chip 1 Chip 2 Chip 3 Chip 4

Aliquot 1 0.324% 0.455% 0.420% 0.447%

Aliquot 2 0.311% 0.467% 0.463% 0.377%

Aliquot 3 0.352% 0.448% 0.424% 0.398%

See the November, 2015, issue of the VanSUG newsletter about PROC TRANSPOSE by Dilinuer Kuerban

Page 17: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty

Visualize the Data

Page 18: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty

Visualize the Data

Grand Mean

Sample mean of all data

Group-specific means Sample means within each group (chip)

Page 19: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty

Visualize the Data

Between-group variation

Within-group variation

Page 20: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty

Compare the 2 sources of variation

• Analysis of Variance (ANOVA) – Linear regression with categorical predictors

– Partition a continuous variable by a categorical factor

– Use sum of squares to quantify the variation

– Sum of deviations of data away from the average • Scale (divide) each sum by the number of degrees of

freedom

Page 21: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty

Visualize the Data

Between-group variation

Within-group variation

Page 22: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty

Analysis of Variance (ANOVA)

• Use sum of squares to quantify the variations

• Sum of deviations of data away from the average

Between-group variation

vs.

Within-group variation

Page 23: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty

* use ANOVA to partition and compare the 2 sources of variation; proc anova data = sodium4; class sample; model weight_percentage = sample; run; You can also use PROC GLM to implement ANOVA. ANOVA is one special case of general linear models. PROC ANOVA should only be used when there are equal numbers of observations for every combination of the classification factors. • There are many exceptions to this!

Page 24: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty
Page 25: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty
Page 26: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty

Image courtesy of Cdang via Wikimedia

Page 27: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty

There is much more variation in the weight percentage of sodium between the chips than within the chips!

Page 28: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty

Bag of Potato Chips

1 2 3 4

Variation in Weight

Percentage Between

Chips

Variation in Weight

Percentage Between

Chips

Variation in Weight

Percentage Between

Chips

Variation in Weight

Percentage Within a Chip

Variation in Weight

Percentage Within a Chip

Page 29: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty

JMP

• A software from The SAS Institute

• Point-and-click

• Has underlying scripting language

• Statistics

• Machine learning

• Industrial statistics

• Go to JMP demonstration!

Page 30: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty

Bag of Potato Chips

1 2 3 4

Page 31: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty

Bag of Potato Chips

1 2 3 4

More measurements are needed!

There is a trade-off!

Page 32: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty
Page 33: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty
Page 34: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty
Page 35: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty
Page 36: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty
Page 37: Some Analytical Chemistry of Potato Chips Group...a bag of potato chips •Obtain a confidence interval for the estimated weight percentage •Need to minimize the cumulative uncertainty

Louis Valente Manager of Global Field Enablement for JMP

Mark Bailey Principal Analytical Training Consultant for JMP

Arati Mejdal Global Social Media Manager for JMP Software

Thank you JMP staff!