STEM data collection analysis using statistics

Embed Size (px)

DESCRIPTION

When to collect data 1) before building – customer survey to help design prototype and determine specifications 2) during building – to test individual components and choose the best materials and configurations 3) after building – to verify that the product works as designed (reliability & validity), or better than other products 4) before redesign – collect customer feedback on prototype and redesign as needed, could test in a “real world” setting

Citation preview

STEM data collection & analysis using statistics When to collect data 1) before building customer survey to help design prototype and determine specifications 2) during building to test individual components and choose the best materials and configurations 3) after building to verify that the product works as designed (reliability & validity), or better than other products 4) before redesign collect customer feedback on prototype and redesign as needed, could test in a real world setting Common data sets Distance sensor data sheet htm Statistics Statistics is the use of mathematical formulas to determine if the data collected in an experiment shows significant results What is a significant result? It means the results shown by the data are not due to chance When results are NOT SIGNIFICANT, then you may have gotten the results by accident (often because of small sample size or collection error) Significance When statistics say that a result is significant, there is 95% chance that future experiments will produce the same results P value = 0.05 Statistical Tests Here are some common statistical tests that are used: Confidence interval (standard deviation) t-test (2 means) Correlation (positive/negative relationship) ANOVA (3 or more means) When to collect data 1) before building customer survey to help design prototype and determine specifications Design process from MITdivision/esd-051j-engineering-innovation-and-design- fall-2012/http://ocw.mit.edu/courses/engineering-systems- division/esd-051j-engineering-innovation-and-design- fall-2012/ Design thinking to fight childhood obesity https://www.youtube.com/watch?v=Ee4CKIPkIik Surveys Create a variety of questions Personal info to show that person fits criteria for a potential customer Open ended Multiple choice Ranking Survey personal info Surveys should be anonymous, but fit criteria of a customer Ex. What is your age? maybe product is for children, or the elderly Ex. Are you male or female? maybe product is specifically for women Ex. What kind of colorblindess do you have? maybe product is targeting someone with a specific condition Survey open ended Open ended questions can provide new ideas to the design team Ex. What color would you prefer for the concussion helmet? Possible responses red, blue, yellow, multicolored, glow-in-the-dark, flashing lights Survey multiple choice Multiple choice does not provide new ideas, but can help identify patterns Ex. What color would you prefer for the concussion helmet? Red10% Yellow30% Multicolored60% Survey multiple choice This is useful if you know that your design has limited options Ex. For a product assisting the blind which location would be most convenient for the ultrasonic sensor? Hand20% Head10% Foot0% Attached to regular White Cane70% Survey - ranking Can help you determine which criteria is most important to the customer Ex. What is your preferred color? How much would color factor into your decision to buy and use this product? (1 = not much, 10 = very much) Ex. What is your preferred size (small, medium, large)? How much would size factor into your decision to buy and use this product? (1 = not much, 10 = very much) When to collect data 1) before building customer survey to help design prototype and determine specifications 2) during building to test individual components and choose the best materials and configurations 3) after building to verify that the product works as designed (reliability & validity), or better than other products 4) before redesign collect customer feedback on prototype and redesign as needed, could test in a real world setting Statistical Tests Here are some common statistical tests that are used: Confidence interval (standard deviation) t-test (2 means) Correlation (positive/negative relationship) ANOVA (3 or more means) Confidence interval Used to test one data set Data will often result in a normal distribution (Bell Curve or Guassian distribution) 68% of values are within one standard deviation away from the mean 95% of the values lie within two standard deviations 99.7% are within three standard deviations Ex. You have 95% confidence that the when the distance sensor is 12 inches away from the object, it will display a number between inches Gaussian distribution Variance in the data Less variance means more reliable Reliability vs Validity Goal: alarm that goes off at 8:00 am Reliable/Not valid = goes off at 8:15 am 1 min Valid/Not reliable = goes off at 8:00 am 15 min Reliable & valid = goes off at 8:00 am 1 min T-test Used to compare the means (averages) of the two populations Ex. there is no statistically significant difference between the mean sensor values of two different sensors Ex. there is a significant difference between the # of collisions you make on an obstacle course using a regular White Cane vs ultrasonic cane Compare data with expected values Standard deviation & ttest Correlation Used to see if there is a relationship between two groups of data Does not predict causality, just that they are related Graph: scatter plot The best-fit-line on the graph will have an R value (the slope of the graph) R = 0, no relationship R = 1, very strong positive relationship R = -1, very strong negative relationship Ex. calibration test an R = 0.98 shows that the sensor is accurately reflecting the temperature in the room R = 0.937 Correlation: compare actual value to sensor reading (calibration test)8220/14/11/22065/htm ANOVA Use instead of ttest when comparing more than two means Can be more challenging to determine exactly where the significant difference lies Ex. there is a significant difference between the deceleration of the braking system at different voltages ANOVA & ttest: Electromagnetic Braking Systemand-Testing-an-Electromagnetic-Braking- S/?ALLSTEPS Other Confidence interval for proportions Ex. there is a significant difference between the % of times each sensor was triggered (e.g. 2 different locations in baby car seat) Chi Square compares counts test for curved data (power law sets)fitting/index.htm?reg_doseshift.htmhttp://www.graphpad.com/guides/prism/6/curve- fitting/index.htm?reg_doseshift.htm GP2Y0A60SZLF Analog Distance Sensor Run standard deviation at each point and see if the data overlap Compare different sensors Chi-square Used to compare quantities in different categories IV must be categories DV must be counts (# of individuals in each category, can be converted to %) Graph: bar graph for #s, pie chart for % Ex. number of bacteria colonies in rebreather with or without the germicidal rod