8
Spreadsheet simulation and trial and error methods in statistics Michael Wood University of Portsmouth, UK [email protected]

Spreadsheet simulation and trial and error methods in statistics Michael Wood University of Portsmouth, UK [email protected]

Embed Size (px)

Citation preview

Page 1: Spreadsheet simulation and trial and error methods in statistics Michael Wood University of Portsmouth, UK michael.wood@port.ac.uk

Spreadsheet simulation and trial and error methods in statistics

Michael WoodUniversity of Portsmouth, UK

[email protected]

Page 2: Spreadsheet simulation and trial and error methods in statistics Michael Wood University of Portsmouth, UK michael.wood@port.ac.uk

Computer-intensive, “crunchy” methods

• Crunch out answers without dependence on sophisticated maths

• Rationale transparent• Sometimes more robust in the sense of no

dependence on unrealistic assumptions• Often more general—can tackle problems with no

convenient formula-based method• Detailed step through for beginners• Very brief overview of three possibilities

Page 3: Spreadsheet simulation and trial and error methods in statistics Michael Wood University of Portsmouth, UK michael.wood@port.ac.uk

An aside …

• I am fairly critical of many applications of statistics: not the subject of this talk except that transparency makes problems more obvious

Page 4: Spreadsheet simulation and trial and error methods in statistics Michael Wood University of Portsmouth, UK michael.wood@port.ac.uk

Regression and least squares models

• Spreadsheet to calculate MSE (mean square error), then use Solver to find parameters for least squares model

• Identical answers to standard formulae• Obvious what’s going on and can be modified if

required• Single variable: pred1var.xls• Multiple regression: predmvar.xls• Can easily adjust method—e.g. ExerciseCurve.xls

Page 5: Spreadsheet simulation and trial and error methods in statistics Michael Wood University of Portsmouth, UK michael.wood@port.ac.uk

Test of null hypothesis that two variables are unrelated

• Randomization test: Spreadsheet simulates no relationship hypothesis

• Obvious what’s going on with no technical statistical concepts

• Assumptions less restrictive and more obvious than t-test, etc

• Flexible – test difference between two means or two proportions, or correlation, etc

• Difference of two means: diffofmeanstest.xls• General spreadsheet: resamplenrh.xls

Page 6: Spreadsheet simulation and trial and error methods in statistics Michael Wood University of Portsmouth, UK michael.wood@port.ac.uk

Bootstrap confidence intervals

• One method for many different statistics– Use sample to set up a “guessed” population– Experiment drawing samples from guessed population

to assess sampling error (resampling with replacement)

• Obvious what’s going on and when it’s not sensible!

• Confidence intervals are a subtle concept: simple bootstrapping avoids the mathematical problems but not the conceptual ones

• Many more complex methods - not simple!

Page 7: Spreadsheet simulation and trial and error methods in statistics Michael Wood University of Portsmouth, UK michael.wood@port.ac.uk

Conclusion

• Active learning in that learners don’t have to take formulae on trust, but can act out methods and see how they work.

• And can often adapt to new problems.

Page 8: Spreadsheet simulation and trial and error methods in statistics Michael Wood University of Portsmouth, UK michael.wood@port.ac.uk

References and website

• All spreadsheets files mentioned at http://userweb.port.ac.uk/~woodm/nms(These all have a Read this sheet for a brief explanation)

• Approach explained in more detail in Wood, M (2003), Making sense of statistics: a non-mathemical approach, Basingstoke, UK: Palgrave.