89
M140: Sampling, Relationships and Plotting Data Dr Jason Verrall [email protected] 07311 188800 This tutorial will begin at 10am and will last for approximately an hour. This tutorial will be recorded. Please let me know if you have any questions or concerns about this. Things you might need for this tutorial: M140 Computer Book & Book 2 Pen, paper & calculator Drink of your choice Don’t forget to set up your audio using the Audio Wizard (in the ‘Meeting Menu’). Some headsets have independent volume controls so you may need to adjust these too. You will also need to set up your mic if you plan on using it. Clicking the Mic symbol at the top of you Adobe Connect Window will toggle it on/off. Not connected Connected and live Connected and muted iCMA41 due on 2 December!

M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

M140: Sampling, Relationships and

Plotting Data

Dr Jason [email protected]

07311 188800

This tutorial will begin at 10am and will last for approximately an hour.

This tutorial will be recorded. Please let me know if you have any questions or concerns about this.

Things you might need for this tutorial:• M140 Computer Book & Book 2• Pen, paper & calculator• Drink of your choice

Don’t forget to set up your audio using the Audio Wizard (in the ‘Meeting Menu’). Some headsets have independent volume controls so you may need to adjust these too.

You will also need to set up your mic if you plan on using it. Clicking the Mic symbol at the top of you Adobe Connect Window will toggle it on/off.

Not connected Connected and live Connected and muted

iCMA41 due on 2 December!

Page 2: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Good morning!

2

Mics will be muted until towards the end of the tutorial, when I will also stoprecording.Do use the Chat Box if you have a question during the tutorial!I will email slides out after the tutorial.

Tutorials are enhanced by your interactionPlease vote in the polls, ask questions and work through the exercises

Feel free to ask any questions or provide feedback by emailafterwards, or use the Private Chat function if you prefer during thetutorial

Page 3: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Sampling, Relationships and Plotting Data

• Minitab – generating lists of random numbers• Uniform, Normal distributions

• Sampling Methods• Simple random sampling - Minitab• Systematic random sampling• Stratified sampling – Minitab• Cluster sampling

• Exploring Relationships• Visual – scatterplots in Minitab• Least Squares Regression in Minitab

3

Computer Book has detailed instructions!

Page 4: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Generating Random Numbers

4

True or false?Computers cannot generate

true random numbers

Page 5: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Generating Random Numbers

• Computers generate pseudo-randomnumbers

• Given sufficient time, patterns will emerge and distribution will become different from true random

• This is bad for strong encryption

• You can force Minitab to generate the same ‘random’ numbers each time, by specifying a base or a seed value

• This is only useful if you want someone else to get the same random values as you

5

Page 6: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Generating Random Numbers

• Computers generate pseudo-randomnumbers

• Given sufficient time, patterns will emerge and distribution will become different from true random

• This is bad for strong encryption

• You can force Minitab to generate the same ‘random’ numbers each time, by specifying a base or a seed value

• This is only useful if you want someone else to get the same random values as you

• True random number sources• Random number tables• Dice or well-shuffled cards• Physical phenomena such as

radioactive decay• Lava lamps!

6

Page 7: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Generating Random Numbers

• Computers generate pseudo-randomnumbers

• Given sufficient time, patterns will emerge and distribution will become different from true random

• This is bad for strong encryption

• You can force Minitab to generate the same ‘random’ numbers each time, by specifying a base or a seed value

• This is only useful if you want someone else to get the same random values as you

• True random number sources• Random number tables• Dice or well-shuffled cards• Physical phenomena such as

radioactive decay• Lava lamps!

• For our purposes and most scientific uses, computer-generated numbers are fine

7

Page 8: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Random Numbers With Minitab 1

8

Which is the Uniformdistribution and which is the Normal distribution?

Page 9: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Random Numbers With Minitab 1

9

Uniform Normal or Gaussian

Every number has an equal chance of occurring, perfect for selecting samples

Only numbers which fit a Normal distribution are used, which are biased

towards the mean (0)

Page 10: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Random Numbers With Minitab 2

1. Create a column(s) to receive the random numbers

10

Page 11: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Random Numbers With Minitab 3

1. Create a column(s) to receive the random numbers

2. Select Calc -> Random Dataand select your distribution• Uniform for regular random

numbers

11

Page 12: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Random Numbers With Minitab 4

1. Create a column(s) to receive the random numbers

2. Select Calc -> Random Dataand select your distribution• Uniform for regular random

numbers

3. Specify the receiving column, number of rows and parameters

12

Page 13: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Random Numbers With Minitab 5

1. Create a column(s) to receive the random numbers

2. Select Calc -> Random Dataand select your distribution• Uniform for regular random

numbers3. Specify the receiving column,

number of rows and parameters

4. Format the receiving column if necessary, e.g. specify dp

13

Page 14: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Sampling Theory & Minitab Practice

14

Page 15: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

How Much Is This Forest Worth?

15

• A farmer wants to know how much the trees are worth in his forest• We can’t measure every tree to determine its value… so how can we answer?

Page 16: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

How Much Is This Forest Worth?

16

• A farmer wants to know how much the trees are worth in his forest• We can’t measure every tree to determine its value… so how can we answer?

Sampling!

Page 17: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

How Much Is This Forest Worth?

17

Page 18: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

How Much Is This Forest Worth?

18

Page 19: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

How Much Is This Forest Worth?

19

Page 20: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Exploratory Data Analysis (EDA)Tally

• Minitab function to count different values in a column

• Numeric or nominal data

20

Page 21: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Exploratory Data Analysis (EDA)Tally

• Minitab function to count different values in a column

• Numeric or nominal data

21

Page 22: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Exploratory Data Analysis (EDA)Tally

• Minitab function to count different values in a column

• Numeric or nominal data

22

Page 23: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Exploratory Data Analysis (EDA)Tally

• Minitab function to count different values in a column

• Numeric or nominal data

• It’s tempting here to make a back-of-envelope estimation

• This will be very rough and not suitable for our purposes

• But does provide an indication

23

Page 24: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Exploratory Data Analysis (EDA)Graphical Summary

• Look at age initially to understand the spread

• Older trees should be larger and more valuable

24

Page 25: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

25

What might account for the skew in ages?

Page 26: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

26

Exploratory Data Analysis (EDA)Graphical Summary

• Look at age initially to understand the spread

• Older trees should be larger and more valuable

• Look at age by tree species• Are trees planted in rotation or in

groups?

Page 27: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

27

Exploratory Data Analysis (EDA)Graphical Summary

• Look at age initially to understand the spread

• Older trees should be larger and more valuable

• Look at age by tree species• Are trees planted in rotation or in

groups? The list of available columns changes because nominal data is valid to

categorise a numeric variable

Page 28: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

28

Species Median (yr) Range (yr)

Beech 55 22-80

Birch 20 15-23

Elm 80 0.9-98

Oak 102 80-150

Yew 124 110-150

Page 29: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Sampling Methods 1

• Simple Random Sampling• Select n at random from a list• With or without replacement

29

Page 30: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

30

Sampling 1 - MinitabSimple Random Sampling

• Each member is equally likely to be sampled

• Sampling does not affect the chance of selecting any other sample

Page 31: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

31

Sampling 1 - MinitabSimple Random Sampling

• Each member is equally likely to be sampled

• Sampling does not affect the chance of selecting any other sample

• Replacement• Without: complete independence• With: may select the same datum

multiple times but may be better for small datasets

Page 32: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

32

Sampling 1 - MinitabSimple Random Sampling

1. Create a column to accept the sample list

Page 33: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

33

Sampling 1 - MinitabSimple Random Sampling

1. Create a column to accept the sample list

2. Open the Sample From Columns dialogue box

Page 34: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

34

Sampling 1 - MinitabSimple Random Sampling

1. Create a column to accept the sample list

2. Open the Sample From Columns dialogue box

3. Complete required fields

Page 35: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

35

Sampling 1 - MinitabSimple Random Sampling

1. Create a column to accept the sample list

2. Open the Sample From Columns dialogue box

3. Complete required fields• From Column will be the serial

number or index of the tree to be measured

Page 36: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

36

Sampling 1 - MinitabSimple Random Sampling

1. Create a column to accept the sample list

2. Open the Sample From Columns dialogue box

3. Complete required fields1. From Column will be the serial

number or index of the tree to be measured

4. Click OK

Page 37: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Sampling Methods 2

• Systematic Random Sampling• Select a random start• Then select every nth

• Often used in industrial processes

37

Page 38: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Sampling Methods 2

• Systematic Random Sampling• Select a random start• Then select every nth

• Often used in industrial processes• Can be more representative than

simple random sampling• Can be less representative if the

sampling list is structured or ordered

38

Page 39: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

39

Sampling 2 - MinitabSystematic Sampling

• Sadly we can’t do this in Minitab!• Paper, Excel or another spreadsheet

is easy

Page 40: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

40

Sampling 2 - MinitabSystematic Sampling

1. Calculate the sampling interval:• Interval = Population size

Sample size

Page 41: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

41

Sampling 2 - MinitabSystematic Sampling

1. Calculate the sampling interval:• Interval = Population size

Sample size

2. Select a random number as the first sample datum• Use a table or generate a Uniform

Distribution random number list

Page 42: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

42

Sampling 2 - MinitabSystematic Sampling

1. Calculate the sampling interval:• Interval = Population size

Sample size

2. Select a random number as the first sample datum• Use a table or generate a Uniform

Distribution random number list

3. Iteratively add the interval to the prior sample index until n is reached

Page 43: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

43

Sampling Methods 3Stratified Sampling

• There are different methods for selecting stratum size

• Distribution-matched – reflects the composition of the population (A)

• Equal size – approximately same number of members in each stratum

• Select stratum members randomly

Species Tally Percent Stratum A

Stratum B

Beech 52 26% 7.8 = 8 6

Birch 66 33% 9.9 = 10 6

Elm 41 20.5% 6.15 = 6 6

Oak 36 18% 5.4 = 5 6

Yew 5 2.5% 0.75 = 1 6

Page 44: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

44

Sampling Methods 3Stratified Sampling

• There are different methods for selecting stratum size

• Distribution-matched – reflects the composition of the population (A)

• Equal size – approximately same number of members in each stratum

• Select stratum members randomly• Can be more representative than

random sampling• Useful method if differences

between strata is important

Species Tally Percent Stratum A

Stratum B

Beech 52 26% 7.8 = 8 6

Birch 66 33% 9.9 = 10 6

Elm 41 20.5% 6.15 = 6 6

Oak 36 18% 5.4 = 5 6

Yew 5 2.5% 0.75 = 1 6

Page 45: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

45

Sampling 3 - MinitabStratified Sampling

• There are different methods for selecting stratum size

• Distribution-matched – reflects the composition of the population (A)

• Equal size – approximately same number of members in each stratum

• Select stratum members randomly• Can be more representative than

random sampling• Useful method if differences

between strata is important

Species Tally Percent Stratum A

Stratum B

Beech 52 26% 7.8 = 8 6

Birch 66 33% 9.9 = 10 6

Elm 41 20.5% 6.15 = 6 6

Oak 36 18% 5.4 = 5 6

Yew 5 2.5% 0.75 = 1 6

Minitab will create a stratified sample but it is fiddly. See the end of the slide pack

for a Minitab Blog article and some screenshots.

Page 46: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

46

Sampling Methods 4Cluster Sampling

• Geographic method, best suited to sampling from multiple locations

Page 47: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

47

Sampling Methods 4Cluster Sampling

• Geographic method, best suited to sampling from multiple locations

• Use a random method to select a small number of locations

• Divide locations into clusters if needed

Page 48: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

48

Sampling Methods 4Cluster Sampling

• Geographic method, best suited to sampling from multiple locations

• Use a random method to select a small number of locations

• Divide locations into clusters if needed

• Choose a subsample from each of these sample locations

• Randomly!

Page 49: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

49

Sampling Methods 4Cluster Sampling

• Geographic method, best suited to sampling from multiple locations

• Use a random method to select a small number of locations

• Divide locations into clusters if needed

• Choose a subsample from each of these sample locations

• Randomly!

• Combine

Page 50: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Sampling Methods

1. Avoid the use of judgement or convenience to select samples2. Use a good source of random values

1. Tables2. Computer3. Calculator4. Dice, well-shuffled deck of cards

3. Trade off between accuracy and sample size1. Sample size may be constrained e.g. by cost, practicality, access etc.

50

More to come on sample sizes

Golden Rules

Page 51: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Relationships Between Variables 1

• Sometimes we have multiple variables in a system• Lab experiment, data analysis, machine learning, traffic survey.. Endless!

51

Page 52: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Relationships Between Variables 2

• Sometimes we have multiple variables in a system• Lab experiment, data analysis, machine learning, traffic survey.. Endless!

• Scientists are often interested in whether there are relationships between variables

• Why?

52

Why do we look for relationships between

variables?

Page 53: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Relationships Between Variables 3

• Sometimes we have multiple variables in a system• Lab experiment, data analysis, machine learning, traffic survey.. Endless!

• Scientists are often interested in whether there are relationships between variables

• Why?

• Here are a couple of tools to help explore multiple variables• Is there a relationship between variable A and variable B?• What kind of relationship?• How strong?• Can I use this to predict variable B behaviour?

53

Page 54: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Relationships Between Variables 4

• What relationship would you expect the following to have?• Positive or negative?

• Petrol price and miles driven• Salt intake and blood pressure• Number of completed Unit exercises and TMA scores• Price of an item and number of that item sold• Temperature and ice cream sales

54

Page 55: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Relationships Between Variables - Minitab

• Tool 1: Visual exploration – scatter plot

55

Page 56: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Relationships Between Variables - Minitab

• Tool 1: Visual exploration – scatter plot• Tool 2: Describing & predicting – least squares regression

56

Page 57: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Scatterplots with Minitab 1

57

How confident are you with using scatter plots?

Page 58: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

58

1 41 21 08642

40000

30000

20000

1 0000

0

C1

C2

Scatterplot of C2 vs C1

1 41 21 08642

20

1 8

1 6

1 4

1 2

1 0

8

C1

C3

Scatterplot of C3 vs C1

1 0987654321

7000

6000

5000

4000

3000

2000

1 000

0

C4

C5

Scatterplot of C5 vs C4

1 0987654321

20

1 5

1 0

5

0

C4C6

Scatterplot of C6 vs C4

Explanatory or predictor Explanatory or predictor

Explanatory or predictor Explanatory or predictor

Resp

onse

Resp

onse

Page 59: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

59

1 41 21 08642

40000

30000

20000

1 0000

0

C1

C2

Scatterplot of C2 vs C1

1 41 21 08642

20

1 8

1 6

1 4

1 2

1 0

8

C1

C3

Scatterplot of C3 vs C1

1 0987654321

7000

6000

5000

4000

3000

2000

1 000

0

C4

C5

Scatterplot of C5 vs C4

1 0987654321

20

1 5

1 0

5

0

C4C6

Scatterplot of C6 vs C4

Page 60: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Response and Explanatory Variables 4

• Are TMA01 scores related to the total amount of time spent studying the course in weeks 1- 6?

Which is the response variable?

Page 61: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Response and Explanatory Variables 4

• Are TMA01 scores related to the total amount of time spent studying the course in weeks 1- 6?

• Explanatory variable: Time spent studying • Response variable: TMA01 scores

Page 62: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Scatterplots with Minitab 1

1. Select Graph -> Scatterplot…

62

Page 63: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Scatterplots with Minitab 2

1. Select Graph -> Scatterplot…2. Select Simple

63

Page 64: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Scatterplots with Minitab 3

1. Select Graph -> Scatterplot…2. Select Simple3. Select your X and Y variables

• Explanatory or Predictor on X• Response on Y

64

Page 65: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

65

Page 66: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Line of Best Fit 1

• Sometimes a line can be fitted to a scatterplot, to help explain data more easily

• This line can also be used as a prediction tool• Machine learning!

• But which line has the best fit?

x

xxx

x

x

xx x

x x

ab

c

Page 67: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Line of Best Fit 2

• Graph of achievement in maths against reading, the units are the average scores for 15 year olds, by country (pisa.mtw)

• Where would you draw the regression line?

67

Page 68: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Regression 1

• A regression model is systematically fitted to every data point • Different methods are used to calculate the distance from many

theoretical lines to each point• Residuals

• The line with the smallest total residuals is selected

• Here we use a linear regression model and the least squares fitting method

68

Page 69: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Regression 2

• Any straight line can be expressed as:• 𝑦𝑦 = 𝑚𝑚𝑚𝑚 + 𝐶𝐶

• 𝑚𝑚 is the gradient or slope of the line• 𝐶𝐶 is the intercept on the vertical axis

69

Page 70: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Regression With Minitab 1

1. Select Fit Regression Model

70

Page 71: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Regression With Minitab 2

1. Select Fit Regression Model2. Select the variables

• Predictor = X axis• Response = Y axis

71

Page 72: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Regression With Minitab 3

1. Select Fit Regression Model2. Select the variables

• Predictor = X axis• Response = Y axis

3. Click OK

72

Page 73: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Regression With Minitab 4

1. Select Fit Regression Model2. Select the variables

• Predictor = X axis• Response = Y axis

3. Click OK4. Here is our 𝑦𝑦 = 𝑚𝑚𝑚𝑚 + 𝐶𝐶

73

Page 74: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Regression With Minitab 5

74

Why might this be a poor

prediction tool in some cases?

Page 75: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Regression With Minitab 5

75

Why might this be a poor

prediction tool in some cases?

Negative intercept suggests anything shorter than 3m has a negative value.

Page 76: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Regression With Minitab 6

Adding a regression line1. Select Graph -> Scatterplot2. Select With Regression

76

Page 77: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Regression With Minitab 7

Adding a regression line1. Select Graph -> Scatterplot2. Select With Regression3. Choose the X and Y variables

77

Page 78: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Regression With Minitab 8

78

Page 79: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Regression With Minitab 8

Residuals1. Select Scatterplot2. Select X and Y variables3. Select Graphs…

79

Page 80: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Regression With Minitab 9

Residuals1. Select Scatterplot2. Select X and Y variables3. Select Graphs…4. Select parameters as shown

80

Page 81: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

Regression With Minitab 10

81

Page 82: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

OU Resources• M140 materials online

• Course Books & Screencasts• https://learn2.open.ac.uk/course/view.php?id=2

08584&area=resources

• M140 student forums• OU Library e-books

• https://pmt-eu.hosted.exlibrisgroup.com/permalink/f/gvehrt/TN_cdi_askewsholts_vlebooks_9781846281686

• https://pmt-eu.hosted.exlibrisgroup.com/permalink/f/h21g24/44OPN_ALMA_DS51131243990002316

• Contact me:• [email protected]• 07311 188 800

Online Resources• Wikipedia• CrossValidated

• https://stats.stackexchange.com/

• Minitab channel on YouTube:• https://www.youtube.com/user/MinitabInc

• Minitab help• https://support.minitab.com/en-us/minitab/19/

82

Thank you! Any questions?

Recording will be available from M140-20J Online Tutorial Roomhttps://learn2.open.ac.uk/mod/connecthosted/view.php?id=1644077&group=274133

Page 83: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

83

Sampling With MinitabStratified Sampling

1. Split Worksheet by tree species

https://blog.minitab.com/blog/statistics-and-quality-improvement/taking-a-stratified-sample-in-minitab-statistical-software

Page 84: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

84

Sampling With MinitabStratified Sampling

1. Split Worksheet by tree species2. Create a random sample on

each new worksheet for the stratum size, using same destination column

https://blog.minitab.com/blog/statistics-and-quality-improvement/taking-a-stratified-sample-in-minitab-statistical-software

Page 85: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

85

Sampling With MinitabStratified Sampling

1. Split Worksheet by tree species2. Create a random sample on

each new worksheet for the stratum size, using same destination column

3. Stack all the sub-sheets

https://blog.minitab.com/blog/statistics-and-quality-improvement/taking-a-stratified-sample-in-minitab-statistical-software

Page 86: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

86

Sampling With MinitabStratified Sampling

1. Split Worksheet by tree species2. Create a random sample on

each new worksheet for the stratum size, using same destination column

3. Stack all the sub-sheets4. Copy the stratified sample

column to a new worksheet using Subset the Data

https://blog.minitab.com/blog/statistics-and-quality-improvement/taking-a-stratified-sample-in-minitab-statistical-software

Page 87: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

87

Sampling With MinitabStratified Sampling

1. Split Worksheet by tree species2. Create a random sample on

each new worksheet for the stratum size, using same destination column

3. Stack all the sub-sheets4. Copy the stratified sample

column to a new worksheet using Subset the Data

https://blog.minitab.com/blog/statistics-and-quality-improvement/taking-a-stratified-sample-in-minitab-statistical-software

Page 88: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

88

Sampling With MinitabStratified Sampling

5. Set this condition

https://blog.minitab.com/blog/statistics-and-quality-improvement/taking-a-stratified-sample-in-minitab-statistical-software

Page 89: M140: Sampling, Relationships and Computer Book Plotting Data · 2020. 11. 23. · Random Numbers With Minitab 5 1. Create a column(s) to receive the random numbers 2. Select Calc

89

Sampling With MinitabStratified Sampling

5. Set this condition6. Sample will appear in new sheet

https://blog.minitab.com/blog/statistics-and-quality-improvement/taking-a-stratified-sample-in-minitab-statistical-software