35
1 rules of engagement •no computer or no power → no lesson •no SPSS → no lesson •no homework done → no lesson GE 5 Tutorial 5

1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

Embed Size (px)

Citation preview

Page 1: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

1

rules of engagement•no computer or no power → no lesson•no SPSS → no lesson•no homework done → no lesson

GE 5 Tutorial 5

Page 2: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

8

Topics

•samples and populations•significance for the correlation coefficient•standard error•SPSS

Page 3: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

74

1.Quiz 2.Discussion homework3.Sampling and samples4.Significance5.Standard error6.SPSS workshop7.Discussion homework next week

Content tutorial 5

Page 4: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

1. Quiz

Page 5: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

2. Discussion homework

Page 6: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

Quick refresher - correlation- measures of association

Page 7: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

13

Scatter plot

•When we look at this scatterplot what can we say about the correlation?

Page 8: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

Levels of measurement

nominal ordinal interval

or ratiodichotomy

nominal Cramérs V Cramérs V Cramérs Vor eta

Cramérs V

ordinal Cramérs V Spearmanrho

Spearmanrho

Spearmanrho

interval or ratio

Cramérs Vor eta

Spearmanrho

Pearson r

dichotomy Cramérs V Spearmanrho

and measures of association

phi

Page 9: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

3. Chapters 9, 10, 11 of Howitt & Cramer

Page 10: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

Chapter 9: Samples & Populations

We talked about large and small correlations0.3 is small; 0.8 is large; -0.8 is just as large

But how large is an “interesting” correlation?

Statistical answer: When it is a “significant” correlation

Significant = Unlikely to have occurred by chance

Page 11: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

Example

“This is not an organized demonstration, we just all happened to be wearing the same thing today” Probably not true, because unlikely to have happened by chance.

Page 12: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

Chapter 9: Samples & Populations

The same question can be asked about samples and populations:

– we measure the enthusiasm about a TV show in a sample of 100 people– we find much higher ratings than we expected– how unlikely is it that we found this by chance– if unlikely enough, we conclude that the TV show is more popular than we thought.

Inferential StatisticsHow can we use a sample to draw conclusions that apply to the population?

Page 13: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

DEFINITIONS

We are not necessarily talking about people, but about scores

- age of husband/wife, each couple is a score- “phone brands owned over life time”: multiple scores per person

Population: entire set of scores that we are interested in

- so only the target groupSample: set of scores that we select for research

- because we simply cannot measure every score- a good sample is a random sample

Page 14: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

Samples

- Statistical inference generally assumes samples are drawn randomly from population

- Randomness does not mean just doing something- Often you have to take special measures to give every score the same chance:

if you are asking people in the street, don't only do this in the mornings

- Random sample: “each score in the population has an equal chance of being selected”

Page 15: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

Samples can approximate population

Table 9.1 of the book has a population of 100 scores Mean = 5.52

Table 9.2 has the means of 40 small random samples (size 5)

The mean of sample means is: 5.20, min: 2.00, max: 8.80 SD: 1.60 Sample means from small samples are not such good estimates of the population mean .

Page 16: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

larger samples

Table 9.3 has means of 40 large samples (of size 20) drawn from same populationWhat can we observe?

The mean of sample means is: 5.33, min: 5.25, max: 6.85

SD: 0.60 Larger sample size and much better estimate of population

Page 17: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

Conclusions

If we draw a random sample the outcome will not be the same as the outcome in a population.

The mean of the sample will not be the same as the mean of the population

But the larger the sample, the larger the chance that the sample outcome will be close to the population outcome

As we saw:- very small samples are not reliable- larger samples can get close to the population values

Page 18: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

SPSS random samples 10%

OPEN THE BIG SMARTIE SURVEY!Clean Up the results first: negative results and 100> need to be deleteduse data; select cases; IF; weight > 0 & weight < 100 choose “remove unselected cases” and press OK Take a random sample (10% of the cases)

data; select cases; random sample; 10 % choose “filter unselected cases”

Calculate descriptives

Page 19: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

Aside about computers and randomness

Computers cannot be really random (or creative)We all got the same “random” sampleTrue random numbers can be found at random.orgYou can “seed” the SPSS random numbers with your own random number (I use 123456, use your birthday)

1) Go to transform->Random number generators and choose for the twister and a random starting point (for example 123456)

2) OR Open a syntax window (New; Syntax) and type and run this command:

1)SET SEED 123456. Then reselect all cases (Data; select cases; use all; OK) And select another random 10% (Data; select cases; ...)

Page 20: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

Distribution of means

• In this (unrealistic) case, we can take 40 samples from the same population

• We can plot the means we obtained and we find:• A normal distribution of sample means• Around the mean of means• With standard deviation• Which we call:• standard error

Page 21: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

standard error

• standard deviation:• for cases and one sample• the average distance from the cases to the mean of the sample

• standard error:• for a number of samples and a mean of means• the average distance from this sample's mean to the mean of

means• But: We can hardly ever obtain more than one sample• We also have a formula to estimate standard error from ONE

sample only• standard error:

• for one sample and a population mean (via formula, population mean unknown)

• the estimated 'average' distance from this sample mean to the population mean

Page 22: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

The error in the measurement is called standard error

Every measurement has a standard error The smaller the standard error the better the measurement. A smaller standard error means a better estimate of the mean of the population! So we could report: The sample mean of age in IMEM year 1 is 20 with standard error of 2.

Page 23: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

Confidence interval

What is confidence interval?- The confidence interval is a better way to report a mean and standard error- The confidence interval is a good way to understand significance.

Page 24: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

Confidence Interval (CI) – report a sample mean

The sample mean of age in IMEM year 1 is 20 with standard error of 2.

The 68% CI of age was 18 – 22 yearsThe 95% CI of age was 16 – 24 year ← most commonThe 99% CI of age was 14 – 26 years

Meaning of the 95% confidence interval (95% CI):

In 95 % of the cases, the population mean will be between 16 and 24.

Page 25: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

Chapter 10: Statistical significance for correlation

coefficient

Remember what null hypothesis (H0) and alternative hypothesis (H1) is?

E.g.: H0: there is no relationship between brain size and intelligenceH1: there is a relationship between brain size and intelligence

- What are the two variables in the previous H0?- What does this imply for the correlation between those?- What would a positive correlation imply?

Page 26: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

Under H0

• We measured brain size and IQ in 120 volunteers.• We compute the Pearson r and the standard error• Assume that the SE is 0.2

• Under H0, there is no association, we should find r=0• under H0, 95% CI for r = [-0.4 ; +0.4]

• If we find a value outside 95% CI, we reject H0• sample r = 0.1 → H0 ok• sample r = -0.3 → H0 ok• sample r = 0.5 → reject H0, accept H1, there is a

relationship• sample r = -0.5 → reject H0, accept H1, there is an

inverse relationship

Page 27: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

ASIDE: How to compute estimated standard error

Left: Estimated SE for a mean- computed from the standard deviation- here they used 's' for the standard deviation of the sample, we have used sigma (σ) before.

Right: Estimated SE for a correlation coefficientNeither formula is on the exam

Page 28: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

Another example

H1: Students in this year are smarter than students from last yearH0: this year is just as smart as last year

Last year’s exam mark was 7.2- H0: the average exam mark will again be a 7.2

We take a sample of 64 students and find mean 7.7, standard deviation 1.6the average student is within 1.6 marks of 7.6estimated standard error = sd / √n = 1.6 / 8 = 0.2Under H0, mean = 7.2, 95% CI = [6.8 ; 7.6]Observed sample mean is outside of 95% CIWe reject H0, this year is smarter!

What if we had found mean 7.5, standard deviation 1.6?What if we had found mean 7.7, standard deviation 2.4?

Page 29: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

Type I & II Errors

Finding a significant results is usually a good thing

But keep this in mind: We could make a Type I errorType I: decide that H0 is false when it is trueType II: decide that H0 is true when it is false

Powerful statistical tests are those in which Type II has less chanceYou will never know whether you have made a Type I or II error

Page 30: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

4. SPSS workshop

Page 31: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

77

•Open Big Smartie survey•Use two scale type variables•Use the correct correlation measure•Define how strong the relationship is•Check if the relationship is significant.

SPSS Topics

Page 32: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

77

•Take a random sample of 20% of the population•Do the same measurement again•Calculate your measurements again

Repeat this for a 50% sample

SPSS Topics

Page 33: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

77

•Open Big Smartie survey•Use two ordinal type variables•Use the correct correlation measure•Define how strong the relationship is•Check if the relationship is significant.

SPSS Topics

Page 34: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

6. Homework

Page 35: 1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5

65