54
Last Time: Last Time:

Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

Embed Size (px)

Citation preview

Page 1: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

Last Time:Last Time:

Page 2: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

Type A

50-54

55-59

60-64

65-69

70-74

75-79

80-84

85-89

90-94

2/3

2/3 of all Type A respondents had measurements between 55 and 69

Page 3: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

50-54

55-59

60-64

65-69

70-74

75-79

80-84

85-89

90-94

50-54

55-59

60-64

65-69

70-74

75-79

80-84

85-89

90-94

DataWorld

TheoryWorld

Page 4: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

Comment: A density function is like a “smoothed out” very fine-tuned histogram

Page 5: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

Examples of Density Functions

Median

75th percentile25th percentile

Area p below pth percentile

Symmetric

= Mean

IQR

Page 6: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

Examples of Density Functions

Median Mean

Positively Skewed (Skewed to the right)

Page 7: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

THE NORMAL DISTRIBUTIONTHE NORMAL DISTRIBUTIONProperties of X ~ N( , )

The proportion of a normally distributed X within:

•one standard deviation from its mean is .6826 P( - < X < + ) = .6826

•two standard deviations from its mean is .9544 P( - 2 < X < + 2 ) = .9544

•three standard deviations from its mean is .9974 P( - 3 < X < + 3 ) = .9974

True for any value of and

Page 8: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

STANDARD NORMAL DISTRIBUTIONSTANDARD NORMAL DISTRIBUTION

Z ~ N( 0, 1)

-4 -3 -2 -1 0 1 2 3 4

Know everything about Z ~ N(0,1):

Table in your book (inside cover) tabulates values P(Z<z)

(note the table goes over two pages)

Note: you can think of values z of Z ~ N(0,1) as

“z many standard deviations from the mean”

z

Page 9: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

AMAZING PROPERTY OF AMAZING PROPERTY OF

NORMAL DISTRIBUTIONSNORMAL DISTRIBUTIONS

If X is normally distributedthen a+bX (b>0) is also normally distributed.

More precisely: X ~ N( , ) (a+bX) ~ N(a+b , b)

Note:

This type of relationship is not necessarily true

for other distributions

Page 10: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

Example:The population distribution of psychometric test X is a normal distribution with mean 1.1 and standard deviation of .08: Thus, X ~ N(1.1 , .08).

a) P(1.1 < X) = ?

b) P(1.02 < X < 1.18) = ?

c) How to calculate P(1.1 < X < 1.25) ?

d) How to calculate P(X > 1) ?

e) How to find x such that P(X <x) = .75 ?

0064.,08.,1.1 2

Page 11: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

Today:Today:

Rehearse the Normal DistributionRehearse the Normal Distribution

Start Chapter 2:Start Chapter 2:

Relationships among VariablesRelationships among Variables

Page 12: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

Relationships among VariablesSo far:

Mostly interested in a single variable at a time. Exception: Type A, Type B data

where we recorded the type and the blood pressure

Mode, Median, Mean, IQR, Variance, Standard Deviation, etc. all applied to a single variable

Single variable statistics are common in daily life:

Government / Mass Media provide tons ofSocio-Economic Statistics, Sports Statistics

Page 13: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

Relationships among Variables:The crucial feature of almost all

scientific research

How does the perception of a stimulus vary withthe physical intensity of that stimulus?

How does the attitude towards the President vary withthe socio-economic properties of the survey respondent?

How does the performance on a mental task vary with age?

Page 14: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

Relationships among Variables:The crucial feature of almost all

scientific research

How does depression vary withnumber of traumatic experiences ?

How does undergraduate student alcohol abuse vary withperformance in quantitative courses?

How does memory performance vary with attention span?

Page 15: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

Relationships among Variables:The crucial feature of almost all

scientific research

How does the behavior of respondents in an experiment vary with the experimental group that the respondents belong to ?

and on … … and on …

… …and on …

Page 16: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69
Page 17: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

Relationships among Variables: Interpretations

Stimulus ResponseExperimental Group Observed Behavior

SAT Verbal ? ? SAT quantitative

One variable is used to “explain” another variable

Both variables depend on a third (“lurking”) variable

Page 18: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

Relationships among

Variables: Interpretations

Page 19: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

Relationships among Variables: Interpretations

Page 20: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

Relationships among Variables: Interpretations

One variable is used to “explain” another variable

X VariableIndependent VariableExplaining VariableExogenous VariablePredictor Variable

Y VariableDependent VariableResponse Variable

Endogenous VariableCriterion Variable

Page 21: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

Scatter Plots

X

Y

Page 22: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

Questions to ask about Scatter Plots

• Is there a systematic trend?

• Can the relationship be described by a linear function Y = a +bX?

• If so, is there a lot of scatter around the line?

• Is there a strong linear relationship?

• Are there lurking variables?

Page 23: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

Scatter Plots

X

Y Weak Positive Association?A lot of Scatter!Lurking Variables?

Page 24: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

Scatter Plots

X

YVenus Mars

Page 25: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

Scatter Plots

X

YVenus

Negative AssociationNot a lot of Scatter

Page 26: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

Scatter Plots

X

YMarsPositive Association

Not a lot of Scatter

Page 27: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

Example: Performance in Experiment

PRACTICE TRIALCASE 1 86 82.6CASE 2 109.3 112.6CASE 3 73.3 70CASE 4 80.6 76.6CASE 5 86.6 84CASE 6 85.3 86CASE 7 83.3 82.6CASE 8 78.6 81.3CASE 9 92 86.6CASE 10 76 75.3

PRACTICE: Performance Score in a Practice SessionTRIAL: Performance Score in a Trial Session

Suppose these scores are Interval Scale

Case i = Respondent i

Sample Size: 10 Respondents

Page 28: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

Example: Performance in Experiment

PRACTICE TRIALCASE 1 86 82.6CASE 2 109.3 112.6CASE 3 73.3 70CASE 4 80.6 76.6CASE 5 86.6 84CASE 6 85.3 86CASE 7 83.3 82.6CASE 8 78.6 81.3CASE 9 92 86.6CASE 10 76 75.3

Page 29: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

Stem and Leaf Plots

Stem and Leaf Plot of variable: PRACTICE, N = 10 Minimum: 73.300 Lower hinge: 78.600 Median: 84.300 Upper hinge: 86.600 Maximum: 109.300 7 3 7 H 68 8 M 03 8 H 566 9 2 * * * Outside Values * * * 10 9

Page 30: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

Stem and Leaf Plot of variable: TRIAL, N = 10 Minimum: 70.000 Lower hinge: 76.600 Median: 82.600 Upper hinge: 86.000 Maximum: 112.600 7 0 7 H 56 8 M 1224 8 H 66 * * * Outside Values * * * 11 2

Stem and Leaf Plots

Page 31: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

PRACTICE TRIAL

N of cases 10 10

Minimum 73.300 70.000

Maximum 109.300 112.600

Mean 85.100 83.760

Standard Dev 10.133 11.381

Some Descriptive Statistics:

Page 32: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

Histograms

Page 33: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

Box and Whisker Plots

Page 34: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69
Page 35: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69
Page 36: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

The 1970 Vietnam War Draft Lottery

http

://w

ww

.sss

.gov

/lott

er1.

htm

Page 39: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

(0,a)

b

InterceptSlope

bX+a

X

Reminder: (Simple) Linear Function Y=a+bX

We are now interested in this, not for data transformation purposes, but rather to model the relationship between an

independent variable X and a dependent variable Y

Y

Page 40: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

1

slope:b

intercept :a

bXaY :sprediction errorless had weIf

X

Y

Simple Least-Squares Regression

Page 41: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

X

YA guess at the location of the regression line

Page 42: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

X

YAnother guess at the location of the regression line(same slope, different intercept)

Page 43: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

X

YInitial guess at the location of the regression line

Page 44: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

X

YAnother guess at the location of the regression line(same intercept, different slope)

Page 45: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

X

YInitial guess at the location of the regression line

Page 46: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

X

YAnother guess at the location of the regression line(different intercept and slope, same “center”)

Page 47: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

X

Y

We will end up being reasonably confidentthat the true regression line is somewhere in the indicated region.

Page 48: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

X

YEstimated Regression Line

errors/residuals

Page 49: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

X

YEstimated Regression Line

Page 50: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

X

YEstimated Regression Line

Wrong Picture!

Wrong Picture!

Error Terms have to be drawn vertically

Page 51: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

X

YEstimated Regression Line

iii yye ˆ

iy

iy

ix

bXaY ˆ

:Line Regression theofEquation

Page 52: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

How do we find a and b?

N

1i

2N

1i

2

idualserrors/res squared of sum theminimize to, Find

abxye

ba

iii

In Least-Squares Regression:

Page 53: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

In Least-Squares Regression:

XbYa

XX

YYXXb N

ii

N

iii

,

1

2

1

N

i

N

iii

N

i

N

ii

N

iiii

XXN

YXYXN

b

1

2

1

2

1 11

ComputationalFormula

Page 54: Last Time:. 2/3 2/3 of all Type A respondents had measurements between 55 and 69

Outliers? Influential

Data Points?