Upload
colin-stevenson
View
217
Download
2
Embed Size (px)
Citation preview
Author: Phillip E. Pfeifer
© 2012 Phillip E. Pfeifer and Management by the Numbers, Inc.
Descriptive Statistics II
This module covers statistics commonly used to describe the relationship between two numerically-scaled variables (correlation and regression).
2
TW
O K
IND
S O
F D
ES
CR
IPT
IVE
STA
TIS
TIC
STwo Kinds of Descriptive Statistics
MBTN | Management by the Numbers
• Measures of Central Tendency• Mean• Median• Mode
• Measures of Variability• Range (Maximum – Minimum)• Standard Deviation• Variance
Descriptive Statistics I covered these six statistical measures used to describe a single numerically-scaled variable. If we have two (or more) variables, we often begin by calculating and examining these statistics for each of the variables of interest.
3
EX
AM
PLE
Example
MBTN | Management by the Numbers
Heights and Weights of 30 Students(in inches and pounds)*
• Using what we learned in Descriptive Statistics I, we can calculate (and interpret) summary statistics for height and weight.
• These calculations and interpretations are accomplished separately for the two variables.
• Our summary of height ignores weight and vice versa.
*http://www.sci.usq.edu.au/staff/dunn/Datasets/Books/Hand/Hand-R/height-R.html
Student Height Weight1 53 572 57 723 60 1214 61 1105 55 706 52 557 59 978 54 689 56 7910 57 7711 56 6112 54 6113 61 7914 59 10515 61 7916 52 6817 59 7418 56 7019 65 10320 57 8121 59 10122 58 7923 60 10324 55 7225 56 9226 58 7027 59 7028 56 6329 54 7430 53 66
4
EX
AM
PLE
Example
MBTN | Management by the Numbers
Separate Summary Statistics for Height and Weight:
These descriptive statistics were discussed in module I
Notice that none of them measure anything about the relationship between height and weight.
Height Weight Sample Mean 57.07 Sample Mean 79.23Median 57 Median 74Mode 59 Mode 70Standard Deviation 3.07 Standard Deviation 17.00Sample Variance 9.44 Sample Variance 289.08Count 30 Count 30
Student Height Weight1 53 572 57 723 60 1214 61 1105 55 706 52 557 59 978 54 689 56 7910 57 7711 56 6112 54 6113 61 7914 59 10515 61 7916 52 6817 59 7418 56 7019 65 10320 57 8121 59 10122 58 7923 60 10324 55 7225 56 9226 58 7027 59 7028 56 6329 54 7430 53 66
5
TH
E S
CA
TT
ER
PLO
TThe Scatter Plot
MBTN | Management by the Numbers
• A great way to begin to examine the relationship between two variables, is to construct a scatter plot.
• The scatter plots of weight (on the Y-axis) versus height (on the X-axis) and height (on the Y-axis) versus weight (on the X-axis) both show that there is a positive relationship between these two variables.
• Students with greater heights tend to have greater weights (and vice versa).
0 10 20 30 40 50 60 700
20406080
100120140
Height (inches)
Wei
ght (
poun
ds)
50 60 70 80 90 100 110 120 1300
10203040506070
Weight (pounds)
Heig
ht (i
nche
s)
Student Height Weight1 53 572 57 723 60 1214 61 1105 55 706 52 557 59 978 54 689 56 7910 57 7711 56 6112 54 6113 61 7914 59 10515 61 7916 52 6817 59 7418 56 7019 65 10320 57 8121 59 10122 58 7923 60 10324 55 7225 56 9226 58 7027 59 7028 56 6329 54 7430 53 66
6
TH
E S
CA
TT
ER
PLO
TThe Scatter Plot
MBTN | Management by the Numbers
• Many of us might say that the relationship between the two variables looks “stronger” in the left plot compared to the right plot.
• But that is nonsense given that both charts plot the same 30 pairs of data.
• The “problem” is one of scaling. Changing the scales on the axes changes the “look” of the plot.
0 10 20 30 40 50 60 700
20406080
100120140
Height (inches)
Wei
ght (
poun
ds)
50 60 70 80 90 100 110 120 1300
10203040506070
Weight (pounds)He
ight
(inc
hes)
Student Height Weight1 53 572 57 723 60 1214 61 1105 55 706 52 557 59 978 54 689 56 7910 57 7711 56 6112 54 6113 61 7914 59 10515 61 7916 52 6817 59 7418 56 7019 65 10320 57 8121 59 10122 58 7923 60 10324 55 7225 56 9226 58 7027 59 7028 56 6329 54 7430 53 66
7
TH
E S
CA
TT
ER
PLO
TThe Scatter Plot
MBTN | Management by the Numbers
The first scatter plots
New scatter plots created by changing the scale of the axes. By changing the scales, notice how the height vs. weight looks “stronger” than weight vs. height, the opposite of the “look” above.
0 10 20 30 40 50 60 700
20406080
100120140
Height (inches)
Wei
ght (
poun
ds)
50 60 70 80 90 100 110 120 1300
10203040506070
Weight (pounds)
Heig
ht (i
nche
s)
30 35 40 45 50 55 60 65 70
-440
-340
-240
-140
-40
60
Height (inches)
Wei
ght (
poun
ds)
50 60 70 80 90 100 110 120 13045
50
55
60
65
70
Weight (pounds)
Heig
ht (i
nche
s)
Student Height Weight1 53 572 57 723 60 1214 61 1105 55 706 52 557 59 978 54 689 56 7910 57 7711 56 6112 54 6113 61 7914 59 10515 61 7916 52 6817 59 7418 56 7019 65 10320 57 8121 59 10122 58 7923 60 10324 55 7225 56 9226 58 7027 59 7028 56 6329 54 7430 53 66
8
TH
E S
CA
TT
ER
PLO
TThe Scatter Plot
MBTN | Management by the Numbers
• The “look” of a scatter plots depends on the scales used for the axes.
• Be aware of this as you interpret scatter plots (and charts in general)
• As a consequence, we want/need a statistic that measures the direction of (positive or negative or zero) and amount/strength of the relationship between two variables.
Student Height Weight1 53 572 57 723 60 1214 61 1105 55 706 52 557 59 978 54 689 56 7910 57 7711 56 6112 54 6113 61 7914 59 10515 61 7916 52 6817 59 7418 56 7019 65 10320 57 8121 59 10122 58 7923 60 10324 55 7225 56 9226 58 7027 59 7028 56 6329 54 7430 53 66
• If high values of X tend to be paired with high values of Y (and vice versa) the sign of the statistic should be positive (and vice versa)
• If the value of X is of no help in predicting Y (and vice versa), the statistic should equal zero.
• If the value of X is a perfect predictor of Y (and vice versa) the statistic should be either +1 or -1. In which case, all the points in the scatter plot will fall on a straight line.
9
TH
E C
OR
RE
LAT
ION
CO
EF
FIC
IEN
TThe Correlation Coefficient
MBTN | Management by the Numbers
Insights
The correlation coefficient measures both the direction and strength of the relationship between two numerically-scaled variables.
The correlation of X and Y equals the correlation of Y and X.
The correlation doesn’t depend on the scales used in the scatter plot, and doesn’t even depend on the scales used to measure the variables.
• Convert the heights to centimeters and/or the weights to kilos, and the correlation coefficient won’t change.
Definition
Correlation Coefficient =
Excel Function = Correl(Array1,Array2)
10
TH
E C
OR
RE
LAT
ION
CO
EF
FIC
IEN
TThe Correlation Coefficient
MBTN | Management by the Numbers
For this data set, the correlation of Height and Weight is 0.72.
• It is positive, as expected.
• And it appears to be high (close to one)
0 10 20 30 40 50 60 700
20406080
100120140
Height (inches)
Wei
ght (
poun
ds)
50 60 70 80 90 100 110 120 1300
10203040506070
Weight (pounds)
Heig
ht (i
nche
s)
Student Height Weight1 53 572 57 723 60 1214 61 1105 55 706 52 557 59 978 54 689 56 7910 57 7711 56 6112 54 6113 61 7914 59 10515 61 7916 52 6817 59 7418 56 7019 65 10320 57 8121 59 10122 58 7923 60 10324 55 7225 56 9226 58 7027 59 7028 56 6329 54 7430 53 66
11
EX
AM
PLE
CO
RR
ELA
TIO
N C
OE
FF
ICIE
NT
SExample Correlation Coefficients
MBTN | Management by the Numbers
3 4 5 6 7 8 9 10 11 12 130
2
4
6
8
10
X
Y
3 4 5 6 7 8 9 10 11 12 13012345678
X
Y
3 4 5 6 7 8 9 10 11 12 1302468
101214
X
Y
3 4 5 6 7 8 9 10 11 12 1302468
101214
X
Y
-1 -0.6
+0.6 +1
03 4 5 6 7 8 9 10 11 12 13
0
2
4
6
8
10
X
Y
12
PO
INT
S LIN
ED
UP
ON
A F
LAT
LINE
?Points Lined Up on a Flat Line?
MBTN | Management by the Numbers
3 4 5 6 7 8 9 10 11 12 130
1
2
3
4
5
6
X
Y
Points all on a line; the correlation should be
+1 or -1?
But the line is flat; correlation coefficient
should be 0?
Math to the rescue; the correlation is 0/0 which
is UNDEFINED.
InsightIn order to measure the relationship between two variables, both have to exhibit variability. Since Y was always 5, we can’t tell whether Y goes up or down with X. Y never changed!!
13
PO
INT
S LIN
ED
UP
ON
A C
UR
VE
D LIN
E?
Points Lined Up on a Curved Line?
MBTN | Management by the Numbers
X appears to be a PERFECT predictor of Y.
However, the relationship is NOT
linear.
The correlation coefficient for these
data is ZERO!
Insight
The correlation coefficient measures the direction and strength of a LINEAR relationship between X and Y.
Because the best straight line through these data is a flat one, X and Y are uncorrelated.
0 1 2 3 4 5 6 7 8 9 100
20
40
60
80
100
120
140
160
180
200
X
Y
14
CO
RR
ELA
TIO
N V
ER
SU
S C
AU
SA
TIO
NCorrelation Versus Causation
MBTN | Management by the Numbers
The correlation coefficient measures the direction and strength of a possible LINEAR relationship between X and Y in the observed data.
• Just because X and Y moved together in the past does not mean X caused Y or that Y caused X.
• Both could have been caused by something else (Z?)• They could have moved together just by chance.
• Just because X and Y are uncorrelated, does not mean that X might not have caused Y.
• Refer to the previous slide. If X causes Y in a nonlinear manner, the correlation can come out to be zero.
15
PR
OP
ER
TIE
S O
F T
HE
CO
RR
ELA
TIO
N C
OE
FF
ICIE
NT
Properties of the Correlation Coefficient
MBTN | Management by the Numbers
• The scales used for the chart and the scales used for the variables (pounds or kilogram, cm or inches) do not change the correlation coefficient as discussed in previous slides.
• In Descriptive Statistics I we learned how adding and multiplying by constants changed the descriptive statistics (mean, standard deviation, range, etc), but how does this affect the correlation coefficient?
16
PR
OP
ER
TIE
S O
F T
HE
CO
RR
ELA
TIO
N C
OE
FF
ICIE
NT
Properties of the Correlation Coefficient
MBTN | Management by the Numbers
What happens to the correlation coefficient if we add and/or multiply X and/or Y by some non-zero constants?
If X and Y are positively correlated, X and –Y will be negatively correlated.
• Adding a constant to X and/or Y will not change the correlation coefficient.
• Multiplying X and/or Y by a constant can change the sign of the correlation coefficient (if we multiply be a negative constant) but not the magnitude.
17
TH
E C
OR
RE
LAT
ION
CO
EF
FIC
IEN
TThe Correlation Coefficient
MBTN | Management by the Numbers
Question 1: The correlation coefficient for the height and weight data was 0.72. If the device used to measure these weights under-stated each weight by 5 pounds, what will be the correlation between height and the corrected weights?
Answer:
0.72!
Adding 5 to each weight will not change the correlation coefficient.
Student Height Weight1 53 572 57 723 60 1214 61 1105 55 706 52 557 59 978 54 689 56 7910 57 7711 56 6112 54 6113 61 7914 59 10515 61 7916 52 6817 59 7418 56 7019 65 10320 57 8121 59 10122 58 7923 60 10324 55 7225 56 9226 58 7027 59 7028 56 6329 54 7430 53 66
18
TH
E C
OR
RE
LAT
ION
CO
EF
FIC
IEN
TThe Correlation Coefficient
MBTN | Management by the Numbers
Question 2: Over the course of a year, each student’s height increased 5% and each weight increased 2%. What is the new correlation coefficient?
Answer:
0.72!
Multiplying all heights by 1.05 and all weights by 1.02 will not change the correlation coefficient.
Student Height Weight1 53 572 57 723 60 1214 61 1105 55 706 52 557 59 978 54 689 56 7910 57 7711 56 6112 54 6113 61 7914 59 10515 61 7916 52 6817 59 7418 56 7019 65 10320 57 8121 59 10122 58 7923 60 10324 55 7225 56 9226 58 7027 59 7028 56 6329 54 7430 53 66
19
TH
E C
OR
RE
LAT
ION
CO
EF
FIC
IEN
TThe Correlation Coefficient
MBTN | Management by the Numbers
Question 3: If the tallest student loses 5 pounds and the shortest student gains 5 pounds, what will happen to the correlation coefficient?
Answer:
It will be less than 0.72.
Since the data started out being positively correlated, moving the Y value for a high X down and the Y value for a low X up will make scatter plot flatter. The correlation coefficient will be less than 0.72.
In contrast, if the tallest student gained weight and/or the shortest student lost weight, the correlation coefficient would increase.
Student Height Weight1 53 572 57 723 60 1214 61 1105 55 706 52 557 59 978 54 689 56 7910 57 7711 56 6112 54 6113 61 7914 59 10515 61 7916 52 6817 59 7418 56 7019 65 10320 57 8121 59 10122 58 7923 60 10324 55 7225 56 9226 58 7027 59 7028 56 6329 54 7430 53 66
20
TH
E R
EG
RE
SS
ION
LINE
The Regression Line
MBTN | Management by the Numbers
• The correlation between height and weight was 0.72.
• So we know the relationship is positive (taller students tend to weigh more), and 0.72 measures the “strength” of the relationship.
• But other than as a relative measure of “strength” is there any other direct use for the correlation coefficient?
Not Really!
50 52 54 56 58 60 62 64 660
20
40
60
80
100
120
140
Height (inches)
Wei
ght (
poun
ds)
Student Height Weight1 53 572 57 723 60 1214 61 1105 55 706 52 557 59 978 54 689 56 7910 57 7711 56 6112 54 6113 61 7914 59 10515 61 7916 52 6817 59 7418 56 7019 65 10320 57 8121 59 10122 58 7923 60 10324 55 7225 56 9226 58 7027 59 7028 56 6329 54 7430 53 66
21
TH
E R
EG
RE
SS
ION
LINE
The Regression Line
MBTN | Management by the Numbers
• So if the correlation coefficient left you longing for something a little more useful….you are going to like the regression line.
• Since height and weight are correlated, we should be able to use one to help predict the other.
• The regression line is the way to accomplish that prediction task.
• If a new student is 61 inches tall, how can we predict what that student will weigh?
50 52 54 56 58 60 62 64 660
20
40
60
80
100
120
140
Height (inches)
Wei
ght (
poun
ds)
Student Height Weight1 53 572 57 723 60 1214 61 1105 55 706 52 557 59 978 54 689 56 7910 57 7711 56 6112 54 6113 61 7914 59 10515 61 7916 52 6817 59 7418 56 7019 65 10320 57 8121 59 10122 58 7923 60 10324 55 7225 56 9226 58 7027 59 7028 56 6329 54 7430 53 66
22
TH
E R
EG
RE
SS
ION
LINE
The Regression Line
MBTN | Management by the Numbers
If a new student is 61 inches tall, how can we predict what that student will weigh?
• One approach would be to predict (110+79)/2=94.5 pounds. This is the average weight of the two students who were 61 inches tall.
50 52 54 56 58 60 62 64 660
20
40
60
80
100
120
140
Height (inches)
Wei
ght (
poun
ds)
One 61-inch tall student weighed 110 pounds.
The other 61-inch tall student weighed 79
pounds.
Student Height Weight1 53 572 57 723 60 1214 61 1105 55 706 52 557 59 978 54 689 56 7910 57 7711 56 6112 54 6113 61 7914 59 10515 61 7916 52 6817 59 7418 56 7019 65 10320 57 8121 59 10122 58 7923 60 10324 55 7225 56 9226 58 7027 59 7028 56 6329 54 7430 53 66
23
TH
E R
EG
RE
SS
ION
LINE
The Regression Line
MBTN | Management by the Numbers
If a new student is 61 inches tall, how can we predict what that student will weigh?
• But rather than base the prediction off of only 2 data values, a regression line lets us use ALL the data.
• If you have charted the data in Excel, it is very easy to find the regression line.
50 52 54 56 58 60 62 64 660
20
40
60
80
100
120
140
Height (inches)
Wei
ght (
poun
ds)
Student Height Weight1 53 572 57 723 60 1214 61 1105 55 706 52 557 59 978 54 689 56 7910 57 7711 56 6112 54 6113 61 7914 59 10515 61 7916 52 6817 59 7418 56 7019 65 10320 57 8121 59 10122 58 7923 60 10324 55 7225 56 9226 58 7027 59 7028 56 6329 54 7430 53 66
24
TH
E R
EG
RE
SS
ION
LINE
The Regression Line
MBTN | Management by the Numbers
Finding the Regression Line
• Right-click on the charted data.
• Select “Add Trendline”
• Select “Display Equation on chart” and “Display R-squared value on chart”.
50 52 54 56 58 60 62 64 660
20
40
60
80
100
120
140
Height (inches)
Wei
ght (
poun
ds)
Student Height Weight1 53 572 57 723 60 1214 61 1105 55 706 52 557 59 978 54 689 56 7910 57 7711 56 6112 54 6113 61 7914 59 10515 61 7916 52 6817 59 7418 56 7019 65 10320 57 8121 59 10122 58 7923 60 10324 55 7225 56 9226 58 7027 59 7028 56 6329 54 7430 53 66
25
TH
E R
EG
RE
SS
ION
LINE
The Regression Line
MBTN | Management by the Numbers
Finding the Regression Line
• We ran a regression of weight on height using “add trendline” option on the graph in Excel.*
• Weight was the Y or dependent variable
• Height was the X or independent variable
• We regressed weight on height to find an equation (the regression line) that can be used to predict weight based on height.
50 52 54 56 58 60 62 64 660
20
40
60
80
100
120
140
f(x) = 3.97103213242454 x − 147.38023369036R² = 0.515142660874044
Height (inches)
Wei
ght (
poun
ds)
The regression line!(Excel Output)
Student Height Weight1 53 572 57 723 60 1214 61 1105 55 706 52 557 59 978 54 689 56 7910 57 7711 56 6112 54 6113 61 7914 59 10515 61 7916 52 6817 59 7418 56 7019 65 10320 57 8121 59 10122 58 7923 60 10324 55 7225 56 9226 58 7027 59 7028 56 6329 54 7430 53 66
*Note that there is an alternative way to run regression that provides a more complete set of regression output using the Excel Analysis Toolpak Add-in, but that is beyond the scope of this module.
26
TH
E R
EG
RE
SS
ION
LINE
The Regression Line
MBTN | Management by the Numbers
Finding the Regression Line
• Predicted weight = 3.971 * height – 147.38
• For the new student…
• Predicted weight = 3.971 * 61 – 147.38
• Therefore, predicted weight = 94.9 pounds.
50 52 54 56 58 60 62 64 660
20
40
60
80
100
120
140
f(x) = 3.97103213242454 x − 147.38023369036R² = 0.515142660874044
Height (inches)
Wei
ght (
poun
ds)
The regression line!
Student Height Weight1 53 572 57 723 60 1214 61 1105 55 706 52 557 59 978 54 689 56 7910 57 7711 56 6112 54 6113 61 7914 59 10515 61 7916 52 6817 59 7418 56 7019 65 10320 57 8121 59 10122 58 7923 60 10324 55 7225 56 9226 58 7027 59 7028 56 6329 54 7430 53 66
27
TH
E R
EG
RE
SS
ION
LINE
The Regression Line
MBTN | Management by the Numbers
Predicted Weight = 3.971 * Height – 147.38
• The 3.971 number is called the regression coefficient.
• The -147.39 is called the regression intercept.
• If X and Y are positively correlated, the regression coefficient will be positive (and vice versa)
• If X and Y are negatively correlated, the regression coefficient will be negative.
• If X and Y are UN-correlated, the regression coefficient will be zero.
50 52 54 56 58 60 62 64 660
20
40
60
80
100
120
140
f(x) = 3.97103213242454 x − 147.38023369036R² = 0.515142660874044
Height (inches)
Wei
ght (
poun
ds)
The regression line!
Student Height Weight1 53 572 57 723 60 1214 61 1105 55 706 52 557 59 978 54 689 56 7910 57 7711 56 6112 54 6113 61 7914 59 10515 61 7916 52 6817 59 7418 56 7019 65 10320 57 8121 59 10122 58 7923 60 10324 55 7225 56 9226 58 7027 59 7028 56 6329 54 7430 53 66
28
TH
E R
EG
RE
SS
ION
LINE
The Regression Line
MBTN | Management by the Numbers
Predicted Weight = 3.971 * Height – 147.38
• We also asked excel to calculate and display the R-squared for the regression.
• For this regression, R-squared was 0.5151.
• R-squared is also a measure of the strength of the linear relationship.
• So both the correlation coefficient and R-squared measure the strength of the linear relationship? Why do we need two?
• We don’t. One is simply the square of the other.
• The square of the correlation coefficient is the R-squared.
• 0.718^2 = 0.515
50 52 54 56 58 60 62 64 660
20
40
60
80
100
120
140
f(x) = 3.97103213242454 x − 147.38023369036R² = 0.515142660874044
Height (inches)
Wei
ght (
poun
ds) The R-squared
Student Height Weight1 53 572 57 723 60 1214 61 1105 55 706 52 557 59 978 54 689 56 7910 57 7711 56 6112 54 6113 61 7914 59 10515 61 7916 52 6817 59 7418 56 7019 65 10320 57 8121 59 10122 58 7923 60 10324 55 7225 56 9226 58 7027 59 7028 56 6329 54 7430 53 66
29
TH
E R
EG
RE
SS
ION
LINE
The Regression Line
MBTN | Management by the Numbers
Predicted Weight = 3.971 * Height – 147.38
• One way to think about the correlation coefficient is as a summary of the regression of Y on X.
• The sign of the correlation coefficient tells you the sign of the regression coefficient.
• And the square of the correlation coefficient tells you the R-squared of the regression….a measure of the ability of the regression line to predict Y.
50 52 54 56 58 60 62 64 660
20
40
60
80
100
120
140
f(x) = 3.97103213242454 x − 147.38023369036R² = 0.515142660874044
Height (inches)
Wei
ght (
poun
ds)
The R-squared is the square of the
correlation coefficient
Student Height Weight1 53 572 57 723 60 1214 61 1105 55 706 52 557 59 978 54 689 56 7910 57 7711 56 6112 54 6113 61 7914 59 10515 61 7916 52 6817 59 7418 56 7019 65 10320 57 8121 59 10122 58 7923 60 10324 55 7225 56 9226 58 7027 59 7028 56 6329 54 7430 53 66
30
TH
E R
EG
RE
SS
ION
LINE
: EX
AM
PLE
The Regression Line: Example
MBTN | Management by the Numbers
Question 4: A new student is surprisingly short…just 50 inches tall. What is the predicted weight of this new student based on the above regression line?
Answer:
Just substitute X=50 into the regression equation.
Predicted Weight = 3.971 * 50 – 147.38 = 51.2 pounds.
Regression Line: Predicted Weight = 3.971 * Height – 147.38
31
TH
E R
EG
RE
SS
ION
LINE
The Regression Line
MBTN | Management by the Numbers
Question 5: A new student is of average height (57.07 inches from the summary statistics given earlier). Will this new student weigh more or less than the average?
Answer:
Substitute X=57.07 into the regression equation.
Predicted Weight = 3.971 * 57.07 – 147.38 = 79.23 pounds.
The sample mean weight of the 30 students was also 79.23 pounds.
THIS IS NOT A COINCIDENCE!
The regression prediction for the sample mean X is ALWAYS the sample mean of the Y’s.
Regression Line: Predicted Weight = 3.971 * Height – 147.38
32
TH
E R
EG
RE
SS
ION
LINE
The Regression Line
MBTN | Management by the Numbers
Question 6: Using the same 30 data points, suppose we regress height on weight (rather then weight on height). Will the resulting regression coefficient be positive, negative, or zero? What will be the resulting R-squared?
Answer:
Because the correlation between X and Y is the same as the correlation between Y and X, the regression of Y on X has the same R-squared as the regression of X on Y.
R-squared for the new regression will be 0.515
The coefficient will be positive (because the variables are positively correlated)….but will not equal 1/3.971. To find the new coefficient, one has to run the regression.
Regression Line: Predicted Weight = 3.971 * Height – 147.38
R-squared = 0.515
33
Any Introductory Statistics Book such as Introductory Statistics (9th Edition), Neil. A. Weiss, Pearson Publishing, 2010.
DE
SC
RIP
TIV
E S
TAT
IST
ICS
– FU
RT
HE
R R
EF
ER
EN
CE
Descriptive Statistics - Further Reference
MBTN | Management by the Numbers