Upload
truongcong
View
267
Download
0
Embed Size (px)
Citation preview
مار و احتمالمار و احتمالا ا
کارشناسی ارشدکارشناسی ارشد
MBAMBA دانشکده مدیریت دانشگاه تهراندانشکده مدیریت دانشگاه تهران
دکـتر عموزاددکـتر عموزاد: : مدرسمدرس
چارچوب و طبقه بندی کلی
مار و احتمال ا
مار )بررسی نمیشود(احتما.ت مار )بررسی نمیشود(احتما.ت ا ا
مار استنباطی تخمین / نمونه گیری (ا
زمون )و ا
مار ناپارامتریکمار پارامتریک ا
ا
مار توصیفی بررسی / سرشماری (ا
بررسی نمیشود - )جوامع
سرفصل های پیشنهادی درس
ردي
ف
توضيحاتتعداد جلساتموضوع
*1گيری-نمونه ھای- روش و آمار معرفی و مبانی1
*1 مرکزی حد قضيه – مطلوب آماره و پارامتر و آماره2
*3)موفقيت نسبت – واريانس – زوج – ميانگين( ای-فاصله تخمين3
*1نمونه اندازه محاسبه4
*3آن انواع و آماری فرض آزمون5
*1دو خی توزيع کاربردھای6
*1ھمبستگی و رگرسيون7
*SPSS1 افزار-نرم8
منابع پیشنهادی درس
انتشارات –مومنی منصور آذر، عادل –مديريت در آن کاربرد و آمار -1
.سمت
مرکز – اصل وحيدی قاسم محمد عميدی، علی –فروند جان –رياضی آمار -2
.دانشگاھی نشر
–قيومی فعال علی مومنی، منصور – افزار نرم از استفاده با آماری تحليل -3
.نو کتاب نشر
منابع پیشنهادی درس
فصل اولفصل اول
مفاهیم اولیهمفاهیم اولیه
Key Definitions
� A population is the collection of all items or
things under consideration –people or objects
� A sample is a portion of the population
selected for analysisselected for analysis
� A parameter is a summary measure that
describes a characteristic of the population
� A statistic is a summary measure computed
from a sample
Population vs. Sample
a b c d
Population Sample
b c
ef gh i jk l m n
o p q rs t u v w
x y z
g i n
o r u
y
Measures used to describe the population are called parameters
Measures computed from sample data are called statistics
Key Definitions
� A survey is the gathering of data about a
particular group of people or items
� A census is a survey of the entire population
� A sample is a survey of a portion of the
population
Two Branches of Statistics
� Descriptive statistics
� Collecting, summarizing, and describing data
� Inferential statistics� Inferential statistics
� Drawing conclusions and/or making decisions
concerning a population based only on sample
data
Descriptive Statistics
� Collect data
� e.g. Survey
Present data� Present data
� e.g. Tables and graphs
� Characterize data
� e.g. Sample mean = iX
n
∑
Inferential Statistics
� Estimation
� e.g.: Estimate the population
mean weight using the sample
mean weight
� Hypothesis testing
� e.g.: Test the claim that the
population mean weight is over
120 pounds
Drawing conclusions and/or making decisions concerning a population based on sample results.
Data Sources
SecondaryData Compilation
Print or Electronic
PrimaryData Collection
Observation
Experimentation
Print or Electronic
Survey
Types of Data
Data
Categorical(Qualitative)
Numerical(Quantitative) (Qualitative) (Quantitative)
Discrete Continuous
Examples:
� Marital Status
� Political Party� Eye Color
(Defined categories)Examples:
� Number of Children
� Defects per hour
(Counted items)
Examples:
� Weight
� Voltage
(Measured characteristics)
Levels of Measurementand Measurement Scales
Interval Data
Highest Level
Strongest forms of measurementDifferences between
measurements but no
Ratio DataDifferences between measurements, true zero exists
Interval Data
Ordinal Data
Nominal Data
Higher Level
Lowest Level
Weakest form of measurement
Categories (no ordering or direction)
Ordered Categories (rankings, order, or scaling)
measurements but no true zero
Tables and Charts forNumerical Data
Numerical Data
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 2-16
Ordered Array
Stem and Leaf
DisplayHistogram Polygon Ogive
Frequency Distributions and
Cumulative Distributions
� Data in raw form (as collected):
24, 26, 24, 21, 27, 27, 30, 41, 32, 38
(continued)
The Ordered Array
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 2-17
� Data in ordered array from smallest to largest:
21, 24, 24, 26, 27, 27, 30, 32, 38, 41
Example
� Completed Stem-and-leaf diagram:
(continued)
Data in ordered array:21, 24, 24, 26, 27, 27, 30, 32, 38, 41
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 2-18
� Completed Stem-and-leaf diagram:
Stem Leaves
2 1 4 4 6 7 7
3 0 2 8
4 1
Class Intervals and Class Boundaries
� If each class grouping has the same width
� Determine the width of each interval by
range≅intervalofWidth
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 2-19
� Use at least 5 but no more than 15 groupings
� Class boundaries never overlap
� Round up the interval width to get desirable endpoints
groupingsclassdesiredofnumber≅intervalofWidth
Approximate Number of Classes
Observations Classes
Less than 50 3-6
50-200 6-9
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 2-20
� Excel uses the square root of n.
50-200 6-9
200-1000 8-12
More than 1000 10-15
Graphing Numerical Data: The Histogram
� A graph of the data in a frequency distribution
is called a histogram
� The class boundaries (or class midpoints)
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 2-21
are shown on the horizontal axis
� frequency is measured on the vertical axis
� Bars of the appropriate heights can be used to
represent the number of observations within
each class
Histogram : Daily High Tem perature
67
Histogram Example
Class
10 but less than 20 15 3
20 but less than 30 25 6
30 but less than 40 35 5
40 but less than 50 45 4
FrequencyClass
Midpoint
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 2-22
0
3
6
5
4
2
00
1
2
3
4
5
6
7
5 15 25 35 45 55 More
Fre
qu
en
cy
Class Midpoints
(No gaps between
bars)
50 but less than 60 55 2
Frequency Polygon: Daily High Temperature
7
Graphing Numerical Data: The Frequency Polygon
Class
1 but less than 10 5 0
10 but less than 20 15 3
20 but less than 30 25 6
30 but less than 40 35 5
FrequencyClass
Midpoint
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 2-23
0
1
2
3
4
5
6
7
5 15 25 35 45 55 More
Fre
qu
en
cy
Class Midpoints
40 but less than 50 45 4
50 but less than 60 55 2
More than 60 65 0
(In a percentage polygon the vertical axis would be defined to show the percentage of observations per class)
Graphing Cumulative Frequencies: The Ogive (Cumulative % Polygon)
Ogive: Daily High Temperature
100
Class
less than 10 10 0
less than 20 20 15
less than 30 30 45
less than 40 40 70
Cumulative Percentage
Lower class
boundary
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 2-24
0
20
40
60
80
100
10 20 30 40 50 60
Cu
mu
lati
ve
Pe
rce
nta
ge
Class Boundaries (Not Midpoints)
less than 50 50 90
less than 60 60 100
Scatter Diagram Example
Cost per Day vs. Production Volume
200
250C
os
t p
er
Da
y
Volume per day
Cost per day
23 125
26 140
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 2-25
0
50
100
150
200
0 10 20 30 40 50 60 70
Volume per Day
Co
st
pe
r D
ay
29 146
33 160
38 167
42 170
50 188
55 195
60 200
Tables and Charts for Categorical Data: Univariate Data
Categorical Data
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 2-26
Graphing Data
Pie Charts
Pareto Diagram
Bar Charts
Tabulating Data
Summary Table
Bar Chart Example
Investor's Portfolio
Investment Amount PercentageType (in thousands $) (%)
Stocks 46.5 42.27
Bonds 32.0 29.09
Current Investment Portfolio
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 2-27
Investor's Portfolio
0 10 20 30 40 50
Stocks
Bonds
CD
Savings
Amount in $1000's
CD 15.5 14.09
Savings 16.0 14.55
Total 110.0 100.0
Pie Chart Example
Current Investment Portfolio
Savings
Investment Amount PercentageType (in thousands $) (%)
Stocks 46.5 42.27
Bonds 32.0 29.09
CD 15.5 14.09
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 2-28
Percentages are rounded to the nearest percent
15%
CD 14%
Bonds 29%
Stocks
42%
CD 15.5 14.09
Savings 16.0 14.55
Total 110.0 100.0
Pareto Diagram Examplecu
mu
lativ
e %
investe
d in
veste
d i
n e
ach
cate
go
ry
35%
40%
45%
70%
80%
90%
100%
Current Investment Portfolio
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 2-29
cu
mu
lativ
e %
investe
d
(line g
rap
h)
% i
nveste
d i
n e
ach
cate
go
ry
(bar
gra
ph
)
0%
5%
10%
15%
20%
25%
30%
Stocks Bonds Savings CD
0%
10%
20%
30%
40%
50%
60%
70%
فصل دوم
شاخص های اولیه
Summary Measures
Arithmetic Mean
Describing Data Numerically
Range Skewness
Central Tendency Variation ShapeQuartiles
Arithmetic Mean
Median
Mode Variance
Standard Deviation
Coefficient of Variation
Range
Interquartile Range
Geometric Mean
Skewness
Measures of Central Tendency
Central Tendency
Arithmetic Mean Median Mode Geometric Mean
Overview
Arithmetic Mean Median Mode Geometric Mean
n
X
X
n
ii∑
== 1
n/1n21G )XXX(X ×××= L
Midpoint of ranked values
Most frequently observed value
(Arithmetic Mean)
n/1n21G )XXX(X ×××= L
Sample size
n
XXX
n
X
X n21
n
1ii +++
==∑
= L
Observed values
Arithmetic Mean
(continued)
0 1 2 3 4 5 6 7 8 9 10
Mean = 3
0 1 2 3 4 5 6 7 8 9 10
Mean = 4
35
15
5
54321==
++++4
5
20
5
104321==
++++
(Median)
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
Median = 3 Median = 3
(Mode)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Mode = 9
0 1 2 3 4 5 6
No ModeMode = 9
No Mode
Review Example
$ 2 ,0 0 0 K
House Prices:
$2,000,000$ 5 0 0 K
$ 3 0 0 K
$ 1 0 0 K
$ 1 0 0 K
$2,000,000500,000300,000100,000100,000
(Quartiles)
25% 25% 25% 25%
� The first quartile, Q1, is the value for which 25% of the observations are smaller and 75% are larger
� Q2 is the same as the median (50% are smaller, 50% are larger)
� Only 25% of the observations are greater than the third quartile
Q1 Q2 Q3
Quartile Formulas
First quartile position: Q1 = (n+1)/4
Second quartile position: Q2 = (n+1)/2 (the median position)Second quartile position: Q2 = (n+1)/2 (the median position)
Third quartile position: Q3 = 3(n+1)/4
where n is the number of observed values
(Measures of Variation)
ความผนแปร
Variance Standard
Deviation
Coefficient of
Variation
Range Interquartile
Range
Same center,
different variation
Deviation VariationRange
(Range)
Range = Xlargest – Xsmallest
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Range = 14 - 1 = 13
7 8 9 10 11 12
Range = 12 - 7 = 5
7 8 9 10 11 12
Range = 12 - 7 = 5
Disadvantages of the Range
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Range = 5 - 1 = 4
Range = 120 - 1 = 119
Interquartile Range
� Interquartile range = 3rd quartile – 1st quartile
= Q3 – Q1
Interquartile Range
Median(Q2)
XmaximumX
minimum Q1 Q3
Example:
25% 25% 25% 25%25% 25% 25% 25%
12 30 45 57 70
Interquartile range = 57 – 30 = 27
� Sample variance:
(Variance)
)X(Xn
2∑ −
1-n
)X(X
S 1i
2i
2∑
=
−
=
Where = arithmetic mean
n = sample size
Xi = ith value of the variable X
X
(Standard Deviation)
� Sample standard deviation:
1-n
)X(X
S
n
1i
2i∑
=
−
=
Calculation Example:
Sample Standard Deviation
Sample Data (Xi) : 10 12 14 15 17 18 18 24
n = 8 Mean = X = 16
)X(2 4)X(1 4)X(1 2)X(1 0 2222 −++−+−+− L
4 .2 4 2 67
1 2 6
18
1 6 )(2 41 6 )(1 41 6 )(1 21 6 )(1 0
1n
)X(2 4)X(1 4)X(1 2)X(1 0S
2222
2222
==
−
−++−+−+−=
−
−++−+−+−=
L
L
Measuring variation
Small standard deviation
Large standard deviation
Comparing Standard Deviations
Mean = 15.5
S = 3.33811 12 13 14 15 16 17 18 19 20 21
Data B
Data A
11 12 13 14 15 16 17 18 19 20 21
Data BMean = 15.5
S = .9258
11 12 13 14 15 16 17 18 19 20 21
Mean = 15.5
S = 4.57
Data C
(Coefficient of Variation, CoV)
1 0 0 %X
SC V ⋅
=
Comparing Coefficients
of Variation
� Stock A:
� Average price last year = $50
� Standard deviation = $5
1 0 %1 0 0 %$ 5
1 0 0 %S
C V =⋅=⋅
=
� Stock B:
� Average price last year = $100
� Standard deviation = $5
Both stocks have the same standard deviation, but stock B is less variable relative to its price
1 0 %1 0 0 %$ 5 0
1 0 0 %X
C V A =⋅=⋅
=
5 %1 0 0 %$ 1 0 0
$ 51 0 0 %
X
SC VB =⋅=⋅
=
(Shape of a Distribution)
� (skewed)
Mean = MedianMean < Median Median < Mean
Right-SkewedLeft-Skewed Symmetric
Population Summary Measures
XN
∑N
XXX
N
XN211i
i +++==µ
∑= L
µ = population mean
N = population size
Xi = ith value of the variable X
Where
� squared deviations of values from the mean
� Population variance:
Population Variance
µ)(XN
2i∑ −
Nσ 1i
i2∑
==
Where µ = population mean
N = population size
Xi = ith value of the variable X
Population Standard Deviation
� Population standard deviation:
N
µ)(X
σ
N
1i
2i∑
=
−
=
The Empirical Rule
1σµ ±
µ
68%
1σµ ±
� contains about 95% of the values in
the population or the sample
� contains about 99.7% of the values in the population or the sample
The Empirical Rule
2σµ ±
3σµ ±in the population or the sample
3σµ±
99.7%95%
2σµ±
Distribution Shape and Box and Whisker Plot
Right-SkewedLeft-Skewed Symmetric
Q1 Q2 Q3 Q1 Q2 Q3 Q1 Q2 Q3
Coefficient of Correlation
� Measures the relative strength of the linear relationship between two variables
� Sample coefficient of correlation:� Sample coefficient of correlation:
∑∑
∑
==
=
−−
−−
=n
i
i
n
i
i
n
i
ii
YYXX
YYXX
r
1
2
1
2
1
)()(
))((
Features of Correlation Coefficient, r
� Unit free
� Ranges between –1 and 1
� The closer to –1, the stronger the negative linear
relationshiprelationship
� The closer to 1, the stronger the positive linear
relationship
� The closer to 0, the weaker any linear relationship
Scatter Plots of Data with Various Correlation Coefficients
Y
X
Y
X
Y
XX X X
Y
X
Y
X
r = -1 r = -.6 r = 0
r = +.3r = +1
Y
Xr = 0
فصل سوم
ماری معرفی برخی توزیع های ا
Probability Distributions
ContinuousProbability
Probability Distributions
DiscreteProbability Probability
Distributions
Binomial
Hypergeometric
Poisson
Probability Distributions
Normal
Uniform
Exponential
The Normal Distribution
Probability Distributions
ContinuousProbability
Normal
Uniform
Exponential
Probability Distributions
The Normal Distribution
� ‘Bell Shaped’
� Symmetrical
� Mean, Median and Modeare Equal
Location is determined by the
f(X)
σLocation is determined by the mean, µ
Spread is determined by the standard deviation, σ
The random variable has an infinite theoretical range: + ∞∞∞∞ to −−−− ∞∞∞∞
Mean = Median = Mode
Xµ
σ
Many Normal Distributions
By varying the parameters µ and σ, we obtain different normal distributions
The Normal Distribution Shape
f(X) Changing µ shifts the distribution left or right.
Xµ
σ
Changing σ increases or decreases the spread.
The Normal Probability Density Function
� The formula for the normal probability density function is
2µ)/σ](1/2)[(Xe1
f(X) −−=
Where e = the mathematical constant approximated by 2.71828
π = the mathematical constant approximated by 3.14159
µ = the population mean
σ = the population standard deviation
X = any value of the continuous variable
µ)/σ](1/2)[(Xe2π
1f(X) −−
σ=
The Standardized Normal
� Any normal distribution (with any mean and standard deviation combination) can be transformed into the standardized normaldistribution (Z)distribution (Z)
� Need to transform X units into Z units
Translation to the Standardized Normal Distribution
� Translate from X to the standardized normal (the “Z” distribution) by subtracting the meanof X and dividing by its standard deviation:
σ
µXZ
−=
Z always has mean = 0 and standard deviation = 1
The Standardized Normal Probability Density Function
� The formula for the standardized normal probability density function is
2(1/2)Ze1
f(Z) −=
Where e = the mathematical constant approximated by 2.71828
π = the mathematical constant approximated by 3.14159
Z = any value of the standardized normal distribution
(1/2)Ze2π
1f(Z) −=
The Mathematical Model
( )( )
( )
21
2
2
1
2
: density of random variable
X
f X e
f X X
µσ
πσ
2− −
=
( )
( )
: density of random variable
3.14159; 2.71828
: population mean
: population standard deviation
: value of random variable
f X X
e
X X
π
µ
σ
= =
−∞ < < ∞
Expectation
( )µµ σµ
σµ
σπ
+−−=
=
∫
∫∞
−−
∞
∞−
−−
)(
)(
22
22
2/)(
2/)(
2
1
xdex
dxxeXE
x
x
( )
µ
µ
µµ
σµ
σπ
σµ
σπ
+=
+−−=
∫
∫∞
∞−
−−
∞
∞−
−−
0
)(
22
22
2/)(
2
1
2/)(
2
1
dxe
xdex
x
x
Variance
( ) ( )µ σ
µ
σ
µ
σπσ
µ
)(2/)(2
2
122
=− ∫∞
∞−
−−−−
deXExx
x
( ) ( )
π
µ
π
σ
π
σ
σσσπ
2
)(
2
2/2
2
2
2
22
=
=
=−
∫
∫∞
∞−
−
∞−
dyey
deXE
y
The StandardizedNormal Distribution
� Also known as the “Z” distribution
� Mean is 0
� Standard Deviation is 1
f(Z)
Z
f(Z)
0
1
Values above the mean have positive Z-values,values below the mean have negative Z-values
Example
� If X is distributed normally with mean of 100and standard deviation of 50, the Z value for X = 200 is
1 0 02 0 0µX −−
� This says that X = 200 is two standard deviations (2 increments of 50 units) above the mean of 100.
2 .05 0
1 0 02 0 0
σ
µXZ =
−=
−=
Comparing X and Z units
Z
100
2.00
200 X
Note that the distribution is the same, only the scale has changed. We can express the problem in original units (X) or in standardized units (Z)
(µ = 100, σ = 50)
(µ = 0, σ = 1)
Finding Normal Probabilities
Probability is the area under thecurve!
f(X) P a X b( )≤
Probability is measured by the area under the curve
≤
a b X
f(X) P a X b( )≤≤
P a X b( )<<=
(Note that the probability of any individual value is zero)
f(X)
Probability as Area Under the Curve
The total area under the curve is 1.0, and the curve is symmetric, so half is above the mean, half is below
0 .5)XP (µ =∞<<0 .5µ )XP ( =<<− ∞
Xµ
0.50.5
1 .0)XP ( =∞<<− ∞
0 .5)XP (µ =∞<<
Empirical Rules
f(X)
What can we say about the distribution of values around the mean? There are some general rules:
µ ± 1σ encloses about
68% of X’s
Xµ µ+1σµ-1σ
σσ
68.26%
The Empirical Rule
� µ ± 2σ covers about 95% of X’s
� µ ± 3σ covers about 99.7% of X’s
(continued)
xµ
2σ 2σ
xµ
3σ 3σ
95.44% 99.72%
The Standardized Normal Table
� The Standardized Normal table in the textbook (Appendix table E.2) gives the probability less than a desired value for Z (i.e., from negative infinity to Z)(i.e., from negative infinity to Z)
Z0 2.00
.9772Example:
P(Z < 2.00) = .9772
The Standardized Normal Table
The column gives the value of Z to the second decimal point
(continued)
Z 0.00 0.01 0.02 …
The value within the table gives the probability from Z = −−−− ∞∞∞∞up to the desired Z value
.9772
2.0P(Z < 2.00) = .9772
The row shows the value of Z to the first decimal point
2.0
.
.
.
0.0
0.1
General Procedure for Finding Probabilities
� Draw the normal curve for the problem in
To find P(a < X < b) when X is distributed normally:
� Draw the normal curve for the problem in
terms of X
� Translate X-values to Z-values
� Use the Standardized Normal Table
Finding Normal Probabilities
� Suppose X is normal with mean 8.0 and standard deviation 5.0
� Find P(X < 8.6)
X
8.6
8.0
� Suppose X is normal with mean 8.0 and standard deviation 5.0. Find P(X < 8.6)
(continued)
Finding Normal Probabilities
0 .1 25 .0
8 .08 .6
σ
µXZ =
−=
−=
Z0.120X8.68
µ = 8
σ = 10
µ = 0
σ = 1
P(X < 8.6) P(Z < 0.12)
Z .00 .01
Solution: Finding P(Z < 0.12)
.5478.02
Standardized Normal Probability Table (Portion) = P(Z < 0.12)
P(X < 8.6)
Z
0.12
0.0 .5000 .5040 .5080
.5398 .5438
0.2 .5793 .5832 .5871
0.3 .6179 .6217 .6255
0.1 .5478
0.00
Upper Tail Probabilities
� Suppose X is normal with mean 8.0 and standard deviation 5.0.
� Now Find P(X > 8.6)
X
8.6
8.0
� Now Find P(X > 8.6)…
(continued)
P(X > 8.6) = P(Z > 0.12) = 1.0 - P(Z ≤ 0.12)
= 1.0 - .5478 = .4522
Upper Tail Probabilities
Z
0.12
0Z
0.12
.5478
0
1.000 1.0 - .5478 = .4522
Probability Between Two Values
� Suppose X is normal with mean 8.0 and standard deviation 5.0. Find P(8 < X < 8.6)
Calculate Z-values:
P(8 < X < 8.6)
= P(0 < Z < 0.12)
Z0.120
X8.68
05
88
σ
µXZ =
−=
−=
0 .1 25
88 .6
σ
µXZ =
−=
−=
Solution: Finding P(0 < Z < 0.12)
= P(0 < Z < 0.12)
P(8 < X < 8.6)
= P(Z < 0.12) – P(Z ≤ 0)
= .5478 - .5000 = .0478Z .00 .01 .02
Standardized Normal Probability Table (Portion)
Z
0.12
.0478
0.00
.50000.0 .5000 .5040 .5080
.5398 .5438
0.2 .5793 .5832 .5871
0.3 .6179 .6217 .6255
0.1 .5478
� Suppose X is normal with mean 8.0 and standard deviation 5.0.
� Now Find P(7.4 < X < 8)
Probabilities in the Lower Tail
X
7.48.0
Probabilities in the Lower Tail
Now Find P(7.4 < X < 8)…
P(7.4 < X < 8)
= P(-0.12 < Z < 0)
(continued)
.0478
X7.4 8.0
= P(Z < 0) – P(Z ≤ -0.12)
= .5000 - .4522 = .0478
.0478
.4522
Z-0.12 0
The Normal distribution is symmetric, so this probability is the same as P(0 < Z < 0.12)
� Steps to find the X value for a known probability:
1. Find the Z value for the known probability
Finding the X value for a Known Probability
2. Convert to X units using the formula:
ZσµX +=
Finding the X value for a Known Probability
Example:
� Suppose X is normal with mean 8.0 and standard deviation 5.0.
� Now find the X value so that only 20% of all
(continued)
� Now find the X value so that only 20% of all values are below this X
X? 8.0
.2000
Z? 0
Find the Z value for 20% in the Lower Tail
� 20% area in the lower tail is consistent with a Z value of -0.84Z .03 .04
Standardized Normal Probability Table (Portion)
.05…
1. Find the Z value for the known probability
Z value of -0.84Z .03
-0.9 .1762 .1736
.2033
-0.7 .2327 .2296
.04
-0.8 .2005
.05
.1711
.1977
.2266
…
…
…
…
X? 8.0
.2000
Z-0.84 0
2. Convert to X units using the formula:
Finding the X value
0.5)84.0(0.8
ZσµX
−+=
+=
80.3
0.5)84.0(0.8
=
−+=
So 20% of the values from a distribution
with mean 8.0 and standard deviation 5.0 are less than 3.80
The Uniform Distribution
ContinuousProbability
Probability Distributions
Probability Distributions
Normal
Uniform
Exponential
The Uniform Distribution
� The uniform distribution is a
probability distribution that has equal
probabilities for all possible probabilities for all possible
outcomes of the random variable
� Also called a rectangular distribution
The Continuous Uniform Distribution:
bXaifab
1≤≤
−
The Uniform Distribution(continued)
f(X) =
otherwise 0
where
f(X) = value of the density function at any X value
a = minimum value of X
b = maximum value of X
f(X) =
Properties of the Uniform Distribution
� The mean of a uniform distribution is
2
baµ
+=
� The standard deviation is
1 2
a )-(bσ
2
=
Uniform Distribution Example
Example: Uniform probability distributionover the range 2 ≤ X ≤ 6:
f(X) = = .25 for 2 ≤ X ≤ 66 - 21
2 6
.25
f(X) = = .25 for 2 ≤ X ≤ 66 - 2
X
f(X)
42
62
2
baµ =
+=
+=
1 5 4 7.11 2
2 )-(6
1 2
a )-(bσ
22
===
The Exponential Distribution
ContinuousProbability
Probability Distributions
Probability Distributions
Normal
Uniform
Exponential
The Exponential Distribution
� Used to model the length of time between two occurrences of an event (the time between arrivals)
� Examples: � Time between trucks arriving at an unloading dock
� Time between transactions at an ATM Machine
� Time between phone calls to the main operator
The Exponential Distribution
� Defined by a single parameter, its mean λ(lambda)
� The probability that an arrival time is less than some specified time X is
Xλe1X)time P(arrival −−=<
some specified time X is
where e = mathematical constant approximated by 2.71828
λ = the population mean number of arrivals per unit
X = any value of the continuous variable where 0 < X < ∞
Exponential Distribution Example
Example: Customers arrive at the service counter at the rate of 15 per hour. What is the probability that the arrival time between consecutive customers is less than three minutes?
� The mean number of arrivals per hour is 15, so λ = 15
� Three minutes is .05 hours
� P(arrival time < .05) = 1 – e-λX = 1 – e-(15)(.05) = .5276
� So there is a 52.76% probability that the arrival time between successive customers is less than three minutes
Exponential Distributions
( )arrival time 1
: any value of continuous random variable
: the population average number of
XP X e
X
λ
λ
−< = −
: the population average number of
arrivals per unit of time
1/ : average time between arrivals
2.71828e
λ
λ
=
e.g.: Drivers Arriving at a Toll Bridge;
Customers Arriving at an ATM Machine
Exponential Distributions
� Describes time or distance between events
� Used for queues
� Density function
(continued)
f(X)λλλλ = 0.5
1x
�
� Parameters
�
X
λλλλ = 2.0
( )1
x
f x e λ
λ
−
=
µ λ σ λ= =
فصل چهارم
توزیع نمونه گیری
د.یل استفاده از نمونه گیری / مزایا
صرفه جویــی در زمان
/ د.یل مزایا
کاهش هزینه ها
مزیت در زمون های
ا
مخرب
امکان تعمیم نتایج به جامعه
مباحث اصلی در نمونه گیری
نمونه گیری
روشهای نمونه گیری
در فصول –محاسبه حجم نمونه بعدی بررسی میشود
تصادفی ساده
قرعه کشی
جدول اعداد تصادفی
گروهی نظام مند
منتظم
غیرمنتظم
خوشه ای مرحله ای
Types of Samples
Samples
Non-Probability Probability SamplesNon-Probability Samples
Judgment
Probability Samples
Simple
Random
Systematic
Stratified
Cluster
Convenience
Types of Samples:Nonprobability Sample
� In a nonprobability sample, items included are chosen without regard to their probability of occurrence.� In convenience sampling, items are selected based � In convenience sampling, items are selected based
only on the fact that they are easy, inexpensive, or convenient to sample.
� In a judgment sample, you get the opinions of pre-
selected experts in the subject matter.
Types of Samples:Probability Sample
� In a probability sample, items in the sample are chosen on the basis of known probabilities.
Probability SamplesProbability Samples
Simple
RandomSystematic Stratified Cluster
Probability Sample:Simple Random Sample
� Every individual or item from the frame has an equal chance of being selected
� Selection may be with replacement (selected individual is returned to frame for possible individual is returned to frame for possible reselection) or without replacement (selected individual isn’t returned to the frame).
� Samples obtained from table of random numbers or computer random number generators.
Selecting a Simple Random Sample Using A Random Number Table
Sampling Frame For Population With 850
Items
Item Name Item #
Portion Of A Random Number Table
49280 88924 35779 00283 81163 07275
11100 02340 12860 74697 96644 89439
09893 23997 20048 49420 88872 08401
The First 5 Items in a simple Item Name Item #Bev R. 001
Ulan X. 002
. .
. .
. .
. .
Joann P. 849
Paul F. 850
The First 5 Items in a simple random sample
Item # 492
Item # 808
Item # 892 -- does not exist so ignore
Item # 435
Item # 779
Item # 002
� Decide on sample size: n
� Divide frame of N individuals into groups of k
individuals: k=N/n
� Randomly select one individual from the 1st
Probability Sample:Systematic Sample
� Randomly select one individual from the 1st
group
� Select every kth individual thereafter
N = 40
n = 4
k = 10
First Group
Probability Sample:Stratified Sample
� Divide population into two or more subgroups (called strata) according
to some common characteristic
� A simple random sample is selected from each subgroup, with sample
sizes proportional to strata sizes
� Samples from subgroups are combined into one� Samples from subgroups are combined into one
� This is a common technique when sampling population of voters, stratifying across racial or socio-economic lines.
Population
Divided
into 4
strata
Probability SampleCluster Sample
� Population is divided into several “clusters,” each representative of
the population
� A simple random sample of clusters is selected
� All items in the selected clusters can be used, or items can be
chosen from a cluster using another probability sampling technique
� A common application of cluster sampling involves election exit polls,
where certain election districts are selected and sampled.
Population divided into 16 clusters.
Randomly selected clusters for sample
ماره مطلوبماره مطلوبویژگیهای اماره مطلوبویژگیهای اماره مطلوبویژگیهای اماره مطلوبویژگیهای اماره مطلوبویژگیهای اماره مطلوبویژگیهای اماره مطلوبویژگیهای اماره مطلوبویژگیهای اماره مطلوبویژگیهای اماره مطلوبویژگیهای اماره مطلوبویژگیهای اماره مطلوبویژگیهای اماره مطلوبویژگیهای اماره مطلوبویژگیهای اماره مطلوبویژگیهای ا ویژگیهای ا
Unbiasedness
BiasedUnbiased
P(X)
Xµµ
X
Less Variability
Sampling
Distribution Sampling
P(X)
of MedianSampling
Distribution of
Mean
µ X
Effect of Large Sample
Larger
sample size
Smaller
P(X)
Smaller
sample size
Xµ
توزیع نمونه گیری توزیع نمونه گیری توزیع نمونه گیری توزیع نمونه گیری توزیع نمونه گیری توزیع نمونه گیری توزیع نمونه گیری توزیع نمونه گیری
Sampling Distributions
Sampling Distributions
Sampling Distributions
of the Mean
Sampling Distributions
of the Proportion
.3
P(x)
(continued)
Summary Measures for the Population Distribution:
Developing a Sampling Distribution
N
Xµ i=∑
.3
.2
.1
018 20 22 24
A B C D
Uniform Distribution
x
214
24222018=
+++=
2.236N
µ)(Xσ
2i
=−
=∑
1st 2nd Observation
Obs 18 20 22 24
18 18,18 18,20 18,22 18,24
Now consider all possible samples of size n=2
1st 2nd Observation
(continued)
Developing a Sampling Distribution
16 Sample Means
20 20,18 20,20 20,22 20,24
22 22,18 22,20 22,22 22,24
24 24,18 24,20 24,22 24,24
16 possible samples (sampling with replacement)
1st 2nd Observation
Obs 18 20 22 24
18 18 19 20 21
20 19 20 21 22
22 20 21 22 23
24 21 22 23 24
1st 2nd Observation
Sampling Distribution of All Sample Means
Sample Means Distribution
16 Sample Means
Developing a Sampling Distribution
(continued)
_1st 2nd Observation
Obs 18 20 22 24
18 18 19 20 21
20 19 20 21 22
22 20 21 22 23
24 21 22 23 24
18 19 20 21 22 23 240
.1
.2
.3 P(X)
X_
(no longer uniform)
_
Summary Measures of this Sampling Distribution:
Developing aSampling Distribution
(continued)
2116
24211918
N
Xµ i
X=
++++==
∑ L
16N
1.5816
21)-(2421)-(1921)-(18
N
)µX(σ
222
2
Xi
X
=+++
=
−=∑
L
Comparing the Population with its Sampling Distribution
P(X)
PopulationN = 4
P(X)
1.58σ 21µXX
==2.236σ 21µ ==
Sample Means Distributionn = 2
_
18 19 20 21 22 23 240
.1
.2
.3 P(X)
X18 20 22 24
A B C D
0
.1
.2
.3 P(X)
X _
_
Sampling Distributions of the Mean
Sampling Distributions
Sampling Distributions
of the Mean
Sampling Distributions
of the Proportion
Standard Error of the Mean
� Different samples of the same size from the same population will yield different sample means
� A measure of the variability in the mean from sample to sample is given by the Standard Error of the Mean:sample is given by the Standard Error of the Mean:
� Note that the standard error of the mean decreases as the sample size increases
n
σσ
X=
If the Population is Normal
� If a population is normal with mean µ and
standard deviation σ, the sampling distribution
of is also normally distributed withX
and
(This assumes that sampling is with replacement or
sampling is without replacement from an infinite population)
µµX
=n
σσ
X=
Z-value for Sampling Distributionof the Mean
� Z-value for the sampling distribution of :
σµ)X(
σ
)µX(Z X −
=−
=
X
where: = sample mean
= population mean
= population standard deviation
n = sample size
Xµ
σ
n
σσX
Finite Population Correction
� Apply the Finite Population Correction if:
� the sample is large relative to the population
(n is greater than 5% of N)
and…and…
� Sampling is without replacement
Then
1N
nN
n
σ
µ)X(Z
−
−
−=
Normal Population Distribution
Sampling Distribution Properties
�µµx =
Normal Sampling Distribution (has the same mean)
(i.e. is unbiased )xx
x
µ
xµ
Sampling Distribution Properties
� For sampling with replacement:
As n increases,
decreases
Larger sample size
(continued)
xσ
Smaller sample size
x
x
µ
If the Population is not Normal
� We can apply the Central Limit Theorem:
� Even if the population is not normal,
� …sample means from the population will beapproximately normal as long as the sample size is approximately normal as long as the sample size is large enough.
Properties of the sampling distribution:
andµµx =n
σσ x =
n↑
Central Limit Theorem
As the sample size gets
the sampling distribution becomes almost normal
size gets large enough…
almost normal regardless of shape of population
x
Population Distribution
Central Tendency
If the Population is not Normal(continued)
Sampling distribution properties:
µµ =
Sampling Distribution (becomes normal as n increases)
Variation
(Sampling with replacement)
x
x
Larger sample size
Smaller sample size
µµx =
n
σσ x =
xµ
µ
How Large is Large Enough?
� For most distributions, n > 30 will give a sampling distribution that is nearly normal
� For fairly symmetric distributions, n > 15� For fairly symmetric distributions, n > 15
� For normal population distributions, the sampling distribution of the mean is always normally distributed
Example
� Suppose a population has mean µ = 8 and standard deviation σ = 3. Suppose a random sample of size n = 36 is selected.
� What is the probability that the sample mean is between 7.8 and 8.2?
Example
Solution:
� Even if the population is not normally distributed, the central limit theorem can be used (n > 30)
(continued)
used (n > 30)
� … so the sampling distribution of is approximately normal
� … with mean = 8
� …and standard deviation
x
xµ
0.536
3
n
σσx ===
Example
Solution (continued):(continued)
363
8-8.2
nσ
µ- µ
363
8-7.8P 8.2) µ P(7.8
X
X
<<=<<
0.38300.5)ZP(-0.5
36n36
=<<=
Z7.8 8.2 -0.5 0.5
Sampling Distribution
Standard Normal Distribution .1915
+.1915
Population Distribution
??
??
?????
???Sample Standardize
8µ = 8µX
= 0µz =xX
Sampling Distributions of the Proportion
Sampling Distributions
Sampling Distributions
of the Mean
Sampling Distributions
of the Proportion
Population Proportions, p
p = the proportion of the population having some characteristic
� Sample proportion ( ps ) provides an estimateof p:
� 0 ≤ ps ≤ 1
� ps has a binomial distribution
(assuming sampling with replacement from a finite population or without replacement from an infinite population)
size sample
interest ofstic characteri the having sample the in itemsofnumber
n
Xps ==
Sampling Distribution of p
� Approximated by a
normal distribution if:
�
Sampling DistributionP(ps)
.3
.2
.1
5np ≥
where
and
(where p = population proportion)
.1
00 . 2 .4 .6 8 1 ps
pµsp =
n
p)p(1σ
sp
−=
5p)n(1
and
≥−
Z-Value for Proportions
p)p(1
pp
σ
ppZ s
p
s
s−
−=
−=
Standardize ps to a Z value with the formula:
� If sampling is without replacement
and n is greater than 5% of the
population size, then must use
the finite population correction
factor:
1N
nN
n
p )p (1σ
sp−
−−=
n
p)p(1ps−
pσ
Example
� If the true proportion of voters who support
Proposition A is p = .4, what is the probability
that a sample of size 200 yields a sample that a sample of size 200 yields a sample
proportion between .40 and .45?
� i.e.: if p = .4 and n = 200, what is
P(.40 ≤ ps ≤ .45) ?
Example
� if p = .4 and n = 200, what is
P(.40 ≤ ps ≤ .45) ?
(continued)
.03464.4).4(1p)p(1
σ =−
=−
=Find : σ .03464200n
σsp ===
1.44)ZP(0
.03464
.40.45Z
.03464
.40.40P.45)pP(.40 s
≤≤=
−≤≤
−=≤≤
Find :
Convert to standard normal:
spσ
Example
� if p = .4 and n = 200, what is
P(.40 ≤ ps ≤ .45) ?
(continued)
Use standard normal table: P(0 ≤ Z ≤ 1.44) = .4251
Z.45 1.44
.4251
Standardize
Sampling DistributionStandardized
Normal Distribution
Use standard normal table: P(0 ≤ Z ≤ 1.44) = .4251
.40 0ps
Example:
( )8 =2 25
7.8 8.2 ?
n
P X
µ σ= =
< < =
Sampling Distribution Standardized
( )
( )
7.8 8 8.2 87.8 8.2
2 / 25 2 / 25
.5 .5 .3830
X
X
XP X P
P Z
µ
σ
−− −< < = < <
= − < < =
Sampling Distribution Standardized Normal Distribution2
.425
Xσ = = 1Zσ =
8X
µ =8.2 Z
0Zµ =0.57.8 0.5−
.1915
X
Example:( )200 .4 .43 ?Sn p P p= = < =
( )( )
( ).43 .4
.43 .87 .8078.4 1 .4
200
S
S
S p
S
p
pP p P P Z
µ
σ
− − < = < = < = − 200
Sampling DistributionStandardized
Normal Distribution
Spσ 1Zσ =
SpµSp Z0.43 .87
فصل پنجم
تخمین فاصله ای
Point and Interval Estimates
� A point estimate is a single number,
� a confidence interval provides additional information about the variability of the estimateinformation about the variability of the estimate
Point Estimate
Lower
Confidence
Limit
Upper
Confidence
Limit
Width of confidence interval
We can estimate a Population Parameter …
Point Estimates
with a SampleStatistic
(a Point Estimate)(a Point Estimate)
Mean
Proportion pπ
Xµ
Confidence Intervals
� How much uncertainty is associated with a point estimate of a population parameter?
� An interval estimate provides more An interval estimate provides more information about a population characteristic than does a point estimate
� Such interval estimates are called confidence intervals
Confidence Interval Estimate
� An interval gives a range of values:
� Takes into consideration variation in sample statistics from sample to sample
� Based on observations from 1 sample� Based on observations from 1 sample
� Gives information about closeness to unknown population parameters
� Stated in terms of level of confidence
� e.g. 95% confident, 99% confident
� Can never be 100% confident
Confidence Interval Example
Cereal fill example
� Population has µ = 368 and σ = 15.
� If you take a sample of size n = 25 you know
� 368 ± 1.96 * 15 / = (362.12, 373.88) contains 95% of 25� 368 ± 1.96 * 15 / = (362.12, 373.88) contains 95% of the sample means
� When you don’t know µ, you use X to estimate µ� If X = 362.3 the interval is 362.3 ± 1.96 * 15 / = (356.42, 368.18)
� Since 356.42 ≤ µ ≤ 368.18 the interval based on this sample makes a correct statement about µ.
But what about the intervals from other possible samples of size 25?
25
25
Confidence Interval Example
(continued)
Sample # XLower
Limit
Upper
Limit
Contain
µ?
1 362.30 356.42 368.18 Yes
2 369.50 363.62 375.38 Yes
3 360.00 354.12 365.88 No
4 362.12 356.24 368.00 Yes
5 373.88 368.00 379.76 Yes
Confidence Interval Example
� In practice you only take one sample of size n
� In practice you do not know µ so you do not know if the interval actually contains µ
However you do know that 95% of the intervals
(continued)
� However you do know that 95% of the intervals formed in this manner will contain µ
� Thus, based on the one sample, you actually selected you can be 95% confident your interval will contain µ (this is a 95% confidence interval)
Note: 95% confidence is based on the fact that we used Z = 1.96.
Estimation Process
Population
Random Sample
Mean
I am 95% confident that µ is between 40 & 60.
(mean, µ, is unknown)
Population Mean
X = 50
Sample
40 & 60.
General Formula
� The general formula for all confidence intervals is:
Point Estimate ± (Critical Value)(Standard Error)
Where:•Point Estimate is the sample statistic estimating the population parameter of interest
•Critical Value is a table value based on the sampling distribution of the point estimate and the desired confidence level
•Standard Error is the standard deviation of the point estimate
Confidence Level
� Confidence Level
� Confidence the interval will contain the unknown population parameter
� A percentage (less than 100%)
Confidence Level, (1-α)
� Suppose confidence level = 95%
� Also written (1 - α) = 0.95, (so α = 0.05)
� A relative frequency interpretation:
� 95% of all the confidence intervals that can be
(continued)
� 95% of all the confidence intervals that can be constructed will contain the unknown true parameter
� A specific interval either will contain or will not contain the true parameter
� No probability involved in a specific interval
Confidence Intervals
Population
ConfidenceIntervals
PopulationPopulation Mean
σ Unknown
PopulationProportion
σ Known
Confidence Interval for µ(σ Known)
� Assumptions� Population standard deviation σ is known� Population is normally distributed� If population is not normal, use large sample
� Confidence interval estimate:
where is the point estimate
Zα/2 is the normal distribution critical value for a probability of α/2 in each tail
is the standard error
n
σ/2ZX α±
X
nσ/
Finding the Critical Value, Zα/2
� Consider a 95% confidence interval:
0.05 so 0.951 ==− αα
1.96/2Z ±=α
Zα/2 = -1.96 Zα/2 = 1.96
0.0252
=α
0.0252
=α
Point EstimateLower Confidence Limit
UpperConfidence Limit
Z units:
X units: Point Estimate
0
Common Levels of Confidence
� Commonly used confidence levels are 90%, 95%, and 99%
Confidence Level
Confidence Coefficient, Zα/2 value
α−1Level
1.28
1.645
1.96
2.33
2.58
3.08
3.27
0.80
0.90
0.95
0.98
0.99
0.998
0.999
80%
90%
95%
98%
99%
99.8%
99.9%
α−1
Intervals and Level of Confidence
Intervals
Sampling Distribution of the Mean
x
/2α /2αα−1
µµx
=
Confidence Intervals
Intervals extend from
to
(1-α)x100%of intervals constructed contain µ;
(α)x100% do not.
n
σ2/αZX −
n
σ2/αZX +
x
x1
x2
Example
� A sample of 11 circuits from a large normal population has a mean resistance of 2.20 ohms. We know from past testing that the population standard deviation is 0.35 ohms.
� Determine a 95% confidence interval for the true mean resistance of the population.
Example
� A sample of 11 circuits from a large normal population has a mean resistance of 2.20 ohms. We know from past testing that the
population standard deviation is 0.35 ohms.
(continued)
2.4068 1.9932
0.2068 2.20
)11(0.35/ 1.96 2.20
n
σ/2 ZX
≤≤
±=
±=
±
µ
α
population standard deviation is 0.35 ohms.
� Solution:
Confidence Intervals
Population
ConfidenceIntervals
PopulationPopulation Mean
σ Unknown
PopulationProportion
σ Known
Do You Ever Truly Know σ?
� Probably not!
� In virtually all real world business situations, σ is not known.known.
� If there is a situation where σ is known then µ is also known (since to calculate σ you need to know µ.)
� If you truly know µ there would be no need to gather a sample to estimate it.
� If the population standard deviation σ is unknown, we can substitute the sample standard deviation, S
Confidence Interval for µ(σ Unknown)
� This introduces extra uncertainty, since S is variable from sample to sample
� So we use the t distribution instead of the normal distribution
� Assumptions
� Population standard deviation is unknown
� Population is normally distributed
� If population is not normal, use large sample
Confidence Interval for µ(σ Unknown)
(continued)
If population is not normal, use large sample
� Use Student’s t Distribution
� Confidence Interval Estimate:
(where tα/2 is the critical value of the t distribution with n -1 degrees of freedom and an area of α/2 in each tail)
n
StX 2/α±
Student’s t Distribution
� The t is a family of distributions
� The tα/2 value depends on degrees of freedom (d.f.)
� Number of observations that are free to vary after sample mean has been calculated
d.f. = n - 1
Degrees of Freedom (df)
Idea: Number of observations that are free to varyafter sample mean has been calculated
Example: Suppose the mean of 3 numbers is 8.0
If the mean of these three values is 8.0, then X3 must be 9(i.e., X3 is not free to vary)
Here, n = 3, so degrees of freedom = n – 1 = 3 – 1 = 2
(2 values can be any numbers, but the third is not free to vary for a given mean)
Let X1 = 7
Let X2 = 8
What is X3?
Student’s t Distribution
Standard Normal
(t with df = ∞)
Note: t Z as n increases
t0
t (df = 5)
t (df = 13)t-distributions are bell-shaped and symmetric, but have ‘fatter’ tails than the normal
Student’s t Table
Upper Tail Area
df .10 .05 .025
1 3.078 6.314 12.706
Let: n = 3 df = n - 1 = 2
α = 0.10α/2 = 0.051 3.078 6.314 12.706
2 1.886
3 1.638 2.353 3.182
t0 2.920
The body of the table contains t values, not probabilities
α/2 = 0.05
α/2 = 0.05
4.3032.920
Selected t distribution values
With comparison to the Z value
Confidence t t t ZLevel (10 d.f.) (20 d.f.) (30 d.f.) (∞ d.f.)
0.80 1.372 1.325 1.310 1.280.80 1.372 1.325 1.310 1.28
0.90 1.812 1.725 1.697 1.645
0.95 2.228 2.086 2.042 1.96
0.99 3.169 2.845 2.750 2.58
Note: t Z as n increases
Example of t distribution confidence interval
A random sample of n = 25 has X = 50 and S = 8. Form a 95% confidence interval for µ
� d.f. = n – 1 = 24, so 2.06390.025t/2 ==αt
The confidence interval is
25
8(2.0639)50
n
S/2 ±=± αtX
46.698 ≤ µ ≤ 53.302
Example of t distribution confidence interval
� Interpreting this interval requires the assumption that the population you are sampling from is approximately a normal distribution (especially since n is only 25).
(continued)
distribution (especially since n is only 25).
� This condition can be checked by creating a:
� Normal probability plot or
� Boxplot
Confidence Intervals
Population
ConfidenceIntervals
PopulationPopulation Mean
σ Unknown
PopulationProportion
σ Known
Confidence Intervals for the Population Proportion, π
� An interval estimate for the population
proportion ( π ) can be calculated by
adding an allowance for uncertainty to adding an allowance for uncertainty to
the sample proportion ( p )
Confidence Intervals for the Population Proportion, π
� Recall that the distribution of the sample proportion is approximately normal if the sample size is large, with standard deviation
(continued)
)(1 ππ −
� We will estimate this with sample data:
n
p)p(1−
n
)(1σp
ππ −=
Confidence Interval Endpoints
� Upper and lower confidence limits for the population proportion are calculated with the formula
p)p(1 −
� where � Zα/2 is the standard normal value for the level of confidence desired
� p is the sample proportion
� n is the sample size
� Note: must have np > 5 and n(1-p) > 5
n
p)p(1/2Zp
−± α
Example
� A random sample of 100 people
shows that 25 are left-handed.
Form a 95% confidence interval for � Form a 95% confidence interval for
the true proportion of left-handers
Example
� A random sample of 100 people shows that 25 are left-handed. Form a 95% confidence interval for the true proportion of left-handers.
(continued)
/1000.25(0.75)1.9625/100
p)/np(1/2Zp
±=
−± α
0.3349 0.1651
(0.0433) 1.96 0.25
≤≤
±=
π
Determining Sample Size
DeterminingSample Size
For the Mean
For theProportion
Sampling Error
� The required sample size can be found to reach a desired margin of error (e) with a specified level of confidence (1 - αααα)
� The margin of error is also called sampling error
� the amount of imprecision in the estimate of the
population parameter
� the amount added and subtracted to the point
estimate to form the confidence interval
Determining Sample Size
For the
DeterminingSample Size
For the Mean
n
σ2/αZX ±
n
σ2/αZe =
Sampling error (margin of error)
Determining Sample Size
For the
DeterminingSample Size
(continued)
For the Mean
n
σ2/αZe =
2
2σ22/
e
Zn α=Now solve
for n to get
Determining Sample Size
� To determine the required sample size for the mean, you must know:
The desired level of confidence (1 - αααα), which
(continued)
� The desired level of confidence (1 - αααα), which
determines the critical value, Zα/2
� The acceptable sampling error, e
� The standard deviation, σ
Required Sample Size Example
If σ = 45, what sample size is needed to
estimate the mean within ± 5 with 90%
confidence?
(Always round up)
219.195
(45)(1.645)
e
σZn
2
22
2
22
===
So the required sample size is n = 220
If σ is unknown
� If unknown, σ can be estimated when
using the required sample size formula
� Use a value for σ that is expected to be � Use a value for σ that is expected to be
at least as large as the true σ
� Select a pilot sample and estimate σ with
the sample standard deviation, S
Determining Sample Size
DeterminingSample Size
For the
(continued)
For theProportion
2
2
e
)(1Zn
ππ −=
Now solve for n to getn
)(1Ze
ππ −=
Determining Sample Size
� To determine the required sample size for the proportion, you must know:
� The desired level of confidence (1 - αααα), which determines the
(continued)
critical value, Zα/2
� The acceptable sampling error, e
� The true proportion of events of interest, π
� π can be estimated with a pilot sample if necessary (or
conservatively use 0.5 as an estimate of π)
Required Sample Size Example
How large a sample would be necessary
to estimate the true proportion defective in
a large population within ±3%, with 95%
confidence?confidence?
(Assume a pilot sample yields p = 0.12)
Required Sample Size Example
Solution:
For 95% confidence, use Zα/2 = 1.96
e = 0.03
(continued)
p = 0.12, so use this to estimate π
So use n = 451
450.742(0.03)
0.12)(0.12)(12(1.96)
2e
)(12/2Z
n =−
=−
=ππα
Confidence Interval for Population Total Amount
� Point estimate for a population of size N:
� Confidence interval estimate:
XN total Population =
� Confidence interval estimate:
1n
S)2/(
−
−±
N
nNtNXN α
(This is sampling without replacement, so use the finite population correction in the confidence interval formula)
Confidence Interval for Population Total: Example
A firm has a population of 1000 accounts and wishes to estimate the total population value.
A sample of 80 accounts is selected with average balance of $87.6 and standard deviation of $22.3.
Find the 95% confidence interval estimate of the total balance.
Example Solution
1n
S)2/(
−
−±
N
nNtNXN α
22.3S ,6.87X 80, n ,1000N ====
The 95% confidence interval for the population total balance is $82,837.52 to $92,362.48
48.762,4600,87
11000
801000
80
22.3)9905.1)(1000()6.87)(1000(
n
±=
−
−±=
� Point estimate for a population of size N:
� Where the average difference, D, is:
DN Difference Total =
Confidence Interval for Total Difference
� Where the average difference, D, is:
value original - value auditedD where
n
D
D
i
n
1ii
=
=∑
=
� Confidence interval estimate:
1n
DS)2/(
−
−±
N
nNtNDN α
Confidence Interval for Total Difference
(continued)
where
1n)2/(
−±
NtNDN α
1
1
2)D(
−
=
−
=
∑
n
n
i
iD
DS
One-Sided Confidence Intervals
� Application: find the upper bound for the proportion of items that do not conform with internal controls
nNp)p(1 −−
� where � Zα is the standard normal value for the level of confidence desired
� p is the sample proportion of items that do not conform
� n is the sample size
� N is the population size
1N
nN
n
p)p(1Zp boundUpper
−
−−+= α
Use A fpc When Sampling More Than 5% Of The Population (n/N > 0.05)
Confidence Interval For µ with a fpc
/ 2 1
S N nX t
Nnα
−±
− A fpc simplyreduces the
Confidence Interval For π with a fpc
1
)1(2/
−
−−±
N
nN
n
ppzp α
reduces thestandard errorof either the
sample mean orthe sample proportion
Confidence Interval for µ with a fpc
10s 50,x 100,n 1000,N Suppose ====
:for CI 95% 2/ =−
−±
nNstx αµ
)88.51 ,12.48( 88.150
11000
1001000
100
10984.150
1 :for CI 95% 2/
=±
=−
−±
=−
±Nn
tx αµ
Determining Sample Size with a fpc
� Calculate the sample size (n0) without a fpc
� For µ: 2
22
2/0
e
zn
σα=
� For π:
� Apply the fpc utilizing the following formula to arrive at the final sample size (n).
� n = n0N / (n0 + (N-1))
e
2
2
2/0
)1(
e
zn
ππα −=
فصل ششم
ماری زمون فرض ا
ا
What is a Hypothesis?
� A hypothesis is a claim (assertion) about a population parameter:
population mean� population mean
� population proportion
Example: The mean monthly cell phone bill in this city is µ = $42
Example: The proportion of adults in this city with cell phones is π = 0.68
The Null Hypothesis, H0
� States the claim or assertion to be tested
Example: The average diameter of a
manufactured bolt is 30mm ( )30µ:H0 =
� Is always about a population parameter, not about a sample statistic
30µ:H0 = 30X:H0 =
The Null Hypothesis, H0
� Begin with the assumption that the null hypothesis is true
� Similar to the notion of innocent untilproven guilty
(continued)
proven guilty
� Refers to the status quo or historical value
� Always contains “=“, or “≤”, or “≥” sign
� May or may not be rejected
The Alternative Hypothesis, H1
� Is the opposite of the null hypothesis
� e.g., The average diameter of a manufactured bolt is not equal to 30mm ( H1: µ ≠ 30 )
� Challenges the status quoChallenges the status quo
� Never contains the “=“, or “≤”, or “≥” sign
� May or may not be proven
� Is generally the hypothesis that the researcher is trying to prove
The Hypothesis Testing Process
� Claim: The population mean age is 50.� H0: µ = 50, H1: µ ≠ 50
� Sample the population and find sample mean.
Population
Sample
The Hypothesis Testing Process
� Suppose the sample mean age was X = 20.
� This is significantly lower than the claimed mean population age of 50.
(continued)
� If the null hypothesis were true, the probability of getting such a different sample mean would be very
small, so you reject the null hypothesis .
� In other words, getting a sample mean of 20 is so unlikely if the population mean was 50, you conclude that the population mean must not be 50.
The Hypothesis Testing Process
Sampling
Distribution of X
(continued)
µ = 50
If H0 is trueIf it is unlikely that you
would get a sample
mean of this value ...
... then you reject
the null hypothesis
that µ = 50.
20
... When in fact this were
the population mean…
X
The Test Statistic and Critical Values
� If the sample mean is close to the stated population mean, the null hypothesis is not rejected.
� If the sample mean is far from the stated � If the sample mean is far from the stated population mean, the null hypothesis is rejected.
� How far is “far enough” to reject H0?
� The critical value of a test statistic creates a “line in the sand” for decision making -- it answers the question of how far is far enough.
The Test Statistic and Critical Values
Sampling Distribution of the test statistic
Region of
Rejection
Region of
RejectionRegion of
Critical Values
“Too Far Away” From Mean of Sampling Distribution
RejectionRegion of
Non-Rejection
Possible Errors in Hypothesis Test Decision Making
� Type I Error
� Reject a true null hypothesis
� Considered a serious type of error
� The probability of a Type I Error is αααα
� Called level of significance of the test
� Set by researcher in advance
� Type II Error
� Failure to reject false null hypothesis
� The probability of a Type II Error is β
Possible Errors in Hypothesis Test Decision Making
Possible Hypothesis Test Outcomes
Actual Situation
(continued)
Decision H0 True H0 False
Do Not Reject H0
No Error
Probability 1 - α
Type II Error
Probability β
Reject H0 Type I Error
Probability α
No Error
Probability 1 - β
Possible Errors in Hypothesis Test Decision Making
� The confidence coefficient (1-α) is the probability of not rejecting H0 when it is true.
� The confidence level of a hypothesis test is
(continued)
� The confidence level of a hypothesis test is (1-α)*100%.
� The power of a statistical test (1-β) is the probability of rejecting H0 when it is false.
Type I & II Error Relationship
� Type I and Type II errors cannot happen atthe same time
� A Type I error can only occur if H0 is true� A Type I error can only occur if H0 is true
� A Type II error can only occur if H0 is false
If Type I error probability (αααα) , then
Type II error probability (β)
Factors Affecting Type II Error
� All else equal,
� β when the difference between
hypothesized parameter and its true value
� β when αααα
� β when σ
� β when n
Level of Significance and the Rejection Region
Level of significance = ααααH0: µ = 30
H1: µ ≠ 30
/2αααα/2αααα
This is a two-tail test because there is a rejection region in both tails
Critical values
Rejection Region
30
Hypothesis Tests for the Mean
Hypothesis Tests for µµµµ
σσσσ Known σσσσ Unknown
(Z test) (t test)
Z Test of Hypothesis for the Mean (σ Known)
� Convert sample statistic ( ) to a ZSTAT test statisticX
Hypothesis Tests for µµµµ
The test statistic is:
n
σ
µXZ
S T A T
−=
σ Known σ Unknownσσσσ Known σσσσ Unknown(Z test) (t test)
Critical Value Approach to Testing
� For a two-tail test for the mean, σ known:
� Convert sample statistic ( ) to test statistic (ZSTAT)
X
� Determine the critical Z values for a specifiedlevel of significance αααα from a table or computer
� Decision Rule: If the test statistic falls in the rejection region, reject H0 ; otherwise do not
reject H0
� There are two cutoff values (critical values), defining the regions of
Two-Tail Tests
αααα/2
H0: µ = 30
H1: µ ≠≠≠≠ 30
αααα/2
Do not reject H0 Reject H0Reject H0
regions of rejection
αααα/2
-Zα/2 0 +Zα/2
αααα/2
Lower critical value
Upper critical value
30
Z
X
6 Steps in Hypothesis Testing
1. State the null hypothesis, H0 and the alternative hypothesis, H1
2. Choose the level of significance, α, and the sample size, nsample size, n
3. Determine the appropriate test statistic and sampling distribution
4. Determine the critical values that divide the rejection and nonrejection regions
6 Steps in Hypothesis Testing
5. Collect data and compute the value of the test statistic
6. Make the statistical decision and state the managerial conclusion. If the test statistic falls
(continued)
managerial conclusion. If the test statistic falls into the nonrejection region, do not reject the null hypothesis H0. If the test statistic falls into the rejection region, reject the null hypothesis. Express the managerial conclusion in the context of the problem
Hypothesis Testing Example
Test the claim that the true mean diameter of a manufactured bolt is 30mm.
(Assume σ = 0.8)
1. State the appropriate null and alternative1. State the appropriate null and alternativehypotheses
� H0: µ = 30 H1: µ ≠ 30 (This is a two-tail test)
2. Specify the desired level of significance and the sample size
� Suppose that α = 0.05 and n = 100 are chosen for this test
Hypothesis Testing Example
3. Determine the appropriate technique� σ is assumed known so this is a Z test.
4. Determine the critical values
� For α = 0.05 the critical Z values are ±1.96
(continued)
2 .00 .0 8
.1 6
1 0 0
0 .8
3 02 9 .8 4
n
σ
µXZ S T A T −−−−====
−−−−====
−−−−====
−−−−====
5. Collect the data and compute the test statistic
� Suppose the sample results are
n = 100, X = 29.84 (σ = 0.8 is assumed known)
So the test statistic is:
� 6. Is the test statistic in the rejection region?
α/2 = 0.025
Hypothesis Testing Example
(continued)
α/2 = 0.025
Reject H0 Do not reject H0
-Zα/2 = -1.96 0
Reject H0 if ZSTAT < -1.96 or ZSTAT > 1.96; otherwise do not reject H0
Reject H0
+Zα/2 = +1.96
Here, ZSTAT = -2.0 < -1.96, so the test statistic is in the rejection region
6 (continued). Reach a decision and interpret the result
Hypothesis Testing Example
(continued)
α = 0.05/2 α = 0.05/2
-2.0
Since ZSTAT = -2.0 < -1.96, reject the null hypothesisand conclude there is sufficient evidence that the mean diameter of a manufactured bolt is not equal to 30
Reject H0 Do not reject H0
-Zα/2 = -1.96 0
Reject H0
+Zα/2= +1.96
p-Value Approach to Testing
� p-value: Probability of obtaining a test
statistic equal to or more extreme than the
observed sample value given H0 is true
� The p-value is also called the observed level of
significance
� It is the smallest value of αααα for which H0 can be
rejected
p-Value Approach to Testing:Interpreting the p-value
� Compare the p-value with αααα
� If p-value < αααα , reject H0
� If p-value ≥≥≥≥ αααα , do not reject H0� If p-value ≥≥≥≥ αααα , do not reject H0
� Remember
� If the p-value is low then H0 must go
The 5 Step p-value approach toHypothesis Testing
1. State the null hypothesis, H0 and the alternative hypothesis, H1
2. Choose the level of significance, α, and the sample size, n
Determine the appropriate test statistic and sampling 3. Determine the appropriate test statistic and sampling distribution
4. Collect data and compute the value of the test statistic and the p-value
5. Make the statistical decision and state the managerial conclusion. If the p-value is < α then reject H0, otherwise do not reject H0. State the managerial conclusion in the context of the problem
p-value Hypothesis Testing Example
Test the claim that the true mean diameter of a manufactured bolt is 30mm.
(Assume σ = 0.8)
1. State the appropriate null and alternative1. State the appropriate null and alternativehypotheses
� H0: µ = 30 H1: µ ≠ 30 (This is a two-tail test)
2. Specify the desired level of significance and the sample size
� Suppose that α = 0.05 and n = 100 are chosen for this test
p-value Hypothesis Testing Example
3. Determine the appropriate technique� σ is assumed known so this is a Z test.
4. Collect the data, compute the test statistic and the p-value
(continued)
2 .00 .0 8
.1 6
1 0 0
0 .8
3 02 9 .8 4
n
σ
µXZ S T A T −−−−====
−−−−====
−−−−====
−−−−====
� Suppose the sample results are
n = 100, X = 29.84 (σ = 0.8 is assumed known)
So the test statistic is:
p-Value Hypothesis Testing Example:Calculating the p-value
4. (continued) Calculate the p-value.� How likely is it to get a ZSTAT of -2 (or something further from the
mean (0), in either direction) if H0 is true?
p-value = 0.0228 + 0.0228 = 0.0456
P(Z < -2.0) = 0.0228
0
-2.0
Z
2.0
P(Z > 2.0) = 0.0228
� 5. Is the p-value < α?
� Since p-value = 0.0456 < α = 0.05 Reject H0
� 5. (continued) State the managerial conclusion in the context of the situation.
p-value Hypothesis Testing Example
(continued)
in the context of the situation.
� There is sufficient evidence to conclude the average diameter of a manufactured bolt is not equal to 30mm.
Connection Between Two Tail Tests and Confidence Intervals
� For X = 29.84, σ = 0.8 and n = 100, the 95%confidence interval is:
0.8 (1.96) 29.84 to
0.8 (1.96) - 29.84 ++++
29.6832 ≤ µ ≤ 29.9968
� Since this interval does not contain the hypothesized mean (30), we reject the null hypothesis at αααα = 0.05
100 (1.96) 29.84 to
100 (1.96) - 29.84 ++++
Do You Ever Truly Know σ?
� Probably not!
� In virtually all real world business situations, σ is not known.known.
� If there is a situation where σ is known then µ is also known (since to calculate σ you need to know µ.)
� If you truly know µ there would be no need to gather a sample to estimate it.
Hypothesis Testing: σ Unknown
� If the population standard deviation is unknown, you instead use the sample standard deviation S.
� Because of this change, you use the t distribution instead of the Z distribution to test the null hypothesis about the of the Z distribution to test the null hypothesis about the mean.
� When using the t distribution you must assume the population you are sampling from follows a normal distribution.
� All other steps, concepts, and conclusions are the same.
t Test of Hypothesis for the Mean (σ Unknown)
Hypothesis Tests for µµµµ
t Test of Hypothesis for the Mean (σ Unknown)
� Convert sample statistic ( ) to a tSTAT test statistic
Hypothesis Tests for µµµµ
X
Hypothesis Tests for µµµµ
The test statistic is:
σ Known σ Unknownσσσσ Known σσσσ Unknown(Z test) (t test)
The test statistic is:
σ Known σ Unknownσσσσ Known σσσσ Unknown(Z test) (t test)
The test statistic is:
n
S
µXt
S T A T
−=
σ Known σ Unknownσσσσ Known σσσσ Unknown(Z test) (t test)
Example: Two-Tail Test(σ Unknown)
The average cost of a hotel room in New York is said to be $168 per night. To determine if this is true, a random sample of 25 hotels is taken and resulted in an X of $172.50 and an S of $15.40. Test the appropriate hypotheses at α = 0.05.
(Assume the population distribution is normal)
H0: µ = 168
H1: µ ≠≠≠≠ 168
� α α α α = 0.05
Example Solution: Two-Tail t Test
Reject HReject H
α/2=.025
Do not reject H
α/2=.025H0: µ = 168
H1: µ ≠≠≠≠ 168
t� α α α α = 0.05
� n = 25, df = 25-1=24
� σσσσ is unknown, so
use a t statistic
� Critical Value:
±t24,0.025 = ± 2.0639 Do not reject H0: insufficient evidence that true mean cost is different than $168
Reject H0Reject H0
-t 24,0.025
Do not reject H0
0
-2.0639 2.0639
1 .4 6
2 5
1 5 .4 0
1 6 81 7 2 .5 0
n
S
µXS T A Tt =
−=
−=
1.46
t 24,0.025
Example Two-Tail t Test Using A p-value from Excel
� Since this is a t-test we cannot calculate the p-value without some calculation aid.
� The Excel output below does this:t Test for the Hypothesis of the Mean
Data
Null Hypothesis µ= 168.00$
Level of Significance 0.05
Sample Size 25
Sample Mean 172.50$
Sample Standard Deviation 15.40$
Standard Error of the Mean 3.08$ =B8/SQRT(B6)Degrees of Freedom 24 =B6-1
t test statistic 1.46 =(B7-B4)/B11
Lower Critical Value -2.0639 =-TINV(B5,B12)
Upper Critical Value 2.0639 =TINV(B5,B12)p-value 0.157 =TDIST(ABS(B13),B12,2)
=IF(B18<B5, "Reject null hypothesis","Do not reject null hypothesis")
Data
Intermediate Calculations
Two-Tail Test
Do Not Reject Null Hypothesis
p-value > αSo do not reject H0
Connection of Two Tail Tests to Confidence Intervals
� For X = 172.5, S = 15.40 and n = 25, the 95%confidence interval for µ is:
172.5 - (2.0639) 15.4/ 25 to 172.5 + (2.0639) 15.4/ 25172.5 - (2.0639) 15.4/ 25 to 172.5 + (2.0639) 15.4/ 25
166.14 ≤ µ ≤ 178.86
� Since this interval contains the Hypothesized mean (168), we do not reject the null hypothesis at αααα = 0.05
One-Tail Tests
� In many cases, the alternative hypothesis focuses on a particular direction
H : µ ≥ 3 This is a lower-tail test since the
H0: µ ≥ 3
H1: µ < 3
H0: µ ≤ 3
H1: µ > 3
This is a lower-tail test since the alternative hypothesis is focused on the lower tail below the mean of 3
This is an upper-tail test since the alternative hypothesis is focused on the upper tail above the mean of 3
� There is only one
critical value, since
the rejection area is
in only one tail
Lower-Tail Tests
αααα
H0: µ ≥ 3
H1: µ < 3
Reject H0 Do not reject H0
in only one tail αααα
-Zα or -tα 0
µ
Z or t
X
Critical value
Upper-Tail Tests
αααα
H0: µ ≤ 3
H1: µ > 3� There is only one
critical value, since
the rejection area is
in only one tail
Reject H0Do not reject H0
αααα
Zα or tα0
µ
in only one tail
Critical value
Z or t
X_
Example: Upper-Tail t Test for Mean (σ unknown)
A phone industry manager thinks that customer monthly cell phone bills have increased, and now average over $52 per month. The company wishes to test this claim. (Assume a normal population)claim. (Assume a normal population)
H0: µ ≤ 52 the average is not over $52 per month
H1: µ > 52 the average is greater than $52 per month(i.e., sufficient evidence exists to support the manager’s claim)
Form hypothesis test:
� Suppose that α = 0.10 is chosen for this test and n = 25.
Find the rejection region: Reject H0
Example: Find Rejection Region
(continued)
Reject H0Do not reject H0
αααα = 0.10
1.3180
Reject H0 if tSTAT > 1.318
Obtain sample and compute the test statistic
Suppose a sample is taken with the following
results: n = 25, X = 53.1, and S = 10
Example: Test Statistic
(continued)
results: n = 25, X = 53.1, and S = 10
� Then the test statistic is:
0 .5 5
2 5
1 0
5 25 3 .1
n
S
µXt
S T A T=
−=
−=
Example: Decision
αααα = 0.10
Reject H0
Reach a decision and interpret the result:
(continued)
Reject H0Do not reject H0
αααα = 0.10
1.3180
Do not reject H0 since tSTAT = 0.55 ≤ 1.318
there is not sufficient evidence that themean bill is over $52
tSTAT = 0.55
Example: Utilizing The p-value for The Test
�Calculate the p-value and compare to αααα (p-value below calculated using excel spreadsheet on next page)
ααααReject H0
p-value = .2937
Reject
H0
αααα = .10
Do not reject
H01.318
0
Reject H0
tSTAT = .55
Do not reject H0 since p-value = .2937 > αααα = .10
Hypothesis Tests for Proportions
� Involves categorical variables
� Two possible outcomes
� Possesses characteristic of interest� Possesses characteristic of interest
� Does not possess characteristic of interest
� Fraction or proportion of the population in the category of interest is denoted by π
Proportions
� Sample proportion in the category of interest is denoted by p
� sizesample
sampleininterest ofcategory in number
n
Xp ==
(continued)
�
� When both nπ and n(1-π) are at least 5, p can be approximated by a normal distribution with mean and standard deviation
�
sizesamplen
π=pµn
)(1σ
ππ −=p
� The sampling distribution of p is approximately normal, so the test
Hypothesis Tests for Proportions
Hypothesis Tests for p
normal, so the test statistic is a ZSTAT
value:
n
)(1
pZ
STAT
ππ
π
−
−=
nπ ≥≥≥≥ 5and
n(1-π) ≥≥≥≥ 5
nπ < 5or
n(1-π) < 5
Not discussed in this chapter
� An equivalent form to the last slide, but in terms of the number in the
Z Test for Proportion in Terms of Number in Category of Interest
Hypothesis Tests for X
number in the category of interest, X:
)(1n
nXZ
STAT
ππ
π
−
−=
X ≥≥≥≥ 5and
n-X ≥≥≥≥ 5
X < 5or
n-X < 5
Not discussed in this chapter
Example: Z Test for Proportion
A marketing company claims that it receives 8% responses from its mailing. To test this mailing. To test this claim, a random sample of 500 were surveyed with 25 responses. Test at the α = 0.05 significance level.
Check:
nπ = (500)(.08) = 40
n(1-π) = (500)(.92) = 460�
Z Test for Proportion: Solution
αααα = 0.05
n = 500, p = 0.05
H0: π = 0.08
H1: π ≠≠≠≠ 0.08
Test Statistic:
2.47
500
.08).08(1
.08.05
n
)(1
pSTATZ −=
−
−=
−
−=
ππ
π
n = 500, p = 0.05
Reject H0 at α = 0.05
Critical Values: ± 1.96 Decision:
Conclusion:
z0
Reject Reject
.025.025
1.96
-2.47
There is sufficient evidence to reject the company’s claim of 8% response rate.
-1.96
Do not reject H0
Reject H0Reject H0
α/2 = .025
Calculate the p-value and compare to αααα(For a two-tail test the p-value is always two-tail)
(continued)
p-value = 0.0136:
p-Value Solution
α/2 = .025 α/2 = .025
1.960
Z = -2.47
0.01362(0.0068)
2.47)P(Z2.47)P(Z
==
≥+−≤
Reject H0 since p-value = 0.0136 < αααα = 0.05
Z = 2.47
-1.96
α/2 = .025
0.00680.0068
The Power of a Test
� The power of the test is the probability of
correctly rejecting a false H0
Suppose we correctly reject H0: µ ≥≥≥≥ 52
when in fact the true mean is µ = 50
Reject H0: µ ≥ 52
Do not reject H0 : µ ≥ 52
5250
when in fact the true mean is µ = 50
α
Power = 1-β
Type II Error
� Suppose we do not reject H0: µ ≥ 52 when in fact the true mean is µ = 50
This is the true This is the range of X where H is not rejected
Reject H0: µ ≥ 52
Do not reject H0 : µ ≥ 52
5250
This is the true distribution of X if µµµµ = 50
H0 is not rejected
Prob. of type II error = β
Type II Error
� Suppose we do not reject H0: µ ≥ 52 when in fact the true mean is µ = 50
Here, β = P( X ≥ cutoff ) if µ = 50
(continued)
Reject H0: µ ≥ 52
Do not reject H0 : µ ≥ 52
α
5250
β
Here, β = P( X ≥ cutoff ) if µ = 50
� Suppose n = 64 , σ = 6 , and α = .05
Calculating β
50.76664
61.64552
n
σZµXcutoff =−=−== αα
(for H0 : µ ≥ 52)
Reject H0: µ ≥ 52
Do not reject H0 : µ ≥ 52
α
5250
So β = P( x ≥ 50.766 ) if µ = 50
50.766
0.15390.84611.01.02)P(Z
646
5050.766ZP50)µ|50.766XP( =−=≥=
−
≥==≥
� Suppose n = 64 , σ = 6 , and α = 0.05
Calculating β and Power of the test
(continued)
Reject H0: µ ≥ 52
Do not reject H0 : µ ≥ 52
64
5250
Probability of type II error:
β = 0.1539
Power
= 1 - β= 0.8461
50.766The probability of correctly rejecting a false null hypothesis is 0.8641
Power of the Test
� Conclusions regarding the power of the test:
1. A one-tail test is more powerful than a two-tail test
2. An increase in the level of significance (α) results in 2. An increase in the level of significance (α) results in an increase in power
3. An increase in the sample size results in an increase in power
Two-Sample Tests
Two-Sample Tests
Population Population Population Population
Population Means,
Independent Samples
Population Means, Related Samples
Population Variances
Group 1 vs. Group 2
Same group before vs. after treatment
Variance 1 vs.Variance 2
Examples:
Population Proportions
Proportion 1 vs. Proportion 2
Difference Between Two Means
Population means, independent
samples
Goal: Test hypothesis or form a confidence interval for the difference between two population means, µ – µ
*population means, µ1 – µ2
The point estimate for the difference is
X1 – X2
σ1 and σ2 unknown, assumed equal
σ1 and σ2 unknown, not assumed equal
Difference Between Two Means: Independent Samples
Population means, independent
samples*
� Different data sources� Unrelated
� Independent� Sample selected from one
population has no effect on the sample selected from the other
Use Sp to estimate unknown σ. Use a Pooled-Variance t test.
σ1 and σ2 unknown, assumed equal
σ1 and σ2 unknown, not assumed equal
Use S1 and S2 to estimate unknown σ1 and σ2. Use a Separate-variance t test
sample selected from the other population
Hypothesis Tests forTwo Population Means
Lower-tail test:
H : µ ≥≥≥≥ µ
Upper-tail test:
H : µ ≤ µ
Two-tail test:
H : µ = µ
Two Population Means, Independent Samples
H0: µ1 ≥≥≥≥ µ2
H1: µ1 < µ2
i.e.,
H0: µ1 – µ2 ≥≥≥≥ 0H1: µ1 – µ2 < 0
H0: µ1 ≤ µ2
H1: µ1 > µ2
i.e.,
H0: µ1 – µ2 ≤ 0H1: µ1 – µ2 > 0
H0: µ1 = µ2
H1: µ1 ≠ µ2
i.e.,
H0: µ1 – µ2 = 0H1: µ1 – µ2 ≠ 0
Two Population Means, Independent Samples
Lower-tail test:
H0: µ1 – µ2 ≥≥≥≥ 0H1: µ1 – µ2 < 0
Upper-tail test:
H0: µ1 – µ2 ≤ 0H1: µ1 – µ2 > 0
Two-tail test:
H0: µ1 – µ2 = 0H1: µ1 – µ2 ≠ 0
Hypothesis tests for µ1 – µ2
1 1 2 1 1 2 1 1 2
αααα αααα/2 αααα/2αααα
-tα -tα/2tα tα/2
Reject H0 if tSTAT < -tα Reject H0 if tSTAT > tα Reject H0 if tSTAT < -tα/2or tSTAT > tα/2
Population means, independent
samples
Hypothesis tests for µ1 - µ2 with σ1
and σ2 unknown and assumed equal
Assumptions:
� Samples are randomly andindependently drawn
� Populations are normallydistributed or both samplesizes are at least 30
� Population variances areunknown but assumed equal
*σ1 and σ2 unknown, assumed equal
σ1 and σ2 unknown, not assumed equal
Population means, independent
samples
• The pooled variance is:
• The test statistic is:
(continued)
( ) ( )1)n(n
S1nS1nS
21
2
22
2
112p
−+−
−+−=
()1
*
Hypothesis tests for µ1 - µ2 with σ1
and σ2 unknown and assumed equal
• The test statistic is:
• Where tSTAT has d.f. = (n1 + n2 – 2)
*σ1 and σ2 unknown, assumed equal
σ1 and σ2 unknown, not assumed equal
( ) ( )
+
−−−=
21
2p
2121STAT
n
1
n
1S
µµXXt
Population means, independent
samplesThe confidence interval for
µ – µ is:
Confidence interval for µ1 - µ2 with σ1and σ2 unknown and assumed equal
( )
+±−
21
2p/221
n
1
n
1SXX αt
µ1 – µ2 is:
Where tα/2 has d.f. = n1 + n2 – 2
*σ1 and σ2 unknown, assumed equal
σ1 and σ2 unknown, not assumed equal
Pooled-Variance t Test Example
You are a financial analyst for a brokerage firm. Is there a difference in dividend yield between stocks listed on the NYSE & NASDAQ? You collect the following data:
NYSE NASDAQNumber 21 25Number 21 25Sample mean 3.27 2.53Sample std dev 1.30 1.16
Assuming both populations are approximately normal with equal variances, isthere a difference in meanyield (α = 0.05)?
Pooled-Variance t Test Example: Calculating the Test Statistic
( ) ( ) ( )µµXX −−−−−
The test statistic is:
(continued)
H0: µ1 - µ2 = 0 i.e. (µ1 = µ2)H1: µ1 - µ2 ≠ 0 i.e. (µ1 ≠ µ2)
( ) ( ) ( ) ( )1.5021
1)25(1)-(21
1.161251.30121
1)n()1(n
S1nS1nS
22
21
2
22
2
112p =
−+
−+−=
−+−
−+−=
( ) ( ) ( )2.040
25
1
21
15021.1
02.533.27
n
1
n
1S
µµXXt
21
2p
2121=
+
−−=
+
−−−=
Pooled-Variance t Test Example: Hypothesis Test Solution
H0: µ1 - µ2 = 0 i.e. (µ1 = µ2)
H1: µ1 - µ2 ≠ 0 i.e. (µ1 ≠ µ2)
αααα = 0.05
df = 21 + 25 - 2 = 44 t0 2.0154-2.0154
.025
Reject H0 Reject H0
.025
Critical Values: t = ± 2.0154
Test Statistic: Decision:
Conclusion:
Reject H0 at αααα = 0.05
There is evidence of a difference in means.
t
2.040
2.040
25
1
21
15021.1
2.533.27t =
+
−=
Pooled-Variance t Test Example: Confidence Interval for µ1 - µ2
Since we rejected H0 can we be 95% confident that µNYSE
> µNASDAQ?
95% Confidence Interval for µNYSE - µNASDAQ95% Confidence Interval for µNYSE - µNASDAQ
Since 0 is less than the entire interval, we can be 95% confident that µNYSE > µNASDAQ
(((( )))) )471.1,009.0(3628.00154.274.0 n
1
n
1SXX
21
2
p/221 ====××××±±±±====
++++±±±±−−−− ααααt
Population means, independent
samples
Hypothesis tests for µ1 - µ2 with σ1
and σ2 unknown, not assumed equal
Assumptions:
� Samples are randomly andindependently drawn
� Populations are normallydistributed or both samplesizes are at least 30
� Population variances areunknown and cannot beassumed to be equal*
σ1 and σ2 unknown, assumed equal
σ1 and σ2 unknown, not assumed equal
Related PopulationsThe Paired Difference Test
Tests Means of 2 Related Populations
� Paired or matched samples
� Repeated measures (before/after)
� Use difference between paired values:
Related samples
� Eliminates Variation Among Subjects
� Assumptions:
� Both Populations Are Normally Distributed
� Or, if not Normal, use large samples
Di = X1i - X2i
Related PopulationsThe Paired Difference Test
The ith paired difference is Di , where
Related samples
Di = X1i - X2i
The point estimate for the Dn
∑
(continued)
The point estimate for the paired difference population mean µD is D : n
D
D 1ii∑
==
n is the number of pairs in the paired sample
1n
)D(D
S
n
1i
2i
D−
−
=∑
=
The sample standard deviation is SD
� The test statistic for µD is:Paired
samples
µDt D−
=
The Paired Difference Test:Finding tSTAT
n
S
µDt
DSTAT
D−=
� Where tSTAT has n - 1 d.f.
Lower-tail test:
H0: µD ≥≥≥≥ 0H : µ < 0
Upper-tail test:
H0: µD ≤ 0H : µ > 0
Two-tail test:
H0: µD = 0H : µ ≠ 0
Paired Samples
The Paired Difference Test: Possible Hypotheses
H1: µD < 0 H1: µD > 0 H1: µD ≠ 0
αααα αααα/2 αααα/2αααα
-tα -tα/2tα tα/2
Reject H0 if tSTAT < -tα Reject H0 if tSTAT > tα Reject H0 if tSTAT < -tα/2or tSTAT > tα/2
Where tSTAT has n - 1 d.f.
The confidence interval for µD isPaired samples
SD2/αtD ±
The Paired Difference Confidence Interval
1n
)D(D
S
n
1i
2i
D−
−
=∑
=
n2/αtD ±
where
� Assume you send your salespeople to a “customer service” training workshop. Has the training made a difference in the number of complaints? You collect the following data:
Paired Difference Test: Example
Number of Complaints: (2) - (1) Σ DNumber of Complaints: (2) - (1)Salesperson Before (1) After (2) Difference, Di
C.B. 6 4 - 2
T.F. 20 6 -14
M.H. 3 2 - 1
R.K. 0 0 0
M.O. 4 0 - 4
-21
D =Σ Di
n
5 .6 7
1n
)D(DS
2i
D
=
−
−=
∑
= -4.2
� Has the training made a difference in the number of complaints (at the 0.01 level)?
H0: µD = 0H1: µD ≠ 0
Reject
αααα/2
Paired Difference Test: Solution
Reject
αααα/2
- 4.2D =
1 .6 655 .6 7 /
04 .2
n/S
µt
D
S T A TD −=
−−=
−=
D
Test Statistic:
t0.005 = ± 4.604d.f. = n - 1 = 4
αααα/2
- 4.604 4.604
Decision: Do not reject H0
(tstat is not in the reject region)
Conclusion: There is not a significant change in the number of complaints.
αααα/2
- 1.66αααα = .01
Two Population Proportions
Goal: test a hypothesis or form a confidence interval for the difference between two population proportions,
π1 – π2
Population proportions
1 2
The point estimate for the difference is
Assumptions:
n1 π1 ≥ 5 , n1(1- π1) ≥ 5
n2 π2 ≥ 5 , n2(1- π2) ≥ 5
21 pp −
Two Population Proportions
Population proportions
The pooled estimate for the overall proportion is:
In the null hypothesis we assume the
null hypothesis is true, so we assume π1
= π2 and pool the two sample estimates
21
21
nn
XXp
+
+=
overall proportion is:
where X1 and X2 are the number of items of interest in samples 1 and 2
Two Population Proportions
Population proportions
( ) ( )−−− pp ππ
The test statistic for
π1 – π2 is a Z statistic:
(continued)
( ) ( )
+−
−−−=
21
2121STAT
n
1
n
1)p(1p
ppZ
ππ
2
22
1
11
21
21
n
Xp ,
n
Xp ,
nn
XXp ==
+
+=where
Hypothesis Tests forTwo Population Proportions
Population proportions
Lower-tail test:
H : π ≥≥≥≥ π
Upper-tail test:
H : π ≤ π
Two-tail test:
H : π = πH0: π1 ≥≥≥≥ π2
H1: π1 < π2
i.e.,
H0: π1 – π2 ≥≥≥≥ 0H1: π1 – π2 < 0
H0: π1 ≤ π2
H1: π1 > π2
i.e.,
H0: π1 – π2 ≤ 0H1: π1 – π2 > 0
H0: π1 = π2
H1: π1 ≠ π2
i.e.,
H0: π1 – π2 = 0H1: π1 – π2 ≠ 0
Hypothesis Tests forTwo Population Proportions
Population proportions
Lower-tail test:
H0: π1 – π2 ≥≥≥≥ 0H1: π1 – π2 < 0
Upper-tail test:
H0: π1 – π2 ≤ 0H1: π1 – π2 > 0
Two-tail test:
H0: π1 – π2 = 0H1: π1 – π2 ≠ 0
(continued)
1 1 2 1 1 2 1 1 2
αααα αααα/2 αααα/2αααα
-zα -zα/2zα zα/2
Reject H0 if ZSTAT < -Zα Reject H0 if ZSTAT > Zα Reject H0 if ZSTAT < -Zα/2or ZSTAT > Zα/2
Hypothesis Test Example: Two population Proportions
Is there a significant difference between the proportion of men and the proportion of women who will vote Yes on Proposition A?
� In a random sample, 36 of 72 men and 35 of 50 women indicated they would vote Yes
� Test at the .05 level of significance
� The hypothesis test is:H0: π1 – π2 = 0 (the two proportions are equal)
H1: π1 – π2 ≠ 0 (there is a significant difference between proportions)
The sample proportions are:
Hypothesis Test Example: Two population Proportions
(continued)
� The sample proportions are:
� Men: p1 = 36/72 = 0.50
� Women: p2 = 35/50 = 0.70
.582122
71
5072
3536
nn
XXp
21
21 ========++++
++++====
++++
++++====
� The pooled estimate for the overall proportion is:
The test statistic for π1 – π2 is:
Hypothesis Test Example: Two population Proportions
(continued)
.025
-1.96 1.96
.025
(((( )))) (((( ))))
11p(1p
ppz 2121
STAT
++++−−−−
−−−−−−−−−−−−====
)
ππππππππ
Reject H0 Reject H0
-1.96 1.96-2.20
Decision: Do not reject H0
Conclusion: There is not significant evidence of a difference in proportions who will vote yes between men and women.
(((( )))) (((( ))))2.20
50
1
72
1.582)(1.582
0.70.50
nnp(1p
21
−−−−====
++++−−−−
−−−−−−−−====
++++−−−− )
Critical Values = ±1.96For αααα = .05
Confidence Interval forTwo Population Proportions
Population proportions
The confidence interval for
π1 – π2 is:
( )2
22
1
11/221
n
)p(1p
n
)p(1pZpp
−+
−±− α
Testing for the Ratio Of Two Population Variances
Tests for TwoPopulation Variances
H0: σ12 = σ2
2
H1: σ12 ≠ σ2
2
H0: σ12 ≤ σ2
2
*Hypotheses FSTAT
S12 / S2
2
F test statistic
H0: σ1 ≤ σ2
H1: σ12 > σ2
2
S12 = Variance of sample 1 (the larger sample variance)
n1 = sample size of sample 1
S22 = Variance of sample 2 (the smaller sample variance)
n2 = sample size of sample 2
n1 –1 = numerator degrees of freedom
n2 – 1 = denominator degrees of freedom
Where:
� The F critical value is found from the F table
� There are two degrees of freedom required: numerator
and denominator
� The larger sample variance is always the numerator
The F Distribution
The larger sample variance is always the numerator
� When
� In the F table,
� numerator degrees of freedom determine the column
� denominator degrees of freedom determine the row
df1 = n1 – 1 ; df2 = n2 – 12
2
2
1
S
SFSTAT ====
Finding the Rejection Region
H0: σ12 = σ2
2
H1: σ12 ≠ σ2
2
H0: σ12 ≤ σ2
2
H1: σ12 > σ2
2
α/2
F0
α
FαReject H0Do not
reject H0
Reject H0 if FSTAT > Fα
F
0
α/2
Reject H0Do not reject H0 Fα/2
Reject H0 if FSTAT > Fα/2
F Test: An Example
You are a financial analyst for a brokerage firm. You want to compare dividend yields between stocks listed
on the NYSE & NASDAQ. You collect the following data:
NYSE NASDAQNYSE NASDAQNumber 21 25Mean 3.27 2.53Std dev 1.30 1.16
Is there a difference in the
variances between the NYSE
& NASDAQ at the αααα = 0.05 level?
F Test: Example Solution
� Form the hypothesis test:
H0: σ21 = σ2
2 (there is no difference between variances)
H1: σ21 ≠ σ2
2 (there is a difference between variances)
� Find the F critical value for αααα = 0.05:Find the F critical value for αααα = 0.05:
� Numerator d.f. = n1 – 1 = 21 –1 =20
� Denominator d.f. = n2 – 1 = 25 –1 = 24
� Fα/2 = F.025, 20, 24 = 2.33
� The test statistic is:
256.116.1
30.12
2
2
2
2
1 ===S
SFSTAT
α/2 = .025
H0: σ12 = σ2
2
H1: σ12 ≠ σ2
2
F Test: Example Solution
(continued)
0
α/2 = .025
F0.025=2.33Reject H0Do not
reject H0� FSTAT = 1.256 is not in the rejection
region, so we do not reject H0
� Conclusion: There is not sufficient evidence of a difference in variances at α = .05
F
فصل هفتمفصل هفتم
توزیع خی دوتوزیع خی دو
Chi-Square Test for a Variance or Standard Deviation
� A χ2 test statistic is used to test whether or not the population variance or standard deviation is equal to a specified value:
22 1)S-(n
====χH0: σ
2 = σ02
Copyright ©2011 Pearson Education
Where n = sample size
S2 = sample variance
σ2 = hypothesized population variance
follows a chi-square distribution with d.f. = n - 1
2σ
====STATχ
2
STATχ
Ha: σ2 ≠ σ0
2
Reject H0 if > or if < 2
STATχχχχ 2
2/ααααχχχχ 2
STATχχχχ 2
2/1 ααααχχχχ −−−−
Chi-Square Test For A Variance: An Example
Suppose you have gathered a random sample of size 25 and
obtained a sample standard deviation of s = 7 and want to
do the following hypothesis test:
H0: σ2 = 81
Copyright ©2011 Pearson Education
H0: σ = 81
Ha: σ2 ≠ 81
185.1481
49*24
σ
1)S-(n2
22 ============STATχ
Since < 14.185 < you fail to reject H0401.122
975.0 ====χχχχ 364.392
025.0 ====χχχχ
Contingency Tables
Contingency Tables
� Useful in situations comparing multiple population proportions
Copyright ©2011 Pearson Education
� Used to classify sample observations according to two or more characteristics
� Also called a cross-classification table.
Contingency Table Example
Left-Handed vs. Gender
Dominant Hand: Left vs. Right
Gender: Male vs. Female
Copyright ©2011 Pearson Education
Gender: Male vs. Female
� 2 categories for each variable, so this
is called a 2 x 2 table
� Suppose we examine a sample of300 children
Contingency Table Example
Sample results organized in a contingency table:
(continued)
Gender
Hand Preference
Left Rightsample size = n = 300:
Copyright ©2011 Pearson Education
Gender Left Right
Female 12 108 120
Male 24 156 180
36 264 300
120 Females, 12 were left handed
180 Males, 24 were left handed
χ2 Test for the Difference Between Two Proportions
H0: π1 = π2 (Proportion of females who are left
handed is equal to the proportion of
males who are left handed)
H1: π1 ≠ π2 (The two proportions are not the same –
hand preference is not independent
Copyright ©2011 Pearson Education
� If H0 is true, then the proportion of left-handed females should be the same as the proportion of left-handed males
� The two proportions above should be the same as the proportion of left-handed people overall
hand preference is not independent
of gender)
The Chi-Square Test Statistic
∑−
=
cells
2
2)(
all e
eo
STAT f
ffχ
The Chi-square test statistic is:
Copyright ©2011 Pearson Education
� where:
fo = observed frequency in a particular cell
fe = expected frequency in a particular cell if H0 is true
(Assumed: each cell in the contingency table has expected
frequency of at least 5)
cells all e
freedom of degree 1 has case 2x 2 thefor 2
STATχ
Decision Rule
The test statistic approximately follows a chi-squared distribution with one degree of freedom
2
STATχ
Copyright ©2011 Pearson Education
χχχχ2222
χ2α
Decision Rule:If , reject H0, otherwise, do not reject H0
0
α
Reject H0Do not reject H0
22
αSTAT χ χ >>>>
Computing the Average Proportion
Here:120 Females, 12
n
X
nn
XXp
21
21 =+
+=
The average proportion is:
Copyright ©2011 Pearson Education
Here:120 Females, 12 were left handed
180 Males, 24 were left handed
i.e., based on all 180 children the proportion of left handers is 0.12, that is, 12%
12.0300
36
180120
2412p ==
+
+=
Finding Expected Frequencies
� To obtain the expected frequency for left handed females, multiply the average proportion left handed (p) by the total number of females
� To obtain the expected frequency for left handed males,
Copyright ©2011 Pearson Education
� To obtain the expected frequency for left handed males, multiply the average proportion left handed (p) by the total number of males
If the two proportions are equal, then
P(Left Handed | Female) = P(Left Handed | Male) = .12
i.e., we would expect (.12)(120) = 14.4 females to be left handed(.12)(180) = 21.6 males to be left handed
Observed vs. Expected Frequencies
Gender
Hand Preference
Left Right
Copyright ©2011 Pearson Education
FemaleObserved = 12
Expected = 14.4
Observed = 108
Expected = 105.6120
MaleObserved = 24
Expected = 21.6
Observed = 156
Expected = 158.4180
36 264 300
Gender
Hand Preference
Left Right
FemaleObserved = 12
Expected = 14.4
Observed = 108
Expected = 105.6120
Observed = 24 Observed = 156
The Chi-Square Test Statistic
Copyright ©2011 Pearson Education
MaleObserved = 24
Expected = 21.6
Observed = 156
Expected = 158.4180
36 264 300
0.7576158.4
158.4)(156
21.6
21.6)(24
105.6
105.6)(108
14.4
14.4)(12
f
)f(fχ
2222
cells all e
2
eo2
=−
+−
+−
+−
=
−= ∑STAT
The test statistic is:
Decision Rule
Decision Rule:If > 3.841, reject H0, otherwise, do not reject H
3.841 d.f. 1 with ; 0.7576 is statistic test The 2
05.0
2 == χχSTAT
2
STATχ
Copyright ©2011 Pearson Education
otherwise, do not reject H0
Here, = 0.7576< = 3.841,
so we do not reject H0 and conclude that there is not sufficient evidence that the two proportions are different at α = 0.05
χχχχ2222
χ20.05 = 3.841
0
0.05
Reject H0Do not reject H0
2
STATχ
2
05.0χ
� Extend the χ2 test to the case with more than two independent populations:
χ2 Test for Differences Among More Than Two Proportions
H : π = π = … = π
Copyright ©2011 Pearson Education
H0: π1 = π2 = … = πc
H1: Not all of the πj are equal (j = 1, 2, …, c)
The Chi-Square Test Statistic
∑−
=
cells
2
2)(
all e
eo
STAT f
ffχ
The Chi-square test statistic is:
Copyright ©2011 Pearson Education
� Where:
fo = observed frequency in a particular cell of the 2 x c table
fe = expected frequency in a particular cell if H0 is true
(Assumed: each cell in the contingency table has expected
frequency of at least 1)
cells all e
freedom of degrees 1-c 1)-1)(c-(2 has case cx 2 thefor χ2 ====STAT
Computing the Overall Proportion
n
X
nnn
XXXp
c21
c21 =+++
+++=
L
LThe overall proportion is:
� Expected cell frequencies for the c categories
Copyright ©2011 Pearson Education
� Expected cell frequencies for the c categories are calculated as in the 2 x 2 case, and the decision rule is the same:
Where is from the chi-squared distribution with c – 1 degrees of freedom
Decision Rule:If , reject H0, otherwise, do not reject H0
22
αSTAT χ χ >>>>
2
αχ
Chi-Square Test Results
Chi-Square Test: Administrators, Students, Faculty
Admin Students Faculty Total
H0: π1 = π2 = π3
H1: Not all of the πj are equal (j = 1, 2, 3)
Copyright ©2011 Pearson Education
Admin Students Faculty Total
Favor 63 20 37 120
60 30 30
Oppose 37 30 13 80
40 20 20
Total 100 50 50 200
ObservedExpected
0
2
010
2 Hreject so 9.2103 79212 =>=.STAT
χ.χ
χ2 Test of Independence
� Similar to the χ2 test for equality of more than two proportions, but extends the concept to contingency tables with r rows and c columns
Copyright ©2011 Pearson Education
H0: The two categorical variables are independent
(i.e., there is no relationship between them)
H1: The two categorical variables are dependent
(i.e., there is a relationship between them)
χ2 Test of Independence
∑−
=
cells
2
2)(
all e
eo
STAT f
ffχ
The Chi-square test statistic is:
(continued)
Copyright ©2011 Pearson Education
� where:
fo = observed frequency in a particular cell of the r x c table
fe = expected frequency in a particular cell if H0 is true
(Assumed: each cell in the contingency table has expected
frequency of at least 1)
cells all e
freedom of degrees 1)-1)(c-(r has case cr x for the χ 2
STAT
Expected Cell Frequencies
� Expected cell frequencies:
n
total columntotalrow fe
×=
Copyright ©2011 Pearson Education
n
Where:
row total = sum of all frequencies in the row
column total = sum of all frequencies in the column
n = overall sample size
Decision Rule
� The decision rule is
If , reject H0,
otherwise, do not reject H
22
αSTAT χ χ >>>>
Copyright ©2011 Pearson Education
Where is from the chi-squared distribution with (r – 1)(c – 1) degrees of freedom
otherwise, do not reject H0
2
αχ
Example
� The meal plan selected by 200 students is shown below:
ClassStanding
Number of meals per week
Total20/week 10/week none
Copyright ©2011 Pearson Education
Standing
Fresh. 24 32 14 70
Soph. 22 26 12 60
Junior 10 14 6 30
Senior 14 16 10 40
Total 70 88 42 200
Example
� The hypothesis to be tested is:
(continued)
H0: Meal plan and class standing are independent
(i.e., there is no relationship between them)
Copyright ©2011 Pearson Education
(i.e., there is no relationship between them)
H1: Meal plan and class standing are dependent
(i.e., there is a relationship between them)
ClassStanding
Number of meals per week
Total20/wk 10/wk none
Fresh. 24 32 14 70
Soph. 22 26 12 60 Number of meals per week
Observed:
Expected cell frequencies if H0 is true:
Example: Expected Cell Frequencies
(continued)
Copyright ©2011 Pearson Education
Junior 10 14 6 30
Senior 14 16 10 40
Total 70 88 42 200
ClassStanding
per week
Total20/wk 10/wk none
Fresh. 24.5 30.8 14.7 70
Soph. 21.0 26.4 12.6 60
Junior 10.5 13.2 6.3 30
Senior 14.0 17.6 8.4 40
Total 70 88 42 2005.10
200
7030
n
total columntotalrow fe
=×
=
×=
Example for one cell:
Example: The Test Statistic
� The test statistic value is:
cells
2
2
f
)ff(χ
all e
eo
STAT
−= ∑
(continued)
Copyright ©2011 Pearson Education
709048
4810
830
83032
524
52424 222
..
).(
.
).(
.
).(=
−++
−+
−= L
= 12.592 from the chi-squared distribution with (4 – 1)(3 – 1) = 6 degrees of freedom
2
050.χ
Example: Decision and Interpretation
(continued)
Decision Rule:If > 12.592, reject H0, otherwise, do not reject H
12.592 d.f. 6 with ; 7090 is statistictest The 2
050
2 ========.STAT
χ.χ
2
STATχ
Copyright ©2011 Pearson Education
otherwise, do not reject H0
Here, = 0.709 < = 12.592,
so do not reject H0
Conclusion: there is not sufficient evidence that meal plan and class standing are related at α = 0.05
χχχχ2222
χ20.05=12.592
0
0.05
Reject H0Do not reject H0
2
STATχ 2
050.χ
فصل هشتمفصل هشتم
رگرسیون خطیرگرسیون خطی
Correlation vs. Regression
� A scatter plot can be used to show the relationship between two variables
� Correlation analysis is used to measure the strength of the association (linear relationship) between two variables
Copyright ©2011 Pearson Education
between two variables
� Correlation is only concerned with strength of the relationship
� No causal effect is implied with correlation
� Scatter plots were first presented in Ch. 2
� Correlation was first presented in Ch. 3
Introduction to Regression Analysis
� Regression analysis is used to:
� Predict the value of a dependent variable based on the value of at least one independent variable
� Explain the impact of changes in an independent
Copyright ©2011 Pearson Education
Explain the impact of changes in an independent variable on the dependent variable
Dependent variable: the variable we wish topredict or explain
Independent variable: the variable used to predict or explain the dependent variable
Simple Linear Regression Model
� Only one independent variable, X
� Relationship between X and Y is described by a linear function
Copyright ©2011 Pearson Education
� Changes in Y are assumed to be related to changes in X
Types of Relationships
Y Y
Linear relationships Curvilinear relationships
Copyright ©2011 Pearson Education
Y
X
X
Y
X
X
Types of Relationships
Y Y
Strong relationships Weak relationships
(continued)
Copyright ©2011 Pearson Education
Y
X
X
Y
X
X
Types of Relationships
Y
No relationship
(continued)
Copyright ©2011 Pearson Education
Y
X
X
Simple Linear Regression Model
Population Y intercept
Population SlopeCoefficient
Random Error term
Dependent
Independent Variable
Copyright ©2011 Pearson Education
ii10i εXββY ++=Linear component
Dependent Variable
Random Errorcomponent
(continued)
Y
Observed Value of Y for Xi
ii10i εXββY ++=
ε
Simple Linear Regression Model
Copyright ©2011 Pearson Education
Random Error for this Xi value
X
Predicted Value of Y for Xi
Xi
Slope = β1
Intercept = β0
εi
The simple linear regression equation provides an estimate of the population regression line
Simple Linear Regression Equation (Prediction Line)
Estimate of the regression
Estimate of the regression slope
Estimated (or predicted) Y value for
Copyright ©2011 Pearson Education
i10i XbbY +=
the regression intercept
regression slopeY value for observation i
Value of X for observation i
The Least Squares Method
b0 and b1 are obtained by finding the values of
that minimize the sum of the squared
differences between Y and :Y
Copyright ©2011 Pearson Education
2i10i
2ii ))Xb(b(Ymin)Y(Ymin +−=− ∑∑
Finding the Least Squares Equation
� The coefficients b0 and b1 , and other regression results in this chapter, will be found using Excel
Copyright ©2011 Pearson Education
found using Excel
Formulas are shown in the text for those
who are interested
Simple Linear Regression Example
� A real estate agent wishes to examine the relationship between the selling price of a home and its size (measured in square feet)
Copyright ©2011 Pearson Education
� A random sample of 10 houses is selected
� Dependent variable (Y) = house price in $1000s
� Independent variable (X) = square feet
Simple Linear Regression Example: Data
House Price in $1000s(Y)
Square Feet (X)
245 1400
312 1600
279 1700
Copyright ©2011 Pearson Education
308 1875
199 1100
219 1550
405 2350
324 2450
319 1425
255 1700
300
350
400
450
Ho
us
e P
ric
e (
$1
00
0s
)
Simple Linear Regression Example: Scatter Plot
House price model: Scatter Plot
Copyright ©2011 Pearson Education
0
50
100
150
200
250
300
0 500 1000 1500 2000 2500 3000
Ho
us
e P
ric
e (
$1
00
0s
)
Square Feet
350
400
450H
ou
se
Pri
ce
($
10
00
s)
Simple Linear Regression Example: Graphical Representation
House price model: Scatter Plot and Prediction Line
Slope
Copyright ©2011 Pearson Education
0
50
100
150
200
250
300
0 500 1000 1500 2000 2500 3000
Square Feet
Ho
us
e P
ric
e (
$1
00
0s
)
feet) (square 0.10977 98.24833 price house +=
Slope = 0.10977
Intercept = 98.248
Simple Linear Regression Example: Interpretation of bo
� b0 is the estimated average value of Y when the
value of X is zero (if X = 0 is in the range of
feet) (square 0.10977 98.24833 price house +=
Copyright ©2011 Pearson Education
value of X is zero (if X = 0 is in the range of
observed X values)
� Because a house cannot have a square footage
of 0, b0 has no practical application
Simple Linear Regression Example: Interpreting b1
� b1 estimates the change in the average
value of Y as a result of a one-unit
feet) (square 0.10977 98.24833 price house +=
Copyright ©2011 Pearson Education
value of Y as a result of a one-unit
increase in X
� Here, b1 = 0.10977 tells us that the mean value of a
house increases by .10977($1000) = $109.77, on
average, for each additional one square foot of size
(sq.ft.) 0.1098 98.25 price house +=
Predict the price for a house with 2000 square feet:
Simple Linear Regression Example: Making Predictions
Copyright ©2011 Pearson Education
317.85
0)0.1098(200 98.25
=
+=
The predicted price for a house with 2000 square feet is 317.85($1,000s) = $317,850
450
Simple Linear Regression Example: Making Predictions
� When using a regression model for prediction, only predict within the relevant range of data
Relevant range for interpolation
Copyright ©2011 Pearson Education
0
50
100
150
200
250
300
350
400
0 500 1000 1500 2000 2500 3000
Square Feet
Ho
us
e P
ric
e (
$1
00
0s
)
Do not try to extrapolate
beyond the range of observed X’s
Measures of Variation
� Total variation is made up of two parts:
SSE SSR SST +=Total Sum of Regression Sum Error Sum of
Copyright ©2011 Pearson Education
Total Sum of Squares
Regression Sum of Squares
Error Sum of Squares
∑ −= 2i )YY(SST ∑ −= 2
ii )YY(SSE∑ −= 2i )YY(SSR
where:
= Mean value of the dependent variable
Yi = Observed value of the dependent variable
= Predicted value of Y for the given Xi valueiY
Y
� SST = total sum of squares (Total Variation)
� Measures the variation of the Yi values around their mean Y
� SSR = regression sum of squares (Explained Variation)
(continued)
Measures of Variation
Copyright ©2011 Pearson Education
� SSR = regression sum of squares (Explained Variation)
� Variation attributable to the relationship between X and Y
� SSE = error sum of squares (Unexplained Variation)
� Variation in Y attributable to factors other than X
(continued)
Yi
SST = ∑∑∑∑(Yi - Y)2
SSE = ∑∑∑∑(Yi - Yi )2
∧∧∧∧
_
Y∧∧∧∧
Y
Measures of Variation
Copyright ©2011 Pearson Education
Xi
Y
X
SST = ∑∑∑∑(Yi - Y)
SSR = ∑∑∑∑(Yi - Y)2∧∧∧∧
_
_
Y_Y∧∧∧∧
� The coefficient of determination is the portion of the total variation in the dependent variable that is explained by variation in the independent variable
Coefficient of Determination, r2
Copyright ©2011 Pearson Education
� The coefficient of determination is also called r-squared and is denoted as r2
1r0 2 ≤≤note:
squares of sum total
squares of sum regression2 ==SST
SSRr
Examples of Approximate r2 Values
Y
r2 = 1
Perfect linear relationship
Copyright ©2011 Pearson Education
r2 = 1
X
Y
X
r2 = 1
Perfect linear relationship between X and Y:
100% of the variation in Y is explained by variation in X
Examples of Approximate r2 Values
Y
0 < r2 < 1
Weaker linear relationships
Copyright ©2011 Pearson Education
X
Y
X
Weaker linear relationships between X and Y:
Some but not all of the variation in Y is explained by variation in X
Examples of Approximate r2 Values
r2 = 0
No linear relationship
Y
Copyright ©2011 Pearson Education
No linear relationship between X and Y:
The value of Y does not depend on X. (None of the variation in Y is explained by variation in X)
Xr2 = 0
Standard Error of Estimate
� The standard deviation of the variation of observations around the regression line is estimated by
∑n
Copyright ©2011 Pearson Education
2
)ˆ(
2
1
2
−
−
=−
=
∑=
n
YY
n
SSES
n
i
ii
YX
WhereSSE = error sum of squares
n = sample size
Comparing Standard Errors
YY
SYX is a measure of the variation of observed Y values from the regression line
Copyright ©2011 Pearson Education
X XYX
S smallYX
S large
The magnitude of SYX should always be judged relative to the size of the Y values in the sample data
i.e., SYX = $41.33K is moderately small relative to house prices in the $200K - $400K range
Assumptions of RegressionL.I.N.E
� Linearity
� The relationship between X and Y is linear
� Independence of Errors
Error values are statistically independent
Copyright ©2011 Pearson Education
� Error values are statistically independent
� Normality of Error
� Error values are normally distributed for any given value of X
� Equal Variance (also called homoscedasticity)
� The probability distribution of the errors has constant variance
Residual Analysis
� The residual for observation i, ei, is the difference between its observed and predicted value
� Check the assumptions of regression by examining the residuals
iii YYe −=
Copyright ©2011 Pearson Education
residuals
� Examine for linearity assumption
� Evaluate independence assumption
� Evaluate normal distribution assumption
� Examine for constant variance for all levels of X (homoscedasticity)
� Graphical Analysis of Residuals
� Can plot residuals vs. X
Residual Analysis for Linearity
Y Y
Copyright ©2011 Pearson Education
Not Linear Linear�
x
resid
ua
ls
x
x x
resid
ua
ls
Residual Analysis for Independence
Not Independent
Independent
X
resid
ua
ls
�
Copyright ©2011 Pearson Education
X
Xresid
ua
ls
resid
ua
lsX
resid
ua
ls
Residual Analysis for Normality
Percent
When using a normal probability plot, normal errors will approximately display in a straight line
100
Copyright ©2011 Pearson Education
Residual-3 -2 -1 0 1 2 3
0
100
Residual Analysis for Equal Variance
Y Y
Copyright ©2011 Pearson Education
Non-constant variance � Constant variance
x x
x x
resid
ua
ls
resid
ua
ls
House Price Model Residual Plot
40
60
80
Simple Linear Regression Example: Excel Residual Output
RESIDUAL OUTPUT
Predicted House Price Residuals
1 251.92316 -6.923162
2 273.87671 38.12329
Copyright ©2011 Pearson Education
-60
-40
-20
0
20
40
0 1000 2000 3000
Square Feet
Re
sid
ua
ls
2 273.87671 38.12329
3 284.85348 -5.853484
4 304.06284 3.937162
5 218.99284 -19.99284
6 268.38832 -49.38832
7 356.20251 48.79749
8 367.17929 -43.17929
9 254.6674 64.33264
10 284.85348 -29.85348
Does not appear to violate
any regression assumptions
� Used when data are collected over time to detect if autocorrelation is present
� Autocorrelation exists if residuals in one
Measuring Autocorrelation:The Durbin-Watson Statistic
Copyright ©2011 Pearson Education
� Autocorrelation exists if residuals in one time period are related to residuals in another period
Autocorrelation
� Autocorrelation is correlation of the errors (residuals) over time
Time (t) Residual Plot
10
15
� Here, residuals show a
Copyright ©2011 Pearson Education
� Violates the regression assumption that residuals are random and independent
-15
-10
-5
0
5
10
0 2 4 6 8
Time (t)
Resid
uals
� Here, residuals show a cyclic pattern, not random. Cyclical patterns are a sign of positive autocorrelation
The Durbin-Watson Statistic
� The Durbin-Watson statistic is used to test for autocorrelation
H0: residuals are not correlated
H1: positive autocorrelation is present
Copyright ©2011 Pearson Education
∑
∑
=
=
−−
=n
1i
2i
n
2i
21ii
e
)ee(
D
� The possible range is 0 ≤ D ≤ 4
� D should be close to 2 if H0 is true
� D less than 2 may signal positive autocorrelation, D greater than 2 may signal negative autocorrelation
H1: positive autocorrelation is present
Testing for Positive Autocorrelation
� Calculate the Durbin-Watson test statistic = D (The Durbin-Watson Statistic can be found using Excel or Minitab)
H0: positive autocorrelation does not exist
H1: positive autocorrelation is present
Copyright ©2011 Pearson Education
(The Durbin-Watson Statistic can be found using Excel or Minitab)
Decision rule: reject H0 if D < dL
0 dU 2dL
Reject H0 Do not reject H0
� Find the values dL and dU from the Durbin-Watson table(for sample size n and number of independent variables k)
Inconclusive
� Suppose we have the following time series data:
140
160
Testing for Positive Autocorrelation (continued)
Copyright ©2011 Pearson Education
� Is there autocorrelation?
y = 30.65 + 4.7038x
R2 = 0.8976
0
20
40
60
80
100
120
0 5 10 15 20 25 30
Tim e
Sa
les
� Example with n = 25:
Durbin-Watson Calculations
Sum of Squared Difference of Residuals 3296.18
y = 30.65 + 4.7038x
R2 = 0.8976
40
60
80
100
120
140
160
Sa
les
Testing for Positive Autocorrelation
(continued)
Excel/PHStat output:
Copyright ©2011 Pearson Education
Difference of Residuals 3296.18
Sum of Squared Residuals 3279.98
Durbin-Watson Statistic 1.00494
0
20
40
0 5 10 15 20 25 30
Tim e
1.004943279.98
3296.18
e
)e(e
Dn
1i
2
i
n
2i
21ii
==
−
=
∑
∑
=
=
−
� Here, n = 25 and there is k = 1 one independent variable
� Using the Durbin-Watson table, dL = 1.29 and dU = 1.45
� D = 1.00494 < dL = 1.29, so reject H0 and conclude that significant positive autocorrelation exists
Testing for Positive Autocorrelation (continued)
Copyright ©2011 Pearson Education
significant positive autocorrelation exists
Decision: reject H0 since
D = 1.00494 < dL
0 dU=1.45 2dL=1.29
Reject H0 Do not reject H0Inconclusive
Inferences About the Slope
� The standard error of the regression slope coefficient (b1) is estimated by
== YXYX SSS
Copyright ©2011 Pearson Education
∑ −==
2i
YXYXb
)X(X
S
SSX
SS
1
where:
= Estimate of the standard error of the slope
= Standard error of the estimate
1bS
2n
SSESYX
−=
Inferences About the Slope: t Test
� t test for a population slope� Is there a linear relationship between X and Y?
� Null and alternative hypotheses� H : β = 0 (no linear relationship)
Copyright ©2011 Pearson Education
� H0: β1 = 0 (no linear relationship)
� H1: β1 ≠ 0 (linear relationship does exist)
� Test statistic
1b
11STAT
S
βbt
−=
2nd.f. −=
where:
b1 = regression slopecoefficient
β1 = hypothesized slope
Sb1 = standarderror of the slope
Inferences About the Slope: t Test Example
House Price in $1000s
(y)
Square Feet (x)
245 1400
312 1600
(sq.ft.) 0.1098 98.25 price house +=
Estimated Regression Equation:
Copyright ©2011 Pearson Education
279 1700
308 1875
199 1100
219 1550
405 2350
324 2450
319 1425
255 1700
The slope of this model is 0.1098
Is there a relationship between the
square footage of the house and its
sales price?
Inferences About the Slope: t Test Example
H0: β1 = 0
H1: β1 ≠ 0From Excel output:
Coefficients Standard Error t Stat P-value
Intercept 98.24833 58.03348 1.69296 0.12892
Square Feet 0.10977 0.03297 3.32938 0.01039
Copyright ©2011 Pearson Education
Square Feet 0.10977 0.03297 3.32938 0.01039
1bSb1
329383032970
0109770
S
βbt
1b
11
STAT.
.
.=
−=
−=
Inferences About the Slope: t Test Example
Test Statistic: tSTAT = 3.329
d.f. = 10- 2 = 8
H0: β1 = 0
H1: β1 ≠ 0
Copyright ©2011 Pearson Education
There is sufficient evidence
that square footage affects
house price
Decision: Reject H0
Reject H0Reject H0
α/2=.025
-tα/2
Do not reject H0
0tα/2
α/2=.025
-2.3060 2.3060 3.329
d.f. = 10- 2 = 8
Inferences About the Slope: t Test Example
H0: β1 = 0
H1: β1 ≠ 0
From Excel output:
Coefficients Standard Error t Stat P-value
Intercept 98.24833 58.03348 1.69296 0.12892
Copyright ©2011 Pearson Education
Intercept 98.24833 58.03348 1.69296 0.12892
Square Feet 0.10977 0.03297 3.32938 0.01039
p-value
There is sufficient evidence that
square footage affects house price.
Decision: Reject H0, since p-value < α
F Test for Significance
� F Test statistic:
where
MSE
MSRFSTAT =
SSRMSR =
Copyright ©2011 Pearson Education
where
1kn
SSEMSE
k
SSRMSR
−−=
=
where FSTAT follows an F distribution with k numerator and (n – k - 1)denominator degrees of freedom
(k = the number of independent variables in the regression model)
H0: β1 = 0
H1: β1 ≠ 0
α = .05
df1= 1 df2 = 8
Test Statistic:
Decision:
11.08FSTAT ==MSE
MSR
F Test for Significance(continued)
Copyright ©2011 Pearson Education
df1= 1 df2 = 8 Decision:
Conclusion:
Reject H0 at αααα = 0.05
There is sufficient evidence that house size affects selling price0
α = .05
F.05 = 5.32Reject H0Do not
reject H0
Critical Value:
Fαααα = 5.32
F
Confidence Interval Estimate for the Slope
Confidence Interval Estimate of the Slope:
Excel Printout for House Prices:
1b2/1 Sb αt± d.f. = n - 2
Copyright ©2011 Pearson Education
Excel Printout for House Prices:
At 95% level of confidence, the confidence interval for the slope is (0.0337, 0.1858)
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
Since the units of the house price variable is
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
Confidence Interval Estimate for the Slope (continued)
Copyright ©2011 Pearson Education
Since the units of the house price variable is $1000s, we are 95% confident that the average impact on sales price is between $33.74 and $185.80 per square foot of house size
This 95% confidence interval does not include 0.
Conclusion: There is a significant relationship between house price and square feet at the .05 level of significance
t Test for a Correlation Coefficient
� Hypotheses
H0: ρ = 0 (no correlation between X and Y)
H1: ρ ≠ 0 (correlation exists)
Copyright ©2011 Pearson Education
� Test statistic
(with n – 2 degrees of freedom)
2n
r1
ρ-rt
2STAT
−
−
=
0 b if rr
0 b if rr
where
12
12
<−=
>+=
t-test For A Correlation Coefficient
Is there evidence of a linear relationship between square feet and house price at the .05 level of significance?
H0: ρ = 0 (No correlation)
(continued)
Copyright ©2011 Pearson Education
H1: ρ ≠ 0 (correlation exists)
α =.05 , df = 10 - 2 = 8
3.329
210
.7621
0.762
2n
r1
ρrt
22STAT =
−
−
−=
−
−
−=
t-test For A Correlation Coefficient
Conclusion:
Decision:Reject H0
3.329
210
.7621
0.762
2n
r1
ρrt
22STAT =
−
−
−=
−
−
−=
(continued)
Copyright ©2011 Pearson Education
Conclusion:There is evidence of a linear association at the 5% level of significance
Reject H0Reject H0
α/2=.025
-tα/2
Do not reject H0
0tα/2
α/2=.025
-2.3060 2.3060
3.329
d.f. = 10-2 = 8
2102n −−
Estimating Mean Values and Predicting Individual Values
YConfidence Interval for the mean of Y, given X
Goal: Form intervals around Y to express uncertainty about the value of Y for a given Xi
Y∧∧∧∧
Copyright ©2011 Pearson EducationXXi
Y = b0+b1Xi
∧∧∧∧
Y, given Xi
Prediction Interval
for an individual Y,given Xi
Confidence Interval for the Average Y, Given X
Confidence interval estimate for the mean value of Y given a particular Xi
XX|Y :µfor interval Confidencei
=
Copyright ©2011 Pearson Education
Size of interval varies according to distance away from mean, X
ihtY YX2/
XX|Y
Sˆ
i
α±
=
∑ −
−+=
−+=
2i
2i
2i
i)X(X
)X(X
n
1
SSX
)X(X
n
1h
Prediction Interval for an Individual Y, Given X
Confidence interval estimate for an Individual value of Y given a particular Xi
= :Yfor interval Confidence XXi
Copyright ©2011 Pearson Education
This extra term adds to the interval width to reflect the added uncertainty for an individual case
ihtY +±
=
1Sˆ YX2/
XXi
α
Estimation of Mean Values: Example
Find the 95% confidence interval for the mean price of 2,000 square-foot houses
∧∧∧∧
Confidence Interval Estimate for µY|X=X i
Copyright ©2011 Pearson Education
Predicted Price Yi = 317.85 ($1,000s)∧∧∧∧
37.12317.85)X(X
)X(X
n
1StY
2i
2i
YX0.025 ±=−
−+±
∑The confidence interval endpoints are 280.66 and 354.90, or from $280,660 to $354,900
Estimation of Individual Values: Example
Find the 95% prediction interval for an individual house with 2,000 square feet
∧∧∧∧
Prediction Interval Estimate for YX=X i
Copyright ©2011 Pearson Education
Predicted Price Yi = 317.85 ($1,000s)∧∧∧∧
102.28317.85)X(X
)X(X
n
11StY
2i
2i
YX0.025 ±=−
−++±
∑The prediction interval endpoints are 215.50 and 420.07, or from $215,500 to $420,070
Finding Confidence and Prediction Intervals in Excel
� From Excel, use
PHStat | regression | simple linear regression …
Copyright ©2011 Pearson Education
� Check the
“confidence and prediction interval for X=”
box and enter the X-value and confidence level desired
فصل نهمفصل نهم
پیش بینیپیش بینی
The Importance of Forecasting
� Governments forecast unemployment rates, interest rates, and expected revenues from income taxes for policy purposes
� Marketing executives forecast demand, sales, and
Copyright ©2011 Pearson Education
� Marketing executives forecast demand, sales, and consumer preferences for strategic planning
� College administrators forecast enrollments to plan for facilities and for faculty recruitment
� Retail stores forecast demand to control inventory levels, hire employees and provide training
Common Approaches to Forecasting
Common Approaches to Forecasting
Quantitative forecasting Qualitative forecasting
Copyright ©2011 Pearson Education
� Used when historical data are unavailable
� Considered highly subjective and judgmental
Causal
Quantitative forecasting methods
Qualitative forecasting methods
Time Series
� Use past data to predict future values
Time-Series Data
� Numerical data obtained at regular time intervals
� The time intervals can be annually, quarterly, monthly, weekly, daily, hourly, etc.
Copyright ©2011 Pearson Education
monthly, weekly, daily, hourly, etc.
� Example:
Year: 2005 2006 2007 2008 2009
Sales: 75.3 74.2 78.5 79.7 80.2
Time-Series Plot
� the vertical axis U.S. Inflation Rate
A time-series plot is a two-dimensional plot of time series data
Copyright ©2011 Pearson Education
the vertical axis measures the variable of interest
� the horizontal axis corresponds to the time periods
U.S. Inflation Rate
0.00
2.00
4.00
6.00
8.00
10.00
12.00
14.00
16.00
19
75
19
77
19
79
19
81
19
83
19
85
19
87
19
89
19
91
19
93
19
95
19
97
19
99
20
01
Year
Infl
ati
on
Ra
te (
%)
Time-Series Components
Time Series
Cyclical Component
Irregular Component
Trend Component
Seasonal Component
Copyright ©2011 Pearson Education
Component ComponentComponent Component
Overall, persistent, long-term movement
Regular periodic fluctuations,
usually within a 12-month period
Repeating swings or
movements over more than one
year
Erratic or residual
fluctuations
Trend Component
� Long-run increase or decrease over time (overall upward or downward movement)
� Data taken over a long period of time
Copyright ©2011 Pearson Education
Sales
Time
Trend Component
� Trend can be upward or downward
� Trend can be linear or non-linear
Sales Sales
(continued)
Copyright ©2011 Pearson Education
Downward linear trend
Sales
Time Upward nonlinear trend
Sales
Time
Seasonal Component
� Short-term regular wave-like patterns
� Observed within 1 year
� Often monthly or quarterly
Copyright ©2011 Pearson Education
Sales
Time (Quarterly)
Winter
Spring
Summer
Fall
Winter
Spring
Summer
Fall
Cyclical Component
� Long-term wave-like patterns
� Regularly occur but may vary in length
� Often measured peak to peak or trough to trough
Copyright ©2011 Pearson Education
trough
Sales
1 Cycle
Year
Irregular Component
� Unpredictable, random, “residual” fluctuations
� Due to random variations of
� Nature
� Accidents or unusual events
Copyright ©2011 Pearson Education
� Accidents or unusual events
� “Noise” in the time series
Does Your Time Series Have A Trend Component?
� A time series plot should help you to answer this question.
Often it helps if you “smooth” the time series
Copyright ©2011 Pearson Education
� Often it helps if you “smooth” the time series data to help answer this question.
� Two popular smoothing methods are moving averages and exponential smoothing.
Smoothing Methods
� Moving Averages
� Calculate moving averages to get an overall impression of the pattern of movement over time
� Averages of consecutive time series values for a
Copyright ©2011 Pearson Education
� Averages of consecutive time series values for a chosen period of length L
� Exponential Smoothing
� A weighted moving average
Moving Averages
� Used for smoothing
� A series of arithmetic means over time
� Result dependent upon choice of L (length of period for computing means)
Copyright ©2011 Pearson Education
period for computing means)
� Last moving average of length L can be extrapolated one period into future for a short term forecast
� Examples: � For a 5 year moving average, L = 5
� For a 7 year moving average, L = 7
� Etc.
Moving Averages
� Example: Five-year moving average
� First average:
(continued)
5
YYYYYMA(5) 54321 ++++
=
Copyright ©2011 Pearson Education
� Second average:
� etc.
5
5
YYYYYMA(5) 65432 ++++
=
Example: Annual Data
Year Sales
1
2
3
4
23
40
25
27
Annual Sales
50
60
Copyright ©2011 Pearson Education
4
5
6
7
8
9
10
11
etc…
27
32
48
33
37
37
50
40
etc…
0
10
20
30
40
1 2 3 4 5 6 7 8 9 10 11
Year
Sa
les
Calculating Moving Averages
Year Sales
1 23
2 40
3 25
Average Year
5-Year Moving Average
3 29.4
4 34.4
5 33.0
5
543213
++++=
322725402329.4
++++=
Copyright ©2011 Pearson Education
� Each moving average is for a consecutive block of 5 years
3 25
4 27
5 32
6 48
7 33
8 37
9 37
10 50
11 40
5 33.0
6 35.4
7 37.4
8 41.0
9 39.4
… …
5
322725402329.4
++++=
etc…
Annual vs. 5-Year Moving Average
50
60
Annual vs. Moving Average
The 5-year moving average smoothes the data and makes it easier to
Copyright ©2011 Pearson Education
0
10
20
30
40
1 2 3 4 5 6 7 8 9 10 11
Year
Sa
les
Annual 5-Year Moving Average
makes it easier to see the underlying trend
Exponential Smoothing
� Used for smoothing and short term forecasting (one period into the future)
Copyright ©2011 Pearson Education
� A weighted moving average
� Weights decline exponentially
� Most recent observation is given the highest weight
Exponential Smoothing
� The weight (smoothing coefficient) is W
� Subjectively chosen
� Ranges from 0 to 1
� Smaller W gives more smoothing, larger W gives
(continued)
Copyright ©2011 Pearson Education
Smaller W gives more smoothing, larger W gives less smoothing
� The weight is:
� Close to 0 for smoothing out unwanted cyclical and irregular components
� Close to 1 for forecasting
Exponential Smoothing Model
� Exponential smoothing model
11 YE =
E)W1(WYE −−+=
Copyright ©2011 Pearson Education
1iii E)W1(WYE −−+=
where:Ei = exponentially smoothed value for period i
Ei-1 = exponentially smoothed value already
computed for period i - 1Yi = observed value in period iW = weight (smoothing coefficient), 0 < W < 1
For i = 2, 3, 4, …
Exponential Smoothing Example
� Suppose we use weight W = 0.2
Time Period
(i)
Sales(Yi)
Forecast from prior
period (Ei-1)
Exponentially Smoothed Value for this period (Ei)
1
2
23
40
--
23
23
(.2)(40)+(.8)(23)=26.4
E1 = Y1
since no prior
Copyright ©2011 Pearson Education
2
3
4
5
6
7
8
9
10
etc.
40
25
27
32
48
33
37
37
50
etc.
23
26.4
26.12
26.296
27.437
31.549
31.840
32.872
33.697
etc.
(.2)(40)+(.8)(23)=26.4
(.2)(25)+(.8)(26.4)=26.12
(.2)(27)+(.8)(26.12)=26.296
(.2)(32)+(.8)(26.296)=27.437
(.2)(48)+(.8)(27.437)=31.549
(.2)(48)+(.8)(31.549)=31.840
(.2)(33)+(.8)(31.840)=32.872
(.2)(37)+(.8)(32.872)=33.697
(.2)(50)+(.8)(33.697)=36.958
etc.
1ii
i
E)W1(WY
E
−−+
=
prior information exists
Sales vs. Smoothed Sales
� Fluctuations have been smoothed
40
50
60
Copyright ©2011 Pearson Education
� NOTE: the smoothed value in this case is generally a little low, since the trend is upward sloping and the weighting factor is only .2
0
10
20
30
40
1 2 3 4 5 6 7 8 9 10Time Period
Sa
les
Sales Smoothed
Forecasting Time Period i + 1
� The smoothed value in the current period (i) is used as the forecast value for next period (i + 1) :
Copyright ©2011 Pearson Education
i1i EY =+
There Are Three Popular Methods For Trend-Based Forecasting
� Linear Trend Forecasting
Copyright ©2011 Pearson Education
� Nonlinear Trend Forecasting
� Exponential Trend Forecasting
Linear Trend Forecasting
Estimate a trend line using regression analysis
Year
Time Period
(X)Sales
(Y)
� Use time (X) as the independent variable:
Copyright ©2011 Pearson Education
(X) (Y)
2004
2005
2006
2007
2008
2009
0
1
2
3
4
5
20
40
30
50
70
65
XbbY 10 +=
In least squares linear, non-linear, andexponential modeling, time periods arenumbered starting with 0 and increasingby 1 for each time period.
Linear Trend Forecasting
The linear trend forecasting equation is:
Sales trend
Year
Time Period
(X)Sales
(Y)
2004 0 20
ii X 9.571421.905Y +=
(continued)
Copyright ©2011 Pearson Education
0
10
20
30
40
50
60
70
80
0 1 2 3 4 5 6
Year
sa
les
2004
2005
2006
2007
2008
2009
0
1
2
3
4
5
20
40
30
50
70
65
Linear Trend Forecasting
� Forecast for time period 6 (2010):
Year
Time Period
(X)Sales
(y)
2004 0 20
(continued)
Sales trend
79.33
(6) 9.571421.905Y
=
+=
Copyright ©2011 Pearson Education
2004
2005
2006
2007
2008
2009
2010
0
1
2
3
4
5
6
20
40
30
50
70
65
??
Sales trend
0
10
20
30
40
50
60
70
80
0 1 2 3 4 5 6
Year
sa
les
Nonlinear Trend Forecasting
� A nonlinear regression model can be used when
the time series exhibits a nonlinear trend
� Quadratic form is one type of a nonlinear model:
Copyright ©2011 Pearson Education
� Compare adj. r2 and standard error to that of
linear model to see if this is an improvement
� Can try other functional forms to get best fit
i2i2i10i XXY ε+β+β+β=
Exponential Trend Model
� Another nonlinear trend model:
i
X
10i εββY i=
Copyright ©2011 Pearson Education
� Transform to linear form:
)εlog()log(βX)βlog()log(Y i1i0i ++=
Exponential Trend Model
� Exponential trend forecasting equation:
i10i XbbYlog( +=)ˆ
where b = estimate of log(β )
(continued)
Copyright ©2011 Pearson Education
where b0 = estimate of log(β0)
b1 = estimate of log(β1)
Interpretation:
%100)1β( 1 ×− is the estimated annual compoundgrowth rate (in %)
Trend Model Selection Using Differences
� Use a linear trend model if the first differences are approximately constant
)YY()YY()Y(Y 1-nn2312 −==−=− L
Copyright ©2011 Pearson Education
� Use a quadratic trend model if the second differences are approximately constant
1-nn2312
)]YY()Y[(Y
)]YY()Y[(Y)]YY()Y[(Y
2-n1-n1-nn
23341223
−−−==
−−−=−−−
L
� Use an exponential trend model if the percentage differences are approximately constant
(continued)
Trend Model Selection Using Differences
Copyright ©2011 Pearson Education
%100Y
)Y(Y%100
Y
)Y(Y%100
Y
)Y(Y
1-n
1-nn
2
23
1
12 ×−
==×−
=×−
L
Autoregressive Modeling
� Used for forecasting
� Takes advantage of autocorrelation
� 1st order - correlation between consecutive values
2nd order - correlation between values 2 periods
Copyright ©2011 Pearson Education
ip-ip2-i21-i10i YAYAYAAY δ+++++= L
� 2nd order - correlation between values 2 periods apart
� pth order Autoregressive model:
Random Error
Autoregressive Model: Example
Year Units
The Office Concept Corp. has acquired a number of office units (in thousands of square feet) over the last eight years. Develop the second order Autoregressive model.
Copyright ©2011 Pearson Education
Year Units
02 4
03 304 205 3 06 2 07 2 08 409 6
Autoregressive Model: Example Solution
Year Yi Yi-1 Yi-2
02 4 -- --
03 3 4 --04 2 3 405 3 2 3
� Develop the 2nd order
table
� Use Excel to estimate a
regression model
Copyright ©2011 Pearson Education
05 3 2 306 2 3 207 2 2 308 4 2 209 6 4 2
Coefficients
Intercept 3.5
X Variable 1 0.8125
X Variable 2 -0.9375
Excel Output
regression model
2i1ii 0.9375Y0.8125Y3.5Y −− −+=
Autoregressive Model Example: Forecasting
Use the second-order equation to forecast number of units for 2010:
Copyright ©2011 Pearson Education
625.4
)0.9375(4)0.8125(63.5
)0.9375(Y)0.8125(Y3.5Y
0.9375Y0.8125Y3.5Y
200820092010
2i1ii
=
−+=
−+=
−+= −−
Autoregressive Modeling Steps
1.Choose p (note that df = n – 2p – 1)
2.Form a series of “lagged predictor” variables
Yi-1 , Yi-2 , … ,Yi-p
Copyright ©2011 Pearson Education
3.Use Excel or Minitab to run regression model using all p variables
4.Test significance of Ap
� If null hypothesis rejected, this model is selected
� If null hypothesis not rejected, decrease p by 1 and repeat
Choosing A Forecasting Model
� Perform a residual analysis
� Eliminate a model that shows a pattern or trend
� Measure magnitude of residual error using squared differences and select the model
Copyright ©2011 Pearson Education
squared differences and select the model with the smallest value
� Measure magnitude of residual error using absolute differences and select the model with the smallest value
� Use simplest model
� Principle of parsimony
Residual Analysis
T T
e e
0 0
Copyright ©2011 Pearson Education
Random errors
Trend not accounted for
Cyclical effects not accounted for
Seasonal effects not accounted for
T T
T T
e e
0 0
Measuring Errors
� Choose the model that gives the smallest measuring errors
� Mean Absolute Deviation (MAD)
� Sum of squared errors (SSE)
Copyright ©2011 Pearson Education
(MAD)
� Less sensitive to extreme observations
(SSE)
� Sensitive to outliers
∑=
−=n
1i
2ii )Y(YSSE
n
YY
MAD
n
1iii∑
=
−
=
Principal of Parsimony
� Suppose two or more models provide a good fit for the data
� Select the simplest model
� Simplest model types:
Copyright ©2011 Pearson Education
� Simplest model types:� Least-squares linear
� Least-squares quadratic
� 1st order autoregressive
� More complex types:� 2nd and 3rd order autoregressive
� Least-squares exponential
� Time series are often collected monthly or quarterly
� These time series often contain a trend component, a seasonal component, and the irregular component
Forecasting With Seasonal Data
Copyright ©2011 Pearson Education
irregular component
� Suppose the seasonality is quarterly
� Define three new dummy variables for quarters:
Q1 = 1 if first quarter, 0 otherwise
Q2 = 1 if second quarter, 0 otherwise
Q3 = 1 if third quarter, 0 otherwise
(Quarter 4 is the default if Q1 = Q2 = Q3 = 0)
فصل دهمفصل دهم
Copyright ©2011 Pearson Education
موزش نرم افزارموزش نرم افزارا ا
spssاجراي برنامه اجراي برنامه اجراي برنامه اجراي برنامه
Copyright ©2011 Pearson Education
منوي اصليمنوي اصليمنوي اصليمنوي اصلي - - - - 2222
� Spss با انتخاب يك . داراي يك منوي اصلي كه مشتمل بر يك سري دستور قابل اجرا است
ن دستور قابل اجراست.دستور با استفاده از موشواره، ا
Copyright ©2011 Pearson Education
File امكان File گزينه طريق از فايلها با كار�
كردن باز جديد، فايل ايجاد .است پذير
ذخيره ها، داده نمايش موجود، فايلهاي
و …و عمليات چاپ فايلها، خروج نهايتا
جمله از ،Exit دستور با spss از
.باشد مي منو اين قابليتهاي
Copyright ©2011 Pearson Education
.باشد مي منو اين قابليتهاي
Edit
جهت جستجو داده، جايگزيني، �
كپـي كردن داده ها، جا به جايــي
در فايلها از دستورات منوي
Edit استفاده مي شود .
Copyright ©2011 Pearson Education
Edit استفاده مي شود .
View
جهت حذف يا نمايش ميله ابزار،
خطوط زمينه در ويرايشگر
داده ها، تغيير قلم و عنوان
مقادير از اين منو استفاده
Copyright ©2011 Pearson Education
مقادير از اين منو استفاده
.مي شود
Data
تعريف جهت شكل مطابق�
نها، مقادير و متغيرها به رفتن ا
تنظيم ،)Case( خاص مورد
وزن و فايلها، تركيب ها، داده
Copyright ©2011 Pearson Education
وزن و فايلها، تركيب ها، داده
مورد اين از موردها، به دادن
. شود مي استفاده
Transform
متغيرهاي محاسبه و ايجاد براي�
مجدد، گذاري كد جديد،
مقاديرمفقود جايگزيني
)Missing( منو اين از …و
Copyright ©2011 Pearson Education
)Missing( منو اين از …و
.ميشود استفاده
Analyze
ماري 16گزينه از �15گزينه اين منو تمامي گزارشهاي ا
ماري توصيفي شامل جداول در مورد داده ها از ا
مار استنباطي توصيفي، ميانگين، انحراف معيار تا ا
شامل ضريب همبستگي، رگرسيون چند متغيره، و
…
Copyright ©2011 Pearson Education
شامل ضريب همبستگي، رگرسيون چند متغيره، و
. از طريق اين منو قابل اجرا است…
Graphs
منو اين از نمودارها انواع رسم جهت�
اي، ميله نمودارهاي .شود مي استفاده
....و پراكنش اي، دايره خطي، ستوني،
Copyright ©2011 Pearson Education
Utilities
جستجوي اطqعات درباره �
متغيرها و فايل ها در اين منو
تعريف سري . امكان پذير است
متغيرها نيز در اين منو انجام مي
Copyright ©2011 Pearson Education
متغيرها نيز در اين منو انجام مي
.شود
Window
براي Windowدستوراتز ا
كوچك كردن پنجره ويرايشگر
.داده ها استفاده مي شود
Copyright ©2011 Pearson Education
Help
انواع راهنمائيها براي كار در �
قسمتهاي مختلف نرم افزار
Spss از اين منو بدست مي ،
Copyright ©2011 Pearson Education
Spss از اين منو بدست مي ،
يد .ا
نوار ابزارنوار ابزارنوار ابزارنوار ابزار ----3333
مطابق شكل زير، ميله ابزار شامل دكمه هايــي است كه براي اجراي برخي �نها استفاده مي spssدستورات
جهت سرعت بخشيدن به عمليات از ا
.شود
Copyright ©2011 Pearson Education
ميله فرمولميله فرمولميله فرمولميله فرمول - - - - 4444
، ميله فرمول spssيكي ديگر از اجزاي پنجره ويرايشگر داده ها در �
است كه در قسمت زيرين ميله ابزار قرار دارد، سطري كه محتويات
در صورت پر بودن سلول، قسمت سمت . سلول فعال را نشان مي دهد
.می دهد راست محتويات سلول را نشان
Copyright ©2011 Pearson Education
نوار پيمايش پنجره نوار پيمايش پنجره نوار پيمايش پنجره نوار پيمايش پنجره - - - - 5555
درنوارهاي افقي وعمودي كه در جهت هاي راست وپائين صفحه ويرايشگر
داده ها قرار دارند ، مثلثهاي كوچكي ديده ميشود كه با حركت دادن
نها مي توان داده ها را حركت داد وقسمتهاي مورد نظر را موشواره روي ا
. روي صفحه نمايش مشاهده نمود
Copyright ©2011 Pearson Education
. روي صفحه نمايش مشاهده نمود
نوار نمايش داده ها ومتغيرهانوار نمايش داده ها ومتغيرهانوار نمايش داده ها ومتغيرهانوار نمايش داده ها ومتغيرها - - - - 6666
Copyright ©2011 Pearson Education
پنجره ويرايشگر داده ها
Copyright ©2011 Pearson Education
:استداراي دو كاربرگ SPSSويرايشگر داده ها در . �
نمايشگر اطqعات1.
نمايشگر متغيرها2.
Copyright ©2011 Pearson Education
نمايشگر متغيرها
ورود داده هاورود داده هاورود داده هاورود داده ها
Copyright ©2011 Pearson Education
تعريف متغيرها
Copyright ©2011 Pearson Education
تعريف متغيرها
Copyright ©2011 Pearson Education
تعريف متغيرها
Copyright ©2011 Pearson Education
تعريف متغيرها
در سمت راست سلول دكمه اي ديده کليکبا
با كليك روي اين دكمه كادر . مي شود
. جديدي ديده مي شود
Copyright ©2011 Pearson Education
تعريف متغيرها
Copyright ©2011 Pearson Education
Align
Copyright ©2011 Pearson Education
تعريف متغيرها
Copyright ©2011 Pearson Education
با . مي شودداده ها نمايانبا كليك روي دكمه كوچك گوشه پائين سمت چپ، صفحه نمايشگر�
: انتخاب اولين سلول، ورود داده ها شروع مي شود
Copyright ©2011 Pearson Education
وارد كردن داده هاوارد كردن داده هاوارد كردن داده هاوارد كردن داده ها
Copyright ©2011 Pearson Education
. اطqعات فرضي مربوط به متغير جنسيت مطابق شكل وارد شده است , درشكل زير�
Copyright ©2011 Pearson Education
:ويرايش داده ها ابعاد مختلف دارد
تغيير و اصqح داده ها 1.
اضافه كردن يك سطر 2.
Copyright ©2011 Pearson Education
اضافه كردن متغير جديد 3.
) مورد ( حذف سطر 4.
) متغير ( حذف ستون 5.
)find(جستجوی داده 6.
)sort(مرتب سازی داده ها 7.
Copyright ©2011 Pearson Education
جهت ايجاد متغير جديد مراحل زير را اجرا
: كنيد
دستور Transformـ از منوي 1
Compute راانتخاب و اجرا كنيد .
Compute) ) ) ) محاسبه داده ها محاسبه داده ها محاسبه داده ها محاسبه داده ها ( ( ( (
Copyright ©2011 Pearson Education
. راانتخاب و اجرا كنيد
Copyright ©2011 Pearson Education
Copyright ©2011 Pearson Education
شمارش داده ها شمارش داده ها شمارش داده ها شمارش داده ها
Copyright ©2011 Pearson Education
Copyright ©2011 Pearson Education
ادامه بحثادامه بحثادامه بحثادامه بحث
در سمت چپ كادر از . نام متغير جديد را تايپ كنيد Target Variableدر كادر - 3
منتقل Numeric Variablesليست متغيرها ، متغير مورد نظر را انتخاب و به كادر
.كنيد
Copyright ©2011 Pearson Education
Define گزينه روي - 4 Values شود مي ديده زير شكل . كنيد كليك :
Copyright ©2011 Pearson Education
ادامه بحثادامه بحثادامه بحثادامه بحث
به عنوان . مقداري را كه مي خواهيد شمارش شود را تايپ كنيد Valueدر كادر مقابل گزينه - 5
7مثال عدد
Copyright ©2011 Pearson Education
ادامه بحثادامه بحثادامه بحثادامه بحث
به كادر Addبا كليك روي دكمه 7عدد
Values to Count منتقل شده
مقابل 5است و مقدار بعدي يعني عدد
تايپ شده است Valueگزينه
Copyright ©2011 Pearson Education
ادامه بحثادامه بحثادامه بحثادامه بحث
Add دكمه فشار با و كنيد تايپ ، شود شمارش است قرار كه مقداري هر
.كنيد وارد شمارش كادر در
.كليك كنيد Continueجهت ادامه مراحل روي گزينه -6
Copyright ©2011 Pearson Education
.كليك كنيد Continueجهت ادامه مراحل روي گزينه -6
. كليك كنيد OKروي گزينه - 7
ادامه بحثادامه بحثادامه بحثادامه بحث
با انجام مراحل فوق متغير جديدي به متغيرها اضافه - 8
.شده است
Copyright ©2011 Pearson Education
متغير جديدي است كه نشان دهنده mزير ستون ,در شكل
. ميباشد 7و 5تعداد پاسخهاي هر پاسخگو به كد
ادامه بحثادامه بحثادامه بحثادامه بحث
متغير جديدي است mمتغير ◄كه جهت شمارش تعريف شده
.است
براي 0مطابق شكل ، عدد ◄
Copyright ©2011 Pearson Education
براي 0مطابق شكل ، عدد ◄نشان دهنده 1پاسخگوي شماره
مي 7و 5عدم انتخاب كدهاي باشد
جداول توافقيجداول توافقيجداول توافقيجداول توافقي
براي تهيه جداول فراواني دوبعدي از دستور
Crosstabs اين دستور براي . استفاده كنيد
جداول دو بعدي اسمي و رتبه اياسمي و رتبه اياسمي و رتبه اياسمي و رتبه ايداده هاي
Copyright ©2011 Pearson Education
جداول دو بعدي اسمي و رتبه اياسمي و رتبه اياسمي و رتبه اياسمي و رتبه ايداده هاي
.را ايجادمي كند
جهت استفاده از گزينه
Crosstabs :
دستور Analyzeاز منوي 1.
Copyright ©2011 Pearson Education
Descriptive Statistics را انتخاب و
.كنيدكليك
ادامه بحثادامه بحثادامه بحثادامه بحث
را Crosstabsزير دستور 1.
.انتخاب و اجرا كنيد
Copyright ©2011 Pearson Education
جداول توافقيجداول توافقيجداول توافقيجداول توافقي
Crosstabsپس از اجراي گزينه 1.
ديده Crosstabsكادر مكالمه
.مي شود
با انتخاب و كليك كردن روي نام 2.
نها را به كادرهاي Rowsمتغيرها، ا
Copyright ©2011 Pearson Education
با انتخاب و كليك كردن روي نام
نها را به كادرهاي و Rowsمتغيرها، ا
Columns منتقل نمائيد
ادامه بحثادامه بحثادامه بحثادامه بحث
.كليك كنيد OKروي گزينه . 3
در صورتي كه نياز به جداول سه �
بعدي داشته باشيد، متغير
Copyright ©2011 Pearson Education
مورد نظررا به كادر
Previous منتقل كنيد.
زمونهايزمونهايازمونهايازمونهاياt ا
زمون :وجود دارد tسه نوع ا
1 .t اي يك نمونه
2 .t دو گروه مستقل
Copyright ©2011 Pearson Education
2 .t دو گروه مستقل
3 .t زوجي يا دو گروه وابسته
زمون زمون ازمون ازمون ايك نمونه اي يك نمونه اي يك نمونه اي يك نمونه اي tا
زمون به اين سؤال پاسخ مي دهد كه ميانگين مشاهده شده �اين ا
اين . در مقايسه با مقدار واقعي تفاوت معناداري دارد يا خير
Copyright ©2011 Pearson Education
اين . در مقايسه با مقدار واقعي تفاوت معناداري دارد يا خير
زمون زمون ساده ترين ا
. مي باشد tا
زمون :اي تك نمونه tجهت انجام ا
Compareگزينه Analyzeاز منوي Means را
.كنيد انتخاب وكليك
Copyright ©2011 Pearson Education
.كنيد انتخاب وكليك
Tدستور دستور دستور دستور Test... Sampl–One راكليك
واجراكنيد
زمون زمون ازمون ازمون ايك نمونه اي يك نمونه اي يك نمونه اي يك نمونه اي tا
Copyright ©2011 Pearson Education
ادامه بحثادامه بحثادامه بحثادامه بحث
Test T Sampl–Oneپنجره �
. ديده مي شود
نظر را از ليست متغيرهاي كادر متغير مورد�
Copyright ©2011 Pearson Education
نظر را از ليست متغيرهاي كادر متغير مورد
Test(سمت چپ به كادر سمت راست
Variables (منتقل كنيد .
ادامه بحثادامه بحثادامه بحثادامه بحث
.خروجي ديده مي شود. كليك كنيد OKروي گزينه �
زمون قضاوت کنيد sigبا توجه به سطح معني داري �.درمورد ا
Copyright ©2011 Pearson Education
زمون زمون ازمون ازمون ادو گروه مستقلدو گروه مستقلدو گروه مستقلدو گروه مستقل tا
زمون غير وابسته نيز مي خوانندزمون را ا
در اين نوع . اين نوع ا
ماري مستقل، زمون تفاوت بين ميانگينهاي دو جامعه ا
ا
زمون قرار مي گيرد. مورد ا
Copyright ©2011 Pearson Education
زمون قرار مي گيرد. مورد ا
ادامه بحثادامه بحثادامه بحثادامه بحث
دو نمونه تصادفي از دو جامعه را با هم مقايسه مي كنيم تا
نها را معين كنيم. تفاوت يا عدم تفاوت ميانگينهاي ا
Copyright ©2011 Pearson Education
زمون :مستقل tجهت اجراي ا
.كليك كنيد Analyzeروي منوي . 1
.كليك كنيد Compare Meansروي گزينه . 2
Copyright ©2011 Pearson Education
.كليك كنيد Compare Meansروي گزينه . 2
Independent Samples Tمطابق شكل روي دستور . 3
tests كليك كنيد
زمون زمون ازمون ازمون ادو گروه مستقلدو گروه مستقلدو گروه مستقلدو گروه مستقل tا
Copyright ©2011 Pearson Education
ادامه بحثادامه بحثادامه بحثادامه بحث
– Independentكادر گـفتگوي. 4
Sample T Testمي شودمشاهده .
Copyright ©2011 Pearson Education
روي . فهرستي از متغير ها در كادر ديده مي شود
ن را درون كادر متغير وابسته كليك كنيد و ا
منتقل Test Variableمقابل با نام
كنيد
ادامه بحث
ادامه بحث
Copyright ©2011 Pearson Education
ادامه بحث
ادامه بحثادامه بحثادامه بحثادامه بحث
روي متغير مستقل كليك كنيد . 5
ن را به كادر و ا
Copyright ©2011 Pearson Education
Grouping Variable منتقل نمائيد.
ادامه بحثادامه بحثادامه بحثادامه بحث
عنوان مثال جنسيت به عنوان به
. متغير مستقل در نظر گرفته شد
Copyright ©2011 Pearson Education
ادامه بحثادامه بحثادامه بحثادامه بحث
Define Groupsروي دكمه
مطابق شكل كادر . كليك كنيد
ديده مي شود، اين كادر مشخص
مي كند كه كدام دو گروه در حال
Copyright ©2011 Pearson Education
مي كند كه كدام دو گروه در حال
. مقايسه شدن هستند
ادامه بحثادامه بحثادامه بحثادامه بحث
در مثال موجود متغير مستقل
جنسيت مي باشد كه .زم است
را به درون كادرهاي 2و1كدهاي
.مشخص وارد كنيد
Copyright ©2011 Pearson Education
.مشخص وارد كنيد
زمون . كليك كنيد OKروي گزينه . 5 . مشاهده مي شود tخروجي ا
مي بينيد شكل در كه همانگونه .كنيد كليك Continue دكمه روي .6است شده وارد پرانتز درون مقادير
Copyright ©2011 Pearson Education
Sig مقدارمقدارمقدارمقدار بهبهبهبه ،،،، گروهگروهگروهگروه دودودودو واريانسهايواريانسهايواريانسهايواريانسهاي برابري برابري برابري برابري بررسيبررسيبررسيبررسي برايبرايبرايبراي� ا ا ا زمونزمونزمونزمونا
زمون Sig مقدار اگرمي شودمي شودمي شودمي شود توجهتوجهتوجهتوجه لونلونلونلون 05/0 از كمتر لون ا
از بايد حالت دراين .نيستند برابر جامعه دو واريانسهاي باشد،
Copyright ©2011 Pearson Education
از بايد حالت دراين .نيستند برابر جامعه دو واريانسهاي باشد،
ماره هايكرد استفاده نابرابر واريانسهاي به مربوط ا
زمونزمونازمونازمونازوجيزوجيزوجيزوجي tا
زمون، �براي تشخيص . همبسته يا وابسته نيز مي گويند tبه اين ا
زمون استفاده و انجام تفاوت ميانگين دو گروه وابسته، ازاين ا
.مي شود
Copyright ©2011 Pearson Education
.مي شود
ادامه بحثادامه بحثادامه بحثادامه بحث
Compareگزينه Analyzeاز منوي- �1 Means را انتخاب وكليك كنيد
Copyright ©2011 Pearson Education
زمونزمونازمونازموناا t زوجيزوجيزوجيزوجي
:ديده مي شودزير كليك كنيد، پنجره Samples T test Pairedروي . 2
Copyright ©2011 Pearson Education
ادامه بحثادامه بحثادامه بحثادامه بحث
Pairedدو متغير موردنظر را به طور هم زمان انتخاب و به كادر . 3
Variables منتقل كنيد.
خروجي ديده مي شود. كليك كنيد OKروي دكمه . 4
Copyright ©2011 Pearson Education
خروجي ديده مي شود. كليك كنيد OKروي دكمه . 4
تحليل واريانستحليل واريانستحليل واريانستحليل واريانس
مقدار . مجذور انحراف اعداد از ميانگين را واريانس مي گويند�
.واريانس نشان دهنده پراكندگي داده ها از ميانگين است
Copyright ©2011 Pearson Education
Way - - - - Oneتحليل واريانس يك طرفه تحليل واريانس يك طرفه تحليل واريانس يك طرفه تحليل واريانس يك طرفه
ماري روشروشروشروش�ماري اماري اماري ان طيطيطيطي كهكهكهكه ا
نانانا متغيرمتغيرمتغيرمتغير رويرويرويروي مستقلمستقلمستقلمستقل متغيرمتغيرمتغيرمتغير يكيكيكيك تاثيرتاثيرتاثيرتاثير ا
گـفتهگـفتهگـفتهگـفته طرفهطرفهطرفهطرفه يكيكيكيك واريانسواريانسواريانسواريانس تحليلتحليلتحليلتحليل مي شود،مي شود،مي شود،مي شود، بررسيبررسيبررسيبررسي وابستهوابستهوابستهوابسته
Copyright ©2011 Pearson Education
گـفتهگـفتهگـفتهگـفته طرفهطرفهطرفهطرفه يكيكيكيك واريانسواريانسواريانسواريانس تحليلتحليلتحليلتحليل مي شود،مي شود،مي شود،مي شود، بررسيبررسيبررسيبررسي وابستهوابستهوابستهوابسته
....مي شودمي شودمي شودمي شود
ناليز واريانس يكطرفه :براي محاسبه ا
را Compare means، گزينه Analyzeاز منوي . 1
.كليك كنيد
Copyright ©2011 Pearson Education
.كليك كنيد
- One، گزينه اسqيد بعدمطابق . 2 Way AN0VA
.را انتخاب كنيد
ناليز واريانس يكطرفها
Copyright ©2011 Pearson Education
Oneگزينه اجراي با - Way
ANOVA، ديده مقابل پنجره
:مي شود
Copyright ©2011 Pearson Education
ادامه بحثادامه بحثادامه بحثادامه بحث
متغير هاي مورد نظر را از ليست سمت . �4
چپ به كادر سمت راست
Dependent List منتقل كنيد .
OKو سپس Contintueروي دكمه
خروجي ديده مي شود. كليك كنيد
Copyright ©2011 Pearson Education
و سپس روي دكمه
خروجي ديده مي شود. كليك كنيد
ادامه بحثادامه بحثادامه بحثادامه بحث
كليك Optionsروي گزينه . 5
- :ديده مي شود مقابلشكل . كنيدو سپس Contintueروي دكمه
OK خروجي ديده مي شود . كليك كنيد .
Copyright ©2011 Pearson Education
OK خروجي ديده مي شود . كليك كنيد .
ناليز واريانس دو طرفهناليز واريانس دو طرفهاناليز واريانس دو طرفهاناليز واريانس دو طرفها- Towا Way Analysis of
Variance((((
درتحليل واريانس دو طرفه، متغير مستقل تغييرات متغير وابسته را درتحليل واريانس دو طرفه، متغير مستقل تغييرات متغير وابسته را درتحليل واريانس دو طرفه، متغير مستقل تغييرات متغير وابسته را درتحليل واريانس دو طرفه، متغير مستقل تغييرات متغير وابسته را �
....تبيين مي كندتبيين مي كندتبيين مي كندتبيين مي كند
Copyright ©2011 Pearson Education
....تبيين مي كندتبيين مي كندتبيين مي كندتبيين مي كند
، متغير وابسته را ، متغير وابسته را ، متغير وابسته را ، متغير وابسته را اياياياي هنگامي كه دو متغير مستقل با مقياس اسمي يا رتبههنگامي كه دو متغير مستقل با مقياس اسمي يا رتبههنگامي كه دو متغير مستقل با مقياس اسمي يا رتبههنگامي كه دو متغير مستقل با مقياس اسمي يا رتبه�
ناليز واريانس دو طرفه تبيين نمايند، ازتبيين نمايند، ازتبيين نمايند، ازتبيين نمايند، ازناليز واريانس دو طرفهاناليز واريانس دو طرفهاناليز واريانس دو طرفهاجهت محاسبه روابط متغيرها، استفاده جهت محاسبه روابط متغيرها، استفاده جهت محاسبه روابط متغيرها، استفاده جهت محاسبه روابط متغيرها، استفاده ا
Copyright ©2011 Pearson Education
ناليز واريانس دو طرفه تبيين نمايند، ازتبيين نمايند، ازتبيين نمايند، ازتبيين نمايند، ازناليز واريانس دو طرفهاناليز واريانس دو طرفهاناليز واريانس دو طرفهاجهت محاسبه روابط متغيرها، استفاده جهت محاسبه روابط متغيرها، استفاده جهت محاسبه روابط متغيرها، استفاده جهت محاسبه روابط متغيرها، استفاده ا
....مي شودمي شودمي شودمي شود
ادامه بحث، گزينه Analyzeاز منوي - 1
General Linear Model
. كليك كنيد
Univariateمطابق شكل گزينه . 2
Copyright ©2011 Pearson Education
مطابق شكل گزينه . 2
: انتخاب كنيد
ناليز واريانس دو طرفهناليز واريانس دو طرفهاناليز واريانس دو طرفهاناليز واريانس دو طرفهاا
Copyright ©2011 Pearson Education
مطابق شكل متغير وابسته را به كادر . �4
Dependent Variable منتقل
كنيد و متغيرهاي مستقل را به كادر بعدي
.منتقل نماييد
Copyright ©2011 Pearson Education
ادامه بحثادامه بحثادامه بحثادامه بحث
كليك كنيد Modelروي گزينه مدل . 5
:ديده مي شود مقابلشكل
Copyright ©2011 Pearson Education
ادامه بحثادامه بحثادامه بحثادامه بحث
كليك كنيد full factorialروي گزينه . 6
.را فشار دهد continueو دكمه
Copyright ©2011 Pearson Education
ادامه بحثادامه بحثادامه بحثادامه بحث
كليك كنيد plotsروي گزينه . 7
. ديده مي شود مقابلشكل
متغيرهاي مورد نظر را به كادرهاي مقابل
.را كليك كنيد Addمنتقل كنيد و گزينه
Copyright ©2011 Pearson Education
.را كليك كنيد Addمنتقل كنيد و گزينه
.را فشاردهد continueدكمه
ادامه بحثادامه بحثادامه بحثادامه بحث
كليك optionsروي گزينه . 8
ديده مي شود مقابلكنيد شكل
.
Copyright ©2011 Pearson Education
ادامه بحثادامه بحثادامه بحثادامه بحث
كليك كنيد و دكمه Residual plotروي گزينه �
continue را فشار دهد، خروجي ديده مي شود.
Copyright ©2011 Pearson Education
ضريب همبستگي پيرسونضريب همبستگي پيرسونضريب همبستگي پيرسونضريب همبستگي پيرسون
:جهت محاسبه ضريب همبستگي پيرسون مراحل زير را دنبال كنيد
را انتخاب و كليك Correlate، گزينه Analyzeاز منوي . 1
.نمائيد
Copyright ©2011 Pearson Education
.نمائيد
:را كليك كنيد Bivariateمطابق شكل، دستور . 2
ضريب همبستگي پيرسونضريب همبستگي پيرسونضريب همبستگي پيرسونضريب همبستگي پيرسون
Copyright ©2011 Pearson Education
ادامه بحثادامه بحثادامه بحثادامه بحث
بعد از اجراي دستور . 3
Bivariate مقابل پنجره
:مشاهده مي شود
Copyright ©2011 Pearson Education
ادامه بحثادامه بحثادامه بحثادامه بحث
متغير هاي مورد نظر را به كادر .4
Variable منتقل كنيد.
Pearsonروي گزينه . 5
Copyright ©2011 Pearson Education
Pearsonروي گزينه . 5
.كليك كنيد
ادامه بحثادامه بحثادامه بحثادامه بحث
كليك Optionsروي دكمه . �6
:شکل مقابل ديده مي شود.كنيد
Copyright ©2011 Pearson Education
:شکل مقابل ديده مي شود.كنيد
ادامه بحثادامه بحثادامه بحثادامه بحث
گزينه هاي زير مجموعه . 7
Statistics مت دارqرا ع
)مطابق شكل(.كنيد
Copyright ©2011 Pearson Education
ادامه بحثادامه بحثادامه بحثادامه بحث
.را كليك كنيد Continueدكمه . 8
. را كليك كنيد OKدكمه . 9
.خروجي مشاهده مي شود
Copyright ©2011 Pearson Education
.خروجي مشاهده مي شود
زمون زمون ازمون ازمون اMann -Whitney U Testمن ويتني من ويتني من ويتني من ويتني Uا
زمون ناپارامتريك جهت متغيرهايــي با مقياس اسميزمون ناپارامتريك جهت متغيرهايــي با مقياس اسمييك ازمون ناپارامتريك جهت متغيرهايــي با مقياس اسمييك ازمون ناپارامتريك جهت متغيرهايــي با مقياس اسمييك ا ....رتبه اي مي باشدرتبه اي مي باشدرتبه اي مي باشدرتبه اي مي باشد - - - - يك ا
زمونزمونازمونازمونازمون معادلمعادلمعادلمعادل ويتنيويتنيويتنيويتني منمنمنمن ا
زمونازمونازمونا....مي باشدمي باشدمي باشدمي باشد مستقلمستقلمستقلمستقل گروهگروهگروهگروه دودودودو t پارامتريكپارامتريكپارامتريكپارامتريك ا
Copyright ©2011 Pearson Education
زمونزمونازمونازمونازمون معادلمعادلمعادلمعادل ويتنيويتنيويتنيويتني منمنمنمن ا
زمونازمونازمونا....مي باشدمي باشدمي باشدمي باشد مستقلمستقلمستقلمستقل گروهگروهگروهگروه دودودودو t پارامتريكپارامتريكپارامتريكپارامتريك ا
زمون اين كاربرد حسب بر را گروه دو است قرار كه هنگامي است ا
.كنند مقايسه هم با افراد رتبه
زمون :اجرا كنيد Uمراحل زيررا جهت ا
را NonParametric Testگزينه Analyzeاز منوي . 1
.كليك كنيد
Copyright ©2011 Pearson Education
.كليك كنيد
را انتخاب independent Sample 2مطابق شكل گزينه . 2
.كنيد
زمون زمون ازمون ازمون امن ويتنيمن ويتنيمن ويتنيمن ويتنيUا
Copyright ©2011 Pearson Education
ادامه بحث
2اجراي انتخاب وپس از . 3
independent Sample
. ديده مي شود مقابلپنجره را Mann Whitney Uگزينه.4
Copyright ©2011 Pearson Education
را Mann Whitney Uگزينه.4
.عqمت دار كنيد
ادامه بحث
Defineروي گزينه . 5
Groups كليك كنيد، كادر
و 1كدهاي گروه : ديده مي شود مقابل
. را مقابل كادرهاي هر كدام تايپ كنيد 2
Copyright ©2011 Pearson Education
. را مقابل كادرهاي هر كدام تايپ كنيد 2
زمون زمون ازمون ازمون امن ويتنيمن ويتنيمن ويتنيمن ويتنيUا
براي 2و 1در اينجا متغير جنسيت با توجه به كدگذاري اوليه داراي دو كد
قايان و خانمها مي باشد .ا
خروجي زير . را كليك كنيد OKو سپس Continueدكمه . 7
Copyright ©2011 Pearson Education
خروجي زير . را كليك كنيد OKو سپس Continueدكمه . 7
.ديده مي شود
زمون ويلكاكسون زمون ويلكاكسون ازمون ويلكاكسون ازمون ويلكاكسون اWilcoxon Testا
زمون ناپارامتريك جهت متغيرهايــي با مقياس �ويلكاكسون، ا
زمون، امكان مقايسه قبل و بعد . رتبه اي مي باشداز طريق اين ا
.يك وضعيت تحت تاثير يك متغير امكان پذير است
Copyright ©2011 Pearson Education
.يك وضعيت تحت تاثير يك متغير امكان پذير است
زمون پارامتريك �زمون ويلكاكسون، معادل ا
زمون پارامتريك ا
زمون ويلكاكسون، معادل ا
زمون پارامتريك ا
زمون ويلكاكسون، معادل ا
زمون پارامتريك ا
زمون ويلكاكسون، معادل ا
....زوجي مي باشدزوجي مي باشدزوجي مي باشدزوجي مي باشد tا
Copyright ©2011 Pearson Education
: براي اجراي ويلكاكسون
، گزينه Analyzeاز منوي . 1
Nonparametrice Test را
.كليك كنيد
Copyright ©2011 Pearson Education
.كليك كنيد
2 .- Related Sample 2 را انتخاب
:كنيد
ادامه بحث
:، كادر ديده مي شود2با اجراي گزينه . 3
متغيرهاي مورد نظر را به صورت جفتي به . 4
. منتقل كنيد Test Pairs Listكادر
انتقال به صورت تك متغيري امكان پذير
Copyright ©2011 Pearson Education
. منتقل كنيد كادر
انتقال به صورت تك متغيري امكان پذير
.نمي باشد
.را كليك كنيد Wilcoxonگزينه . 5
.خروجي ديده مي شود. را كليك كنيد OKدكمه . 6
Copyright ©2011 Pearson Education
.خروجي ديده مي شود. را كليك كنيد OKدكمه . 6
زمون كروسكال واليسزمون كروسكال واليسازمون كروسكال واليسازمون كروسكال واليساا
هنگامي كه داده ها در مقياس رتبه اي باشند، جهت مقايسه وضعيت يك �
زمون استفاده مي شودچند گروهچند گروهچند گروهچند گروهمتغير در .، از اين ا
زمون كروسكال واليس�زمون كروسكال واليسازمون كروسكال واليسازمون كروسكال واليسازمونهاي ,,,, ا
زمونهاي معادل تحليل واريانس يك طرفه در ازمونهاي معادل تحليل واريانس يك طرفه در ازمونهاي معادل تحليل واريانس يك طرفه در امعادل تحليل واريانس يك طرفه در ا
Copyright ©2011 Pearson Education
زمون كروسكال واليس�زمون كروسكال واليسازمون كروسكال واليسازمون كروسكال واليسازمونهاي ,,,, ا
زمونهاي معادل تحليل واريانس يك طرفه در ازمونهاي معادل تحليل واريانس يك طرفه در ازمونهاي معادل تحليل واريانس يك طرفه در امعادل تحليل واريانس يك طرفه در ا
....پارامتريك استپارامتريك استپارامتريك استپارامتريك است
ادامه بحث
زمون :براي اجراي ا
، گزينه Analyzeاز منوي . 1
NonParametrice Test را
.انتخاب و كليك كنيد
Copyright ©2011 Pearson Education
.انتخاب و كليك كنيد
K Independentگزينه . 2
Samples راانتخاب كنيد:
ادامه بحث
Kپس از اجراي گزينه . 3
Independent Samples مقابل پنجره
Copyright ©2011 Pearson Education
مقابل پنجره
:ديده مي شود
ادامه بحث
متغير هاي مورد نظر را از كادر سمت چپ . 4
و test Variableبه كادرهاي
Grouping Variable منتقل
.كنيد
Copyright ©2011 Pearson Education
-Kruskalگزينه . 5
Wallish را مارک دار كنيد.
ادامه بحث
Defineروي گزينه . 6
Range كليك كنيد .
مقابل . ديده مي شود مقابلكادر
گزينه حداقل و حداكـثر، كدهاي
Copyright ©2011 Pearson Education
گزينه حداقل و حداكـثر، كدهاي
.مورد استفاده را تايپ كنيد
زمون كاي دوزمون كاي دوازمون كاي دوازمون كاي دواا
هنگامي كه داده هايــي با مقياس اسمي وجود دارد، يكي از معمول ترين �
زمون مي باشدزمونها، ا
.ا
Copyright ©2011 Pearson Education
ادامه بحث
زمون � :براي اجراي اين ا
، Analyzeاز منوي . 1
Descriptive Statisticsگزينه
.را انتخاب و كليك كنيد
Copyright ©2011 Pearson Education
.را انتخاب و كليك كنيد
ادامه بحث
را انتخاب و كليك Crosstabدستور . �با اجراي اين دستور شكل ديده . كنيد
:مي شود
متغيرهاي مورد نظر را به كادرهاي . �3Row وColumn )سطر و ستون (
Copyright ©2011 Pearson Education
Row وColumn )سطر و ستون ( .منتقل كنيد
ادامه بحث
كليك Statisticsروي گزينه . 4
: ديده مي شودمقابل پنجره. كنيد
- Chiروي گزينه . 5 Square
Copyright ©2011 Pearson Education
.كليك كنيد
ادامه بحث
. را انتخاب و كليك كنيد OKو سپس Continueگزينه . 6
.خروجي ديده مي شود
Copyright ©2011 Pearson Education
.خروجي ديده مي شود
زمون �يد، يعني به علت عدم X2ا
زمونهاي ناپارامتري به شمار مي ا
از ا
جهت گيري نمي تواند مشخص كند وضعيت كدام جنسيت بهتر است،
.صرفا متفاوت بودن وضعيت با توجه به نوع متغير مشخص مي شود
Copyright ©2011 Pearson Education
.صرفا متفاوت بودن وضعيت با توجه به نوع متغير مشخص مي شود
زمون �زمون كاربرد ازمون كاربرد ازمون كاربرد ايا رابطه بين X2كاربرد ا
يا رابطه بين اين است كه مشخص مي كند ايا رابطه بين اين است كه مشخص مي كند ايا رابطه بين اين است كه مشخص مي كند ااين است كه مشخص مي كند ا
....دو متغير كيفي تصادفي است يا واقعيدو متغير كيفي تصادفي است يا واقعيدو متغير كيفي تصادفي است يا واقعيدو متغير كيفي تصادفي است يا واقعي
Copyright ©2011 Pearson Education
زمون فريدمنزمون فريدمنازمون فريدمنازمون فريدمناا
زمون kهنگامي كه قرار است متغيرهايــي با مقياس رتبه اي در گروه وابسته ا
زمون استفاده شوند، جهت بررسي تفاوت در گروه هاي وابسته از اين ا
.مي شود
Copyright ©2011 Pearson Education
.مي شود
ادامه بحث
زمون :جهت استفاده از اين ا
.را انتخاب و كليك كنيد NonParametric Testگزينه Analyzeاز منوي . 1
. را كليك كنيد K Ralated Sampleگزينه . 2
Copyright ©2011 Pearson Education
. را كليك كنيد گزينه . 2
زمون فريدمنزمون فريدمنازمون فريدمنازمون فريدمناا
Copyright ©2011 Pearson Education
ادامه بحث
K Ralatedپس از اجراي گزينه . 3
Samplesكادر زير ديده مي شود ،:
متغيرهاي مورد نظر را از كادر سمت چپ به . 4
Test Variableكادر سمت راست
Copyright ©2011 Pearson Education
كادر سمت راست
.منتقل كنيد
.را عqمت دار كنيد Friedmanگزينه . 5
خروجي مشاهده. را كليك كنيد OKدكمه . 6
.مي شود
Copyright ©2011 Pearson Education
زمون كوكرانزمون كوكرانازمون كوكرانازمون كوكراناا
زمون �اگر متغيرهاي مورد بررسي داراي مقياس اسمي باشند، جهت ا
زمون كوكران استفاده كرد گروه وابسته مي توان از kتفاوت بين ا
Copyright ©2011 Pearson Education
زمون كوكران استفاده كرد گروه وابسته مي توان از kتفاوت بين ا
ادامه بحث
گزينه Analyzeاز منوي . 1
Nonparametric Test را
.انتخاب و كليك كنيد
را K Related Samplesگزينه .2
Copyright ©2011 Pearson Education
را گزينه .2
:پنجره مشاهده مي شود. اجرا كنيد
ادامه بحث
متغيرهاي مورد نظر را از كادر سمت . �3
Test Variableچپ به كادر
.منتقل كنيد
را را را را ′′′′Q s Cochran گزينهگزينهگزينهگزينه. �4
Copyright ©2011 Pearson Education
را را را را گزينهگزينهگزينهگزينه. 4
....عqمت دار كنيدعqمت دار كنيدعqمت دار كنيدعqمت دار كنيد
ادامه بحث
.را كليك كنيد OKدكمه . �5
.خروجي مشاهده مي شود �
Copyright ©2011 Pearson Education
زمون كوكران متغير �زمون كوكران متغير توجه داشته باشيد براي استفاده از ازمون كوكران متغير توجه داشته باشيد براي استفاده از ازمون كوكران متغير توجه داشته باشيد براي استفاده از اتوجه داشته باشيد براي استفاده از ا
....مورد نظر بايستي دو بعدي باشدمورد نظر بايستي دو بعدي باشدمورد نظر بايستي دو بعدي باشدمورد نظر بايستي دو بعدي باشد
Copyright ©2011 Pearson Education
زمون مك نمار زمون مك نمار ازمون مك نمار ازمون مك نمار اMc.Nemar Testا
زمون مك نمار جهت مقايسه دو وضعيت كاربرد دارد� .ا
ن است كه متغير بايستي كيفي باشد و دو �زمون ا
ن است كه متغير بايستي كيفي باشد و دو شرط استفاده از اين ا
زمون ا
ن است كه متغير بايستي كيفي باشد و دو شرط استفاده از اين ا
زمون ا
ن است كه متغير بايستي كيفي باشد و دو شرط استفاده از اين ا
زمون ا
شرط استفاده از اين ا
اگر متغير دو مقوله اي نباشد، پيغام خطا ديده ....مقوله داشته باشدمقوله داشته باشدمقوله داشته باشدمقوله داشته باشد
Copyright ©2011 Pearson Education
اگر متغير دو مقوله اي نباشد، پيغام خطا ديده ....مقوله داشته باشدمقوله داشته باشدمقوله داشته باشدمقوله داشته باشد
.مي شود
ادامه بحث
Nonparametric، گزينه Analyzeاز منوي . �1 Test را كليك كنيد.
:ديده مي شود اسqید بعدپنجره . را اجرا كنيد ...Related Samples-2گزينه . �2
Copyright ©2011 Pearson Education
زمون مك نمار زمون مك نمار ازمون مك نمار ازمون مك نمار اMc.Nemar Testا
Copyright ©2011 Pearson Education
ادامه بحث
Testجفت متغير مورد نظر را به كادر . �3
Pair List منتقل كنيد.
.را كليك كنيد McNemarگزينه . �4
خروجي ديده . را كليك كنيد OKدكمه . �5
Copyright ©2011 Pearson Education
خروجي ديده . را كليك كنيد OKدكمه . �5
.مي شود
ضريب همبستگي اسپيرمنضريب همبستگي اسپيرمنضريب همبستگي اسپيرمنضريب همبستگي اسپيرمن
براي محاسبه همبستگي بين دو متغير بر حسب رتبه ها در اين دو �
متغير، به جاي استفاده از ضريب همبستگي پيرسون از ضريب
Copyright ©2011 Pearson Education
متغير، به جاي استفاده از ضريب همبستگي پيرسون از ضريب
.همبستگي اسپيرمن استفاده مي شود
ادامه بحث
گزينه Analyzeاز منوي . 1
Correlate را انتخاب وكليك كنيد.
کادر .را اجرا كنيد Bivariateدستور . 2
.مقابل ديده ميشود
Copyright ©2011 Pearson Education
.مقابل ديده ميشود
ضريب همبستگي اسپيرمنضريب همبستگي اسپيرمنضريب همبستگي اسپيرمنضريب همبستگي اسپيرمن
متغيرهاي مورد نظر را به كادر . �3
Variables منتقل كنيد .
را عqمت دار Spearmanگزينه . �4
. كنيد
Copyright ©2011 Pearson Education
. كنيد
خروجي . را كليك نمائيد OKدكمه .�5
مشاهده مي شود
موفق باشیدموفق باشید
Copyright ©2011 Pearson Education
موفق باشیدموفق باشید