Upload
thomas-picot
View
217
Download
0
Embed Size (px)
Citation preview
Testing Hypothesis
Nutan S. Mishra
Department of Mathematics and Statistics
University of South Alabama
Description of the problem
The population parameter(s) is unknown.Some one (say person A) has some claim
about the value of this unknown parameter.
Another person (say person B) wants to test how valid is this claim.
The person B collects a sample and gathers sample data
And proceeds to test the claim.
Example Manufacturer of certain cereal claims that his boxes contain
16oz on the average.He does not know the true average. He could claim this
because he has set up the filling machines to pour 16oz of material.
A consumer protection agency received multiple complaints over a period of time that this brand of boxes contain less amount then claimed.
So the consumer protection agency wants to test the claim of the manufacturer.
To start with consumer protection agency thinks that on the average boxes contain less than 16oz.
Example 1So there are two parties who claim differently
about the population parameter.The Manufacturer says average = 16ozThis a prevailing claim in the market.The Consumer Protection Agency says the
average is less than 16 oz.This is the doubt raised by the consumer so
its consumer protection agency’s responsibility to prove the claim.
The consumer protection agency proceeds to test the validity of the claims.
Notations and definitionsA claim about the population parameter is called Statistical
hypothesis Example1: µ = 16ozExample2: µ< 16ozThe prevailing belief in the market is that the box contains 16 oz
cereal.The claim about the population parameter which is a prevailing
belief is called Null HypothesisExample: Ho : µ = 16oz
On the other hand a claim made by another agency is called Alternative hypothesis
Example: H1: µ< 16ozSome times the other claims are also made by a researcher in a
given problem.Thus alternative hypothesis is also known as researcher’s
hypothesis.
Example 2 Consider the claim of an ambulance company
that on the average their vehicle reaches on site within 10 minutes.
Whereas the consumer protection agency received the complaint that they take longer time.
X = time taken by an ambulance to get there
H0: µ= 10 vs H1: µ > 10 minutes
Example 3A company manufactures and supplies the
hexanuts with average inner diameter as 7.00mm.
The customer wants to test if the hexanuts supplied are according to specification or not.
H0: µ= 7.00mm
H1: µ ≠ 7.00 mm
Basic philosophyBurden of the proof lies with the agency/person who raises
the doubts against a prevailing belief.
Example: The consumer protection agency needs to provide enough evidences against the prevailing belief or Null hypothesis H0: µ = 16oz
In order to collect the evidences, the consumer protection agency/ researcher collects a sample.
And uses the information contained in the sample to disprove the Null hypothesis.
Note that Null hypothesis is the target. So he starts with assumption that Null hypothesis is true.
The procedure of testing of hypothesis is developed under the assumption that Null hypothesis is true.
Definitions
A null hypothesis is a claim about the population parameter that is assumed to be true until it is declared to be false.
An alternative hypothesis is a claim about population parameter that will be true if null hypothesis is false.
Method for population mean• Step 1 : define null and alternative hypotheses.• Step2 :Collect a sample of size n• Step3 :Compute the sample mean.• Step4: compare the sample mean with the critical value• Recall that if we collect different samples, value of the
sample mean will be different.• And there is always a difference between sample mean and
population mean• This difference occurs due to:
– Sampling errors (chance errors which are inherent)– Non sampling errors ( due to some assignable cause)
• If the difference is too big, its easy to make a decision on H0.
• When the difference is small we need to analyze it carefully.We want to find out whether the difference in sample mean and claimed
value of population mean has occurred just due to chance or is there any systematic cause behind this difference?
Method for population mean• Step 1 : define null and alternative hypotheses.• Step2 :Collect a sample of size n• Step3 :Compute the sample mean.• Step4: compare the sample mean with the critical
value
Question: How to find the critical value?To find the critical value we need to take into
account two types of errors which may incur in our decision.
Two types of errorsIn testing of hypothesis we make a decision on the
basis of sample evidence (that is sample data). Thus we may commit two types of errors in our decision
Actual situation
H0 is true H0 is false
Our
Decision
Reject H0 Type I error
Correct decision
Do not reject H0
Correct decision
Type II error
Two types of errorsType I error = (rejecting H0| H0 is true)Type II error = (not rejecting H0| H0 is false)These two types of errors are measured in terms of probabilityα = P(committing type I error)α = P(rejecting H0| H0 is true)α is called level of significance of a test
β = P(committing type II error)β = P(not rejecting H0| H0 is false)1- β is called power of a test In our decision process we want to minimize these two types
of errors These two errors can not be minimized simultenuously.
Examples H0:µ = 16 vs H1: µ<16 Actual situation
H0 is true H0 is false
Our
Decision
Reject H0 Type I error Correct decision
Do not reject H0 Correct decision Type II error
Type I error = declaring the product faulty when in fact the manufacturer is producing the boxes with average content 16oz
Type II error = manufacture is producing a faulty product but because of lack of enough evidence the product is declared OK.
ExamplesH0 : person is innocent
H1: Person is guilty
Actual situation
Person is innocent Person is guilty
Our
Decision
Person is guilty Type I error Correct decision
Person is not guilty Correct decision Type II error
Type I error: Declaring an innocent person guilty (based on evidences available)
Type II error : the person has committed the crime but due to lack of evidences is declared not guilty.
Critical valueHow much evidence is enough to declare a person
guilty?How small should be the sample mean in order to
reject the manufacturer’s claim µ=16oz?A fixed value c such that all the values of sample
means below c means reject null.C defines the rejection region and non rejection
regionThen such a c is called critical valueIn our first example what should be the value of c?
15.99 or 15.98 or 15.97?
Three types of rejection regionsIn example 1 H0: µ = 16 oz vs H1: µ< 16oz
If the value of sample mean lies on the left of critical value we will reject H0. This is called left sided rejection region. And the alternative hypothesis is called left sided and the test is called left tailed test
In example 2 H0: µ= 10 vs H1: µ > 10 minutes
If the value of sample mean lies on the right of the critical value we will reject H0. This is called right sided critical region. And the alternative hypothesis is called right sided. The test is called right-tailed test.
In example 3 H0: µ= 7.00mm vs H1: µ ≠ 7.00 mm In this case there are two critical values c1 and c2
x
16 Critical value
16
x10 Critical value
x7 Critical value
c1Critical value
c2
Tails of a testWhen the alternative hypothesis is of the type µ ≠ µ0 then the
test is called two tailed test because the rejection region lies at the left tail as well as right tail of the distribution of the mean.
When alternative hypothesis is of the type µ > µ0 , then the test is right tailed test because the rejection region lies on the right side of the critical value, that is on the right tail of the curve of sample mean.
When alternative is of the type µ < µ0, then the test is left tailed test because the rejection region lies on the left side of the critical value, that is on the left tail of the curve of sample mean.
Large sample casesRecall that for large samples, distribution of is
approximately normal with parameters
~ N( µ, σ/√n) and hence
Z = ~ N(0,1) (this z-value we compute using
the sample information)
The idea is on the normal curve we can compare the value with the critical value(s) or the corresponding z-values on the z-curve
n
x
/
x
x
x
Rejection region of a left tailed test A test is left tailed if the alternative is of the form H1 : µ< µ0
For example H0: µ= 16oz vs H1: µ< 160z
Let α be the level of significance then the rejection region is shown as follows
If the value of sample mean lies towards the left of c then reject the null
If p-value < α reject null hypothesis
Rejection region of a right tailed test A test is right tailed if the alternative is of the form H1 : µ > µ0
For example H0: µ= 10 minutes vs H1: µ >10 minutes
Let α be the level of significance then the rejection region is shown as follows
If the value of sample mean lies towards the right of c then reject the null. If p-value < α reject null hypothesis
c
Rejection region of a two tailed test A test is two tailed if the alternative is of the form H1 : µ ≠ µ0
For example H0: µ= 7 mm vs H1: µ ≠7 mm
Let α be the level of significance then the rejection region is shown as follows
If the value of sample mean lies towards the right of c1 or left of c2 then reject the null. If p-value < α reject null hypothesis
c2 c1
About p-valueWe have seen α and p values are areas on the curve
corresponding to critical value c and the computed value of sample mean
To make a decision we can compare either the areas or the values.
α < p-value is equivalent to c <
Note that c and can be transformed in to corresponding z-values
Thus instead of comparing areas on -curve we may compare the corresponding areas on the z-curve
x
x
x
x
Exercise 9.9a. X = hours spent working per week by
studentsH0: µ = 20 hrs vs H1: µ ≠ 20 hrs.b. X = #hours banks ATM was out of
service/monthH0: µ = 10 hrs vs H1: µ > 10 hrs.c. X= length of experience of security guard H0: µ = 3 years vs H1: µ ≠ 3 hrs.d. X= credit card debt of a college seniorH0: µ = $1000 vs H1: µ < $1000