24
Basic Elements of Testing Hypothesis Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology Director, Data Coordinating Center College of Human Medicine Michigan State University

Basic Elements of Testing Hypothesis Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology Director, Data Coordinating Center College

  • View
    222

  • Download
    3

Embed Size (px)

Citation preview

Page 1: Basic Elements of Testing Hypothesis Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology Director, Data Coordinating Center College

Basic Elements of Testing Hypothesis

Dr. M. H. RahbarProfessor of Biostatistics

Department of Epidemiology

Director, Data Coordinating Center

College of Human Medicine

Michigan State University

Page 2: Basic Elements of Testing Hypothesis Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology Director, Data Coordinating Center College

Inferential Statistics

• Estimation: This includes point and interval estimation of certain characteristics in the population(s).

• Testing Hypothesis about population parameter(s) based on the

information contained in the sample(s).

Page 3: Basic Elements of Testing Hypothesis Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology Director, Data Coordinating Center College

Important Statistical Terms

• Population: A set which includes all measurements of interest to the researcher

• Sample: Any subset of the population

• Parameter of interest: The characteristic of interest to the researcher in the population is called a parameter.

Page 4: Basic Elements of Testing Hypothesis Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology Director, Data Coordinating Center College

Estimation of Parameters

• Point Estimation

• Interval Estimation (Confidence Intervals)

• Bound on the error of estimation (???)

• The width of a confidence interval is directly related to the bound on the error.

Page 5: Basic Elements of Testing Hypothesis Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology Director, Data Coordinating Center College

Factors influencing the Bound on the error of estimation

• Narrow confidence intervals are preferred

• As the sample size increases the bound on the error of estimation decreases.

• As the confidence level increases the bound on the error of estimation increases.

• You need to plan a sample size to achieve the desired level of error and confidence.

Page 6: Basic Elements of Testing Hypothesis Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology Director, Data Coordinating Center College

Testing hypothesis about population parameters

• OR or RR

• Mean = • Standard deviation = • Difference between two population means

• Proportion = p

• Difference between two population proportions

• Incidence

Page 7: Basic Elements of Testing Hypothesis Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology Director, Data Coordinating Center College

Testing Hypothesis about a Population Prevalence “p”

Suppose the Government report that prevalence of hypertension among adults in Pakistan is at most 0.20 but you as a researcher believe that such prevalence is greater than 0.20

Now we want to formally test these hypothesis.

Null Hypothesis H0: P0.20 vs Alternative Hypothesis Ha: P>0.20

Page 8: Basic Elements of Testing Hypothesis Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology Director, Data Coordinating Center College

A sample of n=100 adults is selected from Pakistan. In this sample 28 adults are hypertensive. Do the data provide sufficient evidence that the Government’s figure is wrong, i.e., P>0.20? Test at 5% level of

significance, that is, =0.05. Question:

Estimate prevalence=Þ=0.28

Hypothesized prevalence =0.20

Is the gap of 0.08= 0.28-0.20 considered statistically significant at 5% level?

Page 9: Basic Elements of Testing Hypothesis Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology Director, Data Coordinating Center College

Testing hypothesis about P

• We need to calculate a test statistic

• How many standard deviations have we deviated if the null hypothesis p=0.20 was true?

(0.28 0.20) /(( (0.20)(1 0.20) 100))

2.0

Z

Z

Page 10: Basic Elements of Testing Hypothesis Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology Director, Data Coordinating Center College

What is the likelihood of observing a Z=2.0 or more extreme if the

Government’s figure was correct?

P-value= P[Z > 2.0] = 0.025

How does this p-value as compared with =0.05?

If p-value < , then reject the null hypothesis H0 in favor of the alternative hypothesis Ha.

In this situation we reject the Government’s claim in favor of the alternative hypothesis.

Page 11: Basic Elements of Testing Hypothesis Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology Director, Data Coordinating Center College

Elements of Testing hypothesis

• Null Hypothesis

• Alternative hypothesis

• Level of significance

• Test statistics

• P-value

• Conclusion

• Power of the test

Page 12: Basic Elements of Testing Hypothesis Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology Director, Data Coordinating Center College

Is there an association between Drinking and Lung Cancer?

What is the most appropriate and feasible study design in order to

test the above research hypothesis?

Page 13: Basic Elements of Testing Hypothesis Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology Director, Data Coordinating Center College

Case Control Study of Smoking and Lung Cancer

Null Hypothesis: There is no association between Smoking and Lung cancer, P1=P2

Alternative Hypothesis: There is some kind of association between Smoking and Lung cancer, P1P2.

Page 14: Basic Elements of Testing Hypothesis Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology Director, Data Coordinating Center College

In the following contingency table estimate the proportion and odds of drinkers among those who develop Lung Cancer and those without the disease?

Lung Cancer Total Case Control Drinker Yes A=33 B=27 60 No C=1667 D= 2273 3940

P1=33/1700 P2=27/2300 Odds1=33/1667 Odds2=27/2273

Page 15: Basic Elements of Testing Hypothesis Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology Director, Data Coordinating Center College

QUESTION: Is there a difference between the proportion of drinkers among cases and controls?

G rou p 1D isease

P 1 = p rop ortion o f d rin kers

G rou p 2N o D isease

P 2 = p rop ortion o f d rin k rs

Page 16: Basic Elements of Testing Hypothesis Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology Director, Data Coordinating Center College

Test Statistic

• A statistical yard stick which is computed based on the information contained in the sample under the assumption that the null hypothesis is true.

• Knowledge about the sampling distribution of the test statistics is needed in determining the likelihood of observing extreme values for the test statistics in a given situation.

Page 17: Basic Elements of Testing Hypothesis Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology Director, Data Coordinating Center College

P-value

• An indicator which measures the likelihood of observing values as extreme as the one observed based on the sample information, assuming the null hypothesis is true.

• P-value is also known as the observed level of significance.

Page 18: Basic Elements of Testing Hypothesis Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology Director, Data Coordinating Center College

The level of significance ( )

is known as the nominal level of significance.

• If p-value < , then we reject the null hypothesis in favor of the alternative hypothesis.

• Most of statistical packages give P-value in their computer output.

needs to be pre-determined. (Usually 5%)

Page 19: Basic Elements of Testing Hypothesis Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology Director, Data Coordinating Center College

Type I and Type II errors

• Type I error is committed when a true null hypothesis is rejected.

is the probability of committing type I error.

• Type II error is committed when a false null hypothesis is not rejected.

is the probability of committing type II error.

Page 20: Basic Elements of Testing Hypothesis Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology Director, Data Coordinating Center College

Decision made about the validity of null hypothesis

Rejected Not rejected Null Hypothesis

True Type I Error

No Error

False No Error

Type II Error

Page 21: Basic Elements of Testing Hypothesis Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology Director, Data Coordinating Center College

Power of a test

• The power of a test is the probability that a false null hypothesis is rejected.

• Power = 1 - , where is the probability of committing type II error.

• More powerful tests are preferred. At the design stage one should identify the desired level of power in the given situation.

Page 22: Basic Elements of Testing Hypothesis Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology Director, Data Coordinating Center College

Factors influencing the Power

• The power of a test is influenced by the magnitude of the difference between the null hypothesis and the true parameter.

• The power of a test could be improved by increasing the sample size.

• The power of a test could be improved by increasing . (this is a very artificial way)

Page 23: Basic Elements of Testing Hypothesis Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology Director, Data Coordinating Center College

Minimum Required Sample Size

• Usually a Sample size calculation formula is available for most of the well known study designs. Some software packages such as Epi-Info could also be utilized for the sample size calculation purpose.

• It is extremely important to consult a biostatistician at the design phase to ensure adequate sample is considered for the study.

Page 24: Basic Elements of Testing Hypothesis Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology Director, Data Coordinating Center College

Testing hypothesis about one population mean

• H0: =16 vs Ha: >16 • Z= (sample mean – hypothesized mean)

SE of the Mean

• Under the null hypothesis and when n is large, (n>30), the distribution of Z is standard normal.

• P-value• Conclusion

n