125
STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029 Workshop on “Essentials of Epidemiology and Research Methods” October 8-12 , 2003, Surajkund,Faridabad

STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Embed Size (px)

Citation preview

Page 1: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

STATISTICAL HYPOTHESIS TESTING

BY

Dr. K.R. SUNDARAM Professor & Head

Department of Biostatistics All India Institute of Medical Sciences

New Delhi-110029

Workshop on “Essentials of Epidemiology and Research Methods”

October 8-12 , 2003, Surajkund,Faridabad

Page 2: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

STATISTICAL METHODS

(A)Descriptive methods

(B)Inference methods

Page 3: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

(A) Descriptive Methods :--Statistical methods used for describing ( summarizing ) the collected data:---

Statistical Tables,

Diagrams & Graphs,

Computation of Averages, Location Parameters,Proportions & Percentages,Deviation measures and Correlation measures and Regression analysis .

Page 4: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

(B) Inference Methods:--

Statistical methods used for making inferences (generalizations) from the results obtained from the sample to the population from where the sample was selected

Page 5: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Two important questions raised in scientific studies

(A) How reliable are the results obtained----ESTIMATION

(B)  How probable is it that the differences between observed & expected results on the basis of the hypothesis have been produced by chance alone

TEST OF STATISTICAL SIGNIFICANCE :---by computing the chance element

Page 6: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Important terms / concepts concerned with the Statistical Inference :-- Standard Error Confidence Interval Null Hypothesis Alternate Hypothesis

Type-I error ( level of significance / ‘p’ value’/ ‘’value ) Type – II () error

Probability and Probability distributions or Statistical distributions( Normal , Binomial, Poisson etc. )

Test Statistic ( Test Criterion )

Critical Ratio and Decision making . 

Page 7: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Notations used :--

Statistical Population SamplefigureNumber of N nsubjectsValue of observation - X

Mean M ( ) m (X ) Proportion P p Standard sdeviationVariance 2 s2

Correlation r coefficient

Page 8: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Concept Of Standard Error (SE) 

Standard Deviation (SD):

average amount of deviation of different sample values from the mean value.

SD = SQRT ( (X-m)2/n )

X – sample value n - sample size m – Mean value in the sample

Page 9: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Standard Error (SE) :---

Average amount of deviation of different sample mean values from the population ( true ) mean value.

SE =SQRT ((m-)2/r)

( = Grand ( combined ) mean = estimate of population mean , r - no of samples) 

Page 10: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Computation of SE using the above formula is difficult and may not be feasible. Hence, SE is usually computed from one randomly selected sample of adequate size, as follows:-

SE = SD / SQRT(n)

Page 11: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Probability 

:--Relative frequency or probable chances of occurrences

with which an event is expected to occur on an average –in

the long run.

:--Relative frequency of the number of occurrences of a

favorable event to the total number of occurrences of all

possible events.

No conclusion can be drawn with 100 % certainty

( confidence )

Probability is the measurement of chance / uncertainty /

subjectivity associated with a conclusion.

Page 12: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Two Types of Probability:-

( A ) Mathematical

( B ) Statistical  

Page 13: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

(A) Mathematical probability:

An experiment or a trial where the probabilities of occurrences of various events / possibilities are already established mathematically.

Examples:---(1)   Prob. of getting a head when a coin is tossed (2) Prob. of getting five when a dice is thrown(3) Prob. of getting spade ace from a deck

of cards

Page 14: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

(B)Statistical / Empirical Probability: An experiment or a trial is required to find out the probabilities of occurrences of various events / possibilities.

Examples :----(1 ) Prob. of getting a boy in the first pregnancy (2 ) Prob. of getting a twin for a couple.(3 ) Prob. of improvement after the treatment for a specified period (4 ) Prob. of getting lung cancer in smokers (5 ) Prob. of an association of sedentary type of work with diabetes (6 ) Prob. that drug-A is better than drug-B in curing a disease. 

Page 15: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Probability Distributions

Several basic theorems based on which several types of probabilities are computed.

A series of probabilities associated with various occurrences/ outcomes/ possibilities of events in an experiment/ trial/ study will generate a probability distribution.

Basically -three types of probability distributions:

Binomial , Poisson and Normal distribution.

Page 16: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Probability Distributions

Binomial and poisson distributions --for discrete variables

Normal distribution --for continuous variables .

Most important probability distribution in statistical inference is Normal distribution(Guassian distribution )

Normal distribution will generate a Normal (Guassian ) curve . 

Page 17: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Normal Curve

Page 18: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Properties of Normal Curve: 

(1 ) It is bell shaped & symmetrical

(2 )The three types of averages--- the mean,the median & the mode will be almost equal

(3 ) The total area under the normal curve will be equal to “1”

(4) Fifty percent of the sample values will lie on the left of the perpendicular drawn on the middle and the remaining 50 % will lie on the right of this line

Page 19: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Properties of Normal Curve:

(5 ) Mean - 1 SD & mean + 1 SD will include about 68 % of the sample values

(6 ) Mean – 2 SD& Mean + 2 SD will include about 95 % of the sample values

(7 ) Mean – 3 SD & mean + 3 SD will include about 99 % of the sample values

Page 20: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Properties of Normal Curve(8 ) Theoretically the curve touches the horizontal line only at the infinity

(9 ) (Sample value – Mean ) / SD which is called as Standard Normal Deviate / Z- score is distributed with a mean of “ 0 “ and a SD of “ 1 “ , what ever the variable may be .

This is a very important property.Inference theory is based on this property.

Page 21: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Estimation of Population Parameters

Two types of Estimation 

(1)  Point estimation – (Estimation without Confidence) 

Values of mean, proportion,correlation coefficient etc. computed from sample serve as estimates of the population parameters.

This estimate is a single value and is called Point estimate. 

Page 22: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

(2)  Interval estimation: (Estimation with Confidence)

A lower limit (LL) and an upper limit (UL) are

computed from sample values

It can be said with a certain amount of confidence, that the population value (true value) of the parameter will lie within these limits.

These limits are called Confidence limits or Interval estimates.

Page 23: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

The LL and UL estimates for the Population mean are given as :-

mean - C* SE and mean + C*SE

C= Confidence coefficient, SE ={ SD / (n) }, n = sample size.( * = multiplicative sign ) 

If 95% confidence is desired , C = 1.96 ,for 99% confidence, C = 2.58for 99.9% confidence, C = 3.29

Page 24: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Example-1: In a study of a sample of 100 subjects it was found that

the mean systolic blood pressure was 120mm. of hg. with a standard deviation of 10mm. of hg. Find out 95% confidence limits for the population mean of systolic blood pressure. SE = SD / ( n ) = 10/ ( 100 ) = 10/10 =1 LL :--- mean - 1.96*1 :--- 120 - 1.96 = 118.04UL :--- mean +1.96*1 :--- 120 + 1.96 = 121.96  i.e. the population mean value of systolic blood pressure

will lie between 118.04 and 121.96 and we can have a confidence of 95% for making this statement.

Page 25: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Example-2:

(2) In a study of 10,000 persons in a town , it is found that 100 of them are affected by tuberculosis. Find out 99% confidence limits for the population prevalence rate.  SE = (( pq)/(n)), where, p= (100/10000 ) * 100 = 1% q = 100 – p = 100 – 1 = 99%, SE= ( (1*99) / 10000 )= 0.0995  LL = p - 2.58*0.0995 = 1- 0.2567 = 0.7433 = 0 .74 %

UL= p +2.58*0.0995 = 1 +0.2567 = 1.2567 = 1.26 %  i.e. the population prevalence rate of tuberculosis will lie between 0.74% and 1.26% and we can say this with 99% confidence

Page 26: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Statistical Hypothesis

A declarative statement about the parameters (of population) or the distribution form of the variable in the population.

Page 27: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Examples

1.Mean systolic blood pressure (m) in normal subjects of 30 years of age in the population is equal to 120mm i.e. M=120.

2. Mean cholesterol value in hypertension patients (M1) > mean

cholesterol value in normals (M2) i.e. M1>M2.

3. Percent of babies born with low birth weight to anaemic women (P1) is

greater than that in normal women (P2) i.e. P1>P2.

4.Occurrence of lung cancer is associated with smoking.

5. Birth weights of children are normally distributed

Page 28: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Null Hypothesis --- Ho 

No difference in average values or percentages between two or several populations.  

Examples:--- 

( 1 ) Mean cholesterol value in normal (M1) =Mean

cholesterol value in hypertension patients ( M2 )

 ( 2 ) Percentage of babies born with low birth weight in anaemic women ( P1 ) = Percentage of babies

born with low birth weight in normal women ( P2 )

( 3 ) no association between lung cancer and smoking

Page 29: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Alternative Hypothesis( H1)---two sided 

There is difference in average values or percentages between two or several populations:--- M1 M2 P1 P2

 

Alternate Hypothesis (H1 )---one sided 

M1 > M2 or M2 > M1

  P1 > P2 or P2 > P1

 

Page 30: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Examples:--- 

( 1 ) Mean cholesterol value in hypertension patients (M1) > Mean cholesterol value in

normals( M2 )

 ( 2 ) Percentage of babies born with low birth

weight in anaemic women ( P1 ) > Percentage of

babies born with low birth weight in normal women ( P2 )

( 3 ) There is an association between lung

cancer and smoking---Prevalence of lung cancer is higher in smokers than in non-smokers

Page 31: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

TYPE - I & TYPE- II ERRORS

Consider the following 2X2 Table:-- 

Ho True False Accept (no error) - (type- II )

Reject - (type –I) (no error)

Page 32: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Type- I error :---- : p- value : level of significance probability of rejecting Ho when it is actually true. = probability of finding an effect when actually there is no effect. measures the strength of evidence by indicating the probability that a result at least as extreme as that observed would occur by chance 1- = Confidence coefficient = probability of rejecting Ho when it is false = probability of finding an effect when actually there is an effect.

Page 33: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Type - II error :- = Probability of accepting Ho when it is actually false.

= Probability of not finding an effect when actually there is an effect.

 

1- = Power of the test = Probability of accepting Ho when it is true

= Probability of not finding an effect when actually there is no effect.

Page 34: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

• When the null hypothesis is rejected, type-I error is to be stated

Maximum error allowed---5 % i.e.,Minimum confidence required---95 %

• When the null hypothesis is accepted, type- II error is to be stated

Maximum error allowed---20 % i.e;Minimum power required ----80%

•  • When the null hypothesis is rejected at a chosen level of

significance ,what ever may be the sample size it may be adequate but,

• when the null hypothesis is accepted, the adequacy of the sample size has to be checked before accepting Ho by computing the Power of the test

Page 35: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Testing The Statistical Significance Of

Hypothesis Testing the statistical significance of Hypothesis is the process of

calculations using sample results to see whether the null hypothesis is true or false

 Steps :---

1. State the null hypothesis: H0

2. State the alternate hypothesis: H1

(one sided / tailed or two sided / tailed)3. State the distribution of the sample statistic or the difference

(normal or student’s ‘t’ or chi- square).4. State the level of significance ( or p - value or type -I error) desired. 

Page 36: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

5. Compute the Test Statistic (TS) =

(difference in parameter values)

= ------ -----------------------------

(SE of difference)

6. Find out the Critical Ratio (CR) from the statistical table at the chosen level of significance

Page 37: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

• Take decision :--

a. If TS <CR: accept Ho i.e. difference in parameter values is not statistically significant

b. If TS > CR: reject Ho : accept H1 i.e. difference in parameter values is statistically significant .

If p < 0.05, Confidence (C) > 95 %;

if p < 0.01, C > 99 % and

if p < 0.001, C > 99.9%  

Page 38: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Guidelines , Steps and Examples in Tests of Significance

(A)  Continuous variable :-

(1) Ho : Null Hypothesis: μ1=μ2

μ1= Mean gain in weight of infants who received supplementary diet

μ2= Mean gain in weight of infants who did not receive supplementary diet 

(2) H1 : Alternate Hypothesis: μ1 μ2

Page 39: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

(3-a) If Population distribution of gain in weight in both the groups is NORMAL (either known from earlier studies or could be established from the random samples ) or both the sample sizes are large ( n1 and n2 > 30 ) the TEST STATISTIC is Z

and the test is called NORMAL TEST.

 (3-b) If n1 or n2 or both n1 and n2 < 30 , the TEST

STATISTIC is Student`s “t” and the test is called Student`s “t” TEST.

Page 40: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Level of Significance ( :-Type I Error:- p-Value )

If = 0.05, Confidence ( C ) = 95% ,

if = 0.01, C=99 %

if = 0.001, C=99.9 %

Page 41: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

(5) Test Statistic or Test Criteria (Z) 

If Normal or n1 , n2 > 30 ,

• --- ----

where, X1 and X2 are the mean values of weight in Samples A and B respectively and S1

2 and S2

2 are the corresponding standard deviations.

1 2

2 21 2

1 2

X XZ

S S

n n

Page 42: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

(6) Critical Ratio ( C.R )

 

If = 0.05, C.R =1.96 , if = 0.01,

C.R.= 2.58 and if = 0.001 ,C.R.= 3.29

Page 43: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

(7) Taking Decision Difference in means between the Two Groups

_________________________ 

If Z < 1.96 Not Significant

( Ho is acceptable ) ( p > 0.05 ) ( a ) Z > 1.96 Significant

( p < 0.05 ) ( b ) Z > 2.58 Highly

Significant ( p < 0.01 )

( c ) Z > 3.29 Very Highly Significant

( p < 0.001 )( Ho is rejected in ‘a’ ‘ b’ and ‘c’ )

Page 44: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Various Tests of Statistical Significance (a)To test the statistical significance of the difference in sample and population Means

0 :H X 1 :H X

= 0.05 , CR = 1.96 ,

TC = Z = S / n( )X

Page 45: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Example : Mean SBP in population= 120, Mean SBP in Sample= 115

( n = 100 SD = 20 ) Z = ( 120 – 115 ) 20 / = 2.5 ie ,

TC > CR . p < 0.05 Means in the population and sample are significantly different or

The sample does not represent the population w.r.t. SBP

100

Page 46: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

( b ) To test the statistical significance of the difference in Mean values between two Populations

(1) Large Sample:

If Z < 1.96 ,The difference in means in the population and sample can be considered as statistically not significant

1 2

2 21 2

1 2

X XZ

S Sn n

Page 47: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Test of Homogeneity of Variances ( Fisher`s ‘F ‘ )

• One of the assumption which has to be satisfied for applying Student`s t test is Homogeneity of variances in the two populations .This is tested by computing Fisher`s F statistic.

F = for (n1-1) , (n2-1) d.f. ( )

• If the computed F value is less than the Critical ratio of F at (n1-1) ,

(n2-1) d.f. , then the assumption of Homogeneity of variances in the

two populations can be accepted. Otherwise , the variances in the two populations will be Heterogeneous.

2122

1 2

Page 48: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

(2) Small Samples ( n1 or n2 or both n1 & n2 < 30 ) : (1 = 2) Homogeneity of variances in the two populations is assumed and accepted,

where S,

Critical ratio values depend upon degree of freedom - ( n1+n2-2 )

1 2

1 2

1 1

X Xt

Sn n

2 21 1 2 2

1 2

1 1

2

rr S n SS

n n

Page 49: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

3 Small Samples (n < 30 ) and (1 2) : Homogeneity of variances in

the two populations is not accepted, In such a case . Modified ‘t’ test has to be applied.

1 2

1 2

1 1

X Xt

Sn n

2 2

1 21 2

1 1 22 2

1 2

1 2

1 1S S

t n t nn n

tS S

n n

If t > t` ; p<0.05 (significant) , if t < t` p > 0.05 ( not significant)

Page 50: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Weight ( kg ) of school going ( A ) and non-School going ( B ) children of 5 years of age in slum areas :--- 

Population Sample Size Mean S.D (1) n1 & n2 > 30

A 100 17.4 3.0B 100 13.2 2.5

 Z = 15.56 ( p < 0.001 ) i.e. --- A B

A B

Page 51: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

(2) n1 & n2 < 30 ( σ1 = σ2 )

  

A 15 17.4 3.0B 10 13.2 2.5

  F = ( 3.0 )2 / (2.5)2 =1.44 < 3.00 ( for 14 & 9 d.f. at = 0.05 ). Hence, assumption of homogeneity of variances in the two populations can be accepted. 

t = 3.65 > 2.81 ( for 23 d.f at = 0.01 )< 3.77 (for 23 d.f at = 0.001 ) i.e., p < 0.01

i.e, A B A B

Page 52: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

(3) n1 & n2 < 30 and

A 15 17.4 1.8B 10 13.2 4.2

  F = ( 4.2 )2 / (1.8)2 =5.44 > 2.65 ( for 9 & 14 d.f. at = 0.05 )  i.e . The assumption of Homogeneous variances in the two populations cannot be accepted ( ) and hence modified ‘t’ test has to to be applied .  t =2.98 > 2.25 t` (t`at =0.05 ) but, < 3.22 t` ( t`at =0.01 )

i.e. …… ( p<0.05 )

1 2

1 2

A B A B

Page 53: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

(4) Paired Samples :

Where : Mean of the difference ,

Sd: SD of the difference

degrees of freedom = n-1 

d wt

Sd

d

Page 54: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Systolic B.P Patient Number

1 2 3 4 5 6 7 8 9 10

Before Drug 160 150 170 130 140 170 160 160 120 140

After Drug 140 110 165 140 145 120 130 110 120 130

Page 55: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Mean S.D.  

Before drug 150 17.00 

After drug 131 17.13  Change 19 22.46 (Decrease) 

Page 56: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

19 10

22.46t =2.67 > 2.26 ( t at =0.05 with 9 d.f. ) i.e p < 0.05

i.e The decrease of 19 units ,on average, in the Systolic BP after giving the drug is statistically significant at 5 % level of significance.

Page 57: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

(5) Analysis of Variance (ANOVA)

 •  To test the statistical significance of the differences in

mean values of a variable among different groups (more than TWO groups).

• In case of two groups, student's `t' test is applied. • The added advantage in ANOVA is that the total

variance can be partitioned into different components (due to several factors)which will enhance the validity of comparison of the means among the different Groups.

• This is not possible in the case of `t' test.

Page 58: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Designs

Basically THREE important Experimental Designs are used in ANOVA.

They are :–

1. Completely Randomized Design (CRD) ( One-way ANOVA )

2. Randomized Complete Block Design (RCBD):-

(Two or Multiple-way ANOVA )

3. Repeated Measures Design ( Before & After Design )

( Two-way, Between TIME Analysis )

Page 59: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

• 1. CRD 

If there is only ONE FACTOR studied affecting the study variable Completely Randomized Design (CRD)/One-way ANOVA is used

 

Example:

  The study population consists of only children who are severely malnourished and a Clinical Trial is conducted to study the efficacy of three methods: diet, drug and placebo, in increasing their weight.

Page 60: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

• 2. RCBD

If TWO or more factors are studied affecting the study variable OR if the study elements in the population are HETEROGENEOUS with respect to the Factor(s), in addition to the main Factor studied,Randomized complete Block Design (RCBD)/Two or Multiple-way ANOVA is used.

Page 61: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Example:

• The population consists of children who are mildly, moderately or severely malnourished and a Clinical Trial is conducted to study the efficacy of three methods: diet, drug and placebo, in increasing their weight.

• Here, the children are classified according to their malnourishment status, and in each group are randomly allocated into three methods of treatment.

• This design will enhance the validity of comparison of the mean weight increase among the three Groups as compared to the Completely Randomized Design

Page 62: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Repeated measures design : 

If the values of a variable of the subjects are recorded BEFORE and AFTER an INTERVENTION (more than once after the intervention) Repeated Measures Design is adopted, for a valid comparison of the mean values of the variable between various Timings of recording taking into consideration, the variation between the Subjects.

Page 63: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

 

Example :  

Blood Pressure values of Hypertension patients were recorded before and after ONE week and after TWO weeks after giving a drug. To test the statistical significance of the differences in mean BP among the THREE Timings of recording , Repeated Measures Analysis will enable us to make a more valid comparison.

Page 64: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Homogeneity of variances

 

Before applying ANOVA test ,HOMOGENEITY( EQUALITY) of VARIANCES of the variable in the different Groups has to be tested.

The most commonly used test is BARTLETT`s Test.

If this test shows non-significance ANOVA can be applied on the original values of the Variable .If this shows statistical significance, appropriate transformation ( Log, Square root ,inverse etc. ) has to be done for the original values before applying ANOVA.

Page 65: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

MULTIPLE RANGE TESTS If the Analysis of Variance provides statistically significant F-value for the treatment variation( ie;if the ANOVA shows statistically significant differences in the mean values among the Groups) appropriate Multiple Range Test is to be applied to find out significantly different pairs of groups.

The most commonly used Multiple Range Test is

Student Newman Keul's (SNK) Test.

Page 66: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

 

PROBLEMS IN ANOVA :--- 

(1) ONE – WAY ANOVA ( COMPLETELY RANDOMIZED DESIGN A study was conducted to investigate the effect of supplementary nutrition, a drug and placebo in increasing the weight of severely malnourished children. Fifteen severely malnourished children were randomly divided into three Groups A , B & C. Group A was given supplementary nutrition , Group B , the drug and Group C , the placebo. Gain in weight in these children was noted after one month of treatment. Test whether tht differences in weight gain, on an average,among the three groups are statistically significant or not at 5 % level of significance.

Also test whether the difference between any two groups is statistically significant or not at 5% level of significance.

Page 67: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

  Gain in Weight ( Kg.) 

A B C Total

 

0.20 0.10 0.05 0.35 

0.15 0.10 0.10 0.35

 0.10 0.05 0.05 0.20

 0.30 0.15 0.05 0.50

 0.25 0.20 0.15 0.60

Page 68: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

 

ANOVA TABLE

Source of Variation d.f. S.S. M.S.S. F p

Total  14

 0.0833

     

Between Groups 2 0.0373 0.0186 4.91 < 0.05

Error 12 0.0460 0.0038    

  d.f. –Degrees of freedom ; S.S.—Sum of squares ; M.S.S. –Mean sum of squares ; F—F statistic ; p—level of significance

Page 69: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

F at = 0.05 with 2, 12 d.f. = 3.89 ,

F at = 0.01 with 2, 12 d.f. = 6.93

Computed F (4.91) > 3.89, but < 6.93 .

i.e., Differences in gain in weight in children among

the three groups are statistically significant, on an

average (p < 0.05) – Confidence = 95%

Page 70: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Multiple Comparison Test:

Since the ANOVA gave a significant F value , we may have to find out the groups which are significantly different by applying Multiple comparison test.

The most commonly used multiple comparison test is Student-Newman Keul`s (SNK) test.

Page 71: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

 

Treatment Group Mean gain in weight ( kg) 

A 0.20B 0.12 C 0.08

 On applying SNK test using a statistical software , it is found that gain in weight in severely malnourished children who received supplementary diet was significantly larger than in those who received placebo, on an average (p < 0.05; confidence = 95%). However, differences observed in gain in weight between those who received supplementary diet and drug or between those who received drug and placebo were statistically not significant (p > 0.05)

Page 72: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

  

(2)Two - way ANOVA ( Randomized Complete Block Design - RCBD) In a clinical trial to test the efficacy of two drugs and a placebo in the sleeping hours of mental patients it was thought that age of the patient could also influence the sleeping hours. Hence , the patients were stratified according to their age group and then randomly distributed into three treatment groups.

Page 73: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

   IMPROVEMENT IN

SLEEPING HOURS 

Age group ( Years ) 

A

B Placebo

Total

 24-34 35-44 45-54 55 and More  

 2.3 2.0 1.8 1.2  

 1.6 1.4 1.0 0.8  

 0.6 0.4 0.3 0.3 

 4.5 3.8 3.1 2.3  

         

         

Page 74: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

ANOVA TABLE

Source of Variation

d.f. S.S. M.S.S.

F p

Total  (n-1)= 11

 5.19

     

Due to age (r-1)= 3 0.89 0.297 8.2 < 0.05

Due to drug (p-1)=2 4.0825 2.0412 56.4 <0.001

Error (n-1)-(r-1)-(p-1)=n-r-p+1=6 0.2175 0.0362    

Page 75: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

 Conclusions:

 Influence of age on treatment effect is significant ( p <0.05). i.e., accounting variation due to age has helped in reducing the error (MESS) i.e, in improving the precision of the estimate.

Differences in mean improvement in sleeping hours among the three treatment groups are statistically significant (p <0.001)

Page 76: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Drug Mean improvement in sleeping hours Drug : A -1.825 (A)Drug : B -1.200 (B)Placebo: -0.400(C) On applying SNK test using a statistical software ,it was found that improvement in sleeping hours with drug A was significantly higher than that with drug B and placebo (p < 0.01) and that with drug B was significantly higher than that with placebo, on an average

Page 77: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

(3) Two – way ANOVA ( RCB design where individuals themselves serve as blocks):  Systolic blood pressure values of 10 patients, before treatment and after 1 week and after 2 weeks after treatment are given below. Test whether the change (reduction) in systolic blood pressure after 1 week and 2 weeks after treatment is statistically significant or not.

Page 78: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

77 

Sl.No.

Before

After 1 week

After 2 weeks Total

1 170 160 140 4702 165 160 135 4603 180 170 140 4904 175 165 135 4755 165 160 135 4606 180 160 140 4807 175 170 145 4908 160 150 125 4359 155 140 120 415

10 165 145 120 430Total 1690 1580 1335 4605Mean 196 158 133.5  

Page 79: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

 

TWO-way ANOVA TABLE 

Source of Variation

d.f. S.S. M.S.S.

F p

Total (T)  29

 8857.5

     

Between Time (T)

2 6605.0 3302.5

260.2

< 0.001

Between Patients (P)

9 2024.17 224.9

17.7

< 0.001

Error (E) 18 228.33 12.69

   

Page 80: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

 

Conclusions:Variation due to patients was found to be statistically significant at = 0.001 i.e. variation in BP among patients is statistically significant.After accounting for this variation, the differences in mean BP among the three Time periods are found to be statistically significant (p < 0.001). On applying SNK test ,it was found that reduction in BP, 1 week after treatment and 2 weeks after treatment was statistically significant (p < 0.001).Reduction from 1 week to 2 weeks after treatment is also statistically significant (p < 0.001) .

Page 81: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

INFERENCE METHODS for

DISCRETE VARIABLES

 Estimation :

1. Point Estimate : Proportion , Percentage , Ratio , Rate

2. Interval Estimate :95% or 99%or 99.9 % Confidence intervals for proportion , Percentage.

Page 82: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

  Point Estimate :

1. Proportion of persons diagnosed as cases in a survey of

diabetes ( p = 0.14 or 14 % )

2. Proportion of smokers with lung cancer (p = 0.24 or 24% )

3. Sex Ratio : 970 females / 1000 males Doctor / Population Ratio : 1 : 10,000

4. Birth rate , Death rate etc.

Page 83: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Interval Estimate :-S.E = (pq / n )

(1) If p = 0.14 and n = 900, S.E = = 0.0116

95% Confidence limits : p – 1.96 SE and p + 1.96 SE : 0.1172-3 and 0.1627

(2) If p = 24% and n = 10,000 , SE = 0.43

99% Confidence limits : p –2.58 SE and p+2.58 SE ; 23.2 & 24.8

Page 84: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

 

 Tests of Significance :-  

1.  Z - test ( Proportion )

2. λ 2 test ( 22 , 2n , rn )

3. Matched λ 2 test ( McNemar’s ) ( 2 2 or pp )

Page 85: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Examples:

Distribution of children according to their sex and nutritional grading is given in the table below:-

 

Sex  

Nutritional Grading 

   Total

 Normal Gr I Gr II Gr III/IV

Male 

25 (18) 45(42) 25(30) 5(10) 100

Female 

11(18) 39(42) 35(30) 15(10) 100

 Total 

 36(18) 84(42) 60(30) 20(10) 

 200 ( 100 ) 

Page 86: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

 

( 1 ) 22 Contingency Table : Normal Malnourished Total Sex  M 25(18) 75(82) 100

F 11(18) 89(82) 100

T 36 164 200

Malnourished = Gr-I , Gr- II , Gr. III & Gr. IV

Page 87: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Ho: No association between sex and nutritional status

H1 : There is an association between sex and nutritional status

Test Statistic = with 1 d.f. (degree of freedom ).

Degrees of freedom is the number of independent cells ( groups ) in the data . If there are four cells , d.f. will be 1 since if there is only one independent cell and the number in the other three cells can be determined by subtraction of the available cell number from the corresponding marginal totals.

O—Observed number E--- Expected number

λ 2 =6.64 =6.64 ( Critical ratio with1d.f.at 1 %level of significance.) i.e., p = 0.01.

2

2 O Ex

E

Page 88: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

2

2 O Ex

E

When the expected number in any cell is less than 5 which may happen in case of small samples and rare events,continuity correction has to be applied in the formula as given below :-

 (O-E) should be replaced by

Since the sample sizes in males and females are larger and the expected numbers in all the four cells are more than 5 , continuity correction need not be applied for this data.

5.0 EO

Page 89: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

 Conclusions :

i.e., The association between sex of the child and nutritional status is statistically significant at 5% level .

Proportion of male children with normal nutrition is significantly higher ( 25 % ) than that of female children( 11 % ) .

This statement can be made with 99 % confidence .

Page 90: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

In case of 2*2 contingency table , statistical significance of association can be tested by applying Proportion test also :- 

(2) Proportion Test:

  is to be included in the formula only in case of small sample sizes and if the expected number in any cell is less than 5.

1 21 2

1 2

1 1 12

1 1

p pn n

z

pqn n

1 1 2 2

1 2

p n p np

n n

(1 )q p

1 2

1 1 1

2 n n

Page 91: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

 

=2.58 = CR of 2.58 at 1 % level of significance (p =0.01)

i.e,Proportion of male children with normal nutrition is significantly higher ( 25 % ) than that of female children( 11 % ) .

This statement can be made with 99 % confidence

(0.25 0.11) 0.01

0.0543z

Page 92: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

(3) 2n Table: In the example giving data on the Nutritional grading of children, there are four nutritional groups ( N,Gr I, Gr II , Gr. III & Gr. IV ) and two sexes ( Males & Females )

Degrees of freedom = (4-1) * (2-1) = 3

λ 2= 12.54 > 11.35 ( p < 0.01 )

i.e. Association between sex and Nutritional grading of children is statistically significant at 1 % level of significance ( Confidence = 99 % ) 

Page 93: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

(4) Matched λ 2 test :

To test the significance of the association between two categorical variables in correlated samples Matched λ 2 due to McNemar has to be applied.

McNemar`s λ 2 = {( b-c)-1 }2/ (b+c)

‘ – 1 ‘ need to be included in the formula when the sample size is small. 

Page 94: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

 The data in the table given below gives the results ( + ve & - ve ) of two tests ,TA & TB ,done on 100 subjects to diagnose the presence of a certain disease . TA is the existing test which is expensive and TB is the new test ,which is comparatively cheaper.It has to be investigated whether the results of the two tests are statistically comparable or not so that , if found comparable test A can be replaced by the less expensive test B

Page 95: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Example:  T-A ( Expensive , but confirmative ) + - Total

T-B( cheap )+ 8 ( a) 8 (b) 16(16%) - 12 (c ) 72 (d) 84(84%)Total 20 80 100  

McNemar`s λ 2 = 0.8 i.e., the discrepancy in the results is statistically not significant .

The results of the two tests agree well. Test A can be replaced by test B.

Page 96: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

NON-PARAMETRIC STATISTICAL METHODS The meaning of the word “ Science “ as given in the dictionary is “ the truth ascertained by observation , experiment and induction . “

A vast amount of time , money and energy is being spent by society today in the pursuit of Science knows, the processes of observation, experiment and induction do not always lay bare the “ Truth “.

Page 97: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

One experiment with one set of observations may be lead two scientists to two different conclusions.

The purpose of the body of the method known as “ STATISTICS “ is to provide the means for measuring the amount of subjectivity that goes into the scientist’s conclusion.

Page 98: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

•This is accomplished by setting up a theoretical model for the experiment in terms of probability.

•Laws of probability are applied to this model in order to determine what the (chance) ‘ probabilities’ are for various possible outcomes of the experiment, under the assumption that chance alone determines the outcome of the experiment.

•Then the experimenter has an objective basis for deciding whether the fact was the result of the treatment that was applied or whether it could have occurred by chance alone!• 

Page 99: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Although it is sometimes difficult to describe an appropriate theoretical model for the experiment, the real difficulty often comes after the model has been defined in the form of finding the probabilities associated with the model.

Many reasonable models have been invented for which probability solutions have been found. This body of Statistics, i.e., applying the probability model for making inferences from the sample of experiment in order to arrive at valid conclusion - known as ‘ PARAMETRIC STATISTICAL METHODS ‘

 Student`s t test ---F test

Page 100: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

In parametric method, exact solutions for the approximately suitable probability model are found.

However, in the late 1930s, a different approach to the problem of finding probability began to gather momentum.This approach involves making few changes in the model and using simple unsophisticated methods to find out the desired probability. Thus, approximate solutions to the exact problems were found as opposed to the exact solution to approximate problem. This new package of Statistical Methods became to be known as “ NON PARAMETRIC METHODS “ 

Page 101: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Advantages of Non parametric statistics over parametric statistics :  1. Simpler Models 2. Easy Computability

3. No assumption on the form of population distribution of the variable.

4. No need of larger sample for making inferences. 

Page 102: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

 In case of applying parametric inferences model, the specific form of distribution of the variable in the population is required. Also, the computability is sometimes not easier and hence not quicker. However randomness of the sample is required in applying non parametric methods as in case of parametric methods.

Page 103: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

  There are no parameters such as mean and standard deviation in the Non-parametric models and hence it is called NON-PARAMETRIC METHODS

Since the assumption of specific form of distribution of the variable is not required, Non parametric methods are also known as ‘ DISTRIBUTION FREE METHODS ‘

Since non-parametric methods are based on RANKS it is also called RANKING METHODS OR ORDER STATISTICS 

Page 104: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

 

Since the development of nonparametric methods has been taken place only recently, no comparable methods have been developed for all the inference methods which are used in parametric methods.

However, most of the commonly used parametric inference methods have got corresponding non-parametric methods. :-

Page 105: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Non Parametric methods may be applied when :-- 1.      The form of distribution of the values of the variable in the population (s) is not known.

2.      Sample size is very small.

3.      The researcher does not have the mathematical background to understand and apply the parametric methods. Of course, this is not a compromise. 4.      The researcher would like to make inference as quickly as possible.

Page 106: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

 It has been shown by some researchers that the Power of many Non parametric methods is lesser compared to the corresponding parametric methods.

Hence, it is suggested that one should try his best to apply the parametric inference methods if the conditions for applying such methods are met with . This can be achieved by suitable transformation of the values of the variables.

If all these approaches fail, then the only method of arriving at conclusions with some validity and robustness is by applying the non-parametric methods.

Page 107: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

 1. Wilcoxon’s Rank Sum test :

For testing whether two independent samples with respect to a variable come from the same population or not.

i.e, “ does one population tend to yield larger values than the other population

do the two Medians are equal or not .

Corresponds to the Normal test (Z) or the student’s ‘t” test for two independent samples.

Page 108: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

2. Wicoxon’s Signed Rank test :

For testing whether the differences observed in the values of the variable between two correlated populations ( before and after Design ) are statistically different or not.

Corresponds to the Paired ‘t’ test in parametric methods. 

Page 109: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

3. Kruskal Wally`s One-way Analysis of Variance:

For testing whether several independent samples come from the same population or not.

Corresponds to One - way Analysis of Variance in parametric method. 

Page 110: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

 

4.Friedman`s Two-way Analysis of Variance :

For testing whether the differences observed in the values of the variable between different time periods are statistically significant or not.

Corresponds to the Two-way Analysis of Variance in parametric methods.

Page 111: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

All the Non parametric methods can be applied manually by ranking the observations appropriately and doing simple computation. Computer packages :---

BMDP, SPSS, SAS and SYSTAT

Page 112: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Statistical Estimation:

 

Parametric Non-Parametric 

1. Representative Mean, Median Median, Mode Value Mode

2. Variation Standard Deviation Quartile Deviation, (SD) Range.

3. Correlation Pearson’s Product Spearman’s Moment-corr. Rank Corr.

Coefficient () Coefficient ()

4. Intervals for Mean SD Quartiles (Q10-Q90), Percentiles(P3-P97) the estimate

Page 113: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

Statistical Tests of Significance

1. Comparison between two independent populations :

Parametric Non-Parametric  

Continuous : Z-test Wilcoxon’s Rank t-test Sum test

Discrete : Z-test 2-test

Page 114: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

2.Comparison between two Correlated populations :  Parametric Non parametric

Continuous : Paired ‘t’ test Wilcoxon’s Signed Rank test

Discrete --- McNemar’s 2-test  

Page 115: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

3. Comparison among several independent populations: 

Parametric Non Parametric

Continuous : One- way Anova Kruskal Wally`s One- way Anova

Discrete --- 2-test

Page 116: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

4. Comparison among several correlated populations:

Parametric Non parametric

Continuous : Two- way Anova Freidman’s Two-way Anova

Discrete --- McNemar’s 2-test

Page 117: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

EXAMPLES :

( A) Independent samples: Intelligent quotient ( IQ ) of 5 normally nourished children(NN) and 4 malnourished children(MN), aged 4 years, are given below:--- NN--------- 60 , 80 , 120 , 130 , 100MN-------- 50 , 60 , 100 , 45 Null hypothesis-- IQs in the two groups are statistically the same , on an average.

Page 118: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

 

On applying Wilcoxon`s Rank sum test using statistical software p =0.11 Since p is greater than 0.05 ,the difference in IQ values in the two groups is statistically not significant and the hypothesis of identical IQ values, on average ,in the two groups is accepted .

Page 119: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

( B ) Paired ( repeated ) samples:   IQ Values  Before ( b ) :-- 40 60 55 65 43 70 80 60 After ( a3 ) 50 80 50 70 40 60 90 85

  On applying Wilcoxon`s Rank sum test using the statistical software p=0.18Since p value is greater than 0.05 , the difference in IQ values after giving the diet for three months is not statistically significant and the Null hypothesis(Ho ) of no difference in IQ after giving the diet is accepted. –

Page 120: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

( C ) Independent samples---more than two groups : Intelligent quotient ( IQ ) of 5 normally nourished children( NN), 4 moderately malnourished children(MN) and 5 severely malnourished children( MN ) , aged 4 years, are given below:--- NN--------- 60 , 80 , 120 , 130 , 100MN-------- 50 , 60 , 100 , 45SN -------- 50 , 40 , 60 , 35 , 65 

Page 121: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

On applying Kruskal Wally`s One-way Analysis of variance, p=0.0438.

i.e, The differences in IQ among the three groups on an average, are statistically significant. On applying Multiple range test ,it can be inferred that the differences in IQ between NN & MN and between MN & SN are statistically not significant and the difference between NN & SN is significant at 5 % level.

Page 122: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

( D ) Paired(repeated ) samples-more than two occasions:

IQ of 8 malnourished children of 4 years of age ,before and after giving some Nutritious diet for three months ( a3 ) and for six months ( a6 ) are given below :--- Before ( b ) :-- 40 60 55 65 43 70 80 60 After ( a3 ) :-- 50 80 50 70 40 60 90 85After ( a6 ) :-- 70 90 100 90 75 65 70 120

Page 123: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

On applying Freidman`s Two-way Analysis of variance , p=0.093

i.e, the differences in IQ after giving nutritious food for three and six months are statistically not significant.

Giving Nutritious food for three or six months is not effective in increasing the IQ.

Page 124: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

WISH YOU ALL

A VERY FRUITFUL USEFUL AND MEANINGFUL RESEARCH .

Page 125: STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029

THANK YOU