Chi Square T Test

Embed Size (px)

Citation preview

  • 7/31/2019 Chi Square T Test

    1/17

    TYPES OF STATISTICAL TESTS & THEIR

    ASSUMPTIONS

    1.) PARAMETRIC TESTS Based on assumptions made concerning the parameters

    of the population from which the sample was drawn

    The validity of these tests depend whether theassumptions about the nature of the sampled populationfrom which the sample was drawn

    Usual assumptions include:1.) Random selection of the sample

    2.) Normal distribution of the population from which thesamples were drawn3.) Equality of variances (homoscedasticity) when morethan one population is sampled.Other assumptions:If data is numerical and measured in either interval orratio scale

    2.) NON-PARAMETRIC TESTS Less stringent (strict) assumptions

    No assumptions are made about the populationparameters

    Distribution-free tests

    THE CHI-SQUARE (2)TEST OF HOMOGENEITY

    1

  • 7/31/2019 Chi Square T Test

    2/17

    A commonly used statistical test

    Compares the observed frequency of elements falling indifferent categories with the expected frequency if the nullhypothesis was true

    Types of 2test:1.) Chi-square Goodness of Fit Test2.) Chi-square Test of Association3.) Chi-square Test of Homogeneity

    Uses:

    Chi-square test of homogeneity- is used when we wish to find out whether two ormore populations have the same proportions for thedifferent categories of another variable

    Data Lay-out

    - use a 2 x 2 contingency table ( a cross tabulationof 2 variables)- The rows represent the categories of one variable andthe columns represent the categories of another variable

    - rows are designated as r- columns are designated as c

    2

  • 7/31/2019 Chi Square T Test

    3/17

    Table-1 Distribution of Subjects by Place of Residenceand Blood Pressure Status

    Blood Pressure StatusPlace of

    ResidenceNormotensive Hypertensive Total

    Alfonso 348 62 410

    Magallanes 368 46 414

    Total 716 108 824

    2 Test homogeneity assumes that categories are collectivelyexhaustive and mutually exclusive. Samples are presumed tobe independent of one another.

    Hypothesis Testing Procedure:

    Step 1: Statement of the Hypotheses

    Ho: The proportion of elements falling in each category ofthe variable of interest is the same for all groups.

    H1: There are differences between groups in the proportionof elements falling in each category of the variable

    Step 2: Setting the level of significance.-Arbitrarily set at .05 or .01

    Step 3: Determination of the test statistic

    - 2 test is the test statistic which follows a chi-squaredistribution

    - the shape of the chi-square distribution is dependenton the degrees of freedom (df)

    where df = (row 1) (column 1)Step 4: Determine the critical region (C.R.)

    3

  • 7/31/2019 Chi Square T Test

    4/17

    The critical region of the 2distribution is determined bythe degrees of freedom and the level of significance.

    (See chi-square distribution table)

    Step 5: Computations of the Chi-square statistic

    Formula:

    =E

    EO2

    2 )(

    wheretotalgrand

    totalcolumnxtotalrowE=

    For Chi-square to be applicable, all the Es must be > 5,otherwise, the Fisher Exact Test will be used

    Step 6: Statistical DecisionReject Ho if the computed value of Chi-square falls in

    the critical region . Otherwise, do not reject Ho.

    Step 7: Drawing ConclusionsThis depends on the statistical decision

    Sample Problem for Chi-square Test of Homogeneity:

    4

  • 7/31/2019 Chi Square T Test

    5/17

    Step 1: State the Hypothesis

    Ho: P1 = P2

    The prevalence proportion of hypertension in Alfonso is equalto the prevalence or proportion in Magallanes

    H1: P1 P2

    Step 2: Level of significance = .05

    Step 3: Test statistic = Chi-square test of homogeneity

    =E

    EO2

    2 )(

    Step 4: Critical region (C.R.): = .05

    df = (row 1) (column 1)

    = (2-1) (2-1)

    = 1

    Look at .05, with 1 degree of freedom in the Chi-

    square distribution table this corresponds to

    3.84.

    Therefore the C.R. =

    2

    observed>

    2

    .05, 1 df = 3.84

    5

  • 7/31/2019 Chi Square T Test

    6/17

    Blood Pressure StatusPlace of

    ResidenceNormotensive Hypertensive Total

    Alfonso 348 62 410

    Magallanes 368 46 414

    Total 716 108 824

    Step 5: Computations

    Oij Eij (Oij Eij) (Oij Eij)2 (Oij Eij)2Eij

    3486236846

    356.353.7

    359.754.3

    -8.38.38.3-8.3

    68.968.968.968.9

    0.191.280.191.27

    Remember:

    totalgrandtotalcolumnxtotalrowEfrequencyExpected =)(

    E = 410 x 716 = 356.3824

    E = 410 x 108 = 53.7824

    E = 414 x 716 = 359.7824

    E = 414 x 108 = 54.3824

    6

    2 = 0.19 + 1.28 + 0.19 + 1.27= 2.93

  • 7/31/2019 Chi Square T Test

    7/17

    Step 6: Statistical Decision:

    2

    2.93 3.84

    Since 2 calculated = 2.93 is < than 3.84, therefore DONOT Reject Ho

    The value 2.93 falls in the area of non-rejection (seediagram).

    Step 7 Conclusion:

    There is no sufficient evidence to conclude that the

    prevalence of hypertension in Alfonso differs from the

    prevalence in Magallanes.

    7

    AREA OF REJECTIONOR CRITICAL REGION

    AREA OFNON-REJECTION

  • 7/31/2019 Chi Square T Test

    8/17

    REVIEW OF THE CRITICAL REGION

    Critical Region (C.R.) or Region of Rejection

    Set of values of the test statistic which leads to the

    rejection of null hypothesis

    These values are those whose probability of occurrence

    is less than or equal to the level of significance

    They are found at the tail end of the distribution

    The values whose probability of occurrence is greater

    than or equal to comprise the region of non-rejection

    The size of the CR is determined by the

    The location of the CR is determined by the nature of the

    alternative hypothesis, whether it is one-tailed or two-

    tailed

    See diagrams of the CR at differing alpha levels and

    direction of the alternative hypothesis.

    STATISTICALLY & NON-STATISTICALLY SIGNIFICANTRESULTS

    When the null hypothesis is rejected, the results are said to bestatistically significant and the observed difference between theobserved and expected is not attributed to sampling variation

    8

  • 7/31/2019 Chi Square T Test

    9/17

    When the null hypothesis is not rejected, the results are said tobe non-statistically significant and sampling variation is a likelyexplanation of the observed difference

    TESTING THEDIFFERENCE IN MEAN VALUES BETWEENTWO INDEPENDENT GROUPS

    The T- test for Independent Samples

    Independent the two groups each stand as oneand are mutually exclusive

    E.g. Group A is all maleGroup B is all female

    Treatment arm = 14 studentsControl arm = 15 students

    Statistical Assumptions of the T-test

    1.) The observations in each group follow a normal

    distribution.

    2.) Sample size of each group is at least 30

    3.) The standard deviation (variance) in the two samples isequal (homogeneity of variance)

    4.) The values observed in one group has nothing to do with

    the observations of the other group (independence)

    9

  • 7/31/2019 Chi Square T Test

    10/17

    Sample Problem:

    In an experiment 45 women were randomized to receiveparacervical block prior to cryosurgery while another 39received no paracervical block. The mean pain score in thetreatment arm was 35.60 while the control arm is 51.41points. (See table below)

    Is the mean pain score in the treatment arm significantly lowerthan that observed in the control arm?

    Variable Group N MeanScore

    SD SE of themean

    Totalcramping

    scoreNo block

    Block3945

    51.4135.6

    28.1128.45

    4.504.24

    Step 1: State the null hypothesis:

    Ho: Women who had a paracervical block prior tocryosurgery had a mean cramping score of atleast as high as women who had no block.

    Ha: Women who had a paracervical block prior to

    cryosurgery had a lower mean cramping scorethan women who had no block.

    In symbols:_ _

    Ho: X1 = X2

    10

  • 7/31/2019 Chi Square T Test

    11/17

    _ _Ha: X1 < X2 (one-tailed test)

    Step 2: Let us use =.01(there will be only 1 chance in 100 that we will incorrectlyconclude that cramping is less with cryotherapy if it really isnot.

    Step 3: Test statistic will be T-test for independent samples(Assuming that the observations follow a normal distribution,the SD are equal and the observations are independent)

    [ ])/1()/1(

    )(

    21

    __

    21

    __

    )2( 21 nnSD

    xxt

    p

    nn

    +

    =

    +

    where SDp= is the pooled standard deviation computed usingthe formula:

    2

    )1()1(

    21

    2

    22

    2

    11

    +

    +=

    nn

    SDnSDnSD

    p

    Step 4: Determine the critical region

    The degrees of freedom is : df = (n1 + n2 - 2)df = (45 + 39 -2)df = 82

    11

  • 7/31/2019 Chi Square T Test

    12/17

    Critical region for df= 82 (use 60) and alpha =.01 at 1-tailedtest is equal to -2.39

    Reject null if the observed value of t is < -2.39

    Step 5: Computations

    [ ])39/1()45/1(27.28)4.516.35(

    21

    )23945(

    +

    =

    +t

    = - 2.56

    Step 6: Statistical Decision

    The calculated or observed t is 2.56 which is less than the

    critical value of 2.39, so we reject the null hypothesis

    Step 7 Conclusion

    In this study, on the average, women who had a paracervical

    block prior to cryosurgery experienced less total cramping

    than women who did not have the block.

    Note:

    12

  • 7/31/2019 Chi Square T Test

    13/17

    The conclusion refers to women on the average and does not

    mean that every woman with a paracervical block would

    experience less cramping.

    TESTING THEDIFFERENCE IN MEAN VALUES BETWEENTWO DEPENDENT GROUPS

    (OR PAIRED SAMPLES)

    Means when the Same Group is Measured Twice

    paired designs or repeated-measures designs

    before and after measurements

    the researcher asks whether the intervention makes a

    difference or not (whether there is a change)

    test statistic used is the paired samples T-test or the

    matched groups T-test or the dependent group T-test

    Sample Case Problem:

    Fifty-one (51) patients undergoing cholecystectomy were

    evaluated before, 1 month after, and 3 months after

    cholecystectomy. Patients were interviewed about the qualityand frequency of their stools. In addition, to evaluate the role

    of bile acid malabsorption, serum concentrations of 7-alpha-

    13

  • 7/31/2019 Chi Square T Test

    14/17

    hydroxy-4-cholesten-3-one (7--HCO) were measured before

    and after surgery. The data given were as follows:

    Serum value Before CHE After CHE(1 month)

    7--HCO (ng/mL) 25.33 13.51 46.55 29.58

    Question:Is there a true difference in the 7--HCO before and after

    surgery, so that one can conclude that indeed surgery isbeneficial?

    Step 1: State the Hypothesis

    Ho: The true difference of 7--HCO is zero

    Ha: The true difference of 7--HCO is not zero

    In symbols:

    Ho: = 0 (the symbol delta stands for difference in thepopulation)

    Ha: 0 (2-tailed test) (if 7--HCO significantly increasesor decreases)

    Step 2: Level of significance = .01

    Step 3: Test-statistic is paired samples T-test

    14

  • 7/31/2019 Chi Square T Test

    15/17

    Assuming that the data is in the interval or ratio scale, thedifferences are normally distributed.

    Step 4: Determine the critical region

    The value oft that must be attained to be declared significant. Thevalue of T that divides the distribution into the central 99% is byinterpolation, 2.682, with 0.5% of the area in each tail with n-1degrees of freedom.

    df = n-1df = 51-1 or 50

    C.R.= Reject null hypothesis that the program does not makea difference if the value of the t-statistic is < than -2.682 orgreater than + 2.682

    Step 5 Computations:

    nSD

    dt

    d/

    0__

    =

    _where d = is the mean difference 0 = is the assumed difference in the population nS

    d/ = standard error of the mean difference

    1)( 2

    ___

    =

    n

    ddSD

    d

    wheren

    dd

    =

    ___

    SDd= 26.68

    Substituting = 68.574.322.21

    5168.26033.2555.46

    ==

    =t

    15

  • 7/31/2019 Chi Square T Test

    16/17

    Step 6: Statistical DecisionSince the observed value oftis 5.68 larger than

    the critical value 2.682, we reject the null hypothesis.

    Step 7: Conclusion:Mean serum values for 7--HCO are NOT the same beforecholecystectomy and 1 month later (p-value 5 cms in size.

    4.) A trialist wants to know if the a drug is effective indiminishing psychosis in the emergency room in a non-randomized study (quasi-experimental study design).

    16

  • 7/31/2019 Chi Square T Test

    17/17

    5.) Dr. Lin wants to know if the proportion of midwives andnurses who scored above 60 points and below 60 points differamong the 15 nurses and 15 midwives .

    17