LU2 -A - One-sample and Paired Sign Test

Embed Size (px)

DESCRIPTION

statistic

Citation preview

  • STF1103: Statistic for Biology IISTF1103: Statistic for Biology IILearning Unit 2 One-sample and Paired Sign TestSemester II 2012

    Semester II 2008/2009

  • STF1103: Statistic for Biology IIOne Sample Sign TestThe One sample Sign test is the equivalent to the one sample t-test of H0: = 0. When the assumption that the sampling distribution of the population is not normal, but continuous (i.e. not categorical) and symmetrical about .

    Note that under H0: = 0, for such probability distribution x,

    Therefore, the probability distribution of S, the number of values xi of X greater than 0 will be binomial with = (where = probability of success).

    Hence an equivalent null hypothesis of H0: = 0 is H0:

    The equivalent alternative hypothesis HA: < 0 isHA: or HA: > 0 isHA:

    Semester II 2008/2009

  • STF1103: Statistic for Biology IIThe procedures for the One Sample Sign Test are:Assign a + or sign to each value, xi of X: + if xi > 0 , and if xi < 0.

    Ignore those xi that are equal to 0.

    The test statistic, S, is the number of + signs in the sample (S is binomially distributed).

    The rejection region is obtained from the binomial tables, with n = total number of + and signs in the sample, and

    Semester II 2008/2009

  • STF1103: Statistic for Biology IIExample of One Sample Sign TestThe breaking strength of a certain kind of rope is tested, giving the following results ( in appropriate units):

    169 + 163 + 165 + 160 189 + 150 - 139 - 172 +160 148

    Test whether these data indicate the mean breaking strength is more than 160, using = 5 %

    The hypotheses are:

    Semester II 2008/2009

  • STF1103: Statistic for Biology IIExample of One Sample Sign Test (continue)replacing each value xi > 160 with a + and each xi < 160 with a ( ignoring values of xi = 160), yields:n = number of + and signs = 8S = number of + signs =5

    Under H0, S is binomially distributed with n = 8 and = 1/2, so from the binomial tables:Pr (S 5) = Pr (S=5) + (S=6) + Pr(S=7) + Pr(S=8) = 0.3634Since = 0.05, the rejection region is S 7, with a true level of 0.035.

    Hence we accept (Ho), and conclude that there is no evidence that the mean breaking strength is more than 160.

    Semester II 2008/2009

  • STF1103: Statistic for Biology IISample Sign Test (continue)The easiest of all the non-parametric tests use to compare sample distributions from two populations that are not independent (e.g. before-and-after kind of study).

    The sign of the difference is obtained by subtracting the score of event before from the score of after.

    The procedure for applying a One Sample Sign Test can be summed into 7 steps:-Step 1 : State the null and alternative hypothesis.Step 2 : Decide the level of significance.Step 3 : Determine and tally the sign of difference between paired observations.Step 4 : Determine the test statistic and which Test Distribution to use.Step 5 : Compute the p-value and decision rule.Step 6 : Decision rule reject or accept Ho.Step 7 : Conclusion.

    Semester II 2008/2009

  • STF1103: Statistic for Biology IIExampleA restaurant is introducing a new recipe of fried chicken. The marketing department wants to know if the new recipe if tastier than the original one.

    The customers are randomly selected for a test. Each person is given a piece of original fried chicken to try and rates the taste on a scale of 1 to 10 (poor to very good).

    After drinking a glass of water, the same customer is given a piece of fried chicken cooked using the new recipe, and asked to rate the taste using the same scale.

    Semester II 2008/2009

  • STF1103: Statistic for Biology II n = number of relevant observations = number of + and signs (zero is excluded)= 6 + 2 = 8 r = number of fewer signs = 2

    CustomerTaste RatingSign of Difference (before and after)Original(x)New recipe(y)139+2550336+413+5510+6847220885946+1067+

    Semester II 2008/2009

  • STF1103: Statistic for Biology IIStep 1 :state the null and alternative hypothesisIt can be a two-tailed or one-tailed sign test, which decided the form of the alternative hypothesis.

    In this example we want to know if customers prefer the new recipe. Therefore, we do a one-tailed test.H0: p = 0.5HA: p > 0.5

    The null hypothesis in this example is that the new recipe has no effect on the preference or choice of the customers.A positive sign indicates a taste improvement or improved customers liking. A negative sign indicates the opposite.Here, the probability of getting a taste improvement is p = 0.5 because if there is no effect from the new recipe, the customers preference, i.e. the number of people liking and disliking the new recipe would be about the same.

    With a right-tailed test, if Ho is true, we would expect the same number of + and signs. Therefore, either many + or fewer signs should lead to a rejection of Ho.

    Semester II 2008/2009

  • STF1103: Statistic for Biology IIStep 2 & Step 3 Step-2 is very straightforward. The level of significant is usually at = 0.05.

    Step -3 is to determine and tally the sign of difference between the paired data.For each data pair, subtract one observation from the second observation, and record the difference as signs of + or .In a situation where there is no difference in the rating, a zero is given.Tally all signs. + = 6, = 2 and no difference or 0 = 2.

    0

    Semester II 2008/2009

  • STF1103: Statistic for Biology IIStep 4 : Determine Test StatisticAfter tallying all the signs, we determine the test statistic and which test distribution to use.

    Here we designate the number of sign as the test statistic (because we expect the customers would prefer the new recipe of fried chicken).

    Although non-parametric methods make no restrictive assumption about the distribution of the population being sample, we will still need to choose a suitable probability distribution (i.e. binomial, normal, chi-square, etc) to test the hypothesis. For a SMALL sample sign test, we use the binomial probability distribution.

    Semester II 2008/2009

  • STF1103: Statistic for Biology IIStep 5 : Calculate p-valueFinally, we need to calculate the p-value for the test statistic.Only relevant data or pair observations are used for analysisall the zero values are not included. In this example, only 8 out of 10 customers are relevant (2 customers give no indication of a difference in their rating). Therefore, n = 8.Out of the 8 customers that showed different preferences, and if Ho if true, we would expect 50% or 4 persons indicating positive response and another 4 dislike the new recipe. However, the sample data revealed only 2 persons show dislike. The question to ask now is. What is the chance of having at most 2 out of 8 persons indicating dislike when in fact Ho is true?To find the answer, we look into the binomial table for n = 8, r = 2 (i.e. the number of sign, or people dislike the new recipe), and p = 0.5The probability of getting at most 2 persons is 0.1445i.e. p(r =0) + P(r = 1) + P(r=2) = 0.0039 + 0.0312 + 0.1094

    Semester II 2009/2010

    Semester II 2008/2009

  • STF1103: Statistic for Biology IIStep 6 : Decision Rule (reject or accept Ho)The decision rule to follow in a small sample sign test is:-Do not reject (or accept) Ho if p-value of test statistic > Reject Ho if p-value of test statistic <

    With a p-value of 0.1445, this means if there is TRULY NO DIFFERENCE in the taste between the original and new recipe, the chances of getting at most only 2 out of 8 persons (in this example) reporting dislike is 14.4%.

    The p-value in this example is greater than the -value, therefore we failed to reject Ho.

    Step 7 : There is no significant improvement (i.e. customer choice/preference) of the new recipe over the original one.Semester II 2009/2010

    Semester II 2008/2009

  • STF1103: Statistic for Biology IILeft-tailed or Two-tailed TestIf we are conducting a left-tailed test (i.e. H0: p = 0.5, HA: p
  • STF1103: Statistic for Biology IIExtra Notes on One Sample Sign TestWhen the total number of + and is n 12, the sample statistic x (i.e. proportion of plus sign) has a distribution that is approximately normal with a mean p and standard deviation

    Under null hypothesis, Ho : p = 0.5, we assume that the population proportion p of + sign is 0.5,. Therefore, the z-value corresponding to the sample test statistic x is :-

    where n = total number of + and signsR = total number of + signs.

    Semester II 2008/2009

  • STF1103: Statistic for Biology IIPaired Sample Sign TestJust as t-test can be applied to paired data to test whether two normal populations have the same mean or not.

    Paired Sample Sign Test can be used to test the hypothesis that two symmetric continuous population distributions have the same mean.

    The procedure are:-Calculate di = xi yi, where xi and yi are the paired data.

    (ii) Assign a + or sign to each di.

    (iii) the test statistic is S = number of + signs (S is binomial)

    (iv) The rejection region is obtained from binomial table, with n = total number of + and signs in the sample.

    S

    Semester II 2008/2009

  • STF1103: Statistic for Biology IIExampleThe number of defective items produced by two production lines A and B was recorded for 8 days.

    Assuming that A and B have the same distributions, determine whether these data indicate that one line produces more defectives than the other at about 5% level of significance.

    Day12345678A172165206184174142156201B201179159192177170163182

    Semester II 2008/2009

  • STF1103: Statistic for Biology IIExample (continue)The hypotheses are:H0 : A = B HA : A > B

    Assigning + or sign to each di,

    Day12345678A172165206184174142156201B201179159192177170163182di++

    Semester II 2008/2009

  • STF1103: Statistic for Biology IIExample (continue)Hence n = 8, and the test statistic is S = 2.

    From the binomial tables, with n = 8, and

    P(S 6) = P(S 2) = 0.109 +0.031 + 0.004 = 0.144

    The rejection region is S 1 or S 7 with the p-value of 0.144, we fail to reject Ho at = 0.05. Therefore, we conclude that there is no significant difference between the two production lines.

    Semester II 2008/2009

    *****