MARE 250 Dr. Jason Turner Hypothesis Testing II To ASSUME is to make an… Four assumptions for t-test hypothesis testing: 1. Random Samples 2. Independent

MARE 250 Dr. Jason Turner

Hypothesis Testing II

To ASSUME is to make an…

Four assumptions for t-test hypothesis testing:

1. Random Samples2. Independent Samples3. Normal Populations (or large samples)4. Variances (std. dev.) are equal

When do I do the what now?“Well, whenever I'm confused, I just check my underwear. It holds the answer to all the important questions.” – Grandpa Simpson

If all 4 assumptions are met:Conduct a pooled t-test - you can “pool” the samples because the variances are assumed to be equal

If the samples are not independent:Conduct a paired t-test

If the variances (std. dev.) are not equal:Conduct a non-pooled t-test

If the data is not normal or has small sample size:Conduct a non-parametric t-test (Mann-Whitney)

When to pool, when to not-pool“"We have a pool and a pond…The pond would be good for you.” – Ty Webb

Both tests are run by Minitab as “2-sample t-test”

For pooled test check box – “Assume Equal Variances”

For non-pooled, do not check box

Assessing Equal Variances…Equality of variance can checked by performing an

F-testOften not recommended:

Although pooled t-test is moderately robust to unequal variances, F test is extremely non-robust to such inequalities

Pooled t-test will allow you to run an accurate test with some degree of unequal variance

F-test is much more specific than pooled-t

Who did the What Now…

Assessing Equal Variances…F-test and Levene’s used to judge the equality of

variances.

In both tests, the null hypothesis (Ho) is that the population variances under consideration (or equivalently, the population standard deviations) are equal, and the alternative hypothesis (Ha) is that the two variances are not equal.

The choice of test depends on distribution properties

What the F…?Use the F-test when the data come from a normal

distribution - is not robust to departures from normality

Use Levene's test when the data come from continuous, but not necessarily normal, distributionsis less sensitive than the F-test, so use the F-test when your data are normal or nearly normal

When the F…?MINITAB calculates and displays a test statistic and p-value for

both the F-test and Levene's test

Ho: σ1 = σ2 2 population variances equal

Ha: σ1 ≠ σ2 2 variances are not equal

High p-values (above α-level) Fail to Reject Null- indicate no statistically significant difference between the variances (equality or homogeneity of variances)

Low p-values (below α-level) Reject Null- indicate a difference between the variances

(inequality of variances)

How the F…?STAT – Basic Statistics – 2-VariancesEnter columns of data as beforeUnder “Options” can modify α-level of test (but why would you do

that)

95% Bonferroni Confidence Intervals for StDevs

Ka Lae

Kapoho

5045403530252015

Data

Ka Lae

Kapoho

200150100500

F-Test

0.168

Test Statistic 2.57P-Value 0.008

Levene's Test

Test Statistic 1.94P-Value

Test for Equal Variances for Kapoho, Ka Lae Note that by default, MINITAB gives you the results of both the F-test and Levene’s

Must decide a priori which test you plan to utilize

Significance Level

The probability of making a TYPE I Error (rejection of a true null hypothesis) is called the significance level (α) of a hypothesis test

TYPE II Error Probability (β) – nonrejection of a false null hypothesis

For a fixed sample size, the smaller we specify the significance level (α) , the larger will be the probability (β) , of not rejecting a false hypothesis

I have the POWER!!!

The power of a hypothesis test is the probability of not making a TYPE II error (rejecting a false null hypothesis) t evidence to support the alternative hypothesis

POWER = 1 - βProduce a power curve

We need more POWER!!!For a fixed significance level, increasing the sample size increases the power

Therefore, you can run a test to determine if your sample size HAS THE POWER!!!

By using a sufficiently large sample size, we can obtain a hypothesis test with as much power as we want

Power - the probability of being able to detect an effect of a given size

Sample size - the number of observations in each sample

Difference (effect) - the difference between μ for one population and μ for the other

Increasing the power of the test

There are four factors that can increase the power of a two-sample t-test:

1. Larger effect size (difference) - The greater the real difference between m for the two populations, the more likely it is that the sample means will also be different.

2. Higher α-level (the level of significance) - If you choose a higher value for α, you increase the probability of rejecting the null hypothesis, and thus the power of the test. (However, you also increase your chance of type I error.)

3. Less variability - When the standard deviation is smaller, smaller differences can be detected.

4. Larger sample sizes - The more observations there are in your samples, the more confident you can be that the sample means represent m for the two populations. Thus, the test will be more sensitive to smaller differences.

Increasing the power of the test

The most practical way to increase power is often to increase the sample size

However, you can also try to decrease the standard deviation by making improvements in your process or measurement

Sample size

Increasing the size of your samples increases the power of your test

You want enough observations in your samples to achieve adequate power, but not so many that you waste time and money on unnecessary sampling

If you provide the power that you want the test to have and the difference you want it to be able to detect, MINITAB will calculate how large your samples must be

When to pair, when to not-pair“All I got's two fives!” - Jean LaRose

Test is run by Minitab directly as “paired t-test”

Used when there is a natural pairing of the members of two populations

Each pair consists of a member from one population and that members corresponding member in the other population

Use difference between the two sample means

When to pair, when to not-pair“All I got's two fives!” - Jean LaRose

Paired t-test assumptions:

1. Random Sample2. Paired difference normally distributed; large n3. Outliers can confound results

Tests whether the difference in the pairs is significantly different from zero

Paired Test - Example

For Example…

If you are testing the effects of some experimental treatment upon a populatione.g. – effect of new diet upon a single sample of fish

However…Paired test must have equal sample sizes

When to parametric…

Nonparametric procedures

Statistical procedures that require very few assumptions about the underlying population. They are often used when the data are not from a normal population.

Non-ParametricNon-parametric t-test (Mann-Whitney):

1. Random Sample2. Do not require normally distributed data3. Outliers do not confound results

Tests whether the difference in the pairs is significantly different from zero

Non-parametric test are used heavily in some disciplines – although not typically in the natural sciences – often the “last resort” when data is not collected correctly, low “power”

Nonparametric tests:

Less powerful than parametric tests. Thus, you are less likely to reject the null hypothesis when it is false.

Often require you to modify the hypotheses. For example, most nonparametric tests concerning the population center are tests about the median rather than the mean. The test does not answer the same question as the corresponding parametric procedure.

When a choice exists and you are reasonably certain that the assumptions for the parametric procedure are satisfied, then use the parametric procedure.

Drawbacks of Nonparametric Tests

Documents

MARE 250 Dr. Jason Turner Hypothesis Testing II To ASSUME is to make an… Four assumptions for t-test hypothesis testing: 1. Random Samples 2. Independent