Definitions Population: A collection, or set, of individuals, objects, or events whose properties are to be analyzed. Sample: A subset of the population

DefinitionsPopulation: A collection, or set, of individuals, objects, or events whose properties are to be analyzed.

Sample: A subset of the population.

We desire knowledge about an entire population but is most often the case that it is prohibitively expensive, so we select representative sample from the population and study the individual items in the sample.

Descriptive Statistics: The collection, presentation, and description of of the sample data.

Inferential Statistics: The technique of of interpreting the values resulting from the descriptive techniques and making decisions and drawing conclusions about the population.

1Section 1.1, Page 4

DefinitionsParameter: A numerical value summarizing all the data of a population. For example, the average high school grade point of all Shoreline Students is 3.20. We often use Greek letters to identify parameters, μ = 3.20.

Statistic: A numerical value summarizing the sample data. For example, the average grade point of a sample of Shoreline Students is 3.18. We would use the symbol,

x 3.18

The statistic corresponds to the parameter. We usually don’t know the value of the parameter, so we take a sample and estimate it with the corresponding statistic.

Sampling Variation: While the parameter of a population is considered a fixed number, the corresponding statistic will vary from sample to sample. Also, different populations give rise to more or less sampling variability. Considering the variable age, samples of 60 students from a Community college would have less variability than samples of a Seattle neighborhood.


VariablesVariable: A characteristic of interest about each element of a population.

Data: The set of values collected for the variable from each of the elements that belong to the sample.

Variability: The extent to which data values for a particular variable differ from each other.

Numerical or Quantitative Variable: A variable that quantifies an element of the population. The HS grade point of a student is a numerical variable. Numerical variables are numbers for which math operations make sense. The average grade point of a sample makes sense.

Continuous Numerical Variable: The variable can take on take on an uncountable number of values between to points on the number line. An example is the weight of people.

Discrete Numerical Variable: The variable can take on a countable number of values between two points on a number line. An example is the price of statistics text books.


Variables (2)

Section 1.1, Page 8 4

Categorical or Qualitative Variable: A variable that describes or categorizes an element of a population. The gender of a person would be a categorical variable. The categories are male and female.

Nominal Categorical Variable: A categorical variable that uses a number to describe or name an element of a population. An example is a telephone area code. It is a number, but not a numerical variable used on math operations. The average area code does not make sense.

Ordinal Categorical Variable: A categorical variable that incorporates an ordered position or ranking. An example would be a survey response that ranks “very satisfied” ahead of “satisfied” ahead of “somewhat satisfied.” Limited math operations may be done with ordinal variables.

Problems

Problems, Page 19 5

Problems


Observational Studies and Experiments


Observational Study: Researchers collect data without modifying the environment or controlling the process being observed. Surveys and polls are observational studies. Observational studies cannot establish causality.

Example: For a randomly selected high school researchers collect data on each student, grade point and whether the student has music training, to see if there is a relationship between the two variables.

Experiments: Researchers collect data in a controlled environment. The investigator controls or modifies the environment and observes the effect of a variable under study. Experiments can establish causality.

Example: Randomly divide a sample of people with migraine headaches into a control and treatment groups. Give the treatment group a experimental medication and the control group a placebo, and then measure and compare the reduction of frequency and severity of headaches for both groups.

Single-Stage Sampling Methods


Single-stage sampling: A sample design in which the elements of the sampling frame treated equally and there is no subdividing or partitioning of the frame.

Simple Random Sample: Sample selected in such a way that every element of the population has an equal probability of being selected and all samples of size n have an equal probability of being selected.

Example: Select a simple random sample of 6 students from from a class of 30.

1.Number the students from 1 to 30 on the roster.

2.Get 6 non-recurring random numbers between 1 and 30.

3.The six students who match the six random numbers are the sample.

Multistage Sampling Designs


Multistage Sampling: A sample design in which the elements of the sampling frame are subdivided and the sample is chosen in more than one stage.

Stratified Random Sampling: A sample is selected by stratifying the population, or sampling frame, and then selecting a number of items from each of the strata by means of a simple random sampling technique.The strata are usually subgroups of the sampling frame that are homogeneous but different from each other.

Example: Select a sample of six students from a class of 30 so that the sample contains an equal number of males and females.1.List the males and females separately2.Take a simple random sample of 3 students from each group. 3.The six students selected are the sample.

Multi-Stage Sampling Designs


Cluster Sample: A sample obtained stratifying the population, or sampling frame, and then selecting some or all of the items from some, but not all of the strata.

The strata are usually easily identified subgroups of the sampling frame that are similar to each other. This is often the most economical way to sample a large population.

Example: Take a sample of 300 Catholics in the Seattle Area.1. Get a list of the Catholic Parishes in the Seattle area.2. Take a random sample of 3 parishes.3. In each parish, select a simple random sample of 100 parishioners.

Problem

a. Find the mean, variance, and standard deviation.

b. Find the 5-number summary.

c. Make a box and whisker display and label the numbers.

d. Calculate the Interquartile range and the range

e. Describe the shape of the distribution

11Problems, Page 50

Summary of Probability Formulas

Equally Likely Outcomes: P(A) = n(A)/n

Complement: P(A) = 1- P(not A); P(not A) =1- P(A)

General Addition Rule: P(A or B) = P(A) + P(B) – P(A and B)

If A and B are disjoint, P(A and B) = 0Then the Special Addition Rule:Then P(A or B) = P(A) + P(B)

General Multiplication Rule: P(A and B) = P(A)×P(B|A)

If A and B are independent, P(B|A) = P(B)Then the Special Multiplication Rule:

P(A and B) = P(A)×P(B)

OddsIf the odds for A are a:b, then the odds against A are

b:a. The probability of A is a/(a+b). The probability of not A is b/(b+a)

12Chapter 4

Problems

13Problems, Page 95

Problem

14Problems, Page 95

Problems

15Problems, Page 97

Problems

16Problems, Page 99

Z Score Problems

17Problems, Page 52

Problems

18Problems, Page 132

Problems


6.51 IQ scores are normally distributed with a mean of 100 and a standard deviation of 16. Find the following:a.The 66th percentile.b.The 80th percentile.c.The minimum score required to be in the top 10%.d.The minimum score to be in the top 25%.

6.52 Find the two z-scores that bound the middle 30% of the standard normal distribution.

Problems


Problems


Problems

22Problems, Page 50

Problems


the standard deviation is 5 seconds.

Problems

Test the claim that the BMI of the cardiovascular technologists is different than the BMI of the general population. Use α = .05. Assume the population of the BMI of the cardiovascular technologists is normal.a.State the necessary hypotheses.b.Is the sampling distribution normal. Why?c.Find the p-value.d.State your conclusion.e.If you made an error, what type of error did you make?

24Problems Page 181

Problems


Problems

a. Find the 98% confidence interval.b. Find the critical valuec. Find the margin of error.d. Find the standard error.e. What assumption must we make about the the

population to have a t-sampling distribution.f. What are the proper words to describe the

confidence interval?g. If you wanted to have a margin of error of one

minute and the 98% confidence interval for this data, how large must the sample be?


Problems

a. Find the p-value.b. State your conclusion.c. What is the name of the probability model used for

the sampling distributiond. What is the mean of the sampling distribution?e. What is the value of the standard error?f. If your conclusion is in error, what type of error is it?


Problems


Problems

a. Check the conditions for a normal sampling distribution.b. State the hypotheses.c. Find the p-value.d. State your conclusione. If you make an error in your conclusion, what type is it?f. Find the mean of the sampling distribution.g. Find the standard error of the sampling distribution.


Dependent and Independent Samples


Problems

a. Test the hypotheses that the people increased their knowledge. Use α=.05 and assume normality. State the appropriate hypotheses.

b. Find the p-value and state your conclusion.

c. Find the 90% confidence interval for the mean estimate of the increase in test scores.


Problems

a. State the hypothesis (Assume Normality)b. Find the p-value, and state you conclusion.c. Find the 95% confidence interval for the

difference of the means; Gouda-Brie.d. Find the mean and standard error of the sampling

distribution


Problems

a. State the appropriate hypotheses.b. Find the p-value and state your conclusion.c. What model is used for the sampling distribution and

what is the mean of the sampling distribution and its standard error?

d. Find the 98% confidence interval for the difference in proportions, men – women.


Summary of Chi-Square Applications

Goodness of Fit TestGiven one categorical variable with a fixed set of proportions for the categories. Ha: The observed data does not fit the proportions.Calculate expected values (Ho true proportion * total observations)Observed and Expected data in List EditorPRGM: GOODFIT

Test for IndependenceGiven two categorical variables measured on the same population.Ha: The variables are not independent (They are related)Observed data in Matrix EditorStat-Tests-χ2 Test

Test for HomogeneityGiven one categorical variable and two or more populations.Ha: The proportions for the categories are not the same for for all populations.Observed data in Matrix EditorStat-Tests-χ2 Test

34Chapter 12, Summary

Chi-Square DistributionFair Die Example

Now we need a sampling distribution for the Χ2 statistic = 2.2, so we can calculate the probability of getting a Χ2 ≥ 2.2 when the true proportions are all equal to 1/6.

Χ2 Distribution for 5 df

This is a distribution of all possible Χ2 statistics calculated from all possible samples of 60 observations when there are 6 proportions or cells. Note that the degree of freedom equals the number of proportions – 1.

Finding the p-value on the TI-83, Given Χ2 Stat, df

PRGM – CHI2DISTLOWER BOUND: 2.2UPPER BOUND: 2ND E99df: 5Output: P-VALUE = 0.8208The null hypothesis cannot be rejected.


Problems

a. Perform a hypotheses test to see if the preferences are not all the same. State the hypotheses.

b. Find the p-value and state your conclusionc. What is the name of the model used for the

sampling distribution?


Problems

a. Perform a hypotheses test to see if the preferences are not all the same. State the hypotheses.

b. Find the p-value and state your conclusionc. What is the name of the model used for the



Problems

a. Test the hypotheses that the size of community reared in is independent of the size of community residing in. State the appropriate hypotheses.

b. Find the p-value and state your conclusionc. What is the name of the sampling

distribution?d. What are the necessary conditions, and are

they satisfied? What is the value of the smallest expected cell?


The F-Distribution

39Sec 10.5, Page 226

Each sample must be from a normal distribution4.

Problem


Set up the problem so that the the F-Stat >1.

a. State the necessary hypotheses.b. Find the p-value and state your conclusion.c. What is the name of the model used for the


Problems

41Sec 12.1, Page 268

a. State the necessary hypotheses.b. Sketch the side-by-side box plots. Does it appear

that the means are all the same?c. Find the p-value and state your conclusion.d. What is the name of the model used for the


Problems

Problems, Page 268 42

SampleSize

SampleMean

SampleSt. Dev.

Atlanta 6 24.67 7.76

Boston 7 33.00 9.56

Dallas 7 30.86 7.58

Philadelphia 5 32.20 7.47

Seattle 5 27.40 9.40

St. Louis 6 25.83 10.03

a. Test the hypotheses that not all the mean commute times are all the same. State the appropriate hypothesis.

b. Find the p-value and state your conclusion.c. What is the name of the sampling distribution?d. What is the F-Statistic, the df numerator and df

denominator?

Problems

Problems, Page 268 43

SampleSize

SampleMean

SampleSt. Dev.

Atlanta 6 24.67 7.76

Boston 7 33.00 9.56

Dallas 7 30.86 7.58

Philadelphia 5 32.20 7.47

Seattle 5 27.40 9.40

St. Louis 6 25.83 10.03

a. Test the hypotheses that not all the mean commute times are all the same. State the appropriate hypothesis.

b. Find the p-value and state your conclusion.c. What is the name of the sampling distribution?d. What is the F-Statistic, the df numerator and df

denominator?

Documents

Definitions Population: A collection, or set, of individuals, objects, or events whose properties are to be analyzed. Sample: A subset of the population