INTRODUCTION TO STATISTICAL CONCEPTS. Objectives Definition of “statistics” Descriptive vs....

Preview:

Citation preview

INTRODUCTION TO INTRODUCTION TO STATISTICAL CONCEPTSSTATISTICAL CONCEPTS

Objectives

• Definition of “statistics”

• Descriptive vs. Inferential Statistics

• Types of Descriptive Statistics

• Elements of Inferential Statistics

• Qualitative vs. Quantitative Data

• Data Collection Methods

• Inference errors from nonrandom samples

SURVEY• A random sample of students taking taking a

statistics class are asked, “What is your age?”

Responses23

27

25

26

22

23 21

24

3045

21

20

2519

35

23

2520

31

26

DATA

• From this survey we get data:

22 35 21 2623 27 25 2020 19 25 2425 26 23 2123 30 45 31

INFORMATION

• In reality, data files are often very large– Much larger than this example

• Data is often stored in– Large computer databases– Printed records

• The key question is, “how do we extract useful informationinformation from this data?”

What is Statistics?

StatisticsStatisticsis a way to get

INFORMATIONINFORMATION from

DATADATA

Types of Statistics

• Two Types of Statistics

StatisticsStatistics

DescriptiveDescriptive

InferentialInferential

DESCRIPTIVE STATISTICS

• Graphical Depictions of Data– Histograms (Bar Charts)– Pie Charts – Other Types of Charts/Graphs

• Numerical Descriptions/Measures of Data– Frequencies– Measures of Central Tendency– Measures of Variability

Inferential Statistics

• Testing hypotheses

• Making inferences from surveys

• Giving ranges for estimates

• Predicting the value of one variable (e.g. sales) for given values of other variables (e.g. advertising dollars)

• Forecasting future values over time

• Quality control

Basic Statistical Concepts

• PopulationPopulation– A set of items (experimental unitsexperimental units) under study

• Parameter (Variable)Parameter (Variable)– A descriptive measure of the population that is of

interest e.g. the mean (Unknown -- Use Greek letter)

• (Random) Sample(Random) Sample – A (random) subset chosen from the population

• StatisticStatistic– A descriptive measure that is calculated from the

sample, e.g. the sample mean (Use regular letter)

Purpose of Inferential Statistics Making inferences about a

parameterparameter of a populationpopulation

based on information obtained from a

statisticstatistic of the samplesample

(With a Certain Degree of Confidence)

Example What is the average age of students taking the introductory statistics class at this university?

• POPULATIONPOPULATION under study – AllAll studentsstudents taking the statistics course at this

university• We may not have access to all records• Even if we did, this population is constantly changing with

adds/drops

• PARAMETERPARAMETER of interest– Average ageAverage age of all students taking the course

• Symbol -- • We can never know for sure without looking at the entire

database

EXAMPLE (Continued)

• Take a (random) SAMPLESAMPLE– Obtain data from a random subset of the population -- i.e.

randomly select 8 students taking the course and ask, “What is your age?”

– Results might be: 23 22 19 35 21 25 25 26

• Calculate a STATISTICSTATISTIC from the sampled data– The average age of the samplesample of the 8 students (a statistic

computed from the sample) can be calculated. This is notnot the average age of the populationpopulation but is our best estimate of it.

The Sample Statistic

375.248

2625252130192623x

CONFIDENCE

• The average of the sample was 24.375• What are the chances that the exact true

average of all (1000(?)) students taking statistics is 24.375?– The chance is effectively 0

– But it is our best single guess (point estimatepoint estimate)

– Pretty sure the average is within the interval 23.375 to 25.375

– Even more sure it is in the interval 21.375 to 27.375

How Large An Interval?

• How wide does the interval have to be before we are “reasonably sure” the interval contains the true average age of all students taking statistics?

• The answer to this question is one of the basic concepts of inferential statistics

• We will discuss this later in the course

Computing Arithmetic Statistical Values in this Course

• By hand/tablesBy hand/tables– It is important to know the concepts behind

statistical computations and to be able to calculate basic statistical values by hand or use statistical tables in the analyses

• Computer Computer (EXCEL)(EXCEL)– Computer packages are a valuable aid for

making tedious and/or complex calculations and for generating usable output

TYPES OF DATA• Qualitative Data

– Observation is nonnumeric• What color is your car?

• Who is your favorite candidate for President?

• How would you rate your instructor?

• Quantitative Data– Observation is numeric

• What is your GPA?

• How far do you live from campus?

• What is your salary?

Collecting Data

• Data can be extracted from a public sourcepublic source– Wall Street Journal, Orange County Business Journal

• A designed experimentdesigned experiment can be performed– Test cavity prevention – divide subjects into groups

• A surveysurvey can be taken– Presidential poll (phone, mail), TV program (Nielsen)

• Observation studiesObservation studies can be made– Observe output of workers on morning/evening shifts

Goal of Data Collection

• To obtain a “representative sample” that exhibits the characteristics of the entire population

• Most common approach – taking random random samplessamples where each experimental unit in the population theoretically has the same chance of being selected for the sample

Nonrandom Sampling Errors

• Selection bias– One subset of experimental units in the population

has either no chance, less of a chance, or more of a chance of being selected than another subset

• Nonresponse bias– When data is unavailable or unattainable for certain

experimental units in the population

• Measurement errors – Inaccuracies in getting/recording data; ambiguous

questions on questionnaires, etc.

Using Nonrandom Samples

• Unintentionally– Leads to unjustified or false conclusions

• Intentionally– Designed to skew results on purpose– Unethical statistical practice

REVIEW

• What is statistics?

• What is the difference between descriptive and inferential statistics?

• What are the elements of inferential statistics?

• What are the two types of data?

• What are four ways data are collected?

• What is the importance of using random samples?

Recommended