Chapter 1 Overview and Descriptive Statistics 1111.1 - Populations, Samples and Processes 1111.2 -...

Chapter 1Overview and Descriptive Statistics

1.1 - Populations, Samples and Processes 1.2 - Pictorial and Tabular Methods in

Descriptive Statistics

1.3 - Measures of Location

1.4 - Measures of Variability

Note that these are textbook chapters, although Lecture Notes may be referenced.

STATISTICS IN A NUTSHELL

Examples: Toasting time, Temperature settings, etc. of a population of toasters…

What is “random variation” in the distribution of a population?

POPULATION 1: Little to no variation

O O O O O

In engineering situations such as this, we try to maintain “quality control”… i.e., “tight tolerance levels,” high precision, low variability.

But what about a population of, say, people?

(e.g., product manufacturing)

Density

POPULATION 1: Little to no variation

Most individual values ≈ population mean value

Example: Body Temperature (F)

Very little variation about the mean!

98.6 F

(e.g., clones)

Example: Body Temperature (F)Examples: Gender, Race, Age, Height, Annual Income,…POPULATION 2: Much variation (more common)

Density

Much more variation about the

• Click on image for full .pdf article

• Links in article to access datasets

Example

How is this accomplished?How is this accomplished?Hospital records, etc.

“sampling frame”

Women in U.S. who have given birth

POPULATION

“Random Variable” X = Age at first birth

mean μ = ???

That is, the Population Distribution of X ~ N(, ).

Suppose we know that X follows a “normal distribution” (a.k.a. “bell curve”) in the population.

Study Question:How can we estimate

“mean age at first birth” of women in the U.S.?

{x1, x2, x3, x4, … , x400}

and are “population characteristics” i.e., “parameters”(fixed, unknown)

standard deviation

POPULATION

mean x = 25.6{x1, x2, x3, x4, … , x400} FORMULA

mean μ = ???

is an example of a “sample characteristic” = “statistic.”(numerical info culled from a sample)This is called a “point estimate“ of from the one sample.Can it be improved, and if so, how?• Choose a bigger sample, which

should reduce “variability.”• Average the sample means of

many samples, not just one. (introduces “sampling variability”)

“Sampling Distribution” ~ ???

How big???

?????????

??? and are “population characteristics” i.e., “parameters”(fixed, unknown)

x = 25.6

Other possible parameters:• standard deviation

• median • minimum

• maximum

standard deviation

mean x = 25.6

Without knowing every value in the population, it is not possible to determine the exact value of with 100% “certainty.”

HOWEVER…

POPULATION

mean x = 25.6{x1, x2, x3, x4, … , x400} FORMULA

mean μ = ???

For concreteness, suppose = 1.5

standard deviation

95% CONFIDENCE INTERVAL FOR µ

25.74725.453

BASED ON OUR SAMPLE DATA, the true value of μ is between 25.453 and 25.747, with 95% “confidence” (…akin to

“probability”).

Without knowing every value in the population, it is not possible to determine the exact value of with 100% “certainty.”

HOWEVER…

This is called an “interval estimate“ of from the sample.

Used in “Statistical Inference” via “Hypothesis Testing”…

(Stat 312)

mean x = 25.6

POPULATION

mean x{x1, x2, x3, x4, … , xn} FORMULA

mean μ = ???

standard deviation

• Arithmetic Mean

• Geometric Mean

• Harmonic Mean

Each of these gives an estimate of for a particular sample.

Any general sample estimator for is denoted by the symbol

Likewise for and

1 2 nA

x x xx

G nx x x x

1 1 1n

Hx x x

POPULATION

mean x{x1, x2, x3, x4, … , xn} FORMULA

mean μ = ???

That is, the Population Distribution of X ~ N(, ).“PARAMETER ESTIMATION”

standard deviation

Extending these ideas to other parameters of a population gives rise to the general theory of…

(Stat 311)

standard deviation

That is, the Population Distribution of X ~ N(, ).That is, the Population Distribution of X ~ N(, ).

How is…“Random Variable” X(age, income level, …)

… distributed?

mean μ = ???

composed of “units” (people, rocks, toasters,...)

What do we want to know about this population?

Suppose we know that X follows a known “probability distribution” in the population… but with parameters unknown vals.1 2, , That is, the Population Distribution of X ~ Dist(1, 2,…).

Ideal properties…• Unbiased estimator of • Minimum Variance among all such unbiased estimators

i.e., “MVUE”

heavily skewed tail

SAMPLEFor a particular , want to define a corresponding “parameter estimator”

To make certain calculations simpler, we assume that populations are “arbitrarily large” (or indeed, infinite).

POPULATION

How do we estimate these?

10 10½ 11

Quantitative [measurement] length

temperature

pulse rate

# puppies

shoe size

“Random Variable”

X = any numerical value that can be assigned to each unit of a population

“Random” refers to the notion that this value is unknown until actually observed (usually as part of an outcome of an experiment to test a specific hypothesis). Contrast this with the idea of a “nonrandom” variable with no empirical error, e.g., X = # cards in a deck = 52.

There are two general types.........

Quantitative and Qualitative

… distributed?

composed of “units” (people, rocks, toasters,...) To make certain calculations simpler, we assume that populations are “arbitrarily large” (or indeed, infinite).

POPULATION

Quantitative [measurement]

length

temperature

pulse rate

# puppies

shoe size

… distributed?

POPULATION

CONTINUOUS(can take their values at any point in a continuous interval)

DISCRETE(only take their values in disconnected jumps)

Qualitative [categorical] video game levels (1, 2, 3,...)

income level (low, mid, high)

zip code

color (Red, Green, Blue)

ORDINAL,RANKED

… distributed?

POPULATION

IMPORTANT SPECIAL CASE: Binary (or Dichotomous)• “Pregnant?” (Yes / No)• Coin toss (Heads / Tails)• Treatment (Drug / Placebo)

1 2 3NOMINAL

1, "Success"0, "Failure"

(ordered labels)

(unordered labels)

Random VariableDiscrete Random Variable

1, "Success"0, "Failure"

Define a new parameter

= P(Success)

ˆ ? Point estimatorSuppose we intend to select a random sample of size n from this population of Success and Failures…

… in such a way that the “Success or Failure” outcome of any selected individual conveys no information about the “Success or Failure” outcome of any other selected individual.That is, the “Success or Failure” outcomes between any two individuals are independent. (Think of tossing a coin n times.)

POPULATION

Then a natural estimator for could be

(0, 1, 2, …, n)Let X = “Number of Successes

in the sample.”

the sample proportion of Success

Ex: n = 500 tosses, X= 285 Heads

285ˆ 0.57500

Chapter 1 Overview and Descriptive Statistics 1111.1 - Populations, Samples and Processes 1111.2 -...

Documents

LECTURE 6 15 SEPTEMBER 2009 STA291 Fall 2009 1. Review: Graphical/Tabular Descriptive Statistics Summarize data Condense the information from the dataset

Chapter 02 Descriptive Statistics: Tabular and Graphical

CH 2: Descriptive Statistics: Tabular and Graphical …CH 2: Descriptive Statistics: Tabular and Graphical Presentations Part 2 Organizing Quantitative Data 1. Dot Plot: A graphical

Descriptive Statistics: Tabular and Graphical Presentations n Summarizing Qualitative Data n Summarizing Quantitative Data

2 CHAPTER 2: Descriptive Statistics: Tabular and Graphical Methods

Chapter 02 Descriptive Statistics: Tabular and Graphical ... · Chapter 02 Descriptive Statistics: Tabular and Graphical Methods True / False Questions 1. A stem-and-leaf display

CH 02 - Descriptive Statistics: Tabular/Graphical€¦ · BSST.ASWC.17.02.02 - Summarizing data for a quantitative variable NATIONAL STANDARDS: United States - BUSPROG: Analytic STATE

CHAPTER 2 ² Descriptive Statistics : Tabular and Graphical ... · Chapter 02 - Descriptive Statistics: Tabular and Graphical Method Education. 5 2. Chapter 02 - Descriptive Statistics:

Chapter 2 Descriptive Statistics: Tabular and Graphical ...eschool.kuas.edu.tw/tsungo/Publish/02 Descriptive statistics tabular... · 現，毫無秩序章法，在統計學上必須利用次數分

Test Bank Univ - Chapter 2 Descriptive Statistics: …...Descriptive Statistics: Tabular and Graphical Displays 2 - 3 © 2010 Cengage Learning. All Rights Reserved. May not be scanned,

2 CHAPTER 2: Descriptive Statistics: Tabular and Graphical ...managerialstatistics.wikispaces.com/file/view/Essentials of... · However, sometimes bar charts are constructed with

Tabular data management - Bioconductor - Home · Tabular data management. data cleaning data wrangling descriptive stats inferential stats reporting. data cleaning data wrangling

DESCRIPTIVE STATISTICS: TABULAR AND GRAPHICAL … · TABULAR AND GRAPHICAL PRESENTATIONS MULTIPLE CHOICE QUESTIONS In the following multiple-choice questions, circle the correct answer

G. Kirchengast Curriculum Vitae (tabular scientific CV) · 2018-06-12 · G. Kirchengast – selected achievements June 2018 (for more descriptive-style information complementary

CHAPTER 2 Descriptive Statistics: Tabular and ... - Test Bank · CHAPTER 2—Descriptive Statistics: Tabular and Graphical Methods §2.1 CONCEPTS 2.1 Constructing either a frequency

G. Kirchengast Curriculum Vitae (tabular scientific CV) · 2017-11-17 · G. Kirchengast – selected achievements September 2017 (for more descriptive-style information complementary

Chapter 2, Part A Descriptive Statistics: Tabular and Graphical Presentations n Summarizing Categorical Data n Summarizing Quantitative Data Categorical

Chapter 2 Descriptive Statistics: Tabular and Graphical Methods Statistics for Business (Env) 1

DESCRIPTIVE STATISTICS: TABULAR AND GRAPHICAL … · c. a tabular summary of a set of data showing the frequency of items in each of several nonoverlapping classes d. a graphical

library.wrds.uwyo.edulibrary.wrds.uwyo.edu/wrp/97-06/97-06.pdf · Today. GIS that link graphic (spatial) data and tabular (descriptive) data offer many advantages over conventional