21
Lecture 1. Making Sense of Data: Data Variation David R. Merrell 90-786 Intermediate Empirical Methods for Public Policy and Management

Lecture 1. Making Sense of Data: Data Variation

Embed Size (px)

DESCRIPTION

Lecture 1. Making Sense of Data: Data Variation. David R. Merrell 90-786 Intermediate Empirical Methods for Public Policy and Management. Making Sense of Data: Data Variation. Introductions Instructor: David R. Merrell TA s: Max Hernandez-Toso and Hao Xu Course Content: USEFUL STATISTICS. - PowerPoint PPT Presentation

Citation preview

Lecture 1. Making Sense of Data: Data Variation

David R. Merrell90-786 Intermediate Empirical

Methods for Public Policy and Management

Making Sense of Data: Data Variation

Introductions Instructor: David R. Merrell TA s: Max Hernandez-Toso and Hao Xu

Course Content: USEFUL STATISTICS

Statistics is the use of data to reduce uncertainty about potential observations

Course Information Web site

http://Duncan.heinz.cmu.edu/GeorgeWeb/

Heinz 90-786 Front Page.htm Data files

r:/academic/90786

Making Sense of Data Motivation in management and

policy What is data? What’s the use of data? Data variation

Motivation for Statistical Input

Managerial Decision Making Changes in societal or organizational

conditions Differences between observations and

expectations Policy Making

Impact of changing the system

What is Data?

Unit of analysis Number of variables

one, two, more than two Level of measurement / kind

of data Nominal, Ordinal, Interval

Unit of analysis

Focus of attention: a case that can be be separately and uniquely identified

person (student, woman, tenant, .. place (city, street intersection, river, … object (car, power plant, ...) organization (school, corporation, …) incident (birth, election) time period(day, season, year, ...)

Variables Characteristics, attributes, and

occurrences observed about each unit of analysis

Require specific step-by-step procedure to obtain values for the variable

Examples

Driver's license application study Unit of analysis: people who apply for a driver's

license. Outcome variable: License issued or not Other variables: Applicant's age, sex, and race

Snowfall in Pittsburgh Units of analysis: Snowstorms Outcome variable: depth of the snowfall from each

storm Other variables: date of snowstorm, temperature

Nominal data Classifies outcomes by categories Categories must be mutually

exclusive and exhaustive Examples:

Marital status, region of the country, religion, occupation, school district, place of birth, blood type

Ordinal data Classifies outcomes by ranked

categories Examples:

Officers in the U.S. Army can be classified as: 1 = general 5 = captain 2 = colonel 6 = first lieutenant 3 = lieutenant colonel 7 = second lieutenant 4 = major

Education (highest diploma or degree attained)

Interval data Classifies outcomes on a

continuous scale Examples:

Scholastic Aptitude Test (SAT) score Consumer Price Index (CPI) Time of day

What’s the Use of Data? Description Evaluation Estimation

Description Summary of observations In February, 1997 the M1A money

supply in Taiwan rose 6.46% over February, 1996

Housing starts in June, 1996, rose to a seasonally adjusted rate of 1,480,000 units from a revised 1,461,000 in May

Evaluation Comparison of observed state of

affairs against expectations Expectations are based on: ethical

norms, managerial plans and budgets

Estimation Uses observations to assess an attribute

of a population or to predict future values.

A new charter school in Boston raised test scores an average of 7 percentile points. How would other charter schools do? How will this charter school do in the future?

Data Variation: Data Compression and Display Boxplots Five number summary

minimum lower quartile point median upper quartile point maximum

Batting Average of 263 major league baseball players

Aver Career

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Compressed Data ValuesMedian 0.263Minimum 0.196Maximum 0.353

Range 0.155

Mode 0.250

Mean 0.263Standard Deviation 0.023

Batting Average of 263 major league baseball players

Aver Career

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Median 0.263

Maximum0.352

Minimum0.196

Next Time ... Data Compression for One Variable