10
CHAPTER 4: MORE ON TWO VARIABLE DATA Section 4.3 – Relations in Categorical Data

Chapter 4 : More on Two Variable Data

  • Upload
    favian

  • View
    100

  • Download
    0

Embed Size (px)

DESCRIPTION

Chapter 4 : More on Two Variable Data . Section 4.3 – Relations in Categorical Data. Analyzing Categorical Data. To analyze categorical data we use counts or percents of individuals that fall into various categories . - PowerPoint PPT Presentation

Citation preview

Page 1: Chapter 4 : More on Two Variable Data

CHAPTER 4: MORE ON TWO VARIABLE DATA Section 4.3 – Relations in Categorical Data

Page 2: Chapter 4 : More on Two Variable Data

To analyze categorical data we use counts or percents of individuals that fall into various categories.

Two way tables contain two categorical variables, one represented in the rows and one in the columns.

ANALYZING CATEGORICAL DATA

Row variable Column variable

Page 3: Chapter 4 : More on Two Variable Data

Marginal distributions – The totals of each row and column that appear in the margins are referred to the marginal distributions.

Roundoff error – The difference between actual and the marginal distributions due to rounding of the sums. In the table the total of “35-54” age group doesn’t match the total list. This is

because the table is in thousands of persons and each is rounded to the nearest thousand.

DEFINITIONS

Marginal distributions

Page 4: Chapter 4 : More on Two Variable Data

MARGINAL DISTRIBUTION Percents are often more informative than counts

You can represent marginal distributions using percents and a bar graph

Example Suppose you want to display the distribution of years of

schooling completed among people aged 25 years or older

Education Did not finish HS

Completed HS

1-3 years of college

≥4 years of college

Percent 15.9 33.1 25.4 25.6

27,853175,230 ×100

Page 5: Chapter 4 : More on Two Variable Data
Page 6: Chapter 4 : More on Two Variable Data

EXAMPLE 1 – HOW COMMON IS COLLEGE? Create a bar graph that compares the

percents of the three age groups who have completed 4 or more years of college.

Page 7: Chapter 4 : More on Two Variable Data

CONDITIONAL DISTRIBUTIONS In the last example we only compared the percents of

people who finished college (≥4 years) If you look back you can see that it’s simply a

distribution of percents. The percents don’t add up to 100%.

If we were to take the age group 25- 34-year olds, the percents would add up to 100% because all 25- to 34-year-olds would fall into one of the educational categories. These four percents all together would be a conditional distribution.

Conditional distribution – The total counts or percents of a given categorical variable.

Education for 25- 34-year-olds

Did not finish HS

Completed HS 1-3 years of college

≥4 years of college

Percent 11.8 30.6 28.3 29.3

Page 8: Chapter 4 : More on Two Variable Data

EXAMPLE 2 – CONDITIONAL DISTRIBUTION Find the conditional distribution of years of

school completed among people age 55 and over.

Education for 55 years and over

Did not finish HS

Completed HS 1-3 years of college

≥4 years of college

Percent 25.4 35.8 19.9 18.9

Page 9: Chapter 4 : More on Two Variable Data

READING A TWO-WAY TABLE GENERATED FROM SOFTWARE

Page 10: Chapter 4 : More on Two Variable Data

Homework: p.245- #’s 53, 54, 59, 62-64, & 68