13
Further Mathematics Displaying Bivariate Data K McMullen 2012

Further6 displaying bivariate data

Embed Size (px)

Citation preview

Page 1: Further6  displaying bivariate data

K McMullen 2012

Further MathematicsDisplaying Bivariate Data

Page 2: Further6  displaying bivariate data

K McMullen 2012

Displaying Bivariate Data

Bivariate Data: data with two variables (two quantities or qualities that change)

Generally one variable depends on the other

The dependent variable depends on the independent variable

Eg. Height and Weight

Eg. Hours studied and test result

Tend to focus more on dependent and independent variables when plotting scatterplots

Page 3: Further6  displaying bivariate data

K McMullen 2012

Displaying Bivariate Data

Back-to-back stem plots: are used to display the relationship between a numerical variable and a two-valued categorical variable

They are used to compare data sets using summary statistics such as measures of centre and measures of spread

Eg. Comparing Further Maths study scores (numerical variable) with gender (male or female- two-valued categorical variable)

Page 4: Further6  displaying bivariate data

K McMullen 2012

Displaying Bivariate Data

Parallel box plots: are used to display the relationship between a numerical variable and a categorical variable with two or more categories

They are used to compare sets of data using summary statistics such as measures of centre and measures of spread- also think of the 5 number summary

Remember that parallel box plots must be placed on the same axis (you can also do this on CAS)

Eg. The results achieved by 4 different further maths classes

Page 5: Further6  displaying bivariate data

K McMullen 2012

Displaying Bivariate Data

Two-way frequency tables: are used to display the relationship between two categorical variables and can be represented graphically as a segmented bar chart

Remember that it is easier to compare data sets if you are working with percentages instead of totals

In a frequency table you should place your independent variable along the top row and your dependent variable along the left column (this will mean that all your columns must add to 100% if done correctly)

Page 6: Further6  displaying bivariate data

K McMullen 2012

Displaying Bivariate Data

Scatterplots: are used to display the relationship (correlation) between two numerical variables

The dependent variable is displayed on the vertical axis

The independent variable is displayed on the horizontal axis

The relationship between variables on a scatterplot can be described in terms of:

Strength (strong, moderate, weak)

Direction (positive, negative)

Form (linear, non-linear)

Page 7: Further6  displaying bivariate data

K McMullen 2012

Displaying Bivariate Data

Scatterplots- continued

Pearson’s product-moment correlation coefficient (r) is used to measure the strength of the scatterplot

The values of r range between -1 (perfect negative) to 1 (perfect positive)

You can approximate the value of r (look at formula on p. 101) but you can also calculate it using CAS (obviously more reliable)

To interpret r look and copy the table on page 100 of your textbook

Page 8: Further6  displaying bivariate data

K McMullen 2012

Displaying Bivariate Data

Scatterplots- continued

• The coefficient of determination (r2): this provides information about the degree to which one variable can be predicted from another variable provided that the variables have a linear correlation

• The coefficient of determination is calculated by squaring the correlation coefficient (r)

• When commenting using r2 always convert your value into a percentage

• Comments

“The coefficient of determination tells us that rr% of the variation in the dependent variable is explained by the variation in the independent variable”

Page 9: Further6  displaying bivariate data

K McMullen 2012

Displaying Bivariate Data

• You must remember the difference between correlation and causation

• To interpret your scatterplot you must stick to the variables given and don’t make any unnecessary assumptions

• If your scatterplot is negative then: “As IV increases the DV decreases)

• If your scatterplot is positive then: “As IV increases the DV increases)

Page 10: Further6  displaying bivariate data

K McMullen 2012

Displaying Bivariate Data

Example: Age and arm span of teenage boys

Comment: As the age of teenage boys increases the length of their arm span also increases

Assumption: As teenage boys get taller their arm span increases

Obviously they get taller but height is not a variable and therefore you should not comment on it

Page 11: Further6  displaying bivariate data

K McMullen 2012

Displaying Bivariate Data

Eg. The number of cigarettes smoked and fitness level

Comment: As the number of cigarettes increase the fitness level of participants decreased

Assumption: Smoking cigarettes causes fitness levels to decrease

You must remember that there can be other factors the can account for low levels of fitness such as lack of exercise or weight etc

Page 12: Further6  displaying bivariate data

K McMullen 2012

Displaying Bivariate Data

Eg. People catching public transport and the sales of designer handbags

Comment: As the number of people catching public transport increase the number of people buying designer handbags decreases

Assumption: A high proportion of people catching public transport has caused a decline in the sales of designer handbags

These two variables are clearly unrelated even though there can be some correlation. You need to always question the validity of stats- what else could have caused public transport use to increase and designer handbags sales to decrease?

Page 13: Further6  displaying bivariate data

K McMullen 2012

Displaying Bivariate Data

Work through Ch 4 questions and chapter review