12
AP Statistics Name: ____________________________ Chapter 12 – Inference for Regression Date: ____________________ Per: _____ Section 12.2 Part 1: Transforming to Achieve Linearity In Chapter 3, we learned how to analyze relationships between two quantitative variables that showed a linear pattern. When two-variable data show a curved relationship, we must develop new techniques for finding an appropriate model. This section describes several simple transformations of data that can straighten a nonlinear pattern. Once the data have been transformed to achieve linearity, we can use least-squares regression to generate a useful model for making predictions. And if the conditions for regression inference are met, we can estimate or test a claim about the slope of the population (true) regression line using transformed data. Example 1: Health and Wealth (pg. 765 – 766) We expect people in richer countries to live longer because they have better access to healthcare and typically lead healthier lives. The overall pattern of this scatterplot does show this, but the relationship is nonlinear. Life expectancy rises very quickly as income per person increases, and then levels off. The scatterplot on the right shows a curved pattern. We can straighten things out using logarithms. Figure 12.10 below plots the logarithm of income per person against life expectancy for these same countries. The effect is almost magical. This graph has a clear, linear pattern! Applying a function such as the logarithm or square root to a quantitative variable is called transforming the data. We will see in this section that understanding how simple functions work helps us choose and use transformations to straighten nonlinear patterns.

Section 12.2 Notes - Mrs. Alexandra Voinea

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Section 12.2 Notes - Mrs. Alexandra Voinea

APStatistics Name:____________________________Chapter12–InferenceforRegression Date:____________________Per:_____

Section12.2Part1:TransformingtoAchieveLinearity In Chapter 3, we learned how to analyze relationships between two quantitative variables that showed a linear pattern. When two-variable data show a curved relationship, we must develop new techniques for finding an appropriate model. This section describes several simple transformations of data that can straighten a nonlinear pattern. Once the data have been transformed to achieve linearity, we can use least-squares regression to generate a useful model for making predictions. And if the conditions for regression inference are met, we can estimate or test a claim about the slope of the population (true) regression line using transformed data. Example 1: Health and Wealth (pg. 765 – 766) We expect people in richer countries to live longer because they have better access to healthcare and typically lead healthier lives. The overall pattern of this scatterplot does show this, but the relationship is nonlinear. Life expectancy rises very quickly as income per person increases, and then levels off. The scatterplot on the right shows a curved pattern. We can straighten things out using logarithms. Figure 12.10 below plots the logarithm of income per person against life expectancy for these same countries. The effect is almost magical. This graph has a clear, linear pattern!

Applying a function such as the logarithm or square root to a quantitative variable is called transforming the data. We will see in this section that understanding how simple functions work helps us choose and use transformations to straighten nonlinear patterns.

Page 2: Section 12.2 Notes - Mrs. Alexandra Voinea

Transforming with Powers and Roots Here are some examples of relationships in science that involve powers and roots. You do not need to memorize these!

Although a power model of the form 𝑦 = 𝑎𝑥! describes the relationship between x and y in each of these settings, there is a linear relationship between 𝑥! and y. If we transform the values of the explanatory variable x by raising them to the power of p, and graph the points (𝑥!,𝑦), the scatterplot should have a linear pattern!

Example 2: Go Fish! (pg. 768 – 769) Imagine that you have been put in charge of organizing a fishing tournament in which prizes will be given for the heaviest Atlantic Ocean rockfish caught. You know that many of the fish caught during the tournament will be measured and released. You are also aware that using delicate scales to try to weigh a fish that is flopping around in a moving boat will probably not yield very accurate results.It would be much easier to measure the length of the fish while on the boat. What you need is a way to convert length of the fish to its weight. You contact the nearby marine research laboratory, and they provide reference data on the length (in centimeters) and weight (in grams) for Atlantic Ocean rockfish of several sizes:

We know length is one-dimensional and weight (like volume) is three-dimensional, therefore a Power Model of the form 𝑤𝑒𝑖𝑔ℎ𝑡 = 𝑎(𝑙𝑒𝑛𝑔𝑡ℎ)! should describe the relationship. So what happens if we actually apply this model by cubing the lengths in the data table and then graph weight versus length? This particular transformation of the explanatory variable length helps us produce a graph that is quite linear. Pay special attention to how the axes are labeled before and after the transformation!

There’s another way we can transform the data in this example to achieve linearity. We can take the cube root of the weight values and graph 𝑤𝑒𝑖𝑔ℎ𝑡! versus length: Original equation: 𝑤𝑒𝑖𝑔ℎ𝑡 = 𝑎 ∙ 𝑙𝑒𝑛𝑔𝑡ℎ! Cube root both sides: 𝑤𝑒𝑖𝑔ℎ𝑡! = 𝑎 ∙ 𝑙𝑒𝑛𝑔𝑡ℎ!! Simplify: 𝑤𝑒𝑖𝑔ℎ𝑡! = 𝑎! ∙ 𝑙𝑒𝑛𝑔𝑡ℎ

The resulting scatterplot (at right) has a linear form. Again, pay special attention to the way the axes are labeled!

𝑙! weight

5.2! 8.5! 19.2! 25! 29.6! 34! 37.1!

Page 3: Section 12.2 Notes - Mrs. Alexandra Voinea

Example 3: Go Fish! Continued (pg. 770) Here is Minitab Output from separate regression analyses of the two sets of transformed Atlantic Ocean rockfish data.

Do each of the following for both transformations. (a) Give the equations of the least squares regression lines for both transformations. Define any variables you use. (b) Suppose a contestant in the fishing tournament catches an Atlantic Ocean rockfish that’s 36 centimeters long.

Use the models from part (a) to predict the fish’s weight. Show your work.

Page 4: Section 12.2 Notes - Mrs. Alexandra Voinea

Big Ideas about Transforming with Powers and Roots When experience or theory suggests that the relationship between two variables is described by a power model of the form 𝑦 = 𝑎 ∙ 𝑥!, you now have two strategies for transforming data to achieve linearity: 1. 2.

What if you have no idea what power to choose? You could guess and test until you find a transformation that works. But don’t worry, it turns out there is a much more efficient method of linearizing curved data. Instead of linearizing with powers and roots, we use logarithms. Transforming with Logarithms Not all curved relationships are described by power models. For instance, in the “Health and Wealth” example, a graph of life expectancy versus the logarithm (base 10) of income per person showed a linear pattern. We used Fathom software to fit a least-squares regression line to the transformed data and to make a residual plot. The figure to the right shows the results. The regression line is: Predicted life expectancy = 19.5 + 13.2 log(income) Taking the logarithm of the income per person values straightened out the curved pattern in the original scatterplot. The logarithm transformation can help achieve linearity when the relationship between two variables is described by a power model or an exponential model. Logarithms and Power Models

Big Ideas about Transforming with Logarithms To achieve linearity from a power model, we apply the logarithms transformation to both variables: 1. A power model has the form 𝑦 = 𝑎 ∙ 𝑥!, where a and p are constants. 2. Take the logarithm of both sides of the equation: Taking the logarithm of both sides results in a linear relationship between log x and log y. 3. Look very carefully… the power p in the power model becomes the _______________ of the straight line that links log y to log x! In conclusion, if a power model describes the relationship between two variables then the scatterplot of the logarithms of both variables should produce a linear pattern. Then, we can fit a least-squares regression line to the transformed data and use a linear model to make predictions.

Page 5: Section 12.2 Notes - Mrs. Alexandra Voinea

Example 4: Go Fish! Continued (pg. 772 - 774) Let’s return to the fishing tournament from the previous examples. Our goal remains the same: to find a model for predicting the weight of an Atlantic Ocean rockfish from its length. We still expect a power model of the form:

𝑤𝑒𝑖𝑔ℎ𝑡 = 𝑎 ∙ 𝑙𝑒𝑛𝑔𝑡ℎ! Here once again is a scatterplot of the data from the local marine research lab. Earlier, we transformed the data in two ways to achieve linearity: (1) cubing the length values and (2) taking the cube root of the weight values. This time we’ll use logarithms. We took the logarithm (base 10) of the values for both variables. Some computer from a linear regression analysis on the transformed data is shown below. (a) Based on the output, explain why it would be reasonable to use a power model to describe the relationship

between weight and length for Atlantic Ocean rockfish. (b) Give the equation of the least squares regression line. Be sure to define any variables you use. (c) Suppose a contestant in the fishing tournament catches an Atlantic Ocean rockfish that’s 36 centimeters long.

Use the model part (b) of the previous example to predict the fish’s weight. Show your work.

Page 6: Section 12.2 Notes - Mrs. Alexandra Voinea

APStatistics Name:____________________________Chapter12–InferenceforRegression Date:____________________Per:_____Section12.2Part2:ChoosingtheBestRegressionModel In Part 1 of Section 12.2, we learned how to use powers, roots, and logarithms to linearize curved data. Recall that a power model is in the form 𝑦 = 𝑎𝑥! where a and p are constants. To linearize a power model, you take the logarithm of both sides:

𝑦 = 𝑎𝑥! Original Power Model log 𝑦 = log (𝑎𝑥!) Take the log of both sides log 𝑦 = log 𝑎 + log (𝑥!) Expand using the Product Property of Logarithms log 𝑦 = log 𝑎 + 𝑝log (𝑥) Simplify using the Power Property of Logarithms As a result, you will end up plotting the transformed data (log 𝑥 , log𝑦)

A linear model has the form y = a + bx, where the value of y increases or decreases at a constant rate as x increases. The slope b describes the constant rate of change of a linear model. You can think of a linear model as describing the repeated addition (or subtraction) of a constant amount. Sometimes, the relationship between y and x is based on repeated multiplication by a constant factor b. An exponential model of the form 𝑦 = 𝑎𝑏! describes such multiplicative growth (or decay). Populations of living things and money (when interest is being compounded) display exponential growth. If an exponential model of the form 𝑦 = 𝑎𝑏! describes the relationship between x and y, we can use logarithms to transform the data to produce a linear relationship. Start by taking the logarithm (we’ll use base 10, but the natural log ln using base e would work just as well). Then use algebraic properties of logarithms to simplify the resulting expressions:

𝑦 = 𝑎𝑏! Original Exponential Model log 𝑦 = log 𝑎𝑏! Take the log (or ln) of both sides log 𝑦 = log 𝑎 + log (𝑏) Expand using the product property of logarithms log 𝑦 = log 𝑎 + 𝑥 log 𝑏 Simplify using the power property of logarithms log 𝑦 = log 𝑎 + log (𝑏) ∙ 𝑥 Rearrange the final equation where log(a) and log(b) are just constants. As a result, you will end up plotting the transformed data (𝑥, log𝑦)

Activity/Groupwork: Complete the handout activity on exploring the linearization of curved data with logarithms in your group, then answer the following question. Putting It All Together: Which transformation should we use? Suppose that a scatterplot shows a curved relationship between the two quantitative variables x and y. How can we decide whether a power model or an exponential model better describes the relationship? Explain. Big Idea #1: If a power model truly describes the relationship between two variables x and y, then we can expect the scatterplot of (logx, logy ) to be roughly linear. Big Idea #2: If an exponential model truly describes the relationship between two variables x and y, then we can expect the scatterplot of (x, logy ) to be roughly linear.

Page 7: Section 12.2 Notes - Mrs. Alexandra Voinea

Example 1: Braking Distance How is the braking distance for a motorcycle related to the speed the motorcycle was going when the brake was applied? Statistics teacher Aaron Waggoner gathered data to answer this question. The table and scatterplot below shows the speed (in miles per hour, or mph) and the distance (in feet) needed to come to a complete stop when the brake was applied. The graphs below show the results of two different transformations of the data. The first graph plots the logarithm of the distance against speed and the second graph plots the logarithm of distance against the logarithm of speed. (a) Explain why a power model would provide a more appropriate description of the relationship between braking distance and speed than an exponential model. (b) Minitab output from a linear regression analysis on the second set of transformed data is shown below. Give the equation of the least-squares regression line. Be sure to define any variables you use. Predictor Coef SE Coef T P Constant -1.35063 0.09697 -13.93 0.000 logSpeed 2.03610 0.07118 28.61 0.000 S = 0.0594632 R-Sq = 99.4% R-Sq(adj) = 99.3%

(c) Use your model from part (b) to predict the braking distance if the motorcycle was going 55 miles per hour. Show your work. (d) A residual plot for the linear regression in part (b) is shown. Do you expect your prediction in part (c) to be too high, too low, or just right? Justify your answer.

Speed (mph) Distance (feet) 6 1.42 9 4.92

19 18.00 30 44.75 32 52.08 40 84.00 48 110.33

Page 8: Section 12.2 Notes - Mrs. Alexandra Voinea

Check Your Understanding One sad fact about life is that we’ll all die someday. Many adults plan ahead for their eventual passing by purchasing life insurance. Many different types of life insurance policies are available. Some provide coverage throughout an individual’s life (whole life), while others last only for a specified number of years (term life). The policyholder makes regular payments (premiums) to the insurance company in return for the coverage. When the insured person dies, a payment is made to designated family members or other beneficiaries. How do insurance companies decide how much to charge for life insurance? They rely on a staff of highly trained actuaries—people with expertise in probability, statistics, and advanced mathematics—to establish premiums. For an individual who wants to buy life insurance, the premium will depend on the type and amount of the policy as well as on personal characteristics like age, sex, and health status. The table shows monthly premiums for a 10-year term-life insurance policy worth $1,000,000. The Fathom screen shots below show three possible models for predicting monthly premium from age. Option 1 is based on the original data, while Options 2 and 3 involve transformations of the original data. Each screen shot includes a scatterplot with a least-squares regression line added and a residual plot.

Page 9: Section 12.2 Notes - Mrs. Alexandra Voinea

1. Use each model to predict how much a 58-year-old would pay for such a policy. Show your work. 2. Which type of function—linear, power, or exponential—does the best job of describing the relationship between age and monthly premium? Explain why it does the best job, and explain why the other two functions are not as good.

Page 10: Section 12.2 Notes - Mrs. Alexandra Voinea

IN SUMMARY: Hang in there…

Refer to online notes POWER EXPONENTIAL

Page 11: Section 12.2 Notes - Mrs. Alexandra Voinea

x

log(y)

log(x)

log(y)

AP Statistics Chapter 12 . Section 12.2 Handout: Exploring Transformations to Achieve Linearity of Curved Data (Using Logarithms) Directions: For each example below, enter the data into L1 and L2. Then, READ PG. 781 TECHNOLOGY CORNER! Example 1:

(a) Transform the data using an exponential model. Hint: plot (𝑥, log𝑦) with L1: X and L4: log(L2). Draw a rough

sketch of what you see on the scatterplot. Describe what you notice – did you achieve linearity? (b) Next, transform the data using a power model. Hint: plot (log 𝑥 , log𝑦) with L3: log(L1) and L4: log(L2). Draw a

rough sketch of what you see on the scatterplot. Describe what you notice – did you achieve linearity? (c) What type of model – linear, power, or exponential – describes the relationship between x and y in the following

table and scatterplot? Explain.

Page 12: Section 12.2 Notes - Mrs. Alexandra Voinea

x

log(y)

log(x)

log(y)

Example 2:

(a) Transform the data using an exponential model. Hint: plot (𝑥, log𝑦) with L1: X and L4: log(L2). Draw a rough sketch of what you see on the scatterplot. Describe what you notice – did you achieve linearity?

(b) Next, transform the data using a power model. Hint: plot (log 𝑥 , log𝑦) with L3: log(L1) and L4: log(L2). Draw a

rough sketch of what you see on the scatterplot. Describe what you notice – did you achieve linearity? (c) What type of model – linear, power, or exponential – describes the relationship between x and y in the following

table and scatterplot? Explain.