Cars These data include the engine size or displacement (in liters) and horsepower (HP) of 318 vehicles sold in the United States in 2011. Fit a multiple regression with the log10
of the combined mileage rating as the response and the log10
of the horsepower of the engine (HP), the log10
of the weight the car, and the log10
of the engine displacement as explanatory variables.
(a) Does it seem natural to find correlation among these explanatory variables, either on a log scale or in the original units?
(b) How will collinearity on the log scale affect the standard error of the slope of the log of displacement in the multiple regression?
(c) Describe the effects of collinearity on the three estimated coefficients. Which coefficients are most/least influenced by collinearity?
(d) We can see the effects of collinearity by constructing a plot that shows the slope of the multiple regression. To do this, we have to remove the effect of two of the explanatory variables from the other variables. Here’s how to make a so-called partial regression leverage plot for these data. First, regress Log10
MPG on Log10
HP and Log10
Weight and save the residuals. Second, regress Log10
Displacement on Log10
HP and Log10
Weight and save these residuals. Now, make a scatterplot of the residuals from the regression of Log10
MPG on the two explanatory variables on the residuals from the regression of Log10
Displacement on the other two explanatory variables. Fit the simple regression for this scatterplot, and compare the slope in this fit to the partial slope for Log10
Displacement in the multiple regression. Are they different?
(e) Compare the scatterplot of Log10
MPG on Log10
Displacement to the partial regression plot constructed in part (d). What has changed?