Random versus fixed resampling in regression:
(a) Recall (from Chapter 2) Davis’s data on measured and reported weight for 101 women engaged in regular exercise. Bootstrap the least-squares regression of reported weight on measured weight, drawing r = 1000 bootstrap samples using (1) random-X resampling and (2) fixed-X resampling. In each case, plot a histogram (and, if you wish, a density estimate) of the 1000 bootstrap slopes, and calculate the bootstrap estimate of standard error for the slope. How does the influential outlier in this regression affect random resampling? How does it affect fixed resampling?
(b) Randomly construct a data set of 100 observations according to the regression model Yi ¼ 5 þ 2xi þ εi, where xi= 1; 2; ... ; 100, and the errors are independent (but seriously heteroscedastic), with εi- N(0; x2i). As in (a), bootstrap the least-squares regression of Y on x, using (1) random resampling and (2) fixed resampling. In each case, plot the bootstrap distribution of the slope coefficient, and calculate the bootstrap estimate of standard error for this coefficient. Compare the results for random and fixed resampling. For a few of the bootstrap samples, plot the least-squares residuals against the fitted values. How do these plots differ for fixed versus random resampling?
(c) Why might random resampling be preferred in these contexts, even if (as is not the case for Davis’s data) the X-values are best conceived as fixed?
Already registered? Login
Not Account? Sign up
Enter your email address to reset your password
Back to Login? Click here