Hospital records were examined to assess the link between smoking and duration of illness. The data reported in the table are the number of hospital days (per 1,000 person-years) for several classes of individuals, the average number of cigarettes smoked per day, and the number of hospital days for control groups of nonsmokers for each class. (The control groups consist of individuals matched as nearly as possible to the smokers for several primary health factors other than smoking.)
(a) Plot the logarithm of number of hospital days (for the smokers) against number of cigarettes. Do you think a linear regression will adequately represent the relationship?
(b) Plot the logarithm of number of hospital days for smokers minus the logarithm of number of hospital days for the control group against number of cigarettes. Do you think a linear regression will adequately represent the relationship? Has subtraction of the control group means reduced the dispersion?
(c) Define Y = ln(# days for smokers)−ln(# days for nonsmokers) and X = (#cigarettes)2. Fit the linear regression of Y on X. Make a test of significance to determine if the intercept can be set to zero. Depending on your results, give the regression equation, the standard errors of the estimates, and the summary analysis of variance.
Already registered? Login
Not Account? Sign up
Enter your email address to reset your password
Back to Login? Click here