The following data relate the number of motor vehicle deaths occurring in 12 counties in the northwestern United States in the years 1988 and 1989.
County
|
Deaths in 1988
|
Deaths in 1989
|
1
|
121
|
104
|
2
|
96
|
91
|
3
|
85
|
101
|
4
|
113
|
110
|
5
|
102
|
117
|
6
|
118
|
108
|
7
|
90
|
96
|
8
|
84
|
102
|
9
|
107
|
114
|
10
|
112
|
96
|
11
|
95
|
88
|
12
|
101
|
106
|
The scatter diagram for this data set appears in Fig. 12.6. A glance at Fig. 12.6 indicates that in 1989 there was, for the most part, a reduction in the number
FIGURE 12.6
Scatter diagram of 1989 deaths versus 1988 deaths.
of deaths in those counties that had a large number of motor vehicle deaths in 1988. Similarly, there appears to have been an increase in those counties that had a low value in 1988. Thus, we would expect that a regression to the mean is in effect. In fact, running Program 12-1 yields the estimated regression equation
y
= 74.589 + 0.276x
which shows that the estimated value of β indeed appears to be less than 1. One must be careful when considering the reason behind the phenomenon of regression to the mean in the preceding data. For instance, it might be natural to suppose that those counties that had a large number of deaths caused by motor vehicles in 1988 would have made a large effort—perhaps by improving the safety of their roads or by making people more aware of the potential dangers of unsafe driving—to reduce this number. In addition, we might suppose that those counties that had the fewest number of deaths in 1988 might have “rested on their laurels” and not made much of an effort to further improve their numbers—and as a result had an increase in the number of casualties the following year. While the foregoing suppositions might be correct, it is important to realize that a regression to the mean would probably have occurred even if none of the counties had done anything out of the ordinary. Indeed, it could very well be the case that those counties having large numbers of casualties in 1988 were just very unlucky in that year, and thus a decrease in the next year was just a return to a more normal result for them. (For an analogy, if 9 heads result when 10 fair coins are flipped, then it is quite likely that another flip of these 10 coins will result in fewer than 9 heads.) Similarly, those counties having few deaths in 1988 might have been “lucky” that year, and a more normal result in 1989 would thus lead to an increase. The mistaken belief that regression to the mean is due to some outside influence, when it is in reality just due to “chance,” is heard frequently enough that it is often referred to as the regression fallacy.