.
In this example, we use R to verify the distribution of the statistic
X2 given in (2.22), as used. To do this, first choose a pair of
bases r1r2, and calculate the appropriate value of c by following the recipe
after (2.22) for the given base frequencies p = (p1, . . . , p4). Now use R to
repeatedly simulate strings of 1000 letters having distribution p, calculate O
(the number of times the pair of letters r1r2 is observed) and E, and hence
X2/c. Plot a histogram of these values, and compare it to the theoretical distribution
(which is the χ2 distribution with 1 degree of freedom). Remark:
This simulation approach provides a useful way to estimate percentage points
of the distribution of any test statistic