.
The genome composition π of E. coli can be computed from
Take the first 1000 bps of the E. coli sequence you used in the previous
exercise.We are going to use a variant of (2.22) to test if this 1000 bp sequence
has an unusual base composition when compared with π. The statistic to use
where Oi denotes the number of times base i is observed in the sequence, and
Ei denotes the expected number (assuming that frequencies are given by π).
(a) Calculate Oi and Ei, i = 1, . . . , 4, and then X2.
(b) Values of X2 that correspond to unusual base frequencies are determined
by the large values of the χ2 distribution with 4−1 = 3 degrees of freedom.
Using a 5% level of significance, are data consistent with π or not? [Hint:
percentage points of the χ2 distribution can be found using R.]