.
To test the probability of irrelevant hits from a BLAST search,
download the first two paragraphs of Jane Austen’s Emma from a Web page
located with a search engine of your choice. Remove all spaces and punctuation,
and then replace letters “o” and “u” by “a” (alanine) and letters “b”,
“x”, “j”, and “z” by “g” (glycine). This should yield a string having about
500 characters. Add a first line >emma to convert to FASTA format, and then
run WU-BLAST2 for proteins on the server at the European Bioinformatics
Institute (see Appendix B). What are the E values and percentage identities
for the top three sequences? How do these compare with real biological
sequences?