Answer To: stories/alices_adventures_in_wonderland-ch1.txt CHAPTER I. Down the Rabbit-Hole Alice was beginning...
Mohit answered on Aug 20 2021
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# CSC3501-S2-2020: Assignment 1 (Part B)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Literary Scrabble [65 marks]\n",
"\n",
"In Part B of Assignment 1, we are programming (in Python 3) and playing a round of Literary Scrabble: a game of Scrabble where you can only play words that appear in selected literary classics. \n",
"\n",
"**You Tasks**:\n",
"\n",
"Answer **Q1-Q7** by complementing each of the **Word Analysis** and **Story Analysis** functions below:\n",
"- The **Word Analysis** functions will provide the primary text analysis to help you answer the questions defined in this notebook. \n",
"- The code you write for each **Story Analysis** 's function will need to call the appropriate **Word Analysis** function(s) and then complete any additional processing necessary to answer the specific question. \n",
"\n",
"You can test your code via the inputs and expected outputs in **\"Example\"** in each question.\n",
"\n",
"Text files for creating word lists are available in the `stories` folder."
]
},
{
"cell_type": "code",
"execution_count": 89,
"metadata": {},
"outputs": [],
"source": [
"from __future__ import print_function\n",
"\n",
"import string\n",
"from itertools import permutations"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### I. Word Analysis"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Q1. Create Story Word List [12 Marks]\n",
"\n",
"Complete the function below which should read a story's text file, and return a sorted list (descending) of words (i.e. - no duplicates) extracted from the story's `text_file` that also exist in the official Sowpods list of approved scrabble words. \n",
"\n",
"To create your story's word list: \n",
"- convert all characters to uppercase; \n",
"- replace hyphens and underscores with a single space: `' '`, \n",
"- split hyphenated words into separate words; \n",
"- strip off all contractions and possessives from words: 's, 're, etc. \n",
"- remove all punctuation, whitespace characters and numbers.\n",
"- only keep words which also occur in the official Sowpods list (i.e., sowpods.txt)\n",
"\n",
"HINT: The Python Standard Library provides various string constants, such as `whitespace` and `punctuation`. You may want to review the Python Standard Library's sections on string methods and constants.\n",
"- [String constants](https://docs.python.org/3/library/string.html#string-constants)\n",
"- [String methods](https://docs.python.org/3/library/stdtypes.html#text-sequence-type-str)\n",
"\n",
"**NOTE: Creating a story's word list may take several seconds of processing time**. We recommend you to use the shorter story file `\"Baa_Baa_Black_Sheep.txt\"` while you are testing your code.\n",
"\n",
"**Example:**\n",
"`create_wordlist(\"stories/Baa_Baa_Black_Sheep.txt\")` returns:\n",
"\n",
"['YOU',\n",
" 'YES',\n",
" 'WOOL',\n",
" 'THREE',\n",
" 'THE',\n",
" 'THAT',\n",
" 'SIR',\n",
" 'SHEEP',\n",
" 'ONE',\n",
" 'MY',\n",
" 'MASTER',\n",
" 'LIVES',\n",
" 'LITTLE',\n",
" 'LANE',\n",
" 'HAVE',\n",
" 'FULL',\n",
" 'FOR',\n",
" 'DOWN',\n",
" 'DAME',\n",
" 'BOY',\n",
" 'BLACK',\n",
" 'BAGS',\n",
" 'BAA',\n",
" 'ANY',\n",
" 'AND']"
]
},
{
"cell_type": "code",
"execution_count": 169,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"['YOU', 'YES', 'WOOL', 'THREE', 'THE', 'THAT', 'SIR', 'SHEEP', 'ONE', 'MY', 'MASTER', 'LIVES', 'LITTLE', 'LANE', 'HAVE', 'FULL', 'FOR', 'DOWN', 'DAME', 'BOY', 'BLACK', 'BAGS', 'BAA', 'ANY', 'AND']\n"
]
}
],
"source": [
"def create_wordlist(text_file):\n",
" \"\"\" Provide function docstring\n",
" First create a list of words from text file, reading it line by line\n",
" Then loop through elements to remove numbers, punctuations and contractions\n",
" Remove duplicates\n",
" Check for valid words from the sowpods file\n",
" \"\"\"\n",
" with open(text_file) as f:\n",
" word_list=[word.upper() for line in f for word in line.split()]\n",
" \n",
" for i in range(len(word_list)):\n",
" if word_list[i].isdigit():\n",
" word_list.pop(i)\n",
" for j in string.punctuation:\n",
" if j in word_list[i]:\n",
" if j == '_' or j == '-':\n",
" for word in word_list[i].split(j):\n",
" word_list.append(word)\n",
" ...