In this project, you will create a python program that counts the word frequency in a list of New York Times (NYT) articles, per category and in general. First, make sure to download the following...

1 answer below »
In this project, you will create a python program that counts the word frequency in a list of New York Times (NYT) articles, per category and in general.

First, make sure to download the following files to your working folder (where you save your program):

Stopwords -

https://drive.google.com/file/d/1V9rAioz980HuIigNV5tZlOmAC9qeB1BK/view?usp=sharing
NYT articles (csv) -

https://drive.google.com/file/d/1s-c75Uzzme8irdYuZW9z9kH-j9gp_m0X/view?usp=sharing
NYT article (text file, UTF-8)
https://drive.google.com/file/d/1rwFzwcSP3L2B8VSTkOEXFUjuQU-aR7Vq/view?usp=sharing
NYT article (text file, ANSI)
https://drive.google.com/file/d/1ry598L-YdtXV8DgntLd8DE8UVJ7lvLlP/view?usp=sharing
Part I - word count

In the first part you will read the external files and count the word frequencies.

The program should:
Display a message stating its goal

Read the StopWords.txt file (make sure to follow the right encoding)

Output how many stop words are in the file

Read ONE of the NYT article files. You can use EITHER the text files or the csv file, whichever is more convenient. They are the same. All file includes field names in the first line, and the articles in the following lines. All values are separated by “ | “.
For each article, read through ArticleTitle, ArticleSubtitle and ArticleKeywords and extract all the unique words and their frequencies (disregard lower or upper case).

For each unique word, count its total frequency in each ArticleCategory, as well as overall total frequency (a sum of all the category counts).

Hint: Use dictionaries + nested dictionaries!

Output the following:

For the whole list:

How many articles are in the files?

How many different categories are in the file?

How many unique words are in the file (remember to only count the words from the ArticleTitle, ArticleSubtitle and ArticleKeywords fields)
What's the total number of words (sum of unique word frequencies)
The top ten most frequent words that ARE NOT stop words + their frequency

For each category:

Total number of unique words in the specific category

Total number of words in the category (sum of frequencies)
The top ten most frequent words in the category that ARE NOT stopwords + their frequency

Again, don't forget: for each article only count the words from the ArticleTitle, ArticleSubtitle and ArticleKeywords fields
Part II - save list to file
In this part you will save the word frequency list to a new csv file.

The program should:
Display a message stating its goal

Create a new csv file with the student name

In the file, create fields for word list (where you'll store the unique words). all the different categories (where you store the word count for each specific ArticleCategory) and total count (the overall frequency sum for the unique word)

Based on the lists / dictionaries created in the previous part, fill your csv file with values: all the unique words + their frequency count in each category + their total frequency count (sum of frequencies in all the categories)

Save and close the file.

Part III - word search
In the last part, your program should input words from the users and output their respective frequency counts.

The program should:

Display a message stating its goal

Ask the user to input a word
Check if the word is in the database (disregard lower or upper case)
Notify the user If the word cannot be found

If the word is in the database, output its frequency in each category, as well as total frequency

Ask the user to input “1” to try another word or “0” to exit the program

Remember:

Only use the material covered in this module -- do not use more advanced functions not covered yet in the course

Make sure to include comments that explain all your steps (starts with #). Also use a comment to sign your name at the beginning of the program!
Work individually and only submit original work
Run the program a few times to make sure it executes and meets all the requirements

Submit one .py file!

Answered Same DayOct 28, 2021

Answer To: In this project, you will create a python program that counts the word frequency in a list of New...

Sudipta answered on Oct 31 2021
140 Votes
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here