pdf explaining project attached
CSE 231 CSE 231 Fall 2021 Computer Project #8 This assignment focuses on the design, implementation, and testing of a Python program to process data file and extract meaningful information from them. It is worth 50 points (5% of course grade) and must be completed no later than 11:59 PM on Monday, November 15, 2021). Assignment Background In this project, we will explore economic statistics from countries around the world. From the World Bank website, we have collected the following statistics from the year 2018: electricity access (% of population), fertility rate (births per woman), GDP per capita, and life expectancy. The countries are divided into seven regions. The list of the seven regions is: REGION_LIST = [‘East Asia & Pacific’, ‘Europe & Central Asia’, ‘Latin America & Caribbean’, ‘Middle East & North Africa’, ‘North America’, ‘South Asia’, ‘Sub-Saharan Africa’] In this assignment, you will practice with dictionaries and lists to store and manipulate the data. Assignment Specifications The overall algorithm is 1. Read the data.csv file to create a dictionary whose key is a region and whose value is a list of lists with each inner list containing the name of the country and its four values 2. Loop prompting for a valid region 3. Loop prompting for a specific option to display the data Provide the following functions. def open_file() file_pointer a. Open a file for reading. Repeatedly prompt for a file name until a file is successfully opened. Use the try-except command: except FileNotFoundError The prompt: ‘Input a file: ’ The error message: ‘Invalid filename, please try again.’ b. Parameters: None c. Returns: file_pointer d. Display: prompt and error message as appropriate def read_file(fp) dictionary a. The parameter is a file pointer. Read the whole file to extract data from columns 1 to 7. The file is a comma-separated-value file (csv) use the csv module. There is one header line that needs to be skipped. For this project, we are interested in the following columns: country name Column 1 region of the country Column 7 electricity access (% of population) Column 3 fertility rate (births per woman) Column 4 GDP per capita Column 5 life expectancy Column 6 Convert each numeric value to a float. As you read each line use the region from column 7 as a dictionary key and the value is a list of lists with the inner list containing the country name (Column 1), electricity access (Column 3), fertility rate (Column 4), GDP per capita (Column 5), and life expectancy (Column 6). Use the try-except command, to make sure every value in the inner list exists before appending to the outer list. If any values (string or float) are missing, skip that line of the file and do not add the information to the dictionary. Each country list will have the following data in this order L = [country_name,electricity,fertility,GDP,life] The region dictionary will have the region as the key and a list of country lists as the value. For example, if the file has these 4 lines: Country Name Country Code Access to electricity (% of population) Fertility rate, total (births per woman) GDP per capita (current US$) Life expectancy at birth, total (years) Region India IND 95.23586 2.222 2005.863 69.416 South Asia Canada CAN 100 1.4988 46313.17 81.94878 North America Maldives MDV 100 1.87 10276.93 78.627 South Asia Tunisia TUN 99.8 2.197 3438.789 76.505 The dictionary D will be: D = {‘South Asia’:[[‘India’, 95.2358551,2.222,2005.863005,69.416], [‘Maldives’,100,1.87,10276.93282,78.627] ], ‘North America’:[[‘ Canada,100, 1.4988, 46313.17, 81.94878] ] } Don’t add the region to the dictionary unless it is in the file. (Hint: Make sure the region value is not an empty string.) Notes: (1) The region field in the file may be empty (skip that line if it is). (2) The country field always has a value (no need to check!) (3) An easy way to skip the header is to use next(reader,None) whereas an easy way to skip a line when reading a file is to use continue when a value is missing (a column is empty) a. Parameters: file_pointer b. Returns: dictionary c. Display: Nothing def get_min_max(master_dict,region,option) min_tup, max_tup a. The three parameters are the master_dict (dictionary), region (string) and option (int). Extract data for the specified region (str) from the dictionary master_dict. We want the min and max of the region based on the options (int). If option 1, sort the countries in the region based on electricity access (Index 1). If option 2, sort the countries in the region based on fertility rate (Index 2). If option 3, sort the countries in the region based on GDP per capita (Index 3). If option 4, sort the countries in the region based on life expectancy (Index 4). Return the lowest (min) and highest (max) countries with the specified option as two tuples (min_country,min_value), (max_country,max_value). Return None,None if the region parameter does not exist in the master_dict dictionary or the option is not 1, 2, 3, or 4. Notes: (1) First check the region and option parameters, and return None,None if any are invalid. (2) Next create a list of countries sorted by the specified option value. The first and last items in the last will be your min and max values. Yank them out and put them in tuples. (2) optional challenge: use list comprehension together with sorted and option and itemgetter to build your list in one line. b. Parameters: dictionary, string, integer c. Returns: min_tuple, max_tuple d. Display: Nothing def display_all_countries(master_dict,region) None a. Displays all the countries and its values in the specified region Print the region on the first line with the text "\nDisplaying {} Region:\n". Then skip a line and print the header using the stringformatting: "{:32s}{:>20s}{:>20s}{:>17s}{:>18s}" The header is: 'Country', 'Electricity Access', 'Fertility rate', 'GDP per capita', 'Life expectancy' Sort the countries in the region in alphabetical order and then print each country using the following string formatting: "{:32s}{:>20.2f}{:>20.2f}{:>17.2f}{:>18.2f}" Notes: Optional Challenge – try using * in the format for the list b. Parameters: dictionary, string c. Returns: Nothing d. Display: All the countries in the region def get_top10(master_dict) list of tuples a. Sort all the countries in the world by GDP per capita. Return the top 10 countries with the highest GDP per capita. Each tuple in the list will be in the form: (country, GDP per capita). Notes: (1) first create one list of all the countries, (2) use master_dict.items()to extract the lists from the dictionary, then sort using itemgetter (3) Hint: you might want to start by printing out the keys (regions) and the values (countries in the region) so you can better understand what you are dealing with. (4) optional challenge: build the return list of tuples using list comprehension on a slice of the sorted list b. Parameters: dictionary c. Returns: list of tuples d. Display: None def display_options() None a. Provided. DON’T MODIFY. Displays the options b. Parameters: None c. Returns: None d. Display: Displays the options def main() 1. Call open_file to open an input file for reading 2. Call read_file to read the desired data into a dictionary 3. Loop prompting for a region to display the options; ‘q’ or ‘Q’ to quit looping (the prompt string is provided in the provided code as PROMPT. The prompt will reprompt until a valid region or ‘q’ is entered). Make sure to check if it is a valid region. If the region is incorrect ask for input again without printing any additional text. 4. Print the region with the text "\nRegion: ". 5. Call display_options to display the menu of options 6. Loop prompting for an option with the text '\n\nChoose an option, R to select a different region, q to quit: '; ‘q’ or ‘Q’ or ‘r’ or ‘R’ to quit looping. IMPORTANT (but leave this for the last test): When ‘q’ or ‘Q’ is entered at the region prompt or the option prompt the program will terminate. This means that when ‘q’ or ‘Q’ is entered at the option prompt you must exit both loops. You are not allowed to import sys and use sys.exit() to exit the program. If you use sys.exit(), your TA will deduct five points. Hint: what values of region and option will exit both loops? i. If option is 1, call get_min_max to determine the countries in the specified region with the lowest and highest access to electricity. Use the following two string formats to display the information: "\n{:s} has the highest access to electricity of {:.2f}%" "{:s} has the lowest access to electricity of {:.2f}%" ii. If option is 2, call get_min_max to determine the countries in the specified region with the lowest and