{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "#\n", "# Import packages needed for processing\n", "#\n", "import numpy as np # import the numpy...

1 answer below ยป
Hi, could you explain the code with #comments& make corrections if anything is wrong..



{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "#\n", "# Import packages needed for processing\n", "#\n", "import numpy as np # import the numpy package\n", "import csv # this package needed for processing csv file\n", "from collections import Counter # this is for dictionary construction with counting functionality\n", "import matplotlib.pyplot as plt # this is for plotting and other descriptive statistics\n", "import datetime # this package is for handling time\n", "#\n", "# If you need add any additional packages, then add them below this line\n", "#\n", "\n", "import pandas as pd\n", "import timeit\n", "import matplotlib as mpl\n", "import seaborn as sns" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Note**: you must place your data file in the same folder with your python notebook." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# Declare the path to the data file\n", "DATA_FILE = \"owid-covid-data.csv\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Note**. The following segment of codes is to load the data from the .csv file using csv package. It has been provided for you to get started. Please **do not** change this piece of code as well the variable names as we will need these variables to complete subsequent tasks. " ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "dict_keys(['iso_code', 'continent', 'location', 'date', 'total_cases', 'new_cases', 'new_cases_smoothed', 'total_deaths', 'new_deaths', 'new_deaths_smoothed', 'total_cases_per_million', 'new_cases_per_million', 'new_cases_smoothed_per_million', 'total_deaths_per_million', 'new_deaths_per_million', 'new_deaths_smoothed_per_million', 'reproduction_rate', 'icu_patients', 'icu_patients_per_million', 'hosp_patients', 'hosp_patients_per_million', 'weekly_icu_admissions', 'weekly_icu_admissions_per_million', 'weekly_hosp_admissions', 'weekly_hosp_admissions_per_million', 'new_tests', 'total_tests', 'total_tests_per_thousand', 'new_tests_per_thousand', 'new_tests_smoothed', 'new_tests_smoothed_per_thousand', 'positive_rate', 'tests_per_case', 'tests_units', 'total_vaccinations', 'people_vaccinated', 'people_fully_vaccinated', 'new_vaccinations', 'new_vaccinations_smoothed', 'total_vaccinations_per_hundred', 'people_vaccinated_per_hundred', 'people_fully_vaccinated_per_hundred', 'new_vaccinations_smoothed_per_million', 'stringency_index', 'population', 'population_density', 'median_age', 'aged_65_older', 'aged_70_older', 'gdp_per_capita', 'extreme_poverty', 'cardiovasc_death_rate', 'diabetes_prevalence', 'female_smokers', 'male_smokers', 'handwashing_facilities', 'hospital_beds_per_thousand', 'life_expectancy', 'human_development_index'])\n" ] } ], "source": [ "# Load CSV file using DictReader\n", "input_file = csv.DictReader(open(DATA_FILE))\n", "fieldnames = input_file.fieldnames\n", "data_dict = {fn: [] for fn in fieldnames}\n", "print(data_dict.keys())\n", "for line in input_file:\n", " for k, v in line.items():\n", " if (v == ''): #quick fix for missing values\n", " v=0\n", " try:\n", " data_dict[k].append(int(v))\n", " except ValueError:\n", " try:\n", " data_dict[k].append(float(v))\n", " except ValueError:\n", " data_dict[k].append(v)\n", " \n", "for k, v in data_dict.items():\n", " data_dict[k] = np.array(v)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "## **Instruction 1**\n", "We first examine the variables and the overall data size. The tasks:\n", "\n", "1. Write your code to print the type for the following variables (**4 marks**):\n", " * input_file\n", " * data_dict\n", " * data_dict['iso_code']\n", " * data_dict['reproduction_rate']\n", "\n", "\n", "2. Write your code to print out the number of data records (**1 mark**):\n", "\n", "[**Total mark: 5**]" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "\n", "\n", "\n" ] } ], "source": [ "# INSERT YOUR CODE HERE\n", "#\n", "#1. Write you code to print the type for the following variables:\n", "# * input_file\n", "# * data_dict\n", "# * data_dict['iso_code']\n", "# * data_dict['reproduction_rate']\n", "#\n", "\n", "print(type(input_file))\n", "print(type(data_dict))\n", "print(type(data_dict['iso_code']))\n", "print(type(data_dict['reproduction_rate']))" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "79668\n" ] } ], "source": [ "# INSERT YOUR CODE HERE\n", "#\n", "# 2. Write your code to print out the number of data records\n", "#\n", "\n", "file = open(\"owid-covid-data.csv\")\n", "numline = len(file.readlines())\n", "print (numline)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "## Instruction 2\n", "The following code segment is designed to calculate some basic statistics from the data for a recent date. The data is stored in **data_filtered_by_date** dictionary.\n", "\n", "**Your task:**\n", "1. Learn how the code, which extracts the data for all countries at a required date, works. Why **try-except** construction is required in this code? Provide the answer. (**2 marks**)\n", "1. Write your code to find and print the minimum and the maximum values of mortality per million in the **total_deaths_per_million** column. (**1 mark**)\n", "2. Write your code to find and print the minimum and the maximum values of cases per million in the **total_cases_per_million** column. (**1 mark**)\n", "3. Write your code to find and print the mean and median mortality per million, and the standard deviation from the **total_deaths_per_million** column. (**2 marks**)\n", "4. Write your code to construct a box plot for the **total_deaths_per_million**. (**2 marks**) \n", "4. You will find that mean and median values for mortality are quite different. Briefly describe the meaning of the difference between the mean and median mortality per million. Why are they different? (**2 marks**)\n", "\n", "**[Total mark: 10]**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# THIS PART EXTRACTS THE DATA FOR ALL COUNTRIES AT A REQUIRED DATE.\n", "# LEARN HOW IT WORKS.\n", "# data_filtered_by_date DICTIONARY CONTAINS ALL DATA FOR ALL COUNTRIES AT required_date\n", "\n", "required_date = '2021-03-09'\n", "\n", "index_for_date = (data_dict['date'] == required_date)\n", "data_filtered_by_date = {}\n", "for key in data_dict.keys():\n", " try:\n", " data_filtered_by_date[key] = np.float_(data_dict[key][index_for_date])\n", " except:\n", " data_filtered_by_date[key] = data_dict[key][index_for_date]" ] }, {
Answered 2 days AfterApr 14, 2021SIT112Deakin University

Answer To: { "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [...

Sudipta answered on Apr 16 2021
162 Votes
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"#\n",
"# Import packages needed for processing\n",
"#\n",
"import numpy as np # import the numpy package\n",
"import csv # this package needed for processing csv file\n",
"from collections import Counter # this is for dictionary construction with counting functionality\n",
"import matplotlib.pyplot as plt # this is for plotting and other descriptive statistics\n",
"import datetime # this package is for handling time\n",
"#\n",
"# If you need add any additional packages, then add them below this line\n",
"#\n",
"\n",
"import pandas as pd\n",
"import timeit\n",
"import matplotlib as mpl\n",
"import seaborn as sns"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Note**: you must place your data file in the same folder with your python notebook."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"# Declare the path to the data file\n",
"DATA_FILE = \"owid-covid-data.csv\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Note**. The following segment of codes is to load the data from the .csv file using csv package. It has been provided for you to get started. Please **do not** change this piece of code as well the variable names as we will need these variables to complete subsequent tasks. "
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"dict_keys(['iso_code', 'continent', 'location', 'date', 'total_cases', 'new_cases', 'new_cases_smoothed', 'total_deaths', 'new_deaths', 'new_deaths_smoothed', 'total_cases_per_million', 'new_cases_per_million', 'new_cases_smoothed_per_million', 'total_deaths_per_million', 'new_deaths_per_million', 'new_deaths_smoothed_per_million', 'reproduction_rate', 'icu_patients', 'icu_patients_per_million', 'hosp_patients', 'hosp_patients_per_million', 'weekly_icu_admissions', 'weekly_icu_admissions_per_million', 'weekly_hosp_admissions', 'weekly_hosp_admissions_per_million', 'new_tests', 'total_tests', 'total_tests_per_thousand', 'new_tests_per_thousand', 'new_tests_smoothed', 'new_tests_smoothed_per_thousand', 'positive_rate', 'tests_per_case', 'tests_units', 'total_vaccinations', 'people_vaccinated', 'people_fully_vaccinated', 'new_vaccinations', 'new_vaccinations_smoothed', 'total_vaccinations_per_hundred', 'people_vaccinated_per_hundred', 'people_fully_v
accinated_per_hundred', 'new_vaccinations_smoothed_per_million', 'stringency_index', 'population', 'population_density', 'median_age', 'aged_65_older', 'aged_70_older', 'gdp_per_capita', 'extreme_poverty', 'cardiovasc_death_rate', 'diabetes_prevalence', 'female_smokers', 'male_smokers', 'handwashing_facilities', 'hospital_beds_per_thousand', 'life_expectancy', 'human_development_index'])\n"
]
}
],
"source": [
"# Load CSV file using DictReader\n",
"input_file = csv.DictReader(open(DATA_FILE))\n",
"fieldnames = input_file.fieldnames\n",
"data_dict = {fn: [] for fn in fieldnames}\n",
"print(data_dict.keys())\n",
"for line in input_file:\n",
" for k, v in line.items():\n",
" if (v == ''): #quick fix for missing values\n",
" v=0\n",
" try:\n",
" data_dict[k].append(int(v))\n",
" except ValueError:\n",
" try:\n",
" data_dict[k].append(float(v))\n",
" except ValueError:\n",
" data_dict[k].append(v)\n",
" \n",
"for k, v in data_dict.items():\n",
" data_dict[k] = np.array(v)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"## **Instruction 1**\n",
"We first examine the variables and the overall data size. The tasks:\n",
"\n",
"1. Write your code to print the type for the following variables (**4 marks**):\n",
" * input_file\n",
" * data_dict\n",
" * data_dict['iso_code']\n",
" * data_dict['reproduction_rate']\n",
"\n",
"\n",
"2. Write your code to print out the number of data records (**1 mark**):\n",
"\n",
"[**Total mark: 5**]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\n",
"\n"
]
}
],
"source": [
"# INSERT YOUR CODE HERE\n",
"#\n",
"#1. Write you code to print the type for the following variables:\n",
"# * input_file\n",
"# * data_dict\n",
"# * data_dict['iso_code']\n",
"# * data_dict['reproduction_rate']\n",
"#\n",
"#The type function is used to print the type of the data\n",
"print(type(input_file))\n",
"print(type(data_dict))\n",
"print(type(data_dict['iso_code']))\n",
"print(type(data_dict['reproduction_rate']))"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"79668\n"
]
}
],
"source": [
"# INSERT YOUR CODE HERE\n",
"#\n",
"# 2. Write your code to print out the number of data records\n",
"#\n",
"#open keyword is used to open the file\n",
"file = open(\"owid-covid-data.csv\")\n",
"#we find the length of the file\n",
"numline = len(file.readlines())\n",
"print (numline)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"## Instruction 2\n",
"The following code segment is designed to calculate some basic statistics from the data for a recent date. The data is stored in **data_filtered_by_date** dictionary.\n",
"\n",
"**Your task:**\n",
"1. Learn how the code, which extracts the data for all countries at a required date, works. Why **try-except** construction is required in this code? Provide the answer. (**2 marks**)\n",
"1. Write your code to find and print the minimum and the maximum values of mortality per million in the **total_deaths_per_million** column. (**1 mark**)\n",
"2. Write your code to find and print the minimum and the maximum values of cases per million in the **total_cases_per_million** column. (**1 mark**)\n",
"3. Write your code to find and print the mean and median mortality per million, and the standard deviation from the **total_deaths_per_million** column. (**2 marks**)\n",
"4. Write your code to construct a box plot for the **total_deaths_per_million**. (**2 marks**) \n",
"4. You will find that mean and median values for mortality are quite different. Briefly describe the meaning of the difference between the mean and median mortality per million. Why are they different? (**2 marks**)\n",
"\n",
"**[Total mark: 10]**"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# THIS PART EXTRACTS THE DATA FOR ALL COUNTRIES AT A REQUIRED DATE.\n",
"# LEARN HOW IT WORKS.\n",
"# data_filtered_by_date DICTIONARY CONTAINS ALL DATA FOR ALL COUNTRIES AT required_date\n",
"\n",
"required_date = '2021-03-09'\n",
"\n",
"index_for_date = (data_dict['date'] == required_date)\n",
"data_filtered_by_date = {}\n",
"for key in data_dict.keys():\n",
" try:\n",
" data_filtered_by_date[key] = np.float_(data_dict[key][index_for_date])\n",
" except:\n",
" data_filtered_by_date[key] = data_dict[key][index_for_date]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 1. Why **try-except** construction is required in this code? (2 marks)\n",
"# INSERT YOUR ANSWER HERE\n",
"#"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Maximum: 2526.571\n",
"Minimum: 0.001\n"
]
}
],
"source": [
"# INSERT YOUR CODE HERE\n",
"#\n",
"# 2. Write your code to find and print the minimum and the maximum values of mortality \n",
"# per million (key name 'total_deaths_per_million'). (1 mark)\n",
"#\n",
"#we create a data frame from the csv file\n",
"df = pd.read_csv(\"owid-covid-data.csv\")\n",
"#we print the max and min value using max,min function\n",
"print(\"Maximum: \",df[\"total_deaths_per_million\"].max())\n",
"print(\"Minimum: \",df[\"total_deaths_per_million\"].min())\n"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Maximum: 159011.195\n",
"Minimum: 0.001\n"
]
}
],
"source": [
"# INSERT YOUR CODE HERE\n",
"#\n",
"# 3. Write your code to find and print the minimum and the maximum values of cases per million \n",
"# in the total_cases_per_million column. (1 mark)\n",
"#\n",
"#we print the max and min value using max,min function\n",
"print(\"Maximum: \",df[\"total_cases_per_million\"].max())\n",
"print(\"Minimum: \",df[\"total_cases_per_million\"].min())"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Mean: 202.02780434368864\n",
"Median: 35.193000000000005\n",
"Standard Deviation: 356.63652697858066\n"
]
}
],
"source": [
"# INSERT YOUR CODE HERE\n",
"# Write your code to find and print the mean and median mortality per million, and the standard \n",
"# deviation from the total_deaths_per_million column. (2 marks)\n",
"#we find mean,median and SD on the data frame.\n",
"print(\"Mean: \",df[\"total_deaths_per_million\"].mean())\n",
"print(\"Median: \",df[\"total_deaths_per_million\"].median())\n",
"print(\"Standard Deviation: \",df[\"total_deaths_per_million\"].std())"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"C:\\Users\\dives\\anaconda3\\lib\\site-packages\\seaborn\\_decorators.py:36: FutureWarning: Pass the following variable as a keyword arg: x. From version 0.12, the only valid positional argument will be `data`, and passing other arguments without an explicit keyword will result in an error or misinterpretation.\n",
" warnings.warn(\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAagAAAEYCAYAAAAJeGK1AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAAYvUlEQVR4nO3deZTdZZ3n8fc3VZBAACEJCSRNFolOyyhNSyYjyGl1IC7gUi6ImARxHJ1udGhHHBuXcfAcndF2nLbVQdu2aZYBcQHKJTjHjEtnQDCdKCCKCygMUJhQxEhIIEjxnT9+zy1uKrUmqaonyft1Tp2697c8z3O/det+6vndX/1uZCaSJNVmymQPQJKkwRhQkqQqGVCSpCoZUJKkKhlQkqQqGVCSpCoZUKpGRFwaER/eg+19OCJ6I+K3e6pNjV1EnBsRN7TdfyQinj6ZY9LewYASABFxd0Q8Wl48fhcRqyLimHHo5/sR8Vjppzciro2Io3ehnYyIxcOsPwa4ADguM4/anTG3tRkRcX5E3B4RWyPivoj4SkQ8p22bkyPiuxGxJSJ+HxHfiIjjyrqzS51jQLudEbExIl4eES+MiPva1rXqtSUiHo6I9RFxYURM3ROPaVfs7h8SmXlIZv56T7SlfZsBpXavyMxDgKOBDcCnx6mfd5R+ngkcDvzNOPSxAHgoMzeOdceI6Bxi1d8CfwmcD8ygGX83cEbZ7yTg28DXgLnAIuBW4MYyY7iO5vG+YEC7LwUS+N9D9PuOzDyU5udyAfAG4PqBQTcRIqJjovvU/suA0k4y8zHgq8BxrWUR8bSIuDwiHoyIeyLiAxExJSJmlJnEK8p2h0TEnRFxzij62QRcAzx7sPUR8dbS1qaI+HpEzC3L15RNbi0zsbMG7HcasBqYW9ZfWpa/MiJ+GhGby8zkWW373B0RfxURtwFbB4ZURDwDeDtwdmZ+NzO3Z+a2zLwyMz9aNvtr4PLM/NvM3JKZmzLzA8DNwEWlrl8GBtbmHODKzHxihHptzczvA68ETqIE4yB1uzQiLo6Ib5XHf2NEHBURnyyz459HxJ+2bf+sUo/NpT6vHNDWZyPi+ojYCrwFWA68p7T9jbLdhRFxV5np/SwiXj3U42jNfiPibQPbioj/FBHXDNj+0xHxyeFqo31UZvrlF8DdwGnl9sHAZTQvtq31l9PMDA4FFgK/BN5S1r0Y+C0wG/h74KvD9PN94N+V27OA7wJXlPuXAh8ut/8N0As8F5hKM5tb09ZOAouH6eeFwH1t958JbAWWAQcA7wHuBA5se/y3AMcABw3S3p8D9wzT38FAH/CiQda9GXig3H4+8HCrD+BpwKPACUOMu79eA9pcA3xsiLFcWmp3IjCt1Pg3NEHYAXwY+F7Z9oBSh/cBB5a6bwH+RVtbvy/jnlLa6/85tfV5Js2scQpwVqn10WXducANg/3sBrZFM0vcChxe7ncCG4ETJ/t3xK+J/3IGpXbdEbGZ5gV0GfBx6D+scxbw3mxmBncDnwBWAmTmt4GvAN+h+av+34/Qz6dKP7cCDwDvGmSb5cAlmfmjzNwOvBc4KSIW7uJjOwtYlZmrM/MPwH8HDgJObh9XZt6bmY8Osv/MMtahzKB5cR5smwdowpjMvJHm8GlrhvF64JeZecsYHgtAT+lzKNdl5vpsZm3XAY9l5uWZ2Qd8CWjNoJ4HHAJ8NDMfz8zvAt8Ezm5r62uZeWNmPlna20lmfiUze8o2XwJ+BSwd42MiMx+gCd8zy6KXAr2ZuX6sbWnvZ0CpXVdmHk4zY3kH8E8RcRTNi+uBwD1t294DzGu7/3maQ3X/mJkPjdDP+Zl5eGbOy8zlmfngINvMbe8vMx8BHhrQ51gMbO9J4N4B7d07zP4P0fx1P5TfAU8Osc3RNDOalst56jDfSprZ6ljNAzYNs35D2+1HB7l/SLk9F7i31KNl4M92uLoAEBHnRMQt5TDhZprnwqyR9hvCZcCKcnsFcMUutqO9nAGlnWRmX2ZeS3PI6hSaF9c/0Jx40DIfuB/6Z1h/R/PC+xfDnV03Bj3t/UXEdJpZzP17qL2gOZzX3t5wl/b/DvBHEbFksJWZuRW4iaf+8m/3+rJ/y+XAqeWkiucBV43mAbREc4biicD/Hct+Q+gBjomI9teC/p9tMbAuO9yPiAU0h3bfAcwsf+TcDozmJI7Bat4NHB8RzwZeDlw5ina0DzKgtJNovAo4ArijHBb6MvCRiDi0vCC9C/hfZZf3le//lubQ2eV74Gyvq4A3R8QJ5ZTq/wr8sBxehGZGMJb/pfkycEZEnBoRB9CcDbcd+MFods7MXwEXA18sp4IfGBHTIuINEXFh2exC4E3RnIp+aEQcUU6hPgn4UFtb9wA3AF8EVmfmqP5PKyIOjogX0LwXuBa4fjT7jeCHNO/5vCciDoiIFwKvAK4eZp+BtZ9OEzQPlnG+mSFOfBlFW+RTJ+lcBazNzP83yra0jzGg1O4bEfEIzXtQHwHelJk/Lev+A80L2a9pXlyvAi6JiBNpwuqcEmQfo3mxunBg42ORmd8B/jPNWX4PAMfSnF7dchFwWTmk9PpRtPcLmsNFn6aZEb6C5rT6x8cwrPOBzwD/E9gM3EXzXtI3Sh83AC8BXlPGfA/Nez2nlIBrdxnNjO7yUfT7mYjYQvNi/kmamrx0wGG5XVIe/yuBl9HU5WKan+XPh9ntH4DjSu27M/NnNO9J3lTG+BzgxlEOYYe22pZfVtrx8N5+LDL9wEJJdYmI+cDPgaMy8+HJHo8mhzMoSVUp74e9C7jacNq/DfUf85I04crJMBtoDo++dJKHo0nmIT5JUpU8xCdJqtKYDvHNmjUrFy5cOE5DkSTtj9avX9+bmUcOXD6mgFq4cCHr1q3bc6OSJO33IuKewZZ7iE+SVCUDSpJUJQNKklQlA0qSVCUDSpJUJQNKklQlA0qSVCUDSpJUJQNKklQlA0qSVCUDSpJUJQNKklQlA0qSVCUDSpJUJQNKklQlA0qSVCUDSpJUJQNKklQlA0qSVKXOyei0u7ubnp6e/vu9vb0AzJo1a0L6nzt3Ll1dXRPSlyRp10xKQPX09HDv3fcy4+AZAGzbtg2ArU9uHfe+N23bNO59SJJ236QEFMCMg2dwxrPOAGDVHasA+u+Pp1ZfkqS6+R6UJKlKBpQkqUoGlCSpSgaUJKlKBpQkqUoGlCSpSgaUJKlKBpQkqUoGlCSpSgaUJKlKBpQkqUoGlCSpSgaUJKlKBpQkqUoGlCSpSgaUJKlKBpQkqUoGlCSpSgaUJKlKBpQkqUoGlCSpSgaUJKlKBpQkqUoGlCSpSgaUJKlKBpQkqUoGlCSpSgaUJKlKBpQkqUoGlCSpSgaUJKlKBpQkqUoGlCSpSgaUJKlKBpQkqUoGlCSpSgaUJKlKBpQkqUoGlCSpSgaUJKlKBpQkqUoTHlDd3d309vZOdLd7ve7ubrq7uyd7GJI0YTonusOenh62b9/OQQccNNFd79V6enomewiSNKE8xCdJqpIBJUmqkgElSaqSASVJqpIBJUmqkgElSaqSASVJqpIBJUmqkgElSaqSASVJqpIBJUmqkgElSaqSASVJqpIBJUmqkgElSaqSASVJqpIBJUmqkgElSaqSASVJqpIBJUmqkgElSaqSASVJqpIBJUmqkgElSaqSASVJqpIBJUmqkgElSaqSASVJqpIBJUmqkgElSaqSASVJqpIBJUmqkgElSaqSASVJqpIBJUmqkgElSaqSASVJqpIBJUmqkgElSapS52QPQKNz1113AXDBBRdM8kikfdvMmTPZvHkzfX19HHHEEWzbto1TTz2V66+/nilTptDZ2clpp53G9ddfz+zZs1m5ciVXX301999/PwCLFy/m8ccf55FHHmHTpk0ceeSRvPrVr+aSSy7hiSee4Mgjj2TlypV0d3dz8sknc+WVV7JixQrWrFnDli1b2LRpEzNnzmT69Okcfvjh3HbbbUDzu3/IIYfwhS98gd7eXs4991xWr17NCSecwLXXXsuZZ57J+vXr6erq4pprriEzed3rXkd3dzcrV64E4IorrqCrq4urrrqKDRs2sGLFCm688cad1nd3dzNlyhTuvPNOpk2bxnnnnUd3dzddXV189atfpa+vj87OTs4991wOO+ywcftZRGaOeuMlS5bkunXrdqvDiy++mPvvv5/DDzicM551BgCr7lgF0H9/PK26YxXTZ0/nvPPOG/e+9iSDSarTnDlz2LBhw7DbHHTQQTz66KM77LNx40amTJlCX18fHR0d9PX1jdjPscceyw9+8IP+Nh977DHaX8MjgtmzZ/ePp9XPSSedBMBNN920w/qOjg6efPLJndZv3Lhxh3Zb7bTvC3DyySfz2te+dsQajSQi1mfmkoHLnUHtBQwnqV4jhROwQzi179MKpZHCqbXPxo0bh2wTIDN3GE/r9tq1awdd3+p3qPUD2xm4bu3atSxbtmzcZlETHlC9vb1s376dh/senuiuAXj4sYfZfP9mLr744knpX5J21ViOeLV74okniIhdXj+Uvr4+Vq9evUdmUYMZ8SSJiHhbRKyLiHUPPvjguAxCkjS+Rgq3XQm/zGT9+vW7OqQRjTiDyszPA5+H5j2o3e1w1qxZbN++ncMOGL831oZz2LTD9rr3oDzEJ2l3RcSwITTS+qH2OfHEE3d3aEPyNHNJ2kvsymE4gM7OTjo6OnZ5/VA6OjpYtmzZLo1pNAyovcAnPvGJyR6CpCHMmTNnxG0OOuignfaJiP5QGE04zJkzp/9su1abAwMrInYYT6ufpUuXsnTp0p3Wd3R0DLp+YLutZQMf69KlS8f1NHMDSpLazJw5sz8wjjjiCKZOncrpp58OwJQpUzjwwAP778+ePZsVK1Ywb968/v0XL17M/PnzmTFjBkD//z11dnb231++fDmLFi3i7LPPJiJ44xvfyIIFC/r3mTlzJvPnz+f444/vb3fFihUsW7aMefPmMXXqVM455xwWLVrEa17zGgDOPPNMFi1axPLly1mwYAHz58/v72fZsmUsW7asf/1RRx3V3+9g6xctWsTixYsBmDZtWv+y5cuXM3/+fObNm8eCBQvGdfYE/h/UXqN11uHeNm5JGslQ/wflDEqSVCUDSpJUJQNKklQlA0qSVCUDSpJUJQNKklQlA0qSVCUDSpJUJQNKklQlA0qSVCUDSpJUJQNKklQlA0qSVCUDSpJUJQNKklQlA0qSVCUDSpJUJQNKklQlA0qSVCUDSpJUJQNKklQlA0qSVCUDSpJUJQNKklQlA0qSVCUDSpJUJQNKklQlA0qSVCUDSpJUJQNKklQlA0qSVCUDSpJUJQNKklQlA0qSVCUDSpJUJQNKklQlA0qSVCUDSpJUpc6J7nDu3Ln09vbCkxPd895t7ty5kz0ESZpQEx5QXV1d9PT0sHXj1onueq/W1dU12UOQpAnlIT5JUpUMKElSlQwoSVKVDChJUpUMKElSlQwoSVKVDChJUpUMKElSlQwoSVKVDChJUpUMKElSlQwoSVKVDChJUpUMKElSlQwoSVKVDChJUpUMKElSlQwoSVKVDChJUpUMKElSlQwoSVKVDChJUpUMKElSlQwoSVKVDChJUpUMKElSlQwoSVKVDChJUpUMKElSlQwoSVKVDChJUpUMKElSlQwoSVKVDChJUpUMKElSlQwoSVKVDChJUpUMKElSlQwoSVKVDChJUpUMKElSlTonq+NN2zax6o5VADy07SGA/vvj3e90po97P5Kk3TMpATV37twd7j/a+ygA02eNf3BMZ/pO/UuS6jMpAdXV1TUZ3UqS9iK+ByVJqpIBJUmqkgElSaqSASVJqpIBJUmqkgElSaqSASVJqpIBJUmqkgElSaqSASVJqpIBJUmqkgElSaqSASVJqpIBJUmqkgElSaqSASVJqpIBJUmqkgElSaqSASVJqpIBJUmqUmTm6DeOeBC4Zw/0Owvo3QPt7Mus0ehYp5FZo5FZo5GNZ40WZOaRAxeOKaD2lIhYl5lLJrzjvYg1Gh3rNDJrNDJrNLLJqJGH+CRJVTKgJElVmqyA+vwk9bs3sUajY51GZo1GZo1GNuE1mpT3oCRJGomH+CRJVTKgJElVmvCAioiXRsQvIuLOiLhwovuvSUTcHRE/iYhbImJdWTYjIlZHxK/K9yPatn9vqdsvIuIlkzfy8RMRl0TExoi4vW3ZmGsSESeW2t4ZEZ+KiJjoxzJehqjRRRFxf3ku3RIRp7et2x9rdExEfC8i7oiIn0bEX5blPpeKYWpUz3MpMyfsC+gA7gKeDhwI3AocN5FjqOkLuBuYNWDZXwMXltsXAh8rt48r9ZoKLCp17JjsxzAONfkz4LnA7btTE2AtcBIQwLeAl032YxvnGl0EvHuQbffXGh0NPLfcPhT4ZamFz6WRa1TNc2miZ1BLgTsz89eZ+ThwNfCqCR5D7V4FXFZuXwZ0tS2/OjO3Z+ZvgDtp6rlPycw1wKYBi8dUk4g4GjgsM2/K5rfn8rZ99npD1Ggo+2uNHsjMH5XbW4A7gHn4XOo3TI2GMuE1muiAmgfc23b/PoYvyL4ugW9HxPqIeFtZNiczH4DmCQTMLsv359qNtSbzyu2By/d174iI28ohwNahq/2+RhGxEPhT4If4XBrUgBpBJc+liQ6owY5L7s/nuT8/M58LvAx4e0T82TDbWrudDVWT/bFWnwWOBU4AHgA+UZbv1zWKiEOAa4B3ZubDw206yLL9ok6D1Kia59JEB9R9wDFt9/8I6JngMVQjM3vK943AdTSH7DaUKTPl+8ay+f5cu7HW5L5ye+DyfVZmbsjMvsx8Evh7njr8u9/WKCIOoHnhvTIzry2LfS61GaxGNT2XJjqg/hl4RkQsiogDgTcAX5/gMVQhIqZHxKGt28CLgdtp6vGmstmbgK+V218H3hARUyNiEfAMmjcm9wdjqkk5dLMlIp5XziY6p22ffVLrRbd4Nc1zCfbTGpXH9A/AHZn5P9pW+VwqhqpRVc+lSThz5HSas0XuAt4/0f3X8kVzJuOt5eunrVoAM4HvAL8q32e07fP+UrdfsI+cSTRIXb5Ic1jhDzR/mb1lV2oCLCm/WHcBn6FcNWVf+BqiRlcAPwFuKy8kR+/nNTqF5jDTbcAt5et0n0ujqlE1zyUvdSRJqpJXkpAkVcmAkiRVyYCSJFXJgJIkVcmAkiRVyYDShIiImW1XR/7tgKslHzhg23dGxMGjaPP7EbFkiOW/KJdq+XlEfCYiDt+Nse8wnoh4ZFfb2pPKVaczIha3LfuPZdmScv/61mNvjTsiFka5EnpELImIT03C8KURGVCaEJn5UGaekJknAJ8D/qZ1P5sLB7d7JzBiQI1geWYeDxwPbGf3/nFwT4xnt0VExyCLf0LzD+8trwN+1rqTmadn5uah2szMdZl5/h4bpLQHGVCaNBFxakT8uHyOzCXlP9TPB+YC34uI75XtPhsR68pn1nxoLH2U8HsPMD8i/qS0tyIi1pbZ29+1XvgH62ew8ZTlH4mIWyPi5oiYU5adGRG3l+VrBnm8L4yINRFxXUT8LCI+FxFTyroXR8RNEfGjiPhKuT5a6zPDPhgRNwBnDvIQuymfCBARTwd+DzzY1ufdETFrqPqUMX2z3J4REd1l5nlzRBxfll9Ufj7fj4hfl5pI486A0mSZBlwKnJWZzwE6gb/IzE/RXMfrRZn5orLt+zNzCc1s6AWtF87Rysw+mit2/HFEPAs4i+ZCvScAfcDyofoZYjzTgZsz80+ANcBby/IPAi8py185xHCWAhcAz6G5IOdrSoB8ADgtm4sHrwPe1bbPY5l5SmZePUh7DwP3RsSzgbOBL426MDv7EPDjMvN8H83HJrT8MfCSMv7/Es013KRxZUBpsnQAv8nMX5b7l9F8EN9gXh8RPwJ+DPxLmg9OG6vWFZdPBU4E/jkibin3nz7Gfh4HvllurwcWlts3ApdGxFtpHt9g1mbzeWh9NJcsOgV4XunrxjKmNwEL2vYZKXSupjnM10Vz0eFddQrNZW7IzO8CMyPiaWXdqmw+B6iX5gKrc3ajH2lUOid7ANpvbR3NRuWilO8G/lVm/i4iLqWZfY1aOYT3HJoPZJsNXJaZ792Nfv6QT10jrI/ye5SZfx4R/xo4A7glIk7IzIcG7Dvw2mKtjytYnZlnD9HfSLX6BvBxYF1mPhy7/mnbw31swva2Zf2PWRpPzqA0WaYBC9vOQFsJ/FO5vYXmI6gBDqN5gf59ea/nZWPppByK+m/AvZl5G80FQl8XEbPL+hkRsWCEftrHM1xfx2bmDzPzg0AvO340QcvSaK7mP4XmUOMNwM3A81u1iIiDI+KZo32Mmfko8FfAR0a7zxDWUA53RsQLgd4c/jOUpHHlX0GaLI8Bbwa+EhGdNB/F8rmy7vPAtyLigcx8UUT8mOaK77+mOYw2GldGxHZgKvB/KCcSZObPIuIDNJ9kPIXmiuBvz8ybh+lnh/EM0+fHI+IZNDOR79C87zXQTcBHaWZ0a4DrMvPJiDgX+GJETC3bfYDmqv+jMsT7U2N1EfCPEXEbsI2nPpZCmhRezVyaIGVW8u7MfPkkD0XaK3iIT5JUJWdQkqQqOYOSJFXJgJIkVcmAkiRVyYCSJFXJgJIkVen/A/esX+yARXQgAAAAAElFTkSuQmCC\n",
"text/plain": [
"
"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"# INSERT YOUR CODE HERE\n",
"#\n",
"# Write your code to construct a box plot for the COVID mortality per million.\n",
"#\n",
"#we use boxplot on the dataframe for plotting graph\n",
"sns.boxplot(df['total_deaths_per_million'], color='plum')\n",
"#title function gives the title of the plot\n",
"plt.title('Box Plot for COVID mortality')\n",
"#xlabel denotes the x-axis label\n",
"plt.xlabel('Total Deaths per Million ')\n",
"plt.tight_layout()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# INSERT YOUR COMMENT HERE\n",
"#\n",
"# Briefly describe the meaning of the difference between the mean and median COVID mortality per million.\n",
"# Why are they different? (2 marks)\n",
"#\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"## **Instruction 3**\n",
"This part of the assignment is designed to examine COVID cases and mortality data for different countries. The countries are given by their ISO alpha-3 codes, given here: https://en.wikipedia.org/wiki/ISO_3166-1_alpha-3 .\n",
"As in the previous task, we will be using **total_cases_per_million** and **total_deaths_per_million** columns.\n",
"\n",
"**Your tasks**:\n",
"\n",
"1. Write your code to find and print out the number of countries in the dataset. (**1 mark**)\n",
"2. Population mortality from seasonal influenza is generally thought to be about 0.01% (1 death per 10000 people). Write your code to find and print out the list of countries with the population mortality greater than ten times of the population mortality of seasonal influenza. (**2 marks**)\n",
"3. Write your code to find and print out the country with the highest population mortality. Both the country code and the mortality value for this country need to be printed out (**2 marks**)\n",
"4. Write your code to construct a dictionary variable ***data_large_countries*** whose keys are **iso_code**, **total_cases_per_million**, **total_deaths_per_million**, **population**, **population_density**. You can start from the original dataset **data_dict**. You can adapt the code given to you in Instruction 2. Print the dictionary. (**2 marks**)\n",
"5. Using the new dictionary ***data_large_countries***, plot a bar chart of mortality sorted in descending order of values for the countries with **population** more than 1 million, achieving first twenty places of leadership in exterminating their own population. (**3 marks**)\n",
"6. Using the same dictionary, plot a bar chart of population density for the mortality leaders. Use the same order of countries as in Task 5. (**3 marks**)\n",
"7. Discuss the bar charts. Which conclusion you can draw, if you look at all of them together? (**2 marks**)\n",
"\n",
"[**Total mark: 15**]"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Number of unique countries in the dataset: 215\n"
]
}
],
"source": [
"# INSERT YOUR CODE HERE \n",
"# Write your code to find and print out the number of countries in the dataset. (1 mark)\n",
"# To be sure that you get a list of unique countries, you may want to use np.unique\n",
"\n",
"input_list = (data_dict['iso_code'])\n",
"\n",
"items = Counter(data_dict['iso_code']).keys()\n",
"print(\"Number of unique countries in the dataset:\", len(items))"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"['Andorra', 'Argentina', 'Armenia', 'Austria', 'Belgium', 'Bolivia', 'Bosnia and Herzegovina', 'Brazil', 'Bulgaria', 'Chile', 'Colombia', 'Croatia', 'Czechia', 'Europe', 'European Union', 'France', 'Hungary', 'Italy', 'Latvia', 'Liechtenstein', 'Lithuania', 'Luxembourg', 'Mexico', 'Moldova', 'Montenegro', 'North America', 'North Macedonia', 'Panama', 'Peru', 'Poland', 'Portugal', 'Romania', 'San Marino', 'Slovakia', 'Slovenia', 'South America', 'Spain', 'Sweden', 'Switzerland', 'United Kingdom', 'United States']\n"
]
}
],
"source": [
"# INSERT YOUR CODE HERE \n",
"#\n",
"# 2. Population mortality from seasonal influenza is generally thought to be about 0.01% \n",
"# (1 death per 10000 people). Write your code to find and print out the list of countries \n",
"# with the population mortality is more than ten times of the seasonal influenza. (2 marks)\n",
"#\n",
"\n",
"mortality = pd.read_csv('owid-covid-data.csv')\n",
"#using a condition over the datframe if total death is more than 1000 and location\n",
"list = mortality[mortality.total_deaths_per_million > 1000].location\n",
"#storing all result in a list and returning the list\n",
"res = []\n",
"for i in list:\n",
" if i not in res:\n",
" res.append(i)\n",
"\n",
"print(str(res))"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"2526.571\n",
"CZE\n"
]
}
],
"source": [
"# INSERT YOUR CODE HERE \n",
"# Write your code to find and print out the country with the highest population mortality. \n",
"# Both the country code and the mortality value for this country need to be printed out (2 marks)\n",
"# HINT: You can use np.where or np.argmax functions.\n",
"\n",
"column = df['total_deaths_per_million']\n",
"max_value = column.max()\n",
"\n",
"print(max_value)\n",
"\n",
"df.query('total_deaths_per_million==2526.571')['iso_code']\n",
"#shows that country code is CZE for the max value\n",
"\n",
"print('CZE')\n"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{1: 'iso_code', 2: 'total_cases_per_million', 3: 'total_deaths_per_million', 4: 'population', 5: 'population_density'}\n"
]
}
],
"source": [
"# INSERT YOUR CODE HERE \n",
"# 4. Write your code to construct a dictionary variable ***data_large_countries*** whose keys are **iso_code**,\n",
"# **total_cases_per_million**, **total_deaths_per_million**, **population**, **population_density**.\n",
"# You can start from the original dataset **data_dict**.\n",
"# You can adapt the code given to you in Instruction 2.\n",
"# Print the dictionary.\n",
"# (**2 marks**)\n",
"\n",
"data_large_countries = {1: 'iso_code', 2: 'total_cases_per_million', 3: 'total_deaths_per_million', 4: 'population', 5: 'population_density'}\n",
"print(data_large_countries)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"image/png":...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions ยป

Submit New Assignment

Copy and Paste Your Assignment Here