Create an animated choropleth plot using plotly that analyzes a seven-day moving average of cases for some geographic unit and sub-unit (e.g. USA and states) Create a second, non-animated, choropleth...

1 answer below ยป


  • Create an animated choropleth plot using plotly that analyzes a seven-day moving average of cases for some geographic unit and sub-unit (e.g. USA and states)

  • Create a second, non-animated, choropleth plot that shows cumulative cases or vaccinations per 100,000 people for the most recent date in the data file.

  • Important: Your decision: Analyze either cases or vaccinations, depending on your data source.


Requirements:



  • Find appropriate data source that includes new COVID-19 cases per day for the geographic region. (Use a direct link, not downloaded file.)

  • Find a data source that estimates the population for the geographic region. (Direct link not downloaded file)

  • Load both to a pandas dataframe

  • Calculate cumulative cases per 100,000 population for the sub-region (i.e., state)

  • Calculate 7-day moving average of new cases. (You might need to research methods in pandas.)

  • PLOT 1: Plot 7-day moving average of cases on Plotly plot and animate by day (older dates on left of slider)

  • PLOT 2: Create a separate plot of cumulative cases per 100,000 population. This should be for the maximum date in the dataframe and should not be animated.

  • Plots will include relevant title and hover text.

  • Colors will be continous scale of your choice.

Answered 1 days AfterOct 07, 2021

Answer To: Create an animated choropleth plot using plotly that analyzes a seven-day moving average of cases...

Dinesh answered on Oct 08 2021
145 Votes
4-plotly-covid19-US-data/plotly-covid19-US-data.ipynb
{
"cells": [
{
"cell_type": "markdown",
"id": "f22df480-5030-49d0-8d32-91ae936b2658",
"metadata": {},
"source": [
"Create an animated choropleth plot using plotly that analyzes a seven-day moving average of cases for some geographic unit and sub-unit (e.g. USA and states)\n",
"\n",
"Create a second, non-animated, choropleth plot that shows cumulative cases or vaccinations per 100,000 people for the most recent date in the data file.Important: Your decision: Analyze either cases or vaccinations, depending on your data source.\n",
"\n",
"Requirements:\n",
"\n",
"* Find appropriate data source that includes new COVID-19 cases per day for the geographic region. (Use a direct link, not downloaded file.)\n",
"\n",
"* Find a data source that estimates the population for the geographic region. (Direct link not downloaded file)\n",
"\n",
"* Load both to a pandas dataframe\n",
"\n",
"* Calculate cumulative cases per 100,000 population for the sub-region (i.e., state)\n",
"\n",
"* Calculate 7-day moving average of new cases. (You might need to research methods in pandas.)\n",
"\n",
"* PLOT 1: Plot 7-day moving average of cases on Plotly plot and animate by day (older dates on left of slider)\n",
"\n",
"* PLOT 2: Create a separate plot of cumulative cases per 100,000 population. This should be for the maximum date in the dataframe and should not be animated.\n",
"\n",
"* Plots will include relevant title and hover text.\n",
"\n",
"* Colors will be continous scale of your choice."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b1dc193e-0b32-4f1e-bc43-0abf3dbdf23a",
"metadata": {},
"outputs": [],
"source": [
"# installing necessary packages\n",
"#! pip install pandas\n",
"#! pip install plotly"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "feac02e2-d019-4fdd-a31a-d428d71dd56e",
"metadata": {},
"outputs": [],
"source": [
"# import necessary python packages\n",
"import pandas as pd\n",
"import plotly.express as px"
]
},
{
"cell_type": "code",
"execution_count": 122,
"id": "a347a446-c171-412d-a7da-79c6118c5603",
"metadata": {},
"outputs": [],
"source": [
"# Load Covid data \n",
"\n",
"covid_cases_df = pd.read_csv('https://data.cdc.gov/api/views/9mfq-cb36/rows.csv')\n",
"\n",
"population_df = pd.read_csv('https://www2.census.gov/programs-surveys/popest/datasets/2010-2019/state/detail/SCPRC-EST2019-18+POP-RES.csv')"
]
},
{
"cell_type": "markdown",
"id": "bf59fe11-16ba-43ba-b0ec-b0d489e2ec06",
"metadata": {},
"source": [
"## Data Cleaning"
]
},
{
"cell_type": "code",
"execution_count": 124,
"id": "69b58038-4c87-442c-8e7f-d69ef0559b61",
"metadata": {},
"outputs": [],
"source": [
"population_df = population_df[['NAME', 'POPESTIMATE2019']]"
]
},
{
"cell_type": "code",
"execution_count": 56,
"id": "013ea112-987e-42e9-ab35-3d73cf43b84b",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"RangeIndex: 37440 entries, 0 to 37439\n",
"Data columns (total 15 columns):\n",
" # Column Non-Null Count Dtype \n",
"--- ------ -------------- ----- \n",
" 0 submission_date 37440 non-null object \n",
" 1 state 37440 non-null object \n",
" 2 tot_cases 37440 non-null int64 \n",
" 3 conf_cases 19694 non-null float64\n",
" 4 prob_cases 19623 non-null float64\n",
" 5 new_case 37440 non-null int64 \n",
" 6 pnew_case 33437 non-null float64\n",
" 7 tot_death 37440 non-null int64 \n",
" 8 conf_death 19587 non-null float64\n",
" 9 prob_death 19587 non-null float64\n",
" 10 new_death 37440 non-null int64 \n",
" 11 pnew_death 33376 non-null float64\n",
" 12 created_at 37440 non-null object \n",
" 13 consent_cases 31193 non-null object \n",
" 14 consent_deaths 31824 non-null object \n",
"dtypes: float64(6), int64(4), object(5)\n",
"memory usage: 4.3+ MB\n"
]
}
],
"source": [
"# Information about dataframe\n",
"covid_cases_df.info()"
]
},
{
"cell_type": "code",
"execution_count": 57,
"id": "962af6d5-b587-43a8-9d82-a8f1bf930ff9",
"metadata": {},
"outputs": [],
"source": [
"# There are some negative value present on new case column, so we are replacing with 0\n",
"covid_cases_df['case']=covid_cases_df['new_case'].mask(covid_cases_df['new_case'].lt(0),0)"
]
},
{
"cell_type": "code",
"execution_count": 59,
"id": "e3200d3b-ea2b-4c70-ab1a-1f155cfca35f",
"metadata": {},
"outputs": [],
"source": [
"covid_cases_df = covid_cases_df[['submission_date','case', 'state']]"
]
},
{
"cell_type": "code",
"execution_count": 60,
"id": "5f46ff7e-7081-446d-b646-9ee24f32c375",
"metadata": {},
"outputs": [],
"source": [
"covid_cases_df['date'] = pd.to_datetime(covid_cases_df['submission_date'], format = \"%m/%d/%Y\").dt.strftime('%Y-%m-%d')"
]
},
{
"cell_type": "code",
"execution_count": 61,
"id": "ce9f4562-0d32-498d-acff-4c07650c8d51",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
submission_datecasestatedate
3743507/19/20200NJ2020-07-19
3743602/10/20210FSM2021-02-10
3743712/29/20201290RI2020-12-29
3743804/14/2020103DC2020-04-14
3743907/06/2021117OR2021-07-06
\n",
"
"
],
"text/plain": [
" submission_date case state date\n",
"37435 07/19/2020 0 NJ 2020-07-19\n",
"37436 02/10/2021 0 FSM 2021-02-10\n",
"37437 12/29/2020 1290 RI 2020-12-29\n",
"37438 04/14/2020 103 DC 2020-04-14\n",
"37439 07/06/2021 117 OR 2021-07-06"
]
},
"execution_count": 61,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"covid_cases_df.tail()"
]
},
{
"cell_type": "code",
"execution_count": 62,
"id": "730f6167-7b81-4cbe-a4bc-f76443f0dce5",
"metadata": {},
"outputs": [],
"source": [
"df = covid_cases_df.sort_values(by=['date'])"
]
},
{
"cell_type": "code",
"execution_count": 65,
"id": "79ba8b80-9fde-4ebd-a9ce-db61fcf6c368",
"metadata": {},
"outputs": [],
"source": [
"# calculate 7 days Moving average\n",
"df['7ma'] = df.groupby('state').case.transform(lambda c: c.rolling(7).mean())"
]
},
{
"cell_type": "code",
"execution_count": 67,
"id": "95afb4aa-e6f2-4d8b-8c3b-de42411aecbf",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(63010, 0)"
]
},
"execution_count": 67,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df['case'].max(), df['case'].min()"
]
},
{
"cell_type": "code",
"execution_count": 68,
"id": "0c448f71-cf1e-4789-86f8-38ade3ac61f3",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(47248.28571428572, 0.0)"
]
},
"execution_count": 68,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df['7ma'].max(), df['7ma'].min()"
]
},
{
"cell_type": "markdown",
"id": "619edfc9-6f93-40b4-8c07-f15c15c77145",
"metadata": {},
"source": [
"#### Plot 7-day moving average of cases on Plotly plot and animate by day"
]
},
{
"cell_type": "code",
"execution_count": 69,
"id": "d944f7e8-f1f7-4bc8-8da3-df1523464942",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"