Create a Python program, with associated comments to make the code understandable, which analyzesa medical dataset loaded in a Pandas dataframe. The dataset contains the variables provided on this...

1 answer below ยป

Create a Python program, with associated comments to make the code understandable, which analyzesa medical dataset loaded in a Pandas dataframe. The dataset contains the variables provided on this documentation page:https://vincentarelbundock.github.io/Rdatasets/doc/Ecdat/DoctorContacts.html(Links to an external site.).


An iPython notebook is provided with comments asking for the tasks to complete, with associated output. It is attached both as a notebook and a Python script.


Your taskconsists in providing the code associated with the comments, which is summarized here (see similar example in the worksheet for week5):


1)read the dataset from https://vincentarelbundock.github.io/Rdatasets/csv/Ecdat/DoctorContacts.csvinto 'medical' variable.


2) view Some of the Rows in the Medical Dataset (for example first 2 rows)


3) format numbers to be displayed with two decimal positions
4) view first 5 rows in the Medical Dataset
5) view last 5 rows in the Medical Dataset
6) customize the Column Names:add a name for the first column, for example id# hint: assign name using the index of the column ix with: medical.columns.values[ix] = 'id'
7) medical.head()
8) provide descriptive statistics of all variables in this dataset
9) provide descriptive statistics for the patients in excellent health
10)get_ipython().run_line_magic('matplotlib', 'inline')
11) draw histogram for variable 'educdec'
12) draw a histogram comparing education level (educdec) by gender# hint: use the parameter by='variable name' in hist()histogram = medical.hist(column='educdec',by='sex')
13) access column 'child' of the dataframe in two manners
15) select the first row of the dataset
16) slice the dataset to extract the first 10 rows
17) slice the dataset to extract the first 10 rows and only columns 'health' and 'educdec'
18) display the health status of patient with id 9
`9) sort the dataset by increasing value of column 'age'# hint: use 'by' argument in sort_value method
20) sort the dataset by decreasing value of column 'age'


Turn in your program as an attached iPythonnotebook (.ipynbextension), or as an attached Python file (.py extension).



Answered 4 days AfterJun 23, 2021

Answer To: Create a Python program, with associated comments to make the code understandable, which analyzesa...

Pratap answered on Jun 24 2021
146 Votes
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"id": "municipal-motor",
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "random-james",
"metadata": {},
"outputs": [],
"source": [
"# task 1\n",
"# read the dataset \n",
"# from 'https://vincentarelbundock.github.io/Rdatasets/csv/Ecdat/DoctorContacts.csv'\n",
"# to 'medical' variable\n",
"medical = pd.read_csv('https://vincentarelbundock.github.io/Rdatasets/csv/Ecdat/DoctorContacts.csv')"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "restricted-direction",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
Unnamed: 0mdulcidplpifmdephyslimndiseasehealthlinclfameducdecagesexchildblack
0100.0True6.9077550.0False13.73189good9.5287761.38629412.042.87748maleFalseTrue
1220.0True6.9077550.0False13.73189good9.5287761.38629412.043.87748maleFalseTrue
\n",
"
"
],
"text/plain": [
" Unnamed: 0 mdu lc idp lpi fmde physlim ndisease health \\\n",
"0 1 0 0.0 True 6.907755 0.0 False 13.73189 good \n",
"1 2 2 0.0 True 6.907755 0.0 False 13.73189 good \n",
"\n",
" linc lfam educdec age sex child black \n",
"0 9.528776 1.386294 12.0 42.87748 male False True \n",
"1 9.528776 1.386294 12.0 43.87748 male False True "
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# task 2\n",
"# view some of the rows in the medical dataset (for example first 2 rows)\n",
"medical.head(2)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "signal-toyota",
"metadata": {},
"outputs": [],
"source": [
"# task 3\n",
"# format numbers to be displayed with two decimal positions\n",
"medical = medical.round(decimals=2)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "southern-leather",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
Unnamed: 0mdulcidplpifmdephyslimndiseasehealthlinclfameducdecagesexchildblack
0100.0True6.910.0False13.73good9.531.3912.042.88maleFalseTrue
1220.0True6.910.0False13.73good9.531.3912.043.88maleFalseTrue
2300.0True6.910.0False13.73good9.531.3912.044.88maleFalseTrue
3400.0True6.910.0False13.73good9.531.3912.045.88maleFalseTrue
4500.0True6.910.0False13.73good9.531.3912.046.88maleFalseTrue
\n",
"
"
],
"text/plain": [
" Unnamed: 0 mdu lc idp lpi fmde physlim ndisease health linc \\\n",
"0 1 0 0.0 True 6.91 0.0 False 13.73 good 9.53 \n",
"1 2 2 0.0 True 6.91 0.0 False 13.73 good 9.53 \n",
"2 3 0 0.0 True 6.91 0.0 False 13.73 good 9.53 \n",
"3 4 0 0.0 True 6.91 0.0 False 13.73 good 9.53 \n",
"4 5 0 0.0 True 6.91 0.0 False 13.73 good 9.53 \n",
"\n",
" lfam educdec age sex child black \n",
"0 1.39 12.0 42.88 male False True \n",
"1 1.39 12.0 43.88 male False True \n",
"2 1.39 12.0 44.88 male False True \n",
"3 1.39 12.0 45.88 male False True \n",
"4 1.39 12.0 46.88 male False True "
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# task 4\n",
"# view first 5 rows in the medical dataset\n",
"medical.head(5)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "broadband-laser",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
Unnamed: 0mdulcidplpifmdephyslimndiseasehealthlinclfameducdecagesexchildblack
201812018220.00False5.380.00True10.58excellent8.821.956.00.40femaleTrueFalse
201822018300.00False5.380.00True10.58excellent8.822.086.01.40femaleTrueFalse
201832018483.26False6.878.01True10.58excellent8.921.6112.00.31maleTrueFalse
201842018583.26False5.166.54True10.58excellent7.551.7912.00.06maleTrueFalse
201852018663.26False6.628.01True10.58excellent8.821.3912.00.11femaleTrueTrue
\n",
"
"
],
"text/plain": [
" Unnamed: 0 mdu lc idp lpi fmde physlim ndisease health \\\n",
"20181 20182 2 0.00 False 5.38 0.00 True 10.58 excellent \n",
"20182 20183 0 0.00 False 5.38 0.00 True 10.58 excellent \n",
"20183 20184 8 3.26 False 6.87 8.01 True 10.58 excellent \n",
"20184 20185 8 3.26 False 5.16 6.54 True 10.58 excellent \n",
"20185 20186 6 3.26 False 6.62 8.01 True 10.58 excellent \n",
"\n",
" linc lfam educdec age sex child black \n",
"20181 8.82 1.95 6.0 0.40 female True False \n",
"20182 8.82 2.08 6.0 1.40 female True False \n",
"20183 8.92 1.61 12.0 0.31 male True False \n",
"20184 7.55 1.79 12.0 0.06 male True False \n",
"20185 8.82 1.39 12.0 0.11 female True True "
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# task 5\n",
"# view last 5 rows in the medical dataset\n",
"medical.tail(5)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "timely-access",
"metadata": {},
"outputs": [],
"source": [
"# task 6\n",
"# customize the column names: add a name for the first column,\n",
"# for example id\n",
"# hint: assign name using the index of the column ix with: medical.columns.values[ix] = 'id'\n",
"columns = medical.columns.values\n",
"columns[0] = 'id'\n",
"medical.columns = columns"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "twenty-mitchell",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
idmdulcidplpifmdephyslimndiseasehealthlinclfameducdecagesexchildblack
0100.0True6.910.0False13.73good9.531.3912.042.88maleFalseTrue
1220.0True6.910.0False13.73good9.531.3912.043.88maleFalseTrue
2300.0True6.910.0False13.73good9.531.3912.044.88maleFalseTrue
3400.0True6.910.0False13.73good9.531.3912.045.88maleFalseTrue
4500.0True6.910.0False13.73good9.531.3912.046.88maleFalseTrue
\n",
"
"
],
"text/plain": [
" id mdu lc idp lpi fmde physlim ndisease health linc lfam \\\n",
"0 1 0 0.0 True 6.91 0.0 False 13.73 good 9.53 1.39 \n",
"1 2 2 0.0 True 6.91 0.0 False 13.73 good 9.53 1.39 \n",
"2 3 0 0.0 True 6.91 0.0 False 13.73 good 9.53 1.39 \n",
"3 4 0 0.0 True 6.91 0.0 False 13.73 good 9.53 1.39 \n",
"4 5 0 0.0 True 6.91 0.0 False 13.73 good 9.53 1.39 \n",
"\n",
" educdec age sex child black \n",
"0 12.0 42.88 male False True \n",
"1 12.0 43.88 male False True \n",
"2 12.0 44.88 male False True \n",
"3 12.0 45.88 male False True \n",
"4 12.0 46.88 male False True "
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# task 7\n",
"# medical.head()\n",
"medical.head()"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "agricultural-ozone",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
" ...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions ยป

Submit New Assignment

Copy and Paste Your Assignment Here
idmdulcidplpifmdephyslimndiseasehealthlinclfameducdecagesexchildblack
count20186.0020186.0020186.002018620186.0020186.002018620186.002018620186.0020186.0020186.0020186.00201862018620186
uniqueNaNNaNNaN2NaNNaN2NaN4NaNNaNNaNNaN222
topNaNNaNNaNFalseNaNNaNFalseNaNexcellentNaNNaNNaNNaNfemaleFalseFalse
freqNaNNaNNaN14938NaNNaN16747NaN11015NaNNaNNaNNaN104351208316354
mean10093.502.862.38NaN4.714.03NaN11.24NaN8.711.2511.9725.72NaNNaNNaN
std5827.344.502.04NaN2.703.47NaN6.74NaN1.230.542.8116.77NaNNaNNaN
min1.000.000.00NaN0.000.00NaN0.00NaN0.000.000.000.00NaNNaNNaN
25%5047.250.000.00NaN4.060.00NaN6.90NaN8.581.1011.0011.46NaNNaNNaN
50%10093.501.003.26NaN6.116.10NaN10.58NaN8.981.3912.0024.20NaNNaNNaN
75%15139.754.004.56NaN6.626.96NaN13.73NaN9.261.6113.0037.40NaNNaN