The jupyter notebook has detailed instructions about assignment. The attached data/csv file is the data you need to complete visualizations
{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "##
Exercise \\#3: Making plots of aggregated data
\n", "\n", "## Total Points : 50 \n", "###
Due: Tuesday 03/23/2021 11:59 PM
\n", "#### Objective : Assess your ability to use Pandas Package to query and aggregate data of some rows and make simple plots for the data using MatPlotLib\n", "\n", "##### Skills needed: In this homework assignment, you'll be using the concepts you've learned to\n", "1. Edit and run Jupyter notebooks\n", "2. Write Python code using Pandas to :\n", " * Use boolean masks to selet rows from the Panads dataframe\n", " * get aggregate of some column values\n", " * Normalize data in the range `[a,b]`\n", " * Create colors using normalized data and color maps\n", " * Use matPlotLib to plot the data\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "------\n", "###
Type your name here:
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "-----" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Requirements & Submission:\n", "1. Use this notebook as a template for your answers. Then rename it as ` your first name`-`Last name-Ex3.ipynb`. \n", "2. Submit on WTClass __“Resources >> Tests, exercises and Quizzes >> Exercise\\#3 __\n", "3. Each task's answer must be given in the cells after the task description\n", "4. Answers and explanations must be properly formatted in the appropriate Cell type\n", "5. You can add as many cells as you needed\n", "6. You should submit final ***clean version*** of your notebook, i.e. remove any experimenting cells" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "-------------\n", "-------------" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Reading and cleaning the dataset" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* For this Exercise you are required to use the attached `New York City Airbnb 2019` dataset. We are specifically interested in `room_type` and `price` columns \n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "#use na_values ='na' to instruct read_csv to replace `na` by `np.nan`\n", "notclean = pd.read_csv(\"AB_NYC_2019.csv\",na_values='na')" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(48895, 16)" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#verify the shape\n", "notclean.shape" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Index(['id', 'name', 'host_id', 'host_name', 'neighbourhood_group',\n", " 'neighbourhood', 'latitude', 'longitude', 'room_type', 'price',\n", " 'minimum_nights', 'number_of_reviews', 'last_review',\n", " 'reviews_per_month', 'calculated_host_listings_count',\n", " 'availability_365'],\n", " dtype='object')" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#see column labels\n", "notclean.columns" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
|
id |
name |
host_id |
host_name |
neighbourhood_group |
neighbourhood |
latitude |
longitude |
room_type |
price |
minimum_nights |
number_of_reviews |
last_review |
reviews_per_month |
calculated_host_listings_count |
availability_365 |
---|
0 |
2539 |
Clean & quiet apt home by the park |
2787 |
John |
Brooklyn |
Kensington |
40.64749 |
-73.97237 |
Private room |
149 |
1 |
9 |
10/19/2018 |
0 |
---|