Brooklyn Housing Analysis Dataset: CSV here Provide a short narrative describing on the Brooklyn Housing Analysis problem. You can use any methods or tools you think are most appropriate. Write the...

1 answer below »


Brooklyn Housing Analysis



Dataset:

CSV here




Provide a short narrative describing on the Brooklyn Housing Analysis problem. You can use any methods or tools you think are most appropriate. Write the step-by-step instructions for completing the Dimensionality, Feature Reduction, Model Evaluation and Selection part of your case study.



Add the last remaining steps (10-15) to the current file Jupyter Notebook.


Provide a short narrative describing on the Brooklyn Housing Analysis problem.


1.I want to see if I can create a map to display divided geographical areas or regions that are colored, shaded or patterned in relation to a data variable.






Dimensionality and Feature Reduction


2.Some of my questions have been answered by seeing the charts but in some ways, looking at this much data has created even more questions.


a.Now it’s time to reduce some of the features so we can concentrate on the things that matter!There features we will get rid of are:"Unnamed", "apartment_number", "Ext ", "Landmark", etc.


b.Fill in missing values.(apartment_number has some missing values but we are dropping that feature.)If there is a missing value in a column representing the year in which alterations where carried out on a property, it may make more sense assuming no alteration had been carried out.


3.If you go back and look at the histograms of sales, you’ll see that it is very skewed…many low real estate sales, not very many high real estate sales.Log Transformation is a good method to use on highly skewed data.


4.Convert your categorical data into numbers.For other categorical columns, I filled the missing data with the modal value of their respective columns and for the rest of the numerical variables I used a mixture of a soft impute imputation and filling missing data using the median value.



Model Evaluation and Selection


5.Training – split the data into two sets: Training and Testing.


6.Evaluation: remember we are trying to predict selling prices of houses




Format:The completed task must bein Jupyter Notebook with run & displayed results.




Resources:


https://www.kaggle.com/tianhwu/brooklynhomes2003to2017


https://hackernoon.com/predicting-the-price-of-houses-in-brooklyn-using-python-1abd7997083b


https://towardsdatascience.com/closing-the-sale-predicting-home-prices-via-linear-regression-2eac62c72818


https://medium.com/geoai/house-hunting-the-data-scientist-way-b32d93f5a42f




Answered Same DayOct 08, 2021

Answer To: Brooklyn Housing Analysis Dataset: CSV here Provide a short narrative describing on the Brooklyn...

Ximi answered on Oct 12 2021
158 Votes
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import numpy as np\n",
"import string\n",
"import re\n",
"import matplotlib.pyplot as plt\n",
"from collections import Counter"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Step 1: Load data into a dataframe"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/usr/local/lib/python3.7/site-packages/IPython/core/interactiveshell.py:3020: DtypeWarning: Columns (40,4
1,43,45,46,47,86) have mixed types. Specify dtype option on import or set low_memory=False.\n",
" interactivity=interactivity, compiler=compiler, result=result)\n"
]
}
],
"source": [
"housing_data = pd.read_csv('brooklynhomes2003to2017/brooklyn_sales_map.csv')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Step 2: Check the dimension of the table and view the data"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The dimension of the table is: (390883, 111)\n"
]
},
{
"data": {
"text/html": [
"
\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
Unnamed: 0boroughneighborhoodbuilding_class_categorytax_classblockloteasementbuilding_classaddress...EDesigNumAPPBBLAPPDatePLUTOMapIDFIRM07_FLAPFIRM15_FLVersionMAPPLUTO_FSHAPE_LengSHAPE_Area
013DOWNTOWN-METROTECH28 COMMERCIAL CONDOS41401001NaNR5330 JAY STREET...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
123DOWNTOWN-FULTON FERRY29 COMMERCIAL GARAGES4541NaNG785 JAY STREET...NaN3.000540e+0912/06/20021.0NaNNaN17V1.10.01559.889144140131.577176
233BROOKLYN HEIGHTS21 OFFICE BUILDINGS42041NaNO629 COLUMBIA HEIGHTS...NaN0.000000e+00NaN1.0NaNNaN17V1.10.0890.71852134656.447240
343MILL BASIN22 STORE BUILDINGS4847055NaNK65120 AVENUE U...NaN0.000000e+00NaN1.01.01.017V1.10.03729.786857797554.847834
453BROOKLYN HEIGHTS26 OTHER HOTELS42301NaNH821 CLARK STREET...NaN0.000000e+00NaN1.0NaNNaN17V1.10.0620.76116921360.147631
\n",
"

5 rows × 111 columns

\n",
"
"
],
"text/plain": [
" Unnamed: 0 borough neighborhood building_class_category \\\n",
"0 1 3 DOWNTOWN-METROTECH 28 COMMERCIAL CONDOS \n",
"1 2 3 DOWNTOWN-FULTON FERRY 29 COMMERCIAL GARAGES \n",
...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here