Washington, D.C. is the capital of the United States. Washington's population is approaching 700,000 people and has been growing since 2000 following a half-century of population decline. The city is...

1 answer below »
may i have it done in 24 hours


Washington, D.C. is the capital of the United States. Washington's population is approaching 700,000 people and has been growing since 2000 following a half-century of population decline. The city is highly segregated and features a high cost of living. In 2017, the average price of a single-family home in the district was $649,000. The dataset (DC_Property_Train.CSV) provides insight on the housing stock of the district. Explanations for Columns ID: House ID BATHRM: Number of Full Bathrooms HF_BATHRM: Number of Half Bathrooms (no bathtub or shower) HEAT: Heating AC: Cooling NUM_UNITS: Number of Units ROOMS: Number of Rooms BEDRM: Number of Bedrooms AYB: The earliest time the main portion of the building was built YR_RMDL: Year structure was remodeled EYB: The year an improvement was built more recent than actual year built STORIES: Number of stories in primary dwelling PRICE: Price of most recent sale GBA: Gross building area in square feet BLDG_NUM: Building Number on Property STYLE: Style STRUCT: Structure LANDAREA: Land area of property in square feet ASSESSMENT_NBHD: Neighborhood ID In this problem, you are required to finish the following tasks. All necessary steps need to be clearly documented in your report. You use the data set DC_Property_Train.CSV for questions 1 to 6. 1. Plot a histogram for EYB. Describe the plotted pattern. (4 marks) 2. Plot a histogram for PRICE. Describe the plotted pattern and analyze the potential reasons for the high-priced properties. (4 marks) 3. Summarize the average PRICE for each ASSESSMENT_NBHD. Sort the processed data and make a bar plot of average prices for the top 10 neighborhoods. (6 marks) 4. Plot boxplots of PRICE by ASSESSMENT_NBHD for the top 10 neighborhoods. Explain the pros of using boxplots instead of average prices. (6 marks) 5. Plot boxplots PRICE by the categories of STRUCT using the facet approach. Compare these boxplots and summarize your findings. (6 marks) 6. Visualize the relationship between PRICE and GBA. Identify outliers based on the visualization and list their IDs. (6 marks) 7. Create a regression model for predicting PRICE through selected variables (you decide which ones to use) from the data set DC_Property_Train.CSV. You may exclude the identified outliers from the previous steps. Quantitatively evaluate the model performance using R2 and MSE. Fill the PREDICTED_PRICE column of the data set DC_Property_Test.CSV using the predicted values from your model. (8 marks)
Answered Same DayNov 29, 2021

Answer To: Washington, D.C. is the capital of the United States. Washington's population is approaching 700,000...

Hemanth answered on Nov 30 2021
151 Votes
# Installing required packages
install.packages("dplyr")
install.packages("ggplot2")
install.packages("caret")
# Loading r
equired packages
library(dplyr)
library(ggplot2)
library(caret)
# Removing all objects from the working directory
rm(list = ls())
# Reading dataset
Property_Train <- read.csv("DC_Property_Train.csv", header = TRUE, sep = ",")
# Showing first SIX records
head(Property_Train)
# Showing structure of the data
str(Property_Train)
# Plotting histogram of EYB variable
par(mfrow = c(1,1))
hist(Property_Train$EYB,
xlab = 'EYB',
ylab = 'Frequency',
main = 'Histogram of EYB',
col = rainbow(7))
# Plotting histogram of PRICE variable
hist(Property_Train$PRICE,
xlab = 'PRICE',
ylab = 'Frequency',
main = 'Histogram of PRICE',
col = rainbow(7))
# Summarizing the average PRICE for each ASSESSMENT_NBHD.
# Sorting prices and selecting the top 10 neighborhoods.
Top_10_Avg <- Property_Train %>%
group_by(ASSESSMENT_NBHD) %>%
summarise(Average = as.integer(mean(PRICE))) %>%
arrange(desc(Average)) %>%
head(n = 10)
# Making bar plot of average Prices
ggplot(Top_10_Avg, aes(x = ASSESSMENT_NBHD, y = Average)) +
geom_bar(stat = 'identity', fill = rainbow(10)) +...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here