The attached fileReal estate valuation data set.xlscontains historical data about the market for real estate valuation collected inNew Taipei city in Taiwan. According to the property features which...

1 answer below »

The attached fileReal estate valuation data set.xlscontains historical data about the market for real estate valuation collected inNew Taipei city in Taiwan.


According to the property features which include: the age, distance to the nearest station, number of convenience stores in the living circle.etc, the user can predict the price of the property he/she is willing to buy. (dataset information is below)


Answer the following questions:




  •  Apply KMean clustering algorithm on the given dataset to cluster date points into natural groups, you need to develop optimal number of clusters, then justify your answer by explaining why do you consider the chosen cluster number is the optimal, showing all processes that you have done to achieve your goal.




  •  Apply association rule algorithm to discover interesting relations between the variables in the given dataset, discuss the result in terms of all evaluation measurements that you have used in developing your rules.




  •  Apply decision tree algorithm to create a model that can predict any feature price, you answer should include the decision tree developed, the model accuracy, and the confusion Matrix.




  •  For each previous step, you need to post message/s to the LMS forum (out_Class_Assessment_discussion_forum) that is created for you in your course LMS page, discussing with your group members what part of the project you are




working in?why do you think this process work/doesn’t work? and answering others questions without reveling any codes or any solutions.


Notice the following:




  •  All codes should be developed using R language.




  •  Include your R code with your answer as one zipped folder.




  •  Any preprocessing that is done through your assessment work should be clearly


    explained and justified.




  •  No more than two students should work in one group.




  •  All messages that you have posted to the discussion forum should be included in


    your answer with their posting data and time. The messages should reveal your work, in a positive constructive way, and in a time/data that shows your contributions throughout the assessment working timing.


    Attribute Information:


    The inputs are as follows
    X1=the transaction date (for example, 2013.250=2013 March, 2013.500=2013 June, etc.) X2=the house age (unit: year)
    X3=the distance to the nearest MRT station (unit: meter)
    X4=the number of convenience stores in the living circle on foot (integer)
    X5=the geographic coordinate, latitude. (unit: degree)
    X6=the geographic coordinate, longitude. (unit: degree)


    The output is as follow
    Y= house price of unit area (10000 New Taiwan Dollar/Ping, where Ping is a local unit, 1 Ping = 3.3 meter squared)



Answered Same DayAug 30, 2021

Answer To: The attached fileReal estate valuation data set.xlscontains historical data about the market for...

Pritam answered on Aug 31 2021
158 Votes
library(xlsx)
library(cluster)
library(caTools)
library(rpart)
library(tree)
dt = readxl::read_
xlsx("real-estate-valuation-data-set-mkojbszm-2pjzzhcn.xlsx")
sum(is.na(dt)) # no missing values in the data
dt$No = NULL
dt$`X1_transaction date` = dt$`X1_transaction date`- min(dt$`X1_transaction date`)
d1 = dt[,c(5,6)]
set.seed(123)
km = kmeans(x = d1, centers = 6)
y_km = km$cluster
d1$region =...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here