Online Discussion forum (5 marks): Post your proposed topic and chosen dataset as well as a short...

Question

Online Discussion forum (5 marks): Post your proposed topic and chosen dataset as well as a short plan for the project. Explain if it falls into the supervised or unsupervised learning category and if...

Online Discussion forum (5 marks):
Post your proposed topic and chosen dataset as well as a short plan for the project. Explain if it falls into the supervised or unsupervised learning category and if it is a regression or classification problem. The above is required for approval of the topic. As discussed, students must select unique topics, therefore if any assignments overlap they will not be accepted. This should be done by the end of week 10. Also any queries about the assignment deliverables should be made in the discussion forum so that other students can also benefit from the responses.

Oral Presentation (15 marks):
You will be required to present a brief (10) minute executive summary of your project in class. This is a mandatory component of the assignment.

Data Mining technical report (80 marks):
The marks for the report section are split into three areas:

a.Data understanding and preparation (20%)
b. Algorithms/techniques chosen and implemented in the R programming language for data analysis (30%)

Presentation,discussion and quality of the results – explanation of interesting patterns found (50%)

Notes
:
All work must be submitted in ONE word document
No Email submissions allowed unless specific permission has been granted
Do not explain how to perform the techniques or provide instructions in your report, this is what the books are for. Instead spend your time explaining your findings.

Document Preview:

MURDOCH UNIVERSITY ICT515 Foundations of Data Science Semester 1, 2017 ASSIGNMENT 2 Assignment Information For this assignment, students should work in pairs. You should submit your assignment from the ICT515 LMS site using the Assignment unit tool. Late submissions will be penalised at the rate of 10 marks per day late or part thereof. You must keep a copy of the final version of your assignment as submitted and be prepared to provide it on request. The University treats plagiarism, collusion, theft of other students’ work and other forms of dishonesty in assessment seriously. Any instances of dishonesty in this assessment will be forwarded immediately to the Faculty Dean. For guidelines on honesty in assessment including avoiding plagiarism, see: HYPERLINK "http://our.murdoch.edu.au/Educational-technologies/Academic-integrity/" http://our.murdoch.edu.au/Educational-technologies/Academic-integrity/ Overview For this assignment, students will work in pairs. Each group needs to choose a real dataset (two different dataset) that the group members find interesting, in the sense that they believe it contains data which can provide useful information if explored. Students then need to implement, via the R programming language, different techniques that we have covered in this unit to try to find the best way to answer their questions about the dataset and extract the useful information. There are numerous datasets available online, and a link to a good repository will been given in LMS during the semester. You are free, however, to choose any data set you prefer, the conditions being that The dataset must be freely available online so that I can download it and perform the analysis myself. Students must each choose unique projects – this generally means different datasets entirely. If you have another preferred source of data then you may request to use that instead and I’ll have a look. I can also propose other datasets, if students need additional...

005_3xltazo-wdghgyln.docx

Answered Same DayDec 26, 2021

Robert · Accepted Answer

Data Mining technical Report 
 
PART A: SUPERVISED CLASSIFICATION (using random 
forest)
a) Data understanding and Preparation
The multivariate data is related with a Portuguese banking institution which is 
related with direct marketing campaigns. The data consists of 45,211 
observations from May’08-Nov’10. These are the records of phone calls and are 
not unique as per the client, to record if the term deposit was subscribed for or 
not. In the phone campaign,17 categorical and non-categorical were collected 
from the client. The objective is to predict if the client will subscribe for a term 
loan or not.
Of the 17 input variables, 16 are independent variables and 1 of them is a 
dependent variable. The 17 independent variables are listed below:

Online Discussion forum (5 marks): Post your proposed topic and chosen dataset as well as a short plan for the project. Explain if it falls into the supervised or unsupervised learning category and if...

Answer To: Online Discussion forum (5 marks): Post your proposed topic and chosen dataset as well as a short...

Answer To This Question Is Available To Download

Related Questions & Answers

Submit New Assignment