Unit Assessment Type Assessment Number Assessment Name Weighting Alignment with Unit and Course Due Date and Time Group Assignment A4 Data Mining & BI Report 25% ULO1, ULO2, ULO3, ULO4 Assessment...

1 answer below »

View more »
Answered 3 days AfterMay 31, 2021BISY3001

Answer To: Unit Assessment Type Assessment Number Assessment Name Weighting Alignment with Unit and Course Due...

Mohd answered on Jun 03 2021
159 Votes
Introduction
In this project we are building predictive models for banking customers churn using Decision tree, random forest and naive bayes classifier algorithms. First we want to find a list of significant contributors to attrition. We have nineteen independent variables(predictors). We have done fe
ature engineering to clean the data. We have chosen a three classification algorithm to estimate the future people's attrition of banking churn data. We have eliminated insignificant and unnecessary predictors from the model. Some variables were eliminated inorder to avoid any kind of multicollinearity presence. Multicollinearity severely affects our model performance and predictive ability. We have partitioned data into two group validation and training. In future several other boosting algorithm models can be applied to this data in order to boost the efficiency of the model.
Importance of the chosen area:
Due to exponential growth in the bank sector. Companies want to keep a close eye on their customers. What are their spending patterns? What are the factors behind particular customers terminating the bank services? We have seen the success of american express, citi bank and other banking services providers. They have built models to classify customers who are likely to attrition, and possible factors behind attrition depending on many significant factors. Better insight into credit card customers to help in making potential decisions regarding credit card services offered by banks.
We can build a prototype using a web page on which we enter several essential information regarding the customer like past transaction info, owner origin and income. We can train models with existing data. Whenever bank officials or executives enter certain required information they could get an insight about their future risk about customer churn.
Why this data set is interesting
This data was retrieved from kaggle, earlier data analysis was done using python and r code . This data consist extensive required features or attributes
What has been done so far:
Earlier researchers have investigated problems related with housing price data to measure the respondent willingness to pay for clean air. They have used hedonic price models and Boston housing data. They have calculated estimates of respondent willingness to pay for clean air.
We are using this data to predict prices using boston data variables,for example crime rate in the area. We can build a linear regression model to estimate house prices for many applications like rental Market real estate and government purposes.
•Description of the present experiment1.
We are using feature engineering, feature selection, and linear regression modelling techniques to build a predictive model of median price for boston houses. We want to assess model performance on validation dataset in order identify whether our model is underfit or overfitted. Both situation must be avoided in order to achieve desired results.
Data preparation and Feature extraction:
Select data:...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here
April
January
February
March
April
May
June
July
August
September
October
November
December
2025
2025
2026
2027
SunMonTueWedThuFriSat
30
31
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
1
2
3
00:00
00:30
01:00
01:30
02:00
02:30
03:00
03:30
04:00
04:30
05:00
05:30
06:00
06:30
07:00
07:30
08:00
08:30
09:00
09:30
10:00
10:30
11:00
11:30
12:00
12:30
13:00
13:30
14:00
14:30
15:00
15:30
16:00
16:30
17:00
17:30
18:00
18:30
19:00
19:30
20:00
20:30
21:00
21:30
22:00
22:30
23:00
23:30