IS6052 ‐ Descriptive and Predictive Analytics 2022‐2023 Individual CA Project Due Date: Thursday December 8th Submit your project report as a single pdf file on Canvas Loan Appr...

1 answer below »
check the file atached










IS6052 ‐ Descriptive and Predictive Analytics    2022‐2023 Individual CA Project      Due Date: Thursday December 8th  Submit your project report as a single pdf file on Canvas      Loan Appraisal for FNB Bank  You are a credit analyst working for FNB Bank. Your responsibilities include analysing the loan  applications and making recommendations to management, based upon your findings, to help them  make data‐driven decisions on lending.   “Loan.csv” file includes data on 40,000 FNB customers that were granted a loan in the past and the  respective outcome, i.e. whether they were identified as “write‐offs” or “not write‐offs”.  Using this data, analyse the new loan applications whose data are provided in “NewApplications.csv”  and try to predict whether each new applicant will repay the requested loan if it is approved.    Data available:  1. Gender  M=male; F=female  2. Age  an integer parameter    3. marital_status  widowed; married; single; divorced    4. education  basic; highsch; univ; postgrad    5. nb_depend_child  number of dependent children (0,1,2,3)    6. employ_status  employment status (full_time; part_time; unemployed; self_employ; retired)  7. yrs_current_job  years at the current employment  8. yrs_employed  total number of years employed so far    9. net_income  an integer parameter  10. spouse_work  yes; no    11. spouse_income  if the spouse works, what is his/her income?  12. residential_status  home owner (owner); tenant; home owner with a mortgage (owner_morg);  living with parents (w_parents)    13. yrs_current_address  years at the current address   14. loan_amount  an integer parameter    15. loan_purpose  debt consolidation (debt_consol); wedding; home improvement (home_improv);  vehicle; holidays; other  16. loan_length  the duration of the loan    17. collateral  yes; no  18. writeoff  yes; no    SAT-Dell2019 Cross-Out SAT-Dell2019 Typewritten Text Monday December 19th   Your report should contain:    1. an investigation of the data and a summary of your descriptive analyses; (18 pts)  2. a discussion on the pros and cons of the prediction methods that can be used to address FNB's loan  appraisal problem; (6 pts)  3. a brief description (and the assumptions made, if any) of how selected prediction methods are  applied; (7 pts)  4. R codes developed; (12 pts)  5. an evaluation of the results obtained by each prediction method tried on the data; (20 pts)  6. a comparative analysis of the results; (20 pts)  7. your final recommendation to FNB on which customers should be granted loan; (10 pts)  8. a discussion on any additional data that you think would be useful, if collected, to make better  predictions in the future. (7 pts)
Answered 13 days AfterDec 01, 2022

Answer To: IS6052 ‐ Descriptive and Predictive Analytics 2022‐2023 Individual CA Project Due Date:...

Subhanbasha answered on Dec 15 2022
47 Votes
FNB Bank – prediction model
                 Data and summary
The data collected is about the customers of banks whether they write off or not. The data is from the FNB bank. Here I used descriptive and predictive analytics to find out th
e pattern from the past behaviour of the customers and used it to predict new customers whether we proceed to give loans or not.
    So, by observing the collected data there are some numerical and categorical variables. The total number of customer data is 40000 and we have a total of 18 columns where these are the main inputs to the model. The following are the type of variables that they have in the data.
Character variables:
· Gender
· marital_status
· education
· employ_status
· spouse_work
· residential_status
· loan_purpose
· collateral
· writeoff
Numerical/integer variables:
· age
· nb_depend_child
· yrs_current_job
· yrs_employed
· net_income
· spouse_income
· yrs_current_address
· loan_amount
· loan_length
In the data, the age variable has a minimum age of 20 and maximum age is 65. The maximum number of dependent children is 3 and in years of current job is 25 years. The average net income is 42956 and the maximum is 178500. The average spouse’s income is 10266 and the maximum income is 167298. The average loan amount is 30702 and the maximum is 272343. The average loan tenure is 37 months the maximum is 96 months.
                Prediction Methods
Here we have the previous data about the customers of various characteristics geographical features and personal details. We can use here various machine learning algorithm models to make the model and predict. Here mainly the problem is a classification problem, so we use classification algorithms.
The methods used for prediction is as follows
· Decision Trees
· Random Forest
· Naïve Bayes
· Support Vector Machine
Decision Trees:
Pros:
· It will take less time and effort to create the algorithm and called straight forward algorithm
· It can be easily understood by the users.
· The algorithm does not require the scaling of the raw data it will handle itself only
· The missing values also not reflect in the model.
Cons:
· It will change drastically while we are changing data for the small part
· Decision tree algorithm not recommended for the continuous variables.
· This will take some time to make...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here