Peer-graded Assignment: The best classifierSubmit by Oct 31, 11:59 PM PDTiIt looks like this is your...

Question

Peer-graded Assignment: The best classifier

Submit by Oct 31, 11:59 PM PDTiIt looks like this is your first peer-graded assignment.Learn moreSubmit your assignment soonEven though your assignment is due on Oct 31, 11:59 PM PDT, try to submit it 1 or 2 days early if you can. Submitting early gives you a better chance of getting the peer reviews you need in time.

Instructions

My submission

Discussions

Now that you have been equipped with the skills to use different Machine Learning algorithms, over the course of five weeks, you will have the opportunity to practice and apply it on a dataset. In this project, you will complete a notebook where you will build a classifier to predict whether a loan case will be paid off or not.

You load a historical dataset from previous loan applications, clean the data, and apply different classification algorithm on the data. You are expected to use the following algorithms to build your models:

k-Nearest Neighbour

Decision Tree

Support Vector Machine

Logistic Regression

The results is reported as the accuracy of each classifier, using the following metrics when these are applicable:

Jaccard index

F1-score

LogLoass

Review criteria

less

This final project will be graded by your peers who are completing this course during the same session. This project is worth 25 marks of your total grade, broken down as follows:

Building model using KNN, finding the best k and accuracy evaluation (7 marks)

Building model using Decision Tree and find the accuracy evaluation (6 marks)

Building model using SVM and find the accuracy evaluation (6 marks)

Building model using Logistic Regression and find the accuracy evaluation (6 marks)

course-zsy12wvj-smm5gb4a.docx

Pritam Kumar · Accepted Answer

loan data_classification task
  
    
Data downloaded from: https://www.kaggle.com/burak3ergun/loan-data-set
In [1]:
    
import pandas as pd
data = pd.read_csv("D:\New\loan_data_set.csv")
data.head()
    
    Out[1]:
		Loan_ID	Gender	Married	Dependents	Education	Self_Employed	ApplicantIncome	CoapplicantIncome	LoanAmount	Loan_Amount_Term	Credit_History	Property_Area	Loan_Status
	0	LP001002	Male	No	0	Graduate	No	5849	0.0	NaN	360.0	1.0	Urban	Y
	1	LP001003	Male	Yes	1	Graduate	No	4583	1508.0	128.0	360.0	1.0	Rural	N
	2	LP001005	Male	Yes	0	Graduate	Yes	3000	0.0	66.0	360.0	1.0	Urban	Y
	3	LP001006	Male	Yes	0	Not Graduate	No	2583	2358.0	120.0	360.0	1.0	Urban	Y
	4	LP001008	Male	No	0	Graduate	No	6000	0.0	141.0	360.0	1.0	Urban	Y
In [2]:
    
for col in data.columns:
    print(col)

Loan_ID
Gender
Married
Dependents
Education
Self_Employed
ApplicantIncome
CoapplicantIncome
LoanAmount
Loan_Amount_Term
Credit_History
Property_Area
Loan_Status
In [3]:
    
data.dtypes
    
    Out[3]:
Loan_ID               object
Gender                object
Married               object
Dependents            object
Education             object
Self_Employed         object
ApplicantIncome        int64
CoapplicantIncome    float64
LoanAmount           float64
Loan_Amount_Term     float64
Credit_History       float64
Property_Area         object
Loan_Status           object
dtype: object
Missing values handling
In [4]:
    
data = data.dropna()
data.head()
    
    Out[4]:
		Loan_ID	Gender	Married	Dependents	Education	Self_Employed	ApplicantIncome	CoapplicantIncome	LoanAmount	Loan_Amount_Term	Credit_History	Property_Area	Loan_Status
	1	LP001003	Male	Yes	1	Graduate	No	4583	1508.0	128.0	360.0	1.0	Rural	N
	2	LP001005	Male	Yes	0	Graduate	Yes	3000	0.0	66.0	360.0	1.0	Urban	Y
	3	LP001006	Male	Yes	0	Not Graduate	No	2583	2358.0	120.0	360.0	1.0	Urban	Y
	4	LP001008	Male	No	0	Graduate	No	6000	0.0	141.0	360.0	1.0	Urban	Y
	5	LP001011	Male	Yes	2	Graduate	Yes	5417	4196.0	267.0	360.0	1.0	Urban	Y
In [5]:
    
data.reset_index(drop=True, inplace=True)
data.head()
    
    Out[5]:
		Loan_ID	Gender	Married	Dependents	Education	Self_Employed	ApplicantIncome	CoapplicantIncome	LoanAmount	Loan_Amount_Term	Credit_History	Property_Area	Loan_Status
	0	LP001003	Male	Yes	1	Graduate	No	4583	1508.0	128.0	360.0	1.0	Rural	N
	1	LP001005	Male	Yes	0	Graduate	Yes	3000	0.0	66.0	360.0	1.0	Urban	Y
	2	LP001006	Male	Yes	0	Not Graduate	No	2583	2358.0	120.0	360.0	1.0	Urban	Y
	3	LP001008	Male	No	0	Graduate	No	6000	0.0	141.0	360.0	1.0	Urban	Y
	4	LP001011	Male	Yes	2	Graduate	Yes	5417	4196.0	267.0	360.0	1.0	Urban	Y
Data preprocessing
In [6]:
    
from sklearn import preprocessing
    
In [7]:
    
df = pd.DataFrame(data['Gender'])
    
In [8]:
    
le = preprocessing.LabelEncoder()
le.fit(df['Gender'])
    
    Out[8]:
LabelEncoder()
In [9]:
    
df['Gender'] = le.transform(df['Gender'])
    
In [10]:
    
df['Married'] = data['Married']
le.fit(df['Married'])
df['Married'] = le.transform(df['Married'])
    
In [11]:

Peer-graded Assignment: The best classifier Submit by Oct 31, 11:59 PM PDTiIt looks like this is your first peer-graded assignment.Learn moreSubmit your assignment soonEven though your assignment is...

Peer-graded Assignment: The best classifier

Review criteria

Answer To: Peer-graded Assignment: The best classifier Submit by Oct 31, 11:59 PM PDTiIt looks like this is...

Answer To This Question Is Available To Download

Related Questions & Answers

Submit New Assignment