Answer To: Introduction For this week’s take-home lab, you will work on the same data set from Week 4 Take-Home...
Santosh Vasant answered on Feb 13 2022
Q1. Summarize the model/ feature selection process used to fit your SVM model
Correlation matrix is used to identify highly correlated features. Features with high correlation coefficient (>0.85) are removed. Initially there were 24 features, 6 out of which are found to be correlated. To build model 18 features were considered.
The continuous features such as Limit balance, age, BILL AMT*, PAY AMT*, were scaled using scaler function of dplyr library. Categorical features like, age, marriage, education were kept as it is.
Following list of input features were considered:
LIMIT BAL,
AGE,
BILL AMT6,
PAY_AMT1,2,3,4,5,6
SEX,
EDUCATION,
MARRIAGE,
PAY_0,2,3,4,5,6
The correlation plot is shown in Figure 1.
Figure 1 : Correlation plot between all the features as well as target varaible
The data was split into training set and test set in the ratio of 8:2. Later 10 fold cross validation with 2 repeats was used. Hyper parameters were trained using train function of caret with two levels of model parameter values.
Q2. Provide a summary of a fitted SVM model.
Initially, problem is solved as regression, however, as target variable is...