{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "LBdSH0Yg6XoH" }, "source": [ "Load Libraries" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "id": "52VtTLi0EE7r" },...

1 answer below »
You need to write a research report on a machine learning project. The analysis has been done and the script for this project is uploaded named [MLProject.ipynb]. Please review, check and feel free to modify it. For the information about this project please check the uploaded doc named [Project_Proposal.pdf]. For the other doc uploaded is a template,you can refer and follow it to write this report. Do not include any codes in the report, write at least 6 pages excluding appendices. Thank you!


{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "LBdSH0Yg6XoH" }, "source": [ "Load Libraries" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "id": "52VtTLi0EE7r" }, "outputs": [], "source": [ "import pandas as pd\n", "import seaborn as sns\n", "import numpy as np\n", "from collections import Counter\n", "import matplotlib.pyplot as plt\n", "from matplotlib.pyplot import figure\n", "from pickle import dump\n", "from pickle import load" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "id": "btQItuEQMuS_" }, "outputs": [], "source": [ "def warn(*args, **kwargs):\n", " pass\n", "import warnings\n", "warnings.warn = warn\n", "\n", "from sklearn.linear_model import LogisticRegression\n", "from sklearn.tree import DecisionTreeClassifier\n", "from sklearn.ensemble import RandomForestClassifier\n", "from sklearn.neighbors import KNeighborsClassifier\n", "from sklearn.discriminant_analysis import LinearDiscriminantAnalysis\n", "from sklearn.decomposition import PCA\n", "from sklearn.naive_bayes import GaussianNB\n", "from sklearn.svm import SVC\n", "from sklearn.model_selection import KFold\n", "from sklearn.model_selection import cross_val_score\n", "from sklearn.model_selection import cross_validate\n", "from sklearn.metrics import cohen_kappa_score\n", "from sklearn.metrics import confusion_matrix\n", "from sklearn.metrics import classification_report\n", "from sklearn.metrics import plot_confusion_matrix\n", "from sklearn import tree\n", "from sklearn.metrics import accuracy_score\n", "from sklearn.metrics import roc_curve\n", "from sklearn.metrics import roc_auc_score\n", "from sklearn.metrics import plot_roc_curve\n", "from sklearn.metrics import precision_recall_curve\n", "from sklearn.metrics import f1_score\n", "from sklearn.metrics import auc\n", "from sklearn.metrics import precision_score\n", "from sklearn.metrics import recall_score\n", "from sklearn.metrics import make_scorer" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "id": "DQ4gQmpQMv7X" }, "outputs": [], "source": [ "from sklearn.model_selection import train_test_split, GridSearchCV, RepeatedStratifiedKFold\n", "from sklearn.feature_selection import RFECV, SelectKBest\n", "from imblearn.under_sampling import RandomUnderSampler\n", "from imblearn.over_sampling import SMOTE\n", "from sklearn.preprocessing import StandardScaler\n", "from imblearn.pipeline import Pipeline" ] }, { "cell_type": "markdown", "metadata": { "id": "EvaThE8LQkIH" }, "source": [ "# Python Project Template\n", "# 1. Prepare Problem\n", "# a) Load libraries\n", "# b) Load dataset\n", "# 2. Summarize Data\n", "# a) Descriptive statistics\n", "# b) Data visualizations\n", "# 3. Prepare Data\n", "# a) Data Cleaning\n", "# b) Feature Selection\n", "# c) Data Transforms\n", "# 4. Evaluate Algorithms\n", "# a) Split-out validation dataset\n", "# b) Test options and evaluation metric\n", "# c) Spot Check Algorithms\n", "# d) Compare Algorithms\n", "# 5. Improve Accuracy\n", "# a) Algorithm Tuning\n", "# b) Ensembles\n", "# 6. Finalize Model\n", "# a) Predictions on validation dataset\n", "# b) Create standalone model on entire training dataset\n", "# c) Save model for later use" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " Load dataset" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "id": "2wUOXeYtEhIH" }, "outputs": [], "source": [ "data = pd.read_csv(\"data.csv\")" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 372 }, "id": "BjTO4L74FiXn", "outputId": "c87a1064-600d-4126-cf48-fcfdbec4dd65" }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n",














































Bankrupt?ROA(C) before interest and depreciation before interestROA(A) before interest and % after taxROA(B) before interest and depreciation after taxOperating Gross MarginRealized Sales Gross MarginOperating Profit RatePre-tax net Interest RateAfter-tax net Interest RateNon-industry income and expenditure/revenue...Net Income to Total AssetsTotal assets to GNP priceNo-credit IntervalGross Profit to SalesNet Income to Stockholder's EquityLiability to EquityDegree of Financial Leverage (DFL)Interest Coverage Ratio (Interest expense to EBIT)Net Income FlagEquity to Liability
010.3705940.4243890.4057500.6014570.6014570.9989690.7968870.8088090.302646...0.7168450.0092190.622879

Answered 2 days AfterMay 05, 2021

Answer To: { "cells": [ { "cell_type": "markdown", "metadata": { "id": "LBdSH0Yg6XoH" }, "source": [ "Load...

Deepti answered on May 08 2021
143 Votes
Abstract
Prediction of financial capabilities of companies in past has been dealt with by researchers worldwide. Many prediction models mostly focused on national economy have been developed in the US. The main purpose of this paper is to propose a prediction model to facilitate and guide the investors to predict if a
particular trade shall near bankruptcy or not, owing to the condition of the economy post the global pandemic and thus decide upon investment. The paper proposes a prediction model based on different algorithms using historical dataset. The results of the analysis can be used for early prediction of financial stand of companies irrespective of their size that would be useful to the investors.
Keywords: Prediction model, machine learning, logistic regression, prediction algorithms, dataset visualization, random sampling, spot checking, ROC curve
Introduction
This paper proposes a model that would facilitate the business investors to assess substantial opportunities with weak as well as strong acumen. The possibility of predicting whether a particular business shall lead to bankruptcy is based on the analysis and evaluation of the results achieved by the proposed model. The model shall point our threats in advance to enable the investors towards decision making. Such analysis should be carried out for all small and large companies, business activities or other specifics during the pandemic. As a result of the development machine learning technique, genetic algorithms are proving to be beneficial in prediction of financial analysis. It can be accurately classified through these algorithms whether a particular investment would be prosperous or non-prosperous.
The paper is organized as follows. The literature review describes the six methods as used by researchers in predicting bankruptcy. The methodology describes the technique used and the main characteristics of the code. Further to this, the result of the prediction model for small and large companies are described based on the machine learning algorithms. The conclusion summarizes the results and includes the strengths and weaknesses of the model and further recommended tasks as future work.
Methodology
The relevant data libraries were loaded from the Taiwan economy. The methods used in the assessment of the economy include logistic regression technique, k-NN, Linear Discriminant Analysis, Classification Trees, Support Vector Machines and Gaussian Naïve Bayes. Correct prediction and interpretation of the indicators of the financial development of the company on the basis of analyzing data through these methods was the essence of predicting business analysis. LR method was used to create the model to predict the probability of a company failure using the data set of Taiwan economy. Each of the methods was aimed at achieving high prediction ability of the future financial troubles of a particular company on the basis of the datasets that are used for such analysis. The detailed characteristics were used and classification was achieved by adding estimated values of probability of financial difficulties as a new variable. The benefit of this research would be creation of...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here