For the following assignments, please provide as much evidence of the results as possible, including the code, screenshots (only plots – not text or code) and documentation. Submit only one pdf file...

1 answer below »
Please find instructions for this assignment in the word document. You need to complete all questions, for short answer questions please provide explanations in words.ExpertDelivery Manager(Upto 15 mnt delay)






For the following assignments, please provide as much evidence of the results as possible, including the code, screenshots (only plots – not text or code) and documentation. Submit only one pdf file and .ipynb / .py files containing the code with documentation. 1.a. [20 points] Choose one of the cleaned datasets at https://www.kaggle.com/annavictoria/ml-friendly-public-datasets. Split it into training and test data. Write from scratch and apply any ML algorithm that you learned in the class to this dataset. You can use Python to implement it. For the implementation, you may use any classes, modules, and functions in Python libraries such as NumPy to do various math / linear algebra operations, but not use the ML classes or functions directly. Apply another algorithm that you learned to the same dataset. For this one, you are free to implement it from scratch or use the ML class and functions directly from the ML packages. Which one of the algorithm fares better? Use as many evaluation metrics as possible to discuss the performance of the algorithms. Write down your comments in your script. 1.b. [5 points] Derive an equation for accuracy in terms of Specificity and Sensitivity. The equation can include metrics such as number of True Positives or number of False Positives, etc. in addition to accuracy, Specificity and Sensitivity. Give an interpretation of the equation. 2.(a) [15 points] Assume we have only two features in our dataset. The transposes of the feature vectors comprise of the first 10 consecutive pairs of primes: [2 3], [5 7], …, [67 71]. For k=2, show step-by-step (either manually or programmatically) iterations of k-means clustering when the centroids are initialized to (i) [2 3] and [5 7] and to (ii) [2 3] and [67 71]. Compare and comment on the results in both cases. 2.(b) [10 points] K-means algorithm is applied to the wines dataset in the tutorial available at https://www.kaggle.com/xvivancos/clustering-wines-with-k-means However, K-means algorithm assumes that the mean is representative of the cluster. In real-world though, most often, the “most” vociferous, “most” influential, “most” wealthy, or some other “most” ____ person gets elected to represent the people of a constituency. Explain using visualization tools (like in the tutorial) and words, the difference the clustering algorithm makes, if based on the “mode” instead of the “mean” on the above wines dataset or any categorical dataset such as https://www.kaggle.com/sl6149/data-scientist-job-market-in-the-us Explain based on the cluster analysis, if mode, which is also a measure of central tendency can represent a real-world cluster of similar data points. You can use the k-modes algorithm described here: https://www.kaggle.com/ashydv/bank-customer-clustering-k-modes-clustering If you are using the wine dataset, you may have to convert it into a categorical dataset by applying binning, for the results to make sense.
Answered 2 days AfterApr 20, 2021

Answer To: For the following assignments, please provide as much evidence of the results as possible, including...

Vicky answered on Apr 23 2021
149 Votes
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"1.a. [20 points]\n",
"Choose one of the cleaned datasets at https://www.kaggle.com/annavictoria/ml-friendly-public-datasets. Split it into training and test data. Write from scratch and apply any ML algorithm that you learned in the class to this dataset. You can use Python to implement it.\n",
"\n",
"For the implementation, you may use any classes, modules, and functions inPython libraries such as NumPy to do various math / linear algebra operations, but not use the ML classes or functions directly.\n",
"\n",
"Apply another algorithm that you learned to the same dataset. For this one, you are free to implement it from scratch or use the ML class and functions directly from the ML packages.\n",
"\n",
"Which one of the algorithm fares better? Use as many evaluation metrics as possible to discuss the performance of the algorithms. Write down your comments in your script.\n"
]
},
{
"cell_type": "code",
"execution_count": 172,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import pandas as pd\n",
"import seaborn as sns\n",
"import sklearn.metrics as metrics\n",
"from sklearn.metrics import confusion_matrix, classification_report\n",
"from sklearn.linear_model import LogisticRegression\n",
"from sklearn.preprocessing import LabelEncoder\n",
"from sklearn.model_selection import train_test_split\n",
"import matplotlib.pyplot as plt\n",
"%matplotlib inline"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
ACTIVITYTIMESLEEGBPHRCIRCLUATION
034722.924019.64-1600.001379317
124059.122191.03-1146.082054165
224773.562787.99-1263.384667224
348271.279545.98-2848.9326138554
447102.1614148.80-2381.1585120809
\n",
"
"
],
"text/plain": [
" ACTIVITY TIME SL EEG BP HR CIRCLUATION\n",
"0 3 4722.92 4019.64 -1600.00 13 79 317\n",
"1 2 4059.12 2191.03 -1146.08 20 54 165\n",
"2 2 4773.56 2787.99 -1263.38 46 67 224\n",
"3 4 8271.27 9545.98 -2848.93 26 138 554\n",
"4 4 71
02.16 14148.80 -2381.15 85 120 809"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data = pd.read_csv('falldeteciton.csv')\n",
"data.head()"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"RangeIndex: 16382 entries, 0 to 16381\n",
"Data columns (total 7 columns):\n",
" # Column Non-Null Count Dtype \n",
"--- ------ -------------- ----- \n",
" 0 ACTIVITY 16382 non-null int64 \n",
" 1 TIME 16382 non-null float64\n",
" 2 SL 16382 non-null float64\n",
" 3 EEG 16382 non-null float64\n",
" 4 BP 16382 non-null int64 \n",
" 5 HR 16382 non-null int64 \n",
" 6 CIRCLUATION 16382 non-null int64 \n",
"dtypes: float64(3), int64(4)\n",
"memory usage: 896.0 KB\n"
]
}
],
"source": [
"data.info()"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
countmeanstdmin25%50%75%max
ACTIVITY16382.02.3614941.7391950.000000e+000.0003.0004.005.0
TIME16382.010936.8424785261.4636011.954230e+037263.6859769.35513481.6550895.5
SL16382.075271.979925127502.2416884.222420e+019941.17031189.20080761.452426140.0
EEG16382.0-5621.125313108220.924420-1.262600e+07-5630.000-3361.275-2150.001410000.0
BP16382.058.25106848.2929260.000000e+0025.00044.00078.00533.0
HR16382.0211.536992129.9495213.300000e+01119.000180.000271.00986.0
CIRCLUATION16382.02894.3414723825.9278305.000000e+00587.0001581.0003539.0052210.0
\n",
"
"
],
"text/plain": [
" count mean std min 25% \\\n",
"ACTIVITY 16382.0 2.361494 1.739195 0.000000e+00 0.000 \n",
"TIME 16382.0 10936.842478 5261.463601 1.954230e+03 7263.685 \n",
"SL 16382.0 75271.979925 127502.241688 4.222420e+01 9941.170 \n",
"EEG 16382.0 -5621.125313 108220.924420 -1.262600e+07 -5630.000 \n",
"BP 16382.0 58.251068 48.292926 0.000000e+00 25.000 \n",
"HR 16382.0 211.536992 129.949521 3.300000e+01 119.000 \n",
"CIRCLUATION 16382.0 2894.341472 3825.927830 5.000000e+00 587.000 \n",
"\n",
" 50% 75% max \n",
"ACTIVITY 3.000 4.00 5.0 \n",
"TIME 9769.355 13481.65 50895.5 \n",
"SL 31189.200 80761.45 2426140.0 \n",
"EEG -3361.275 -2150.00 1410000.0 \n",
"BP 44.000 78.00 533.0 \n",
"HR 180.000 271.00 986.0 \n",
"CIRCLUATION 1581.000 3539.00 52210.0 "
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data.describe().T"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(16382, 7)"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data.shape"
]
},
{
"cell_type": "code",
"execution_count": 187,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAfsAAAE9CAYAAAAMOst7AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAATh0lEQVR4nO3df6xfd13H8eeLdo4JLG7uMmfvpItpyMbUwZo6Xfw10FV+dRCmXRxrdFiDk0Ag6qbxdxpJRCMgI1kU1grZUjdwEwGdlR+ig3ELw7GVSuNwq5trYRqGmpmOt398P9Nvyl373XrP/X7vp89H8s33nPf3nHPf96Tp655zPt9zUlVIkqR+PW3aDUiSpGEZ9pIkdc6wlySpc4a9JEmdM+wlSeqcYS9JUudWT7uBoZx22mm1du3aabchSdKy2L1795eram6xz7oN+7Vr17KwsDDtNiRJWhZJ/uWJPvM0viRJnTPsJUnqnGEvSVLnDHtJkjpn2EuS1DnDXpKkzhn2kiR1zrCXJKlzhr0kSZ0z7CVJ6pxhL0lS57q9N/6RnP+LO6bdwrLa/XtXTLsFSdIUeWQvSVLnDHtJkjpn2EuS1DnDXpKkzhn2kiR1zrCXJKlzhr0kSZ0z7CVJ6pxhL0lS5wx7SZI6Z9hLktQ5w16SpM4Z9pIkdc6wlySpc4a9JEmdM+wlSeqcYS9JUucMe0mSOmfYS5LUOcNekqTOGfaSJHXOsJckqXOGvSRJnTPsJUnqnGEvSVLnDHtJkjpn2EuS1DnDXpKkzhn2kiR1bvCwT7IqyWeTfKDNn5rktiRfbO+njC17TZJ9SfYmuXisfn6Su9pnb0uSofuWJKkXy3Fk/3pgz9j81cCuqloH7GrzJDkH2Aw8D9gIXJtkVVvnncBWYF17bVyGviVJ6sKgYZ9kHngJ8Mdj5U3A9ja9HbhkrH5jVT1aVfcC+4ANSc4ATq6q26uqgB1j60iSpKMY+sj+D4FfAr4+Vju9qh4EaO/PbvU1wP1jy+1vtTVt+vC6JEmawGBhn+SlwIGq2j3pKovU6gj1xX7m1iQLSRYOHjw44Y+VJKlvQx7ZXwi8PMmXgBuBi5K8B3ionZqnvR9oy+8Hzhxbfx54oNXnF6l/g6q6rqrWV9X6ubm5pfxdJElasQYL+6q6pqrmq2oto4F3f1tVlwO3AlvaYluAW9r0rcDmJCcmOYvRQLw72qn+R5Jc0EbhXzG2jiRJOorVU/iZbwZ2JrkSuA+4FKCq7k6yE7gHOARcVVWPtXVeC1wPnAR8qL0kSdIEliXsq+qjwEfb9FeAFz7BctuAbYvUF4Bzh+tQkqR+eQc9SZI6Z9hLktQ5w16SpM4Z9pIkdc6wlySpc4a9JEmdM+wlSeqcYS9JUucMe0mSOmfYS5LUOcNekqTOGfaSJHXOsJckqXOGvSRJnTPsJUnqnGEvSVLnDHtJkjpn2EuS1LnV025A0vHjYz/4Q9NuYVn90Mc/Nu0WJMAje0mSumfYS5LUOcNekqTOGfaSJHXOsJckqXOOxpekGfRHb/qLabewbH7h91827Ra655G9JEmdM+wlSeqcYS9JUucMe0mSOmfYS5LUOcNekqTOGfaSJHXOsJckqXOGvSRJnTPsJUnqnGEvSVLnDHtJkjpn2EuS1DnDXpKkzhn2kiR1zrCXJKlzhr0kSZ0z7CVJ6pxhL0lS5wx7SZI6Z9hLktQ5w16SpM4NFvZJnp7kjiSfS3J3kt9q9VOT3Jbki+39lLF1rkmyL8neJBeP1c9Pclf77G1JMlTfkiT1Zsgj+0eBi6rqe4DzgI1JLgCuBnZV1TpgV5snyTnAZuB5wEbg2iSr2rbeCWwF1rXXxgH7liSpK4OFfY18rc2e0F4FbAK2t/p24JI2vQm4saoerap7gX3AhiRnACdX1e1VVcCOsXUkSdJRDHrNPsmqJHcCB4DbqupTwOlV9SBAe392W3wNcP/Y6vtbbU2bPrwuSZImMGjYV9VjVXUeMM/oKP3cIyy+2HX4OkL9GzeQbE2ykGTh4MGDT75hSZI6tCyj8avqP4CPMrrW/lA7NU97P9AW2w+cObbaPPBAq88vUl/s51xXVeurav3c3NyS/g6SJK1UQ47Gn0vyLW36JOBFwBeAW4EtbbEtwC1t+lZgc5ITk5zFaCDeHe1U/yNJLmij8K8YW0eSJB3F6gG3fQawvY2ofxqws6o+kOR2YGeSK4H7gEsBquruJDuBe4BDwFVV9Vjb1muB64GTgA+1lyRJmsBgYV9V/wg8f5H6V4AXPsE624Bti9QXgCNd75ckSU/AO+hJktQ5w16SpM4Z9pIkdc6wlySpc4a9JEmdM+wlSeqcYS9JUucMe0mSOmfYS5LUOcNekqTOGfaSJHXOsJckqXOGvSRJnTPsJUnqnGEvSVLnDHtJkjo3Udgn2TVJTZIkzZ7VR/owydOBbwZOS3IKkPbRycC3D9ybJElaAkcMe+DngDcwCvbd/H/YfxV4x4B9SZKkJXLEsK+qtwJvTfK6qnr7MvUkSZKW0NGO7AGoqrcn+X5g7fg6VbVjoL4kSdISmSjsk/wp8J3AncBjrVyAYS9J0oybKOyB9cA5VVVDNiNJkpbepGH/eeDbgAcH7EVasS58+4XTbmHZ/P3r/n7aLUh6kiYN+9OAe5LcATz6eLGqXj5IV5IkaclMGva/OWQTkiRpOJOOxv/Y0I1IkqRhTDoa/xFGo+8Bvgk4AfjPqjp5qMYkSdLSmPTI/lnj80kuATYM0pEkSVpST+mpd1X158BFS9yLJEkawKSn8V85Nvs0Rt+79zv3kiStAJOOxn/Z2PQh4EvApiXvRpIkLblJr9n/9NCNSJKkYUx0zT7JfJL3JzmQ5KEkNyeZH7o5SZJ07CYdoPdu4FZGz7VfA/xFq0mSpBk3adjPVdW7q+pQe10PzA3YlyRJWiKThv2Xk1yeZFV7XQ58ZcjGJEnS0pg07H8G+Ang3xg9+e5VgIP2JElaASb96t3vAFuq6t8BkpwKvIXRHwGSJGmGTXpk/92PBz1AVT0MPH+YliRJ0lKaNOyfluSUx2fakf2kZwUkSdIUTRrYvw/8Q5KbGN0m9yeAbYN1JUmSlsykd9DbkWSB0cNvAryyqu4ZtDNJkrQkJj4V38LdgJckaYV5So+4lSRJK4dhL0lS5wx7SZI6Z9hLktS5wcI+yZlJPpJkT5K7k7y+1U9NcluSL7b38e/vX5NkX5K9SS4eq5+f5K722duSZKi+JUnqzZBH9oeAN1XV2cAFwFVJzgGuBnZV1TpgV5unfbYZeB6wEbg2yaq2rXcCW4F17bVxwL4lSerKYGFfVQ9W1Wfa9CPAHmANsAnY3hbbDlzSpjcBN1bVo1V1L7AP2JDkDODkqrq9qgrYMbaOJEk6imW5Zp9kLaN76X8KOL2qHoTRHwTAs9tia4D7x1bb32pr2vTh9cV+ztYkC0kWDh48uJS/giRJK9bgYZ/kmcDNwBuq6qtHWnSRWh2h/o3Fquuqan1VrZ+bm3vyzUqS1KFBwz7JCYyC/r1V9b5Wfqidmqe9H2j1/cCZY6vPAw+0+vwidUmSNIEhR+MH+BNgT1X9wdhHtwJb2vQW4Jax+uYkJyY5i9FAvDvaqf5HklzQtnnF2DqSJOkohnxM7YXAq4G7ktzZar8CvBnYmeRK4D7gUoCqujvJTkb33z8EXFVVj7X1XgtcD5wEfKi9JEnSBAYL+6r6BItfbwd44ROss41FHp1bVQvAuUvXnSRJxw/voCdJUucMe0mSOmfYS5LUOcNekqTOGfaSJHXOsJckqXOGvSRJnTPsJUnqnGEvSVLnDHtJkjpn2EuS1DnDXpKkzg351DtJkga17fJXTbuFZfWr77npKa3nkb0kSZ0z7CVJ6pxhL0lS5wx7SZI6Z9hLktQ5w16SpM4Z9pIkdc6wlySpc4a9JEmdM+wlSeqcYS9JUucMe0mSOmfYS5LUOcNekqTOGfaSJHXOsJckqXOGvSRJnTPsJUnqnGEvSVLnDHtJkjpn2EuS1DnDXpKkzhn2kiR1zrCXJKlzhr0kSZ0z7CVJ6pxhL0lS5wx7SZI6Z9hLktQ5w16SpM4Z9pIkdc6wlySpc4a9JEmdM+wlSercYGGf5F1JDiT5/Fjt1CS3Jfliez9l7LNrkuxLsjfJxWP185Pc1T57W5IM1bMkST0a8sj+emDjYbWrgV1VtQ7Y1eZJcg6wGXheW+faJKvaOu8EtgLr2uvwbUqSpCMYLOyr6uPAw4eVNwHb2/R24JKx+o1V9WhV3QvsAzYkOQM4uapur6oCdoytI0mSJrDc1+xPr6oHAdr7s1t9DXD/2HL7W21Nmz68LkmSJjQrA/QWuw5fR6gvvpFka5KFJAsHDx5csuYkSVrJljvsH2qn5mnvB1p9P3Dm2HLzwAOtPr9IfVFVdV1Vra+q9XNzc0vauCRJK9Vyh/2twJY2vQW4Zay+OcmJSc5iNBDvjnaq/5EkF7RR+FeMrSNJkiaweqgNJ7kB+GHgtCT7gd8A3gzsTHIlcB9wKUBV3Z1kJ3APcAi4qqoea5t6LaOR/ScBH2ovSZI0ocHCvqoue4KPXvgEy28Dti1SXwDOXcLWJEk6rszKAD1JkjQQw16SpM4Z9pIkdc6wlySpc4a9JEmdM+wlSeqcYS9JUucMe0mSOmfYS5LUOcNekqTOGfaSJHXOsJckqXOGvSRJnTPsJUnqnGEvSVLnDHtJkjpn2EuS1DnDXpKkzhn2kiR1zrCXJKlzhr0kSZ0z7CVJ6pxhL0lS51ZPuwHNtvt++7um3cKy+Y5fv2vaLUjSIDyylySpc4a9JEmdM+wlSeqcYS9JUucMe0mSOmfYS5LUOcNekqTOGfaSJHXOsJckqXOGvSRJnTPsJUnqnGEvSVLnDHtJkjpn2EuS1DnDXpKkzhn2kiR1zrCXJKlzhr0kSZ0z7CVJ6pxhL0lS5wx7SZI6Z9hLktQ5w16SpM6tmLBPsjHJ3iT7klw97X4kSVopVkTYJ1kFvAP4ceAc4LIk50y3K0mSVoYVEfbABmBfVf1zVf0PcCOwaco9SZK0IqyUsF8D3D82v7/VJEnSUaSqpt3DUSW5FLi4ql7T5l8NbKiq1x223FZga5t9LrB3WRs9utOAL0+7iRXA/TQ599Vk3E+Tc19NZhb303Oqam6xD1YvdydP0X7gzLH5eeCBwxeqquuA65arqScryUJVrZ92H7PO/TQ599Vk3E+Tc19NZqXtp5VyGv/TwLokZyX5JmAzcOuUe5IkaUVYEUf2VXUoyS8AfwWsAt5VVXdPuS1JklaEFRH2AFX1QeCD0+7jGM3sJYYZ436anPtqMu6nybmvJrOi9tOKGKAnSZKeupVyzV6SJD1Fhv0y8Fa/k0nyriQHknx+2r3MsiRnJvlIkj1J7k7y+mn3NKuSPD3JHUk+1/bVb027p1mWZFWSzyb5wLR7mWVJvpTkriR3JlmYdj+T8DT+wNqtfv8J+FFGXyH8NHBZVd0z1cZmUJIfBL4G7Kiqc6fdz6xKcgZwRlV9JsmzgN3AJf6b+kZJAjyjqr6W5ATgE8Drq+qTU25tJiV5I7AeOLmqXjrtfmZVki8B66tq1r5n/4Q8sh+et/qdUFV9HHh42n3Muqp6sKo+06YfAfbgHSUXVSNfa7MntJdHOItIMg+8BPjjafeipWfYD89b/WowSdYCzwc+Nd1OZlc7NX0ncAC4rarcV4v7Q+CXgK9Pu5EVoIC/TrK73bl15hn2w8siNY8sdMySPBO4GXhDVX112v3Mqqp6rKrOY3TnzQ1JvER0mCQvBQ5U1e5p97JCXFhVL2D0JNar2iXImWbYD2+iW/1KT0a7/nwz8N6qet+0+1kJquo/gI8CG6fcyiy6EHh5uxZ9I3BRkvdMt6XZVVUPtPcDwPsZXa6daYb98LzVr5ZUG3T2J8CeqvqDafczy5LMJfmWNn0S8CLgC9PtavZU1TVVNV9Vaxn9H/W3VXX5lNuaSUme0QbGkuQZwI8BM/8NIsN+YFV1CHj8Vr97gJ3e6ndxSW4Abgeem2R/kiun3dOMuhB4NaOjrzvb68XTbmpGnQF8JMk/MvrD+7aq8mtlOhanA59I8jngDuAvq+rDU+7pqPzqnSRJnfPIXpKkzhn2kiR1zrCXJKlzhr0kSZ0z7CVJ6tzqaTcgabqSvAJ4H3B2VX2h1TYAb2H0NaNi9ACZzwI/21Y7B9gLPAZ8mNF319cDNwG/W1XfN7b91cC/AucBvwt8APgp4CzgmcAccG9b/GHg01X1y23d5wAfAV7Qbooj6SnwyF7SZYzCfDNAktOBPwN+uaqeC5zNKNBvqqrz2q1nHwB+pM2PP7b548B8u2f/414EfL6qHny8UFWvaNt5DfB3Y9t9CbApydlt0bcCv2bQS8fGsJeOY+3++hcCV9LCHrgK2F5Vt8P/PTnupqp66Gjbq6qvM/pD4SfHypuBGybpp6r+G3gjcG2SHweeVVXvnfT3kbQ4w146vl0CfLiq/gl4OMkLgHOBY3kgyg38/1mCE4EXM7qP/0Sq6oOMTufvAH7+GPqQ1HjNXjq+Xcbo0aYwegDKZce6war6dJJnJnn8EsAnq+rfn+Rm3gGcVFV7j7UfSYa9dNxK8q3ARcC5SQpYxWgw3nbgfOCWY9j8jYyO7s9mwlP4h/k6PlddWjKexpeOX68CdlTVc6pqbVWdyWhU/N8AW5J87+MLJrk8ybc9iW3fAFzO6I8Jn/IoTZlhLx2/LmP0LO5xNzM6It8MvCXJ3iR7gB8AvjrphqvqHuC/GD0q9T+XqF9JT5FPvZMkqXMe2UuS1DnDXpKkzhn2kiR1zrCXJKlzhr0kSZ0z7CVJ6pxhL0lS5wx7SZI697+clbyx19MFKgAAAABJRU5ErkJggg==\n",
"text/plain": [
"
"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"# ACTIVITY distribution\n",
"plt.figure(figsize=(8,5))\n",
"sns.countplot(x=\"ACTIVITY\", data=data)\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"X = data[['TIME','SL','EEG','BP','HR','CIRCLUATION']]\n",
"y = data['ACTIVITY']\n",
"\n",
"X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Logistic Regression"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"D:\\Anaconda3\\lib\\site-packages\\sklearn\\linear_model\\_logistic.py:764: ConvergenceWarning: lbfgs failed to converge (status=1):\n",
"STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.\n",
"\n",
"Increase the number of iterations (max_iter) or scale the data as shown in:\n",
" https://scikit-learn.org/stable/modules/preprocessing.html\n",
"Please also refer to the documentation for alternative solver options:\n",
" https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression\n",
" extra_warning_msg=_LOGISTIC_SOLVER_CONVERGENCE_MSG)\n"
]
}
],
"source": [
"lr_clf = LogisticRegression().fit(X_train, y_train)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([3, 3, 0, ..., 3, 0, 3], dtype=int64)"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"y_pred=lr_clf.predict(X_test)\n",
"y_pred"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.3709053916581892"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"metrics.accuracy_score(y_test, y_pred)"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" precision recall f1-score support\n",
"\n",
" 0 0.49 0.74 0.59 1372\n",
" 1 0.00 0.00 0.00 144\n",
" 2 0.00 0.00 0.00 716\n",
" 3 0.28 0.71 0.40 1115\n",
" 4 0.88 0.01 0.01 1081\n",
" 5 0.00 0.00 0.00 487\n",
"\n",
" accuracy 0.37 4915\n",
" macro avg 0.27 0.24 0.17 4915\n",
"weighted avg 0.39 0.37 0.26 4915\n",
"\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"D:\\Anaconda3\\lib\\site-packages\\sklearn\\metrics\\_classification.py:1221: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.\n",
" _warn_prf(average, modifier, msg_start, len(result))\n"
]
}
],
"source": [
"report = metrics.classification_report(y_test, y_pred)\n",
"print(report)"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
""
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"
"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"plt.figure(figsize=(10,8))\n",
"cf_matrix = metrics.confusion_matrix(y_test, y_pred)\n",
"sns.heatmap(cf_matrix, fmt='.4g', annot=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Decision Trees"
]
},
{
"cell_type": "code",
"execution_count": 188,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.model_selection import cross_val_score\n",
"from sklearn import metrics\n",
"import sklearn.tree as dt\n",
"from sklearn.tree import DecisionTreeClassifier, DecisionTreeRegressor, export_graphviz, export\n",
"from sklearn.model_selection import GridSearchCV"
]
},
{
"cell_type": "code",
"execution_count": 189,
"metadata": {},
"outputs": [],
"source": [
"clf_tree = DecisionTreeClassifier( max_depth = 5)\n",
"clf_tree=clf_tree.fit( X_train, y_train )"
]
},
{
"cell_type": "code",
"execution_count": 190,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([3, 4, 0, ..., 3, 0, 4], dtype=int64)"
]
},
"execution_count": 190,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"y_pred=clf_tree.predict(X_test)\n",
"y_pred"
]
},
{
"cell_type": "code",
"execution_count": 191,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.43662258392675485"
]
},
"execution_count": 191,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"metrics.accuracy_score(y_test, y_pred)"
]
},
{
"cell_type": "code",
"execution_count": 192,
"metadata": {},
"outputs": [],
"source": [
"# Fine Tuning the parameters\n",
"param_grid = {'max_depth': np.arange(3, 9),\n",
" 'max_features': np.arange(3,9)}"
]
},
{
"cell_type": "code",
"execution_count": 193,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"GridSearchCV(cv=5, estimator=DecisionTreeClassifier(),\n",
" param_grid={'max_depth': array([3, 4, 5, 6, 7, 8]),\n",
" 'max_features': array([3, 4, 5, 6, 7, 8])})"
]
},
"execution_count": 193,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tree = GridSearchCV(DecisionTreeClassifier(), param_grid, cv = 5)\n",
"tree.fit( X_train, y_train )"
]
},
{
"cell_type": "code",
"execution_count": 194,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.5619603425154682"
]
},
"execution_count": 194,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tree.best_score_"
]
},
{
"cell_type": "code",
"execution_count": 195,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"DecisionTreeClassifier(max_depth=8, max_features=4)"
]
},
"execution_count": 195,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tree.best_estimator_"
]
},
{
"cell_type": "code",
"execution_count": 196,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'max_depth': 8, 'max_features': 4}"
]
},
"execution_count": 196,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tree.best_params_"
]
},
{
"cell_type": "code",
"execution_count": 197,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"DecisionTreeClassifier(max_depth=8, max_features=5)"
]
},
"execution_count": 197,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"clf_tree = DecisionTreeClassifier( max_depth = 8, max_features=5 )\n",
"clf_tree.fit( X_train, y_train )"
]
},
{
"cell_type": "code",
"execution_count": 198,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([4, 4, 0, ..., 4, 0, 4], dtype=int64)"
]
},
"execution_count": 198,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"y_pred=clf_tree.predict(X_test)\n",
"y_pred"
]
},
{
"cell_type": "code",
"execution_count": 199,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.5749745676500508"
]
},
"execution_count": 199,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"metrics.accuracy_score(y_test, y_pred)"
]
},
{
"cell_type": "code",
"execution_count": 200,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
""
]
},
"execution_count": 200,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png":...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here