Hello, I have another assignment (statistics coded with Python) for you. It due November 8. Please advise on price. BTW - I really liked how you did last one it helped me understand the concepts which...

1 answer below »
Hello, I have another assignment (statistics coded with Python) for you. It due November 8. Please advise on price. BTW - I really liked how you did last one it helped me understand the concepts which will greatly help in my upcoming exam. I have attached the file. - Ed
Answered Same DayOct 31, 2021

Answer To: Hello, I have another assignment (statistics coded with Python) for you. It due November 8. Please...

Kshitij answered on Nov 03 2021
155 Votes
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Assignment 4"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This assignment asks you to use resources at hand to apply module 6 - Linear Regression to several sets of data."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Learning Outcomes"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Exploratory analysis for regression\n",
"- Understand difference between linear and non-linear models\n",
"- Carry out OLS regression model\n",
"- Evaluate model\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Question 1**\n",
"\n",
"* For each data set in Assignment4_linear_regression_data.xlsx:\n",
"\n",
"- Create a scatter plot and visually decide if a linear model is appropriate (a matrix scatter plot will would be most efficient).\n",
"\n",
"* If the relation is not linear, transform the data accordingly. \n",
" - Try logarithm, exponential, square root, square, etc., for X and/or Y until you see a linear relation. You only need to report what is the transformation chosen, not all the attempts. \n",
" Note: most of the time, you can guess visually. A systematic way is to create a matrix scatter plot of the different transformations. A generic way we did not cover is to use a Box-Cox transformation. \n",
" \n",
"* Create an OLS model for the original and transformed data if required. \n",
" - Evaluate if the OLS assumptions are met: normality of errors centered around
zero, equal variance, etc..., for the original data and transformed data if appropriate. \n",
"\n",
" - Comment how the transformation impacted the different assumptions. (This should be done only by looking at the output diagnostic charts created by the software)\n",
" - If datasets have outliers, remove the outliers and see the effect in the model (slope, intercept and R-square)\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"The output of the assignment should be: \n",
"\n",
"- OLS full report for the original and transformed data if appropriate (only two datasets should need transformation).\n",
"\n",
"- A short comment on the validity of the linear assumptions for the original and transformed data set when appropriate (it should not need to be longer than a couple of sentences).\n",
"\n",
"- An interpretation of the slope and intercept in relation to the original data, i.e. if the model is linear [intercept value] is the expected value when the independent variable is zero, etc.). If the model is not linear, you need to transform the equation back to its original form. \n",
"\n",
"Check out the following if you need further guidance:\n",
"http://www.bzst.com/2009/09/interpreting-log-transformed-variables.html\n",
"\n",
"https://stats.idre.ucla.edu/other/mult-pkg/faq/general/faqhow-do-i-interpret-a-regression-model-when-some-variables-are-log-transformed/\n",
"\n",
"https://stats.idre.ucla.edu/sas/faq/how-can-i-interpret-log-transformed-variables-in-terms-of-percent-change-in-linear-regression/\n",
"\n",
"https://stats.stackexchange.com/questions/266722/interpretation-of-linear-regression-results-where-dependent-variable-is-transfor\n",
"\n",
"- If the dataset have outliers, determine if the outlier have leverage or not by comparing the OLS with and without the outlier.\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"import seaborn as sns\n",
"from scipy import stats\n",
"import math\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
yx
038.8581447.266278
140.8911487.985333
248.9716489.387120
346.4101249.382849
425.3333915.240903
\n",
"
"
],
"text/plain": [
" y x\n",
"0 38.858144 7.266278\n",
"1 40.891148 7.985333\n",
"2 48.971648 9.387120\n",
"3 46.410124 9.382849\n",
"4 25.333391 5.240903"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df = pd.read_csv(\"Assignment4_linear_regresion_data1.csv\")\n",
"df = df[['y','x']]\n",
"df.head()"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"RangeIndex: 100 entries, 0 to 99\n",
"Data columns (total 2 columns):\n",
"y 100 non-null float64\n",
"x 100 non-null float64\n",
"dtypes: float64(2)\n",
"memory usage: 1.6 KB\n"
]
}
],
"source": [
"df.info()"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
yx
count100.000000100.000000
mean29.1932145.809318
std13.1862322.617712
min6.2887161.163897
25%18.4147253.873405
50%27.5972325.586470
75%40.8795647.996223
max50.8879179.910843
\n",
"
"
],
"text/plain": [
" y x\n",
"count 100.000000 100.000000\n",
"mean 29.193214 5.809318\n",
"std 13.186232 2.617712\n",
"min 6.288716 1.163897\n",
"25% 18.414725 3.873405\n",
"50% 27.597232 5.586470\n",
"75% 40.879564 7.996223\n",
"max 50.887917 9.910843"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.describe()"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
""
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAATgAAAD8CAYAAADjcbh8AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAEgVJREFUeJzt3X+sX/Vdx/Hniw4GDBCk0wkliFuNudHKkB9LdA5mpu0wIlQDKAxU7LKMzMyAg2i2pLNWlGiiwy3NVh1mjC34q38wqan8MqC2UctgC2uDW7jtlsE2wbGR8r337R/f0+3r3d39ntve29v74fkgn3C+n8/nnPP5prnvvD/nc873pKqQpBYds9QDkKTFYoCT1CwDnKRmGeAkNcsAJ6lZBjhJzTLASVpQSbYm+XKSx79Le5L8WZK9SR5Lct5I23VJ9nTlupH6n0jy6W6fP0uSPmMxwElaaH8FrJ2jfR2wuisbgA8CJPle4H3ARcCFwPuSnNbt88Gu78H95jr+txjgJC2oqnoI+OocXS4D7qyhfwVOTfIDwM8B/1RVX62qrwH/BKzt2k6pqkdr+GTCncAv9hnLKw7rm/Tw0rNP+ajEMnLCGW9c6iHoEAwO7Os1Zftu5vN3etyrX/t2htnUQVuqass8Tncm8PTI58mubq76yVnqx1r0ACepLV0wm09Am2m2YFyHUD+WU1RJMD3Vvxy+SeCskc+rgP1j6lfNUj+WAU4STA36l8O3DXhbt5r6BuC5qvoicB/ws0lO6xYXfha4r2v73yRv6FZP3wb8Q58TOUWVRNX0gh0ryceBi4GVSSYZroweOzxPfQi4F3grsBf4BvBrXdtXk7wf2NkdamNVHVyseAfD1dkTgE91ZfxYFvvnklxkWF5cZFieDneR4cDkp/svMqz6scM615FkBicJFjCDO5oY4CQt1OLBUccAJ8kMTlK7amFWR486BjhJMG0GJ6lVTlElNctFBknNMoOT1CwXGSQ1y0UGSa2q8hqcpFZ5DU5Ss5yiSmqWGZykZk29tNQjWBQGOElOUSU1zCmqpGaZwUlqlgFOUqvKRQZJzfIanKRmOUWV1CwzOEnNMoOT1CwzOEnNGviDl5JaZQYnqVleg5PULDM4Sc0yg5PULDM4Sc1qdBX1mKUegKSjQFX/0kOStUmeTLI3yS2ztJ+dZEeSx5I8kGTVSNttSR7vypUj9T+T5D+S/FeSf0nyunHjMMBJGl6D61vGSLICuANYB0wAVyeZmNHtduDOqloDbAQ2d/teCpwHnAtcBNyc5JRunw8Cv1pV5wJ3Ab83biwGOEkLGuCAC4G9VfVUVR0A7gYum9FnAtjRbd8/0j4BPFhVg6p6AdgNrO3aCjgY7L4H2D9uIAY4ScNFhr5lvDOBp0c+T3Z1o3YD67vty4GTk5ze1a9LcmKSlcAlwFldvxuAe5NMAtcCfzhuIAY4STA11bsk2ZBk10jZMONomeUMMy/e3QS8Kcl/Am8C9gGDqtoO3As8AnwceBQ4uALybuCtVbUK+EvgT8Z9LVdRJc3rPriq2gJsmaPLJN/OugBWMWM6WVX7gSsAkpwErK+q57q2TcCmru0uYE+SVwM/XlX/1h3iE8A/jhurGZykhb4GtxNYneScJMcBVwHbRjskWZnkYPy5Fdja1a/opqokWQOsAbYDXwO+J8kPd/u8BfjsuIGYwUla0Bt9q2qQ5EbgPmAFsLWqnkiyEdhVVduAi4HNSQp4CHhnt/uxwMNJAJ4HrqmqAUCS3wT+Jsk0w4D36+PGkup5X8uheunZpxb3BFpQJ5zxxqUegg7B4MC+2a579faNLe/u/Xd64oY/PaxzHUlmcJJ8FlVSw6amlnoEi8IAJ8kMTlLDDHCSmrXIi41LZex9cEluTHLakRiMpCWysPfBHTX63Oj7GmBnkk92P4GybJaIJfU0Xf3LMjI2wFXV7wGrgY8A1zN8bOIPkrx2kccm6UiZx7Ooy0mvR7VqeDfwl7oyAE4D7knyR7P1H30Y98N3fnzBBitpcdT0dO+ynIxdZEjyLuA64Fngw8DNVfVS9xzZHuB3Zu4z+jCuTzJIy8Aym3r21WcVdSVwRVV9YbSyqqaT/PziDEvSEfVyfelMVb13jraxT/NLWgZexhmcpNYNltfiQV8GOEkv3ymqpJcBp6iSWrXcbv/oywAnyQxOUsMMcJKatcwewerLACeJMoOT1CwDnKRmuYoqqVlmcJKaZYCT1KqacooqqVVmcJJa5W0iktplgJPUrDYvwRngJEEN2oxwBjhJZnCS2uUig6R2NZrB9Xrxs6S21XT1Ln0kWZvkySR7k9wyS/vZSXYkeSzJA0lWjbTdluTxrlw5Up8km5J8Lslnu3c2z8kMTtKCZnBJVgB3AG8BJoGdSbZV1WdGut0O3FlVH03yZmAzcG2SS4HzgHOBVwIPJvlUVT0PXA+cBfxI917m7xs3FjM4SdSgf+nhQmBvVT1VVQeAu4HLZvSZAHZ02/ePtE8AD1bVoKpeAHYDa7u2dwAbq4avAKuqL48biAFOEjXdvyTZkGTXSNkw43BnAk+PfJ7s6kbtBtZ325cDJyc5vatfl+TEJCuBSxhmbQCvBa7szvmpJKvHfS+nqJLmNUWtqi3Aljm6ZLbdZny+CfhAkuuBh4B9wKCqtie5AHgEeAZ4FDiYN74SeLGqzk9yBbAVeONcYzWDkzSvDK6HSb6ddQGsAvb/v/NV7a+qK6rq9cDvdnXPdf/fVFXnVtVbGAbLPSPH/Ztu+++ANeMGYoCTtNABbiewOsk5SY4DrgK2jXZIsjLJwfhzK8NsjCQruqkqSdYwDGLbu35/D7y5234T8LlxA3GKKomamm1WeYjHqhokuRG4D1gBbK2qJ5JsBHZV1TbgYmBzkmI4RX1nt/uxwMNJAJ4Hrqn61tLGHwIfS/Ju4OvADePGkqrFvYP5pWefavMW6UadcMaclzR0lBoc2HdYEepLP31x77/T1zz0wMJFw0VmBieJml42MWteDHCS+l5bW3YMcJKoMoOT1CgzOEnNml7AVdSjiQFOkosMktplgJPUrEW+HXbJGOAkmcFJape3iUhq1pSrqJJaZQYnqVleg5PULFdRJTXLDE5Ss6am2/xxbwOcJKeokto17SqqpFZ5m4ikZjlFPUS+xGR5+eb+h5d6CFoCTlElNctVVEnNanSGaoCT5BRVUsNcRZXUrEZfqmWAkwSFGZykRg2cokpqlRmcpGZ5DU5Ss8zgJDWr1QyuzeczJM3LFOld+kiyNsmTSfYmuWWW9rOT7EjyWJIHkqwaabstyeNduXKWff88ydf7jMMAJ4np9C/jJFkB3AGsAyaAq5NMzOh2O3BnVa0BNgKbu30vBc4DzgUuAm5OcsrIsc8HTu37vQxwkpgmvUsPFwJ7q+qpqjoA3A1cNqPPBLCj275/pH0CeLCqBlX1ArAbWAvfCpx/DPxO3+9lgJNEzaMk2ZBk10jZMONwZwJPj3ye7OpG7QbWd9uXAycnOb2rX5fkxCQrgUuAs7p+NwLbquqLfb+XiwyS5rXIUFVbgC1zdJktzZv5gyU3AR9Icj3wELAPGFTV9iQXAI8AzwCPAoMkZwC/DFw8j6Ea4CTBdBb0NpFJvp11AawC9o92qKr9wBUASU4C1lfVc13bJmBT13YXsAd4PfA6YG+GYz0xyd6qet1cAzHASWJqYQ+3E1id5ByGmdlVwK+Mduimn1+tqmngVmBrV78COLWqvpJkDbAG2F5VA+A1I/t/fVxwAwOcJPqtjvZVVYMkNwL3ASuArVX1RJKNwK6q2sZwqrk5STGcor6z2/1Y4OEuS3seuKYLbofEACep7+pob1V1L3DvjLr3jmzfA9wzy34vMlxJHXf8k/qMwwAnyZ8sl9SuhZyiHk0McJKafRbVACeJKTM4Sa0yg5PULAOcpGY1+koGA5wkMzhJDVvgR7WOGgY4Sd4HJ6ldTlElNcsAJ6lZPosqqVleg5PULFdRJTVrutFJqgFOkosMktrVZv5mgJOEGZykhg3SZg5ngJPkFFVSu5yiSmqWt4lIalab4c0AJwmnqJIaNtVoDmeAk2QGJ6ldZQYnqVVmcJKa5W0ikprVZniDY8Z1SDIxS93FizIaSUtiQPUuy8nYAAd8Msl7MnRCkj8HNi/2wCQdOTWP//pIsjbJk0n2Jrlllvazk+xI8liSB5KsGmm7LcnjXblypP5j3TEfT7I1ybHjxtEnwF0EnAU8AuwE9gM/OebLbUiyK8mu6ekXepxC0lKankcZJ8kK4A5gHTABXD3LTPB24M6qWgNspEuaklwKnAecyzD23JzklG6fjwE/AvwYcAJww7ix9AlwLwHf7A54PPDfVTXn96yqLVV1flWdf8wxr+pxCklLaYEzuAuBvVX1VFUdAO4GLpvRZwLY0W3fP9I+ATxYVYOqegHYDawFqKp7qwP8O7CKMfoEuJ0MA9wFwE8xjMb39NhP0jKxkBkccCbw9Mjnya5u1G5gfbd9OXByktO7+nVJTkyyEriE4QzyW7qp6bXAP44bSJ9V1N+oql3d9peAy5Jc22M/ScvEVPVfPEiyAdgwUrWlqraMdpllt5knuAn4QJLrgYeAfcCgqrYnuYDhJbFngEeBwYx9/wJ4qKoeHjfWsQFuJLiN1v31uP0kLR/zuQ+uC2Zb5ugyyf/PulYxvHY/eoz9wBUASU4C1lfVc13bJmBT13YXsOfgfkneB7waeHufsfaZokpq3AJfg9sJrE5yTpLjgKuAbaMdkqxMcjD+3Aps7epXdFNVkqwB1gDbu883AD8HXD1uHeAgb/SVtKCPalXVIMmNwH3ACmBrVT2RZCOwq6q2ARcDm5MUwynqO7vdjwUeTgLwPHBNVR2con4I+ALwaNf+t1W1ca6xGOAkLfijWlV1L3DvjLr3jmzfA3zHYmVVvchwJXW2Y847XhngJPlrIpLaNZ9V1OXEACfJXxOR1C5/D05Ss7wGJ6lZTlElNatcZJDUKl8bKKlZTlElNcspqqRmmcFJapa3iUhqlo9qSWqWU1RJzTLASWqWq6iSmmUGJ6lZrqJKatZUv3e4LDsGOEleg5PULq/BSWqW1+AkNWvaKaqkVpnBSWqWq6iSmuUUVVKznKJKapYZnKRmmcFJatZUTS31EBaFAU6Sj2pJapePaklqVqsZ3DFLPQBJS2+6qnfpI8naJE8m2Zvkllnaz06yI8ljSR5Ismqk7bYkj3flypH6c5L8W5I9ST6R5Lhx4zDASaLm8d84SVYAdwDrgAng6iQTM7rdDtxZVWuAjcDmbt9LgfOAc4GLgJuTnNLtcxvwp1W1Gvga8BvjxmKAk8RUTfcuPVwI7K2qp6rqAHA3cNmMPhPAjm77/pH2CeDBqhpU1QvAbmBtkgBvBu7p+n0U+MVxAzHASaKqepckG5LsGikbZhzuTODpkc+TXd2o3cD6bvty4OQkp3f165KcmGQlcAlwFnA68D9VNZjjmN/BRQZJ83qSoaq2AFvm6JLZdpvx+SbgA0muBx4C9gGDqtqe5ALgEeAZ4FFg0POY38EAJ2mhV1EnGWZdB60C9s84337gCoAkJwHrq+q5rm0TsKlruwvYAzwLnJrkFV0W9x3HnI1TVElMU71LDzuB1d2q53HAVcC20Q5JViY5GH9uBbZ29Su6qSpJ1gBrgO01jMD3A7/U7XMd8A/jBmKAkzSva3A9jjUAbgTuAz4LfLKqnkiyMckvdN0uBp5M8jng++kyNuBY4OEkn2E4Db5m5Lrbe4DfTrKX4TW5j4wbSxb7Br9XHHdmm3cQNuqb+x9e6iHoEBy78odmu0bV26tO/MHef6cvfOPzh3WuI8lrcJL8uSRJ7Wr1US0DnCR/D05Su8zgJDWr1Wtwi76K2rIkG7q7urUM+O/18uN9cIdn5jN4Orr57/UyY4CT1CwDnKRmGeAOj9dzlhf/vV5mXGSQ1CwzOEnNMsBJapYBTlKzDHCSmmWAm6ck70/yWyOfNyV511KOSXNLckH3/s3jk7wqyRNJfnSpx6XF5yrqPCX5QeBvq+q87ieX9wAXVtVXlnRgmlOS3weOB04AJqtq8xIPSUeAD9vPU1V9PslXkrye4U8t/6fBbVnYyPBdAS8CZtwvEwa4Q/Nh4HrgNXQvy9BR73uBkxj+5v/xwAtLOxwdCU5RD0H3pqBPM/xjWV1VU0s8JI2RZBvDN6yfA/xAVd24xEPSEWAGdwiq6kCS+xm+advgdpRL8jaGLxW+K8kK4JEkb66qf17qsWlxmcEdgm5x4T+AX66qPUs9Hkmz8zaReUoyAewFdhjcpKObGZykZpnBSWqWAU5SswxwkpplgJPULAOcpGb9H4q2lUaz+OfSAAAAAElFTkSuQmCC\n",
"text/plain": [
"
"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"corrmat = df.corr()\n",
"sns.heatmap(corrmat, square=True)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"df1 = df**2\n",
"df2 = df**0.5\n",
"df3 = np.exp(df)\n",
"df4 = np.log(df)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Text(0,0.5,'y')"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"text/plain": [
"
"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/png":...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here