DASC 512, Homework 7 Instructions: We will be using a rather popular dataset for regression that is predicting the progression of diabetes. You will be responsible for a (no more than 2 pages)...

1 answer below »
Instructions are in the Homework7 file


DASC 512, Homework 7 Instructions: We will be using a rather popular dataset for regression that is predicting the progression of diabetes. You will be responsible for a (no more than 2 pages) regression analysis. For a description of the data: Ten baseline variables, age, sex, body mass index, average blood pressure, and six blood serum measurements were obtained for each of n = 442 diabetes patients, as well as the response of interest, a quantitative measure of disease progression one year after baseline. 1. Data Preprocessing: Save the file ‘diabetes.txt’ in the same folder as your python script and run the following code to gather your data. We haven’t discussed it in class much, but it is useful to split your data (with large datasets) into a training and testing group. This will allow us to assess the usefulness of our model at prediction (specifically, we can see overfitting/underfitting). from patsy import dmatrices from sklearn.model_selection import train_test_split diabetes = pd.read_table(’diabetes.txt’, sep=’\s+’) y, X = dmatrices(’Y ~ AGE+C(SEX)+BMI+BP+S1+S2+S3+S4+S5+S6’, data=diabetes , return_type=’dataframe ’) X.columns = [’Intercept ’, ’SEX’, ’AGE’, ’BMI’, ’BP’, ’S1’, ’S2’, ’S3’, ’S4’, ’S5’, ’S6’] X_train , X_test , y_train , y_test = train_test_split( X, y, test_size =0.33, random_state =42) 2. Build a regression model predicting ‘y train’ by using ‘X train’. You may transform any variables and include any interactions or higher order terms you deam necessary. Include in your writeup the following: ˆ Steps you took to create the model (feature selection), ˆ Assessment of regression assumptions (tests and plots as required), ˆ Remedial steps (if any) taken to create a valid model that passes assumption tests. ˆ Overall assessment of your final model (Measures of usefulness, inference and interpretations of features, etc.) 3. Use the following function to calculate and report on your RMSE. RMSE is root mean square error of your model on your test set. RMSE = √√√√ n∑ i=1 (ŷi − yi)2 n The individual(s) with the lowest RMSE will get a bonus point. Remember that any transformation and additional terms for your ‘X train’ and ‘y train’ datasets need to be applied to their respective ‘test’ datasets in order for this code to work. def calculate_rmse(model ,X_test ,y_test ): ’’’ Function for calculating the rmse of a test set Parameters ---------- model : OLS model from Statsmodels (formula or regular) X_test : Test inputs (MUST MATCH MODEL INPUTS) y_test : Test outputs (MUST INCLUDE ANY TRANFORMATIONS TO MODEL OUTPUTS) Returns ------- Returns RMSE printed in the console ’’’ predicted = model.predict(X_test) from sklearn.metrics import mean_squared_error MSE = mean_squared_error(y_test ,predicted) rmse = np.sqrt(MSE) print(rmse) Grading: I will be grading you on the quality of your writing. This includes the effectiveness of your com- munication of your model building and assumption testing steps as well as your reporting on the usefulness of your model. Don’t spend a rediculous amount of time building your model (find a decent model and move on), while model quality is important, your effectiveness at communicating it is more important for this assignment. HW 7, DASC 512 AGESEXBMIBPS1S2S3S4S5S6Y 59232.110115793.23844.859887151 48121.687183103.27033.89186975 72230.59315693.64144.672885141 24125.384198131.44054.890389206 50123101192125.45244.290580135 23122.68913964.86124.18976897 362229016099.65033.951282138 66226.2114255185564.554.24859263 60232.183179119.44244.477394110 291308518093.44345.384588310 22118.69711457.64623.951283101 5622885184144.83263.58357769 53123.792186109.26234.304181179 50226.297186105.44945.062688185 6112491202115.47234.290573118 34224.7118254184.23975.03781171 47130.3109207100.27035.214998166 68227.51112141473954.941691144 38125.4841621034244.44278797 41124.783187108.26034.543378168 35121.18215687.85034.51099568 25224.39516298.65433.85018749 2512692187120.45633.97038868 61232103.6721085.23566.107124245 31129.788167103.44844.356778184 30225.283178118.43454.85283202 19119.287124545724.174490137 42131.98315887.65334.465910185 63124.47316091.44834.634778131 67225.811315854.26425.2933104283 32130.589182110.65634.343889129 42120.37116181.26624.23418159 58238103150107.22274.644498341 57121.794157588224.44279287 53120.57814784.25233.9897565 62223.580.33225112.8862.624.875296102 52128.511019597.26035.241785265 46127.478171885834.828390276 48233123253163.64465.42597252 48227.773191119.44644.8529290 50225.6101229162.24354.7791114100 21120.163135695434.09438955 32225.490.33153100.4344.54.53268361 54124.2742041098224.174410992 61232.797177118.42964.997287259 56223.1104181116.44744.47737953 33125.385155855134.553970190 27119.678128684334.442771142 67222.598191119.26133.9898675 37227.793180119.43065.030488142 58125.79915791.64934.406793155 65227.910315996.84244.615186225 34125.5932181445744.44278859 46124.9115198129.65444.2767103104 35128.797204126.86434.189793182 37121.8841841017333.91293128 37130.28716696404.155.01068752 41120.58012448.86424.02547537 60120.410519878.49924.634779170 6622498236146.45845.062696170 291268314165.26424.07758361 37226.879157982865.043496144 41225.783181106.66633.73778552 39122.977204143.24644.304174128 672248314377.24934.43089471 36224.11121931253565.105995163 46224.785174123.23064.644496150 6022589.67185120.8464.024.51099297 59223.6831651004744.499892160 53122.19313476.24634.077596178 48119.991189109.66933.951210148 48129.5131207132.24744.9345106270 6622691264146.66545.568387202 52224.594217149.44854.58589111 52226.6111209126.46134.682110985 46223.587181114.84444.70959842 402291159747.2352.774.304195170 221237316197.85433.828691200 501218814071.83545.11271252 20122.987191128.25343.891885113 68127.5107241149.66444.9290143 52224.386197133.64454.57479151 44123.187213126.47733.87127252 38127.38114681.64734.465981210 49122.765.3316896.2622.713.89186065 6113395182114.85434.189774141 29219.483152105.83943.58358355 61125.898235125.87635.11282134 34222.67516691.86034.262710842 36121.989189105.26834.369496111 521248316786.67123.85019498 61131.279235156.84755.049996164 43126.8123193102.26734.77919448 35120.465187105.6672.794.27677896 27124.891189106.86934.18976990 2912171156973844.65490162 64227.3109186107.63855.308399150 41134.687.33205142.64154.6728110279 49225.991178106.65234.57477592 48120.498209139.44654.77077883 5312888233143.85845.049991128 53222.2113197115.26734.3041100102 2312990216131.46534.58591302 65230.298219160.64054.521884198 41132.494171104.45633.97037695 55223.483166101.64644.52189653 22119.38215693.25233.98971134 5613178.67187141.4345.54.060490144 54230.6103.3314479.8304.85.1417101232 59225.595.33190139.4355.434.356711781 60223.48815389.85833.258195104 54126.8872061226834.3828059 25128.3871931284944.38292246 54227.7113200128.43755.1533113297 55136.611319994.4434.635.730197258 40226.5932361473775.560792229 62231.8115199128.64454.882898275 65124.4120222135.63765.5094124281 33225.41022061413954.8675105179 5312294175885934.941698200 35126.898162103.64544.204786200 66128101195129.24054.859894173 62233.9101221156.43564.9972103180 50229.694.33300242.4339.094.812210984 47128.69716490.65634.465988121 47225.69416574.84045.525593161 24120.78714980.66123.61097899 58226.291217124.27134.691368109 34120.687185112.25834.304174115 51127.996196122.24255.0689120268 31235.3125187112.44844.8903109274 22119.975175108.65434.127172158 53224.4922141465044.499897107 37221.48312869.64933.85018483 28130.485198115.66734.343880103 47131.68415488305.15.1985105272 23118.878145726323.9128685 501311231781054844.828388280 58236.711716693.84444.9488109336 55132.111016484.24245.241790281 60227.7107167114.63844.276795118 41130.881214152287.65.1358123317 60227.5106229143.85145.141791235 40126.992203119.87034.18978160 57230.790204147.83464.709593174 37138.311316594.65334.465979259 40231.995198135.63854.80493178 3313589200130.4424.764.9273101128 32227.889216146.25544.30419196 35225.981174102.43165.313282126 55132.9102164106.24144.430889288 4912693183100.26434.54338888 39226.3115218158.23274.9345109292 60222.3113186125.84644.26279471 67228.393204132.24944.736292197 41232109251170.64955.0562103186 44125.49516292.65334.40678325 48223.389.33212142.8464.614.75369884 45120.374.33190126.2493.884.30417996 47130.41201991204645.105987195 46120.6731721075134.24858053 36232.3115286199.43975.4723112217 34129.273172108.24944.304191172 53233.11171831194844.382106131 61124.6101209106.87734.836388214 37120.28116287.86334.02548859 33220.88412570.24633.78426670 68132.8105.67205116.4405.135.4931117220 49231.994234155.83475.3982122268 48123.9109232105.23766.10796152 55224.584179105.86633.58358747 43122.16613477.24534.07758074 6023397217125.64555.4467112295 3121993137734734.442778101 53227.382119553934.828393151 67122.88716698.65234.343892127 61228.21062041325244.605296237 62128.987.33206127.2336.245.433799225 60125.687207125.86934.11098481 42124.991204141.83854.795889151 38226.8105181119.23754.820391107 62122.479222147.45944.35677664 61226.9111236172.43964.812289138 61223.1113186114.44744.8122105185 53128.68817198.84145.049999265 28224.79717599.63255.379987101 26230.389218152.23175.159182137 30121.387134636323.688966143 50126.1109243160.66244.62589141 48120.295187117.45344.41888579 51125.2103176112.23754.897890292 47222.58213166.84134.753689178 64223.5972031295934.31757791 51225.9762401693965.075296116 30120.910415283.84734.66349786 56228.799208146.43954.727497122 42122.185213138.66044.27679472 62226.71151831243554.7875100129 34131.48714993.84633.828677142 60122.2104.67221105.4603.685.62769390 6412192.33227146.8653.494.3307102158 39221.290182110.46034.06049839 71226.5105281173.65555.568384196 48229.2110218151.63964.9298222 79227103169110.83754.6634110277 40130.79917785.45045.33758599 49228.8922071404454.744992196 51130.6103198106.65735.1475100202 57130.1117202139.64254.625120155 59224.7114152104.82954.51098877 51127.799229145.66934.276777191 74129.8101171104.85034.39448670 67126.7105225135.46934.63479673 49119.888188114.85734.39449349 57123.38815563.67824.20477865 56235.1123164953845.0434117263 52229.7109228162.83185.1417103248 69129.31242231395445.0106102296 37120.383185124.63854.718588214 24122.589141685234.65484185 55222.79315494.25333.52647578 36122.8871781164144.6548293 42224107150854434.65496252 21124.276147775334.442779150 41120.262153895034.24858977 57229.410916087.63155.332792208 20222.18717199.65834.20477877 67223.6111.33189105.4702.74.219593108 34125.277189120.65344.343879160 41224.9861921156134.3829453 3823378301215506.025.193108220 51123.51011951215144.744994154 52226.491.33218152395.594.905399259 67129.88017293.46334.35678290 61130108194100523.735.3471105246 67225111.6714693.4334.424.585103124 56127105247160.65455.08769467 6412074.67189114.8623.054.11099172 58225.5112163110.62964.762286257 55128.291250140.26745.366103262 62233.31141821143855.010696275 57225.696200133523.854.3175105177 20224.28812672.24533.78427471 53222.198165105.24744.15898147 32231.48915384.25634.158990187 41123.186148785834.094360125 60123.476.67247148653.85.13587778 26118.883191103.66934.52186951 37130.8112282197.24375.3423101258 45132110224134.24555.411693215 67131.611617990.44145.4723100303 34235.5120233146.63475.5683101243 50131.978.33207149.2385.454.59518491 71129.597227151.64555.0239108150 57231.6117225107.64065.9584113310 49120.3931841036134.605293153 35141.381168102.83754.948894346 41221.2102184100.46434.5857963 70224.182.33194149.2316.264.234110589 52123107179123.742.54.214.15899350 60125.67819595.49123.76128739 62122.5125215999824.499895103 44238.2123201126.64455.023992308 28219.28115594.65133.850187116 5822985156109.23643.98986145 3922489.67190113.6523.654.80410174 34220.698183928323.68899245 65126.370244166.25154.897898115 66234.6115204139.43664.9628109264 51123.487220108.89324.51098287 50229.211916285.25434.736295202 59227.21071581023944.442793127 5212778.3313473443.054.442769182 69224.5108243136.44065.8081100241 53124.1105184113.44644.81229566 47225.398173105.64444.762210894 52128.81132801746745.27386283 39120.99515065.66824.40679564 67223701841283554.65499102 59224.19617098.65434.465985200 51228.1106202122.25544.820387265 2321878171964844.90539294 68125.993253181.25354.543398230 44121.58515792.25533.891884181 60224.310314186.63344.672878156 52124.5901981292975.298386233 38121.37216560.28824.43089060 61125.890280195.45554.997290219 68224.8101221151.46043.87128780 28231.583228149.43865.31328368 65233.5102190126.23554.9698102332 69128.1113234142.85245.278177248 51124.385.3315371.6712.153.95128284 2913598.33204142.6504.084.043191200 55223.593177126.84143.82868355 3423083185107.25334.82039285 67120.78317099.85934.02547789 49125.67616199.85133.93187831 55222.98112367.24134.304188129 59225.190163101.44644.35679183 53133.282.67186106.8464.045.112102275 48224.1110209134.65844.406710065 52129.5104.33211132.8494.314.983698198 69129.6122231128.45645.45186236 60222.8110245189.83964.394488253 46222.783183125.83264.836375124 51226.210116199.64834.20478844 67223.596207138.24254.8978111172 49122.18513663.4622.193.970372114 46226.594247160.25944.9345111142 47132.4105188125464.094.442799109 75130.178222154.2445.054.779197180 28124.293174106.45434.219584144 65231.31102131284755.24791163 42130.191182114.84944.510982147 51124.579212128.66534.52189197 53227.795190101.84155.4638101220 54123.2110.67238162.8484.964.9127108190 731271022111216734.744999109 54126.810817680.66734.9558106191 42129.293249174.24565.003992122 75131.2117.67229138.8297.95.7236106230 55232.1112.6720792.4258.286.1048111242 68225.7109233112.63576.0568105248 57126.998246165.23875.36696249 48131.475.33242151.6386.375.5683103192 61225.685184116.23954.969898131 69137103207131.45544.634790237 38132.677168100.64744.6259678 45221.29416996.85534.4543102135 51229.21071871393264.38295244 712248413885.83944.189790199 57136.1117181108.23455.2679100270 56225.8103177114.43454.962899164 322228813778.64833.95127872 50121.991190111.26734.07757796 43134.384256172.63385.5294104306 54225.21151811203954.70059291 31123.385190130.84344.394477214 56125.780244151.65945.1189595 44125.11331821135534.248584216 57231.9111173116.24144.369487263 64228.41111841274144.38297178 43128.11211921216034.007393113 19125.383225156.64654.718584200 71226.185220152.44754.634791139 50228104282196.84465.327995139 59223.673180107.45144.68218488 57124.59318696.67134.521891148 492218211985.42353.97037488 41232126198104.24945.4116124243 25222.685130714834.00738171 52219.78115253.48224.41888277 34121.284254113.45256.093692109 42230.6101269172.25055.4553106272 28225.599162101.64644.27679460 47223.390195125.85444.33077354 3223110017796.24545.187477221 43118.58716393.6612.673.73778090 59226.9104194126.64354.804106311 53128.31011791074844.7875101281 60125.710315884.66423.850197182 54236.111516398.44344.6821101321 35224.194.6715597.4324.844.8529458 49225.889182118.63954.804115262 58122.891196118.84844.9836115206 36239.190219135.83865
Answered 3 days AfterAug 20, 2021

Answer To: DASC 512, Homework 7 Instructions: We will be using a rather popular dataset for regression that is...

Pritam Kumar answered on Aug 24 2021
138 Votes
diabetes analysis answer
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here