Exam 3 (submit by Dec 1, 11:59 pm EST on Classes as a single PDF file) Instruction: This is the third of the 4 exams in BMB 620. Complete the following questions in RMarkdown and submitted the...

1 answer below »
answer questions


Exam 3 (submit by Dec 1, 11:59 pm EST on Classes as a single PDF file) Instruction: This is the third of the 4 exams in BMB 620. Complete the following questions in RMarkdown and submitted the generated PDF on Classes by the deadline. Make sure to show both the input and output. 1. Download the "leukemia_remission.txt" data from Classes. The data contain information on 27 leukemia patients. The response variable is whether leukemia remission has occurred (REMISS: 1=remiss; 0=no remiss). The predictors are cellularity of the marrow clot section (CELL), smear differential percentage of blasts (SMEAR), percentage of absolute marrow leukemia cell infiltrate (INFIL), percentage labeling index of the bone marrow leukemia cells (LI), absolute number of blasts in the peripheral blood (BLAST), and the highest temperature prior to start of treatment (TEMP). a. Draw a scatterplot showing the relationship between the response and one of the predictors. Interpret the plot. b. Fit a logistic regression with REMISS as the response. Interpret the result. c. Convert the model coefficients to probability of remiss. Interpret the result. d. Use predict() to find the predicted probability of remiss using a new data set. e. Build another logistic regression with different predictors. Compare the models in b and e. f. Check the assumptions of your model in b. 2. Download the "CivilInjury_0.csv" data from Classes. The data contain the number of injuries during a fire in a day. a. Extract the year and month from the `Injury Date` variable. b. Count the number of injuries within each month of each year. c. Fit a Poisson regression with the number of injuries within a month of each year as the response. Interpret the result. d. Check the assumptions of your model in c. 3. We want to randomly sample male and female college undergraduate students and ask them if they consume alcohol at least once a week. Our null hypothesis is no difference in the proportion. Our alternative hypothesis is that there is a difference. This is a two-sided alternative; one gender has higher proportion but we don't know which. a. We would like to detect a difference as small as 5%. How many students do we need to sample in each group if we want 80% power and a significance level of 0.05? b. It turns out we were able to survey 543 males and 675 females. Find the power of our test if we're interested in being able to detect a “small” effect size with 0.05 significance. c. Let's say we previously surveyed 763 female undergraduates and found that p% said they consumed alcohol once a week. We would like to survey some males and see if a significantly different proportion respond yes. How many do I need to sample to detect a small effect size (0.2) in either direction with 80% power and a significance level of 0.05? 4. Download the "USHospitals.txt" data from Classes. The data list outcome data on US hospitals, as reported by Medicare. In addition to the Provider variable (hospital id), there are 16 variables measured. The information of the 16 variables is below: a. Use pairs() function to make pair-wise scatterplots between all 16 variables. b. Prepare the data for a principal component analysis (PCA). c. Determine the number of components needed in your PCA. d. Perform a PCA on the data. Extract the results and interpret the components. e. Rotate the principal components and interpret the result. 5. Download the "hemangioma.txt" data from Classes. The data contain information on infants who were surgically treated for hemangioma. Hemangioma is the most common of all childhood cancer diagnoses. It appears as a lump of blood vessels on the skin. If left untreated it usually resolves within a few years. Some parents opt for surgical removal. Age (in days) and expression of genetic markers are given in the data. a. Determine the number of factors in your exploratory factor analysis (EFA). b. Perform a EFA on the data. c. Rotate the factors and interpret the result. AgeRBp16DLKNanogC-MycEZH2IGF-2 812.04614933.0671272308974.7294.173366.48960072.764100811175.689 956.541.970988.3381.8317.095340.17 953.613.82153060.6237.2805.576310.24 1651.91226663.735868596991.688.23736602.46963267008.523 2862.62543635.1682935369600.56282.2972712.2258281.6289237104.238 2992.87076025.75524571119257.5176.751438.7642353.5114699342.126 3801.92597772.3968298214070.9245.2669225.75891451.41118043725.5146 4187.063.3869511.45264.621.173.078038.53 4206.393.3781457.12658.751.883.8712583.25 5476.364.0564348.36336.110.784.766505.15 5901.81375815.147162164881.062012.479235.6453869.44881432721.809 6356.72.67126015.953072.504.3511762.76 7521.84536893.2752457567857.56127.306374.1290521.004505210283.141 7607.330.9243438.04697.571.773.3211517.89 11711.82886776.5563745716259.5392.087612.9177792.90410513263.792 12771.340.0594.4714.960.363.8329.93 15204.022.7931124.69453.720.622.331162.69 21380.502330.5933.110.030.1766.49 36264.18358334.236695560208.3339.93095.4328151.356588721173.781 ProviderSDeathNSDeathRLungNLungRClotNClotRSplitNSplitRCutsNCutsRHADeathRHADeathNHFDeathRHFDeathNPDeathRPDeathN 10001157140.58151490.3249411.473930.9158203.281566911.875112.7339 1000529123.9244670.36894.71020.6945841.517.33514.721315.9339 10006110146.3399620.528053.222341.21105773.0918.630712.656315785 1000710NA21760.31736.11350.8522201.74NA1614.410116.1211 10008NANA6730.33284.52NANA7071.9NA513.55115.660 10011173143.09107260.537764.583401.55112122.4216.323810.831717.3482 1001225124.2132380.337532.5691.8533602.1216.210016.413415.8243 1001691145.7465510.2122393.982550.4969072.031421313.225713.1318 100183NA930.34734.36NANA931.98NANANANANANA 1001979122.6456380.2715356.012140.5457282.5117.2701127715.3331 100218NA15590.271665.9516NA16181.9816.83111.76111.3136 10022NANA11160.32684.4317NA11371.68NA811.65613.6101 10023170155.29112830.2445512.594130.91116482.0218.230710.840414.1184 1002495124.5697760.2628882.122370.5100612.1917.322510.445715.7320 1002511NA25440.263332.31783.7527171.9314729.916511.4117 10027NANA5750.34NANANANA5941.95NA3NA2311.867 10029145141.06111120.3341392.393621.55114572.5817.127511.863212.8257 10032NANA6950.323NANANA7881.9NA1211.84913.7103 10033594157.99240570.71108045.3616170.4263642.1115.721411.540812.5384 100347NA14500.31034.1121NA15961.98NA2212.49611.491 1003554165.4161910.2515502.811730.5863021.7415.911613.439911.6419 1003619NA32650.275405.08750.7633421.87NA1611.715011.4238 1003834115.746720.3510084.621180.6748821.9816.5671219411.8215 10039391133.14289690.24109053.589791.3299811.8915.4104412.2141412.9910 10040193105.0587870.6626994.712571.0490631.7116.721811.236515.5408 10043NANA5280.32NANANANA5431.94NA211.33012.652 10044NANA10650.32354.5116NA11072.13NA811.6578.4163 100456NA13980.31515.7323NA1442216.72812.510112.9163 10046108123.3780020.1617953.851700.5683810.7616.612911.537213.7345 10047NANA15200.32NANANANA15831.8115.23112.46411.898 100496NA34650.375753.97980.7935482.416.63614.114112.8180 10050NANA12790.31644.445NA13111.71NA1912.45816.6135 10051NANA6120.34NANANANA6161.98NA111.23411.634 10052NANA9650.33NANANANA9921.88NA610.7379.887 1005419NA33040.353725.77840.7634032.18163311.218912.8210 1005597144.5697360.2336793.13150.46100481.8513.432012.750011.1437 10056227133.44163690.4969385.435550.3416407215.326811.147712.8515 10058NANA2690.34NANANANA2872.01NA2NA2113.952 10059NANA18760.4211NANANA19351.73NA2210759.6141 1006118NA38830.282737.01332.0339811.914.62613.212712.5361 100625NA16420.31944.28270.8717421.49NA1713.810014194 100653592.8630680.436595861.7531553.7217.27212.315716.2187 10066NANA5710.323NANANA5761.94NA110.24110.979 10069NANA14690.32654.39280.8814941.61NA1014.99815.8123 10073NANA12490.3211NA6NA12691.8NA2114.36612.4115 10078155132.6499640.2432864.893133.22106943.4816.243810.960313.6398 1007915NA32840.333816.84740.7634042.1514.73310.818111.6231 1008344115.2548690.449222.861651.3449932.0715.97014.328410.5264 1008586115.2464440.3614903.22410.4967362.5115.66812.839412.9346 1008613NA24230.293625.1530.7626371.4814.12811.58313.3246 1008746176.1222100.639026.261420.624032.77NA10NA22NA22 1008927115.646050.2110422.48530.7745251.16166611.122411.2276 10090117138.3392590.4434655.932971.0996322.7415.720213.548811.4483 10091NANA6450.3314NA4NA6631.88NA410.55411.545 10092399117.05216580.3855866.166670.99230841.3114.739812.688312790 10095NANA5830.33NANANANA5901.97NA710.23911.229 10097NANA13810.31NANANANA14141.88NA16113711.770 100998NA21860.31295.18601.9422273.215.92813.810113.3168 1010073146.1557020.3324814.311790.5763463.3816.817013.629913.9185 1010110NA22660.282085.88570.7223062.1516.83515.611911.1105 10102NANA2590.34NANANANA2722.01NA2NA1511.339 10103259124.73112580.46448744960.8123662.7314.730712.236613.6324 10104219138.31103020.3340835.814370.36108093.31629712.433015.3278 101087NA17590.3874.06350.818223.23143111.612114.4180 101094NA13210.31654.17260.8813551.59NA1114.66012.6152 10110NANA9620.33NANANANA9781.94NA312.23812.427 10112NANA27080.32073.7418NA27511.49NA219.71088.5126 10113269163.02213510.4372009.178660.53222421.1918.225611.874814.2580 1011463155.2547130.5710134.061710.5849744.4214.410211.219911.6205 1011867110.1359580.4410396.391220.6861651.6515.312911.131614.1227 101203NA15780.43414.5221NA16242.43NA2313.38514.8172 101253NA19680.31655.7623NA20461.97NA1413.511011.6322 101269NA17840.31553.89310.8418231.48NA1910.38311.9120 10128NANA7180.337NANANA7301.91NA211.52811.152 101299NA8830.321144.01330.869302.61NA1011.8421490 10130NANA9920.435NANANA10211.88NA510.88911158 1013176135.1178680.2826214.82180.5175422.5712.91821330812.1328 10138NANA4420.34NANANANA4432.02NA2NA8NA7 10139181137.17116140.345238.866370.34115682.0815.925413.440512.8513 1014479114.6651100.6316756.231730.6154843.0115.712812.534414.1197 101466NA13350.41744.3713NA13841.64NA2414.24816.9126 10148NANA20790.3222NANANA21341.72NA1410.113211.765 1014933117.4845900.288172.061860.5546981.7713.7799.729310.3212 1015011NA15860.311703.92420.8216343.69NA2011.49112.887 1015728137.0314360.32748.12790.7315352.07NA3146511.8137 1015836118.8925690.385363.78930.7726033.7513.83113.711414.1219 1016425139.6629980.453334.591221.6231042.6114.6441313910.3252 1016835106.9524060.319904.5313NA21941.31NA611.42711.640 101693NA15830.31554.318NA16221.92NA1810.414912.896 11300NANANANANANANANANANANA3NA2214.628 11302NANANANANANANANANANANA212.62610.842 11304NANANANANANANANANANANANANANANA3 13301NANANANANANANANANANANANANANANANA 2000114298.6278170.1932872.873350.4581323.813.328515.235811.5256 2000627117.9828540.256823.051612.229733.5717.7671412612.4197 200086NA10600.322143.54260.8810791.56NA2011.54210.880 2001232114.9423900.336462.941420.624942.8715.27914.810313.297 2001780130.6328320.318325.471820.5429001.8415.27813.47414.168 20018NANA5260.33NANANANA5401.95NA111.2339.5137 200247NA14580.33664.87700.7615342.01NA1710.86310.994 2002652128.0628380.478773.661880.5628943.6315.93813.47314.2150 20027NANA5310.32654.3223NA5492.67NA2NA1615.539 21301NANANANANANANANANANANA1NA3NA9 21302NANANANANANANANANANANA3NA7NA10 21303NANANANANANANANANANANA8NA911.431 21304NANANANANANANANANANANA1NA6NA13 21305NANANANANANANANANANANA4NA9NA18 21306NANANANANANANANANANANA2NA2111.640 21307NANANANANANANANANANANANANA1NA2 21308NANANANANANANANANANANA1NA1511.926 21309NANANANANANANANANANANA3NA7NA15 21311NANANANANANANANANANANA1011.22813.177 21313NANANANANANANANANANANA9NA2412.460 300013NA11530.29835.4110NA12041.98NA1711.52711.134 3000240187.13187900.4579373.1510960.75200731.8514.21799.930211.1182 30006268107.7162370.4572056.216732.55168982.3714.436813.644911.9443 3000744120.1834870.3411993.81830.5436981.0218.313610.216012.4210 3001085118.3665230.2119174.012091.1267011.4918.418012.81839.9207 30011136129.6894120.2332323.934150.8995843.5317.320110.431914.8408 30012131101.0573400.6527653.012741.6877603.1314.929812.622211.8382 30013154124.86112070.3127923.073310.41121522.3511.752612.850816.7935 30014143102.04102960.3733084.533450.93109112.7315.79611.522911168 3001636100.9145740.529995.281821.3348752.1214.48312.313114275 3002244130.6721580.2654612.23640.7523312.34NA11NA10NA20 30023102133.1763640.2131742.662300.4362011.8512.516413.21869.3251 3002437680.89127670.5158265.435200.9128473.716.614511.616112.1151 3003012392.4948240.1821234.192241.2652152.8415.21411114315.387 300333779.1224750.246444.321210.6226191.46NA2411.5961378 3003698109.5170310.3219882.892141.277971.9115.222112.131311.5354 300374297.6627860.4115962.53970.7429852.3114.74011.26313.651 3003818877.56111100.4538093.73000.95117841.86151729.338411.8355 3004338111.3440790.336965.781681.3843513.7616.814511.820710.9238 3005578105.4565080.3415723.92360.4370573.3716.41799.928912.9448 30061221107.79242230.2376123.137240.58257721.2415.246512.263610.9577 3006226104.0822330.268115.071280.6423681.7814.45612.27612142 3006424299.94109910.4841483.156180.67120303.3712.821710.923410.6287 30065180100.63102720.434323.895871.11106881.3914.618910.331510.3331 300675NA10940.371633.94390.811241.415.246125410.3123 300684NA12080.312163.54270.8512892.18NA1315.55111.5153 3006956107.8556530.1922124.52271.226044218.424212.828615.4249 300713NA8300.311034.28260.888671.67NA5NA2110169 3007320NA13440.33303.23330.8913911.45NA511.44310.2193 30074NANA930.34NANANANA942.04NANANANANA10 30077NANA290.34NANANANA292.05NANANANANA2 30078NANA3720.45925.5324NA3862.76NANANA410.737 3008333110.4821580.455523.27720.7722472.4615.34512.46313.6116 300844NA8670.31904.2924NA8881.63NA1110.73810.7143 3008513679.9794150.3541794.163611.5896423.617.715613.127913.7458 30087189104.9128310.2857064.176450.61136542.25142698.847210.5519 3008811875.45148430.3229174.375551.86156431.4216.64613.143610.1626 3008916276.5695690.1926824.643251.48102641.7616.117593898353 3009211983.7366110.2822195.42091.2273011.5613.411411.216810192 300938997.75129760.2631803.43450.4136331.913.439412.949312.1578 300946487.4340470.4414664.021281.5343511.87157411.712312177 301004796.6221820.6210424.2320NA25805.39131881018512.787 3010155109.6258020.4414494.321730.5759621.2415.818011.831312.3211 3010338254.89136190.346378310250.73148180.8812.721210.23867.9457 301056295.2443690.6124252.65260.8851181.4114.44079.860613.925 3010712NA8710.338532.916NA7231.75NANANANANANA 301084NA7800.337502.78NANA3581.26NANANANANANA 301105888.0135890
Answered Same DayDec 02, 2021

Answer To: Exam 3 (submit by Dec 1, 11:59 pm EST on Classes as a single PDF file) Instruction: This is the...

Subhanbasha answered on Dec 02 2021
120 Votes
Exam3
Catherine
December 2021
knitr::opts_chunk$set(echo = TRUE,warning = FALSE,messages=FALSE)
Question 1.a
library(tinytex)
# Reading file
leuk <- read.table('leukemiaremission.txt',header = TRUE)

plot(REMISS~CELL,data=leuk,
main="Scatter Plot")
From the above scatter plot we can observe that CELL is incresed then it tends to
leukemia remission has occurred.
Question 1.b
# Logistic model
logi <- glm(REMISS~.,data=leuk)

# Summary of the model
summary(logi)
##
## Call:
## glm(formula = REMISS ~ ., data = leuk)
##
## D
eviance Residuals:
## Min 1Q Median 3Q Max
## -0.73148 -0.28640 0.01317 0.28326 0.61075
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.001776 6.717355 0.745 0.4652
## CELL -0.222161 1.779991 -0.125 0.9019
## SMEAR -1.528849 3.387515 -0.451 0.6566
## INFIL 1.584228 3.850919 0.411 0.6852
## LI 0.535005 0.266740 2.006 0.0586 .
## BLAST -0.009178 0.335354 -0.027 0.9784
## TEMP -4.949192 6.692905 -0.739 0.4682
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for gaussian family taken to be 0.1953698)
##
## Null deviance: 6.0000 on 26 degrees of freedom
## Residual deviance: 3.9074 on 20 degrees of freedom
## AIC: 40.433
##
## Number of Fisher Scoring iterations: 2
By observing the logistic regression model summary the p value of each independent
variable is greater than 0.05 so we can infer that non of the variables are significant in
the model.
Question 1.c
logit2prob <- function(logit){
odds <- exp(logit)
prob <- odds / (1 + odds)
return(prob)
}
logit2prob(coef(logi))
## (Intercept) CELL SMEAR INFIL LI BLAST
## 0.993318947 0.444687136 0.178162226 0.829802429 0.630649643 0.497705515
## TEMP
## 0.007039233
The variable INFIL has high probability than other variables that means it has hih
valuable in the model
Question 1.d
New_data<-data.frame(CELL=0.7,SMEAR=0.6,INFIL=0.8,LI=0.5,BLAST=0.6,TEMP=0.7)
predict(logi,newdata = New_data)
## 1
## 1.993898
Question 1.e
logi1 <- glm(REMISS~INFIL+LI+BLAST+CELL,data=leuk)
# Summary of the model
summary(logi1)
##
## Call:
## glm(formula = REMISS ~ INFIL + LI + BLAST + CELL, data = leuk)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.83660 -0.23042 -0.06619 0.23535 0.65026
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.60229 0.45452 -1.325 0.1987
## INFIL 0.05614 0.56418 0.100 0.9216
## LI 0.54888 0.22975 2.389 0.0259 *
## BLAST -0.05043 0.26502 -0.190 0.8508
## CELL 0.43947 0.56918 0.772 0.4483
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for gaussian family taken to be 0.1849362)
##
## Null deviance: 6.0000 on 26 degrees of freedom
## Residual deviance: 4.0686 on 22 degrees of freedom
## AIC: 37.524
##
## Number of Fisher Scoring iterations: 2
Comparing with the intial model the LI variable is significant in this model. Tha AIC
value is 37.524.
Question 1.f
plot(logi)
Question 2.a
library(lubridate)
##
## Attaching package: 'lubridate'
## The following objects are masked from 'package:base':
##
## date, intersect, setdiff, union
civi <- read.csv('civilinjury0.csv')
unique(year(civi$Injury.Date))
## [1] 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016
unique(month(civi$Injury.Date))
## [1] 1 2 4 5 7 8 9 10 11 12 3 6
Question 2.b
civi$year <- year(civi$Injury.Date)
civi$mon <- month(civi$Injury.Date)
agg <- aggregate(civi["Total.Injuries"], by=civi[c("year","mon")],FUN=sum)
agg
## year mon Total.Injuries
## 1 2005 1 11
## 2 2006 1 2
## 3 2007 1 2
## 4 2008 1 4
## 5 2009 1 6
## 6 2010 1 1
## 7 2011 1 4
## 8 2012 1 9
## 9 2013 1 2
## 10 2014 1 1
## 11 2015 1 5
## 12 2005 2 3
## 13 2006 2 3
## 14 2007 2 7
## 15 2009 2 2
## 16 2010 2 1
## 17 2011 2 3
## 18 2012 2 3
## 19 2013 2 5
## 20 2014 2 2
## 21 2015 2 2
## 22 2006 3 4
## 23 2007 3 1
## 24 2008 3 2
## 25 2009 3 1
## 26 2010 3 1
## 27 2011 3 1
## 28 2012 3 4
## 29 2013 3 1
## 30 2014 3 2
## 31 2015 3 1
## 32 2016 3 1
## 33 2005 4 3
## 34 2006 4 7
## 35 2007 4 1
## 36 2008 4 3
## 37 2010 4 7
## 38 2011 4 3
## 39 2013 4 1
## 40 2015 4 1
## 41 2005 5 2
## 42 2006 5 3
## 43 2007 5 3
## 44 2008 5 2
## 45 2009 5 5
## 46 2010 5 7
## 47 2011 5 1
## 48 2012 5 1
## 49 2013 ...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here