10/22/2020 Final Project file:///C:/Users/leesu/Downloads/project.html 1/2 Final Project Important: Please create your GitHub repository on the AU-R-Programming organization by Monday (Nov. 2nd 2020)...

1 answer below »
Please see attached file


10/22/2020 Final Project file:///C:/Users/leesu/Downloads/project.html 1/2 Final Project Important: Please create your GitHub repository on the AU-R-Programming organization by Monday (Nov. 2nd 2020) at 2pm and use this repository to work on this assignment. Your final submission will be done via Canvas in a single html file (in which you will specify the name of the corresponding GitHub repository) and is due by Friday, Dec. 4th 2020 at 11.59pm (no late work is accepted). The version submitted on Canvas will have to correspond to the last version on the GitHub repository and you will receive zero points if you make modifications to your work after Dec. 4th 2020 at 11.59pm. The final project will be evaluated on 100 points and the goal is to develop an R package implementing linear regression as highlighted in Section 6.4 of the book (https://smac-group.github.io/ds/section- functions.html#section-example-continued-least-squares-function). The package must contain the basic functions to perform linear regression (e.g. estimate the coefficient vector ) and obtain different statistics from the procedure. Using the notation from the book and without using any of the linear regression functions already available in R (i.e. all outputs must be produced using formulas provided in the book and in this document), the basic outputs from the procedure must be the following: Confidence intervals: the user must be able to choose the significance level to obtain for the confidence intervals for and whether to use the asymptotic or bootstrap approach for this. Plots (with e.g. ggplot2) including: 1. Residuals vs fitted-values (fitted values are ). 2. qq-plot of residuals 3. Histogram (or density) of residuals Mean Square Prediction Error (MSPE) computed in matrix form: where is the number of observations in the data (i.e. number of rows). F-test: compute the statistic in matrix form and output the corresponding p-value. With representing the sample mean of , let and and . Then we can define and and obtain the F-statistic as follows: Using the appropriate distribution in R, compute which corresponds to the p-value. Help documentation for all functions (for example using the roxygen2 package) β α 1 − α β = Xŷ β̂ MSPE := ( − 1 n ∑ i=1 n yi ŷ i) 2 n ȳ y SSM := ( − ,∑ i=1 n ŷ i ȳ) 2 SSE := ( − ,∑ i=1 n yi ŷ i) 2 DFM = p − 1 DFE = n − p MSM = SSM/DFM MSE = SSE/DFE = .F ∗ MSM MSE P (F > )F ∗ https://smac-group.github.io/ds/section-functions.html#section-example-continued-least-squares-function 10/22/2020 Final Project file:///C:/Users/leesu/Downloads/project.html 2/2 The package will be made available for download on a GitHub repository in the AU-R- Programming organization and the submission will be an html file on Canvas. The html file wil be a so-called vignette which indicates the name of the GitHub repository (and package) where you explain and give examples of how to use the package functions for all the desired outputs using one of the datasets on the Canvas course page. Up to 20 bonus points will be given for the final projects if other features are added for the package (e.g. a website with vignette, an example Shiny app that uses the package, the use of the Rcpp package).
Answered Same DayOct 22, 2021

Answer To: 10/22/2020 Final Project file:///C:/Users/leesu/Downloads/project.html 1/2 Final Project Important:...

Naveen answered on Oct 31 2021
152 Votes
Testing1/.Rbuildignore
^.*\.Rproj$
^\.Rproj\.user$
Testing1/.Rproj.user/E408E4C2/build_options
auto_roxygenize_for_build_and_reload="1"
auto_roxygenize_for_build_package="1"
auto_roxygenize_for_check="1"
live_preview_website="1"
makefile_args=""
preview_website="1"
website_output_format="all"
Testing1/.Rproj.user/E408E4C2/cpp-definition-cache
[]
Testing1/.Rproj.user/E408E4C2/pcs/debug-breakpoints.pper
{
"debugBreakpointsState": {
"breakpoints": []
}
}
Testing1/.Rproj.user/E408E4C2/pcs/files-pane.pper
{
"sortOrder": [
{
"columnIndex": 2,
"ascending": true
}
],
"path": "F:/Projects/GreyNodes/69502/Testing1/R"
}
Testing1/.Rproj.user/E408E4C2/pcs/source-pane.pper
{
"activeTab": 0
}
Testing1/.Rproj.user/E408E4C2/pcs/windowlayoutstate.pper
{
"left": {
"splitterpos": 258,
"topwindowstate": "NORMAL",
"panelheight": 612,
"windowheight": 650

},
"right": {
"splitterpos": 522,
"topwindowstate": "NORMAL",
"panelheight": 612,
"windowheight": 650
}
}
Testing1/.Rproj.user/E408E4C2/pcs/workbench-pane.pper
{
"TabSet1": 3,
"TabSet2": 3,
"TabZoom": {}
}
Testing1/.Rproj.user/E408E4C2/persistent-state
build-last-errors="[]"
build-last-errors-base-dir="F:/Projects/GreyNodes/69502/Testing1/"
build-last-outputs="[{\"type\":0,\"output\":\"==> devtools::document(roclets = c('rd', 'collate', 'namespace'))\\n\\n\"},{\"type\":2,\"output\":\"Updating Testing1 documentation\\r\\n\"},{\"type\":2,\"output\":\"Loading Testing1\\r\\n\"},{\"type\":1,\"output\":\"Writing my_lm.Rd\\r\\nWriting my_qqplot.Rd\\r\\nWriting my_resid_fit.Rd\\r\\nWriting my_hist_resid.Rd\\r\\nWriting my_MSPE.Rd\\r\\nWriting my_F.Rd\\r\\n\"},{\"type\":2,\"output\":\"Warning: The existing 'NAMESPACE' file was not generated by roxygen2, and will not be overwritten.\\r\\n\"},{\"type\":1,\"output\":\"Documentation completed\\n\\n\"},{\"type\":0,\"output\":\"==> Rcmd.exe INSTALL --no-multiarch --with-keep.source Testing1\\n\\n\"},{\"type\":1,\"output\":\"* installing to library 'C:/Users/dell/Documents/R/win-library/4.0'\\r\\n\"},{\"type\":1,\"output\":\"* installing *source* package 'Testing1' ...\\r\\n\"},{\"type\":1,\"output\":\"\"},{\"type\":1,\"output\":\"** using staged installation\\r\\n\"},{\"type\":1,\"output\":\"\"},{\"type\":1,\"output\":\"** R\\r\\n\"},{\"type\":1,\"output\":\"** byte-compile and prepare package for lazy loading\\r\\n\"},{\"type\":1,\"output\":\"\"},{\"type\":1,\"output\":\"** help\\r\\n\"},{\"type\":1,\"output\":\"\"},{\"type\":1,\"output\":\"*** installing help indices\\r\\n\"},{\"type\":1,\"output\":\"\"},{\"type\":1,\"output\":\" converting help for package 'Testing1'\\r\\n\"},{\"type\":1,\"output\":\"\"},{\"type\":1,\"output\":\" finding HTML links ...\"},{\"type\":1,\"output\":\" my_F html \\r\\n\"},{\"type\":1,\"output\":\" my_MSPE html \"},{\"type\":1,\"output\":\" done\\r\\n\"},{\"type\":1,\"output\":\"\"},{\"type\":1,\"output\":\"\\r\\n\"},{\"type\":1,\"output\":\" my_hist_resid html \\r\\n\"},{\"type\":1,\"output\":\" my_lm html \\r\\n\"},{\"type\":1,\"output\":\" my_qqplot html \\r\\n\"},{\"type\":1,\"output\":\" my_resid_fit \"},{\"type\":1,\"output\":\" html \"},{\"type\":1,\"output\":\"\\r\\n\"},{\"type\":1,\"output\":\"\"},{\"type\":1,\"output\":\"** building package indices\\r\\n\"},{\"type\":1,\"output\":\"\"},{\"type\":1,\"output\":\"** testing if installed package can be loaded from temporary location\\r\\n\"},{\"type\":1,\"output\":\"\"},{\"type\":1,\"output\":\"** testing if installed package can be loaded from final location\\r\\n\"},{\"type\":1,\"output\":\"\"},{\"type\":1,\"output\":\"** testing if installed package keeps a record of temporary installation path\\r\\n\"},{\"type\":1,\"output\":\"* DONE (Testing1)\\r\\n\"},{\"type\":1,\"output\":\"\"}]"
compile_pdf_state="{\"tab_visible\":false,\"running\":false,\"target_file\":\"\",\"output\":\"\",\"errors\":[]}"
files.monitored-path=""
find-in-files-state="{\"handle\":\"\",\"input\":\"\",\"path\":\"\",\"regex\":false,\"ignoreCase\":false,\"results\":{\"file\":[],\"line\":[],\"lineValue\":[],\"matchOn\":[],\"matchOff\":[],\"replaceMatchOn\":[],\"replaceMatchOff\":[]},\"running\":false,\"replace\":false,\"preview\":false,\"gitFlag\":false,\"replacePattern\":\"\"}"
imageDirtyState="1"
saveActionState="-1"
Testing1/.Rproj.user/E408E4C2/rmd-outputs
Testing1/.Rproj.user/E408E4C2/saved_source_markers
{"active_set":"","sets":[]}
Testing1/.Rproj.user/E408E4C2/sources/prop/1381EF53
{
"cursorPosition": "153,1",
"scrollLine": "140"
}
Testing1/.Rproj.user/E408E4C2/sources/prop/8674A834
{}
Testing1/.Rproj.user/E408E4C2/sources/prop/A8436F53
{
"cursorPosition": "49,13",
"scrollLine": "44"
}
Testing1/.Rproj.user/E408E4C2/sources/prop/E11E1D39
{
"cursorPosition": "11,0",
"scrollLine": "0"
}
Testing1/.Rproj.user/E408E4C2/sources/prop/FA90EA93
{
"cursorPosition": "20,0",
"scrollLine": "9"
}
Testing1/.Rproj.user/E408E4C2/sources/prop/INDEX
F%3A%2FProjects%2FGreyNodes%2F69502%2FTesting1%2FDESCRIPTION="E11E1D39"
F%3A%2FProjects%2FGreyNodes%2F69502%2FTesting1%2FR%2FTesting1.R="1381EF53"
F%3A%2FProjects%2FGreyNodes%2F69502%2FTesting1%2FR%2Fhello.R="A8436F53"
F%3A%2FProjects%2FGreyNodes%2F69502%2FTesting1%2Fman%2Fhello.Rd="8674A834"
F%3A%2FR%20Package%2FSA%2FR%2FSA%20-%20Copy.R="FA90EA93"
Testing1/.Rproj.user/E408E4C2/sources/s-0DD8B8D6/1ECE7019-contents
#' Simple Linear Regression
#'
#' The function \code{my_lm()} is used to fit simple linear regression model.
#'
#' @param response A vector or matrix
#' @param covariates A vector or matrix
#' @param alpha level of significance, the default significance level is **0.05**.
#'
#' @return Returns \code{Residuals}, \code{beta hat},
#' \code{sigma hat}, \code{Variance of beta hat},
#' \code{Confidence Intervals of beta}, \code{fitted values},
#' \code{response} and \code{covariate} values
#'
#' @author Naveen Kumar M.Sc., \emph{Email}: \email{[email protected]} OR \emph{WhatsApp}: \href{https://wa.me/918688896472}{Click Here}
#' @examples
#' y <- c(25,36,12,45,26,82,14,35,21,45,32)
#' x <- c(10,35,62,42,15,32,18,24,38,26,43)
#' lm_model <- my_lm(response, covariates, alpha = 0.1)
#' print(lm_model)
#'
#' @seealso \link{my_qqplot} for making \code{Normal Q-Q} plot, \link{my_resid_fit} getting missing percentage
#' \link{my_hist_resid} for missing count, \link{my_MSPE} getting missing percentage
#' \link{my_F} for missing count
#'
#' @export
my_lm = function(response, covariates, alpha = 0.05) {
# Make sure data formats are appropriate
response <- as.vector(response)
covariates <- as.matrix(covariates)
# Define parameters
n <- length(response)
p <- ncol(covariates)
df <- n - p
# Estimate beta through Eq. (6.1)
beta.hat <- solve(t(covariates)%*%covariates)%*%t(covariates)%*%response
# Estimate of the residual variance (sigma2) from Eq. (6.3)
# Compute residuals
fitted.val <- covariates%*%as.matrix(beta.hat)
resid <- response - fitted.val
sigma2.hat <- (1/df)*t(resid)%*%resid
# Estimate of the variance of the estimated beta from Eq. (6.2)
var.beta <- sigma2.hat*solve(t(covariates)%*%covariates)
# Estimate of the confidence interval based on alpha
quant <- 1 - alpha/2
ci.beta <- c(beta.hat - qnorm(p = quant)*sqrt(var.beta), beta.hat +
qnorm(p = quant)*sqrt(var.beta))
# Return all estimated values
return(list(residuals= resid, beta = beta.hat,
sigma2 = sigma2.hat, variance_beta = var.beta,
ci = ci.beta, fitted.values = fitted.val,
Response = response, Covariates = covariates))
}
Testing1/.Rproj.user/E408E4C2/sources/s-0DD8B8D6/29C2EAED
{
"id": "29C2EAED",
"path": "F:/Projects/GreyNodes/69502/Testing1/R/Testing1.R",
"project_path": "R/Testing1.R",
"type": "r_source",
"hash": "597828291",
"contents": "",
"dirty": false,
"created": 1604044128430.0,
"source_on_save": false,
"relative_order": 1,
"properties": {
"cursorPosition": "153,1",
"scrollLine": "140"
},
"folds": "",
"lastKnownWriteTime": 1604143560,
"encoding": "ISO8859-1",
"collab_server": "",
"source_window": "",
"last_content_update": 1604143560011,
"read_only": false,
"read_only_alternatives": []
}
Testing1/.Rproj.user/E408E4C2/sources/s-0DD8B8D6/29C2EAED-contents
#' Simple Linear Regression
#'
#' The function \code{my_lm()} is used to fit simple linear regression model.
#'
#' @param response A vector or matrix
#' @param covariates A vector or matrix
#' @param alpha level of significance, the default significance level is **0.05**.
#'
#' @return Returns \code{Residuals}, \code{beta hat},
#' \code{sigma hat}, \code{Variance of beta hat},
#' \code{Confidence Intervals of beta}, \code{fitted values},
#' \code{response} and \code{covariate} values
#'
#' @author Naveen Kumar M.Sc., \emph{Email}: \email{[email protected]} OR \emph{WhatsApp}: \href{https://wa.me/918688896472}{Click Here}
#' @seealso \link{my_qqplot} for making \code{Normal Q-Q} plot, \link{my_resid_fit} residual vs fitted plot,
#' \link{my_hist_resid} for histogram of residuals, \link{my_MSPE} Calculating Mean Square Percentage Error
#' \link{my_F} for F cal, F cri & p values.
#'
#'
#' @examples
#' y <- c(25,36,12,45,26,82,14,35,21,45,32)
#' x <- c(10,35,62,42,15,32,18,24,38,26,43)
#' lm_model <- my_lm(response, covariates, alpha =...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here