Please use R studies In the question 1 tips http://www.ggobi.org/book/data/tips.csv")

1 answer below »
Please use R studies
In the question 1 tips http://www.ggobi.org/book/data/tips.csv")
Answered Same DaySep 30, 2021

Answer To: Please use R studies In the question 1 tips http://www.ggobi.org/book/data/tips.csv")

Pritam answered on Oct 01 2021
156 Votes
---
title: "STAT 4410/8416 Homework 2"
author: "lastName firstName"
date: "Due on Sep 29, 2019"
output:
html_document: default
pdf_document: default
word_document: default
---
```{r setup, include=FALSE}
library(knitr)
opts_chunk$set(fig.align='center', message=FALSE, cache=TRUE)
output <- opts_knit$get("rmarkdown.pandoc.to")
if(!is.null(output)) {
if(output=="html") opts_chunk$set(out.width =
'400px') else
opts_chunk$set(out.width='.6\\linewidth')
}
```
**1.** The data set `tips` contains tip amounts for different party sizes as well as total bill amounts per payment. We can get the data from the reshape2 package as follows:
```{r}
library(reshape2)
tips.dat <- tips
```
Now answer the following questions:
a. Compute the tip rate, dividing tip by total bill, and create a new column called `tip.rate` in the dataframe `tips.dat`. Demonstrate your results by showing the head of `tips.dat`.
```{r}
library(reshape2)
tips.dat = tips
attach(tips.dat)
tips.dat$tip.rate = tip/total_bill
head(tips.dat)
```
b. Draw a side-by-side violin plot of the tip rate for each party size. Order the party sizes by the median tip rate. Provide your code as well as your plot. Which party size is responsible for the highest median tip rate?
```{r}
library(ggplot2)
tips.dat$size_fac = as.factor(size)
p = ggplot(tips.dat, aes(x = size_fac, y = tip.rate))+
geom_violin()
p
p + stat_summary(fun.y=median, geom="point", size=2, color="red")
```
One can see that size 1 has the highest median tip rate.
c. Generate a similar plot to the one you created in question 2b for each day (instead of party size) and facet by sex and smoker. Is the shape of the violin plot similar for each faceted condition?
```{r}
p_1 = ggplot(tips.dat, aes(x = day, y = tip.rate))+
geom_violin()
p_1+facet_grid(.~ sex)
p_1+facet_grid(.~ smoker)
```

No, the plot seems to be different altogether for each faceted condition.
**2.** We can generate an $n$x$k$ matrix $M$ and a vector $V$ of length $k$ for some specific values of $n$ and $k$ as follows:
```{r}
set.seed(321)
n <- 9
k <- 5
V <- sample(seq(50), size = k, replace = TRUE)
M <- matrix(rnorm(n * k), ncol = k)
```
a. Now, carefully review the following for-loop. Rewrite the code so that you perform the same job without a loop.
```{r}
X <- M
for(i in seq(n)) {
X[i, ] <- round(M[i, ] / V, digits = 4)
}
```
Here is the code without using for-loop.
```{r}
X1 = round(t(t(M)/V), digits = 4)
```

b. Now do the same experiment for $n=900$ and $k=500$. Which runs faster, your code or the for-loop? Demonstrate this using the function `system.time()`.

```{r}
set.seed(321)
n = 900
k = 500
V = sample(seq(50), size = k, replace = TRUE)
M = matrix(rnorm(n * k), ncol = k)
X = M
system.time(for(i in seq(n)) {
X[i, ] = round(M[i, ] / V, digits = 4)
})
system.time(round(t(t(M)/V), digits = 4))
```
The code newly created is taking lesser time than the for loop.

**3.** We want to generate a plot of US arrest data (USArrests). Please provide the detailed codes to answer the following questions.
a. Obtain USA state boundary coordinates data for generating a USA map using function `map_data()` and store the data in `mdat`. Display the first few rows of data from `mdat`, noticing that there is a column called `order` that contains the true order of the coordinates.
```{r}
data("USArrests")
d1 = USArrests
mdat = map_data("state")
head(mdat)
```
b. \label{standardize-rate} You will find USA crime data in the data frame called `USArrests`. Standardize the crime rates and create a new column called `state` so that all state names are in lower case. Store this new data in an object called `arrest` and report the first few rows of `arrest`.
```{r}
d1 = as.data.frame(scale(d1))
arrest = d1
arrest$state = tolower(rownames(d1))
head(arrest)
```
c. \label{order-data} Merge the two data sets `mdat` and `arrest` by state name. Note: merging will change the order of the coordinates data. So, order the data back to the original order and store the merged-ordered data in `odat`. Report the first few rows of data...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here