• Import the package nycflights13. Merge the flights and planes data into a new data frame. Answer the following questions by the merged data. • Regress arr_delay on log(distance), seats, and origin....

1 answer below »

• Import the package nycflights13. Merge the flights and planes data into a new data frame. Answer the following questions by the merged data.


• Regress arr_delay on log(distance), seats, and origin. (1) Which airport is the base cateory? (2) explain the meaning of the coef. of log(distance), and (3) is the coef. of seats significantly different from -0.5 under the 5% significance level?


• Regress arr_delay on log(distance), seats, and carrier. Which carrier is the worst carrier (namely, has the longest delay time)?


• Select only two carriers: AA and DL. Randomly sample 100 observations. Use set.seed(777) to fix the random seed.


• Creat the scatterplot of arr_delay (y-axis) and dep_delay (x-axis) based on the random sample from the previous step. (1) Color the points by the carrier. (2) Add a single regression line. (3) Label the destination. (4) The size of the points depends on the variable seat. (5) Apply the Wall Street Journal theme.

Answered Same DayApr 25, 2021

Answer To: • Import the package nycflights13. Merge the flights and planes data into a new data frame. Answer...

Suraj answered on Apr 25 2021
153 Votes
# Merging
install.packages("nycflights13")
library("nycflights13")
df1<-data.frame(planes)
head(
planes)
df2<-data.frame(flights)
head(flights)
df<-merge(df1,df2)
# Regression
lookup <- c("JFK" = 0, "LGA" = 1,"EWR" = 2)
df$new_x <- lookup[df$origin]
model<-lm(arr_delay~log(distance)+seats+new_x,df)
summary(model)
# Since the predictd value for "JFK" origin is approximately equal to the intercept tern taking other two
# variables as constant. So JFK is the base category.
#The coefficient of...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here