Question 1 | 3pts: ##--------------------## What are two reasons for why the following code throws an error:10 plus "six"# One reason the code throws an error is because the addition command is written out as plus istead of the addition sign + which is the command for adding two numbers together.#Another reason the code throws an error is because you cannot execute mathematical commands to a string, in this case the string "six".answerQ1 print(answerQ1)
#--------------------## Question 2 | 3pts: ##--------------------## There are three parts to this quesiton that will be based off of the code below.testVectorQ2A testVectorQ2A[2,]
# (a | 1pt) Why does the second line of the code throw an error?answerQ2A print(answerQ2A)
# (b | 1pt) Please correct the second line of code such that the 2nd value in the vector is executed.testVectorQ2A[2]# (c | 1pt) What is the difference between the dimensions of a vector and a dataframe/array?answerQ2C print(answerQ2C)
#--------------------## Question 3 | 3pts: ##--------------------## There are three parts to this question that will be based off of the three for loops below.# For-Loop Afor (i in 1:length(testVector)){ print(testVector[i])}# For-Loop Bfor (i in 1:10){ print(testVector[i])}# For-Loop Cfor (i in testVector){ print(i)}
# (a | 1pt) Which of the for-loops achieve the same result AND why?answerQ3A print(answerQ3A)
# (b | 1pt) Please explain what the variable 'i' takes on in the iterations of each for-loop.answerQ3B print(answerQ3B)
# (c | 1pt) Between for-loop A & B, which one is considered dynamic and why?answerQ3C print(answerQ3C)
#--------------------## Question 4 | 3pts: ##--------------------## There are two parts to this question about data frames.
# (a | 1.5pts) If there are two vectors, one with 10 values, and the second with 12 values,# why won't R allow you to cbind them together to create a data frame while using# as.data.frame()?answerQ4A print(answerQ4A)
# (b | 1.5pts) What are two ways to access a column from a data frame?answerQ4B print(answerQ4B)
#--------------------## Question 5 | 3pts: ##--------------------## There are two parts to this question.
# (a | 1.5pts) Say you have a vector of 10 values called vector5Q. If calculating the length of# of vector5Q outputs a different number than the length of the unique values of# vector5Q, what does this mean?answerQ5A print(answerQ5A)
# (b | 1.5pts) Please explain in detail what a UniqueID/Key/etc. does when working with multiple# datasets within a data base. Why is it important that you preserve the status of# this field and keep it unaltered?answerQ5B print(answerQ5B)
#---------------------------------------## Section 2: Applications of Functions ##---------------------------------------#
#--------------------## Question 1 | 4pts: ##--------------------#
vector for (i in 1:length(vector)){ print(i)}# Please correct the above code such that the contents of the vector is printed to the console.# Leave the for-loop above unaltered, enter your code and answer below:
#--------------------## Question 2 | 4pts: ##--------------------#
# Please read in the uber_jun2014.txt data for this question. (This includes July 1st which is okay)# Create a new column called WeekdayString that is the text version of the corresponding# weekday value. Key: 1 = Sunday | 2 = Monday | 3 = Tuesday | 4 = Wednesday | 5 = Thursday# 6 = Friday | 7 = Saturday.
uber$WeekdayString # Do not use a for-loop. Enter code below:
#--------------------## Question 3 | 4pts: ##--------------------#
# Using the uber data from Q2, execute the following four steps:# (a | 1pt) - Change the name of the column 'Date/Time' to something more reflective of the contents:# (b | 1pt) - Format your new column from part (a) to be of the Date format.# (c | 1pt) - Plot the price column against the new column from part (a).# (d | 1pt) - Explain in text what the plot suggests about price fluctuations across the standard week in June.
# Part A - Enter code below:
# Part B - Enter code below:
# Part C - Enter code below:
# Part D - Enter text below:answerS2Q3D
#--------------------## Question 4 | 4pts: ##--------------------#
# Using the uber data from Q2, write a for-loop that runs through the 7 unique values of the WeekdayString# column and prints out the range of the prices and durations for each weekday - Sunday, Monday, Tuesday, etc.# Make sure to specify the price and duration print statements, ex: "Price - Sunday: range" & "Duration - Sunday: range"
#--------------------## Question 5 | 4pts: ##--------------------#
# Using the uber data set from Q2, subset the data into a two new data frames based on:.# Subset 1 - subUber612AM: The hours between 6AM - 12 noon.# Subset 2 - subUber612PM: The hours between 6PM - 12 midnight.# Print the average price and duration of the two subsets and the difference between the two.
# Extra credit ( 2pts ):# Explain what the differences between the two average prices and durations suggest.
#--------------------## Question 6 | 4pts: ##--------------------#
uncleanedVector # Using the vector above, clean all of the unwanted values from the data, change the structure to numeric,# then convert the values from millions to hundred thousands.
#--------------------## Question 7 | 4pts: ##--------------------#
rVals # Using the vector above, write a for-loop to print "Yes this is true." to the console when# a value is between 0.9 and 1.0. Please print the value along with the text in the line above.
#---------------------## Question 8 | 12pts: ##---------------------#mtcarsDf
# (a | 8pts) Using the data above and partially done for-loop below, please update the code below such that:# - the for loop runs 6 times, one time for each column in mtcarsDf except for mpg.# - update the if statement to run if the current regression model's R-squared is greater than 0.70.# (b | 4pts): Explain in text what is happening in this code.# (c | 4pts EXTRA CREDIT): Please update the for loop to also store the column name within each iteration# so that you keep track of what the [x] value is in each model. Have this such# that your print statement also prints out the corresponding column name.for (){ curModel curSummary if (){ print(paste0("R-Squared: ", round(curSummary$r.squared, digits = 2))) }}
#---------------------------## Section 3: Mini-activity ##---------------------------#
#---------------------------------------------## Please read in the following two data sets: ## - abp_2015_price.csv ## - abp_locations.csv ##---------------------------------------------#
# After reading in the data, complete/answer the following items in R:# You must complete each action item in your R script for me to be able to review your work for credit.# - Create multiple new data frames based on yearly profit ***in 2020 dollars*** for each of the below:# - Profit of each individual product. (5)# - Profit of each location subdivision - please breakout by city rather than subdivision code. (5)# - Profit of each state. (5)# - Profit of each product type. (5)# - This new column needs to be derived in R. Please do not do this in Excel or outside of R.# - Ex: Soup, Sandwich, etc.# - Which product produces the most profit in each state? (5)# - Create a profit-margins data frame of each product. (10) - Profit Margins should be price - total cost of each individual product.# - Based on this data frame, which product has the best profit-margins? (2.5)# - How many more need to be sold to eclipse the leading product in terms of product profit? (2.5)# - Write code for a 'finished' visual/graphic depicting something from the overall abp data set, (5)# write code to save this to your working directory for me to execute. Explain what the graphic shows.# - 'Finished': Cleaned up axis labels (e.g. do not use df$Price as the y-axis label), properly# formatted title, etc.# - This can not be a spatial graph from Week 6.# - Extra credit ( 10pts ): Create a spatial graph of each individual ABP location that has circle markers# with different sizes and shades of green to show their respective overall profit and total products sold.