High School Intro to R Homework--- title: "HW 5" subtitle: 'Introduction to R' output: ...

Question

High School Intro to R Homework--- title: "HW 5" subtitle: 'Introduction to R' output:   pdf_document: default   html_notebook: default --- Run this code to load the R objects for this homework: ```{r} load( "HW5 R Objects.Rdata" ) ``` # Problem 1: Sum of Squares In this problem, we'll calculate the same value in 3 different ways. The sum of the squares of the first $n$ positive integers is: $$ 1^2 + 2^2 + 3^2 + \ldots + n^2\ =\ \frac{ n 	imes (n+1) 	imes (2n + 1)}{6} $$ For instance, suppose $n = 5$. Then the sum of the first $n = 5$ positive integers is: $$ 1^2 + 2^2 + 3^2 + 4^2 + 5^2\ =\ 1 + 4 + 9 + 16 + 25\ =\ 55 $$ The right-hand side of the formula is: $$ \frac{5 	imes (5 + 1) 	imes (2 	imes 5 + 1)}{6}\ =\ \frac{5 	imes 6 	imes 11}{6}\ =\ 55 $$ ## Part (a) Use the formula to calculate the sum of the first 70 positive integers. Report your result using a `cat()` statement. **Solution** ## Part (b) Use a vectorized approach to calculate the sum of the first 20 positive integers. Report your result using a `cat()` statement. **Solution** ## Part (c) Use a `for` loop to calculate the sum of the first 20 positive integers. Report your result using a `cat()` statement. **Solution** 
ewpage End of problem 1 
ewpage # Problem 2: Removing -9 and -99 ## Part (a) Construct a stripchart of the values in `problem.2.data`. Notice that the data contains some values that are -9 and some values that are -99. **Solution** ## Part (b) Replace the values in `problem.2.data` that are equal to -9 with the special value `NA`. Write a short sentence explaining which locations had a -9 value. **Solution** ## Part (c) Now replace the values in `problem.2.data` that are equal to -99 with the special value `NA`. Write a short sentence explaining which locations had a -99 value. At the end of this problem, the vector `problem.2.data` should have the same values as it originally did, except that both -9 and -99 values have been replaced with `NA` values. **Solution** ## Part (d) Now create a new stripchart of the data in `problem.2.data`. Since the -9 and -99 values have been converted to `NA` values, you should not see them in the graph. **Solution** 
ewpage End of problem 2 
ewpage # Problem 3: Temperature Conversion Let $F$ denote a temperature measurement in degrees Fahrenheit, and $C$ denote a temperature measurement in degrees Centigrade. Then we have: $$ F\ =\ \frac{9}{5} \cdot C + 32 $$ Conversely, we also have: $$ C\ =\ \frac{5}{9} \cdot (F - 32) $$ ## Part (a) The vector `problem.3.a.data` consists of a sequence of temperature measurements, recorded in degrees Fahrenheit. Using vectorized operations, convert these measurements to degrees Centigrade. Then report the sample mean of this data using a `cat()` statement, rounding to 5 decimal places. **Solution** ## Part (b) The vector `problem.3.b.data` consists of a sequence of temperature measurements, recorded in degrees Centigrade. Using vectorized operations, convert these measurements to degrees Fahrenheit. Then report the sample maximum and sample minimum of this data using a separate `cat()` statement for each value, rounding to 5 decimal places. **Solution** 
ewpage End of problem 3 
ewpage # Problem 4: Graphing the Logistic Function The *logistic* function is defined as: $$ f(x)\ =\ \frac{ e^x }{1 + e^x} $$ Draw a graph of this function: * First, draw the function curve itself, with the $x$-axis ranging from -6 to +6 and the $y$-axis ranging from 0 to 1.5. Use a solid line for the curve, and choose a nice color. * Draw the horizontal reference line $y = 0$ from $x = -6$ to $x = +6$. * Draw a vertical reference line $x = 0$ from $y = 0$ to $y = 1.5$. * Draw the horizontal asymptote $y = 1$ from $x = -6$ to $x = +6$. **Solution** 
ewpage End of problem 4 
ewpage # Problem 5: Filtering Extreme Values Systolic blood pressures are typically in the range of about 120 to 140, although in extreme cases they can be as high as 180. ## Part (a) The variable `problem.5.data` contains numeric values representing systolic blood pressures. Unfortunately, due to data entry errors, there are some values in this dataset that are too large to represent a valid systolic blood pressure. For the first part of this problem, construct a histogram of the data in `problem.5.data`. **Solution** ## Part (b) Using the histogram that you drew, remove the unusual values in the data. You'll have to determine which values are too large, although this should be clear. Don't set these values to `NA` -- instead, actually create a vector which has these values removed. The result should be a vector that has a length that is less than the original vector. When you've finished this filtering operation, report the length of the filtered vector, along with the sample mean of the values in the filtered vector. Use a separate `cat()` statement for each value, rounding to 5 decimal places. **Solution** ## Part (c) Now create a histogram for the filtered data from part (b). Make sure you include a main title as well as titles for the horizontal and vertical axes, select a nice color, and choose the number of breaks. **Solution** 
ewpage End of problem 5 
ewpage # Problem 6: Grouping Categories ## Part (a) The variable `problem.6.data` contains data on support requests at each of the four offices of WiDgT. Create a table of these values, summarizing the number of requests for each of the offices, and display this table directly (i.e.\ you don't need to do anything like a `cat()` statement). **Solution** ## Part (b) Create a pie chart using the tabulated data from part (a). Be sure to give your pie chart a main title, and to choose nice colors for the pie slices. **Solution** ## Part (c) Group the categories "Boston" and "Salt Lake City" together into a category named "Domestic". Group the categories "London" and "Shanghai" together into a category named "International". Then construct and display a table summarizing the total number of requests for these two grouped categories. **Solution** ## Part (d) Display the grouped data from part (c) not as a table of raw counts, but instead as a table of the relative proportions. Round the values to 2 decimal places. **Solution** ## Part (e) Now create a pie chart using the grouped data from part (c). 
ewpage End of problem 6 
ewpage # Problem 7: Stratified Boxplot So far, we've seen 3 ways to repair data: * If missing data is represented by a value such as -9 or -99, we can convert that to `NA`. * We can convert outliers to `NA`. * We can convert the value `Missing` in a factor to `NA`. In this problem, we will put all of these ideas to work, and then make a nice graph at the end. ## Part (a) The variable `problem.7.data.vector` contains numeric data representing sales. Find all the locations where there is a -9, and replace these with the special value `NA`. Also, find all the locations where there is a 99999, and replace these with the special value `NA`. Save this repaired vector in a variable. Then write one or two sentences and tell us which locations had a -9, and which locations had a 99999. At the end of this part, you should have constructed a vector which has all the values of `problem.7.data.vector`, except that -9 and 99999 values have been converted to `NA` values. **Solution** ## Part (b) The variable `problem.7.data.factor` contains factor data representing one of the four office locations of WiDgT, as well as the category "Missing". Replace the "Missing" values with the special value `NA`. Then summarize this categorical data using a table. **Solution** ## Part (c) Using the numeric vector that you created in part (a) and the factor that you created in part (b), create a vertical stratified boxplot. Include a main title and a title for the vertical axis, and choose nice colors for the boxes. **Solution** 
ewpage End of problem 7 
ewpage # Problem 8: Smiley Face In this problem, we will graph a sequence of points that will make a nice design. The variable `problem.8.x.data` contains the $x$-coordinates for a sequence of points. The variable `problem.8.y.data` contains the $y$-coordinate for the same sequence of points. ## Part (a) First, create an empty plot with no data. The $x$ values should range from -3 to 3, and the $y$ values should range from 0 to 4. You don't have to give your graph a main title, and the $x$- and $y$-axis title can just be empty strings i.e.\ just use "". Then graph the sequence of points by making a single call to the `points()` function. I suggest using solid circular points, and you should explicitly select a nice color for the points. **Solution** ## Part (b) Write a `for` loop that iterates over the two vectors, taking corresponding values of the `problem.8.x.data` vector and the `problem.8.y.data` and plotting a single point at that location: * As before, first create an empty plot with no data. The $x$ values should range from -3 to 3, and the $y$ values should range from 0 to 4. You don't have to give your graph a main title, and the $x$- and $y$-axis title can just be empty strings i.e.\ just use "". * Then graph the sequence of points by iterating over the two vectors. (Hint: you can do this by iterating with an index, and then using positive integer indexing to select the elements from the two vectors.)     - In the first iteration of the `for` loop, plot a point at the location where the $x$-coordinate is the first value of the `problem.8.x.data` vector and the $y$-coordinate is the first value of the ``problem.8.y.data` vector.     - In the second iteration of the `for` loop, plot a point at the location where the $x$-coordinate is the second value of the `problem.8.x.data` vector and the $y$-coordinate is the second value of the ``problem.8.y.data` vector.     - In the third iteration of the `for` loop, plot a point at the location where the $x$-coordinate is the third value of the `problem.8.x.data` vector and the $y$-coordinate is the third value of the ``problem.8.y.data` vector. Your `for` loop should iterate over all the points, so make sure you get the upper limit right. (Hint: the $x$ and $y$ data vectors must have the same number of values, so you can just calculate the

Kshitij · Accepted Answer

hw5-r-objects-2fnbrp2g-1.rdata
hw5-r-objects-2fnbrp2g-1.rdata
hw5-s3pejenh.docx
HW 5
Introduction to R
Run this code to load the R objects for this homework:
#load( "HW5 R Objects.Rdata" )
load("~/Downloads/hw5-r-objects-2fnbrp2g.rdata")
Problem 1: Sum of Squares
In this problem,

--- title: "HW 5" subtitle: 'Introduction to R' output: pdf_document: default html_notebook: default --- Run this code to load the R objects for this homework: ```{r} load( "HW5 R Objects.Rdata" ) ```...

Answer To: --- title: "HW 5" subtitle: 'Introduction to R' output: pdf_document: default html_notebook: default...

Answer To This Question Is Available To Download

Related Questions & Answers

Submit New Assignment