KimData <- kimdata="" %="">%->
mutate(Shoe.Size=as.numeric(sub(113, 11, Shoe.Size, fixed =TRUE))) %>%
replace_na(Cups.of.Coffee=0)
1.The
pairs
command had an additional function that would put a histogram on its diagonal:
panel.hist
usr
par(usr = c(usr[1:2], 0, 1.5) )
h
breaks
y
rect(breaks[-nB], 0, breaks[-1], y, col = "cyan", ...)
}
For each line of the function, explain what it’s doing.
2.Write a function that calculates the five-number summary of a variable and returns it as a vector of five numbers.
3.Write a function that makes a snazzy jittered scatterplot (geom_jitter) from
KimData, with specific colors for gender, smoothing, and a couple of other awesome features.
4.Looking above, which of the two methods do you prefer?
Lucky (using
replicate) or Loopy (using
for)? Why?
5.Write a function that repeats either luckysibs or loopysibs for any variable in
KimData.
Apply it to several variables: Shoe.Size, Politically.Liberal, Gender.
6.UCLA students have ACT composite scores that are normally distributed (roughly) with a mean of 26.5 and a standard deviation of 3.7 points. Remember that ACT scores are rounded to the nearest whole point, and must be between 6 and 36. UCLA’s incoming class is roughly 1400 students.
a) Simulate an incoming class worth of ACT-composite scores.
Make a histogram of the scores with ggplot.
b) Simulate 100 incoming classes of 1400 ACT-composite scores each. For each, calculate the 80% percentile and save this as a new vector
ACT80.
Find the bootstrap confidence interval for the 80%ile of an incoming classes ACT-composite.
7.Bonus. Instead of matplot, use ggplot to make a chart of the random walk.