follow
data-viz-challenge-4-FatimahAldhamen-main/data-viz-challenge-4.Rmd --- title: "ISTA 320 Data Visualization Challenge 4" author: "ENTER YOUR NAME HERE" date: "Fall 2021" output: html_document --- ```{r setup, include=FALSE} library(knitr) opts_chunk$set(echo = TRUE) ``` For this data viz challenge, you will be working with the same dataset as the [Line Plots Case Study -- 2016 Advanced Placement Classes Part 1](https://d2l.arizona.edu/d2l/le/content/1064958/viewContent/11083549/View), which was downloaded from [2016 Advanced Placement Exam Scores](https://www.kaggle.com/collegeboard/ap-scores?select=exams.csv) # Data Wrangling Part 1 In the next code block, make sure you: 1. load the `tidyverse` library 1. read "data/exams.csv" in using `read_csv()` 1. inspect data to get an idea of what it looks like ```{r} # ENTER YOUR CODE HERE ``` # Data Visualization 1 > Question 1: How many students, from `All Students (2016)` (i.e., from all the students that took AP exams in 2016), got each score (from 1 to 5) across different exam subjects. 1. start with the object you created when you `read_csv()` 1. filter out Scores that are "All" and "Average" (keep scores that are from 1 to 5) 1. start a ggplot with the following in mind: score is your sequential variable, `All Students (2016)` is your numeric variable (with student counts), and `Exam Subject` is your categorical variable. 1. use `geom_line` to draw a line plot 1. make any other adjustments to make your plot clearer (e.g., use `facet_wrap`, change scales, add a caption) ```{r fig.height=20} # ENTER YOUR CODE HERE ``` # Data Wrangling Part 2 > Question 2: What is the distribution of Average scores across exam subjects and gender (male vs. female)? First, create a new dataframe with exam subject and average scores for male and female students. First you need to filter your original data to keep Score that is equal to "Average" only -- when you do, you are keeping rows that diplay average scores only (instead of student counts). Then use `select()` to keep the following columns: - `Exam Subject` - `Students (Male)` - `Students (Female)` ```{r} # create a new data frame that holds the results of: # start with the original data you read through read_csv and then # filter the data to keep only Score that is equal to "Average" and then # select Exam Subject, Students (Male), and Students (Female) # (remember to use back tick for column names inside select()) # inspect your data to makes sure it looks good ``` Create a pivoted dataframe, starting with the selected dataframe you just created, and `pivot_longer()` the columns "Students (Male)" and "Students (Female)". Make any changes to the new gender column (e.g., clean it up so it only says "male" or "female"). ```{r} # ENTER YOUR CODE HERE ``` You should now have a tidy dataframe with the following three variables: exam subject, gender, and average score. # Data Visualization 2 To answer question 2, plot a line plot of "Average" scores by student count, across gender: 1. start with the tidy dataframe you created in the previous section 1. start a ggplot with the following in mind: gender is your "sequential" variable, your average score is your numeric variable, and `Exam Subject` is your categorical variable for group mapping. 1. use `geom_line` to draw a line plot 1. use `facet_wrap` to split your plot in subplots by `Exam Subject` 1. make any other adjustments to make your plot clearer (e.g., add a caption, change number of columns in your facet_wrap, add geom_label) ```{r fig.height=30} # ENTER YOUR CODE HERE ``` data-viz-challenge-4-FatimahAldhamen-main/data/exams.csv Exam Subject,Score,Students (11th Grade),Students (12th Grade),Students (Male),Students (Female),Students (White),Students (Black),Students (Hispanic/Latino),Students (Asian),Students (American Indian/Alaska Native),Students (Native Hawaiian/Pacific Islander),Students (Two or More Races),All Students (2016) ART HISTORY,5,897,1260,815,1889,1631,34,321,514,5,5,154,2704 ART HISTORY,4,1835,2608,1800,3787,3290,132,918,877,13,8,273,5587 ART HISTORY,3,2311,3282,2283,4657,3898,287,1347,992,19,10,307,6940 ART HISTORY,2,2252,3248,2374,4494,3211,450,1972,809,15,21,280,6868 ART HISTORY,1,901,1352,1072,1699,942,307,1074,292,13,10,87,2771 ART HISTORY,All,8196,11750,8344,16526,12972,1210,5632,3484,65,54,1101,24870 ART HISTORY,Average,2.95,2.93,2.87,2.98,3.11,2.29,2.55,3.15,2.72,2.57,3.12,2.94 BIOLOGY,5,5769,6396,8118,6776,8585,153,788,4443,11,7,733,14894 BIOLOGY,4,18585,21195,22244,25943,29999,976,3966,10367,75,49,2139,48187 BIOLOGY,3,30119,34631,30786,47150,46377,3305,10511,13051,183,97,3477,77936 BIOLOGY,2,25986,29963,21590,45639,31140,6083,16454,9477,268,136,2790,67229 BIOLOGY,1,7936,12645,7787,15943,6098,4935,9157,2057,146,55,734,23730 BIOLOGY,All,88395,104830,90525,141451,122199,15452,40876,39395,683,344,9873,231976 BIOLOGY,Average,2.87,2.8,3.01,2.73,3.03,2.05,2.29,3.14,2.32,2.47,2.93,2.84 CALCULUS AB,5,27199,38937,39784,31619,42748,1405,6432,16956,87,78,2883,71403 CALCULUS AB,4,15229,33814,25936,25041,31455,1597,6189,9069,72,66,1993,50977 CALCULUS AB,3,13400,36373,25532,25707,31004,1960,7599,8105,108,73,1901,51239 CALCULUS AB,2,6749,21469,14222,14728,16824,1386,4912,4276,87,42,1106,28950 CALCULUS AB,1,17014,72777,43519,48368,42029,9031,24615,11013,367,192,3512,91887 CALCULUS AB,All,79591,203370,148993,145463,164060,15379,49747,49419,721,451,11395,294456 CALCULUS AB,Average,3.36,2.73,3.03,2.84,3.1,2.02,2.29,3.34,2.2,2.55,2.97,2.94 CALCULUS BC,5,17903,33010,34020,20640,28740,764,3749,18354,47,44,2281,54660 CALCULUS BC,4,4255,12725,9912,7645,9814,403,1815,4577,26,26,720,17557 CALCULUS BC,3,4361,14667,10632,8957,10878,615,2411,4569,35,29,836,19589 CALCULUS BC,2,1332,5063,3490,3068,3453,274,981,1476,19,14,271,6558 CALCULUS BC,1,2649,11301,7480,6851,6612,913,3086,2917,35,18,592,14331 CALCULUS BC,All,30500,76766,65534,47161,59497,2969,12042,31893,162,131,4700,112695 CALCULUS BC,Average,4.1,3.67,3.91,3.68,3.85,2.94,3.18,4.07,3.19,3.49,3.81,3.81 CHEMISTRY,5,8160,3434,9378,4645,6961,142,650,5438,17,14,593,14023 CHEMISTRY,4,12850,6159,12543,9260,12189,337,1399,6605,21,21,923,21803 CHEMISTRY,3,23413,12322,20487,19363,23358,1105,3783,9441,66,40,1609,39850 CHEMISTRY,2,21081,12301,16709,19877,20626,1829,5431,6637,103,48,1545,36586 CHEMISTRY,1,16630,13108,13464,19066,13626,3693,9305,3985,158,74,1327,32530 CHEMISTRY,All,82134,47324,72581,72211,76760,7106,20568,32106,365,197,5997,144792 CHEMISTRY,Average,2.69,2.46,2.83,2.45,2.72,1.79,1.96,3.09,2,2.25,2.65,2.64 CHINESE LANGUAGE & CULTURE,5,2021,1181,2597,3525,67, ,16,5763,2,6,75,6122 CHINESE LANGUAGE & CULTURE,4,554,513,875,847,108,2,18,1459, ,1,68,1722 CHINESE LANGUAGE & CULTURE,3,450,853,825,814,396,25,75,959,3,2,114,1639 CHINESE LANGUAGE & CULTURE,2,64,226,151,168,137,18,37,95, , ,23,319 CHINESE LANGUAGE & CULTURE,1,84,325,240,214,207,48,64,91,1,3,25,454 CHINESE LANGUAGE & CULTURE,All,3173,3098,4688,5568,915,93,210,8367,6,12,305,10256 CHINESE LANGUAGE & CULTURE,Average,4.38,3.65,4.16,4.31,2.66,1.8,2.45,4.52,3.33,3.58,3.48,4.24 COMPUTER SCIENCE A,5,3859,3799,8880,2252,5396,112,563,4310,16,7,528,11132 COMPUTER SCIENCE A,4,3898,4457,8537,2500,5676,190,787,3648,12,7,531,11037 COMPUTER SCIENCE A,3,4470,5404,9589,2922,6470,371,1248,3610,22,16,551,12511 COMPUTER SCIENCE A,2,2254,3238,5160,1641,3468,299,878,1759,6,14,287,6801 COMPUTER SCIENCE A,1,4289,6528,9571,3327,5688,1055,2780,2586,32,27,518,12898 COMPUTER SCIENCE A,All,18770,23426,41737,12642,26698,2027,6256,15913,88,71,2415,54379 COMPUTER SCIENCE A,Average,3.04,2.82,3.05,2.9,3.06,2.02,2.28,3.34,2.7,2.34,3.11,3.01 MACROECONOMICS,5,3735,14744,12731,7104,11503,286,1532,5469,26,19,777,19835 MACROECONOMICS,4,4689,22905,17212,11846,16827,891,3534,6271,51,38,1115,29058 MACROECONOMICS,3,2955,16498,11458,8933,11561,882,3268,3614,39,31,781,20391 MACROECONOMICS,2,2772,18089,11685,10136,11396,1211,4681,3322,61,34,835,21821 MACROECONOMICS,1,2612,29739,15378,18396,11789,3630,12878,3674,91,69,1093,33774 MACROECONOMICS,All,16763,101975,68464,56415,63076,6900,25893,22350,268,191,4601,124879 MACROECONOMICS,Average,3.25,2.75,3,2.63,3.08,1.98,2.08,3.29,2.48,2.5,2.92,2.83 MICROECONOMICS,5,2645,7343,7501,3390,6022,118,628,3541,7,4,437,10891 MICROECONOMICS,4,3898,14127,12051,7245,11771,397,1664,4453,27,27,738,19296 MICROECONOMICS,3,3141,12072,9472,6691,9896,558,1994,2840,30,25,603,16163 MICROECONOMICS,2,1817,7728,5449,4689,5816,493,1720,1582,23,13,377,10138 MICROECONOMICS,1,1843,11321,6848,7233,5827,1433,4241,1771,33,19,468,14081 MICROECONOMICS,All,13344,52591,41321,29248,39332,2999,10247,14187,120,88,2623,70569 MICROECONOMICS,Average,3.28,2.97,3.19,2.82,3.16,2.09,2.29,3.45,2.6,2.82,3.11,3.04 ENGLISH LANGUAGE & COMPOSITION,5,48747,6352,24563,32488,35886,1242,4911,11441,66,61,2683,57051 ENGLISH LANGUAGE & COMPOSITION,4,81534,9215,37778,56609,60784,3119,11525,13110,176,129,4367,94387 ENGLISH LANGUAGE & COMPOSITION,3,127496,12595,54104,92051,87962,7800,25570,15701,404,262,6616,146155 ENGLISH LANGUAGE & COMPOSITION,2,151005,14305,59531,113998,82622,17020,48304,14252,791,418,7621,173529 ENGLISH LANGUAGE & COMPOSITION,1,57275,7086,25651,42584,18019,14036,27980,3307,495,202,2539,68235 ENGLISH LANGUAGE & COMPOSITION,All,466057,49553,201627,337730,285273,43217,118290,57811,1932,1072,23826,539357 ENGLISH LANGUAGE & COMPOSITION,Average,2.81,2.87,2.88,2.77,3.05,2.09,2.3,3.26,2.24,2.47,2.88,2.81 ENGLISH LITERATURE & COMPOSITION,5,3428,25321,9723,19402,18906,516,2371,5476,33,22,1472,29125 ENGLISH LITERATURE & COMPOSITION,4,8032,61378,25086,45300,46948,2063,7740,9692,120,61,3081,70386 ENGLISH LITERATURE & COMPOSITION,3,14192,100653,43479,73374,73232,5882,18682,12405,276,169,4981,116853 ENGLISH LITERATURE & COMPOSITION,2,18035,112113,49228,83999,63309,13838,36649,11391,602,301,5530,133227 ENGLISH LITERATURE & COMPOSITION,1,4948,41789,20079,28035,11719,11697,19090,2638,298,126,1613,48114 ENGLISH LITERATURE & COMPOSITION,All,48635,341254,147595,250110,214114,33996,84532,41602,1329,679,16677,397705 ENGLISH LITERATURE & COMPOSITION,Average,2.73,2.75,2.7,2.78,2