STAT 250 Spring 2021 Data Analysis Assignment 1 Your solutions document should include the following items. Points will be deducted if the following are not included. 1. Type your Name and STAT 250...

1 answer below »
I attached the file with the assignment I need done. It is an entry level statistics assignment.


STAT 250 Spring 2021 Data Analysis Assignment 1 Your solutions document should include the following items. Points will be deducted if the following are not included. 1. Type your Name and STAT 250 with your correct section number (e.g. STAT 250-xxx) right justified and then Data Analysis Assignment #1 centered on the top of page 1 below your name to begin your solutions document. 1. Number your pages across your entire solutions document. 1. Your solutions document should include the ANSWERS ONLY with each answer labeled by its corresponding number and subpart. Keep the answers in order. 1. Generate all requested graphs and tables using StatCrunch. 1. Upload your solutions document onto Blackboard as a Word (docx) file or pdf file using the link provided by your instructor. It is your responsibility for uploading a readable file. 1. You may not work with other individuals on this assignment. It is an honor code violation if you do. Elements of good technical writing: Use complete and coherent sentences to answer the questions. Graphs must be appropriately titled and should refer to the context of the question. Graphical displays must include labels with units if appropriate for each axis. Units should always be included when referring to numerical values. When making a comparison you must use comparative language, such as “greater than”, “less than”, or “about the same as.” Ensure that all graphs and tables appear on one page and are not split across two pages. Type all mathematical calculations when directed to compute an answer ‘by-hand.’ Pictures of actual handwritten work are not accepted on this assignment. When writing mathematical expressions into your solutions document you may use either an equation editor or common shortcuts such as: can be written as sqrt(x), can be written as p-hat, can be written as x-bar. Problem 1: Streaming Services Survey A marketing researcher wants to investigate whether there is any relationship between preferred streaming service and age in order to properly advertise a particular streaming service. A sample of 552 individuals who responded yes to subscribing to a streaming service were asked their most watched service and age. The options listed for streaming subscriptions were: Netflix, Hulu, Amazon Prime Video, Disney Plus, and Other. The individuals were grouped into three age ranges: 18-35, 36-53 and 54-71. The data set found in our StatCrunch Group presents these 552 individuals’ responses. The data set is called “Streaming Services Survey.” a) Use StatCrunch to create a one-way table for the variable “Streaming Subscription” using both counts and percentages. Select Stat Tables Frequency. Select “Streaming Subscription” in the Select Column(s) box, and select both ‘Frequency’ and ‘Percent of total’ in the Statistic(s) box by holding down the Ctrl Key (Command Key on Macs) when making these selections. Copy your table into your solutions document and then manually round the values in the “Percent of total” column to two decimal places in the StatCrunch table that you have copied into your solutions document. b) Interpret your findings from the table in 1(a) by identifying the subscription service with the largest and smallest percentage. Use complete sentences with context and include the streaming subscription name and percentage in the sentences. c) Use StatCrunch to generate a two-way table for the variables “Streaming Subscription” and “Age”. Go to Stat Tables Contingency With Data (since you have the raw data in StatCrunch). Select “Streaming Subscription” as your row variable and “Age” as your column variable. In the “Display:” box, select only “Percent of Total.” Lastly, unclick (or deselect) “Chi-Square test for independence” since it is highlighted by default by holding the Ctrl key and clicking on it. Copy the entire table into your solutions document. d) Identify the largest and smallest percentages of individuals sampled (552) from the two-way table you constructed in part (c) (i.e. the percentages found inside the table). Explain the references of those percentages in both context and calculations. Provide both the value of the variables that make up the cell and the percentage in your answer in two complete sentences. e) What values are the same when looking at both your one-way table from part (a) and your two-way table from part (c)? Be specific if referencing rows or columns. f) Now, create two more two-way tables keeping “Streaming Subscription” as your row variable and “Age” as your column variable. One table needs to include row percentages and the other needs to include column percentages. To do this, select “Row percent” in the display box for the first table and “Column percent” for the second table. Include both tables in your solutions document. g) Specifically interpret the meaning of the row percentage found in the “Netflix” and “18-35” cell. Note that there are 74 observations in that cell. h) Now, specifically interpret the meaning of the column percentage found in the “Netflix” and “18-35” cell. Note that there are 74 observations in that cell. Problem 2: Streaming Services Survey At the next board meeting, the marketing researcher will present the findings from the survey to the executives of the subscription service in an organized and easy to view manner. Use the “Streaming Services Survey” data set posted in our StatCrunch group to answer the following questions. a) Using the variable named “Streaming Subscription”, produce a relative frequency Pareto chart using Graph Bar Plot With Data. Remember to order the bars by changing “Order by” to Count Descending. Properly title and label your graph and copy it into your solutions document. b) Using the variable “Streaming Subscription”, produce a frequency bar chart. Properly title and label your graph and copy it into your solutions document. c) Using the variable “Streaming Subscription”, produce a Pie Chart using Graph Pie Chart With Data. Add an appropriate title and copy this entire graph including the legend into your solutions document. d) Using the variable “Streaming Subscription”, produce a Pie Chart only for individuals in the “18-35” age range using Graph Pie Chart With Data. To the right of the box marked “Where:” click the “Build” button. Under “Columns:” click on “Age” and click “Add Column” to bring it to the box. Next, click the equal sign (=) on the calculator section. Then, click the arrow at right end of the “Values:” box to get a drop down menu and select “18-35” and click “Add.” Finally, click Okay. Add an appropriate title and copy this entire graph including the legend into your solutions document. e) Explain how the percentages displayed for “Disney+” differ based on the calculations from both Pie Charts in part (c) and part (d). f) Now produce a grouped relative frequency bar chart (to copy to your solutions document) by following the directions below. Go to Graph Bar Plot With Data. For this grouped bar chart, graph the variable “Streaming Subscription” and group by “Age.” To group by a particular variable click the arrow next to the “Group by:” box (the third box down) and from the drop down menu select the variable you are asked to group by. In the “Type:” box (5th box down from the top) choose “Relative Frequency (within category).” Title this graph clearly. You may keep the default labels for the x and y-axis. Copy this graph including the legend into your solutions document. g) For this next grouped bar chart, graph the variable “Age” and group by “Streaming Subscription.” Again, in the Type box (5th box down from the top) choose “Relative Frequency (within category).” Title this graph clearly. You may keep the default labels for the x and y-axis. Copy this graph including the legend into your solutions document. h) The two graphs you made in 2(f) and 2(g) are another representation of row and column percentage two-way tables. If we consider the row variable to be “Streaming Subscription” and the column variable to be “Age,” which graph would correspond to a row percentage two-way table and which graph would correspond to the column percentage two-way table? Answer in two complete sentences. You may create these two-way tables for this data set to help you answer the questions, but the tables do not need to be copied into your solutions document. i) Which graph that you created in Problem 2 would you recommend the marketing researcher present to the executives at the meeting? Include your choice and a compelling argument in two to three sentences. Problem 3: Age at Death of Australian Males The Australian Bureau of Statistics collects data on the age of death for every individual in their country yearly. A sample of 1000 male ages at death were taken from the year 2011 and data are provided under the title “Sample of Ages at Death” in StatCrunch. a) Create a frequency histogram for the variable “Age” by using Graph Histogram. Properly title and label your graph and copy it into your solutions document. b) Interpret the shape of this distribution in one complete sentence. c) Use StatCrunch to obtain the sample size (n), mean, and standard deviation for the “Age” variable by using Stat Summary Stats Columns. Note: in the Statistics box, select the summary statistics listed above in the exact order given. Copy the entire table into your solutions document and manually round each value to two decimal places when necessary. d) Use StatCrunch to obtain the five number summary and the IQR for the “Age” variable (the five number summary includes Min, Q1, Median, Q3, Max). Go to Stat Summary Stats Columns to obtain these values. Note: in the Statistics box, select the summary statistics listed above in the exact order given. Copy the entire table into your solutions document. e) Choose the appropriate summary statistics for center and spread (presented in either 3(c) or 3(d)) based on your stated shape of the distribution in 3(b). Please define the names of the statistics and their values in your answer. f) Use your summary statistics from part 3(d) and determine the fences used to mathematically identify outliers for the “Age” variable. To do this, type all steps in your calculations manually, including how you obtained the upper and lower fences. g) Construct a horizontally oriented boxplot of the “Age” variable by using Graph Boxplot. To do this, under “Other options:” click the “Draw boxes horizontally” button. Properly title and label and copy this graph into your solutions document. h) How many outliers do you identify (please use both the boxplot and your results from 3(f) to answer this question)? Hint: you can sort your data by “Age” (using Data Sort in StatCrunch) to get the correct number or hover your
Answered Same DayFeb 20, 2021

Answer To: STAT 250 Spring 2021 Data Analysis Assignment 1 Your solutions document should include the following...

Pooja answered on Feb 21 2021
152 Votes
STAT 250 Spring 2021 Data Analysis Assignment 1
Your solutions document should include the following items. Points will be deducted if the following are not included.
1. Type your Name and STAT 250 with your correct section number (e.g. STAT 250-xxx) right justified and then Data Analysis Assignment #1 centered on the top of page 1 below your name to begin your solutions document.
1. Number your pages across your entire solutions document.
1. Your solutions document should include the ANSWERS ONLY with each answer labeled by its corresponding number and subpart. Keep the answers in order.
1. Generate all
requested graphs and tables using StatCrunch.
1. Upload your solutions document onto Blackboard as a Word (docx) file or pdf file using the link provided by your instructor. It is your responsibility for uploading a readable file.
1. You may not work with other individuals on this assignment. It is an honor code violation if you do.
Elements of good technical writing:
Use complete and coherent sentences to answer the questions.
Graphs must be appropriately titled and should refer to the context of the question.
Graphical displays must include labels with units if appropriate for each axis.
Units should always be included when referring to numerical values.
When making a comparison you must use comparative language, such as “greater than”, “less than”, or “about the same as.”
Ensure that all graphs and tables appear on one page and are not split across two pages.
Type all mathematical calculations when directed to compute an answer ‘by-hand.’
Pictures of actual handwritten work are not accepted on this assignment.
When writing mathematical expressions into your solutions document you may use either an equation editor or common shortcuts such as: can be written as sqrt(x), can be written as p-hat, can be written as x-bar.
Problem 1: Streaming Services Survey
A marketing researcher wants to investigate whether there is any relationship between preferred streaming service and age in order to properly advertise a particular streaming service. A sample of 552 individuals who responded yes to subscribing to a streaming service were asked their most watched service and age. The options listed for streaming subscriptions were: Netflix, Hulu, Amazon Prime Video, Disney Plus, and Other. The individuals were grouped into three age ranges: 18-35, 36-53 and 54-71. The data set found in our StatCrunch Group presents these 552 individuals’ responses. The data set is called “Streaming Services Survey.”
a) Use StatCrunch to create a one-way table for the variable “Streaming Subscription” using both counts and percentages. Select Stat Tables Frequency. Select “Streaming Subscription” in the Select Column(s) box, and select both ‘Frequency’ and ‘Percent of total’ in the Statistic(s) box by holding down the Ctrl Key (Command Key on Macs) when making these selections. Copy your table into your solutions document and then manually round the values in the “Percent of total” column to two decimal places in the StatCrunch table that you have copied into your solutions document.
Frequency table results for Streaming Subscription:
Count = 552
    Streaming Subscription
    Frequency
    Percent of Total
    Amazon
    112
    20.289855
    Disney+
    69
    12.5
    Hulu
    136
    24.637681
    Netflix
    185
    33.514493
    Other
    50
    9.057971
b) Interpret your findings from the table in 1(a) by identifying the subscription service with the largest and smallest percentage. Use complete sentences with context and include the streaming subscription name and percentage in the sentences.
The number of streaming subscriptions is the least for Other (9% only). The number of streaming subscriptions is the highest for Netflix (33.55).
c) Use StatCrunch to generate a two-way table for the variables “Streaming Subscription” and “Age”.
Go to Stat Tables Contingency With Data (since you have the raw data in StatCrunch). Select “Streaming Subscription” as your row variable and “Age” as your column variable. In the “Display:” box, select only “Percent of Total.” Lastly, unclick (or deselect) “Chi-Square test for independence” since it is highlighted by default by holding the Ctrl key and clicking on it. Copy the entire table into your solutions document.
Contingency table results:
Rows: Streaming Subscription
Columns: Age
    Cell format
    Count
(Percent of total)
    
    18-35
    36-53
    54-71
    Total
    Amazon
    44
(7.97%)
    46
(8.33%)
    22
(3.99%)
    112
(20.29%)
    Disney+
    52
(9.42%)
    11
(1.99%)
    6
(1.09%)
    69
(12.5%)
    Hulu
    26
(4.71%)
    47
(8.51%)
    63
(11.41%)
    136
(24.64%)
    Netflix
    74
(13.41%)
    59
(10.69%)
    52
(9.42%)
    185
(33.51%)
    Other
    12
(2.17%)
    18
(3.26%)
    20
(3.62%)
    50
(9.06%)
    Total
    208
(37.68%)
    181
(32.79%)
    163
(29.53%)
    552
(100%)
d) Identify the largest and smallest percentages of individuals sampled (552) from the two-way table you constructed in part (c) (i.e. the percentages found inside the table). Explain the references of those percentages in both context and calculations. Provide both the value of the variables that make up the cell and the percentage in your answer in two complete sentences.
The highest number of subscription streaming corresponds to 74 (13.41%) for Netflix and age group 18-35 years old. The lowest number of subscription streaming corresponds to 6 (1.09%) for Disney+ and age group 54-71 years old.
e) What values are the same when looking at both your one-way table from part (a) and your two-way table from part (c)? Be specific if referencing rows or columns.
The marginal frequency for rows (Streaming subscription) is the same in a) and c).
f) Now, create two more two-way tables keeping “Streaming Subscription” as your row variable and “Age” as your column variable. One table needs to include row percentages and the other needs to include column percentages. To do this, select “Row percent” in the display box for the first table and “Column percent” for the second table. Include both tables in your solutions document.
Contingency table results:
Rows: Streaming Subscription
Columns: Age
    Cell format
    Count
(Row...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here