there is an excel file attached. It has 4 tabs...ANOVA, Regression, CHI, and cleaning data with outlier total of 20 questions
HW11 MGMT 650 Fall 21 Week 11 Homework Questions (Last updated 8/16/2021) Chi Square An analyst at a local bank wonders if the age distribution of customers coming for service at his branch in town is the same as at a branch located near the mall. He selects 100 transactions at random from each branch and researches the age information for the associated customer. These are the data : Age less than 3030-5556 or olderTotal In town204040100 mall305020100 Total509060200 1What is the null hypothesis if you want to check if the age patterns of customers are independent of bank location? 2What are the expected numbers for each cell in a 3 by 3 table if the null hypothesis is true? 3Use the chi square test to accept or reject the null hypothesis. What is the chi square test statistic? 4What is the chi square critical value and how many degrees of freedom does it have? Assume alpha is .05. 5What do you conclude? ANOVA Saeko owns a yarn shop and want to expands her color selection. Before she expands her colors, she wants to find out if her customers prefer one brand over another brand. Specifically, she is interested in three different types of bison yarn. As an experiment, she randomly selected 21 different days and recorded the sales of each brand. At the .10 significance level, can she conclude that there is a difference in preference between the brands? Misa's BisonYak-et-ty-YaksBuffalo Yarns 799776799 784640931 807822794 675856920 795616731 875893837 Total4,735.004,603.005,012.00 6)What is the null hypothesis? What is the alternative hypothesis? What is the level of significance? 7)Use Tools - Data Analysis - ANOVA:Single Factor to find the F statistic: 8)From the ANOVA output: What is the F value? What is the F critical value? 9)What is your decision? Explain in statistical terms Regression Studies have shown that the frequency with which shoppers browse Internet retailers is related to the frequency with which they actually purchase products and/or services online. The following data show respondents age and answer to the question “How many minutes do you browse online retailers per year?” Age (X)Time (Y) 16307 17285 19267 22343 22393 22287 22253 28364 28251 28248 28433 30319 33226 34321 35336 35302 35476 36395 39473 39342 40539 42455 43326 44565 48385 50590 50507 51333 52426 54261 58625 59252 60615 10)Use Data > Data Analysis > Correlation to compute the correlation checking the Labels checkbox. 11)Use the Excel function =CORREL to compute the correlation. If answers for #1 and 2 do not agree, there is an error. The strength of the correlation motivates further examination. 12)a) Insert Scatter (X, Y) plot linked to the data on this sheet with Age on the horizontal (X) axis. b) Add to your chart: the chart name, vertical axis label, and horizontal axis label. c) Complete the chart by adding Trendline and checking boxes Read directly from the chart: 13)a) Intercept = b) Slope = c) R2 = Perform Data > Data Analysis > Regression. 14)Highlight the Y-intercept with yellow. Highlight the X variable in blue. Highlight the R Square in orange 15)Use Excel to predict the number of minutes spent by a 22-year old shopper. Enter = followed by the regression formula. Enter the intercept and slope into the formula by clicking on the cells in the regression output with the results. 16)Is it appropriate to use this data to predict the amount of time that a 9-year-old will be on the Internet? If yes, what is the amount of time, if no, why? Cleaning Data with Outlier 17)On this worksheet, make an XY scatter plot linked to the following data: XY 1.012.8482 1.484.2772 1.84.788 1.815.3757 1.072.5252 1.533.0906 1.464.3362 1.383.2016 1.774.3542 1.884.8692 1.323.8676 1.753.9375 1.945.7424 1.192.4752 1.3126.2 1.564.5708 1.162.842 1.222.44 1.725.1256 1.454.3355 1.434.2471 1.193.5343 25.46 1.63.84 1.583.8552 18)Add trendline, regression equation and r squared to the plot. Add this title. ("Scatterplot of X and Y Data") 19)The scatterplot reveals a point outside the point pattern. Copy the data to a new location in the worksheet. You now have 2 sets of data. Data that are more tha 1.5 IQR below Q1 or more than 1.5 IQR above Q3 are considered outliers and must be investigated. It was determined that the outlying point resulted from data entry error. Remove the outlier in the copy of the data. Make a new scatterplot linked to the cleaned data without the outlier, and add title ("Scatterplot without Outlier,") trendline, and regression equation label. XY 1.012.8482 1.484.2772 1.84.788 1.815.3757 1.072.5252 1.533.0906 1.464.3362 1.383.2016 1.774.3542 1.884.8692 1.323.8676 1.753.9375 1.945.7424 1.192.4752 1.564.5708 1.162.842 1.222.44 1.725.1256 1.454.3355 1.434.2471 1.193.5343 25.46 1.63.84 1.583.8552 Compare the regression equations of the two plots. How did removal of the outlier affect the slope and R2? Explain why the slope and R Square change the way they did 20) HW11 MGMT 650 Fall 21 Week 11 Homework Questions (Last updated 8/16/2021) Chi Square An analyst at a local bank wonders if the age distribution of customers coming for service at his branch in town is the same as at a branch located near the mall. He selects 100 transactions at random from each branch and researches the age information for the associated customer. These are the data : Age less than 3030-5556 or olderTotal In town204040100 mall305020100 Total509060200 1What is the null hypothesis if you want to check if the age patterns of customers are independent of bank location? 2What are the expected numbers for each cell in a 3 by 3 table if the null hypothesis is true? 3Use the chi square test to accept or reject the null hypothesis. What is the chi square test statistic? 4What is the chi square critical value and how many degrees of freedom does it have? Assume alpha is .05. 5What do you conclude? ANOVA Saeko owns a yarn shop and want to expands her color selection. Before she expands her colors, she wants to find out if her customers prefer one brand over another brand. Specifically, she is interested in three different types of bison yarn. As an experiment, she randomly selected 21 different days and recorded the sales of each brand. At the .10 significance level, can she conclude that there is a difference in preference between the brands? Misa's BisonYak-et-ty-YaksBuffalo Yarns 799776799 784640931 807822794 675856920 795616731 875893837 Total4,735.004,603.005,012.00 6)What is the null hypothesis? What is the alternative hypothesis? What is the level of significance? 7)Use Tools - Data Analysis - ANOVA:Single Factor to find the F statistic: 8)From the ANOVA output: What is the F value? What is the F critical value? 9)What is your decision? Explain in statistical terms Regression Studies have shown that the frequency with which shoppers browse Internet retailers is related to the frequency with which they actually purchase products and/or services online. The following data show respondents age and answer to the question “How many minutes do you browse online retailers per year?” Age (X)Time (Y) 16307 17285 19267 22343 22393 22287 22253 28364 28251 28248 28433 30319 33226 34321 35336 35302 35476 36395 39473 39342 40539 42455 43326 44565 48385 50590 50507 51333 52426 54261 58625 59252 60615 10)Use Data > Data Analysis > Correlation to compute the correlation checking the Labels checkbox. 11)Use the Excel function =CORREL to compute the correlation. If answers for #1 and 2 do not agree, there is an error. The strength of the correlation motivates further examination. 12)a) Insert Scatter (X, Y) plot linked to the data on this sheet with Age on the horizontal (X) axis. b) Add to your chart: the chart name, vertical axis label, and horizontal axis label. c) Complete the chart by adding Trendline and checking boxes Read directly from the chart: 13)a) Intercept = b) Slope = c) R2 = Perform Data > Data Analysis > Regression. 14)Highlight the Y-intercept with yellow. Highlight the X variable in blue. Highlight the R Square in orange 15)Use Excel to predict the number of minutes spent by a 22-year old shopper. Enter = followed by the regression formula. Enter the intercept and slope into the formula by clicking on the cells in the regression output with the results. 16)Is it appropriate to use this data to predict the amount of time that a 9-year-old will be on the Internet? If yes, what is the amount of time, if no, why? Cleaning Data with Outlier 17)On this worksheet, make an XY scatter plot linked to the following data: XY 1.012.8482 1.484.2772 1.84.788 1.815.3757 1.072.5252 1.533.0906 1.464.3362 1.383.2016 1.774.3542 1.884.8692 1.323.8676 1.753.9375 1.945.7424 1.192.4752 1.3126.2 1.564.5708 1.162.842 1.222.44 1.725.1256 1.454.3355 1.434.2471 1.193.5343 25.46 1.63.84 1.583.8552 18)Add trendline, regression equation and r squared to the plot. Add this title. ("Scatterplot of X and Y Data") 19)The scatterplot reveals a point outside the point pattern. Copy the data to a new location in the worksheet. You now have 2 sets of data. Data that are more tha 1.5 IQR below Q1 or more than 1.5 IQR above Q3 are considered outliers and must be investigated. It was determined that the outlying point resulted from data entry error. Remove the outlier in the copy of the data. Make a new scatterplot linked to the cleaned data without the outlier, and add title ("Scatterplot without Outlier,") trendline, and regression equation label. XY 1.012.8482 1.484.2772 1.84.788 1.815.3757 1.072.5252 1.533.0906 1.464.3362 1.383.2016 1.774.3542 1.884.8692 1.323.8676 1.753.9375 1.945.7424 1.192.4752 1.564.5708 1.162.842 1.222.44 1.725.1256 1.454.3355 1.434.2471 1.193.5343 25.46 1.63.84 1.583.8552 Compare the regression equations of the two plots. How did removal of the outlier affect the slope and R2? Explain why the slope and R Square change the way they did 20)