1. A bank is experimenting with processes to improve customer service during peak periods of business (for example, during the lunch hour). Two alternative processes were implemented at two different bank branches. A random sample of 15 customers was chosen at each bank, and their waiting times were recorded. These data, and the Excel output, can be found on the excel file Bank_waiting_time.xls. Note: ignore the sample size issue.
a. Set up the null and alternative hypotheses.
b. Using Excel output, is there evidence that the mean waiting time for the two branches are not equal? Interpret the sample t-value and the two tailed p-value for the test.
c. Given your answer to part b. above, which type of error are you vulnerable to, Type I or Type II? Explain in a single sentence.
d. Are the samples for this analysis “independent samples” or “related samples”? Explain briefly.
2. Integrated circuits are manufactured on silicon wafers through a process that involves a series of steps. An experiment was carried out the effect on the yield of using three methods in the cleansing step. The results of the experiment can be found on the Yield-OneWay.xls Excel file. The reported values are an index of production (higher is better!).
a. Set up the null and alternative hypotheses.
b. Using the Excel file indicated above, is there evidence that the results of the cleaning methods are statistically different? In answering the question, interpret the p-value of this test.
3. Note: There is no data or data file with this question. It is a “thought question”. Consider an athletic league that is attempting to reduce injuries. In the off season, they randomly assign volunteers to three different dietary regimens (high carb, high protein, and glutton free). At the same time the volunteers were randomly assigned to two different workout regimens (short wind sprints with intense lighter weight training; longer interval training and heavier weight training). If you were using 2-way ANOVA to analyze the results of these “trials”, explain what it would mean if there were a statistically significant interactive effect between the dietary regimens and the workout regimens. In other words, describe how the “interaction” might manifest itself.
4. The Lottery data Excel file provides data on the topic of “playing the lottery”. It is often claimed that individuals who play the lottery are those who can least afford to do so, making the impact of the lottery “regressive”. In this context, regressive implies that lower income individuals play the lottery more than do higher income individuals.
The data for this analysis include 100 randomly chosen men from a major metropolitan area. For each individual, the following information has been recorded:
· The number of times in the preceding month the individual has played a state-sponsored lottery. This is the dependent variable in the accompanying regression analysis.
· The education of the individual, measured by the number of years of schooling completed. Note that 12 years of schooling completed means that the person has graduated from “high school”, and 16 years of schooling completed means that the person has graduated from college.
· Age in years at the person’s most recent birthday.
· The number of children currently living in the person’s household.
· The person’s annual “wage and salary” income in the previous year (reported in thousands of dollars).
Use the Excel to address the questions below.
- Write out the regression equation, with specific intercept and slope estimates.
- For the first row of actual data only, use the independent variable values to “predict” the value of the dependent variable. For this row, also compute the “residual”.
- For the independent variables “Age” and “Income” interpret the numerical value of the slope.
- Evaluate the statistical significance of each of the four slope estimates. This can be done in a very summary way. Start out by indicating the “null value” against which you will be testing each of the four slope estimates. Then, next to the name of each independent variable, state whether the slope is “significant”, and why. The “why” should be stated in
no more
than a few words or a single sentence.
- For the independent variable “Income”, interpret the p-value of the estimated slope, briefly and specifically (by specifically I mean with specific reference to the numerical value).
- Interpret the R-square for this regression.