The data is attached.
chgtestscores
|
curriculum
|
income ($000)
|
school
|
0.19
|
Old
|
27.7
|
3
|
2.67
|
New
|
93.1
|
14
|
2.46
|
New
|
124
|
17
|
1.28
|
New
|
73.8
|
20
|
-0.24
|
New
|
32.4
|
22
|
0.95
|
New
|
32.9
|
36
|
2.31
|
New
|
86.6
|
39
|
0.63
|
Old
|
69.8
|
40
|
-0.54
|
Old
|
158.8
|
43
|
0.34
|
Old
|
19.5
|
54
|
-0.6
|
Old
|
35.9
|
57
|
-0.88
|
Old
|
68.9
|
58
|
-0.3
|
Old
|
9.6
|
73
|
1.14
|
Old
|
32.5
|
75
|
0.28
|
Old
|
87.2
|
76
|
0.29
|
Old
|
50.2
|
78
|
0.12
|
New
|
31.2
|
81
|
0.91
|
New
|
49.3
|
86
|
-1.09
|
Old
|
63.3
|
91
|
-0.02
|
Old
|
95.4
|
97
|
1.71
|
New
|
75.4
|
112
|
0.07
|
Old
|
56
|
113
|
-0.12
|
Old
|
15.2
|
115
|
1.04
|
New
|
28.4
|
118
|
0.85
|
Old
|
22.8
|
120
|
-0.02
|
Old
|
38.3
|
134
|
-0.17
|
Old
|
68.6
|
135
|
-0.89
|
Old
|
63.3
|
138
|
1.25
|
New
|
70.5
|
141
|
1.99
|
New
|
114.5
|
145
|
0.42
|
New
|
33.1
|
153
|
-1.13
|
Old
|
25
|
155
|
-0.32
|
New
|
16.2
|
169
|
1.09
|
New
|
49.9
|
170
|
0.46
|
New
|
53
|
172
|
0.95
|
New
|
90.4
|
173
|
-0.38
|
New
|
32.2
|
175
|
1.19
|
Old
|
43.7
|
182
|
-0.24
|
New
|
27.7
|
193
|
-1.26
|
Old
|
94.6
|
194
|
-1.56
|
Old
|
54.5
|
201
|
1.52
|
New
|
71.3
|
204
|
2.35
|
New
|
86.6
|
209
|
1.07
|
Old
|
8.3
|
224
|
1.82
|
New
|
72.6
|
225
|
1.45
|
New
|
65.1
|
227
|
-0.44
|
Old
|
8.7
|
230
|
-0.45
|
Old
|
100.8
|
231
|
1.66
|
New
|
94.1
|
243
|
0.38
|
Old
|
250
|
245
|
-0.31
|
Old
|
101
|
246
|
0.34
|
New
|
51.4
|
247
|
0.51
|
New
|
33.1
|
249
|
-0.55
|
Old
|
41
|
258
|
0.85
|
Old
|
19.4
|
264
|
0.76
|
Old
|
61.8
|
269
|
0.12
|
New
|
34.7
|
270
|
1.17
|
New
|
39.5
|
271
|
-0.16
|
Old
|
51.7
|
276
|
1.43
|
New
|
39.1
|
285
|
0.44
|
New
|
27.3
|
298
|
0.26
|
New
|
49.7
|
299
|
1.01
|
Old
|
7.4
|
302
|
0.39
|
New
|
43.1
|
316
|
-0.05
|
New
|
31.4
|
319
|
-0.9
|
Old
|
99.4
|
320
|
-0.91
|
Old
|
45.2
|
332
|
0.42
|
Old
|
139.2
|
333
|
1.69
|
New
|
48.6
|
336
|
0.44
|
New
|
4.6
|
340
|
0.64
|
Old
|
74.5
|
351
|
1.86
|
New
|
59.4
|
354
|
0.28
|
New
|
37.1
|
355
|
0.37
|
New
|
60.1
|
370
|
0.71
|
Old
|
38.2
|
371
|
-0.73
|
New
|
13.3
|
372
|
0.85
|
Old
|
36.3
|
374
|
-0.06
|
Old
|
84.8
|
387
|
0.33
|
New
|
51.3
|
388
|
-0.97
|
Old
|
109.3
|
390
|
Question
Assume you are working on a team that has been commissioned by a large school district to
collect and analyze data related to a recent curriculum experiment designed to improve student scores on region-wide standardized tests. The schools in this district are predominantly large, urban schools. School officials are interested in knowing how successful the experiment was, and if the new curriculum should be incorporated districtwide.
Three years ago, the school district rolled out the experimental curriculum to 100 of the 400
elementary schools in the district. Those 100 schools were selected via a simple random sample. Your working budget is not large enough to collect data on the population of 400 schools. You can only afford to collect a sample of 80 schools. Unless the question states otherwise, conduct all analyses at the 95% confidence level (a=.05).
1. You think that the best sampling strategy is stratified sampling. You’d like to list the characteristics of schools in this district and then randomly select 80 schools that roughly match the demographic characteristics of the entire population of schools in this district. Forty (40) of these would come from schools that had the experimental curriculum, and 40 would come from schools that kept the old curriculum. However, people from the school board have made the following statements regarding sampling:
A. One well-meaning school board member has argued that "If it ain't broken, there's no reason to fix it! We should sample 80 random schools with the old curriculum; as long as the students in those schools are performing acceptably, there is no reason to change curricula."
B. A second school board member has argued that "since you are looking for improvement, what you should do is simply choose the 80 schools that had the biggest improvement from three years ago to now and look at how many of those used the new vs. the old curriculum."
C. Finally, the School District Superintendent has told you that "as you know, I'm currently running for re-election, and it is important that this research turn out a certain way. I have a list of schools that I personally handpicked for your sample that I think really represent what is going on here."
Describe what is wrong methodologically with each of the three suggestions you received from the various school board members. You will want to focus on issues such as sampling error, the sorts of biases that will be introduced from such sampling methods, how you might expect to see those biases manifested in the data/data analysis, and how these issues will affect your ability to comment on the district's original question as to whether or not it would be a good idea to roll out the new curriculum to all schools in the district.
2. Having convinced the board members that a stratified sample is the most appropriate, you collect data from 80 schools. The collected data is provided in this file on the worksheet labeled "Data."
The following variables are contained in the data set:
i.
chgtestscoresis the difference between average school test scores three years ago and average school test scores today. A positive value for chgtestscores indicate an increase in test scores at the school as compared to 3 years ago, while a negative number indicates that the school is now performing worse on these tests.
ii.
cirriculumis a categorical variable that can have a value of either "old" or "new." New means that the experimental curriculum is being used.
iii.
income($000) represents the average annual income (in thousands of dollars) of the households of students from each school.
iv.
schoolis simply the ID number of the elementary school.
The first step to your analysis is to generate some descriptive measures. For each of the following, create a chart or graph that best displays the data:
A. Show the makeup/composition of your sample by
curriculumtype (old vs. new)
B. Show the distribution of the change in test scores (chgtestscores) across all 80 schools.
C. Show the distribution of
incomeacross all 80 schools. Additionally, you want to generate summary/descriptive statistics for your variables. You must show a minimum of 5 measures for each variable.
D. Create a table that shows summary/descriptive statistics for the
chgtestscoresand
incomevariables using all 80 schools.
E. Create a table that shows summary/descriptive statistics for the
chgtestscoresand
incomevariables separated by curriculum type.
Based on the graphs and tables created in parts a-e, what preliminary conclusions can you draw regarding the effectiveness of the experimental curriculum?
3. One of the criticisms levied upon the old curriculum is that it is outdated. It is so outdated, the board members argue, that it is causing standardized test scores to fall. You decide to test this hypothesis.
A. First, state the null and alternative hypotheses (H0 and H1)
B. Next, perform the statistical test of the hypothesis that the mean change in schools using the old curriculum is less than 0.
C. Calculate the p-value associated with your test statistic from Part b.
D. Interpret your results.
4. Because the school board's primary concern is whether or not the experimental curriculum leads to better standardized test scores, your next step is to conduct a simple analysis comparing test scores from schools with the old curriculum to the test scores from schools with the new/experimental curriculum.
5.
A. Conduct an ANOVA to evaluate whether or not there is a significant difference in test scores between schools with the old curriculum and schools with the new curriculum.
B. Summarize and interpret the results of your test.
6. One board member is concerned about the impact of the experimental curriculum on schools in low income neighborhoods. To investigate his concerns, you need to run a multiple regression model.
A. Create a dummy variable for the new/experimental curriculum. Also, create an interaction variable that interacts your experimental dummy variable and the income variable.
B. Run a multiple regression that includes the experimental curriculum dummy, income, and the interaction variable as independent variables. Chgtestscores will be your dependent variable.
C. Calculate predicted values for the chgtestscores variable under both the new and old curriculum for income levels of $15,000, $30,000, $60,000, and $120,000.
Summarize and interpret the results of this model. What do you tell the board member about the effect of the new curriculum across different income levels?
7. Shortly after you publish your findings in a report, you receive a call from Mensa, an organization whose members have high IQs. Mensa wants to open a series of schools for gifted kids across the country; children with IQs above 160 would be offered places at these schools completely free of charge. Mensa is intrigued by your research, and wants to know, based on your findings, whether or not they might expect similar results if they use the experimental curriculum at its new chain of schools. What do you tell Mensa and why?
8. The variable the school board is most interested in understanding/explaining is the change in school-wide standardized test scores. You were also given variables that indicated the curriculum type and the income of the households of students from each school. If this were a real research project, what other data would you collect to use as control variables? Why?
In other words, aside from income and the curriculum type, what other variables might you collect that could have an impact on the change in test scores? Give at least 4 examples. For each variable you list, explain why you would include it.