I. Background
The applied project is the most important assignment throughout the course.
The main goal of the project is to use the statistical skills and theory learned in the class in a research problem that is significant for the student.
This assignment follows the maxim of this class that the best way to learn statistics is through conducting actual statistical procedures to explain a relevant phenomenon.
In particular, the final report of the applied project is cumulative and requires that students apply the conceptual and technical knowledge they have learned throughout the whole course.
The emphasis of this report will be on evaluating the student’sanalytical capacityto examine quantitative data related to their research question and determine the degree to which such data provide statistically significant and substantive robust support or not to their hypothesis.
II. Learning Objectives
To successfully prepare this assignment, the student needs to demonstrate knowledge of the basic conceptual dimension of statistical analysis and scientific inquiry, namely:
- Write a hypothesis about a relationship of interest between two variables.
- Identify potential threats to a causal explanation and control for rival factors.
- Conduct multivariate descriptive statistics, based on the interpretation and analysis of control tables and multivariate figures.
- Conduct and interpret the results of basic bivariate significance tests and measures of associations.
The student also needs to demonstrate technical knowledge in the use of R:
- Manipulation of variables of interest.
- Preparation of multivariate figures and tables.
- Conducting basic bivariate inferential methods.
- Conducting basic bivariate and multivariate linear regression analysis (Optional).
III. Activities
First Part: Introduces the report, states the hypothesis or research question, and describes the research design and empirical strategy (e.g. the variables used). The student can use here the first report, with the relevant modifications suggested by the instructor (if any).
Second Part: Advanced descriptive statistical analysis. The student will prepare control tables and multivariate figures to explore: (1) whether the X-Y relationship of interest finds support in the data and (2) whether that relationship remains after controlling for Z. In particular, the student needs to identify if the X-Y-Z relationship of interest is spurious, additive, or conditional (an interaction effect).
Third Part: Bivariate inferential analysis. The student will use (1) significance tests to examine whether the X-Y relationship of interest is statistically significant; (2) measures of association to determine the strength (and sign) of the X-Y relationship; and (3) provide further evidence whether the control variable Z has a statistically significant effect on Y.
First Part(10 points)
1.1 Write (reframe if necessary) the hypothesis of interest in clear terms and following the standards discussed in the class.
Remember that your goal is to propose a testable empirical statement of how an explanatory variable affects an outcome of interest. You need to be specific about how you expect that the change inXaffects the levels ofY. You also need to clearly define X and Y.
1.2 Clearly state which additional factor can potentially affect your relationship of interest. This variable is Z. Justify why Z is important and explain what kind of effect do you think it has on the X-Y relationship. Be specific in the use of terms; explain if you expect that Z will either reveal spuriousness or help to explain the phenomenon of interest in an additive or a conditional manner.
1.3 Cite the dataset you will use to explore this hypothesis and the role of Z in your dependent variable of interest.
1.4 Describe explicitly which variables of the dataset you will use to explore this relationship.
1.5 If you conducted specific data transformations (rescaling, collapsing categories, etc.) or modified the dataset (deleting missing data, subsetting, etc.), clearly describe what you did.
Do not write this part assuming that the instructor knows the data. Even if you use the CCES explain the variables used in the analysis as if you were submitting this report to a third party.
Each subtask worth 2 points. If you got full credit in the first report, you only need to reorganize the information according to these five points. If you didn’t obtain full credit, make the necessary modifications suggested by the instructor.
To prepare this part of the report, you can use the materials presented in Lecture 2.1 (Foundations of Research Design) and Lecture 4.1 (Descriptive Statistics -Part I-), as well as the readings assigned to those classes.
In the event you make a mistake in responding to one of these subtasks, you still will obtain 60% of the score for that subtask. The only way you receive less than that (and actually zero) is if you submit nothing in the subtask. Thus, ‘
try your best, don’t leave any subtask empty
’.
Second Part(10 points)
2.1.1 Prepare a bivariate table to represent the X-Y relationship of interest.
I will make emphasis on three features of your table: (1) you selected the correct type of figure according to the dependent variable, (2) the structure of the table is correct, according to Pollock’s standards, and (3) the table provides both frequencies and percentages determined by the columns.
I will not penalize you for aesthetics. Yet, anicetable according to what was taught in the course will receive a .5 extra point.
The table worth 1 point.
2.1.2 Prepare a bivariate figure to represent the X-Y relationship of interest.
You can select the type of figure you prefer (bar graph, line graph, scatterplot). I will make emphasis on three features of your figure: (1) the plot displays the relevant piece of information to evaluate your hypothesis; (2) the scale of the y-axis is correct (starting at zero), and (3) X corresponds to the horizontal axis and Y to the vertical axis.
I will not penalize you for aesthetics. Yet, anicefigure according to what was taught in the course will receive a .5 extra point.
The figure worth 1 point.
2.2 Respond to the following two items, using the terms, concepts, and theory taught in the course:
- Do the descriptive statistics materials provide preliminary evidence to your hypothesis? Why? Use the percentages, trends, etc.
- Interpret thezero-ordereffects of X on Y.
Each item worth 1 point. It is critical that you use the terms, concepts, and theory of the class. Failure to use them will decrease your grade.
2.3.1 Prepare a control table to represent the X-Y-Z relationship of interest.
I will make emphasis on three features of your table: (1) you selected the correct type of figure according to the dependent variable, (2) the structure of the table is correct, according to Pollock’s standards, and (3) the table provides both frequencies and percentages determined by the columns.
I will not penalize you for aesthetics. Yet, anicetable according to what was taught in the course will receive a .5 extra point.
The table worth 1 point.
2.3.2 Prepare a multivariate figure to represent the X-Y-Z relationship of interest.
You can select the type of figure you prefer (bar graph, line graph, scatterplot). I will make emphasis on three features of your variable: (1) the plot displays the relevant piece of information to evaluate your hypothesis; (2) the scale of the y-axis is correct (starting at zero); and (3) X corresponds to the horizontal axis, Y to the vertical axis, and the lines or bars are determined by the values of Z.
I will not penalize you for aesthetics. Yet, anicefigure according to what was taught in the course will receive a .5 extra point.
The figure worth 1 point.
2.4 Respond to the following two items, using the terms, concepts, and theory taught in the course:
- Does the X-Y relationship of interest change after holding the values of Z constant? Explain.
- What kind of X-Y-Z relationship Z reveals (spurious, additive, or conditional -interaction-). Explain.
Each item worth 1 point. It is critical that you use the terms, concepts, and theory of the class. Failure to use them will decrease your grade.
2.5 Respond to the following two items, using the terms, concepts, and theory taught in the course:
- Interpret thepartial effectsof X on Y, controlled for Z.
- Interpret thepartial effectsof Z on Y, controlled for X.
Each item worth 1 point. It is critical that you use the terms, concepts, and theory of the class. Failure to use them will decrease your grade.
To prepare these five subtasks, review the materials of Lecture 4.1, Lecture 6.2, and Lecture 7.2, as well as the readings assigned to those classes and the replication materials provided by the instructor.
Note:Your hypothesis need not be correct. The emphasis is on evaluating your capacity to evaluate empirical evidence with quantitative methods, rather than your ability to propose a valid explanation of a phenomenon of study.
Third Part(8 points)
3.1 Conduct a significance test of the X-Y relationship. Copy-paste the results of the test in the document. Respond to the following two questions:
- Is statistically significant the effect of X on Y, according to the test? Justify or explain using the results of the test.
- Do these results corroborate or disconfirm the preliminary analysis you conducted with descriptive statistics? Why?
You need to adequately select the type of significance test that fits according to the scale of measurement of the dependent variable and research question in the project. Failure to select the correct test will automatically discard your analysis. You would obtain just 60% of this subtask.
Each item worth 1 point. It is critical that you use the terms, concepts and theory of the class. Failure to use them will decrease your grade.
3.2 Estimate a measure of association of the X-Y relationship. Copy-paste the results of the test in the document. Respond to the following two questions:
- A Is substantively significant the effect of X on Y, according to the test? Justify or explain using the results of the test. Even if the previous test yielded insignificant results, interpret the results of the measure of association.
- Do these results corroborate or disconfirm the preliminary analysis you conducted with descriptive statistics? Why?
You need to adequately select the type of measure of association that fits according to the scale of measurement of the dependent variable and research question in the project. Failure to select the correct measure of association will automatically discard your analysis. You would obtain just 60% of this subtask.
Each item worth 1 point. It is critical that you use the terms, concepts, and theory of the class. Failure to use them will decrease your grade.
3.3 Conduct a significance test of the Z-Y relationship. Copy-paste the results of the test in the document. Respond to the following two questions:
- Is statistically significant the effect of Z on Y, according to the test? Justify or explain using the results of the test.
- Do these results corroborate or disconfirm the preliminary analysis you conducted with descriptive statistics related to the role of the control variable in the X-Y relationship of interest?
You need to adequately select the type of significance test that fits according to the scale of measurement of the dependent variable and research question in the project. Failure to select the correct test will automatically discard your analysis. You would obtain just 60% of this subtask.
Each item worth 1 point. It is critical that you use the terms, concepts and theory of the class. Failure to use them will decrease your grade.
3.4 Estimate a measure of association of the Z-Y relationship. Copy-paste the results of the test in the document. Respond to the following two questions:
- Is substantively significant the effect of Z on Y, according to the test? Justify or explain using the results of the test. Even if the previous test yielded insignificant results, interpret the results of the measure of association.
- Do these results corroborate or disconfirm the preliminary analysis you conducted with descriptive statistics? Why?
You need to adequately select the type of measure of association that fits according to the scale of measurement of the dependent variable and research question in the project. Failure to select the correct measure of association will automatically discard your analysis. You would obtain just 60% of this subtask.
Each item worth 1 point. It is critical that you use the terms, concepts, and theory of the class. Failure to use them will decrease your grade.
To prepare these four subtasks you can review Lecture 6.1, Lecture 7.1, and Lecture 4.1, as well as the reading materials assigned to those classes and the replication materials provided by the instructor.
IV. Basic Formal Requirements
- Write the document in a .doc format (preferably a Word document).
- Use titles for each part of the report and subtitles for each subtask.
- Pay attention to grammar and style. Think about this as an academic report, so be formal.
- Include a bibliography and use a correct citation style.
- You don’t need to write a long document to get full credit. Concise, specific responses that demonstrate appropriate use of the language and theory learned in the class, as well as a correct use and interpretation of the empirical evidence, is better than long essays.
V. Replication Materials
Together with the Word document, the student needs to upload:
- The .R file with the coding, from setwd() to the final line for preparing the final subtasks.
- The dataset downloaded.
I haven't been able to emphasize enough the importance of providing the replication materials for quantitative analysis in the academic community. Even for the first report of the project, some students submitted no replications materials.This cannot happen this time.
In fact, submitting replication materials is in your best interest. You have seen how sometimes I am able to rerun the code and identify why you failed in preparing an object in R. In those cases, I have given you partial credit as a way to recognize your effort. With no replication materials, I cannot do anything to evaluate your knowledge on the specific topic of the subtask.
This time, I will try also the ‘carrots’ policy: submit your replication materials and you will get the remaining2 pointsof the final report of the applied project.