here
Assignment #5 1. Below are two columns of numbers: values of x and y for five observations. The variable x has a mean of 3.0 and a standard deviation equal to 1.6. The variable y has a mean of 13.0 and a standard deviation of 3.9. In this exercise you will hand-calculate the Pearson's correlation coefficient, R-square (remember another name for Pearson's correlation coefficient is Person's r), and adjusted R-square for the relationship between x and y. x y 1 17 2 14 3 16 4 10 5 8 A. Copy the columns onto a sheet of paper (or better yet, into a table in a word processor document). Label three additional columns: “Z score of x,” “Z score of y,” and “Z score of x * Z score of y.” B. Calculate and write in the appropriate values in each column. C. Show your calculation of Person's r for the relationship. D. Calculate R-square for this relationship. E. Showing your work, calculate adjusted R-square for this relationship. 2. Are Catholics more likely to oppose abortion rights than are non-Catholics? To findout, a researcher first constructs an abortion scale from responses to the General Social Survey (GSS). Scores range from 0 (abortion should be permitted in all circumstances) to 7 (abortion should not be permitted under any circumstances). Unlike the abortion scale analyzed in this chapter, respondents with lower scores on the GSS scale are more pro-abortion rights, and respondents with higher scores are anti-abortion rights. The researcher uses the GSS scale as the dependent variable, y. The researcher then creates a dummy variable. Catholics are coded 1 on this dummy, and non-Catholics are coded 0. This is the independent variable, x, which the researcher names “Catholic.” Here are the regression results: y = 2.6 + .56 (Catholic) Standard error of b = .15 Adjusted R-square = .01 A. Based on these findings, the researcher concludes: “While non-Catholics averaged 2.6 on the abortion scale, Catholics averaged averaged only .56 on the scale. Therefore, Catholics are more pro-abortion rights than are non-Catholics.” Is this inference correct? Why or why not? B. Another conclusion reached by the researcher: “The independent variable does not have a statistically significant effect on the dependent variable.” Is this inference correct? Why of why not? C. Yet another of the researcher's conclusions: “The Independent variable explains very little of the variation in the dependent variable.” Is this inference correct? Why or why not? 3. Another researcher, after viewing the puny value of adjusted R-square in Exercise 4, suggests that another variable – the frequency with which individuals attend religious services – may contribute to the explanation of abortion beliefs. This researcher defines a dummy variable, which is coded 1 for individuals who report high levels of religious attendance and coded 0 for people who have low levels of attendance. The regression is estimated: y = a + b1(Catholic) + b2(high attendance), where “Catholic” is the Catholic/non-Catholic dummy (Catholics are coded 1, non-Catholics are coded 0) and “high attendance” is the high attendance/low attendance dummy. Here are the results (the standard errors for the regression coefficients are in parentheses): y = 2.14 + .50(Catholic) + 1.95(high attendance) (.14) (.14) Adjusted R-square = .12 A. What is the partial effect of Catholicism on the abortion scale, controlling for attendance at religious services? Is it reasonable to infer that, in the population, Catholics are more opposed to abortion than are non-Catholics? Explain. B. What is the partial effect of attendance on the abortion scale, controlling for differences between Catholics and non-Catholics? Is it reasonable to infer that, in the population, people who attend services more frequently are more opposed to abortion than are people who attend less frequently? Explain. C. Based on this regression, what is the mean abortion score for non-Catholic low-attenders? For Catholic high-attenders? D. The adjusted R-square value is .12. This mean that ___________ percent of the variation in abortion scores is explained by both variables in the model. It also means that ____________ percent is explained by variables not in the model. E. Name one other variable that may account for differences in the dependent variable. Briefly describe why you think this variable may contribute to the explanation of abortion attitudes. 4. Suppose you want to model a set of interaction relationships between Catholicism, religious attendance, and abortion beliefs. You think that the positive effect of religious attendance on anti- abortion attitudes is significantly stronger for Catholics than non-Catholics. To construct the interaction model, you will build on the base effects of the model shown in Exercise 3: y = a + b1(Catholic) + b2(high attendance), where “Catholic” is a Catholic/non-Catholic dummy (Catholics are coded 1, non- Catholics coded 0) and “high attendance” is a high attendance/low attendance dummy (frequent attenders are coded 1, infrequent attenders are coded 0). Before you specify the model, you will need to compute an interaction variable. A. The interaction variable is computed by multiplying __________ times _________. Which of the following groups of respondents will have a value of 0 on the interaction variable: Catholic low-attenders, non-Catholic low-attenders, Catholic high-attenders, non-Catholic high- attenders? Write down all answers that apply. B. Which of the following groups of respondents will have a value of 1 on the interaction variable: Catholic low-attenders, non-Catholic low-attenders, Catholic high-attenders, non- Catholic high-attenders? Write down all answers that apply. C. Write out the interaction model to be estimated. D. Focus on the cofficient that estimates the interaction effect. If your idea is correct – that the positive effect of religious attendance on anti-abortion attitudes is significantly stronger for Catholics than non-Catholics – then would you expect the sign on the coefficient to be negative, positive, or close to 0? Explain your answer. Lab assignment: Let's say that you are doing research on approval for Bill Clinton. You decide to use the Clinton thermometer rating from the 2004 NES (clinton_therm) to measure your dependent variable (how much the respondent approves of Bill Clinton). Since Bill Clinton was president during the passage of the Brady gun control bill, you hypothesize that approval will be higher among those who support greater restrictions on the ability to purchase a gun. Your hypothesis is: Among individuals in the US, those who believe it should be more difficult to purchase a gun will have a higher average approval for Bill Clinton than those who believe it should be easier to purchase a gun. 5. Run a bivariate regression of approval for Bill Clinton on the respondent's stance on gun control (guncontrol). A. What is the direction of the relationship; does it support or refute your hypothesis? [Note: Be sure to check how the gun control variable is coded in the codebook before answering this question. You can recode the variable to make it more like your hypothesis, but be sure to note that you did this in your answer.] B. How would you interpret this substantively? C. Is the relationship statistically significant? At what level? D. What is the R-squared and Adjusted R-Squared? What does the R-squared mean? 6. One of your friends from class reads your results and says, “What about partisanship? Couldn't both a person's stance on gun control and their rating of Bill Clinton just be due to their affinity for the democratic party? Test this counter hypothesis using the democrat variable (democrat) in the NES data by adding it to your model. A. What is the relationship between being a democrat and a person's approval of Bill Clinton? Give both the direction and strength of this relationship. B. Is the relationship statistically significant? At what level? C. Does the R-squared value increase significantly? What does it now say about the model? D. Is the relationship you observed in the previous question still statistically significant, and is it as strong as in the bivariate regression? 7. Let's say you think there may be an interaction between guncontrol and democrat – perhaps Clinton's stance on gun control is more important for non-democrats than for democrats. A. Add an interaction term to the equation. Is it statistically significant? At what level? B. Does it substantially change the R-squared? C. Based on these results, is the relationship between opinion on gun control and approval of Bill Clinton, controlling for whether the person is a democrat, spurious, additive, or interactive?