There are ten questions. I need help finishing it by midnight. It is simple economics code but I struggle with it. I would appreciate if you could write out a description of what each command does if you can. Here's the starting code.capture log close // this closes an open log filelog using pset4_log_file.smcl, replace // this creates new/replaces old logfileclear all // this closes data currently openuse "https://github.com/tvogl/econ121/raw/main/data/crime_ps4.dta" // this opens desired dataset
/* Name: Group Members: Professor Tom Vogl ECON 121: Applied Econometrics Date: PROBLEM SET 4: EFFECT OF MILITARY CONSCRIPTION ON CRIME I hereby declare that I worked with other group members on the Stata commands. I hereby declare I wrote my own written answers and did not copy from someone else. I understand that copying some else's answer can result in negative infinity points for both parties. */ capture log close // this closes an open log file log using pset4_log_file.smcl, replace // this creates new/replaces old logfile clear all // this closes data currently open use "https://github.com/tvogl/econ121/raw/main/data/crime_ps4.dta" // this opens desired dataset /* Many countries require young men to serve in the military. Proponents of these policies cite many benefits, including promoting national security and disciplining otherwise undisciplined young men. In this problem set, we will examine the second of these purported benefits by estimating the effect of military conscription on crime in Argentina. Military service in Argentina was mandatory for young men throughout most of the twentieth century. The needs of the military varied over time, however, so it held an annual lottery to decide which newly eligible men would serve. The following paragraph (taken from a published source) describes the lottery in detail: The eligibility of young males for military service was randomly determined, using the last three digits of their national IDs. Each year, for the cohort due to be conscripted the following year, a lottery assigned a number between 1 and 1,000 to each combination of the last three ID digits. The lottery system was run in a public session using a lottery drum filled with a thousand balls numbered 1–1,000. The first ball released from the lottery drum corresponded to ID number 000, the second released ball to ID number 001, and so on. The lottery was administered by the National Lottery and supervised by the National General Notary in a public session. Results were broadcasted over the radio and published in the main newspapers. After the lottery, individuals were called for physical and mental examinations. Later, a cutoff number was announced. Individuals whose ID number had been assigned a lottery number higher than the cutoff number, and who had passed the medical examination, were mandatorily called to military service. Clerics, seminarians, novitiates, and any individual with family members dependent upon him for support were exempted from military service. To produce the dataset, researchers started with all men born in 1958-1962, divided them into cells by birth year and last three ID digits, and then calculated crime rates for each of these cells. Thus, each observation in the dataset represents a set of men with the same birth year and last three ID digits. (The data are aggregated in this way to ensure confidentiality.) The following table defines the variables in the dataset: Variable name Description birthyr Birth year draftnumber Draft number (1-1000) conscripted Fraction conscripted crimerate Fraction with criminal record by 2005 property Fraction with property crime conviction in 2000-2005 murder Fraction with murder conviction in 2000-2005 drug Fraction with drug conviction in 2000-2005 sexual Fraction with sex crime conviction in 2000-2005 threat Fraction with threat conviction in 2000-2005 arms Fraction with weapons-related conviction in 2000-2005 whitecollar Fraction with white collar crime conviction in 2000-2005 argentine Fraction non-indigenous Argentinean indigenous Fraction indigenous Argentinean naturalized Fraction naturalized citizens Our main outcome variable will be crimerate, which reflects the probability of ever having a criminal record. We will also disaggregate by type of crime, although these data are only available for crimes committed starting in the year 2000. */ *1. *Describe the data. Are there differences in conscription rates or crime rates across birth years? *2. *Use OLS to estimate the relationship between conscription rates and crimerate, controlling for observable covariates. Does the result reflect the causal effect of conscription? Describe possible biases. *3. *The lottery assigned a draft number to each last three ID digit combination, and the military then set a cutoff based on the needs of the military, such that all draft numbers at or above the cutoff were eligible for conscription. Based on the following cutoffs, code a variable that equals 1 if eligible, 0 if not: *Year: 1958 1959 1960 1961 1962 *Cutoff: 175 320 341 350 320 *4. *Estimate the "first stage" effect of eligibility on conscription. Think carefully about the regression specification. Do you need to control for birth year indicators? Do you need to control for ethnic composition? *5. *Estimate the "reduced form" effect of eligibility on crimerate. Does this result reflect the causal effect of conscription? *6. *Based on your results for questions (4) and (5), calculate the instrumental variables estimate of the effect of conscription on crimerate. You need only calculate a point estimate, not standard errors. *7. *Confirm your calculations by running a two-stage least squares regression. Are there differences between the 2SLS (question 7) and OLS (question 2) results? Why or why not? *8. *Given your knowledge of the Argentine draft (from the description above), assess the validity of eligibility as an instrument for conscription. Does it satisfy all the criteria for a valid instrument? (WORDS ONLY, NO CODING.) *9. *Interpret the 2SLS result. Which sub-population's average treatment effect does it estimate? Is it reasonable to call it a local average treatment effect? Is it reasonable to call it a treatment-on-the-treated effect? (WORDS ONLY, NO CODING.) *10. *Suppose we are concerned that ID numbers (and therefore draft numbers) are correlated with characteristics that raise a person's risk of committing crimes. Estimate the effect of conscription on crimerate in a fuzzy regression discontinuity design. Because the problem set is getting a little long, I will guide you through it. *(a) To motivate the regression discontinuity design, draw scatter plots of conscription rates against draft numbers by birth year: *scatter conscripted draftnumber, by(birthyr) *(b) Generate a new variable distance that measures the distance from the birth-year-specific cutoff. Drop observations with distance greater than 100 or less than -100. This step effectively restricts our analysis to a bandwidth of 100. *(c) Draw a scatter plot of the conscripted against distance. Do your results suggest that crossing the cutoff raises conscription? *(d) Draw a scatter plot of the crimerate against distance. Do your results suggest that crossing the cutoff raises crime rates? *(e) Now run a two-stage least squares regression to estimate the effect of conscripted on crimerate in a regression discontinuity design. Use local linear regression with a bandwidth of 100. To allow the slopes to be different above and below the cutoff, first generate the interaction of distance with eligible: *gen distanceXelig = distance*eligible *Then estimate: *ivregress 2sls crimerate (conscripted = eligible) distance distanceXelig, r first *Explain the command, and interpret both the first- and second-stage results. Do the results confirm your interpretation of the scatter plots in parts (c) and (d) of the question? *(f) Why do you think the regression discontinuity results are different from the results earlier in the problem set? translate pset4_log_file.smcl pset4_log_file.pdf, replace // This turns your logfile into a pdf to submit on Gradescope