thanks
DEPARTMENT OF ECONOMICS ECON 4041H – RESEARCH METHODOLOGY Fall 2021, Peterborough Assignment #3 Due date: November 17, 2021 Instructions: You must provide your own unique solution. You may work with others, but each of you is responsible for submitting your own problem set solution. Each question is 50 marks and each part is of equal value. Submit solution through SafeAssign. Sub- mission of one file generated using RMarkdown is best, but acceptable alternatives are allowed. 1. Use the dataset “klemA3.csv” to estimate the aggregate production function for an entire economy. This is a nice application of the basic economic theory of production. Start with a basic Cobb-Douglas production function Y = AKαLβ where Y is output (value-added), K is capital stock, and L is labour. We can estimate this Cobb-Douglas production function as a linear model by log-transforming1 it into log(Y ) = log(A)+α log(K)+β log(L). A more flexible functional form allows for interaction among factors, yielding: log(Y ) = log(A)+α1 log(K)+α2 (log(K))2+β1 log(L)+β2 (log(L))2+γ (log(K)×log(L)). The term log(A) represents a productivity parameter and is the coefficient on the constant in the regression, so relabel it as α0. log(Y ) =α0+α1 log(K)+α2 (log(K))2+β1 log(L)+β2 (log(L))2+γ (log(K)×log(L)). Note the equation has (log(L))2, not log(L2). The variables in the dataset are: • ind: industry label • indnum: an integer identifying an industry • year: year • y: value of gross output ($ millions) • k: value of capital input ($ millions) • l: value of labour input ($ millions) • int: value of intermediate inputs ($ millions) 1log is natural log (or ln) in R. ECON 4041H - Assignment 3 a. Estimate the production function log(Y ) = α0 +α1 log(K)+β1 log(L)+α2 (log(K))2 +β2 (log(L))2 + γ (log(K)×log(L))+ ε where Y is value-added, calculated as gross output minus intermediate inputs (y− int). Report your results, and comment briefly. b. Is the Cobb-Douglas production function sufficient? Or is the full flexible-functional form appropriate? Use a formal test(s) to support you conclusion. c. Generate predicted value-added levels Y using emmeans() for i. mean values of K and L. ii. values of K and L equal to half their mean values. iii. values of K and L equal to twice their mean values. Remember from micro theory that for a function y = aKαLβ , if doubling both inputs yields • double the output, the function displays constant returns to scale. • less than double the output, the function displays diminishing returns to scale. • more than double the output, the function displays increasing returns to scale. Does this estimated production function display decreasing, constant, or increasing re- turns to scale? Note: you specify the values of K and L, and emmeans() will apply the log() transfor- mation. So provide values of K and L in the “at =” parameter, not values of log(K) or log(L). Also, add the option type = “response” as an additional parameter to the emmeans() command. That option will convert the predicted means from log() values back into their values, essentially applying the exp() function to all output. To see that, try it without the option. d. Estimate the marginal products of the two inputs. Since the marginal product of a factor of production is the partial derivative of output Y with respect to the factor ( ∂Y∂K and ∂Y ∂L ), you can use the margins() function for this. i. Estimate the marginal effect of value-added (Y) with respect to capital, and graph the resulting estimates. Just like for emmeans() above, specify the values for K, not log(K), in the “at =” parameter. You will need to specify a vector of values of K to generate a vector of values of ∂Y∂K to graph. Note that because the function takes the log(K), your vector of values of K will have to be a geometric series (1, 2, 4, 8, . . . ; or 10, 100, 1000, . . . ) and not a linear series (1000, 2000, 3000, . . . ). Play around with this until you get it right. ii. Repeat 1.d.i but now with respect to labour L. Same details from above apply. iii. What do the graphs reveal, and are they consistent with the economic theory of production? 2 ECON 4041H - Assignment 3 2. Using the labour force survey file, “lfs21.rds”, explore whether wages differ for immigrants and non-immigrants. We will also explore interaction effects of immigrant status with other potential explanatory variables. Some data processing is required. • The variable immig, has three categories. Two of the categories are for immigrants and identify time since they arrived in Canada. The third category represents non- immigrants. Recode the two immigrant categories into one, so that the new variable is a binary categorical variable identifying immigrant status only. In other words, the new variable will combine the two “Immigrant” status categories into one and the variable will be coded “immigrant” and “non-immigrant”. Let’s identify this new variable as im2. • The variable union has three categories: union member, not unionized by under a col- lective agreement, and non-unionized. Recode this variable into a new binary variable combining the first two categories together into one that captures presence of a collec- tive agreement. The variable will now code as either covered by a collective agreement or non-unionized. Let’s refer to it as ca2. • Convert age_12 into a numeric variable, and drop the top age category “70 and over”. Let’s refer to it as age. a. Run a regression with wages (hrlyearn) as the dependent variable, and use the follow- ing following variables as explanatory variables: the numeric age in both linear and quadratic terms, education (educ), sex, sector of employment (cowmain), collective agreement status (ca2 from above), firmsize, immigrant status (im2 from above), and province (prov). You will have nine explanatory variables, including age as both linear and quadratic. Discuss the estimates and discuss what the coefficients mean. Provide a complete discussion for all but province. We will address province next. b. Do wages differ by province? Do they differ for every province, or are some similar? Answer this part using both lht() and emmeans(). c. Now interact the immigration status variable with all the categorical variables in the model except province, so use: educ, sex, cowmain, ca2, and firmsize. Very briefly characterize the interaction terms. d. Explaining interactions is challenging, so now use emmeans() to calculate the interac- tion effect of immigration status on wages for each of the other five categorical explana- tory variables with which im2 is interacted. Run the interaction for each separately, ie. first run emmeans() for immigration status and education, then immigration status and sex, etc. Feel free to add any additional analysis that helps provide explanation. It is often useful to graph the results of emmeans() (hint, hint). You may also find the contrast() function helpful after running emmeans(). I leave this part a bit open-ended and invite you to explore these economic relationships using the tools we have been reviewing. 3