Assignment 1: Simulations In this first assignment, I want you to analyze the properties of a few estimators. I want you to create a Rmd file, compile it and upload both the Rmd and PDF files before...


Assignment 1: Simulations


In this first assignment, I want you to analyze the properties of a few estimators. I want you to create a Rmd


file, compile it and upload both the Rmd and PDF files before Sunday October 3 at 10:00pm. I usually do


not answer emails during the weekend, so do not start at the last minute in case you need help. You are


allowed to discuss with each other, but I expect each copy to be unique.


If some simulations are too long to compile, you can generate the solution and save it into an .Rda file and


load the file in your document to present the result. This will make it easier for you to compile the PDF. If


you choose to proceed this way, include the code for the simulation with the option eval=FALSE. Here is an


example. You first show the simulation code without running it (you can run it once on your side to get the


result):


Xvec numeric()


for (i in 1:10000)


{


x rnorm(5000)


Xvec[i] mean(x)


}


save(Xvec, file="A1Res.rda")


Then, you print the result:


load("A1Res.rda")


mu mean(Xvec)


rmse sqrt(mean(Xvecˆ2))


rmse


## [1] 0.01405928


If you use this method, upload all .rda files needed to compile the PDF.


Important: Before you start any simulation, set the seed to your student number. Also, set the number of


iterations to 2,000 for all your simulations.


Estimator of the covariance


We saw that the unbiased estimator of the variance, S2, is less efficient than ˆ 2 = (n − 1)S2/n. We also saw


in a simulation that the RMSE of ˆ 2 is smaller. In this question, we want to see if we get the same result for


the following estimators of the covariance:


Sxy = 1


n − 1


Xn


i=1


(xi − ¯x)(yi − ¯y)


 xy = 1


n


Xn


i=1


(xi − ¯x)(yi − ¯y)


Set the sample size to n = 25 and try three different pairs of distributions for X and Y . For each pair, report


the RMSE of both estimators. Interpret your result.


1


Estimator of the mean for non-identically distributed samples.


We saw that ¯X is the best linear unbiased estimator of the mean, which means that it is has the smallest


variance among all estimators that can be defined as:


ˆμ =


Pn


Pi=1 wixi n


i=1 wi


However, this is true only for iid samples. In your simulation, consider a sample of 100 observations with


xi   iid(0,  2


1) for i = 1, ..., 50 and xi   iid(0,  2


2) for i = 51, ..., 100.


• In the first simulation, compare the bias, standard error and RMSE for ¯X and ˆμ using wi = 1/ 2 for


i = 1, ..., 50 and wi = 1/ 2


2 for i = 51, ..., 100. We also assume that  2


i ’s are known. You can try a few


distributions and see if it affects the properties.


• In the second simulation, we use the same weights but we assume that the  2


i ’s are unknown. Therefore,


we replace them by estimators. Since it is a different estimator, we call it ˜μ. Using the same realized


samples as in the previous simulation, compare the bias, standard error and RMSE of ˜μ and ¯x.


Do we see a difference when the true variances are replaced by estimates? Explain.


Asymptotic distribution


For this question, we want to analyze the large sample properties of the fourth moment estimator:


ˆm4 = 1


n


Xn


i=1


X4


i


To analyze the properties, use two distributions: one with the first 8 moments being finite and one with the


8th moment being either infinite or undetermined. Then:


• Compare the consistency of ˆm4 for the two distributions when n goes from 10 to 2000. Explain your


result.


• Compare the distributions of p


n( ˆm4 − m4) for n = 10, ..., 2000. Do they seem to be asymptotically


normal? Explain why.


If you want the true m4, you can get it from the web, or you can compute it numerically. Since the definition


is


R


f(x)x4dx, where f(x) is the density, we could get it in R using the integrate function. For example, the


fourth moment of the N(0,4) is:


f function(x) dnorm(x, 0, 2)*xˆ4


m4 integrate(f, -Inf, Inf)


m4


## 48 with absolute error


It is therefore 48.


The Delta Method


Suppose X > 0 and we want to show that:


p


n(¯X 3 − μ3) ! N(0,  2(3μ2)2)


First, use the Delta method to show that it does converge to N(0,  2(3μ2)2). Then, use a simulation to see if


the distribution gets closer and closer to this normal distribution when n increases. Interpret your result


(e.g. is it a good approximation in small samples?).


2


For your population, you can choose any distribution with μ 6= 0 that satisfies the CLT for ¯X , but you cannot


use the normal distribution. If you want, you can try more than one distribution and compare the results.


Theoretically, what happens if μ = 0? Can you very that numerically?


Hypothesis tests


We want to test the Null hypothesis H0 : μ = c against the alternative H1 : μ 6= c at  . We know that when


the population is known to be normal, we have the following:


Test 1:


p


n(¯X − μ)


S


  tn−1


If the distribution of the population is unknown, we rely on the asymptotic distribution:


Test 2:


p


n(¯X − μ)


ˆ


! N(0, 1)


where ˆ 2 = (n − 1)S2/n. Answer the following question:


• Suppose xi   N(μ,  2). Using a simulation, compare the power curve of Test 1 and Test 2 for   = 5%.


You can choose any μ and  2. Set the sample size to 25.


• Answer the same question when xi is not normal. Choose any distribution.


What happens when n=300? Explain.


Bootstrap


Here, we want to see how good is the Bootstrap method to estimate the standard deviation of


ˆm4 = 1


n


Xn


i=1


X4


i


For this simulation, assume that xi   N(0, 1). Since we know all moments of the normal distribution, it


implies that V ar( ˆm4) = 96/n (prove it). Also, set the sample size of each sample to 20 and the number of


Bootstrap samples to 500.


• Estimate the bias of the Bootstrap estimator of the standard deviation.


• Using the 2,000 ˆm4 and 2,000 Bootstrap estimates of the standard deviation, estimate the probability


of rejecting H0 : m4 = 3 against H0 : m4 6= 3 using the asymptotic distribution when the size of the


test is   = 1%, 5% and 10=%. The asymptotic distribution is:


p


n( ˆm4 − 3)


SD( ˆm4) ! N(0, 1)


3

Sep 29, 2021
SOLUTION.PDF

Get Answer To This Question

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here