Is anyone available to do this before 23:30 CET?
Problem Set 1 Deadline: Wednesday, September 28 2022, 8:50 am Problem 1 (1 point) In this year’s Econometric I class at the CEU Department of Economics and Business, there are 12 students: 9 boys and 3 girls. a) Let us choose randomly one student, and let us define random variable X1 as this student’s gender (i.e. X1 = 0 if this student is a boy and X1 = 1 if she is a girl). What is the probability distribution of this random variable? Does it belong to any type of distribution mentioned during the lectures? b) Let us choose randomly a group of 4 students, and define a random variable X2 as the number of girls in this group. What are the possible values of this random variable? Does this random variable belong to any type of distribution mentioned during the lectures? Problem 2 (1 point) a) Let X be am exponentially distributed random vari- able with parameter θ > 0. Find the median of this distribution. b) Let U be a uniformly distributed random variable on the interval [a; b], where a < b are real numbers. find the mean absolute deviation and the standard deviation of this random variable, and compare them. problem 3 (1 point) let x and y be two random variables with e(x) = 1, e(y ) = 2, v ar(x) = 4, v ar(y ) = 9, and corr(x;y ) = 0.5. a) let z = 3x − 2y − 4. calculate e(z) and v ar(z). b) let u = x + 5y . find the correlation coefficient between u and z. problem 4 (3 points) in this problem you will investigate whether in the english premier league, the number of goals scored by the visiting (or away) team depends on the number of goals scored by the home team. for this purpose, please use the stata data file called epl games.dta – taken directly from the data repository of the textbook data analysis for business, economics and policy by gábor békés and gábor kézdi. using data in this file, please write a mini report that answers the question asked in the first sentence. basic information for the data file: 1. it contains data of all the english premier league games between the sea- sons of 2008/2009 and 2018/2019 (11 seasons with 380 games per season, altogether 4,180 observations). 1 2. the variable called season contains the beginning year of the season in which the match was played. (see the output of tab season). 3. the variables of interest – of which you investigate whether they are in- dependent or not – are goals home and goals away. the first shows the number of goals scored by the home team, while the second shows the number of goals scored by the away team. (see the output of tab goals home and tab goals away.) hints for the solution – but please feel free to use other stata functions than recommended here if you find them more appropriate: 1. due to the relatively small number of matches when either team scores more than 4 goals, you can do your analysis by considering the following five categories for both variables: 0 goals / 1 goal / 2 goals / 3 goals / at least 4 goals. 2. start with presenting conditional distributions: plot these conditional dis- tributions and speculate what they indicate for the independence of the variables. the stata function tab [variable] if [condition] should be appropriate to calculate these conditional distributions; while with the stata function hist you can graph them. 3. then present conditional expected values (or conditional means), and dis- cuss what they imply for independence. the stata function sum [variable] if [condition], d will calculate these expected values – together with other basic statistics – for you. 4. next calculate the appropriate correlation coefficient(s). the stata func- tion pwcorr [variable1] [variable2], sig star(0.05) should do the job. (the option “sig star”will tell you if the coefficient is “substan- tially”different from zero. if it is indeed substantially different from zero, then you will see a star (*) next to the estimated coefficient. we will learn the technical details of this later in the course.) 5. finally, evaluate shortly how robust is your result: is it true in all the 11 seasons? or just in some seasons but not in others? present some empirical evidence in a convincing way that supports your claim. to be submitted by everybody: 1. solution of problems 1-3 (printed, or readable hand-written). 2. a short report for problem 4, containing simple graphs and/or tables that support your claims. 3. a stata do-file for your calculations of problem 4. these files (or pic- tures of them) should be sent in email to the following email addresses:
[email protected] and mark
[email protected]. 2 b="" are="" real="" numbers.="" find="" the="" mean="" absolute="" deviation="" and="" the="" standard="" deviation="" of="" this="" random="" variable,="" and="" compare="" them.="" problem="" 3="" (1="" point)="" let="" x="" and="" y="" be="" two="" random="" variables="" with="" e(x)="1," e(y="" )="2," v="" ar(x)="4," v="" ar(y="" )="9," and="" corr(x;y="" )="0.5." a)="" let="" z="3X" −="" 2y="" −="" 4.="" calculate="" e(z)="" and="" v="" ar(z).="" b)="" let="" u="X" +="" 5y="" .="" find="" the="" correlation="" coefficient="" between="" u="" and="" z.="" problem="" 4="" (3="" points)="" in="" this="" problem="" you="" will="" investigate="" whether="" in="" the="" english="" premier="" league,="" the="" number="" of="" goals="" scored="" by="" the="" visiting="" (or="" away)="" team="" depends="" on="" the="" number="" of="" goals="" scored="" by="" the="" home="" team.="" for="" this="" purpose,="" please="" use="" the="" stata="" data="" file="" called="" epl="" games.dta="" –="" taken="" directly="" from="" the="" data="" repository="" of="" the="" textbook="" data="" analysis="" for="" business,="" economics="" and="" policy="" by="" gábor="" békés="" and="" gábor="" kézdi.="" using="" data="" in="" this="" file,="" please="" write="" a="" mini="" report="" that="" answers="" the="" question="" asked="" in="" the="" first="" sentence.="" basic="" information="" for="" the="" data="" file:="" 1.="" it="" contains="" data="" of="" all="" the="" english="" premier="" league="" games="" between="" the="" sea-="" sons="" of="" 2008/2009="" and="" 2018/2019="" (11="" seasons="" with="" 380="" games="" per="" season,="" altogether="" 4,180="" observations).="" 1="" 2.="" the="" variable="" called="" season="" contains="" the="" beginning="" year="" of="" the="" season="" in="" which="" the="" match="" was="" played.="" (see="" the="" output="" of="" tab="" season).="" 3.="" the="" variables="" of="" interest="" –="" of="" which="" you="" investigate="" whether="" they="" are="" in-="" dependent="" or="" not="" –="" are="" goals="" home="" and="" goals="" away.="" the="" first="" shows="" the="" number="" of="" goals="" scored="" by="" the="" home="" team,="" while="" the="" second="" shows="" the="" number="" of="" goals="" scored="" by="" the="" away="" team.="" (see="" the="" output="" of="" tab="" goals="" home="" and="" tab="" goals="" away.)="" hints="" for="" the="" solution="" –="" but="" please="" feel="" free="" to="" use="" other="" stata="" functions="" than="" recommended="" here="" if="" you="" find="" them="" more="" appropriate:="" 1.="" due="" to="" the="" relatively="" small="" number="" of="" matches="" when="" either="" team="" scores="" more="" than="" 4="" goals,="" you="" can="" do="" your="" analysis="" by="" considering="" the="" following="" five="" categories="" for="" both="" variables:="" 0="" goals="" 1="" goal="" 2="" goals="" 3="" goals="" at="" least="" 4="" goals.="" 2.="" start="" with="" presenting="" conditional="" distributions:="" plot="" these="" conditional="" dis-="" tributions="" and="" speculate="" what="" they="" indicate="" for="" the="" independence="" of="" the="" variables.="" the="" stata="" function="" tab="" [variable]="" if="" [condition]="" should="" be="" appropriate="" to="" calculate="" these="" conditional="" distributions;="" while="" with="" the="" stata="" function="" hist="" you="" can="" graph="" them.="" 3.="" then="" present="" conditional="" expected="" values="" (or="" conditional="" means),="" and="" dis-="" cuss="" what="" they="" imply="" for="" independence.="" the="" stata="" function="" sum="" [variable]="" if="" [condition],="" d="" will="" calculate="" these="" expected="" values="" –="" together="" with="" other="" basic="" statistics="" –="" for="" you.="" 4.="" next="" calculate="" the="" appropriate="" correlation="" coefficient(s).="" the="" stata="" func-="" tion="" pwcorr="" [variable1]="" [variable2],="" sig="" star(0.05)="" should="" do="" the="" job.="" (the="" option="" “sig="" star”will="" tell="" you="" if="" the="" coefficient="" is="" “substan-="" tially”different="" from="" zero.="" if="" it="" is="" indeed="" substantially="" different="" from="" zero,="" then="" you="" will="" see="" a="" star="" (*)="" next="" to="" the="" estimated="" coefficient.="" we="" will="" learn="" the="" technical="" details="" of="" this="" later="" in="" the="" course.)="" 5.="" finally,="" evaluate="" shortly="" how="" robust="" is="" your="" result:="" is="" it="" true="" in="" all="" the="" 11="" seasons?="" or="" just="" in="" some="" seasons="" but="" not="" in="" others?="" present="" some="" empirical="" evidence="" in="" a="" convincing="" way="" that="" supports="" your="" claim.="" to="" be="" submitted="" by="" everybody:="" 1.="" solution="" of="" problems="" 1-3="" (printed,="" or="" readable="" hand-written).="" 2.="" a="" short="" report="" for="" problem="" 4,="" containing="" simple="" graphs="" and/or="" tables="" that="" support="" your="" claims.="" 3.="" a="" stata="" do-file="" for="" your="" calculations="" of="" problem="" 4.="" these="" files="" (or="" pic-="" tures="" of="" them)="" should="" be="" sent="" in="" email="" to="" the="" following="" email="" addresses:=""
[email protected]="" and="" mark=""
[email protected].=""> b are real numbers. find the mean absolute deviation and the standard deviation of this random variable, and compare them. problem 3 (1 point) let x and y be two random variables with e(x) = 1, e(y ) = 2, v ar(x) = 4, v ar(y ) = 9, and corr(x;y ) = 0.5. a) let z = 3x − 2y − 4. calculate e(z) and v ar(z). b) let u = x + 5y . find the correlation coefficient between u and z. problem 4 (3 points) in this problem you will investigate whether in the english premier league, the number of goals scored by the visiting (or away) team depends on the number of goals scored by the home team. for this purpose, please use the stata data file called epl games.dta – taken directly from the data repository of the textbook data analysis for business, economics and policy by gábor békés and gábor kézdi. using data in this file, please write a mini report that answers the question asked in the first sentence. basic information for the data file: 1. it contains data of all the english premier league games between the sea- sons of 2008/2009 and 2018/2019 (11 seasons with 380 games per season, altogether 4,180 observations). 1 2. the variable called season contains the beginning year of the season in which the match was played. (see the output of tab season). 3. the variables of interest – of which you investigate whether they are in- dependent or not – are goals home and goals away. the first shows the number of goals scored by the home team, while the second shows the number of goals scored by the away team. (see the output of tab goals home and tab goals away.) hints for the solution – but please feel free to use other stata functions than recommended here if you find them more appropriate: 1. due to the relatively small number of matches when either team scores more than 4 goals, you can do your analysis by considering the following five categories for both variables: 0 goals / 1 goal / 2 goals / 3 goals / at least 4 goals. 2. start with presenting conditional distributions: plot these conditional dis- tributions and speculate what they indicate for the independence of the variables. the stata function tab [variable] if [condition] should be appropriate to calculate these conditional distributions; while with the stata function hist you can graph them. 3. then present conditional expected values (or conditional means), and dis- cuss what they imply for independence. the stata function sum [variable] if [condition], d will calculate these expected values – together with other basic statistics – for you. 4. next calculate the appropriate correlation coefficient(s). the stata func- tion pwcorr [variable1] [variable2], sig star(0.05) should do the job. (the option “sig star”will tell you if the coefficient is “substan- tially”different from zero. if it is indeed substantially different from zero, then you will see a star (*) next to the estimated coefficient. we will learn the technical details of this later in the course.) 5. finally, evaluate shortly how robust is your result: is it true in all the 11 seasons? or just in some seasons but not in others? present some empirical evidence in a convincing way that supports your claim. to be submitted by everybody: 1. solution of problems 1-3 (printed, or readable hand-written). 2. a short report for problem 4, containing simple graphs and/or tables that support your claims. 3. a stata do-file for your calculations of problem 4. these files (or pic- tures of them) should be sent in email to the following email addresses: reiffa@ceu.edu and mark lili@phd.ceu.edu. 2>