Week 5 HomeworkWeek 5 HomeworkDATA 310Problem 11. Load in the ACSCountyData.Rdata dataset. Run a regression where ‘percent.college’ (the percentwithin a county that graduated from college)...

1 answer below »
******************************What is the cost to complete this assignment? Please let me know. ******************************


Week 5 Homework Week 5 Homework DATA 310 Problem 1 1. Load in the ACSCountyData.Rdata dataset. Run a regression where ‘percent.college’ (the percent within a county that graduated from college) is the independent variable and ‘median.income’ (the median income in each county) is the dependent variable. 2. Looking at the output of the regression, first interpret what the estimated coefficient of percent.college means. 3. What null hypothesis is being tested automatically for percent.college? What is the alternative hypothesis? What is the result of that hypothesis test? Problem 2 1. Using the same acs data, run a multiple regression where ‘percent.walk.commute’ is your dependent variable and ‘census.region’ is your independent variable. 2. Looking at the output of your regression, interpret what the estimated coefficents mean (don’t forget to interpret your Intercept!.) 3. Construct a regression formula using your regression output. 4. Using your formula, estimate the percent who walk to work in the North East reagion. 5. What hypothesis test(s) are being run automatically in this regression? 6. What are the findings of those hypothesis tests? 1 Problem 1 Problem 2
Answered 3 days AfterFeb 10, 2023

Answer To: Week 5 HomeworkWeek 5 HomeworkDATA 310Problem 11. Load in the ACSCountyData.Rdata dataset....

Banasree answered on Feb 14 2023
50 Votes
2
Problem 1
1. Ans.
R – Script
    # Load the dataset
load("acs")
# Run the regression
model <- lm(median.income ~ percent.college, data = acs)
# Print the results
summary(model)
Output -
    Call:
lm(formula = median.income ~ percent.college, data = acs)
Residuals:
Min 1Q Median 3Q Max
-43499 -5704 -104 6261 52977
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 28897.69 482.15 59.94<2e-16 ***
percent.college 1015.14 20.51 49.50<2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 10890 on 3217 degrees of freedom
(1 observation deleted due to missingness)
Multiple R-squared: 0.4324, Adjusted R-squared: 0.4322
F-statistic: 2451 on 1 and 3217 DF, p-value: < 2.2e-16
2.Ans.
1. The estimated coefficient of `percent.college` is 1015.14, which means that for a one unit increase in the percentage of people who graduated from college within a county, the median income is expected to increase by $1,015.14, while holding all other variables constant.
2. This coefficient is statistically significant (p < 0.001), indicating that the relationship between `percent.college` and `median.income` is unlikely to have occurred by chance.
3. The R-squared value of 0.4324 suggests that about 43% of the variability in `median.income` can be explained by the percentage of people who graduated from college within a county, based on this model.
3.Ans.
1. The null hypothesis being tested automatically for `percent.college` in the regression output is that the true population slope (i.e., coefficient) of `percent.college` is equal to zero. In other words, the null hypothesis is that there is no linear relationship between the percentage of people who graduated from college within a county and the median income in the county.
2. The alternative hypothesis is that the true population slope is not equal to zero, indicating that there is a linear relationship between the two variables. Specifically, a two-tailed hypothesis test is performed with the alternative hypothesis being that the true population slope is either greater or less than zero.
3. The result of the hypothesis test is reported in the "Coefficients" table of the regression output. The t-value for `percent.college` is 49.50, with a corresponding p-value of less than 2.2e-16 (i.e., smaller than the typical significance level of 0.05). This means that we can reject the null hypothesis and conclude that there is a statistically significant linear relationship between `percent.college` and `median.income`.
R Script
    # Fit the regression model
model <- lm(median.income ~ percent.college, data = acs)
# Create a scatter plot of the data with regression line
plot(acs$percent.college, acs$median.income,
xlab = "Percentage of College Graduates",
ylab = "Median Income")
abline(model, col = "blue")
Output
Problem 2
1.Ans.
R – script
    # Load the dataset
load("acs")
# Fit the regression model
model <- lm(percent.walk.commute ~ census.region, data = acs)
# Print the model...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here