Answer To: Excel file attached
Saravana answered on May 15 2021
Ques 7.13.docx
Ques 7.13
A .Estimate the regression equation with months since last service as the only independent variable. Express the relationship in an equation. Test the coefficient at 0.05 level of significance. Interpret it. What is the coefficient of determination? Interpret it.
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.730874
R Square
0.534177
Adjusted R Square
0.475949
Standard Error
0.781022
Observations
10
ANOVA
df
SS
MS
F
Significance F
Regression
1
5.596033
5.596033
9.173887
0.016338
Residual
8
4.879967
0.609996
Total
9
10.476
Coefficients
Standard Error
t Stat
P-value
Lower 95%
Upper 95%
Lower 95.0%
Upper 95.0%
Intercept
2.147273
0.604977
3.549344
0.007517
0.752193
3.542353
0.752193
3.542353
Months Since Last Service
0.304132
0.100412
3.028842
0.016338
0.072582
0.535683
0.072582
0.535683
Repair time (hours) = 2.14 + (Months since last service) * 0.30
Intercept: The intercept coefficient is 2.14 and the coefficient was significantly different from zero ( t(8) = 3.54, p = 0.0075). The repair time was around 2.14 irrespective of the months of last service.
The months since last service coefficient was 0.30 and this coefficient was significant (t(8) = 3.02, p = 0.016) .An increase of one month since last service increased the repair time by 0.30 hours.
The coefficient of determination measured as R squared = 0.53 mean that 53% of variance in repair time being explained by months since last service.
B. Use model in a to predict repair time in hours for the below data. Compute the residuals. Do you see any pattern in residuals related to type of repair or the repairperson? Create two scatter diagrams of actual repair time against months since last service.
Repair Time in Hours
Months Since Last Service
Type of Repair
Repairperson
Predicted Repair Time in Hours
Residuals
1.8
3
Mechanical
Donna Newton
3.05
-1.25
3
6
Mechanical
Donna Newton
3.97
-0.97
4.2
9
Mechanical
Bob Jones
4.88
-0.68
2.9
2
Electrical
Donna Newton
2.75
0.14
2.9
2
Electrical
Donna Newton
2.75
0.14
4.8
8
Electrical
Bob Jones
4.58
0.21
4.8
8
Mechanical
Bob Jones
4.58
0.21
4.5
6
Electrical
Donna Newton
3.97
0.52
4.9
7
Electrical
Bob Jones
4.27
0.62
4.4
4
Electrical
Bob Jones
3.36
1.03
The predicted times for mechanical repairs are consistently higher, so we see a pattern of negative residuals for mechanical repairs.
Scatter plot of Repair time vs Months since last service split across Mechanical and electrical type of repairs
Scatter plot of Repair time vs Months since last service split across different repair person Donna Newton and Bob Jones
C. Create a dummy variable for type of repair. Estimate a regression model with months since last service and type of repair as independent variables. Express the relationship in an equation. Test the parameters of the model at 0.05 level of confidence. Interpret the parameters. Find the coefficient of determination. Interpret it.
Dummy Coded Data:
Repair Time in Hours
Months Since Last Service
Type of Repair Dummy Variable
Type of Repair
2.9
2
0
Electrical
3.0
6
1
Mechanical
4.8
8
0
Electrical
1.8
3
1
Mechanical
2.9
2
0
Electrical
4.9
7
0
Electrical
4.2
9
1
Mechanical
4.8
8
1
Mechanical
4.4
4
0
Electrical
4.5
6
0
Electrical
Linear Regression output:
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.926928
R Square
0.859195
Adjusted R Square
0.818964
Standard Error
0.459048
Observations
10
ANOVA
df
SS
MS
F
Significance F
Regression
2
9.000923
4.500461
21.357
0.001048
Residual
7
1.475077
0.210725
Total
9
10.476
Coefficients
Standard Error
t Stat
P-value
Lower 95%
Upper 95%
Lower 95.0%
Upper 95.0%
Intercept
2.193189
0.355761
6.164787
0.000461
1.351948
3.034429
1.351948
3.034429
Months Since Last Service
0.387616
0.062565
6.195396
0.000447
0.239673
0.535559
0.239673
0.535559
Type of Repair Dummy Variable
-1.26269
0.314127
-4.0197
0.005062
-2.00549
-0.5199
-2.00549
-0.5199
The months since last service is still has a positive coefficient = 0.38, the coefficient is also significant t(7) = 6.19 = 0.0004. So, for every increase in one month in delayed service increased the time spent in repair by 0.38 hours.
To interpret the dummy coded Type of repair variable we need to interpret the coefficient as the mean difference between Electrical repairs (dummy coded -0) and Mechanical Repairs (dummy coded as 1). Since the mean difference is negative we can infer that electrical repairs take longer than mechanical repairs. The t-test in this scenario is tests whether the mean difference between the two groups is significant or not. We can observe that the mean difference between the electrical and mechanical repair is significant (t(7) = -4.0197, p = 0.0052).
The coefficient of determination measured as R Square = 0.859195, means that 85% of the variance in repair times is explained by two predictors: months since last service and the type of repair (electrical or mechanical).
D. Create a dummy variable for repairperson. Estimate a regression model with months since last service and repairperson as independent variables. Express the relationship in an equation. Test the parameters of the model at 0.05 level of confidence. Interpret the parameters. Find the coefficient of determination. Interpret it.
The dummy coded data:
Repair Time in Hours
Months Since Last Service
Repairperson Dummy Variable
Repairperson
2.9
2
1
Donna Newton
3
6
1
Donna Newton
4.8
8
0
Bob Jones
1.8
3
1
Donna Newton
2.9
2
1
Donna Newton
4.9
7
0
Bob Jones
4.2
9
0
Bob Jones
4.8
8
0
Bob Jones
4.4
4
0
Bob Jones
4.5
6
1
Donna Newton
Repair person: Donna Newton is coded as 1 and Bob jones as 0
Linear regression output:
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.824936
R Square
0.680519
Adjusted R Square
0.589238
Standard Error
0.691467
Observations
10
ANOVA
df
SS
MS
F
Significance F
Regression
2
7.129114
3.564557
7.455258
0.018431
Residual
7
3.346886
0.478127
Total
9
10.476
Coefficients
Standard Error
t Stat
P-value
Lower 95%
Upper 95%
Lower 95.0%
Upper 95.0%
Intercept
3.526329
0.93808
3.75909
0.007083
1.308121
5.744537
1.308121
5.744537
Months Since Last Service
0.151899
0.123006
1.234884
0.256713
-0.13897
0.442763
-0.13897
0.442763
Repairperson Dummy Variable
-1.08354
0.605112
-1.79065
0.116467
-2.51441
0.347318
-2.51441
0.347318
Though the whole regression model is significant (F(2,7) = 7.45, p = 0.018), We can find any significant individual predictors. The mean difference between Bob and Donna is negative indicating longer repair times for bob, but this mean difference was not significant (t(7) = 1.79, p = 0.1164).
The model has an R square = 0.68 meaning that the model as a whole explains 68% of variance in the repair times.
E. Estimate a regression model with months since last service, type of repai and repairperson as independent variables. Express the relationship in an equation. Test the parameters of the model at 0.05 level of confidence. Interpret the parameters. Find the coefficient of determination. Interpret it.
The dummy coded data:
Repair Time in Hours
Months Since Last Service
Type of Repair Dummy Variable
Repairperson Dummy Variable
Type of Repair
Repairperson
2.9
2
0
1
Electrical
Donna Newton
3.0
6
1
1
Mechanical
Donna Newton
4.8
8
0
0
Electrical
Bob Jones
1.8
3
1
1
Mechanical
Donna Newton
2.9
2
0
1
Electrical
Donna Newton
4.9
7
0
0
Electrical
Bob Jones
4.2
9
1
0
Mechanical
Bob Jones
4.8
8
1
0
Mechanical
Bob Jones
4.4
4
0
0
Electrical
Bob Jones
4.5
6
0
1
Electrical
Donna Newton
Linear regression Output:
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.948789
R Square
0.9002
Adjusted R Square
0.8503
Standard Error
0.417434
Observations
10
ANOVA
df
SS
MS
F
Significance F
Regression
3
9.430492
3.143497
18.04002
0.002091
Residual
6
1.045508
0.174251
Total
9
10.476
Coefficients
Standard Error
t Stat
P-value
Lower 95%
Upper 95%
Lower 95.0%
Upper 95.0%
Intercept
2.962567
0.587176
5.045452
0.002344
1.5258
4.399334
1.5258
4.399334
Months Since Last Service
0.291444
0.083598
3.486238
0.013043
0.086886
0.496002
0.086886
0.496002
Type of Repair Dummy Variable
-1.10241
0.303344
-3.63418
0.010911
-1.84466
-0.36015
-1.84466
-0.36015
Repairperson Dummy Variable
-0.60909
0.38793
-1.5701
0.167444
-1.55832
0.34014
-1.55832
0.34014
The months since last service is a significant predictor in the model (t(6) = 3.48, p = 0.013). The coefficient of Months since last service is 0.29. This indicates that a delay in 1 month increases the repair time by 0.29 hour.
The mean difference between the electrical and mechanical Type of repair is significant (t(6) = 3.63, p = 0.01). The mean difference between electrical and mechanical repair time is negative indicating larger repair times for electrical repairs compared to mechanical repairs.
There is no significant difference in the mean of the repair times of Bob and Donna (t(6) = -1.5701, p = 0.1674). The mean difference is negative, thus indicates a trend of longer repair time for Bob, but this trend is not significant.
The R squared of the model with three predictors is R-square = 0.9002. The model explains 90% of variance in repair times.
F. Which model would you use? Why?
The model with three predictors: Months since last service, Type of repair and Repairperson as an R square of 0.9002. Similarly, the two parameter model with Months since last service, Repairperson has in R square = 0.859 and finally the two parameter model with Months since last service, Type of repair has in R square = 0.68. And based on the largest R squared value we can choose the model with three predictors as best model.
Repair time (hours) vs Months since last service
Mechanical Repair 3 6 9 8 1.8 3 4.2 4.8 Electrical Repair 2 2 6 7 4 8 2.9 2.9 4.5 4.9000000000000004 4.4000000000000004 4.8 Months since last service
Repair time ()hours)
Repair Time (hours) vs Months since last service
Donna 3 6 2 2 6 1.8 3 2.9 2.9 4.5 Bob 9 8 8 7 4 4.2 4.8 4.8 4.9000000000000004 4.4000000000000004 Months since last service
Repair Time (hours)
Repair time (hours) vs Months since last service
Mechanical Repair 3 6 9 8 1.8 3 4.2 4.8 Electrical Repair 2 2 6 7 4 8 2.9 2.9 4.5 4.9000000000000004 4.4000000000000004 4.8 Months since last service
Repair time ()hours)
Repair Time (hours) vs Months since last service
Donna 3 6 2 2 6 1.8 3 2.9 2.9 4.5 Bob 9 8 8 7 4 4.2 4.8 4.8 4.9000000000000004 4.4000000000000004 Months since last service
Repair Time (hours)
Ques 7.18.docx
Ques 7.18:
A. Develop a scatter chart, treating asking rent as independent variable. Does a simple linear regression model appear to be appropriate?
B. Develop a simple linear regression to explain monthly mortgage. Express the relationship in an equation. Plot the residuals. Does a simple linear regression model appear to be appropriate based on the residual plot.
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.869565
R Square
0.756143
Adjusted R Square
0.725661
Standard Error
78.78191
Observations
10
ANOVA
df
SS
MS
F
Significance F
Regression
1
153961.7
153961.7
24.80616
0.001079
Residual
8
49652.72
6206.59
Total
9
203614.4
Coefficients
Standard Error
t Stat
P-value
Lower 95%
Upper 95%
Lower 95.0%
Upper 95.0%
Intercept
-197.958
187.695
-1.05468
0.322379
-630.784
234.8671
-630.784
234.8671
Rent ($)
1.069929
0.21482
4.980579
0.001079
0.574553
1.565305
0.574553
1.565305
Mortgage = -197.958 + (Rent) * 1.06
RESIDUAL OUTPUT
Observation
Predicted Mortgage ($)
Residuals
1
700.7819
-161.782
2
938.306
63.69396
3
682.5931
-56.5931
4
635.5162
75.4838
5
653.705
1.295014
6
947.9354
29.0646
7
821.6838
-45.6838
8
712.5511
-17.5511
9
617.3274
33.67259
10
575.6002
78.39981
Residual Plot:
The residual plot has a prominent feature:
1. There is a sort of curvilinear relation between the fitted vs residuals hinting at Non-linearity in data.
Thus, looking at the curvilinear relation in residual plot we can conclude that there is Non-linearity in the data will be best explained by a quadratic relationship rather than linear relationship in linear regression.
C. Develop a quadratic regression model. Express the relationship in an equation.
Quadratic regression...