This assignment is essentially the multiple regression analysis portion of your project. This means that I expect you to develop a good regression model with more than one independent variable (X). Ideally, if you made a good choice of variables in your proposal you should be able to include all three X variable in your regression equation.Be sure to complete each part and write your responses supported by Minitab/excel work.This assignment should be turned in to me as a Word document.You should include excel and Minitab tables and graphs in the Word document as required.Be sure to comment on each of the 10 points below.
1.Run scatter plots and a correlation matrix on your project variables and comment on their values and significance if you have done this earlier you may use that analysis here.
2. Note any seasonality in your Y data with ACF (autocorrelation analysis of Y) You may use ACFs that you previously developed.
3.Determine if any of your variables require transformation.If they do, calculate the transformed values and create a scatter plot with a regression line and run a correlation with Y for each transformed X.Create a table for the Y, X and X transformed values.
4.Determine if your model requires dummy variables (e.g. for Y variable seasonality or significant events) and include a table of the dummy variable values for regression analysis. You may use either Decomposition centered moving average of Y (CMA) for Y and seasonal indices (SI) to seasonally adjust your Y variable or use dummy X variables in regression.
5.Use regression to evaluate the variable combinations to determine the best regression model.Note that is any seasonal dummy variables are used all of the seasonal dummy variables must be used.Use R square and F as primary determinants of the best model.
Note the significance of each slope term in the model. Rule-- if the coefficient is not significant then you maynotuse the model to forecast.
7.Investigate your best model using appropriate statistics or graphs to comment on possible:
a.Autocorrelation (Serial correlation) with the DW statistic
b.Heteroscedasticity with a residuals versus order plot (look for a megaphone effect)
c.Multicollinearity with the VIF statistic
6.Evaluate model fit with 2 error measures (RMSE and MAPE).
8.Determine the best remedies for any of the problems identified in 5 above and make the appropriate changes to your regression model if required.Rerun the model and evaluate the fit again including error measures, R adjusted square, F value, slope coefficient significance, DW and VIF.
9.Evaluate the model fit residuals and comment on their randomness using autocorrelation functions (ACFs) , histogram and a normality plot (You may use a four-in-one graph set along with residual ACFs).
10.Forecast for the holdout period using your hold out X values to forecast Y. You can use Minitab Regression - Options menu by placing the columns for the X variables hold out values and any dummy variable predictions in the "Prediction intervals for new observations" area. If you used the Decomposition Indices make sure you seasonalize the hold out forecast Y values.
11.Evaluate the forecast error measures and residuals to determine if the error is acceptable or has systematic variation.Write your conclusion relative to the acceptability of the forecast.