Assignment-1 MIS771 Descriptive Analytics and Visualisations Page 1 of 10 MIS771 Descriptive Analytics and Visualisation DEPARTMENT OF INFORMATION SYSTEMS AND BUSINESS ANALYTICS DEAKIN BUSINESS SCHOOL...

1 answer below »
Files have attached


Assignment-1 MIS771 Descriptive Analytics and Visualisations Page 1 of 10 MIS771 Descriptive Analytics and Visualisation DEPARTMENT OF INFORMATION SYSTEMS AND BUSINESS ANALYTICS DEAKIN BUSINESS SCHOOL FACULTY OF BUSINESS AND LAW, DEAKIN UNIVERSITY Assignment Two Background This is an individual assignment. You need to analyse a given dataset, and then interpret and draw conclusions from your analysis. You then need to convey your findings in a written report to an expert in Business Analytics. Percentage of the final grade 35% The Due Date and Time 8 pm Thursday 17th September 2020 Submission instructions The assignment must be submitted by the due date, electronically in CloudDeakin. When submitting electronically, you must check that you have submitted the work correctly by following the instructions provided in CloudDeakin. Please note that we will NOT accept any paper or email copies, or part of the assignment submitted after the due date. Information for students seeking an extension BEFORE the due date If you wish to seek an extension for this assignment before the due date, you need to apply directly to the Unit Chair by completing the Assignment and Online Test Extension Application Form before Friday 5 pm 17th Thursday September 2020. Please make sure you attach all supporting documentation and a draft of your assignment. The request for extension needs to occur as soon as you become aware that you will have difficulty in meeting the due date. Please note: Unit Chairs can only grant extensions of up to two weeks beyond the original due date. If you require more than two weeks, or have already been provided with an extension by the Unit Chair and require additional time, you must apply for Special Consideration via StudentConnect within 3 business days of the due date. Conditions under which an extension will usually be considered include: • Medical – to cover medical conditions of a severe nature, e.g. hospitalisation, serious injury or chronic illness. Note: temporary minor ailments such as headaches, colds and minor gastric upsets are not serious medical conditions and are unlikely to be accepted. However, serious cases of these may be considered. • Compassionate – e.g. death of a close family member, significant family and relationship problems. • Hardship/Trauma – e.g. sudden loss or gain of employment, severe disruption to domestic arrangements, a victim of crime. Note: misreading the due date, assignment anxiety, or multiple assignments will not be accepted as grounds for consideration. https://www.deakin.edu.au/students/faculties/buslaw/student-support/assignment-extensions MIS771 Descriptive Analytics and Visualisations Page 2 of 10 Information for students seeking an extension AFTER the due date If the due date has passed; you require more than two weeks extension, or you have already been provided with an extension and require additional time, you must apply for Special Consideration via StudentConnect. Please be aware that applications are governed by University procedures and must be submitted within three business days of the due date or extension due date. Please be aware that in most instances the maximum amount of time that can be granted for an assignment extension is three weeks after the due date, as Unit Chairs are required to have all assignment submitted before results/feedback can be released back to students. Penalties for late submission The following marking penalties will apply if you submit an assessment task after the due date without an approved extension: • 5% will be deducted from available marks for each day, or part thereof, up to five days. • Work that is submitted more than five days after the due date will not be marked; you will receive 0% for the task. Note: 'Day' means calendar day. The Unit Chair may refuse to accept a late submission where it is unreasonable or impracticable to assess the task after the due date. Additional information: For advice regarding academic misconduct, special consideration, extensions, and assessment feedback, please refer to the document "Rights and responsibilities as a student" in the "Unit Guide and Information" folder under the "Resources" section in the MIS771 CloudDeakin site. The assignment uses the dataset file T22020MIS771_A2Data.xlsx, which can be downloaded from CloudDeakin. Analysis of the data requires the use of techniques studied in Module-2. Assurance of Learning This assignment assesses the following Graduate Learning Outcomes and related Unit Learning Outcomes: Graduate Learning Outcome (GLO) Unit Learning Outcome (ULO) GLO1: Discipline-specific knowledge and capabilities - appropriate to the level of study related to a discipline or profession. GLO2: Communication - using oral, written and interpersonal communication to inform, motivate and effect change GLO5: Problem Solving - creating solutions to authentic (real world and ill-defined) problems. GLO6: Self-Management - working and learning independently, and taking responsibility for personal actions ULO 1: Apply quantitative reasoning skills to solve complex problems. ULO 2: Plan, monitor, and evaluate own learning as a data analyst. ULO 3: Deduce clear and unambiguous solutions in a form that they useful for decision making and research purposes and for communication to the wider public. MIS771 Descriptive Analytics and Visualisations Page 3 of 10 Feedback before submission You can seek assistance from the teaching staff to ascertain whether the assignment conforms to submission guidelines. Feedback after submission An overall mark together with feedback, will be released via CloudDeakin, usually within 15 working days. You are expected to refer and compare your answers to the feedback to understand any areas of improvement. The Case Study ANALYTICs7, a leading data analysis consulting company, has extensive experience in analysing data for both local and global, small to medium companies. By solving their business problems, ANALYTICs 7 helps these businesses to plan ahead and thrive. Your Role in ANALYTICS7 Dr Hugo Barra, the lead data scientist at ANALYTICs7 has engaged you to lead the modelling component for the TPM and AP projects and construct a report of your key findings and recommendations in response to the questions posed in the meeting minutes of the last team meeting on the next page. Datasets (accessible via T22020MIS771_A2Data.xlsx file) There are two datasets available for this assignment: TPM_Employee_Attrition and Monthly_EnergyCon_MW Employee Survey data (TPM_Employee_Attrition )– TassPaperMill (TPM), a subsidiary of Pinnon Paper Industries (PPI), is an Australian company with a long history of manufacturing paper rolls. To address numerous concerns raised in their recent employee survey TPM is currently reviewing how they calculate salary increments for their employees. TPM has hired ANALYTICs7 to extract a random sample of 1470 employee records from their HR database. Their ultimate goal is to adopt a more holistic rewarding system factoring the key relations between remuneration indicators and demographic characteristics, employment history and various other potential contributors to boost performance. In addition, human resource manager at TPM reported in her recent presentation to the company executive management team that the staff turnover rate at TPM is higher compared to their competitors. Thus, TMP wants to identify key contributing factors before they lose more talented, motivated and focused employees who contribute to the organisation's overall success. Energy consumption data (Monthly_EnergyCon_MW) – Australian Power (AP) is one of the largest generators of electricity in Australia, servicing for more than three million households in Victoria. AP operates an electric transmission system that covers much of Victoria and serves over 30% of the electricity demand in Victoria. This dataset consists of monthly power consumption data in megawatts (MW) comes from AP’s data warehouse during 2010-2019. AP wishes to review their current resources allocation strategy to plan and prioritise the provision of resources based on rapidly growing energy demand in Victoria. A complete listing of variables is provided in the T22020MIS771_A2Data.xlsx file. Note: All data, reports, people and scenarios in this assignment are either fictitious or have been modified from their original state. Any similarity to actual events is purely coincidental. It has been produced for the sole purpose of assessing performance of summative assessment task 2. MIS771 Descriptive Analytics and Visualisations Page 4 of 10 Form 210-3 ANALYTICS7 Team Meeting ANALYTICS7 727 Collins St, Docklands VIC 3008 Phone: (+61 3 212 66 000) [email protected] Reference AP-211 TPM Project Revised 27th August 2020 Level Expert Analysis Meeting Chair Dr Hugo Barra Date 24 August 2020 Time 10:00 AM Location ANALYTICS7 L4.320 Topic TPM and AP Research Projects – Analytics Details Meeting Purpose: Specifying and Allocating Data Analytics Tasks Discussion items: 1. Variable(s) description 2. Modelling PercentSalaryHike 3. Modelling the likelihood of an employee leaving the company 4. Forecasting monthly energy consumption in Megawatts 5. Producing a technical report Detailed Action Items Who: Modeller What: 1. Providing an overall summary of the following two variables: 1.1. Percentage increase in salary (PercentSalaryHike) 1.2. Attrition 2. Identify potential variables that may influence PercentSalaryHike: 2.1. Identify a list of possible variables that influence percentage increase in salary. Which three independent variables have the more impactful linear relationship with PercentSalaryHike? What form of relationship(s) exist between the independent variable(s) and PercentSalaryHike? Are there any potential multi-collinearity problems? If so, which variables are they? 2.2. Build a regression model to estimate percentage increase in salary. 2.3. Perform residual analysis. Based on your residual plots, does there appear to be any problems with the regression model? 3. Hugo has performed some preliminary analysis and discovered that the performance rating is a significant predictor of the Percentage increase in salary. Prior research shows that the strength of the relationship between performance rating and percentage increase in salary may vary according to satisfaction with the job. Generally speaking increased job satisfaction creates a more productive workforce as they are more motivated to improve their job performance. MIS771 Descriptive Analytics and Visualisations Page 5 of 10 Therefore, Hugo believes that the relationship between performance rating and percentage increase in salary should be stronger for employees who
Answered Same DaySep 05, 2021MIS771Deakin University

Answer To: Assignment-1 MIS771 Descriptive Analytics and Visualisations Page 1 of 10 MIS771 Descriptive...

Biswajit answered on Sep 09 2021
144 Votes
Assignment Solutions
Introduction-TassPaperMill (TPM), a subsidiary of Pinnon Paper Industries (PPI), is an Australian company with a long history of manufacturing paper rolls. To address numerous concerns raised in their recent employee survey,TPM is currently reviewing how they calculate salary increments for their employees.We will extract a random sample of 1470 employee records from their HR database.The ultimate goal is to adopt a more holistic rewarding system factoring the key relations between remuneration indicators and demographic characteristics, employment history and various other potential contributors to boost performance. In addition, human resource manager at TPM reported in her recent presentation to the company executive management team that the staff turnover rate at TPM is higher compared to their competitors.Thus,TMP wants to identify key contributing factors before they lose more talented, motivated and focused employees who contribute to the organisation's overall success.
Main Body-
    Variable Name
    Description of variables
    Age
    Measured to the nearest year
    Attrition
    Employee leaving the company (0=No, 1=Yes)
    Education
    1 Below College; 2 College; 3 Bachelor; 4 Master
    EnvironmentSatisfaction
    SATISFACTION WITH THE ENVIROMENT (working conditions; development opportunities; company culture; management styles; hierarchies and human resources policies etc)-
Responses were obtained on a 4 point scale ranging from 1 to 4 where 1 indicating low and 4 indicating very high
    Gender
    (1=FEMALE, 2=MALE)
    JobInvolvement
    Degree to which a job is central to a person's identity - Responses were obtained on a 4 point scale ranging from 1 to 4 where 1 indicating low and 4 indicating very high
    Job level
    Responsibility level and expectations of roles - 1 'Low' 2 'Medium'3 'High' 4 'Very High
    JobSatisfaction
    SATISFACTION WITH THE JOB - Responses were obtained on a 4 point scale ranging from 1 to 4 where 1 indicating low and
 4 indicating very high
    MaritalStatus
    (1=DIVORCED, 2=MARRIED, 3=SINGLE)
    NumCompaniesWorked
    NO. OF COMPANIES WORKED AT
    OverTime
    (1=NO, 2=YES)
    PercentSalaryHike
    PERCENTAGE INCREASE IN SALARY
    PerformanceRating
    PERFORMANCE RATING - Responses were obtained on a 4 point scale ranging from 1 to 4 where 1 indicating low and 4 indicating very high
    WorkLifeBalance
    TIME SPENT BEWTWEEN WORK AND OUTSIDE 1 'Bad' 2 'Good' 3 'Better' 4 'Best'
    YearsAtCompany
    TOTAL NUMBER OF YEARS AT THE COMPNAY
    YearsInCurrentRole
    YEARS IN CURRENT ROLE
    YearsSinceLastPromotion
    YEARS SINCE LAST PROMOTION
Job satisfaction rating of 11 was replaced with imputed value of 2 considering its gender & other variables score similar to this row.
Q1.Description variables :
There are in total 17 variables in the dataset.Out of these 17 variables,6 are categorical variables & rest 11 are continuous/discrete variables.
We analysed the two variables Attrition & PercentSalaryHike.Below are the details.
    PercentSalaryHike
    Mean
    15.21
    Standard Error
    0.10
    Median
    14.00
    Mode
    11.00
    Standard Deviation
    3.66
    Sample Variance
    13.40
    Kurtosis
    -0.30
    Skewness
    0.82
    Range
    14.00
    Minimum
    11.00
    Maximum
    25.00
    Sum
    22358.00
    Count
    1470.00
Table 1
Assuming that the sample was representative of the entire employee population,the average salary hike for all employees was 15.21 & standard deviation was 3.66.The data is right skewed with skewness of 0.82 & also negative kurtosis.It does not follow the standard Bell curve used in HR.While the maximum salary increase is 25%,the minimum is 11%.
As evident from the histogram,it is a long tail distribution.Around 55% of employees have salary hike between 11 -14.3 %. Around 27 % of employees salary hike is within 11-12.1 %.

Above Box plot comapres the percent salary hike for people who left & who stayed back.Better would be a hypothesis test to check the statistical significance of difference.
    Row Labels
    Average of PercentSalaryHike
    No
    15.23
    Yes
    15.10
    Grand Total
    15.21
    Row Labels
    StdDev of PercentSalaryHike
    No
    3.64
    Yes
    3.77
    Grand Total
    3.66
    Row Labels
    Max of PercentSalaryHike
    No
    25.00
    Yes
    25.00
    Grand Total
    25.00
    Row Labels
    Min of PercentSalaryHike
    No
    11.00
    Yes
    11.00
    Grand Total
    11.00
    Row Labels
    Count of Employees
    No
    1233
    Yes
    237
    Grand Total
    1470
Table 2
In Table 2,Yes stands for employees who left the organisation & No stands for employees who stayed back.
Conclusion :From 1470 employees,237 employees left which is around 16.12%.From table 2,we see that Minimum salary hike for both categories of employees i.e who left & who stayed back are 11 percent.Similarly,maximum salary hike for both categories of employees are also 25%.Average salary hike of employees who left is 15.10 % while those who stayed back are 15.23%.The standard deviation of salary hike for employees who left is 3.77 while who stayed back are 3.64.The coefficient of variation of salary hike for employees who left is 24.97% in comparison to 23.89% for employees who stayed back.So basically,there is higher variability of salary hike for employees who left.
Q2.1
Introduction : In this case,we have 17 variables out of which one PercentSalaryHike is dependent variable & rest 16 are independent variables.Out of the 16 independent variables,6 are categorical variables & rest 10 are continuous variables.Before building the regression model,we need to do feature/variable selection by using correlation matrix where we will check correlation coefficient between independent.We will pick only the variables which have significant correlation with dependent variables.This will help us identify the continuous variables which are significant.For categorical variables,we will be directly take those in the model & if p value is less than the assumed 0.05,we will accept or else drop those.
Description :
Correlation Matrix :
     
    Age
    Education
    EnvironmentSatisfaction
    JobInvolvement
    JobLevel
    JobSatisfaction
    NumCompaniesWorked
    PercentSalaryHike
    PerformanceRating
    WorkLifeBalance
    YearsAtCompany
    YearsInCurrentRole
    YearsSinceLastPromotion
    Age
    1
     
     
     
     
     
     
     
     
     
     
     
     
    Education
    0.04752484
    1
     
     
     
     
     
     
     
     
     
     
     
    EnvironmentSatisfaction
    0.01014643
    -0.00754
    1
     
     
     
     
     
     
     
     
     
     
    JobInvolvement
    -0.01183258
    0.002385
    -0.0095
    1
     
     
     
     
     
     
     
     
     
    JobLevel
    0.50960423
    0.031794
    0.001212
    -0.056062233
    1
     
     
     
     
     
     
     
     
    JobSatisfaction
    0.01479432
    -0.05039
    -0.03417
    0.398666134
    -0.018348
    1
     
     
     
     
     
     
     
    NumCompaniesWorked
    0.07866392
    0.052225
    0.013452
    -0.351162338
    0.036036
    -0.388621527
    1
     
     
     
     
     
     
    PercentSalaryHike
    0.00363358
    -0.04893
    -0.0317
    0.550392895
    -0.03473
    0.689970561
    -0.510569943
    1
     
     
     
     
     
    PerformanceRating
    -0.00686535
    -0.04805
    -0.03536
    0.335945013
    -0.010407
    0.431331828
    -0.245052431
    0.764765885
    1
     
     
     
     
    WorkLifeBalance
    -0.02149003
    -0.02017
    0.027627
    -0.019344333
    0.037818
    0.023805154
    0.014317645
    -0.003279636
    0.002898784
    1
     
     
     
    YearsAtCompany
    0.31130877
    -0.00607
    0.001458
    -0.068457051
    0.534739
    -0.028841981
    -0.001827975
    -0.035991262
    0.003300928
    0.012089185
    1
     
     
    YearsInCurrentRole
    0.21290106
    0.048399
    0.018007
    -0.030374295
    0.389447
    0.000304992
    -0.002212759
    -0.001520027
    0.033066195
    0.049856498
    0.758753737
    1
     
    YearsSinceLastPromotion
    0.21651337
    0.008263
    0.016194
    -0.065300624
    0.353885
    -0.006542608
    0.009712666
    -0.022154313
    0.01199545
    0.008941249
    0.618408865
    0.548056248
    1
Q2.2
First we create multiple linear regression with all variables
    SUMMARY OUTPUT
     
     
     
     
     
     
     
     
    Regression Statistics
     
     
     
     
     
     
     
     
    Multiple R
    0.909365
     
     
     
     
     
     
     
    R Square
    0.826945
     
     
     
     
     
     
     
    Adjusted R Square
    0.825039
     
     
     
     
     
     
     
    Standard Error
    1.530891
     
     
     
     
     
     
     
    Observations
    1470
     
     
     
     
     
     
     
    ANOVA
     
     
     
     
     
     
     
     
     
    df
    SS
    MS
    F
    Significance F
     
     
     
    Regression
    16
    16272.18
    1017.011
    433.9475618
    0
     
     
     
    Residual
    1453
    3405.29
    2.343627
     
     
     
     
     
    Total
    1469
    19677.47
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
    Coefficients
    Standard Error
    t Stat
    P-value
    Lower 95%
    Upper 95%
    Lower 95.0%
    Upper 95.0%
    Intercept
    -2.45267
    0.435049
    -5.63769
    2.0675E-08
    -3.306066038
    -1.59928
    -3.30607
    -1.59928
    Age
    0.012813
    0.0052
    2.464003
    0.013854579
    0.002612631
    0.023014
    0.002613
    0.023014
    Education2
    -0.01757
    0.113468
    -0.15486
    0.876952406
    -0.24015004
    0.205007
    -0.24015
    0.205007
    Education3
    0.040894
    0.110801
    0.369079
    0.712122274
    -0.176453037
    0.258242
    -0.17645
    0.258242
    Education4
    -0.18423
    0.300032
    -0.61403
    0.539293303
    -0.772769475
    0.404314
    -0.77277
    0.404314
    Gender
    -0.8442
    0.092629
    -9.11376
    2.56733E-19
    -1.02589697
    -0.6625
    -1.0259
    -0.6625
    JobInvolvement
    0.633671
    0.049803
    12.72344
    3.07961E-35
    0.535976447
    0.731365
    0.535976
    0.731365
    Joblevel2
    -0.09591
    0.096155
    -0.9975
    0.318686742
    -0.28453263
    0.092703
    -0.28453
    0.092703
    Joblevel3
    -0.08817
    0.128929
    -0.68388
    0.494158156
    -0.341078021
    0.164734
    -0.34108
    0.164734
    Joblevel4
    -0.15145
    0.180404
    -0.83953
    0.401310951
    -0.505333689
    0.202426
    -0.50533
    0.202426
    Joblevel5
    -0.41499
    0.211314
    -1.96385
    0.049737758
    -0.829503976
    -0.00048
    -0.8295
    -0.00048
    JobSatisfaction
    0.934493
    0.042254
    22.1161
    1.18132E-93
    0.85160792
    1.017379
    0.851608
    1.017379
    Married
    -0.07691
    0.093049
    -0.82659
    0.408605586
    -0.25943844
    0.105612
    -0.25944
    0.105612
    Divorced
    -0.16533
    0.110929
    -1.49041
    0.136334201
    -0.382927229
    0.052269
    -0.38293
    0.052269
    NumCompaniesWorked
    -0.27864
    0.018575
    -15.001
    2.11031E-47
    -0.315076903
    -0.2422
    -0.31508
    -0.2422
    OverTime
    -0.0642
    0.089028
    -0.72111
    0.47096053
    -0.238834902
    0.110438
    -0.23883
    0.110438
    PerformanceRating
    4.799953
    0.12132
    39.56434
    6.2769E-233
    4.561971183
    5.037934
    4.561971
    5.037934
From the above table,we see that adjusted R square is around 82.5% & standard error is 1.53 & model is statistically significant as shown in lower p value <0.05 in Anova test.Those are variables which are significant are marked green.
But we will build a more refined model by removing the insignificant variables where p value >0.05.Below model is the refined model with all significant variables.Here in the final model we find Female gender,Job involvement,Job satisfaction,Num companies worked,performance rating are significant.
    SUMMARY OUTPUT
     
     
     
     
     
     
     
     
    Regression Statistics
     
     
     
     
     
     
     
     
    Multiple R
    0.908644
     
     
     
     
     
     
     
    R Square
    0.825634
     
     
     
     
     
     
     
    Adjusted R Square
    0.825038
     
     
     
     
     
     
     
    Standard Error
    1.530895
     
     
     
     
     
     
     
    Observations
    1470
     
     
     
     
     
     
     
    ANOVA
     
     
     
     
     
     
     
     
     
    df
    SS
    MS
    F
    Significance F
     
     
     
    Regression
    5
    16246.38
    3249.276
    1386.424
    0
     
     
     
    Residual
    1464
    3431.087
    2.343639
     
     
     
     
     
    Total
    1469
    19677.47
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
    Coefficients
    Standard Error
    t Stat
    P-value
    Lower 95%
    Upper 95%
    Lower 95.0%
    Upper 95.0%
    Intercept
    -2.17115
    0.392657
    -5.52937
    3.8E-08
    -2.94138
    -1.40092
    -2.94138
    -1.40092
    Female
    -0.84082
    0.092404
    -9.0994
    2.89E-19
    -1.02207
    -0.65956
    -1.02207
    -0.65956
    JobInvolvement
    0.6423
    0.049547
    12.96332
    1.84E-36
    0.545108
    0.739491
    0.545108
    0.739491
    JobSatisfaction
    0.939155
    0.042092
    22.31182
    3.68E-95
    0.856587
    1.021722
    0.856587
    1.021722
    NumCompaniesWorked
    -0.27491
    0.018407
    -14.9352
    4.71E-47
    -0.31102
    -0.2388
    -0.31102
    -0.2388
    PerformanceRating
    4.795484
    0.121134
    39.58816
    1.2E-233
    4.557869
    5.033099
    4.557869
    5.033099
Q2.3
Residual analysis shows that the dependent variable percentsalary hike is not exactly normally distributed & the residuals of independent variables Job involvement,performance rating & job satisfaction are not homoscedastic.The details of the residual analysis graphs are given in the Excel sheet.
Conclusion :As seen from above correlation matrix,we can see the variables such as Performance rating,Job involvement,Job satisfaction,NumCompanies worked are significantly correlated with Percentsalary hike as correlation coefficient is higher.Also,the individual scatter plots between the independent & dependent variables shows that the relationship is linear.Also,the correlation among independent variables seem to be low hence eliminating the possibility of multicollinearity.The significant variables are marked green.As most independent variables have less than 0.5 correlation coefficient among themselves,there is less chance of multicolleanirity.But still we can check for multicolleanirity if any by finding the VIF (variance inflation factor) in regression.Eventhough the model violates few of the assumptions of multiple linear regression,still we can accept it for descriptive purpose.
As per the final model,the famales have on an average 0.8402 % less PercentsalaryHike than that of male. 1 unit increase in job involvement results in 0.6423 % increase in PercentSalaryHike.1 unit increase in job satisfaction results in 0.9391 % increase in PercentSalaryhike on average.Every additional no of companies worked results in -.2741 % decrease in PercentSalaryHike.1 unit increase in performance rating results in 4.79% increase in PercentSalaryHike.
The model is statistically significant as P value in ANOVA test less than 0.05 & adjusted R square is 82.5%
Q3.
    SUMMARY OUTPUT
     
     
     
     
     
     
     
     
    Regression Statistics
     
     
     
     
     
     
     
     
    Multiple R
    0.91014884
     
     
     
     
     
     
     
    R Square
    0.828370911
     
     
     
     
     
     
     
    Adjusted R Square
    0.827667032
     
     
     
     
     
     
     
    Standard Error
    1.519350169
     
     
     
     
     
     
     
    Observations
    1470
     
     
     
     
     
     
     
    ANOVA
     
     
     
     
     
     
     
     
     
    df
    SS
    MS
    F
    Significance F
     
     
     
    Regression
    6
    16300.24098
    2716.707
    1176.866
    0
     
     
     
    Residual
    1463
    3377.225683
    2.308425
     
     
     
     
     
    Total
    1469
    19677.46667
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
    Coefficients
    Standard Error
    t Stat
    P-value
    Lower 95%
    Upper 95%
    Lower 95.0%
    Upper 95.0%
    Intercept
    3.627857474
    1.262195506
    2.874244
    0.004109
    1.151951411
    6.103763536
    1.151951411
    6.103763536
    Female
    -0.825286881
    0.091763043
    -8.99367
    7.22E-19
    -1.005288056
    -0.645285707
    -1.005288056
    -0.645285707
    JobInvolvement
    0.633468834
    0.049207801
    12.87334
    5.28E-36
    0.53694346
    0.729994208
    0.53694346
    0.729994208
    JobSatisfaction
    -0.826316075
    0.367873788
    -2.24619
    0.02484
    -1.547932447
    -0.104699703
    -1.547932447
    -0.104699703
    NumCompaniesWorked
    -0.276258441
    0.018270092
    -15.1208
    4.18E-48
    -0.312096813
    -0.240420069
    -0.312096813
    -0.240420069
    PerformanceRating
    2.877047033
    0.414958503
    6.933337
    6.15E-12
    2.063069905
    3.691024161
    2.063069905
    3.691024161
    Performance rating xJob satisfaction
    0.57910428
    0.119888274
    4.830366
    1.51E-06
    0.343933023
    0.814275538
    0.343933023
    0.814275538
The final model with interaction term between performance rating & job satisfaction is
PercentSalaryHike = 3.62 -0.82 Female +0.63 JobInvolvement -0.82 Jobsatisfaction-0.27 NumCompaniesworked +2.87 performancerating+0.57 performancerating x job satisfaction.
As the p value associated with interaction term is less than 0.05,so this term is statistically significant.So increase in performance rating affects increase in percent salary in different way depending upon job satisfaction.
Conclusion :
So final model shows that Female gender,Job involvement,performance rating,job satisfaction,number of companies worked & interaction between performance rating & job satisfaction are important for salary hike.
Females have on average 0.82 % salary hike in comparison to male.1 unit increase in job involvement increases % salary hike by 0.63%.1 unit increase in job satisfaction results in (0.57 performance rating -0.82) % salary hike.so basically job satisfaction results in higher % salary hike in the presence of higher performance rating.similarly higher performance rating results in higher % salary hike in presence of high job satisfaction.
Q4.The employee attrition problem :
We did a logistic regression for the dependent variable Attrition (1=yes,0=No) with all other variables as independent variables.Among these Education,Gender,Job level,Marital status,overtime are categorical variables for which dummy codes were used.Education has 4 levels so 3 dummy codes were created with Education1 being reference level.Gender 1 was female with male being reference level.similarly,job level 1 was reference category with others being dummy coded.marrital status is dummy coded with single being reference & so also overtime where “No” is reference category.
    Variable name
    Coeff value
    odds
     intercept
    1.706330944
     
    Age
    -0.030831391
    0.970
    Education2
    0.096060844
    1.101
    Education3
    0.307956911
    1.361
    Education4
    0.227199341
    1.255
    EnvironmentSatisfaction
    -0.367395095
    0.693
    Female
    -0.319340636
    0.727
    JobInvolvement
    -0.175083026
    0.839
    Joblevel2
    -0.961435136
    0.382
    Joblevel3
    -0.319954909
    0.726
    Joblevel4
    -1.682080948
    0.186
    Joblevel5
    -1.08606482
    0.338
    JobSatisfaction
    -0.1699398
    0.844
    Married
    -0.857916847
    0.424
    Divorced
    -1.194120663
    0.303
    NumCompaniesWorked
    0.045318512
    1.046
    OverTime
    1.631282149
    5.110
    PercentSalaryHike
    0.036695796
    1.037
    PerformanceRating
    0.049571141
    1.051
    WorkLifeBalance
    -0.227926049
    0.796
    YearsAtCompany
    -0.00855152
    0.991
    YearsInCurrentRole
    -0.153279842
    0.858
    YearsSinceLastPromotion
    0.142827598
    1.154
In the table above,we have attached coefficient value with odds of each.where the odds are greater than 1,it means higher value of the variable results in greater probability of exit & where odds are lower,the probability of exit is lower.
Different cases :
1. Medium satisfaction with job & work environment,5 years since last promotion,the person has higher probability of leaving as odds is higher (around 2.17 if we plug in the values of medium as 3 & 5 for years since last promotion)
2. Higher number of years in current role & overtime results in higher odds of exit
3. A 45 years old married employee with a very-high level job classification and maintaining a good work-life balance has lower odds of leaving.
The final equation is log(p/1-p) = 1.7 +0.3 Age +0.09 Education2 +0.307 Education3+0.227 Education4 +0.367 Environmentsatisfaction +.319 female + 0.175 Job involvement +0.961 Job level2 +0.313 Job level3+ 1.68 Job level4 -1.08 job level5 -.169 Job satisfaction+.857 married +1.19 divorce +0.045 Numcompaniesworked +1.63 Overtime +0.036 Percent salary hike +0.049 performance rating+.227 worklife balance-.008 yearsAtCompany+0.153YearsInCurrentRole+0.142 YearsSinceLastPromotion where p is the probability of attrition.
If we plug the individual values of an employee in the above equation,it will...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here