MIS771 Descriptive Analytics and Visualisations Page 1 of 8 MIS771 Descriptive Analytics and Visualisation DEPARTMENT OF INFORMATION SYSTEMS AND BUSINESS ANALYTICS DEAKIN BUSINESS SCHOOL FACULTY OF...

1 answer below »

View more »
Answered Same DayAug 03, 2021MIS771Deakin University

Answer To: MIS771 Descriptive Analytics and Visualisations Page 1 of 8 MIS771 Descriptive Analytics and...

Biswajit answered on Aug 09 2021
156 Votes
Analysis
1. Provide your insights on how the specific attributes of the whole insured population is affecting their insurance premiums based upon our sample data:
(a) An estimate of the difference in medical costs for a female versus a male
Ans : The estimate of the difference in medical costs for a female vs male is -1387.The 95% confidence interval for difference lies between -2682 to -92.
Our hypothesis was
Null Hypothesis (H0) :No difference in medical costs of female & male
Alternate Hypothesis (Ha) :Medical cost for females is not same as that of males
As p value .0359 is less than level of significance 0.05,we reject the null hypothesis.
In a way we say that at 5% level of significance,medical costs for female is less than that for males.
(b)
An estimate of the difference in medical costs for a single person versus someone with a family
Ans :An estimate of the difference in medical cots for a single person vs someone with a family is -1584.The 95% confidence interval for difference in medical costs lies between -2894 to -274.
Our hypothesis here was
Null Hypothesis (H0) :No difference in medical costs for a single person vs someone with family.
Alternate Hypothesis (Ha) :Medical cost for single person is not same as that for someone with family.
As p value .0178 is less than level of significance 0.05,we reject the null hypothesis.
In a way we say that at 5% level of significance,medical costs for a single person is less than that for someone with family.
(c) Males with no dependents have claimed that they have, on average, been charged more than their female counterparts. Can you check whether this claim is possibly true?
Ans :
Here our hypothesis was :
Null hypothesis :Male with no dependents had been charged less than or equal to that females with no dependents
Alternate Hypothesis : Males with no dependence had been charged more than the females with no dependents
As p value .1781 is greater than level of significance 0.05,we fail to reject the null hypothesis.
In a way we say that at 5% level of significance,medical costs for males with no dependents is same as those for females with no dependents.
The estimate of difference in medical costs between males with no dependents & females with no dependents is 927 & the 95% confidence interval for the difference lies between -1045 to 2899
So the claim of males with no dependents that they have been charged more than female counterparts is false.
(d) We would also like to know if there is gender bias in smoking behaviours. Specifically, is there a greater proportion of males who are smokers compared to females? Can you check whether this claim can also be substantiated? Briefly advise the findings in regard to the proportion of males and females who are smokers.
Ans : Here our hypothesis was :
Null hypothesis :Proportion of male smokers is less than equal to as that of female smokers
Alternate Hypothesis : Proportion of male smokers is more than that of female smokers
As p value .0027 is less than the level of significance 0.05,we reject the null hypothesis.
In a way we say that at 5% level of significance,proportion of male smokers is more than that of female smokers.
The estimate of difference in proportion of male & female smokers is 6.15% & the 95% confidence interval lies between 6.01% to 6.29%.
So the claim of higher male smokers than female smokers is substantiated
2. Can you further analyse to see whether the beneficiary's residential area/region in the US affect how health insurance provider bill their medical costs?
Ans:Our hypothesis was ANOVA
Null Hypothesis (H0) :There are no differences in medical costs for all 4 regions
Alternate Hypothesis (Ha) :There are differences in medical cost between at least one pair of regions.
ANOVA in Excel built in function shows that there are differences in medical costs between at least one pair of regions as shown from p value of 0.0308
As evident in the Tukey Kramer test,means of medical costs are different between regions southeast & southwest.
So beneficiary’s residential regions affect how their health insurance provider bill their medical costs.
3. We believe that individual medical costs billed by health insurance differs significantly across age group of primary beneficiary (young adults: 18 to 35 years; middle age: 36 to 55 years; and older adulthood: 56 years and older) with their smoking behaviour. Is there any evidence to support this assertion?
What associations, if any, exists focusing on smokers in the diverse age groups?
Ans :Here the hypothesis is 2 way ANOVA.
Null Hypothesis (H0) :Mean medical costs of different age groups are equal
Alternate Hypothesis (Ha) :Mean medical costs of at least two age groups are different
Null Hypothesis (H0) :Mean medical costs of smoking & non smoking groups are same
Alternate Hypothesis (Ha) :Mean medical costs of smoking & non smoking groups are different
Null Hypothesis (H0) :Age groups & Smoking group do not interact to affect the medical cost
Alternate Hypothesis (Ha) : Age groups & Smoking group interact to affect the medical cost
As found from the two way ANOVA analysis,the associated p value 2.64E-94, 3.87E-14, 1.71E-06 are all less than 0.05 ,so we reject all 3 null hypotheses.
We conclude that medical costs of at least two different age groups are different,medical costs of smoking & non smoking groups are different ,also there is interaction between age group & smoking behaviour.
We believe that individual medical costs billed by health insurance differs significantly across age group of primary beneficiary (young adults: 18 to 35 years; middle age: 36 to 55 years; and older adulthood: 56 years and older) with their smoking behaviour
Doing a one factor ANOVA shows that the p value 0.025 less than 0.05 shows that there are at least two groups among smokers which have statistically different medical costs.
Tukey Kramer test shows that age groups 18-35 & 36-55 have statistically different costs & absolute difference is 5034.
4. We are interested in comparing the smoking behaviour of primary beneficiaries across each of the Body mass index levels (Under, Normal, Overweight, and Obese). Could we say that there are differences in the proportion of smokers across the four BMI levels?
Further, some studies have claimed that the heavy smoking is often associated with higher body mass index (BMI). Could we reach a similar conclusion for all beneficiaries?
Ans : Here we are using Chisquare test
Null Hypothesis (H0) :No difference in proportion of smokers across different body mass index levels
Alternate Hypothesis (Ha) :There is difference in proportion of smokers across different body mass index levels.
The Marascuilo table shows that there is difference in proportion of smokers between normal & overweight groups.
While there are statistically significant differences in proportion of smokers between normal & overweight groups,we can not conclusively say that heavy smoking is associated with higher body mass index as no statistical difference between other groups.
Introduction :
America’s most prominent health insurance provider UnitedHealth Group aims to identify the characteristics of the population to improve their understanding of the potential influence of these characteristics on their high medical costs billed by an insurance provider. They have access to a sample of US Health Insurance data containing 1338 insured personnel with their Age, Sex, Body Mass Index, Number of Children, Smoking status, Region and Insurance charges.we will be analysis the correlations,patterns etc in the data & derive significant insights to help improve our business.
We...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here
April
January
February
March
April
May
June
July
August
September
October
November
December
2025
2025
2026
2027
SunMonTueWedThuFriSat
30
31
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
1
2
3
00:00
00:30
01:00
01:30
02:00
02:30
03:00
03:30
04:00
04:30
05:00
05:30
06:00
06:30
07:00
07:30
08:00
08:30
09:00
09:30
10:00
10:30
11:00
11:30
12:00
12:30
13:00
13:30
14:00
14:30
15:00
15:30
16:00
16:30
17:00
17:30
18:00
18:30
19:00
19:30
20:00
20:30
21:00
21:30
22:00
22:30
23:00
23:30