MIS771 Descriptive Analytics and Visualisations Page 1 of 8 MIS771 Descriptive Analytics and Visualisation DEPARTMENT OF INFORMATION SYSTEMS AND BUSINESS ANALYTICS DEAKIN BUSINESS SCHOOL FACULTY OF...

1 answer below »
The files are attached


MIS771 Descriptive Analytics and Visualisations Page 1 of 8 MIS771 Descriptive Analytics and Visualisation DEPARTMENT OF INFORMATION SYSTEMS AND BUSINESS ANALYTICS DEAKIN BUSINESS SCHOOL FACULTY OF BUSINESS AND LAW, DEAKIN UNIVERSITY Assignment One Background This is an individual assignment. You need to analyse a given data set, and then interpret and draw conclusions from your analysis. You then need to convey your conclusions in a written report to a Business professional with very little or no knowledge of Business Analytics. Percentage of the final grade 30% The Due Date and Time 8pm Thursday 20th August 2020 Submission instructions The assignment must be submitted by the due date, electronically in CloudDeakin. When submitting electronically, you must check that you have submitted the work correctly by following the instructions provided in CloudDeakin. Please note that we will NOT accept any paper or email copies, or part of the assignment submitted after the deadline. Information for students seeking an extension BEFORE the due date If you wish to seek an extension for this assignment prior to the due date, you need to apply directly to the Unit Chair by completing the Assignment and Online Test Extension Application Form (PDF, 188.6KB). Please make sure you attach all supporting documentation and a draft of your assignment. This needs to occur as soon as you become aware that you will have difficulty in meeting the due date. Please note: Unit Chairs can only grant extensions up to two weeks beyond the original due date. If you require more than two weeks, or have already been provided an extension by the Unit Chair and require additional time, you must apply for Special Consideration via StudentConnect within 3 business days of the due date. Conditions under which an extension will normally be considered include: • Medical – to cover medical conditions of a serious nature, e.g. hospitalisation, serious injury or chronic illness. Note: temporary minor ailments such as headaches, colds and minor gastric upsets are not serious medical conditions and are unlikely to be accepted. However, serious cases of these may be considered. • Compassionate – e.g. death of a close family member, significant family and relationship problems. • Hardship/Trauma – e.g. sudden loss or gain of employment, severe disruption to domestic arrangements, victim of crime. Note: misreading the due date, assignment anxiety or returning home will not be accepted as grounds for consideration. https://www.deakin.edu.au/__data/assets/pdf_file/0006/2055552/BL_AssignmentExtensionForm_Feb2020.pdf MIS771 Descriptive Analytics and Visualisations Page 2 of 8 Information for students seeking an extension AFTER the due date If the due date has passed, you require more than two weeks extension, or you have already been provided with an extension and require additional time, you must apply for Special Consideration via StudentConnect. Please be aware that applications are governed by University procedures and must be submitted within three business days of the due date or extension due date. Please be aware that in most instances the maximum amount of time that can be granted for an assignment extension is three weeks after the due date, as Unit Chairs are required to have all assignment submitted before results/feedback can be released back to students. Penalties for late submission The following marking penalties will apply if you submit an assessment task after the due date without an approved extension: • 5% will be deducted from available marks for each day, or part thereof, up to five days. • Work that is submitted more than five days after the due date will not be marked; you will receive 0% for the task. Note: 'Day' means calendar day. The Unit Chair may refuse to accept a late submission where it is unreasonable or impracticable to assess the task after the due date. Additional information: For advice regarding academic misconduct, special consideration, extensions, and assessment feedback, please refer to the document “Rights and responsibilities as a student” in the “Unit Guide and Information” folder under the “Content” section in the MIS771 CloudDeakin site. The assignment uses the dataset file Insurance.xlsx, which can be downloaded from CloudDeakin. Analysis of the data requires the use of techniques studied in Module-1. Assurance of Learning This assignment assesses the following Graduate Learning Outcomes and related Unit Learning Outcomes: Graduate Learning Outcome (GLO) Unit Learning Outcome (ULO) GLO1: Discipline-specific knowledge and capabilities - appropriate to the level of study related to a discipline or profession. GLO2: Communication - using oral, written and interpersonal communication to inform, motivate and effect change GLO5: Problem Solving - creating solutions to authentic (real world and ill-defined) problems. GLO6: Self-Management - working and learning independently, and taking responsibility for personal actions ULO 1: Apply quantitative reasoning skills to solve complex problems. ULO 2: Plan, monitor, and evaluate own learning as a data analyst. ULO 3: Deduce clear and unambiguous solutions in a form that they useful for decision making and research purposes and for communication to the wider public. Feedback before submission You can seek assistance from the teaching staff to ascertain whether the assignment conforms to submission guidelines. Feedback after submission An overall mark together with feedback, will be released via CloudDeakin, usually within 15 working days. You are expected to refer and compare your answers to the feedback to understand any areas of improvement. MIS771 Descriptive Analytics and Visualisations Page 3 of 8 The Case Study The United States has one of the highest healthcare costs in the world, spending trillions of dollars on healthcare, which typically exceeds $10,000 per individual. Studies report that healthcare costs have gone up from 5% of gross domestic product (GDP) to 18% during the period 1960 to 2018. Moreover, a number of articles have suggested that predisposing personal characteristics, such as income, age, state, job etc. could possibly be related to the cost of health services. “The Americans dying because they can't afford medical care - A December 2019 poll conducted by Gallup found 25% of Americans say they or a family member have delayed medical treatment for a serious illness due to the costs of care.” Illustration: Mikyung Lee/The Guardian 2018-02-21 “Millions of Americans – as many as 25% of the population – are delaying getting medical help because of skyrocketing costs” Michael Sainato Tue 7 Jan 2020 20.00 AEDT Last modified on Wed 8 Jan 2020 22.13 AEDT “Young people, who are expected to benefit from lower premiums should the GOP repeal-and-replace efforts succeed, already pay the least. But even their costs can be considerable, depending on where they live. In 2016, the financial data site ValuePenguin found that the average costs for coverage for a 21-year-old go from $180 a month in Utah, plus a $2,160 deductible (potentially $4,320 a year, total), to $426 a month in Alaska, with a $5,112 deductible (potentially $10,224 a year, total).” Published Fri, Jun 23 201710:52 AM EDT Updated Mon, Oct 14 2019 11:55 AM EDT “Average annual costs per person hit $10,345 in 2016. In 1960, the average cost per person was only $146 — and, adjusting for inflation, that means costs are nine times higher now than they were then.” Published Fri, Jun 23 201710:52 AM EDT Updated Mon, Oct 14 201911:55 AM EDT “Americans pay a lot for healthcare. Depending on where they live, typical workers shelled out between $4,500 and $8,300 for healthcare in 2017. But the US government pays even more.” Tanza Loudenback Mar 8, 2019, 12:25 AM The UnitedHealth Group: America’s most prominent health insurance provider aims to identify the characteristics of the population to improve their understanding of the potential influence of these characteristics on their high medical costs billed by an insurance provider. They have access to a sample of US Health Insurance data containing 1338 insured personnel with their Age, Sex, Body Mass Index, Number of Children, Smoking status, Region and Insurance charges. You are a Data Analyst working for UnitedHealth Group. Your Manager – Edmond Kendrick has asked you to conduct a preliminary analysis. In particular, you are expected to apply a series of statistical techniques and produce a report based on your findings. Edmond’s email is reproduced on the next page. https://www.theguardian.com/us-news/2020/jan/07/americans-healthcare-medical-costs#img-1 https://www.theguardian.com/us-news/2020/jan/07/americans-healthcare-medical-costs#img-1 https://www.theguardian.com/us-news/2020/jan/07/americans-healthcare-medical-costs#img-1 https://www.theguardian.com/us-news/2020/jan/07/americans-healthcare-medical-costs#img-1 https://www.theguardian.com/us-news/2020/jan/07/americans-healthcare-medical-costs#img-1 https://www.theguardian.com/us-news/2020/jan/07/americans-healthcare-medical-costs#img-1 MIS771 Descriptive Analytics and Visualisations Page 4 of 8 Email from Edmond Kendrick To: > From: Edmond Kendrick Subject: Analysis of US Health Insurance data Hi, As per our conversation, I have spoken with our reporting team and we have following questions relating to the US health insurance data. Please complete the following analysis for me. Your responses will assist them in writing the feature section of our next issue. 1. Provide your insights on how the specific attributes of the whole insured population is affecting their insurance premiums based upon our sample data: (a) An estimate of the difference in medical costs for a female versus a male (b) An estimate of the difference in medical costs for a single person versus someone with a family (c) Males with no dependents have claimed that they have, on average, been charged more than their female counterparts. Can you check whether this claim is possibly true? (d) We would also like to know if there is gender bias in smoking behaviours. Specifically, is there a greater proportion of males who are smokers compared to females? Can you check whether this claim can also be substantiated? Briefly advise the findings in regard to the proportion of males and females who are smokers. 2. Can you further analyse to see whether the beneficiary's residential area/region in the US affect how health insurance provider bill their medical costs? 3. We believe that individual medical costs billed by health insurance differs significantly across age group of primary beneficiary (young adults: 18 to 35 years; middle age: 36 to 55 years; and older adulthood:
Answered Same DayAug 03, 2021MIS771Deakin University

Answer To: MIS771 Descriptive Analytics and Visualisations Page 1 of 8 MIS771 Descriptive Analytics and...

Biswajit answered on Aug 09 2021
153 Votes
Analysis
1. Provide your insights on how the specific attributes of the whole insured population is affecting their insurance premiums based upon our sample data:
(a) An estimate of the difference in medical costs for a female versus a male
Ans : The estimate of the difference in medical costs for a female vs male is -1387.The 95% confidence interval for difference lies between -2682 to -92.
Our hypothesis was
Null Hypothesis (H0) :No difference in medical costs of female & male
Alternate Hypothesis (Ha) :Medical cost for females is not same as that of males
As p value .0359 is less than level of significance 0.05,we reject the null hypothesis.
In a way we say that at 5% level of significance,medical costs for female is less than that for males.
(b)
An estimate of the difference in medical costs for a single person versus someone with a family
Ans :An estimate of the difference in medical cots for a single person vs someone with a family is -1584.The 95% confidence interval for difference in medical costs lies between -2894 to -274.
Our hypothesis here was
Null Hypothesis (H0) :No difference in medical costs for a single person vs someone with family.
Alternate Hypothesis (Ha) :Medical cost for single person is not same as that for someone with family.
As p value .0178 is less than level of significance 0.05,we reject the null hypothesis.
In a way we say that at 5% level of significance,medical costs for a single person is less than that for someone with family.
(c) Males with no dependents have claimed that they have, on average, been charged more than their female counterparts. Can you check whether this claim is possibly true?
Ans :
Here our hypothesis was :
Null hypothesis :Male with no dependents had been charged less than or equal to that females with no dependents
Alternate Hypothesis : Males with no dependence had been charged more than the females with no dependents
As p value .1781 is greater than level of significance 0.05,we fail to reject the null hypothesis.
In a way we say that at 5% level of significance,medical costs for males with no dependents is same as those for females with no dependents.
The estimate of difference in medical costs between males with no dependents & females with no dependents is 927 & the 95% confidence interval for the difference lies between -1045 to 2899
So the claim of males with no dependents that they have been charged more than female counterparts is false.
(d) We would also like to know if there is gender bias in smoking behaviours. Specifically, is there a greater proportion of males who are smokers compared to females? Can you check whether this claim can also be substantiated? Briefly advise the findings in regard to the proportion of males and females who are smokers.
Ans : Here our hypothesis was :
Null hypothesis :Proportion of male smokers is less than equal to as that of female smokers
Alternate Hypothesis : Proportion of male smokers is more than that of female smokers
As p value .0027 is less than the level of significance 0.05,we reject the null hypothesis.
In a way we say that at 5% level of significance,proportion of male smokers is more than that of female smokers.
The estimate of difference in proportion of male & female smokers is 6.15% & the 95% confidence interval lies between 6.01% to 6.29%.
So the claim of higher male smokers than female smokers is substantiated
2. Can you further analyse to see whether the beneficiary's residential area/region in the US affect how health insurance provider bill their medical costs?
Ans:Our hypothesis was ANOVA
Null Hypothesis (H0) :There are no differences in medical costs for all 4 regions
Alternate Hypothesis (Ha) :There are differences in medical cost between at least one pair of regions.
ANOVA in Excel built in function shows that there are differences in medical costs between at least one pair of regions as shown from p value of 0.0308
As evident in the Tukey Kramer test,means of medical costs are different between regions southeast & southwest.
So beneficiary’s residential regions affect how their health insurance provider bill their medical costs.
3. We believe that individual medical costs billed by health insurance differs significantly across age group of primary beneficiary (young adults: 18 to 35 years; middle age: 36 to 55 years; and older adulthood: 56 years and older) with their smoking behaviour. Is there any evidence to support this assertion?
What associations, if any, exists focusing on smokers in the diverse age groups?
Ans :Here the hypothesis is 2 way ANOVA.
Null Hypothesis (H0) :Mean medical costs of different age groups are equal
Alternate Hypothesis (Ha) :Mean medical costs of at least two age groups are different
Null Hypothesis (H0) :Mean medical costs of smoking & non smoking groups are same
Alternate Hypothesis (Ha) :Mean medical costs of smoking & non smoking groups are different
Null Hypothesis (H0) :Age groups & Smoking group do not interact to affect the medical cost
Alternate Hypothesis (Ha) : Age groups & Smoking group interact to affect the medical cost
As found from the two way ANOVA analysis,the associated p value 2.64E-94, 3.87E-14, 1.71E-06 are all less than 0.05 ,so we reject all 3 null hypotheses.
We conclude that medical costs of at least two different age groups are different,medical costs of smoking & non smoking groups are different ,also there is interaction between age group & smoking behaviour.
We believe that individual medical costs billed by health insurance differs significantly across age group of primary beneficiary (young adults: 18 to 35 years; middle age: 36 to 55 years; and older adulthood: 56 years and older) with their smoking behaviour
Doing a one factor ANOVA shows that the p value 0.025 less than 0.05 shows that there are at least two groups among smokers which have statistically different medical costs.
Tukey Kramer test shows that age groups 18-35 & 36-55 have statistically different costs & absolute difference is 5034.
4. We are interested in comparing the smoking behaviour of primary beneficiaries across each of the Body mass index levels (Under, Normal, Overweight, and Obese). Could we say that there are differences in the proportion of smokers across the four BMI levels?
Further, some studies have claimed that the heavy smoking is often associated with higher body mass index (BMI). Could we reach a similar conclusion for all beneficiaries?
Ans : Here we are using Chisquare test
Null Hypothesis (H0) :No difference in proportion of smokers across different body mass index levels
Alternate Hypothesis (Ha) :There is difference in proportion of smokers across different body mass index levels.
The Marascuilo table shows that there is difference in proportion of smokers between normal & overweight groups.
While there are statistically significant differences in proportion of smokers between normal & overweight groups,we can not conclusively say that heavy smoking is associated with higher body mass index as no statistical difference between other groups.
Introduction :
America’s most prominent health insurance provider UnitedHealth Group aims to identify the characteristics of the population to improve their understanding of the potential influence of these characteristics on their high medical costs billed by an insurance provider. They have access to a sample of US Health Insurance data containing 1338 insured personnel with their Age, Sex, Body Mass Index, Number of Children, Smoking status, Region and Insurance charges.we will be analysis the correlations,patterns etc in the data & derive significant insights to help improve our business.
We...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here