My college has requested me to do a project using script and I am seeking your help in this regard. The instructionsare as follows:You expected to pick a topic of interest, find an open-source...

1 answer below »
My college has requested me to do a project using script and I am seeking your help in this regard.
The instructionsare as follows:You expected to pick a topic of interest, find an open-source dataset, define a business analytics question, apply appropriate business analytics methods, interpret results, and share the insights in your project report and presentations. There are many open-source datasets. Here are just a few examples, and you are welcome to explore other sources.1. Kagglehttps://www.kaggle.com/datasets2. UCI Machine Learning Repositoryhttps://archive.ics.uci.edu/ml/datasets.php3. Google Public Data Explorerhttps://www.google.com/publicdata/directory4. National Information Center on Health Services Research and Health Care Technologyhttps://www.nlm.nih.gov/hsrinfo/datasites.html#164Databases/Repositories5. Centers for Disease Control and Prevention (CDS) Public datasetshttps://www.cdc.gov/nchs/data_access/ftp_data.htm6. U.S. Census Bureauhttps://www.census.gov/topics/income-poverty/income/data/datasets.html
https://www.bls.gov/data/

7. Open Data Bostonhttps://data.boston.gov/dataset8. Data Worldhttps://data.world/datasets/professional9. Airline Industryhttps://catalog.data.gov/dataset?tags=flight10. Statlib datasetshttp://lib.stat.cmu.edu/datasets/11. van der Maaten’s web page provides high dimensional data setshttps://lvdmaaten.github.io/tsne/12. Open Gov. Datahttps://www.data.gov/
https://www.data.gov.uk13. Federal Reserve Economic Datahttps://fred.stlouisfed.org/14. World Bankhttps://data.worldbank.org/15. Youtube labeled Video Datasethttps://research.google.com/youtube8m/16. Past KDD Cupshttps://www.kdd.org/kdd-cup17. AwesomePublic Datasetshttps://github.com/awesomedata/awesome-public-datasets18. WRDS and Bloomberg Access to finance data for all students and faculty of SOM
Could you kindly suggest me a topic, please be advised that this is a beginners level project?Once we decide on topic we can talk about the price and everything.Thank you!
Answered 26 days AfterMar 26, 2022

Answer To: My college has requested me to do a project using script and I am seeking your help in this regard....

Suraj answered on Mar 27 2022
108 Votes
Data Analysis using R
Data Analysis using R
Introduction:
Since, the project is an open-source internet type project. So, I have selected data related to medical field where many variables are measured on the health state
of people and the response variable is the death event. That is the person will survive or not. The population of the data set are all the patients who suffered from the cardiovascular disease earlier and whether they survived or not. A brief explanation about the variables given as follows:
Description Table
There are 299 rows and 12 different variables in the data set. The data set does not contain any missing value for any row. It is check using R-Studio as follows:
Missing Value table
Here, many variables are of categorical type. So, our research question will be based on these categorical variables.
The data set is available from the following link:
https://www.kaggle.com/datasets/andrewmvd/heart-failure-clinical-data?select=heart_failure_clinical_records_dataset.csv
Descriptive statistics:
The descriptive statistics of the different variables is calculated as follows:
Why this data set important?
This is a quite interesting question which can build our analytical thinking skills. This data set is medical data set, so we are interested to analyse some medical aspects of the human life. That is to check about the whether a diabetic person has high chances of death or not. Whether the presence of some disease features leads to heart attach and also the person will be survived or not. Here, all the variables are important for the analysis purpose, so no single variable is deleted from the data set.

Research question:
Since, we are interested in the analysis of the categorical variables. Thus, our variables of interest are Diabetes, High blood pressure and death event. Thus, the research question is that is there any association of Diabetes and High blood pressure and death event. In other words, we can say that is there any dependence of diabetes and high blood pressure on the death event variable.
Theory:
Since, the question is based on the categorical variable and we are checking about the association. Thus, the best approach to do this type of analysis is by using...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here