For this assignment, we will analyze the open dataset with data on the passengers aboard the Titanic. The data file for this assignment can be downloaded from the Kaggle...


For this assignment, we will analyze the open dataset with data on the passengers aboard the Titanic.



The data file for this assignment can be downloaded from the Kaggle website:https://www.kaggle.com/c/titanic/data(Links to an external site.)Links to an external site., file train.csv. It is also attached to the assignment page. The definition of all variables can be found on the same Kaggle page, in the Data Dictionary section.



Read the data from the file into a pandas DataFrame. Analyze, clean, and transform the data to answer the following question:




What categories of passengers were most likely to survive the Titanic disaster?




Question 1.(4 points)



  • The answer to the main question - What categories of passengers were most likely to survive the Titanic disaster? (2 points)

  • The detailed explanation of the logic of the analysis (2 points)




Question 2.(3 points)



  • What other attributes did you use for the analysis? Explain how you used them and why you decided to use them.

  • Provide a complete list of all attributes used.




Question 3.(3 points)



  • Did you engineer any attributes (created new attributes)? If yes, explain the rationale and how the new attributes were used in the analysis?

  • If you have excluded any attributes from the analysis, provide an explanation of why you believe they can be excluded.




Question 4.(5 points)



  • How did you treat missing values for those attributes that you included in the analysis (for example, the `age` attribute)? Provide a detailed explanation in the comments

Feb 21, 2021
SOLUTION.PDF

Get Answer To This Question

Related Questions & Answers

More Questions ยป

Submit New Assignment

Copy and Paste Your Assignment Here