Question 1. The wine data set at https://archive.ics.uci.edu/ml/datasets/wine has 13 features. Develop in Python and apply your own version of the PCA algorithm to this data set, to visualize how PCA...

1 answer below »
There are 3 ML questions require you to do both coding and documentation. Please read them carefully and make sure everything will be covered and answered. You should use a notebook for codes, a word document for aggregation. In the documentation, please screenshot and paste your outputs and give clear explanations/interpretations in words as per questions. In the notebook, please write clear comments and use Markdown to indicate which question begins onward. Many thanks amigo.


Question 1. The wine data set at https://archive.ics.uci.edu/ml/datasets/wine has 13 features. Develop in Python and apply your own version of the PCA algorithm to this data set, to visualize how PCA helps with dimensionality reduction. Explain how many Principal Components you will choose and why. What percent of the variance in the data do the selected Principal Components cover? For the implementation, you may use any objects, modules, and functions in NumPy, SciPy and other python libraries to do various operations such as to compute the eigen values, vectors or perform any other math / linear algebra operation, but not use the PCA function available in SciKit-Learn directly. Question 2. AutoML Follow the Automated Machine Learning (AutoML) Walkthrough using DataRobot given here: https://community.datarobot.com/t5/knowledge-base/automated-machine-learning-automl- walkthrough/tac-p/8251 Use the free trial version of DataRobot on a prediction or classification problem that you solved in a ML domain in the past. By what % does the accuracy increase when you use DataRobot? Which models listed on the leaderboard perform the best and why? List 5 takeaways from this exercise. Question 3. Graph Databases for ML Here's an implementation of the K-means algorithm using Neo4j’s Cypher query language: https://medium.com/neo4j/k-means-clustering-with-neo4j-b0ec54bf0103 Implement K-means for the same dataset in python and using similar visualizations, compare its performance with the implementation in Cypher. https://archive.ics.uci.edu/ml/datasets/wine https://community.datarobot.com/t5/knowledge-base/automated-machine-learning-automl-walkthrough/tac-p/8251 https://community.datarobot.com/t5/knowledge-base/automated-machine-learning-automl-walkthrough/tac-p/8251 https://medium.com/neo4j/k-means-clustering-with-neo4j-b0ec54bf0103
Answered 2 days AfterNov 18, 2021

Answer To: Question 1. The wine data set at https://archive.ics.uci.edu/ml/datasets/wine has 13 features....

Pritam Kumar answered on Nov 20 2021
135 Votes
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here