Microsoft Word - HW6.docx CS 4375 Homework 6 April 14, 2022 Deadline for the submission: April XXXXXXXXXX All assignments MUST have your name, student ID, course name/number at the beginning of your...

follow the instructions as given in the pdf, download the dataset using the link given in the document.


Microsoft Word - HW6.docx CS 4375 Homework 6 April 14, 2022 Deadline for the submission: April-27-2022 All assignments MUST have your name, student ID, course name/number at the beginning of your documents. Your homework MUST be submitted via eLearning with file format and name convention as follows: HW#_Name_code. ipynb (for coding part) Question 1(Clustering): Consider the following Iris dataset with 150 points, and each point has four numerical features, as shown in the second to the fifth columns (sepal_length to petal_width), and one categorical feature, as shown in the last column. The categorical feature is named “species” with three possible categories: Iris Setosa, Iris Versicolour, and Iris Virginica. We only use the four numerical features for the clustering task described below and ignore the categorical feature. Iris dataset link: https://scikit-learn.org/stable/auto_examples/datasets/plot_iris_dataset.html Figure 1: Visualization example of the Iris dataset Apply the following three clustering methods: k-means, hierarchical clustering, and Gaussian mixture models (GMM), to the Iris dataset and generate k clusters. (a) Choosing an appropriate number of clusters for the dataset using the SSE-based technique introduced in class. (b) For each method, show a scatter plot of the clusters in the space spanned by the first two numerical features, in which the data points belonging to each cluster are shown in a different color. The following is a scatter plot of the dataset in the first two features, in which the three different colors are related to three different categorical values. Figure 2: scatter plot of the dataset in the first two features There following are some useful links: Tutorial about Gaussian mixture model-based clustering in Python: https://www.analyticsvidhya.com/blog/2019/10/gaussian-mixture-models-clustering/ Tutorial about hierarchical clustering in Python: https://www.analyticsvidhya.com/blog/2019/05/beginners-guide-hierarchical-clustering/ Tutorial about k-means clustering in Python: https://realpython.com/k-means-clustering-python/
Apr 27, 2022
SOLUTION.PDF

Get Answer To This Question

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here