1. (40 points) Implement the KNN classifier. Note: This question carries an additional 10 points bonus if you implement the model in Python. You can call existing Python functions for normalzation and Euclidean distance calculations. But you are not allowed to call the KNN model provided by the Python libraries. Your implementation should accept two data files as input (both are posted with the assignment): a spam train.csv file (weka spam train.arff for Weka users) and a spam test.csv file (weka spam test.arff for Weka users). Both files contain examples of e-mail messages, with each example having a class label of either “1” (spam) or “0” (no-spam). Each example has 57 (numeric) features that characterize the message. Your classifier should examine each example in the spam test set and classify it as one of the two classes. The classification will be based on an unweighted vote of its k nearest examples in the spam train set. We will measure all distances using regular Euclidean distance: d(x, y) = sX i (xi − yi) 2 (a) Report test accuracies when k = 1, 5, 11, 21, 41, 61, 81, 101, 201, 401 without normalizing the features. (b) Report test accuracies when k = 1, 5, 11, 21, 41, 61, 81, 101, 201, 401 with z-score normalization applied to the features. (c) In the (b) case, generate an output of KNN predicted labels for the first 50 instances (i.e. t1 - t50) when k = 1, 5, 11, 21, 41, 61, 81, 101, 201, 401 (in this order). For example, if t5 is classified as class ‘spam” when k = 1, 5, 11, 21, 41, 61 and classified as class “no-spam” when k = 81, 101, 201, 401, then your output line for t5 should be: t5 spam, spam, spam, spam, spam, spam, no, no, no, no (d) What can you conclude by comparing the KNN performance in (a) and (b)?