CAP4770 – Module 2 Assignment Objective The purpose of the assignment is to help students acquire skills in working with (1) methods used to understand important features of the application data (2)...

1 answer below »
quote this please



CAP4770 – Module 2 Assignment Objective The purpose of the assignment is to help students acquire skills in working with (1) methods used to understand important features of the application data (2) methods used to preprocess data to improve its quality. Assignment Questions 1. (This exercise is a variation of Exercise 2.2 in Chapter 2 of the textbook) Suppose that the data for analysis includes the attribute age. The age values for the data tuples are (in increasing order): 13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25, 25, 25, 25, 30, 33, 33, 35, 35, 35, 35, 36, 40, 45, 46, 52, 70 (a) Complete the table below (round results to two decimal places): Mean Mode Midrange Minimum First Quartile Median Third quartile Maximum (b) Construct a boxplot of the data 2. (Exercise 2.6 in Chapter 2 of the textbook) Consider two objects represented by the tuples (22, 1, 42, 10) and (20, 0, 36, 8). Complete the table below; round results to two decimal places: Euclidian Distance Manhattan Distance Minkowski Distance Supremum Distance 3. (This exercise is a variation of Exercise 2.8 in Chapter 2 of the textbook) Consider the data as 2-D data points. Given a new data point x = (1.4, 1.6) as a query, rank the database points based on the cosine similarity measure. A1 A2 x1 1.5 1.7 x2 2 1.9 x3 1.6 1.8 x4 1.2 1.5 x5 1.5 1 In other words, find the cosine similarity for each data point and sort in decreasing order: Similarity Vector Closest Second closest Third closest Fourth closest Farthest 4. (This exercise is a variation of Exercise 3.3 in Chapter 3 of the textbook) Using the data set given in Exercise 1 of this assignment, use smoothing by bin means to smooth this data, using a bin depth of 3. Round results to two decimal places. Bins Smoothed by Bin Means 5. (This exercise is a variation of Exercise 3.7 in Chapter 3 of the textbook) Using the data set given in Exercise 1 of this assignment, use min-max, z-score, and decimal scaling normalizations to transform the value 35 (use [0, 1] as the new range for the min-max normalization and 12.94 as the standard deviation for the z-score normalization). Round results to three decimal places. Normalization Normalized Value min-max z-score decimal scaling Guidelines The assignment is to be completed individually. Questions are based on Module 2 readings. Deliverables: 1) This document with the answers entered in the table or space of each question. 2) The document (or documents) with the details of how the solutions were obtained (do not work the problems out in this document but in a separate one). Submit Excel spreadsheet(s) if Excel was used for the calculations. Name your files and. If there are more than one worksheet simply add a number to the name. For example, Smith_John_Module2_Assignment.docx Smith_John_Module2_Worksheet1.docx Smith_John_Module2_Worksheet2.xlsx Create a folder, name it Module 2 Assignment and place your files in there. Compress it and drop the resulting zipped folder into the Dropbox. Make sure you write your full name, Panther ID, date, and your class section in the first lines of each document, in that order: Student Name: __________ Panther ID: __________ Date: __________ Section: __________ Grading Rubric Question 1 is worth 2 points and questions 2 – 5 are worth 1 point each. Each question will be graded based on correctness, completion, and organization.
Answered Same DaySep 22, 2021

Answer To: CAP4770 – Module 2 Assignment Objective The purpose of the assignment is to help students acquire...

Rajeswari answered on Sep 22 2021
149 Votes
Sheet1
            Mean    39.963                                        Distance
            Mode    25.35                                1.5    1.7    0.1414213562
            Midrange    41.
5                                2    1.9    0.6708203932
            Minimum    13                                1.6    1.8    0.2828427125
            Q1    20.5            Bin depth    3                1.2    1.5    0.2236067977
            Median    25        Bins            Mean            1.5    1    0.608276253
            Q3    35        13    15    16    14.6666666667
            Maximum    70        16    19    20    18.3333333333
                        20    21    22    21
                        22    25    25    24
                        25    25    30    26.6666666667
                        33    33    35    33.6666666667
                        35    35    35    35
                        36    40    45    40.3333333333
                        46    52    70    56
        These are copied from mean from top right table            from top right table
        Smoothed by bin...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here