This is a major assignment that needs someone professional in using Tableau, Weka, etc.Must satisfy all the requirements given.
MITS5509 Assignment 3 MITS5509 Intelligent Systems for Analytics Assignment 3 MITS5509 Assignment 3 2 Copyright © 2015-2018 VIT, All Rights Reserved. NOTE: This Document is used in conjunction with MITS5509 Objective(s) This assessment item relates to the unit learning outcomes as in the unit descriptor. This assessment is designed to improve student collaborative skills in a team environment and to give students experience in constructing a range of documents as deliverables form different stages of the Intelligent Systems for Analytics INSTRUCTIONS Assignment 3 :- Group Assignment (30 %) and submission at week 12 In this assignment students will work in group of (3-4 students) to develop components of the Documents discussed in lectures. Student groups should be formed by Session four. Each group needs to complete the group participation form attached to the end of this document. Assignments will not be grades unless a group participation form is completed. Carefully read the following two questions and provide the appropriate answer. Question 1: The bankruptcy-prediction problem can be viewed as a problem of classification. The data set you will be using for this problem includes one ratio that have been computed from the financial statements of real-world firms. These ratios have been used in studies involving bankruptcy prediction. The first sample (training set) includes 68 data value on firms that went bankrupt and firms that did not. This will be your training sample. The second sample (testing set) of 68 firms also consists of some bankrupt firms and some non-bankrupt firms. Your goal is to use different classifiers to build a training model, by randomly selecting the 40 data points (20 points from category 1 and 20 points from category 0), and then test its performance on the testing model by randomly selecting 40 data points from the testing set. (Try to analyze the new cases yourself manually before you run the neural network and see how well you do.) Students has to use the following classifiers. The selection of the classifiers depends upon the members of the group, e.g. if the group has four members then they will use the four classifiers from the following five classifiers. 1. Neural network 2. Support vector machine 3. Nearest neighbor algorithm 4. Decision tree 5. Naive Bayes The following tables show the training sample and test data you should use for this major assignment. MITS5509 Assignment 3 3 Copyright © 2015-2018 VIT, All Rights Reserved. Training Sample Data Firm WC Category 1 309.577 1 2 363.79 1 3 341.399 1 4 363.616 1 5 323.673 1 6 323.353 1 7 350.371 1 8 240.602 1 9 220.057 1 10 287.837 1 11 274.6 1 12 278.494 1 13 234.267 1 14 284.923 1 15 190.62 1 16 327.76 1 17 211.94 1 18 373.571 1 19 219.891 1 20 193.489 1 21 204.333 1 22 205.657 1 23 362.361 1 24 285.562 1 25 352.649 1 26 400.44 1 27 307.301 1 28 240.314 1 29 322.995 1 30 408.197 1 31 209.027 1 32 198.979 1 33 340.418 1 34 320.154 1 35 189.826 0 36 651.65 0 37 487.494 0 38 254.899 0 MITS5509 Assignment 3 4 Copyright © 2015-2018 VIT, All Rights Reserved. 39 575.646 0 40 160.712 0 41 269.729 0 42 513.301 0 43 1996.866 0 44 683.512 0 45 377.246 0 46 289.579 0 47 171.851 0 48 205.39 0 49 203.593 0 50 365.159 0 51 266.962 0 52 461.943 0 53 215.392 0 54 235.794 0 55 881.477 0 56 463.897 0 57 475.693 0 58 540.01 0 59 612.817 0 60 140.277 0 61 396.541 0 62 271.185 0 63 507.039 0 64 733.641 0 65 612.455 0 66 499.495 0 67 290.715 0 68 171.447 0 MITS5509 Assignment 3 5 Copyright © 2015-2018 VIT, All Rights Reserved. Testing Sample Data Firm WC 1 367.325 2 347.513 3 330.226 4 178.106 5 378.899 6 257.212 7 333.088 8 182.324 9 238.099 10 329.643 11 294.644 12 281.666 13 308.086 14 317.079 15 245.139 16 354.662 17 292.256 18 306.79 19 222.396 20 367.628 21 342.115 22 353.326 23 336.39 24 298.008 25 266.396 26 243.554 27 172.184 28 362.479 29 249.981 30 327.877 31 286.696 32 182.762 33 338.347 34 302.57 35 299.651 36 247.595 37 339.311 38 366.139 MITS5509 Assignment 3 6 Copyright © 2015-2018 VIT, All Rights Reserved. 39 398.295 40 205.129 41 371.419 42 175.406 43 476.159 44 359.144 45 315.97 46 329.629 47 399.552 48 442.799 49 255.405 50 408.036 51 497.195 52 249.674 53 292.026 54 481.193 55 394.76 56 273.175 57 311.517 58 238.067 59 292.459 60 2010.227 61 637.604 62 379.869 63 268.318 64 416.08 65 377.011 66 355.757 67 319.223 68 240.423 From the above data set, the group has to prepare a report which include the followings: 1. Explain the process of building each classifier using the training set (add the screenshots). 2. Explain how did you evaluate the classifier. 3. Create the confusion matrix based on 70% (training) / 30% (testing). 4. Predict the category of the values (any random 40 values) in table used for Testing set. 5. Compare the results between the different classifiers and discuss which one is the best and why. Note: Students can use any open source free data mining software such as Statistica Data Miner, Weka, RapidMiner, KNIME and MATLAB etc. MITS5509 Assignment 3 7 Copyright © 2015-2018 VIT, All Rights Reserved. Question 2: Create a DASHBOARD. For creating a dashboard, the group can use the above database or any other database. The group has to prepare a report which include the followings: 1. Write an introduction about the dataset used and add the reference (link). 2. Create at least four figures (different graphs) and add them to dashboard. 3. Add Screenshot of each of the steps. 4. Describe the figures in the dashboard. The student can use any software to create the dashboard such as Microsoft excel, Tableau, etc. The above list of documents is not necessarily in any order. The chronological order we cover these topics in lectures is not meant to dictate the order in which you collate these into one coherent document for your assignment. Your report must include a Title Page with the title of the Assignment and the name and ID numbers of all group members. A contents page showing page numbers and titles of all major sections of the report. All Figures included must have captions and Figure numbers and be referenced within the document. Captions for figures placed below the figure, captions for tables placed above the table. Include a footer with the page number. Your report should use 1.5 spacing with a 12 point Times New Roman font. Include references where appropriate. Citation of sources (if using any) is mandatory and must be in the Harvard style. Only one submission is to be made per group. The group should select a member to submit the assignment by the due date and time. All members of the group will receive the same grade unless special arrangement is made due to group conflicts. Any conflict should be resolved by the group, but failing that, please contact your lecture who will then resolve any issues which may involve specific assignment of work tasks, or removal of group members. What to Submit All submissions are to be submitted through turn-it-in. Drop-boxes linked to turn-it-in will be set up in the Unit of Study Moodle account. Assignments not submitted through these drop-boxes will not be considered. Submissions must be made by the due date and time (which will be in the session detailed above) and determined by your Unit coordinator. Submissions made after the due date and time will be penalized at the rate of 10% per day (including weekend days). The turn-it-in similarity score will be used in determining the level if any of plagiarism. Turn-it-in will check conference web-sites, Journal articles, the Web and your own class member submissions for plagiarism. You can see your turn-it-in similarity score when you submit your assignment to the MITS5509 Assignment 3 8 Copyright © 2015-2018 VIT, All Rights Reserved. appropriate drop-box. If this is a concern you will have a chance to change your assignment and re- submit. However, re-submission is only allowed prior to the submission due MITS5509 Assignment 3 9 Copyright © 2015-2018 VIT, All Rights Reserved. date and time. After the due date and time have elapsed you cannot make re-submissions and you will have to live with the similarity score as there will be no chance for changing. Thus, plan early and submit early to take advantage of this feature. You can make multiple submissions, but please remember we only see the last submission