Task back to top For this assessment, you are required to useWeka 3.6(or a later version), you will then use throughout the duration of this subject. Task 1: Weka data exploration [5 marks] In Weka...

1 answer below »

Task


back to top

For this assessment, you are required to useWeka 3.6(or a later version), you will then use throughout the duration of this subject.






Task 1: Weka data exploration [5 marks]


In Weka workbench (workbench option for 3.8 or explorer option for older versions), loaddiabetes.arffdataset to answer the following questions.






(a)How many instances and attributes (including the class attribute) does this dataset have?[1 mark]






(b)How many classes are present in the dataset and how many instances are there for each class ?[2 marks]






(c)Use histograms (with default settings) to show which age group has the highest number of samples? [2 marks]






Task 2:Working with a new data file in Weka [5 marks]


This task needs you to make your own .arff file for the following dataset and explore features.






a)Open theiris.arfffile from~/weka/data/folder in a text editor, then remove‘petal_width’attribute and save it asiris.3D.arff. Please make sure that the Attribute-Relation File Format (.arff) iscorrectly preserved. [4 marks]








Hints:

You may use any of the*.arfffiles as a template for this conversion. There can be several*.arfffiles found in ~/weka/data/ folder in your distribution.






b)Load this file in workbench and include a screenshot of the histograms (with default setting) for each attribute in this dataset. [1 marks]








Hints:

After loading the file in workbench, you may use ‘visualize all’ button to generate histograms for each attribute.






Task 3:Visual analysis [5 marks]






a)Load the file (iris.3D.arff) that you have created in the previous task in workbench and generate a scatter plot using the‘visualize’menu option to show data distribution for each two attributes in a two-dimensional visualisation. [2 marks]






b)Visually compare the plots for (sepal_length, sepal_width) and (sepal_length, petal_length) and comment on which one of them shows a better class separability in this dataset. Justify your answer with screenshots. [3 marks]






Rationale


back to top

This assessment task will assess the following learning outcome/s:



  • be able to identify and analyse business requirements for the identification of patterns and trends in data sets.

  • be able to appraise the different approaches and categories of data mining problems.

  • be able to compare and evaluate output patterns.

  • be able to explore and critically analyse data sets and evaluate their data quality, integrity and security requirements.

  • be able to compare and evaluate appropriate techniques for detecting and evaluating patterns in a given data set.

  • be able to identify and evaluate the security, privacy and ethical implications in data mining.


Marking criteria and standards


back to top










































Criteria







HD (>=85%)



DI (>=75%)CR (>=65%)PS (>=50%)FL (

Task 1:

Weka data exploration [5 marks]


The answers are correct and complete, demonstrating thorough and comprehensive understanding of the specified dataset and the usage of WEKA, and insightful observations.

The answers are correct and complete, demonstrating good understanding of the specified dataset and the usage of WEKA, and insightful observations.

The answers are correct and complete, demonstrating understanding of the specified dataset and the usage of WEKA and some insightful observations.

The answers are correct, demonstrating understanding of the specified dataset and the usage of WEKA and some observations.

Answers are incorrect/in complete or partially complete.


Task 2:


Making a data file for Weka


[5 marks]




The file is correctly formatted and can be loaded in WEKA and included comprehensive demonstration of visual analysis.

The file is correctly formatted and can be loaded in WEKA and included a good demonstration of visual analysis.

The file is correctly formatted and can be loaded in WEKA and included some demonstration of visual analysis.

The file is correctly formatted and can be loaded in WEKA and included visual analysis.

The file is incorrectly formatted and cannot be loaded in WEKA.


Task 3:


Visual Analysis


[5 marks]




The answers demonstrated comprehensive visual analysis and insightful observations in the specified dataset.

The answers demonstrated good visual analysis and insightful observations in the specified dataset.

The answers demonstrated some visual analysis and insightful observations in the specified dataset.

The answers demonstrated minimal visual analysis and observations in the specified dataset.

Answers are incorrect/in complete or partially complete.







Presentation


back to top


You are recommended to write the answers in a word document and submit it in either Word format (.doc, or .docx) or .pdf format.



All diagrams that are required should be inserted into the document in appropriate positions with descriptive titles. Your answers to the questions should be precise but complete and informative.



Answered Same DayAug 23, 2021ITC516Charles Sturt University

Answer To: Task back to top For this assessment, you are required to useWeka 3.6(or a later version), you will...

Pritam answered on Aug 24 2021
144 Votes
Task-1:
a) There are 9 attributes and 768 instances.
b) In the data set one can find that there ar
e two types of classes and they are tested positive and tested negative. For the positive class one can find that there are 500 instances and for the negative class there are 268 instances.
c) For the histogram we create the histogram in the workbench and thus attached below.
From the histogram one can see that the age group between 21 to 26(approx.) has the highest number of samples of 267. The distribution of the attribute age thus seems to be quite skewed and specifically saying it is actually positively skewed with most of the data contained by the...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here