Microsoft Word - T2 2021 BISY3001 A4 Briefing - Block.docx Unit Assessment Type Group Assignment Assessment Number A4 Assessment Name Data Mining & BI Report Weighting 25% Alignment with Unit and...

1 answer below »
its report and power point, both are related
there need to be 2 knime data set and one excel.example for the data set is attached but please dont use same we need to use similar kind of data as data set of excel for excel and for knime Iris dataset


Microsoft Word - T2 2021 BISY3001 A4 Briefing - Block.docx Unit Assessment Type Group Assignment Assessment Number A4 Assessment Name Data Mining & BI Report Weighting 25% Alignment with Unit and Course ULO1, ULO2, ULO3, ULO4 Due Date and Time Report (10%): Friday, 20 August 2021, 5 PM via Moodle. Presentation and QA Session (15%): Session 12, in last class. Assessment Description In this assessment, the students will extend their previous work from assessment A3 Business case understanding. Here, the students have to submit a report of the data mining process on a real-world scenario and a presentation and QA Session will be held based on the report written. The report will consist of the details of every step followed by the students. Detailed Submission Requirements Cover Page • Title • Group members Introduction • Importance of the chosen area • Why this data set is interesting • What has been done so far • Which can be done • Description of the present experiment 1. Data preparation and Feature extraction: 1.1 Select data o Task Select data 1.2 Clean data o Task Clean data o Output Data cleaning report 1.3 Construct data/ feature extraction o Task Construct data o Output Derived attributes o Activities: Derived attributes o Add new attributes to the accessed data o Activities Single-attribute transformations o Output Generated records 2 Modeling 2.1 Select modeling technique o Task – Select Modelling Technique 2.2 Output Modeling technique o Record the actual modeling technique that is used. 2.3 Output Modeling assumption o Activities Define any built-in assumptions made by the technique about the data (e.g. quality, format, distribution). Compare these assumptions with those in the Data Description Report. Make sure that these assumptions hold and step back to the Data Preparation Phase if necessary. You can explain the data file here, even when it is pre prepared. 3 Generate test design 3.1 Task Generate test design o Activities Check existing test designs for each data mining goal separately. Decide on necessary steps (number of iterations, number of folds etc.). Prepare data required for test. (You can use 66% of records for model Building and rest for Testing) 3.2 Build model o Task - Build model Run the modeling tool on the prepared dataset to create one or more models. (Using Knime Tool as shown in the lab). 3.3 Output Parameter settings o Activities - Set initial parameters. Document reasons for choosing those values. o Activities - Run the selected technique on the input dataset to produce the model. Post-process data mining results (e.g. editing rules, display trees). 3.4 Output Model description o Activities - Describe any characteristics of the current model that may be useful for the future. Give a detailed description of the model and any special features. o Activities - State conclusions regarding patterns in the data (if any); sometimes the model reveals important facts about the data without a separate Assessment process (e.g. that the output or conclusion is duplicated in one of the inputs). 4 Evaluation and Conclusion Previous evaluation steps dealt with factors such as the accuracy and generality of the model. This step assesses the degree to which the model meets the business objectives and seeks to determine if there is some business reason why this model is deficient. It compares results with the evaluation criteria defined at the start of the project. A good way of defining the total outputs of a data mining project is to use the equation: RESULTS = MODELS + FINDINGS In this equation we are defining that the total output of the data mining project is not just the models (although they are, of course, important) but also findings which we define as anything (apart from the model) that is important in meeting objectives of the business (or important in leading to new questions, line of approach or side effects (e.g. data quality problems uncovered by the data mining exercise). Note: although the model is directly connected to the business questions, the findings need not be related to any questions or objective, but are important to the initiator of the project. ~ End of Assessment Details ~ Marking Criteria Activities Rank the possible actions. Select one of the possible actions. Document reasons for the choice. Content Marks Cover Page Table of contents 0.5 Executive Summary 0.5 Introduction 0.5 Data Pre-processing and feature extraction 2.5 Experiment 3 Result analysis 2.5 Conclusion 0.5 Presentation and QA 15 Rubrics Marking criteria HD D C P F ULO1: Demonstrate broad understanding of data mining and business intelligence and their benefits to business practice ULO 2: Choose and apply models and key methods for classification, prediction, reduction, exploration, affinity analysis, and customer segmentation that can be applied to data mining as part of a business intelligence strategy ULO3: Analyse appropriate models and methods for classification, prediction, reduction, exploration, affinity analysis, and customer segmentation to data mining ULO4: Propose a data mining approach using real business cases as part of a business intelligence strategy Report, presentation and QA outcome address all the tasks. Report consists of no/minor mistakes. (21-25 marks) Report, presentation and QA outcome address all the tasks. Report consists of a few number of mistakes. (18-20 marks) Report, presentation and QA outcome address most of the contents. Report consists of a few number of mistakes. (15-17 marks) Report, presentation and QA outcome address a few of the contents. Report consists of a good number of mistakes. (13-14 marks) Incomplete report. Unable to perform the experiment/dat a pre- processing/ conclude result. Unable to answer to the question of QA Session and Unable to present the work that has been done. (0-12.5 marks) Misconduct • Engaging someone else to write any part of your assessment for you is classified as misconduct. • To avoid being charged with Misconduct, students need to submit their own work. • Remember that this is a Turnitin assignment and plagiarism will be subject to severe penalties. • The AIH misconduct policy and procedure can be read on the AIH website (https://aih.nsw.edu.au/about-us/policies-procedures/). Late Submission • Late submission is not permitted, practical submission link will close after 1 hour. Special consideration • Students whose ability to submit or attend an assessment item is affected by sickness, misadventure or other circumstances beyond their control, may be eligible for special consideration. No consideration is given when the condition or event is unrelated to the student's performance in a component of the assessment, or when it is considered not to be serious. • Students applying for special consideration must submit the form within 3 days of the due date of the assessment item or exam. • The form can be obtained from the AIH website (https://aih.nsw.edu.au/current- students/student-forms/) or on-campus at Reception. • The request form must be submitted to Student Services. Supporting evidence should be attached. For further information please refer to the Student Assessment Policy and associated Procedure available on (https://aih.nsw.edu.au/about-us/policies-procedures/). Rainfall and Umbrella MonthRainfall (mm)Umbrellas sold Jan8215 Feb92.525 Mar83.217 Apr97.728 May131.941 Jun141.347 Jul165.450 Aug14046 Sep126.737 Oct97.822 Nov86.220 Dec99.630 Jan8714 Feb97.527 Mar88.214 Apr102.730 May12343 Jun146.349 Jul16049 Aug14544 Sep131.739 Oct11836 Nov91.220 Dec104.632 spec.xml
Answered 9 days AfterAug 12, 2021BISY3001

Answer To: Microsoft Word - T2 2021 BISY3001 A4 Briefing - Block.docx Unit Assessment Type Group Assignment...

Shreyan answered on Aug 18 2021
138 Votes
BUSINESS CASE UNDERSTANDING

Group Members:
TABLE OF CONTENTS
Introduction                                             3
Data preparation and Feature extraction                             3
Data selection                                         3
Data cleaning                                         4
Data construction/ feature extraction
                            4
Modelling                                             5
Choice of Model                                         5
Modelling assumptions                                     6
Test Design                                             6
Building the model                                     7
Parameter settings                                     8
Model description                                         9
Evaluation and Conclusion                                     9
1. Introduction
Data mining applications are varied and can be incorporated into almost all industries. However, the transition has been slow and although it is steadily growing, it will take a significant amount of time before it reaches its peak. The online retail industry is one industry however, where data mining is used heavily. Companies which provide retail services like Amazon and Facebook Marketplace have heavily invested in data analytics and now perform the same everyday to get insight into their customers preferences.
Data mining has widely been used in the industry to predict customer’s preferences (Kim et. al., 2012), predict sales of a new product (Cheriyan et.al., 2018) or even identify new customers. Most of the companies identify new customers through social media who make use of their user’s data to identify their individual preferences and target their ads accordingly, thereby increasing the probability of a sale. These techniques have been proven to be extremely useful and have brought companies like Facebook to the global forefront.
In this experiment, we assume we are a clothing retailer who is trying to understand how one of their products is perceived. To do that, we use topic extraction on the reviews left by our customers. The data used is real-world data which was collected and anonymized. We essentially perform several different tasks on it. We want to identify the different groups of customers, find the topics and terms used by the users in the reviews as well as try to predict the rating of the product given the age of the user. Although these are simplistic experiments – it can provide a lot of help to the retailer regarding whom to advertise to, or what to expect from their users if they buy it already, and maybe change the product based on their customer’s expectations.
2. Data Preparation and Feature Extraction
In the data preparation stage, we select the data, clean the data and construct the data. This is a done sequentially, and sets up the stage for the actual model building. Reliable and clean data is essential for the model to perform well, since faulty or missing data would make prediction unreliable, causing the accuracy to fall dramatically.
Data Selection
The data we have gathered is from one of Kaggle’s open-source datasets. The Kaggle dataset is question was provided under the CC0 license, and contains real-world data from a women’s ecommerce clothing brand. It has 23486 rows and 10 feature variables. The feature variables are as follows:
· Clothing ID: Integer Categorical variable that refers to the specific piece being reviewed.
· Age: Positive Integer variable of the reviewers age.
· Title: String variable for the...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here