Microsoft Word - T2 2021 BISY3001 A4 Briefing - Block.docx Unit Assessment Type Group Assignment Assessment Number A4 Assessment Name Data Mining & BI Report Weighting 25% Alignment with Unit and...

1 answer below »

View more »
Answered 9 days AfterAug 12, 2021BISY3001

Answer To: Microsoft Word - T2 2021 BISY3001 A4 Briefing - Block.docx Unit Assessment Type Group Assignment...

Shreyan answered on Aug 18 2021
140 Votes
BUSINESS CASE UNDERSTANDING

Group Members:
TABLE OF CONTENTS
Introduction                                             3
Data preparation and Feature extraction                             3
Data selection                                         3
Data cleaning                                         4
Data construction/ feature extraction
                            4
Modelling                                             5
Choice of Model                                         5
Modelling assumptions                                     6
Test Design                                             6
Building the model                                     7
Parameter settings                                     8
Model description                                         9
Evaluation and Conclusion                                     9
1. Introduction
Data mining applications are varied and can be incorporated into almost all industries. However, the transition has been slow and although it is steadily growing, it will take a significant amount of time before it reaches its peak. The online retail industry is one industry however, where data mining is used heavily. Companies which provide retail services like Amazon and Facebook Marketplace have heavily invested in data analytics and now perform the same everyday to get insight into their customers preferences.
Data mining has widely been used in the industry to predict customer’s preferences (Kim et. al., 2012), predict sales of a new product (Cheriyan et.al., 2018) or even identify new customers. Most of the companies identify new customers through social media who make use of their user’s data to identify their individual preferences and target their ads accordingly, thereby increasing the probability of a sale. These techniques have been proven to be extremely useful and have brought companies like Facebook to the global forefront.
In this experiment, we assume we are a clothing retailer who is trying to understand how one of their products is perceived. To do that, we use topic extraction on the reviews left by our customers. The data used is real-world data which was collected and anonymized. We essentially perform several different tasks on it. We want to identify the different groups of customers, find the topics and terms used by the users in the reviews as well as try to predict the rating of the product given the age of the user. Although these are simplistic experiments – it can provide a lot of help to the retailer regarding whom to advertise to, or what to expect from their users if they buy it already, and maybe change the product based on their customer’s expectations.
2. Data Preparation and Feature Extraction
In the data preparation stage, we select the data, clean the data and construct the data. This is a done sequentially, and sets up the stage for the actual model building. Reliable and clean data is essential for the model to perform well, since faulty or missing data would make prediction unreliable, causing the accuracy to fall dramatically.
Data Selection
The data we have gathered is from one of Kaggle’s open-source datasets. The Kaggle dataset is question was provided under the CC0 license, and contains real-world data from a women’s ecommerce clothing brand. It has 23486 rows and 10 feature variables. The feature variables are as follows:
· Clothing ID: Integer Categorical variable that refers to the specific piece being reviewed.
· Age: Positive Integer variable of the reviewers age.
· Title: String variable for the...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here
April
January
February
March
April
May
June
July
August
September
October
November
December
2025
2025
2026
2027
SunMonTueWedThuFriSat
30
31
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
1
2
3
00:00
00:30
01:00
01:30
02:00
02:30
03:00
03:30
04:00
04:30
05:00
05:30
06:00
06:30
07:00
07:30
08:00
08:30
09:00
09:30
10:00
10:30
11:00
11:30
12:00
12:30
13:00
13:30
14:00
14:30
15:00
15:30
16:00
16:30
17:00
17:30
18:00
18:30
19:00
19:30
20:00
20:30
21:00
21:30
22:00
22:30
23:00
23:30