DESCRIPTIONOne of the leading retail stores in the US, Walmart, would like to predict the sales and demand accurately. There are certain events and holidays which impact sales on each day. There are...

1 answer below »









DESCRIPTION








One of the leading retail stores in the US, Walmart, would like to predict the sales and demand accurately. There are certain events and holidays which impact sales on each day. There are sales data available for 45 stores of Walmart. The business is facing a challenge due to unforeseen demands and runs out of stock some times, due to the inappropriate machine learning algorithm. An








ideal ML algorithm will predict demand accurately and ingest factors like economic conditions including CPI, Unemployment Index, etc.

















Walmart runs several promotional markdown events throughout the year. These markdowns precede prominent holidays, the four largest of all, which are theSuper Bowl, Labour Day, Thanksgiving, and Christmas. The weeks including these holidays are weighted five times higher in the evaluation than non-holiday weeks. Part of the challenge presented by this competition is modeling the effects of markdowns on these holiday weeks in the absence of complete/ideal historical data. Historical sales data for 45 Walmart stores located in different regions are available.

















Dataset Description








This is the historical data which covers sales from 2010-02-05 to 2012-11-01, in the file Walmart_Store_sales. Within this file you will find the following fields:











  • Store - the store number








  • Date - the week of sales








  • Weekly_Sales - sales for the given store








  • Holiday_Flag - whether the week is a special holiday week 1 – Holiday week 0 – Non-holiday week








  • Temperature - Temperature on the day of sale








  • Fuel_Price - Cost of fuel in the region








  • CPI – Prevailing consumer price index








  • Unemployment - Prevailing unemployment rate




















Holiday Events








Super Bowl: 12-Feb-10, 11-Feb-11, 10-Feb-12, 8-Feb-13
Labour Day: 10-Sep-10, 9-Sep-11, 7-Sep-12, 6-Sep-13
Thanksgiving: 26-Nov-10, 25-Nov-11, 23-Nov-12, 29-Nov-13
Christmas: 31-Dec-10, 30-Dec-11, 28-Dec-12, 27-Dec-13


















Analysis Tasks








Basic Statistics tasks











  • Which store has maximum sales








  • Which store has maximum standard deviation i.e., the sales vary a lot. Also, find out the coefficient of mean to standard deviation








  • Which store/s has good quarterly growth rate in Q3’2012








  • Some holidays have a negative impact on sales. Find out holidays which have higher sales than the mean sales in non-holiday season for all stores together








  • Provide a monthly and semester view of sales in units and give insights




















Statistical Model








For Store 1 – Build prediction models to forecast demand











  • Linear Regression – Utilize variables like date and restructure dates as 1 for 5 Feb 2010 (starting from the earliest date in order). Hypothesize if CPI, unemployment, and fuel price have any impact on sales.








  • Change dates into days by creating new variable.











Select the model which gives best accuracy.


Answered 2 days AfterMar 08, 2023

Answer To: DESCRIPTIONOne of the leading retail stores in the US, Walmart, would like to predict the sales and...

Mohd answered on Mar 11 2023
41 Votes
-
-
-
2023-03-11
library(readxl)
walmartstoresales <- read_excel("walmartstoresales.xlsx",
col_types = c("numeric", "date", "numeric",
"numeric", "numeric", "numeric",
"numeric", "numeric"))
View(walmartstoresales)
walmartstoresales<-na.omit(walmartstoresales)
Which store has maximum sales
library(magrittr)
library(dplyr)
walmartstoresales%>%
group_by(Store)%>%
summarise(Total_Sales=sum(Weekly_Sales))%>%
arrange(desc(Total_Sales))%>%
head()
## # A tibble: 6 × 2
## Store Total_Sales
##
## 1 20 123654116.
## 2 4 119047759.
## 3 14 118285800.
## 4 13 113895601.
## 5 2 111261408.
## 6 10 108599830.
Which store has maximum standard deviation i.e., the sales vary a lot. Also, find out the coefficient of mean to standard deviation
walmartstoresales%>%
group_by(Store)%>%
summarise(Total_Sales=sum(Weekly_Sales),stddev=sd(Weekly_Sales),coeff_mean_var=mean(Weekly_Sales)/sd(Weekly_Sales))%>%
arrange(desc(stddev))%>%
head()
## # A tibble: 6 × 4
## Store Total_Sales stddev coeff_mean_var
##
## 1 14 118285800. 224426. 9.25
## 2 27 101618599. 180950. 9.85
## 3 10 108599830. 178732. 10.7
## 4 20 123654116. 167719. 12.9
## 5 35 52739752. 165835. 5.58
## 6 23 79390180. 157337. 8.85
Which store/s has good quarterly growth rate in Q3’2012
walmartstoresales$quarters <- paste(
format(walmartstoresales$Date, "%Y"),
sprintf("%02i", (as.POSIXlt(walmartstoresales$Date)$mon) %/% 3L + 1L),
sep = "/")
library(tidyr)
walmart<-walmartstoresales%>%
filter(quarters==c("2012/03","2012/02"))%>%
pivot_wider(names_from =quarters , values_from =Weekly_Sales )
colnames(walmart)[c(8,9)]<-c("y2012q2","y2012q3")
walmart_growth_2012Q3<-walmart%>%
group_by(Store)%>%
...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here