The purpose of this study is to compare VAR, ARIMA and SARIMA methods in an attempt to generate sales forecasting in Store xyz with high accuracy. This study will compare the results of sales forecasting with time series forecasting model of Vector Auto Regression (VAR), Autoregressive Integrated Moving Average (ARIMA) and Seasonal Autoregressive Integrated Moving Average (SARIMA). VAR or ARIMA model still accurate when the time series data is only in a short period, these models is accurate on short period forecasting but less accurate on long period forecasting. Meanwhile Seasonal Autoregressive Integrate Moving Average is more accurate on forecasting seasonal time series data, either it’s pattern shows trend or not all three models are compared with forecasting data showing seasonal patterns. The data used is the data of super mart retail store, sales from 2017 to 2022. Accuracy level of each model is measured by comparing the percentage of forecasting value with the actual value. This value is called Mean Absolute Deviation (MAD). Based on the comparison result, the best model with the smallest MAD value is SARIMA model (0,1,0) (0,1,0)12 with MAD value 0.122. From the comparison results can be concluded that the SARIMA model is optimal to be used as a model for further forecasting
Introduction
I. INTRODUCTION
In the strip mall business, it is commonly known that consumer demand is normally very volatile. Before and after the COVID pandemic I have seen a lot of difference, In fact consumer’s choice of mart is generally based on price. In this case, to overcome this condition, the managerial store tried to forecast the demand and reduce the price by reducing the cost of maintance or by buying goods directly from the manufacturer or the first party. Where machine learning is an innovative way for sales or demand forecasting. It is one of the effective solutions to prepare a complete data set for eradication of different challenging situations in the organization. In Machine learning system, the uses of different models such as VAR (multivariate forecasting algorithm), ARIMA (non-seasonal time series data), SARIMA (seasonal or non-seasonal time series data) helps to introduce different algorithms to understand the accuracy of business. last few decades for the maintenance of the organization’s potentiality in the organization. The uses of artificial intelligence and computer algorithms help to create different programs for autonomous activities in the organization. where python and Jupyter are two innovative software that has been used for sales prediction
The aim of this research paper is to determine the impact of machine learning in sales prediction for the enhancement of business profitability. An authentic data preparation process is essential to determine the sales rate in the same year. The machine learning process is important for the vision of future sales revenues to determine the mart profitability. Moreover, this process is required to generate innovative sales management strategies for future performances .
II. LITERATURE REVIEW
In 2021, DontiReddy et al. [1] discussed about the machine learning is an effective way for sales forecasting. They Implement the Jupiter and Python are two innovative models for introducing different algorithms for secured business profitability. The different models, such as GARCH, SARIMA, SARIMAX helped to promote business profitability for the management. In 2020, Purvika Bajaj et al. [2] discussed about the dimension for predicting the future sales of Big Mart Companies keeping in view the sales of previous years. A comprehensive study of sales prediction is done using Machine Learning models such as Linear Regression, K-Neighbors Regressor, XGBoost Regressor and Random Forest Regressor. In the paper The prediction includes data parameters such as item weight, item fat content, item visibility, item type, item MRP, outlet establishment year, outlet size and outlet location type etc. In 2021, Ashutosh Kumar Dubeya et al. [3] g. discussed about data analytics presented on the collected smart meter measurement and then predicting the energy consumption on a daily basis using ARIMA, seasonal ARIMA, SARIMA and LSTM. The results indicate the feasible method for forecasting energy consumption.
In 2021, Prabhat sharma et al. this research shows in the tough time of covid-19, what will be the sales trend for various automobile companies will the graph go downward or upward, by various machine learning techniques and the result is project successfully meet its aim. In 2020, Ms. Rachana Mohite et al. [4] . This research discussed the comparison between market basket analysis by using apriori algorithm and market basket analysis without using algorithm in creating rule to generate the new knowledge with the help of these concepts can easily setup his retail shop and can develop the business in future. In 2020, Jiantao Zhao et al. [5] in this paper discussed about combination model is to use prophet and SARIMA model to forecast the sales data respectively, and then weighted combination to get the final forecast results, and the result is combination model which is weighted by two single models is optimal. In 2018, G A N Pongdatu et al. [6] In this study they will compare the results of sales forecasting with time series forecasting model of Seasonal Autoregressive Integrated Moving Average (SARIMA) and Holt Winter's Exponential Smoothing method and the result is SARIMA model (1,1,0) (0,1,0)12 . With accurate forecasting results or estimates.
III. DATA AND ANALYSIS
A. Var (Vector Auto Regression)
Vector auto regression (VAR) is a multivariate forecasting algorithm that is used when two or more time series influence each other. We set the first estimation period to be 2022:5 and forecast with each VAR recursively, applying 1,6 and 12-month ahead forecasts in each single month of the sample 2017:12 through 2021:12 When factors are extracted from the 55 predictor variables, these are estimated using the same sample period as the VAR
B. ARIMA (Autoregressive Integrated Moving Average)
Our dataset contained selling data for grocery and other (non-essential items ), spanning over five years and 41248 order lines. For all product, we used the sales per month
Pre-processing has to take place in order to convert data to the appropriate format for the ARIMA model. During pre-processing the following steps are taken:
Sales data are ordered by date time
Data are reduced to one-dimensional information, so extra information like average price and other product attributes are removed.
Since time series plot of the historical data exhibited the seasonal variations which present similar trend every year, then SARIMA was chosen as the appropriate approach to develop a model prediction.
SARIMA formula used for forecasting
The general form of seasonal model SARIMA(p, d, q) (P, D, Q)s is given by:
s D d s P s t Q t Φ ? ∇ ∇ = Θ θ (B ) (B) x (B ) (B)w
IV. RESULT AND DISCUSSION
A. VAR
In this method, the selected months sales separated for the period of Jan 20017 to march 2022, have been used as the basis on daily scale. But to get the maximum explorative information and reduction of volatility, the data have been transformed to the monthly scale. Data from January 2017 to march 2022 are used in-sample estimation and from April 2022 to December 2022 are used for the out-of-sample forecasting purposes.
From Figure 5, it has been observed that each study variable, except grocery and other non-essential product and the forecasting result as shown below.
Where:
OLS = Ordinary least squares
BIC = Significant autocorrelation for Consumption
HQIC = Estimate of the deviance of the model fit
FPE = The function returns information criteria
In the result the standard error is 45% for grocery and 41% for other
B. ARIMA
Here, the data is predicted from the taken dataset by first converting the data into stationary. To make stationary we have to find the difference on mean of Number of sales. The final graph is plotted for the best fit ARIMA model of number of sales next following years. The following is the output with forecasted values of grocery sales in blue. Also, the expected error is displayed with orange lines on either side of predicted blue line.
Root Mean Squared Error: 35%
C. SARIMA
-The final SARIMA model (0,1,0)(1,0,1)12 was used to forecast the values of the 55 months-ahead are presented in Table 2. While, the whole forecasting plot is shown in Fig. 8
Error Check
The accuracy of the forecasting can be evaluated using error measures. It is achieved by comparing the original data and the forecast values In this paper, Mean Absolute Percentage Error was used as the error measure. The result showed MAPE value for the selected model was 12.2%. Thus, the empirical result indicated that the model was able to accurately represent the covid sales historical dataset
Conclusion
In our case study, we considered different machine-learning approaches for time series forecasting.. The use of SARIMA algorithm for sales forecasting can often give us better results compared to ARIMA and VAR. in this research paper we use 5 year data in which both seasonal and non-seasonal data set are present. The aim of this research paper finding the optimal method between them. The uses of different algorithm and software made huge changes in the conduction of effective resource plans in the organization. A secondary data collection method has been used to identify the impact of machine learning . by the application of this method we were able to get mean standard error between 12 to 12.5 % in other models this error get so high
The model can hence provide following benefits to the greenmart company if the results of this research paper are adopted.
1) Accurate sales prediction before upcoming pandemic because we all know that what decision get taken by the government
2) Stock maintenance get easy
3) It helps to increases customer satisfaction because demand and sale are both interconnected to each
References
[1] DontiReddy Sai Rakesh Reddy, Katanguru Shreya Reddy, S. Namrata Ravindra B. Sai Sahithi., (2021), \" Prediction and Forecasting of Sales Using Machine Learning Approach \" International Research Journal of Engineering and Technology (IRJET), vol. 8, pp.377
[2] Purvika Baja1, Renesa Ray, Shivani Shedge, Shravani Vidhate, Prof. Dr. Nikhilkumar Shardoor., (2020) “SALES PREDICTION USING MACHINE LEARNING ALGORITHMS”, “International Research Journal of Engineering and Technology (IRJET), vol. 7 issue 6 pp 380
[3] Ashutosh Kumar Dubeya, Abhishek Kumara, Vicente García-Díazb, Arpit Kumar Sharmac, Kishan Kanhaiyad., (2021) “Study and analysis of SARIMA and LSTM in forecasting time series data” “ELSEVIER” , vol.47
[4] Ms. Rachana Mohite, Mr. Kevin Shah, Mr. Gaurav Kolhe, Mrs. Madhura Mokashi, Mrs. Prajakta Rokade (2020) “Sales Prediction of Market using Machine Learning” “International Journal of Engineering Research & Technology (IJERT)” vol.9
[5] Jiantao Zhao, Chunwei Zhang (2020) “Research on Sales Forecast Based on Prophet-SARIMA Combination Model” “Journal of Physics: Conference Series” doi:10.1088/1742-6596/1616/1/012069
[6] G A N Pongdatu and Y H Putra (2018) “Seasonal Time Series Forecasting using SARIMA and Holt Winter’s Exponential Smoothing” “IOP Conference Series: Materials Science and Engineering” doi: :10.1088/1757-899X/407/1/012153
[7] M\'Amanja, Daniel; Lloyd, Tim; Morrissey, Oliver (2005) “Fiscal aggregates, aid and growth in Kenya: A vector autoregressive (VAR) analysis” “The University of Nottingham, Centre for Research in Economic Development and International Trade (CREDIT)” vol. 05.07