Understanding the purchase behavior of various customers (dependent variable) against different products using their demographic information (IS features where most of the features are self -explanatory. This dataset consist of null values, redundant and unstructured data.
Machine learning is the most common applications in the domain retail industry. This concept helps to develop a predictor that has a distinct commercial value to the shop owners as it will help with their inventory management, financial planning, advertising and marketing.
This entire process of developing a model includes preprocessing, modelling, training testing and evaluating. Hence, frameworks will be developed to automate few of this process and its complexity will be reduced. The algorithm we proposed was Random Forest regressor that performed an average accuracy of 83.6% and with minimum RMSE (Root Mean Squared Error) value of 2829 on tire Black Friday sales dataset.
Introduction
I. INTRODUCTION
“Black Friday” is the name given to the shopping day after thanksgiving. This day was actually considered as “Black Friday” on the grounds that the number of customers made auto collisions and some of the time even violence [1], [2].
Police begat the saying to depict the disorder encompassing the congestion of pedestrian and auto traffic in downtown shopping regions. In retail industry, the number of sales play an important part that decide the loss a profit for the company. Predicting the sales accurately gives the efficient industry management.
Black Friday is like a carnival sale in the USA. In this day huge sale occurs in a very less price for the products which are much demanded. To incur the sales, a prediction model is made to hover on the type of product which is sold in maximum numbers. A customer’s behavior is to be analyzed in order to predict the amount of purchase to be done by him/her on a particular day. In this paper, we will predict the sales of a company on "Black Friday" [3].
To predict the sales of different products based on their independent variables, we need to analyze the relationship between different variables and well organize the darn. So that a model can perform calculations and predicts sales accurately.
II. MOTIVATION
Predicting customer behavior is one of the most popular applications of Machine Learning in various fields like Finance, Sales, Marketing. Building such predictive models, we can predict the impact of the decisions taken on the growth of our organization.
III. OBJECTIVES
Analyzing the data of all the customers and finding relationship of independent variables with respect to the target variable
Predicting the expected sales by testing and training
IV. SYSTEM ARCHITECTURE
A. System Architecture
VI. HARDWARE AND SOFTWARE REQUIREMENTS
A. Software Requirements Specification
Operating system : Windows 10.
Coding Language : Python
Tool : PyCharm, Visual Studio Code
Database : SQLite
B. Hardware Requirements Specification
System : Intel i5 6 core.
Hard Disk : 500 GB SSD.
Monitor : 15’’ LED
Input Devices : Keyboard, Mouse
Ram : 32 GB.
VII. APPLICATIONS
Application for sales prediction
Website to show the timeline when the product rates differ the most
E-Commerce sites
VIII. ACKNOWLEDGMENT
It gives us great pleasure in presenting the preliminary project report on ‘Machine Learning Application For Black Friday Sales Prediction Framework’. I would like to take this opportunity to thank our internal guide Prof. S.B. Nimbekar and head of Department, Dr. R. V. Babar for giving us all the help and guidance we needed. We are really grateful to them for there kind support. There valuable suggestions were very helpful.
Conclusion
Machine Learning (ML) can be used for the various tasks. This research work presents the use of ML algorithm for the prediction of the amount that a customer is likely to spend on next “Black Friday” sale. It has been performed that the exploratory data analysis is used to find interesting trends from the dataset. This research work suggests that when the user tries to predict the product that the customer is more likely to purchase, according to the customer’s gender, age and occupation. Experiments states that our method can produce more accurate prediction when compared to the techniques like decision trees, ridge regression etc. A comparison of various methods are summarized. Also, we have concluded that our model with lowest RMSE perform better than exiting models.
References
[1] Beheshti-Kashi, S., Karimi, H.R., Thoben, K.D., Lutjen, M., Teucke, M.:“ A survey on retail sales forecasting and prediction in fashion markets, ”Systems Science & Control Engineering 3(1), 154, 161(2015)
[2] Smith, Oliver, and Thomas Raymen. “ Shopping with violence: BlackFriday sales in the British context. ” Journal of Consumer Culture 17.3(2017): 677-694.
[3] Majumder, Goutam. “ ANALYSIS AND PREDICTION OF CONSUMER BEHAVIOUR ON BLACK FRIDAY SALES. ” Journal of the Gujarat Research Society 21.10s (2019): 235-242.
[4] Challagulla, Venkata Udaya B., et al. “ Empirical assessment of machine learning based software defect prediction techniques. ” International Journal on Artificial Intelligence Tools 17.02 (2008): 389-400
[5] Chu, C.W., Zhang, G.P.: “ A comparative study of linear and nonlinear models for aggregate retail sales forecasting, ” International Journal of production economics 86(3), 217{231(2003)
[6] Makridakis, S., Wheelwright, S.C., Hyndman, R.J.: “ Forecasting methods and applications, ” John wiley & sons(2008)
[7] Correia, Alvaro, Robert Peharz, and Cassio P. de Campos. “ Joints in Random Forests. ” Advances in Neural Information Processing Systems 33 (2020).
[8] Kvalheim, Olav Martin, et al. “ Determination of optimum number of components in partial least squares regression from distributions of the root mean squared error obtained by Monte Carlo resampling. ” Journal of Chemometrics 32.4 (2018): e2993.
[9] Sheridan, Robert P., et al. “ Extreme gradient boosting as a method for quantitative structure–activity relationships. ” Journal of chemical information and modeling 56.12 (2016): 2353-2360
[10] Ngiam, Kee Yuan, and Wei Khor. “ Big data and machine learning algorithms for health-care delivery. ” The Lancet Oncology 20.5 (2019):e262-e273.
[11] Domingos, P.M.: A few useful things to know about machine learning. Communacm 55(10), 78{87(2012)
[12] Langley, P., Simon, H.A.: Applications of machine learning and Rule induction. Communications of the ACM 38(11), 54{64(1995)
[13] Website url: https://machinelearningmastery.com/gentle-introductionxgboost- applied-machine-learning, accessed on 20th Sept, 2020
[14] Xiang Gao, Junhao Wen, Cheng Zhang, ” An Improved Random Forest Algorithm for Predicting Employee Turnover\", Mathematical Problems in Engineering, vol. 2019, Article ID 4140707, 12 pages, 2019.https://doi.org/10.1155/2019/4140707
[15] Das, P., Chaudhury, S.: “ Prediction of retail sales of footwear using feedforward and recurrent neural networks, ” Neural Computing and Applications 16(4-5), 491{502 (2007)}
[16] Loh, W.Y.: “ Classifiation and regression trees, ” Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 1(1),14{23 (2011) }