Predictive analytics is a term mainly used in statistical and analytics techniques to detect the relationships and patterns in data in order to predict the future by analyzing the past and taking better preventive decisions. This term is drawn from statistics, machine learning, database techniques and optimization techniques. The most benefits of using predictive analytics are it reduces and prevent risk, save time, cost and better management of resources in addition to the ability to take better strategic decisions based on fact not on intuition. This review paper discusses the overview of the researches done in this area. The benefit of implementing this technique would be that making business orientated decisions in a respective domain of data.
Introduction
I. INTRODUCTION
Prediction is booming. It reinvents industries and runs the world. More and more, predictive analytics (PA) drives commerce, manufacturing, healthcare, government, and law enforcement. In these spheres, organizations operate more effectively by way of predicting behavior—i.e., the outcome for each individual customer, employee, patient, voter, and suspect. For the past ten years Artificial Intelligence (AI) has experienced a renaissance and particularly Machine Learning (ML) has been the subject of great attention. The ultimate goal of AI is to make machines capable of performing tasks previously considered intelligent [1], [2] better than humans.
Predictive analytics, which is a branch in the domain of advanced analytics, used to predict the future events. It analyzes the current and historical data to make predictions about the future by applying the techniques from statistics, data mining, machine learning, and artificial intelligence [3]. It brings together the information technology, business modeling process, and management to make prediction about the future. Predictive analytics has not a limited application in particular domain but it has a wide range of application in many domains like insurance company, banking, healthcare, financial services etc. Predictive analytics improves decision making and helps to increase the profit rates of business and reduces risk by identifying them at the early stage.
II. PREDICTIVE ANALYTICS PROCESS
Predictive analytics involves several steps through which a data analyst can predict the future based on the current and historical data.
Define Problem Statement: Define the project outcomes, the scope of the effort, objectives; identify the data sets that are going to be used.
Data Collection: Data collection involves gathering the necessary details required for the analysis. It involves the historical or past data from an authorized source over which predictive analysis is to be performed.
Data Cleaning: Data Cleaning is the process in which our data sets are refined. In the process of data cleaning, un-necessary and erroneous data are removed. It involves removing the redundant data and duplicate data from the data sets.
Data Analysis: It involves the exploration of data. The data are explored and analyze it thoroughly in order to identify some patterns or new outcomes from the data set
Build Predictive Model: In this stage of predictive analysis, various algorithms are used to build predictive models based on the patterns observed. It requires knowledge of python, R, Statistics and MATLAB and so on.
Validation: It is a very important step in predictive analysis. In this step, the efficiency of the model are checked by performing various tests. The model needs to be evaluated for its accuracy in this stage.
Deployment: In deployment the model work are made real and it helps in everyday discussion making and make it available to use.
Model Monitoring: Regularly monitor the models to check performance and ensure that we have proper results. It is seeing how model predictions are performing against actual data sets [4].
III. CATEGORIES OF PREDICTIVE ANALYTICS MODELS
Today, there are a variety of predictive data models that have been developed to meet specific requirements and applications. Below, we examine some of the key models analytics professionals use to generate useful insights.
A. Classification Model
The classification model is one of the most popular predictive analytics models. These models perform categorical analysis on historical data. Various industries adopt classification models because they can retrain these models with current data and as a result, they obtain useful and detailed insights that help them build appropriate solutions. Classification models are customizable and are helpful across industries, including banking and retail.
B. Clustering Model
The clustering model gathers data and divides it into groups based on common characteristics. Hard clustering facilitates data classification, determining if each data point belongs to a cluster, and soft clustering allocates a probability to each data point.
C. Outliers Model
Unlike the classification and forecast models, the outlier model deals with anomalous data items within a dataset. It works by detecting anomalous data, either on its own or with other categories and numbers. Outlier models are essential in industries like retail and finance, where detecting abnormalities can save businesses millions of dollars. Outlier models can quickly identify anomalies, so predictive analytics models are efficient in fraud detection.
D. Forecast Model
One of the most prominent predictive analytics models is the forecast model. It manages metric value predictions by calculating new data values based on historical data insights. Forecast models also generate numerical values in historical data if none are present. One of the most powerful features of forecast models is that they can manage multiple parameters at a time. As a result, they're one of the most popular predictive models in the market. Various industries can use a forecast model for different business purposes. For example, a call center can use forecast analytics to predict how many support calls they will receive in a day, or a retail store can forecast inventory for the upcoming holiday sales periods, etc.
E. Time Series Model
Time series predictive models analyze datasets where the input parameter is time sequences. The time series model develops a numerical value that predicts trends within a specific period by combining multiple data points (from the previous year's data). A Time Series model outperforms traditional ways of calculating a variable's progress because it may forecast for numerous regions or projects at once or focus on a single area or task, depending on the organization's needs.
Time Series predictive models are helpful if organizations need to know how a specific variable changes over time. For example, if a small business owner wishes to track sales over the last four quarters, they will need to use a Time Series model. It can also look at external factors like seasons or periodical variations that could influence future trends [5].
IV. PREDICTIVE ANALYTICS APPLICATIONS
A few examples and real-world use cases of how various industries are leveraging predictive models to accelerate workflows and boost revenue are discussed below
Retail: Predictive analytics helps retailers in multiple regions with inventory planning and dynamic pricing, evaluating the performance of promotional campaigns, and deciding which personalized retail offers are best for customers.
Healthcare:The healthcare industry employs predictive analytics and modeling to analyze and forecast future population healthcare needs by leveraging healthcare data. Predictive models in the healthcare industry help identify activities that increase patient satisfaction, resource usage, and budget control. Predictive modeling also enables the healthcare industry to improve financial management to optimize patient outcomes.
Banking:The banking industry benefits from predictive analytics by creating a credit risk-aware mindset, managing capital and liquidity, and satisfying regulatory obligations. Predictive analytics models provide more significant detection and protection and better control and compliance. Predictive models allow banks and other financial organizations to tailor each client interaction, reduce customer churn, earn customer trust, and generate remarkable customer experiences.
Manufacturing: Manufacturing companies use predictive modeling to forecast maintenance risks and reduce costs on sudden breakdowns. Predictive analytics models help businesses improve their performance and overall equipment efficiency, and also allow companies to enhance product quality and boost consumer experience[5][6].
Detecting Fraud: Detection and prevention of criminal behavior patterns can be improved by combining the multiple analysis methods. The growth in cybersecurity is becoming a concern. The behavioral analytics may be applied to monitor the actions on the network in real time. It may identify the abnormal activities that may lead to a fraud. Threats may also be detected by applying this concept [7].
The table below show the usage of predictive analytics in term of sectors, the purpose of use and the most algorithms and tools applied
Sector
Goals of PA use
Algorithms
Tools
Business/marketing
Marketer can use PA to predict the customers’ response to advertising campaign
Banks
To predict the most profitable customers or to alert credit card customer to a probable fraud
Education
Predicting student abnormal behavior, Predicting students’ results, Predicting the performance of students in a specific course
Classification Decision tree Feature selection
WEKA
Public Transportation
predicting the time of bus arrival
Clustering model
Stock market
optimizing prediction of products and stock market indications
Healthcare sector
benefiting from predictive analytics in oncology and cancer care
Conclusion
There has been a long history of using predictive models in the tasks of predictions. For the past ten years Artificial Intelligence (AI) has experienced a renaissance and particularly Machine Learning (ML) has been the subject of great attention. The ultimate goal of AI is to make machines capable of performing tasks previously considered intelligent [8], [9] better than humans. A common element in most AI use cases is prediction[10]. Earlier, the statistical models were used as the predictive models which were based on the sample data of a large-sized data set. With the improvements in the field of computer science and the advancement of computer techniques, newer techniques of artificial intelligence and machine learning have changed the world of computation where intelligent computation techniques and algorithms are introduced over the period of time. Hence this paper opens a scope of development of new models for the task of predictive analytics.
References
[1] Montreal University: http://www.iro.umontreal.ca/ nie/IFT3335/Introduction.html, accessed on 13 june 2020.
[2] CNRS LE JOURNAL, Yaroslav Pigenet: https://lejournal.cnrs.fr/articles/des-machines-enfin-intelligentes.
[3] Charles Elkan, 2013, “Predictive analytics and data mining”, University of California, San Diego.
[4] https://www.geeksforgeeks.org/step-by-step-predictive-analysis-machine-learning/.
[5] https://www.projectpro.io/article/predictive-modelling-techniques/598
[6] https://www.analyticsinsight.net/predictive-analytics-models-and-algorithms-describing-the-types/.
[7] M Nigrini, 2011, “Forensic Analytics: Methods and John Willey and Sons Ltd. Techniques for Forensic Accounting Investigations”,
[8] Montreal University: http://www.iro.umontreal.ca/ nie/IFT3335/Introduction.html, accessed on 13 june 2020.
[9] CNRS LE JOURNAL, Yaroslav Pigenet : https://lejournal.cnrs.fr/articles/des-machines-enfin-intelligentes.
[10] Zinebi, K., Souissi, N., and Tikito, K. (2018). Driver Behavior Analysis Methods: Applications oriented study. In Proceedings of the 3rd International Conference on Big Data, Cloud and Application (BDCA 2018), April 2018 in Kenitra Morocco.
[11] mining”, University of California, San Diego