Abstract: Stock market is a place where you can buy and sell shares for publicly listed companies. Stocks which are also known as equities represent the ownership in the company. Stock exchange is the mediator that allow buying and selling of shares. Stock market prediction is the act of trying to determine future value of a company stock.
Stock market prediction is made possible with the help of machine learning. It allows you to analyze and predict the future values of company stocks. Most of the stock prediction techniques make use of linear regression model of machine learning. It is simple and easy to handle, but the main limitation of linear regression is that, linear regression is the assumption of linearity between the dependent and independent variables which can be incorrect many times. Also it is very sensitive to the outliers and cause over-fitting of the data. This project is about the prediction of the stock market price by using the LSTM model which will overcome the limitation of the method that is implemented using the linear regression. Along with plotting the graphical representation using LSTM model, I also built a dashboard in this project to analyse the stock. By using dash it is possible to view the graphical analysis of the stock market which will be very effective
Keywords: Stock market, LSTM model, Dash, Machine learning, Python
I. INTRODUCTION
The stock market is defined to be the collection of markets and exchanges where regular activities of buying and selling of shares of publicly-held companies take place. It is a place where shares of pubic listed companies are traded. The primary market is the place where companies float shares to the general public in an initial public offering (IPO). This is done to raise capital.
People mainly buy the stocks in the expectation that their price may rise in the future. But there is always an uncertainty in the stock market due to which people is not willing to invest their money in the stock markets. Thus we need a technique which can predict the stock market prices, so that people can invest their money in the best stocks.
The project is about the prediction of the stock market price using LSTM model and also uses dash to visualize stock market analysis which include the actual price and predicted price as a web application.
LSTM is the Long Short Term Memory. LSTM are the type of recurrent neural network for learning long term dependencies. It is mainly used for processing and predicting on the basis of time series data. LSTM mainly have a chain like structure. Instead of having a single neural network layer, there are four interacting layers communicating with each other in a very special way. Working of LSTM mainly consists of four steps:
1) First step in the LSTM is to identify the information that are not required and will be thrown away from the cell state. This decision is made by a sigmoid layer. This layer is also called forget gate layer.
2) The next step is to decide what new information we are going to store in cell state. Key to the LSTM is the cell state. This mainly consists of two steps:
a) The sigmoid layer called input gate layer decides which values will be updated.
b) tanh layer creates a vector of new candidate values that could be added to that state. tanh function is a squashing function which means it converts the value into between the range -1 and +1.
3) Now we will update old cell state Ct-1 into the new cell state Ct. first we multiply old state Ct-1 by ft, forgetting the things we decided to forget earlier. Then we add it*Ct . This is the new candidate value and it is scaled by how much we decided to update each state value.
4) We will run a sigmoid layer which decides what part of cell state we are going to output. Then we put the cell state through tanh and multiply it by the output of sigmoid state, so we can output only the parts we decided to output.
Dash is a wonderful library framework allowing python to build interactive web application dashboards. Layout of dash consists of all HTML contents. To implement dash we need to install dash components.
II. LITERATURE SURVEY
In [1] stock market price prediction using the SVM model is explained. Data preprocessing is the one the important step in this paper. This paper was proposed in order to overcome the disadvantage of one of the traditional machine learning method, back propagation. The main aim of the paper is to output the best stock price prediction value. The paper mainly tries to bring out the wide advantages of the random forest and support vector machine (SVM). These are the topics in the machine learning. Researchers in the various fields are having a great interest in doing research on the topic stock market price prediction. Machine learning is one of the best approaches that are used in the various fields. It has a wide range of applications and is also used in the field of the prediction.
In the stock market price prediction also the machine learning is having an important and a major role. In [2] stock market price prediction using SVM and reinforcement learning is used. Here instead of taking a local dataset, dataset of the global stock market is used. The paper makes use of global data set. In [3] various machine learning techniques that can be used for the prediction of the stock market are explained.
It is a comparison of the various machine learning techniques. Support vector machines, linear regression, prediction using decision stumps, expert weighting and online weighting are the techniques that are discussed in the paper. Advantages and the disadvantages of each of these methods are described in the paper. [4] uses the linear regression method, which is one of the machine learning algorithm to predict the future stock prices.
Various open source libraries and preexisting algorithms are used to make the unpredictable format into predictable. [5] proposes trained traditional machine learning algorithm. Along with that trained deep learning methods are also proposed. News sentiments and historical prices are the two data sources that are considered in the paper. Tick data and news data of ten years were collected. After selecting data source, data preprocessing step need to be implemented. After data preprocessing, it is needed to align news with tick data and then feature generation is done. Minimum, maximum, average and standard deviation are the features that are considered in the paper. After feature extraction data normalization is done.
III. PROPOSED SYSTEM
The stock price prediction using LSTM and dash consist of following stages:
1) Raw Data: As a first step we need to collect the required data sets in order to train our model. Historical stock data is collected. Historical stock data is the data that obtained in the previous year. This is used for the comparison purpose. And this data is used for the prediction of stock prices in the future. As an initialization we use the libraries like numpy, matplotlib, pandas, keras etc. Numpy helps to apply mathematical functions and operations to the array. Matplotlib is used for the visualization purpose. Pandas library is used as data analysis and manipulation tool. It takes the data as CSV and then creates a data frame.
2) Data Preprocessing: The preprocessing stage involves data discretization, data transformation, data cleaning and data integration. After the data set is transformed into a clean dataset, dataset is divided into training and testing sets for the evaluation purpose.
3) Feature Extraction: In feature extraction layer the features that fed in to the neural network are chosen. Here dropout is used which is a regularization technique for reducing over-fitting in neural networks.
4) Train Neural Network: In this stage, data is fed into the neural network and trained for prediction and for assigning random biases and weights.
5) Optimizer: Optimizer is used for the compiling purpose. The type of optimizer used can greatly affect how fast the algorithm converges to minimum value. Here we have chosen to use Adam optimizer. The Adam optimizer combines the perks of two optimizers: ADAgrad and RMSprop. In the ADAgrad learning rate is calculated based on past gradient that that have been computed for each parameter. RMSprop considers fixing the diminishing learning rate by only using a certain number of certain gradients. Adam is the adaptive movement estimation. It is another method that computes the adaptive learning rates for each parameter based on its past gradients.
6) Output Generation: In this layer, output value generated by the output later of the RNN is compared with the target value. The error or the difference between the target and the obtained output is minimized by using back propagation algorithm.
7) Visualization: A rolling analysis of a time series model is often used to assess the stability of the model over time. When analyzing financial statistic data employing a statistical model, a key assumption is that the parameters of the model are constant over time.
IV. CONCLUSIONS
The popularity of stock market trading is growing extremely rapidly which is encouraging researchers to find out new methods for the prediction using the new techniques. The forecasting technique is not only helpful for researchers, but also helps investors or any person dealing with the stock market. In order to help to predict the stock indices a forecasting model with good accuracy is required. In this proposed system I have used one of the most precise forecasting technologies using RNN and LSTMs units which help investors or any person interested in the stock market by providing them an honest knowledge of the longer term situation of the stock exchange.
REFERENCES
[1] S. K. Sushanth Kurdekar, “Stock price forecasting and recommendation system using machine learning techniques and sentiment analysis,” 2008.
[2] S. Shen, H. Jiang, and T. Zhang, “Stock market forecasting using machine learning algorithms,”Department of Electrical Engineering, Stanford University, Stanford, CA, pp. 1–5,2012.
[3] V. H. Shah, “Machine learning techniques for stock prediction,” Foundations of Machine Learning— Spring, vol. 1, no. 1, pp. 6–12, 2007.
[4] K. Pahwa and N. Agarwal, “Stock market analysis using supervised machine learning,” in 2019 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon). IEEE, 2019, pp. 197–200.
[5] W. E.-H. M. Mokalled and M. Jaber, “Automated stock price prediction using machine learning,” in Proceedings of the Second Financial Narrative Processing Workshop (FNP 2019), 2019, pp. 16–24.