A number of studies have been conducted to model the stock market indices using pure time series models or regression models based on macro economic variables. In this study, instead of focusing on modeling the actual levels of stock market indices I focus on predicting the direction (up/down) as investors who rely on technical analysis are more interested in the direction of stock market index than the actual prediction value. Therefore, in this study I look at best modelling approach for the direction prediction: time series (ARMA) or macro factor models or combination of both (ARDL). My study shows that macro factor models outperform for direction prediction as compared to ARMA or ARDL models. The study was performed on stock market direction prediction of stock indices of three South Asia countries: India, Pakistan and Malaysia. The macro economic factors that are considered for direction prediction are: Inflation, Unemployment and Exchange Rate monthly data from March 2016 to September 2021.
Introduction
I. INTRODUCTION
In general, developing a predictive model involves: (a) taking the known data (historical time series or observed independent/exogenous data), (b) developing a model based on optimization of some cost/error function and then (c) using that developed model to predict new observed data which occur in the future. There are number techniques to build the prediction models and generally, the best model is the one which minimises the prediction error on test data. However, in some cases the stakeholders are more interested in getting the direction of prediction (up/down) than the actual predicted values. This is particularly true for exotic derivatives trading such as binary option where the traders are more interested in getting the stock market direction prediction accurate than actual values/levels. Therefore, in this study, I want to study which models/techniques such as time series or regression based on macro economic variable or combination of both are best suited for direction prediction. In this study I have chosen stock market indices of three South Asian countries: India, Pakistan and Malaysia monthly data over 5 years from March 2016 to Sep 2021. Using this observed time series data, prediction models are built using: (i) pure time series model (ARMA), (ii) macro economic factors based regression model and (iii) combina- tion of both (ARDL) and these models are used to test the direction prediction. The macro economic variables selected for this study are: Inflation, Exchange Rate, Consumer Pricing Index and Unemployment monthly data downloaded from the Global Economy [1] website. The study shows that the regression model based on macro economic variables outperforms as compared to time series or ARDL when it comes to direction prediction consistently for all three countries. The paper is organised as follows. The section 2 briefly review the current literature survey, section 3 talks about the data, section 4 describes the method- ology for modelling the stock market direction prediction, section 5 discusses the results and finally, in section 6 results are concluded along with the next steps.
II. LITERATURE SURVEY
Stock market indices of any country is largely dependent on number of macro economic factors like interest rates, industrial production, output, money supply, political environment, government stability etc. In the literature, I found large number of studies have been conducted in finding the causal relationship between stock price movements and macro economic variables. Among such studies some notable one are [1] [2] [3] [4] and main conclusions from these studies are that the macro economic factors do influence the stock price movements. The literature survey also reveals that causal relationship between stock price movements and macro economic data varies from one country to another country. For example, in the study conducted by Muhammed and Rasheed [5] which tests the relationship between stock returns and exchange rate they conclude that the causal relationship exists only for Bangladesh and Sri Lanka but not for Pakistan and India.
Moreover, there were large number of studies to predict the stock price move- ments by using time series models like ARMA & ARIMA. One such study conducted by Mondal et al. [6] tried to predict the stock prices of 56 different stocks from seven sectors by using the time series model like ARMA and concluded that ARIMA model gives prediction accuracy of over 85%.
In another study conducted by Erfani [7], they have used Stock Price In- dex (TISP) to find the long ..
In the study conducted by Mahdi [8], they have used the simple ARMA model to test the market efficiency and financial stability of the London Stock Exchange and S&P 500 stock indices using monthly and annual data. In their study they concluded that predicting the month stock return outperforms the prediction of annual returns. In addition, their study also reveals that predicting S&P 500 outperforms the London Stock Exchange.
III. METHODOLOGY
A. Approach for Modelling
In this study to predict the direction of stock mark indices of the selected countries, I have applied three different variations of the multivariate linear regression. The variations of these three models are:
Auto Regressive Moving Average Model (ARMA): In this variation, the time series is modelled by considering the two lags. That is obser- vation at time tn is function of f (tn−1, tn−2). This can be mathematically expressed as:
In the above equation, y(tn) is the observation at time tn, and y(tn−1), y(tn−2) are observations at times tn−1 and tn−2 respectively. The term e(tn) represent the error term. The α terms are regression coefficients which are estimated by minimising the least square error between the ac- tual and predicted value. Passive autocorrelation function plots are used to decide the number of lag terms to include in the ARMA model. One such plot is shown in the Figure. 1 for India’s stock and macroeconomic data.
2. Macroeconomic Factor based Multivariate Linear Regression Model (ARMA): In this variation, the stock market indices is modelled as a function of macro economic variables inflation rate, unemployment rate and exchange rate. This can be mathematically expressed as:
In the above equation, y(tn) is the stock market observation at time tn, I(tn) is the inflation, U(tn) is the unemployment rate and ER(tn) is the exchange rate of the respective countries for 1 USD. The term e(tn) represent the error term. The α terms are regression coefficients which are estimated by minimising the least square error between the actual and predicted value.
3. Auto Regressive Distributed Lag Model (ARDL): This variation of the model is combination of 1 and 2 where a regression model is built by combining the lag terms of stock market indices at previous time period (auto regressive) and lag terms of macroeconomic terms (lag terms) at previous time periods. Mathematically, it is expressed as:
y(tn) = α0 + α1 · y(tn−1) + α2 · y(tn−2)+
β1 · I(tn) + β2 · U(tn) + β3 · ER(tn) + e(tn) (3)
In the above equation α terms are the regression coefficients of the auto regressive lag terms and β terms are the regression coefficients of the macroeconomic factor lag terms estimated by minimising the least square error between the actual and predicted values approach for counting direction prediction and equations comments on Adj R2
B. Approach for Direction Prediction
The following methodology is adopted to test the model accuracy in terms of stock market directional movement prediction.
The given time series data of 65 terms are divided into training and test data comprising 59 observation for the model fit and 6 observations for testing.
2. Ignoring the actual levels of estimation, a directional predict is treated as correct prediction if both the actual and estimated values at tn and tn−1 has the same sign. That is if yˆ(tn) − yˆ(tn−1) > 0 and y(tn) − y(tn−1) > 0; or yˆ(tn) yˆ(tn−1) < 0 and y(tn) y(tn−1) < 0 have the same signs then the direction prediction is correct otherwise incorrect. In this equation yˆ(t) represents the predicted value from the model and y(t) represents the actual value.
3. The directional prediction is measured over both training and test data for each of the regression model described in the previous section for comparison.
In addition to the direction prediction, R2 goodness of fit of the regression model is also captured for model comparison.
Overall mean of the series is around zero. To confirm that the data is truly stationary, the AD Fuller tests are applied on the first difference data and found that the null hypothesis of H0 the series is not stationary is rejected as the test statistics values are with in the 5% critical values.
V. RESULTS
The three linear regression models described in Methodology section were fitted to the stock and macroeconomic data of the three South Asia countries: India, Pakistan and Malaysia. The results of AdjR2 and stock market direction prediction results for each model and country is tabulated below.
Conclusion
The results summarised in Table. 4 shows that Auto Regressive Distributed Lag (ARDL) model outperforms the regression fit consistently as compared to ARMA or the regression model that considers only macroeconomic variables for all three countries stock market indices. This conclusion logically make sense because the ARDL model has all the information about the stock market levels from the previous periods and also the macroeconomic information to predict the stock market levels for the next period. On the other hand, when the regression model was fit purely on macro economic data the goodness of fit measures drops consistently for all three countries stock market indices. From these tables we can conclude that if the focus of the modelling is in predicting the actual levels of stock market indices then best models are ARDL as first choice and then ARMA model as second choice. This also implies that previous levels of stock market indices is very important to predict the levels for the future periods.
The results summarised in Tables. 1, 2, 3 shows that for stock market di- rection prediction, the ARDL models outperforms as compared to regression model that considers only macro economic data or ARMA model.
This conclu- sion is also consistent because ARDL model has all the information (current levels and macro data) to predict the stock market directional movement for the future periods. From these results we can conclude that if the focus of the modelling is to predict the directional movement the ARDL is the first choice followed by macroeconomic model.
References
[1] Fama E. F. Stock returns, real activity, inflation and money. 71(4):45–565, 1981.
[2] Fama E. F. Stock returns, expected returns and real activity. 45(4):1089– 1108, 1990.
[3] Fama E. F. and French L. Business conditions and expected prices on stocks and bonds. 25:23–49, 1989.
[4] Ross. stock market title. 4(2), 2014.
[5] Naeem Muhammad and Abdul Rasheed. Stock prices and exchange rates: Are they related? evidence from south asian countries. 41(4):535–550, 2002.
[6] Labani Shit Prapanna Mondal and Saptarsi Goswam. Study of effective- ness of time series modeling (arima) in forecasting stock prices. 4(2), 2014.
[7] Alireza Erfani and Ahmad Jafari Samimi. Long memory forecasting of stock price index using a fractionally differenced arma model. 5(10):1721– 1731, 2009.
[8] Mohammad Mahdi Rounaghia and Farzaneh Nassir Zadeh. Investiga- tion of market efficiency and financial stability between s&p 500 and london stock exchange: Monthly and yearly forecasting of time series stock returns using arma model. 456(10):10–21, 2016.
[9] J. M. Smith and A. B. Jones. Book Title. Publisher, 7th edition, 2012.
[10] Neven Valev. Business Economic Data for 200 Countries. The Global Econ- omy, https://www.theglobaleconomy.com/.