Weather is an important factor that directly affects farming activities. The temperature and humidity of a region play a crucial role in the type of crop to be cultivated. Thus, It becomes important for farmers to know the future trends in the weather to plan farming activities. There are several classical weather prediction services but all of them depend on complex modelling and are operated by government authorities. In this case, time series becomes an effective way of forecasting these temperature trends as it not only requires minimal computational resources but only the past weather data to achieve results. In this paper, we study the process of implementing both additive and regression-based models and compare their performance to decide which approach is the best for weather prediction.
Introduction
I. INTRODUCTION
Time series data refers to that type of data containing observations collected over subsequent timestamps. The core of time series analysis is using past observations to predict future values without considering any additional factors. The research and applications of time series have become a hotspot in recent years due to its various applications from stock market index predictions to real estate price predictions.
One of the applications of time series data is in weather prediction. Weather conditions are rapidly changing around the world [1] and forecasting these trends is a very exhaustive task. The current weather prediction approach relies heavily on complex physical models and the forecasts from these models require tremendous data extracted from various factors like satellite weather reports, precipitation reports [2] and other factors. The type of equipment needed for this is also not readily available to the public, thus forecasts are mostly done by government or non-profit authorities. Prediction using time series is not a new problem but with the advent of deep learning models and complex statistical models like ARIMA, they can be a cheaper yet viable alternative to predict future trends in weather data. Moreover, these predictions can be combined with regular forecast data to achieve rolling forecasts of time series data which are the most accurate form of predictions.
The weather is a seasonal quantity which is stationary as it repeats itself after a certain period in time. The deep learning models and statistical time series models like ARIMA and Prophet exploit these seasonal trends in the weather data to learn complex seasonal patterns and accurately predict the weather parameters.
During writing this article we studied various research methods and selected three models namely Prophet, ARIMA and LSTM for weather prediction. The first two techniques are additive models while the last one is a memory-based deep learning model. This study helps compare the performance of both categories of ML models. The forecasted values versus actual values will help us decide if time series approaches are significantly cheaper and more reliable to develop and would replace the classical weather forecast approaches.
The paper is structured as follows: Section II includes information on the methodology of data processing. Section III describes the different models used, Section IV discusses the methodology to train the models for the weather forecast system. Then, Section V provides the results of our model and finally, Section VI summarizes the conclusion.
II. DATA COLLECTION AND PREPROCESSING
For our research problem, we selected the prediction of temperature and humidity weather parameters since they are the most basic yet crucial weather parameters. The weather data of the Delhi region was considered as it is the capital city and has quite a wide variation in temperature trends throughout the year.
The weather data is collected from The Indian Meteorological Department which indexes the daily weather data for different Indian cities.
A. Data Preprocessing
We fetched the temperature and humidity data for the past 20+ years (1996-2018) which consisted of hourly data readings as seen in Fig. 1. The hourly data readings are quite nuanced and are difficult to train models for weather prediction over long periods in future. Hence, we replace the hourly data with the mean average temperature and humidity of the data readings. We also observed that the temperature and humidity readings were missing for a few days. These missing values have been replaced with the weekly mean value of the weather data as shown in Fig. 2.
We split the data into two sets for training and testing with the time series from the 1996-2018 year being used for training the models and the time series from the 2018-2020 year used to verify our prediction results achieved from the models.
III. MODELS
There are various methods for time series forecasting both regression and deep learning models. According to the survey paper by Zhenyu Liu and his team [3] ARIMA is the most widely used method for time series forecasting. Hence, we selected ARIMA as our baseline model to compare against our independent suggestions. Among the forecasting approaches, there are ANNs, SVM, Fuzzy time series for forecasting and RNNs. A major challenge is the prediction of weather parameters over a long duration for which LSTMs [4] have proven to be ideal. The Prophet model is also an additive model like ARIMA and provides automatic hyper-tuning parameter selection. Thus, we collectively implement ARIMA, Prophet and LSTM approaches and compare their performance.
Conclusion
We implemented and compared both deep learning as well as regression models for forecasting temperature and humidity weather trends. On comparing the Fig. 4. and Fig. 5. plots with other figures we found that the ARIMA model showed the least accurate predictions. This was closely followed by Prophet which showed significant improvement as shown in Fig. 8 and Fig. 9. The Prophet model plots align well with the actual weather data values. Both of the regression models fail to observe granular details in the weather data that are only observed in deep learning models like LSTM. On observing the LSTM plots in both Fig. 6 and Fig. 7. we can safely conclude that deep learning models like LSTM are very advanced in the prediction of time series trends. The granular accuracy of LSTM is because it constantly uses past time sequence data to predict future trends acting as a rolling time prediction system which is generally considered the most accurate form of time series prediction.
The trend is further confirmed by comparing the performance error metrics of each model as shown in Table 2. Based on these metrics in comparison to our baseline ARIMA approach, LSTM exhibited the best performance with an MAE value of 1.479 degrees and 5.126 percentage relative humidity.
Given the democratization of computing resources, deep learning models have become popular in recent years. The performance metrics exhibited by these models in the prediction of time series are impressive and can be considered as a viable and more practical approach to weather forecasting. These deep learning approaches can be further applied to various problems like stock value predictions [9] for finance sector industries.
References
[1] F. V. Davenport and N. S. Diffenbaugh Using machine learning to analyze physical causes of climate change: A case study of U.S. Midwest extreme precipitation, Geophys. Res. Lett., vol. 48, no. 15, Aug. 2021 Art. no. e2021GL093787
[2] Krouma, M., Yiou, P., Déandreis, C., and Thao, S.: Assessment of stochastic weather forecast of precipitation near European cities, based on analogs of circulation, Geosci. Model Dev., 15, 4941–4958, https://doi.org/10.5194/gmd-15-4941-2022, 2022.
[3] Z. Liu, Z. Zhu, J. Gao and C. Xu, \"Forecast Methods for Time Series Data: A Survey,\" in IEEE Access, vol. 9, pp. 91896-91912, 2021, doi: 10.1109/ACCESS.2021.3091162
[4] A. Srivastava and A. S, \"Weather Prediction Using LSTM Neural Networks,\" 2022 IEEE 7th International conference for Convergence in Technology (I2CT), Mumbai, India, 2022, pp. 1-4, doi: 10.1109/I2CT54291.2022.9824268.
[5] J. Pant, R. K. Sharma, A. Juyal, D. Singh, H. Pant and P. Pant, \"A Machine-Learning Approach to Time Series Forecasting of Temperature,\" 2022 6th International Conference on Electronics, Communication and Aerospace Technology, Coimbatore, India, 2022, pp. 1125-1129, doi: 10.1109/ICECA55336.2022.10009165.
[6] Taylor SJ, Letham B. 2017. Forecasting at scale. PeerJ Preprints 5:e3190v2. doi: https://doi.org/10.7287/peerj.preprints.3190v2
[7] Sepp Hochreiter, Jürgen Schmidhuber; Long Short-Term Memory. Neural Comput 1997; 9 (8): 1735–1780. doi: https://doi.org/10.1162/neco.1997.9.8.1735
[8] Z. Li, H. Zou and B. Qi, \"Application of ARIMA and LSTM in Relative Humidity Prediction,\" 2019 IEEE 19th International Conference on Communication Technology (ICCT), Xi\'an, China, 2019, pp. 1544-1549, doi: 10.1109/ICCT46805.2019.8947142.
[9] Kumar Prakhar, Sountharrajan S, Suganya E, Karthiga M Effective Stock Price Prediction using Time Series Forecasting IEEE 2022 6th International Conference on Trends in Electronics and Informatics (ICOEI)