Global warming can be reduced and sustainability can be achieved by reducing greenhouse gases emission. Therefore, it is helpful to know the forecast of CO2 emission in the southern region states. Knowing the comparative analysis might help to change the source of power generation to renewable. The data source is taken from the southern region load dispatch center (SRLDC). Carbon dioxide emission values were estimated using the method proposed by Central Electricity Authority. The random forest regression method has been used to forecast the emission values. The data were divided into training and testing sets for better results. The results showed the increasing nature of CO2 emission from power stations. The model could be used to forecast the emission values either for next month or next week. The accuracy for the model was favorable in the range from 0.80 to 0.92. From comparing the results of the states, it could be noticed which state has used its renewable power stations wisely to reduce greenhouse gases emission.
Introduction
I. INTRODUCTION
Excessive carbon dioxide emission has created great challenges for the sustainable development of the world [1]. Global warming has now become a serious issue around the world because of greenhouse gases. About eight billion tons per year of carbon in the form of CO2 is emitted by the burning of fossil fuels and the generation of electricity [2]. The biggest increase in CO2 emission occurred due to electricity and heat production. As the population increases, the energy demand also increased, which caused more electricity generation. The major source of electricity generation in India is coal, which results in more CO2 emission [3]. The emission affects health and causes serious diseases. Forecasting CO2 emissions might bring awareness about the emission and alert people to use electricity consciously. So the proposed work has forecasted the values of CO2 emission in four states of India namely, Andhra Pradesh, Karnataka, Tamil Nadu, and Telangana.
There are various forecasting methods like linear regression, random forest tree, and support vector machine for predicting the values. Here the suitable method is chosen based on the performance metrics and analysis. The average power generation data was gathered for each state on a daily basis and multiplied a generic factor to find the CO2 emission. The trend of CO2 emission was examined for the states and compared to whether there is a change in the source of electricity generation. This analysis would provide not only awareness of the emission but also the realization to switch to renewable resources for power generation. Switching to the renewable resource might reduce greenhouse gases emission and promote sustainable development.
A. The Dataset
We have collected the data from SRLDC (Southern Region Load Dispatch Centre). The data consists of the daily report of power generation of the states (Andhra Pradesh, Karnataka, Tamil Nadu, Telangana) under SRLDC. The CO2 emission of the respective total average power of stations was estimated using the method proposed by Central Electricity Authority (CEA). The Random Forest regression model is used to predict the emission values for future years. After observing the minimum, maximum, and median values, the increasing behavior of CO2 emission can be seen. The data is split for testing and training samples. For training, 80% of the data is utilized and for testing, 20% of the data is used.
The data used in the model belongs to the year 2022. The emission values are estimated by multiplying with the carbon emission factor of 0.85 as proposed by CEA. Since the data is non-linear, random forest regression model can be used.
B. Proposed Model
Random Forest Regression
This model is a supervised model for both regression and non-linear problems. In random forest, it makes more decision trees in randomized way. The more the trees, the more the accuracy for regression [4]. Benefit of using this model is that it perpetuates accuracy and can handle missing values.
Random forest works in two stages, creating a random forest and forecasting using the classifier [5].
Creation of Random Forest
a. Select K features at random amongst m features
b. The node is divided into daughter nodes.
c. The steps will be repeated until the node count becomes one.
d. By repeating the above procedures, the forest grows.
2. The next step would be to make forecasting using random forest classifier.
C. Flow Chart
Description
Collection of Data: The dataset is collected from SRLDC and carbon dioxide values are estimated as per CEA guidelines.
Random Forest Model: The prepared dataset is given as input for the model. After training and testing, the data will be fitted into the model. With the model designed, we can forecast the values for a week or a month.
Performance Metrics: The designed model is analyzed using Root Mean Squared Error and checked for its effectiveness.
Finally, graphs were plotted and compared.
D. Performance Metrics
The performance metrics are used to measure the effectiveness of the model used for regression. There are several evaluation metrics; we have used root mean squared error (RMSE) for analyzing the efficacy of the model. In Root Mean Squared Error, first we calculate the squares of the residual error i.e., the difference between actual and predicted values. Followed by this average of that would be found. Then finally, taking the square root will give the RMSE value.
From the figures fig 1, fig 2, fig 3 and fig 4, we can clearly compare the emission trends of the states.
In fig 1, it shows the emission plot of Andhra Pradesh for the year 2022. Here it can be observed that the emission values are consistent, it neither increases nor decreases.
From the figure fig 2, in Karnataka, after the month of May 2022, emission values started to decrease gradually, as the state started to use the renewable source of power generation, mainly hydropower.
In case of Tamil Nadu, from the figure fig 3, the values are constantly increasing throughout the year. This shows that Tamil Nadu should start to use renewable means for power generation so as to decrease the emission of CO2.
Finally, for Telangana, from figure fig 4, we can see decrease in emission in between the months June and September. And then gradually attained the saturated level of CO2 emission as same as months before June.
B. Analysis of Forecast and Evaluation Metrics
After training the model with the data of year 2022, forecast for the next month is done i.e., for January 2023. The predicited values for the month January 2023 will be evaluated with the original values of that month with the help of RMSE evaluating metrics.
The following plots will show the predicted and original CO2 emission values for the first month of 2023.
Conclusion
In this paper, the Machine Learning model Random forest regression is used to forecast the carbon dioxide emission values. It tells about the CO2 emission nature in Andhra Pradesh, Karnataka, Tamil Nadu, and Telangana. It also tells about how sustainably the state is shifting towards renewable power generation. Then finally, it describes the results of the experiment done for evacuating the model designed using RMSE and R2 scores. In the future, the model can be designed to forecast not only for the week and months but also for the efficiency of a power station. Because when the efficiency of a power station increases, the CO2 emission decreases. With that, according to the efficiency of a power station in the respective state, the CO2 emission can be controlled by planning the operation of stations according to their efficiency.
References
[1] N. Stern, The Economics of Climate Change: The Stern Review. Cambridge, U.K.: Cambridge Univ. Press, 2007, pp. 1-8.
[2] Dayaratne S P and Gunavardana K D 2014 Carbon Footprint Reduction: a critical study of rubber production in small and medium scale enterprises in Sri lanka J. Clean. Prod.
[3] IEA (2022), Global Energy Review: CO2 Emissions in 2021, IEA, Paris License: CC BY 4.0
[4] Fang X, Liu W, Ai J, He M, Wu Y, Shi Y, Shen W, Bao C (2020) Forecasting incidence of infectious diarrhea using random forest in Jiangsu Province, China. BMC Infectious Diseases 20(1):1–8
[5] Prof. Swapnil Wani, Mr. Akash Akhilesh Yadav, Mr.Mihir Mukesh Panchal, Mr. Prashant Vinod Pandey Predicting CO2 emission using Machine Learning International Journal for Research in Engineering Application & Management , Vol-08, Apr 2022 86-88
[6] Bakay MS, A?bulut Ü (2021) Electricity production based forecasting of greenhouse gas emissions in Turkey with deep learning, support vector machine and artificial neural network algorithms. J Clean Prod 285:125324
[7] D. N. Moriasi, J. G. Arnold, M. W. Van Liew, R. L. Bingner, R. D. Harmel, T. L. Veith Model evaluation guidelines for systematic quantification of accuracy in watershed simulations 2007 American Society of Agricultural and Biological Engineers ISSN 0001?2351 Vol. 50(3): 885?900