Ijraset Journal For Research in Applied Science and Engineering Technology
Authors: Tanmya Vishvakarma
DOI Link: https://doi.org/10.22214/ijraset.2022.47098
Certificate: View Certificate
This is an extremely contentious subject of study as many people across the world don\'t believe that humans are responsible for climate change. There will be an urgent requirement for adaptation as a result of the effects of climate change on human society, as this is the only way that humans can hope to survive the increasingly chaotic weather of the future. The rapid development of machine learning (ML) algorithms has sparked innovations in numerous fields of study and has even been proposed as improving climate studies. Since global warming has an effect on not just people but also on many other kinds of animals, efforts to bring it down may benefit everyone on Earth. To examine the global climate\'s development since 1800, we use machine learning methods. From land average temperature to land average temperature uncertainty (95% CI around the mean), land maximum temperature to land average temperature uncertainty, land minimum temperature to land average temperature uncertainty, land and ocean average temperature to land and ocean average temperature uncertainty, and more. Kaggle datasets show a wide range of temperature change metrics since 1750. Another dataset measures the average worldwide concentration of carbon monoxide since 1958; it was compiled by the Earth System Research Laboratory of the United States Government. We propose focusing equally on machine learning and artificial intelligence to better interpret and profit from current data and simulation. Our findings provide insight into the difficulties and potential benefits of data-driven climate modeling, particularly about the future integration of increasingly enormous model datasets.
I. INTRODUCTION
The effects of climate change are becoming increasingly obvious. Extreme weather events, such as hurricanes, tornadoes, hail, lightning, and fires, as well as floods, have become more frequent and more intense in recent years. Humanity's access to the natural resources and agricultural practices that sustain it are under jeopardy as global ecosystems undergo rapid transformation. According to projections made in the 2018 international report on climate change, catastrophic results are certain if greenhouse gas emissions are not completely removed over the next thirty years. However, these pollutants continue to climb year after year. Reducing emissions and adjusting to the effects of climate change are both necessary steps (preparing for unavoidable consequences). Both have complex facets. Changing electrical systems, transportation, buildings, industries, and land use are all necessary for reducing emissions of greenhouse gasses (GHG). Given the knowledge of climate and catastrophic occurrences, adaptation needs preparation for resilience and disaster management. It's possible to consider the wide variety of issues as a positive sign, as it means there are a lot of methods to make a difference. [1]
To help decision makers better manage present and future climate change risks, we need a deeper comprehension of risk interconnections and dynamics. As a response to this challenge, researchers have begun to experiment with novel analytical methods and techniques, such as the use of Machine Learning (ML) to make the most of the potential offered by the abundance and diversity of spatio-temporal big data in environmental contexts. This paper covers the current state of the art and future possibilities for ML approaches in Climate Change Risk Assessment (CCRA), a growing area of interest. Together, sciento-metric and systematic analysis were used to conduct a comprehensive literature search covering the years 2000–2020. The results of the study revealed that several different ML algorithms have been used before in CCRA, with Decision Tree, Random Forest, and Artificial Neural Network being the most common. Most flood and landslide risk events are analyzed using these algorithms in an ensemble or hybridized fashion. The examined CCRA applications all use ML to process remote sensing data in a consistent and efficient manner, which facilitates the detection of environmental and structural aspects, as well as the identification and categorization of targets. In contrast to research evaluating dangers under present conditions, literature concerning future climate change scenarios does not seem to be highly pervasive in scientific production. Since these notions have just recently emerged in CCRA literature, they have not yet been combined with ML-based applications, leading to the same conclusion. [2]
Visualization Techniques for Climate Change with the Help of ML and AI focuses on the effects of climate change and how they might be avoided or lessened. There is hope that various algorithms, mathematical relations, and software models can help us make sense of our world, forecast the weather, and develop novel products and services that reduce our negative impact on the planet while simultaneously increasing our opportunities for success in areas like healthcare and environmental sustainability. This paper discusses several methods for predicting climate change and alternate systems that might mitigate the risks climate scientists have detected. Even better, climate action, one of the 17 sustainable development objectives, will be advanced with research assistance.[3] . The Climate Change Intergovernmental Panel (IPCC), established by the United Nations in 1988 and honored with the Nobel Peace Prize in 2007, evaluates climate projections (predictions) based on the results of approximately 50 climate models, which are supported by approximately 25 laboratories around the world. (Common with Al Gore, the former Vice President) But we're curious whether, even at our level, we could utilize some basic machine learning models to evaluate climate change. As a result, the machine learning algorithms we use are more straightforward than those utilized by climate scientists. The public may have a better understanding of the gravity of environmental issues thanks to Big Data.
We share graphs and plots that even those without scientific training can understand as more data is collected. As a result, in this project, we place equal emphasis on the novelty and usability of the plotting, primarily via the use of a remarkable program called bokeh3. The majority of our graphics are included in the report; however, we encourage readers to utilize our research to take full use of their interactivity. To proceed, a large database had to be located. Kaggle, an online competition platform, provided us with one. The dataset consists of five files, each of which is a record of monthly global temperatures beginning in 1750. (1), using either (2) a nation, (3) a state, (4) a big city, or (5) an individual city. There are three different types of files used in this project: (4). Finally, we included a new factor—the amount of CO2 pollution—and assess its effect on our models.
II. REVIEW OF LITERATURE
[4] (Adedeji, 2014) Changes in the climate are already having serious consequences for the world around us, and those repercussions are only expected to increase. Global climate change has unparalleled implications, from changing weather patterns that threaten food production to increasing sea levels that increase the danger of catastrophic flooding. Taking preventative measures now will make future adaptation to these impacts easier and cheaper than if we wait. In this introduction, we'll go over what "Global Climate Change" is, some related terms, some potential impacts on human health, and some potential solutions. It demonstrates the critical importance of taking swift action to prevent the catastrophic buildup of greenhouse gasses (GHGs) and the resulting global warming, which might have devastating effects on economies and societies throughout the world. In order to combat climate change, "historic levels of collaboration" are needed between nations, governments, businesses, and individuals.
[5] (Klingelhöfer, 2020) Safe to say that climate change is one of humanity's most pressing problems. Greenhouse gasses accelerate climate change, and their accumulation in the atmosphere is mostly attributable to human activity, especially the burning of fossil fuels. Damage from climate change is already being felt, and it will have far-reaching consequences. Each country has unique options to avoid unpleasant repercussions based on its economic history since global effects vary widely and will lead to extremely varied national susceptibility to climate impacts. Depending on their level of readiness and susceptibility, governments will need to make decisions to lessen the effects of climate change. For this, we need a basis in hard science. To help set the stage, a bibliometric study was carried out to showcase international research on climate change using known and specialized measures.
[6] (Margherita Grasso Manera, 2012) This research aims to do just that by reviewing the best available literature on the topic of how climate change affects the spread of infectious and non-infectious illnesses, using malaria as an example. Time series models, panel data and spatial models, and non-statistical techniques are the three main categories into which the methodological details of each study will be broken down.
[7] (Oppenheimer, 2021) Climatic Change examines the whole scope of the problem of climatic variability and change, from its descriptions to its causes to its consequences and linkages. This publication's mission is to improve collaboration among experts working to address climate change's effects by publishing their findings in a single venue. This opens the door for authors to communicate the gist of their studies to those in other climate-related disciplines and interested non-specialists, and it also facilitates the reporting of research in which the novelty lies in the combinations of (not necessarily original) work from several disciplines. Besides its regular articles, this newspaper also has lively editorial and book review sections.
[8] (Castro, 2019) This study examined the vulnerability of El Colli to climate change using a variety of research tools and techniques, with a particular emphasis on the city's vulnerability to flooding. Findings shed insight on how vulnerability manifests itself in the setting of urban poverty. We argue that the evidence requires the climate change discourse to include local urban concerns, especially those affecting informal settlements.
[9] (Reckien, 2017) Climate change is widely regarded as the greatest threat to our civilizations in the future decades, and it may threaten the lives of the large and diverse people who have made cities their homes in this century of urbanization. As the impacts of climate change become more dire, discussions about cities are likely to shift from an overarching perspective to concentrate on specific populations and how they will be impacted. This is because urban areas are home to widely varied individuals with varying vulnerabilities. The issue of urban equality is therefore thrust into the spotlight.
[10] (Glavovic, 2021) There has been a breach of the agreement between science and society. Weather patterns are shifting. Science provides evidence for the causes, the pace of worsening, the effects on people and on social-ecological systems, and the necessity of taking action. Across governments, there is consensus that this issue has been thoroughly researched and resolved. Despite mounting data, new warnings, and innovative approaches, indications of harmful global change continue to climb, making climate change science a tragic field.
[11] (Massazza, 2022) We then highlight areas, such as intervention research, Potential areas where new methods may need to be implemented in the near future include climate change assessments and the collection of mental health metrics. Both public mental health and environmental epidemiology frameworks are drawn upon in this article. The purpose is not to provide precise definitions of each methodology, but rather to highlight opportunities for combining different methods, encouraging collaboration across disciplines, and inspiring novel approaches to research design.
[12] (L. Kaack, 2022) The possible influence of growing AI and ML on global greenhouse gas emissions is the subject of increased speculation. The complex nature of the consequences of these emissions makes it difficult to quantify and predict them. In this study, we provide a technique for comprehensively documenting the effects of ML on GHG emissions, categorizing these results as follows: computing-related impacts; direct effects of utilizing ML; system-level impacts. Utilizing this methodology, we may identify which facets of ML's effect on climate change mitigation require more investigation via impact assessment and scenario analysis, and then recommend appropriate policy levers to bring about these changes. Climate change is only one area where the development of AI is having a significant impact on society. In this Perspective, we provide a framework for assessing AI's effect on GHG emissions, as well as recommendations for aligning AI with efforts to mitigate climate change.
[13] (Tahmineh Ladi, 2022) The climate catastrophe presents a number of difficult problems, and this document will outline some of them and the potential role that artificial intelligence might play in finding answers. In particular, it will highlight a variety of real-world applications of AI and future areas where the technology shows great potential. To further elaborate on the goals of the AI and Climate Change Impact Weekend 2020, this document serves as an addendum to the Challenge Statement. As well as setting the stage for the issue at hand, this reading is meant to spark students' imaginations about the potential applications of AI in the fight against climate change.
[14] (Tom Beucler, I. Ebert?Uphoff, 2020) Machine learning (ML) techniques are strong tools to develop models of clouds and climate that are more true to the rapidly-increasing amounts of Earth system data than commonly-used semiempirical models. Here, we examine ML technologies, including interpretable and physics-guided ML, and detail how they might be used to cloud-related processes in the climate system, including radiation, microphysics, convection, and cloud detection, classification, simulation, and uncertainty quantification.
[15] (Kyle Tilbury, 2020) With the help of machine learning, we may be able to mitigate some of the worst effects of climate change. Approaches such as alerting individuals of their carbon footprint and measures to lessen it are examples of how machine learning has been used in the past to combat the human repercussions of climate change. These strategies will be most effective if they take into account the specific social and psychological characteristics of each patient. Affect is a social-psychological aspect in climate change that has been shown to play a significant role in people's perspectives and motivation to take action to mitigate its effects. We propose a study investigating the potential of incorporating emotion into machine learning-based solutions for climate change.
[16] (Zhongkai Shangguan, Zihe Zheng, 2021) In this study, we assembled a large Twitter dataset about climate change and analyzed it extensively using machine learning. We use topic modeling and natural language processing to demonstrate how the volume of tweets concerning climate change correlates with the occurrence of big weather events, as well as to reveal the most often discussed aspects of this issue and the general tone of online discourse.
III. RESEARCH METHODOLOGY
We started by collecting temperature data from throughout the globe. We also include several fundamental charting applications, such as Maplotlib, Seaborn, and Bokeh, and the data analysis libraries Numpy and Pandas. Lastly, the scikit-learn and XGBoost machine learning software was used in our models.
After that, we de-noise our data and give it comprehensible visual representation. The occurrence of NaN values was the first problem we noticed. Inferring the global land average temperature in the 18th century is likely to have been a far more challenging task. After using the dropna() technique of data cleansing, we discover that the dataset is only of use after the year 1800. Consistent with this is the fact that compiled 19th century temperature data are often used by climate experts. Only the Land Average Temperature and the 95% Uncertainty of that Temperature are of significance to us. We are aware that if we preserve the data in this format, it will be difficult to present the yearly temperatures. Hence we generated a smaller data frame and resized the data.
A. Dataset Description
The first [temperature] dataset originates from the online platform Kaggle, [17]. The second is a record of the average worldwide concentration of carbon dioxide (CO2) since 1958 by the United States Government's Earth System Research Laboratory, Global Monitoring Division. Berkeley Earth, a division of the Lawrence Berkeley National Laboratory, has just accumulated new data, and we have packed it.[18] Researchers at Berkeley compiled 1.6 billion temperature observations from 16 distinct sources to construct the Earth Surface Temperature Study. It's packaged neatly and may be divided apart for additional investigation (for example by country). They released both the original data and the software used to modify it. As an added bonus, they use methods that permit the inclusion of meteorological data from shorter time periods, thereby lowering the number of unnecessary observations.
B. Data Pre-processing
To maximize the effectiveness of the machine learning algorithm, the data must be pre-processed. The first step in the pre-processing pipeline is normalizing the data. The process is useful for linear data transformation. Missing data or missing values occur when there is no recorded value for a certain attribute of a dataset. Unfortunately, missing data is common, and this might affect the conclusions that can be drawn from data sets.
C. Model Generation
In our proposed approach, we try out a wide variety of machine learning techniques. We develop a metric that measures how well the model fits the given data to ascertain their level of accuracy.
As can be seen in the graph, the regression accurately predicts temperatures within a wide range. By finally separating the data into a training set and a test set, we hope to use it as a prediction model. The time periods 1800-1949 and 1950-2015 are represented in the databases used for training and testing. Based on the data and the visualization, it is clear that linear regression is not useful for this kind of testing. The model's parameters gave a good match to the temperature data when built using information from the whole dataset, but not when utilizing just a portion of the data. This makes it very evident that linear regression is not a viable method of making forecasts.
2. Polynomial Regression: Replacing the fundamental linear model has traditionally been the go-to strategy for extending linear regression to settings where the relationship between predictors and response is nonlinear. A link between the two is apparent. As a result, we want to construct a polynomial regression with deg(P) 3. The initial step is to train the model using the whole dataset. Once again, we utilize visual aids and common regression metrics to determine how well a model fits the data.
This model seems to better fit the data than the first linear regression. Predicting the future accurately is a field of study. Once again, the testing database extends to 2015, whereas the train database begins in 1800 and goes through 1949.
The variance score here is substantially higher than the one obtained from the second linear regression. Unfortunately, it is still not suitable for use as a data forecast model on our end. When we use linear regression, we have the same problem.
3. Random Forest: The scikit-learn package's recommended Random Forest Regressor is implemented here. We tried a few different numbers for n (the number of trees) and found that 10 was optimal. To improve upon the accuracy of our prior regressions, we choose to use a more extensive training database. Here, we take a look back at the previous month's minimum and maximum land and ocean temperatures via the lens of regression analysis. If inputs (min, max, Land_ocean_average) are only from the same year, the output (average) will be skewed. The testing database includes data from 1981 onward, whereas the train database covers the years 1800-1980. Since then, we've been able to fine-tune our model even further. Mean Euclidean Distance, the metric we created previously, is used to evaluate the goodness of model fit (MED).
4. XGBoost: The eXtreme Gradient Boosting Regressor (XGBoostRegressor) is employed after the Random Forest Regressor has already been applied. The input and output variables, as well as the training and test sets, remain unchanged. We will run the model with the preset values and evaluate the results. This means that the model outperforms the Random Forest Regressor.
5. KNN: As a result, we use a machine learning strategy called K-Nearest Neighbors. After a great deal of experimentation, we find that the optimal value of k for MED minimization is 2. Aside from those two changes, everything else, including the input and output variables and the test and training datasets, is the same. When compared to Random Forest and XGBoost, the MED produced by KNN is superior. The XGB Regressor is currently the top performing model, thus we will aim to out-optimize it by modifying KNN hyperparameters to get a lower MED. To find the optimal values for our hyperparameters, we use a grid search and a hand-optimized minimization. The scikit-learn package provides the GridSearchCV feature.
6. GridSearchCV: GridSearchCV is a technique for determining the optimal values of a model's hyperparameters. The last section highlighted the importance of hyperparameter settings in determining a model's output. The optimal settings of hyperparameters cannot be anticipated in advance, thus it is vital to test all possible combinations. We use GridSearchCV to automate the process of adjusting hyperparameters since doing it manually would require a significant investment of time and energy.
a. Impact of CO2 on our ML Models: We expanded our data set to include a CO2 variable. From 1958 to 2016, CO2 measurements have been made accessible. We have to exclude 2016 since the source database only contains data through 2015. The Random Forest Regressor and the XGBoost Regressor, two of our models that have so far had the best outcomes, are again used in our evaluation of CO2 data. To find the best hyperparameters in this situation, we explicitly employ the aforementioned optimization techniques. We add the most recent data to what we currently have as background and blend it with it.
Now that the dataset has been updated, we may utilize our prediction models. This time, we choose to use the strategies for hyperparameters optimization directly. Following a number of attempts, the set on which we are minimizing was selected.
IV. RESULT AND ANALYSIS
We test several machine learning models. We establish a statistic that gauges the model's fit to the data in order to assess how accurate it is. We refer to it as Mean Euclidean Distance (MED), and its definition is as follows:
Ε?β-β^?
Here β is the temperature's actual value, β^ is its anticipated value, and || || the Euclidean Standard.
Our process produced a MED of 0.2051560. As a result, our model is more accurate than the Random Forest Regressor. Note that KNN outputs a MED of 0.2376090, which is greater than both Random Forest and XGBoost. We also note that the XGBoost MED result, with a best estimate of 0.284059, is observably inferior to the best estimates of 0.154456 for the Random Forest. Remember that the XGB Regressor outperforms the Random Forest when the new variable is not present.
We use two facts to explain this brand-new finding that defies logic. Due to the new variable's lack of values prior to 1958, the training database is much smaller than it was previously. We think that if we had been able to minimize our function on a larger set, given the constraints of our computational capacity, we would have discovered more encouraging findings for the XGBRegressor. In light of the Random Forest Regressor, we infer that the addition of the additional variable significantly affects the MED.
Table 1. Result comparison with traditional method.
Algorithms |
MED |
Random Forest |
0.21349 |
XG Boost |
0.20515 |
KNN |
0.23760 |
Table 2. Result comparison with hyper-parameter tuning and new variable.
Algorithms |
Best parameter |
Accuracy |
MED |
Random Forest |
{'max_depth': 13, 'n_estimators': 95} |
0.9995 |
0.154456 |
XG Boost |
{'learning_rate': 0.01, 'max_depth': 3, 'n_estimators': 520} |
0.9333 |
0.284059 |
V. ACKNOWLEDGEMENTS
I would like to express my deep gratitude to my Professor Ms. Divya Sharma, my research supervisor, for her patient guidance, enthusiastic encouragement and constructive criticism for this study effort.
The first part of the results shows that similar to the trend in CO2 concentration, temperatures have been on the rise. In addition, using a correlation study between CO2 concentration and temperature, we show that an increase in CO2 concentration results in a temperature rise. A number of successful machine learning strategies were examined. However, more research into other machine learning methods, such as ensemble-based methods like xgboost and linear regression, might be conducted in the future in an attempt to discover improved models. In this study, we compared the performance of many machine learning models. When it comes to the primary data set, the most effective models are the MED, the Random Forest Regressor, and the XGBoost Regressor. The Random Forest Regressor\'s output improves when we add a new variable. Both GridSearch and manual optimization are very time-consuming and inefficient approaches to MED reduction, which is a significant problem for us. Having access to more processing power would have helped us achieve superior outcomes. Including CO2 in our models improves their performance. As a percentage of body weight, the MED is rather small, at roughly 0.1%. To further lessen the impact of the temperature rise, we need to account for other associated elements.
[1] ? D. R. and P. L. D. Et.al, “Tackling Climate Change with Machine Learning,” 2019, [Online]. Available: https://arxiv.org/pdf/1906.05433.pdf [2] F. E. Et.al, “Exploring machine learning potential for climate change risk assessment,” Sci. direct, 2021, [Online]. Available: https://www.sciencedirect.com/science/article/abs/pii/S0012825221002531 [3] A. D. et. a. Arun Srivastav, “Visualization Techniques for Climate Change with Machine Learning and Artificial Intelligence,” ELSEVIER, 2022, [Online]. Available: https://www.elsevier.com/books/visualization-techniques-for-climate-change-with-machine-learning-and-artificial-intelligence/srivastav/978-0-323-99714-0 [4] O. Adedeji, “Global Climate Change,” Res. gate, 2014, [Online]. Available: https://www.researchgate.net/publication/276495677_Global_Climate_Change [5] D. R. M. et. a. Klingelhöfer, “Climate change: Does international research fulfill global demands and necessities?,” springeropen, 2020, [Online]. Available: https://enveurope.springeropen.com/articles/10.1186/s12302-020-00419-1 [6] M. Margherita Grasso Manera, “The Health Effects of Climate Change: A Survey of Recent Quantitative Research,” Natl. Libr. Med., 2012, [Online]. Available: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3386570/ [7] M. Oppenheimer, “Climatic Change,” springer link link, 2021, [Online]. Available: https://www.springer.com/journal/10584 [8] J. A. G. Castro, “Climate change and flood risk: vulnerability assessment in an urban poor community in Mexico,” 2019, [Online]. Available: https://journals.sagepub.com/doi/full/10.1177/0956247819827850 [9] D. Reckien, “Climate change, equity and the Sustainable Development Goals: an urban perspective,” 2017, [Online]. Available: https://journals.sagepub.com/doi/full/10.1177/0956247816677778 [10] B. C. Glavovic, “The tragedy of climate change science,” 2021, [Online]. Available: https://www.tandfonline.com/doi/full/10.1080/17565529.2021.2008855 [11] A. Massazza, “Quantitative methods for climate change and mental health research: current trends and future directions”, [Online]. Available: https://www.thelancet.com/journals/lanplh/article/PIIS2542-5196(22)00120-6/fulltext# [12] P. D. L. Kaack, “Aligning artificial intelligence with climate change mitigation,” Semant. Sch., 2022, [Online]. Available: https://www.semanticscholar.org/paper/Aligning-artificial-intelligence-with-climate-Kaack-Donti/8e45e5d8e4d7ca24699b516105414b29f71431e2 [13] S. J. et. a. Tahmineh Ladi, “Applications of machine learning and deep learning methods for climate change mitigation and adaptation,” Semant. Sch., 2022, [Online]. Available: https://www.semanticscholar.org/paper/Applications-of-machine-learning-and-deep-learning-Ladi-Jabalameli/3e44af63f0c79f46b40ac38668ad79f67fd4c383 [14] E. a. Tom Beucler, I. Ebert?Uphoff, “Machine Learning for Clouds and Climate,” Semant. Sch., 2020, [Online]. Available: https://www.semanticscholar.org/paper/Machine-Learning-for-Clouds-and-Climate-Beucler-Ebert?Uphoff/ea0ccde42b1f4517e87ee2e3f77fb6c06671666b [15] J. H. Kyle Tilbury, “The Human Effect Requires Affect: Addressing Social-Psychological Factors of Climate Change with Machine Learning,” Semant. Sch., 2020, [Online]. Available: https://www.semanticscholar.org/paper/The-Human-Effect-Requires-Affect%3A-Addressing-of-Tilbury-Hoey/cf0fd2632dff13bc69daeeb98118d1b7644418a0 [16] E. a. Zhongkai Shangguan, Zihe Zheng, “Trend and Thoughts: Understanding Climate Change Concern using Machine Learning and Social Media Data,” Semant. Sch., 2021, [Online]. Available: https://www.semanticscholar.org/paper/Trend-and-Thoughts%3A-Understanding-Climate-Change-Shangguan-Zheng/fd85cdaf951693c76dde2974a9c2502fe6c8bcc9 [17] B. Earth, “Climate Change: Earth Surface Temperature Data.” https://www.kaggle.com/datasets/berkeleyearth/climate-change-earth-surface-temperature-data [18] B. Earth, “climate data.” http://berkeleyearth.org/data/
Copyright © 2022 Tanmya Vishvakarma. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Paper Id : IJRASET47098
Publish Date : 2022-10-17
ISSN : 2321-9653
Publisher Name : IJRASET
DOI Link : Click Here