A Survey of Data Science Approaches in Physics

Authors: Abhishek Kumar Sarkar, Anuj Gupta, Yuvraj Joshi

DOI Link: https://doi.org/10.22214/ijraset.2022.43717

Abstract

Now a days data is growing rapidly and Data science has become central attraction for many scientist in latest discoveries . This paper presents a review of various data science approaches which can be used in physics. This paper will elaborate how data science ( Specially Machine Learning ) helped Physics discoveries in past and how data science can help physics in future.

Introduction

I. INTRODUCTION

Data science can be defined as a combination of mathematics, multiplicity of tools, algorithms and the basics of machine learning, which helps us gain hidden insights and styles from raw calculations that can be used extensively in large-scale construction. Trading business options. The Data science has helped to discover many of the highest levels of physics recently, especially within particle physics and astronomy. In particle physics, Collider details are very large and include a lot of noise. how to use the Machine learning to filter information for useful particles or particles which allows researchers to create incredible results within billions of dollars, using the fastest discovery and the least wasteful time. It also allows researchers to filter current tests that are not directly related to the program they are studying, however their presence that takes place within that test

II. SUBFIELD OF DATA SCIENCE THAT CAN BE USED IN PHYSICS

There are some subfield that can be used in physics discoveries :

A. Machine Learning

Machine learning (ML) quickly provide new and efficient devices for scientists to publish important statistics from a large amount of data, either from experiments or comparisons. The positive steps ahead of each physical science department can be done with the help of adopting, growing and implementing gadget optimization techniques to investigate extremely complex statistics in a way that has never worked before. Physics (indeed, preference science) almost all patterns are found in records and interpretations. That is important in informing information technology as well, yet with one kind of twist. each contains a set of coding. Relying on type of mathematical information technology, and type of Physics more. In Physics anyone can occasionally fall for technical knowledge data, due to the fact that from time to time the power is set to transfer.

B. Deep Learning

Deep learning is a new area of gadget recognition that attempts to model the extruded models that exist within the green information to detect a variety of high-level tasks under records and to make accurate standard predictions for unseen statistics. this has been achieved through the transformation of inaccurate facts about the diversity of deep structures along with Neural Networks. Deep learning objectives are fun for the purpose of real Intelligence and more recently become an amazing hobby for machine-learning researchers. Tech giants like Google, Microsoft, fb and Baidu are investing hundreds of millions of dollars in in-depth research of the area that bleeds and expands its programs.

III. THE WAY DATA SCIENCE HELPED PHYSICS IN PAST

In Finding Higgs Boson Data science has helped physics in past discoveries. We have discussed below:

he establishment of Higgs Boson is a first-level activity of an extremely dynamic region of Physics. Machine learning strategies are proposed to address the issue of signal separation from heritage events. To make sense, the signal is a rare waste of waste, an area in the workplace that is not defined by historical techniques. The back incorporates the scattering debris that was previously found in previous tests.

A. Physics Background

At the ATLAS detector in CERN, high-energy protons are propelled forward in a circle trajectory in each command as a result of the collision itself followed by hundreds of The debris is solid for a second. Environmentalists at CERN have been studying Higgs' boson by looking at their discovery in 2012 that allows you to investigate the homes of this unique molecule. Higgs leather, created from the proton collision of a large Hadron Collider, it disperses - called decomposition -almost instantly into other particles. one of the most important strategies for studying the properties of the Higgs boson is to analyse how it decomposes on important waste and dispersal costs.

B. Machine Learning Background

The power of the overuse physicists the power of using different ML to get information on strategies to improve and to look at the chosen area that produces those signal moments. Classifiers trained in signal and historical functions that may be given the weight of the hold on file the difference between the previous possible event and the internal opportunity of the simulator. The Higgs Boson device that reads the project that started on the 14th can be re-2014 and used15th September 2014. The purpose of this challenge has been changed to strategic development indicating the location selected by the symbols. The goal to be transferred was called Median value estimation (AMS). It becomes a feature of the weights determined by the functions that process the actual illegal activity Direct costs. The problem is officially defined and defined within the technology documents provided in the form of editors and can be accessed here:

http://higgsml.lal.in2p3.fr/documentation/

C. Implementation

Organizers on Kaggle made a toolkit and with that toolkit the baseline submission was made. With the help of Binned Naive Bayes it was made. solutions aimed at type of facts have been implemented using open source scikit-research library in Python.

The full range of models used have been approximately 12 in wide variety. These are a list of these models.

K-Nearest Neighbours

K-Means Clustering

Affinity Propagation

Spectral Clustering

AdaBoost Classifier

Decision Tree Classifier

Support Vector Machines

Naive Bayes Classifier

Gaussian Mixture Models

Random Forest Classifier

Gradient Boosting Classifier

Bagging classifier

The highest AMS on board of community leaders obtained by these classifiers - and in total -was 3.38 by the Gradient Boosting Classifier with a cut-off value of 85.5. The accuracy of the class we found was 84%. The details of the hyper parameter settings for this excellent submission are as follows.

n_estimators = 100

max_depth = 5 min_samples_leaf = 200

max_features =10

learning_rate =0.5

For detail scikit learn API visit: www.scikit-learn.org

D. Deep Learning Implementation

Recognition of the foundation of this research work cantered on the analysis and enforcement of different in-depth study strategies for searching for excess energy particles in this competition. Deep-neural networks were different types of deep neural networks. motivation has emerged little from P’s activities. The various reasons for coming here are from the works of Geoffery Hinton of Toronto College, Joshua Bengio of Montreal College, Yann LeCun of recent York College and Andrew Ng of Stanford University.

The launch is based on Python's use of NumPy and SciPy open libraries and activates a collection of 12 nodes running the Pink Hat master Linux with Xeon processors and 64 GB memory (Rustam3). Performance was enhanced by the use of the Stochastic Gradient Descent with 50 mini-batches of size. Obtaining pricing information reduced the reduction program by a factor of 0.0005 per epoch. Momentum expanded in sequence over a hundred epochs from zero.9 to zero.99 and remained unchanged. The RMSProp method is used for a beta fee of 0.nine. Weights are derived from the Gaussian distribution which means 0 with a variation of 0.1 within the first layer, 0.05 in the rest of the hidden layers and 0.01 within the output layer. all hidden layers had previous capabilities using a greedy pre-training algorithm layer the use of embedded car-encoders each with one hidden layer. Hyper network parameters used for the use of different subsets of educational facts of a thousand, ten thousand and 50000 sizes.

IV. THE WAY DATA SCIENCE CAN HELP PHYSICS IN FUTURE

A. In Astronomy

Data-Driven Astronomy (DDA) data-driven production of astronomical data based on archived record sets. The DDA is very similar to industrial technology knowledge in that the mathematical components are not all psychological tests, but as an alternative it is the product.

B. In Material Physics

In machine learning supervised learning can be used to determine materials. The workflow shown above.

Conclusion

In A physicist in the data science process will spend much of his or her time analysing data and designing and developing models to predict how something will behave based on the statistics of our behaviours across the board. Over the past few years, the field of physics has grown in popularity. Specialized testing physics. records accumulated by hand a few years ago are now possible with a laptop or computer programming and causing photographic additional parameters points. One second of truth can be between a few hundred and tens of millions math points. As this knowledge grows, so does the science of data transforming this knowledge into mathematics and also into astronomy, scientific data (and machine awareness) is often used to inform interesting processes or skills in astronomy such as new galaxies, potential black holes, supernovas and more. And, the facts are usually great, and the signatures are not sure.

References

[1] Baldi, P., P. Sadowski, and D. Whiteson. “Searching for Exotic Particles in High-energy Physics with Deep Learning.” Nature Communications 5 (July 2, 2014). [2] Adam-Bourdarios, C., Cowan, G., Germain-Renaud, C., Guyon, I., Kegl, B., and Rousseau, D. (2015). ”The Higgs Machine Learning Chal- ´ lenge.Journal of Physics: Conference Series”, 634: 072015. doi:10.1088/1742-6596/664/7/07 2015. [3] Chen, Tianqi, and Tong He.(2015) “Higgs Boson Discovery with Boosted Trees,” JMLR: Workshop and Conference Proceedings , 42 : 69–80. [4] www.scikit-learn.org [5] http://higgsml.lal.in2p3.fr/documentation/

Copyright

Copyright © 2022 Abhishek Kumar Sarkar, Anuj Gupta, Yuvraj Joshi. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET43717

Publish Date : 2022-06-01

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here