Analysts need to be able to identify insights into the data as it grows and becomes more complex. In the business world, it\'s common for organizations, even small ones, to be overwhelmed by data. They have a lot of spreadsheets, databases, and other documents that need to be looked at to help them make decisions. [5]Unfortunately, this procedure takes a lot of time and a lot of manual labor.
Additionally, it costs a lot, especially if the user implements it. The systematic application of statistical [3] and logical techniques to describe and illustrate, summarize, and evaluate data is known as data analysis [22-23]. One of the fast-growing techniques for identifying data trends is data analysis [13]. Because speed and accuracy are the foundations of this system, they are why it is so well-known.
The entire procedure by which this tool could take the place of the current method is detailed in this paper. A device that brings those fields together is integrating the various tools used by individual users. The UNkNOT tool\'s flexibility for integration into existing security systems and frameworks is designed to guarantee data integrity and confidentiality.
Introduction
I. INTRODUCTION
Unknot Data Analysis Tool [23] is a highly customizable program that helps users unknot their databases. The tool provides users with a straightforward approach to exploring and researching their data, which can help them source new information or discover patterns within their databases.
The tool will allow its users to create their parameters for exploration and analysis. When it comes to performance monitoring there are several tools on the market, but sadly many of these tools do not allow monitoring databases in a very simple way. By making it possible to efficiently handle and manipulate large amounts of data, automate tasks, and train and deploy AI models, this tool can help with AI-based data analysis. This tool can help data scientists and AI engineers focus on building better models and analyzing the results rather than getting bogged down in data management and manipulation by offering effective methods for handling large amounts of data.
The tool will provide an easy way to access, process, and share centralized information on various stages of your workflow, as well as help with managing security policies. It also allows users to personalize their tool by altering its functionality and user interface. The Unknot tool is the ultimate data validation and extraction tool, designed to provide users with the most robust and thorough results. It provides a holistic solution in 4 phases: 1st phase focuses on working with existing data on a local device; 2nd phase focuses on taking data from the user (can be local or global) given that the device is local; 3rd phase focuses on securing the data integrity and confidentiality. And finally, after 3rd phase, the tool can be hosted on a global level/platform.
II. FRAMEWORK USED
There are several approaches and techniques are commonly used in data analysis, including statistical analysis, machine learning, and data visualization [15-16]. It is particularly useful for understanding patterns and trends in data and can be applied to various fields, including social science, medicine, and engineering. Machine learning is a rapidly growing field involving algorithms to learn from data and make predictions or decisions. It can be applied to a wide range of problems, including image recognition, natural language processing, and predictive modeling. Python [4-10-11-12] is a powerful and versatile programming language widely used for data analysis. It offers a wide range of libraries and frameworks that make it easy to work with data, perform complex calculations, and visualize results. One of the most popular libraries used for data analysis in Python is NumPy [8-9-21]. It is an open-source library that provides support for large and multi-dimensional arrays and matrices of numerical data, as well as a collection of mathematical functions to operate on these. The below table displays the used libraries along with their merits and demerits.
III. PROPOSED METHODOLOGY
UNkNOT has a unique methodology for data analysis [18-19]. Users can use UNkNOT's analytics [5] tools to review their own data, or they can use UNkNOT's charts and graphs to gain an understanding of the overall trends in their dataset. UNkNOT will work on a single data set in its initial phases. A data set will be asked from the user irrespective of the contents (can be any data). For analytics, unknot has provided several charts and graphs for the user to understand their data easily. Providing the users to also work on a specific part of the result and the ability to download it displayed in Fig. 3.2.
Conclusion
Unknot is a data analysis [1-2-7] tool that helps you under- stand your data easily and intuitively. We\'ve completed our current phase with the help of libraries like pandas [8-21], plotly[20], and openpyxl. The dashboard [6] is a great way to display the data and provide an overall view. Updates in the upcoming version of Unknot- Analysis/Visualization of a specific number of rows or a particular set of data, more interactive environment, predictions based on data, analysis of more than attributes, and multiple dataset analysis[8]. The current version of UNKNOT is in its initial state, taking data from the end user and doing visualization on it. With the aim of making UNkNOT available to users with no prior knowledge of Data Analysis [16- 17], it is planned that it will go full global—or more precisely, be made available around the world. UNkNOT\'s functionality is currently at the end phase of 1 and the beginning of phase 2, but many features from phase 3 have also been included. The security focus will remain in the new version, which will also offer more functionality. Phase 1, in its simplicity, only presents visual representations of the data it inputs. Phase 2 attempts to make progress on the project by working with user-provided data. For now, this data is stored locally—in the future, it will be possible for users in other locations to provide global analysis. The Visualizer [24] will undergo several upgrades during its development. This is when a new user with no prior knowledge of data analysis will be able to work on UNkNOT.
References
[1] K. Johnson, B. Lee, and J. Smith. (2020). Data analysis methods for large datasets. Journal of Big Data, 7(2), 23-38.
[2] S. Chen, X. Zhang, and Y. Liu. (2021). Machine learning approaches for predictive analytics. Data Mining and Knowledge Discovery, 35(1), 73-8
[3] W. McKinney, \"pandas: a foundational Python library for data analysis and statistics\", Python for High Performance and Scientific Computing, vol. 14, no. 9, 2011
[4] X. Cai, H. Langtangen and H. Moe, \"On the Performance of the Python Programming Language for Serial and Parallel Scientific Computations\", Scientific Programming, vol. 13, no. 1, pp. 31-56, 200
[5] J. Van Der Donckt, J. Van der Donckt, E. Deprost and S. Van Hoecke, \"Plotly-Resampler: Effective Visual Analytics for Large Time Series,\" 2022 IEEE Visualization and Visual Analytics (VIS), Oklahoma City, OK, USA, 2022, pp. 21-25, doi: 10.1109/VIS54862.2022.00013
[6] G. Iyer, S. DuttaDuwarah and A. Sharma, \"DataScope: Interactive visual exploratory dashboards for large multidimensional data,\" 2017 IEEE Workshop on Visual Analytics in Healthcare (VAHC), Phoenix, AZ, USA, 2017, pp. 17-23, doi: 10.1109/VAHC.2017.8387496
[7] Kabita Sahoo, Abhaya Kumar Samal, Jitendra Pramanik, and Subhendu Kumar Pani. Exploratory data analysis using python. International Journal of Innovative Technology and Exploring Engineering (IJITEE), 2019
[8] Wes McKinney. Python for data analysis: Data wrangling with Pandas, NumPy, and IPython. OReilly Media, Inc., 2012
[9] Fabio Nelli. Python data analytics: Data analysis and science using PANDAs, Matplotlib and the Python Programming Language. Apress, 2015.
[10] Dr Ossama Embarak, Embarak, and Karkal. Data analysis and visualization using python. Springer, 2018.
[11] Pramanik, Jitendra & Samal, Abhaya Kumar & Sahoo, Kabita & Pani, Dr. Subhendu. (2019). Exploratory Data Analysis using Python. International Journal of Innovative Technology and Exploring Engineering. 8. 4727-4735
[12] Kiranbala Nongthombam , Deepika Sharma, 2021, Data Analysis using Python, INTERNATIONAL JOURNAL OF ENGINEERING RESEARCH & TECHNOLOGY (IJERT) Volume 10, Issue 07 (July 2021
[13] Wes McKinney and the Pandas Development Team,pandas: powerful Python data analysi
[14] Stancin, Igor and Alan Jovi?. “An overview and comparison of free Python libraries for data mining and big data analysis.” 2019 42nd International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO) (2019): 977-982
[15] Harshal S. Kudale, Mihir V. Phadnis, Pooja J. Chittar, Kalpesh P. Zarkar,DATA ANALYSIS AND VISUALIZATION OF OLYMPICS USING PYSPARK AND DASH-PLOTLY,202
[16] Pritchard, L., White, J. A., Birch, P. R. J., Toth, I. K. GenomeDiagram: a python package for the visualization of large-scale genomic data. Bioinformatics, Volume 22, Issue 5, 1 March 2006, Pages 616–617. DOI: 10.1093/bioinformatics/btk021
[17] Shammamah Hossain,Visualization of Bioinformatics Data with Dash Bio,201
[18] Nagpal, Abhinav & Gabrani, Goldie. (2019). Python for Data Analytics, Scientific and Technical Applications. 140-145. 10.1109/AICAI.2019.8701341
[19] Wes McKinney, Python for Data Analysis(BookZZ.org),201
[20] Carson Sievert,Interactive web-based data visualization with R, plotly, and shiny(CRC press),202
[21] Nelli, Fabio. (2018). Python Data Analytics: With Pandas, NumPy, and Matplotlib. 10.1007/978-1-4842-3913-1
[22] \"Data Wrangling with Python\" by Jacqueline Kazil and Katharine Jarmul (2017) - O\'Reilly Media, ISBN: 978-1491948811
[23] \"Data Analysis with Pandas and Python\" by Fabio Nelli (2017) - Packt Publishing, ISBN: 978-1787125933
[24] \"Hands-On Data Analysis with Pandas\" by Kevin Markham (2019) - Packt Publishing, ISBN: 978-1801092913
[25] \"Python for Data Analysis and Visualization: A Hands-On Guide to Pandas, Matplotlib, Seaborn and Plotly\" by Hadelin de Ponteves (2021) - Udemy, ISBN: 978-1801249073