Ijraset Journal For Research in Applied Science and Engineering Technology
Authors: Anusha A. K. R. S., Jenefa Joy A. B.
DOI Link: https://doi.org/10.22214/ijraset.2024.63520
Certificate: View Certificate
In the era of digital transformation, the exponential growth of data from multifaceted origins has fundamentally revolutionized decision-making and strategic planning across global industries. Big Data, distinguished by its immense volume, rapid velocity, diverse variety, and critical veracity, introduces unprecedented opportunities alongside formidable challenges. This journal explores the foundational concepts, methodologies, and applications of Big Data and analytics. It delves into the types and characteristics of Big Data, examines sources and collection methods, outlines analytical steps and tools, and addresses the inherent challenges. Through comprehensive analysis, this journal aims to provide insights into leveraging Big Data for informed decision-making, innovation, and competitive advantage in various sectors.
I. INTRODUCTION
Big Data typically refers to datasets that are so large and complex that traditional data processing software often cannot manage them effectively. These datasets can include both structured data (like numbers, dates, and strings that fit into traditional databases) and unstructured data (like text, images, and videos).
Big Data Analytics involves applying sophisticated analytical techniques to vast and varied datasets that encompass structured, semi-structured, and unstructured data.
II. BIG DATA AND ITS TYPES
A. Big data
The term Big Data refers to huge volumes of data, both structured and unstructured, that cannot be stored and processed by any traditional data storage or processing units. Big data can be analyzed to gain insights that enhance decision-making and provide confidence for strategic business initiatives.
The term Big Data was coined by Roger Magoulas back in 2005. The term 'Big Data' was in use since the early 1990s. Although it is not exactly known who first used the term "big data," most people credit John R. Mashey, who worked at Silicon Graphics at the time, is credited with popularizing the term.
B. Types of Big data
Big data is classified in three ways:
2. Semi-Structured Data
3. Unstructured Data
III. CHARACTERISTICS OF BIG DATA
Big Data has 7V's characteristics. They are described below:
IV. SOURCES OF BIG DATA
The bulk of big data generated comes from the following primary sources:
Predict Consumer Preferences, Changing Trends
2. Machine-generated Data
3. Human and Computer-generated Data
a. Media as Big Data Source: The most popular source of big data is media as it provides valuable insights on consumer preferences and changing trends. Media includes social media and generic media. It includes Google, Facebook, Twitter, YouTube, Instagram, as well as generic media like images, audios, videos, and podcasts.
b. The Web as Big Data Source: The World Wide Web ensures for its diverse usability.
c. Cloud as Big Data Source: Cloud storage accommodates unstructured and structured data and provides business with real-time information and on-demand insights.
d. IoT as Big Data Source: Machine-generated content or data created from IoT establish a valuable source of big data. The data is usually generated from the sensors that are connected to the electronic devices. IoT is now gaining momentum and includes big data generated from computers, smartphones, and also from every device that can emit data. With IoT, data can now be obtained from medical devices, vehicular processes, video games, meters, cameras, household appliances, and the like.
e. Database as Data Source: Databases can provide for the extraction of insights that are used to drive business profits. Popular databases include MS Access, DB2, Oracle, SQL, and Amazon Simple.
f. Transactional Data as Data Source: Transactional data are gathered when a user makes an online purchase (product, time of purchase, payment method).
g. Time series Data as Data Source: Time series data help in observing trends. Example: Stock exchange data
V. CHALLENGES WITH BIG DATA
Big data poses several challenges in spite of huge benefits. The challenges include new privacy and security concerns, accessibility for business users, and choosing the right solutions for your business needs. To make the most of the incoming data, organizations will have to address the following:
VI. BIG DATA ANALYTICS AND ITS TYPES
Big Data Analytics is the use of advanced analytic techniques against very large, diverse data sets that include unstructured, semi-structured, and structured data, from different sources, and in different sizes from terabytes to zettabytes.
Big data analytics comprises techniques to examine big data to uncover information -- such as correlations, hidden patterns, market trends, and customer preferences -- that can help organizations make informed business decisions.
Big data analytics provides benefits such as effective marketing, new revenue opportunities, customer personalization, and improved operational efficiency. With an effective strategy, these benefits can provide competitive benefits over rivals.
A. Types of Big data Analytics
Examples
VII. STEPS IN BIG DATA ANALYTICS
VIII. BIG DATA ANALYTICS TOOLS
Several tools are available for big data analytics. Some of the popular tools include the following:
2. Apache Hive
3. MongoDB
4. MapReduce
5. Apache Cassandra
6. Apache Spark
7. Apache Storm
8. Apache Kafka
9. Talend
IX. BENEFITS OF BIG DATA ANALYTICS
The importance of big data lies not in the quantity of data a company possesses, but in how effectively it utilizes this data. Each company leverages data uniquely; the more efficiently a company uses its data, the greater its potential for growth. By analyzing data from any source, a company can find answers that enable it to:
X. APPLICATIONS OF BIG DATA
Big Data is being the most wide-spread technology that is being used in almost all business sectors.
A. Key Functions of Data Science
Data science involves the following key functions:
B. Functions of Data Scientist
Data Scientists perform the following functions
C. Skillset of Data Scientist
Data Analysts require the following skills:
The field of Big Data and Big Data Analytics represents a transformative force across various industries, driven by its ability to handle vast volumes of data, diverse data types, and its application of advanced analytical techniques. This journal has explored several key aspects of Big Data and Big Data Analytics. 1) Types and Characteristics: Big Data encompasses structured, semi-structured, and unstructured data, originating from sources such as sensors, social media platforms, and transaction records. Its defining characteristics —Volume, Velocity, and Variety—underscore its complexity and the challenges it poses for traditional data processing methods. 2) Sources: The sources of Big Data are diverse and expansive, ranging from machine-generated data streams to human-generated content on social media platforms. These sources contribute to the exponential growth of data, creating opportunities for organizations to extract valuable insights. 3) Tools and Techniques: Advanced tools and techniques such as machine learning algorithms, data mining, natural language processing, and predictive analytics are essential in making sense of Big Data. These tools enable organizations to uncover patterns, trends, and correlations that facilitate informed decision-making and strategic planning. 4) Applications: The applications of Big Data and Big Data Analytics are broad and impactful. Industries leverage these technologies for purposes including customer segmentation, personalized marketing, operational efficiency improvements, healthcare diagnostics, and fraud detection. The ability to derive actionable insights from Big Data empowers organizations to innovate, optimize processes, and gain a competitive edge in their respective markets. In summary, while Big Data presents challenges related to data management, privacy, and scalability, its potential to revolutionize business operations and drive innovation cannot be overstated. As technology continues to evolve, so too will the capabilities and applications of Big Data Analytics, shaping the future of data-driven decision-making across industries worldwide.
[1] Hassan Aldarbesti, Abbas Al-Refaie, et al, Big Data Analytics: A Literature Review Paper, IEEE Access, 2021. [2] Samah Mohamed, Ahmed Abdel-Basset, et al, A Survey on Big Data Analytics: Challenges, Open Research Issues, and Tools, Future Generation Computer Systems, Science Direct, 2021. [3] Iqbal H. Sarker, Jiayu Shang, et al, Big Data Analytics for Cyber-Physical Systems: A Survey, ACM Computing Surveys, 2020. [4] Neda Alavi, Mostafa Jafari, et al, Big Data Analytics in E-commerce: A Comprehensive Review and a Framework for Future Research, Electronic Commerce Research and Applications, 2021. [5] Yalini Chandrasekaran, Saravanan Subramanian, et al, Machine Learning and Big Data Analytics for Healthcare: A Comprehensive Survey, Artificial Intelligence in Medicine, 2020. [6] Iqbal H. Sarker, Jiayu Shang, et al, Big Data Analytics for Cyber-Physical Systems: A Survey, ACM Computing Surveys, 2020. [7] S. Sathiya Kumaran, K. Thangavel, et al, Challenges and Opportunities in Big Data Analytics: A Review, Journal of Ambient Intelligence and Humanized Computing, 2023. [8] Sarah Faisal, Imran Sarwar Bajwa, et al, Ethical Issues in Big Data Analytics: A Comprehensive Review, IEEE Access, 2023. [9] Mohit K. Chandra, Raghvendra Kumar, et al, Security Challenges in Big Data Analytics: A Systematic Review, Information Systems Frontiers, 2022. [10] Priyanka Gupta, Vikram Goyal, et al, Scalability Challenges in Big Data Analytics: A Survey, Journal of Big Data, 2022. [11] Abeer Al-Mutairi, Khaled Alutaibi, et al, Legal and Regulatory Issues in Big Data Analytics: A Review, Future Generation Computer Systems, 2023.
Copyright © 2024 Anusha A. K. R. S., Jenefa Joy A. B.. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Paper Id : IJRASET63520
Publish Date : 2024-06-30
ISSN : 2321-9653
Publisher Name : IJRASET
DOI Link : Click Here