Big data refers to data sets that are overlarge or complex to be restrained by traditional processing apclication software. Data with many fields offer greater statistical power, while data with higher complexity maynlead to a better false discovery rate. Big data analysis challenges include capturing data, data storage, data analysis, search, sharing, transfer, visualization, querying, updating, information privacy, and data source. Big data was originally related to three key concepts: volume, variety, and velocity. The analysis of huge data presents challenges in sampling, and thus previously with only observations and samling. Therefore, big data often includes data with sizes that exceed the capacity of traditional software to process within an appropriate time and value. Current usage of the term big data tends to sit down with the utilization of predictive analytics, user behavior analytics, or certain other advanced data analytics methods that extract value from ig data, and infrequently to a specific size of knowledge set. “There is no doubt that the quantities of knowledge now available are indeed large. But that’s not the foremost relevant chracteristic of this new data ecosystem.” Analysis of information sets can find new correlations to ” spot business trends, prevent diseases, combat crime and then on”.Scientists, business executives, medical practitioners, advertising and governments alike regularly meet difficulties with large data-sets in areas including internet searches, fintech, healthcare analytics, geographic information systems, urban informatics, and business informatics. Scientists encounter limitations in e-Science work, including meteorology, genomics, connectomics, complex physics simulations, biology, and environmental research.The size and number of obtainable data sets have grown rapidly as data is collected by devices like mobile devices, cheap and diverse information – sensing Internet of things devices, aerial, software logs, cameras, microphones, radio-frequency identification(RFID) readers and wireless sensor networks. The world’s technological percapita capacity to store information has roughly doubled every 40 months since the 1980s: as of 2012, a day 2.5 exabytes(2.5*2^60 bytes) of knowledge are generated. supported an IDC report prediction, the worldwide data volume was predicteed to grow exponentially from 4.4 zettabytes to 44 zettabytes between 2013 and 2020. By 2025, IDC predicts there\'ll be 163 zettabytes of knowledge.
Introduction
I. INTRODUCTION
The big data is the cobination of structured and semi structured and unstructured data collected by organization. big data burst upon the scene in the first decade of the 21st century. that can be mined for information and used in machine learning projects and other advanced analytics applications. in organizations the systems that process and store big data have become common component of data management architectures. Big data analytics support uses combined tools.
II. WHAT IS BIG DATA
A. It is similar to ‘small data’ but bigger in size.
B. But having data bigger it requires different approaches the technique tools and architecture.
C. The big data is an aim to solve new problems or old problems in better way.
D. It generates value from the storage and processing of very large quantity of digital information that can not be analyazed with traditional computing techniques.
III. HISTORY
The earliest examples we've got of humans storing and analyzing data are the tally sticks, which go back to 18,000 BCE! The ishango Bone was discovered in 1960 in what's now referred to as Uganda and is believed to be one among the earliest pieces of evidence of prehistoric data storage. Paleolithic tribespeople would mark notches into sticks or bones, to stay track of trading actiity or supplies. they might compare sticks and notchesto perform rudimentary calculations, enabling them to create predictions like how long their food supplies would last.Then in 2400 BCE came, the abacus, the primary dedicated device constructed specifically for performing calculations. the primary libraries also appeared around now, representing our first attempts at mass data storage. the traditional Egyptians around 300 BC already tried to capture all existing ‘data’ within the library of Alexandria. Moreover, The empire accustomed carefully analyze statistics of their military todetermine the optimal distribution for his or her armies. But, in additional recent times it's revolutionized the fashionable business environment. the primary major data project was created in 1937 and was ordered by the Franklin D. Roosevelt administration after the social insurance act became law. the govt had to stay track of contributions from 2 million employers. IBM got the contract to develop punch card- reading machine for this massive bookkeeping project. the primary data-processing machine appeared in 1943 and was developed by Brits to decipher Nazi codes during war 2. This device, named Colossus, sought for patterns in intercepted messages at a rate of 5,000 characters per second, reducing the length of your time the task took from weeks to merely hours. Then, in 1965, the us government decided to make the primary ever data centre to store over 742 million tax returns and 175 milliion sets of fingrprints. They decided to try to to this by transferring those records onto magnetic computer tape that had to be stored during a single location. The project was later dropped but is usually accepted because the beginning of the electronic data storage era.
IV. STRUCTURE OF BIG DATA
V. APPLICATION
Tracking Customer Spending Habit, Shopping Behavior:In big reatails store management team must keep data of customer’s spending habit, shopping behavior, customer’s most liked product. Which product is being searced/sold most, supported that data, production/collection rate of that product get fixed. Banking sector uses their customer’s spending behavior related data so they'll provide the offer to a selected customer to shop for his particular liked product by using bank’s cedit or open-endcredit withdiscountorCashback.By thisfashion, they'll send correct offerto thecorrect personat thecorrect time.
Recommendation:By tracking customer spending habit, shopping behavior , Big retails store provide a recommendation to the customer, E-commerce sit like Amazon, Walmart, Flipkart does product recommendation. They track what product a customer is searching, supported that data they recommend that typr of product to it customer. As an example, suppose any customer searched bed cover on Amazon got data that custoer is also interested to shop for bed cover. Next time when that customer will visit anygooglepage,advertisementof thecorrect productto thecorrect customer willbe sent.
Smart Traffic System:Data about the condition of the traffic of various road, collected through camera kept beside the road, at entry and exit point of town, GPS device placed within the vehicle. All such data are analyzed and jam-free or less jam way, less time taking ways are recommended. Such the way smart traffic system will be in-built town by Big data analysis
Secure Air Traffc System:At various plces of flight sensors capture data just like the speed of flight, moisture, temperature, other status. supported such data analysis, an environental parameter within flight are founded and varied. By analyzing flight’s machine generated data, it will be estimated hoe long themachine can operate flawlessly when it to bereplaced/ repaired.
Auto Driving Car:Big data analysis helps drive a car without human interpretation. within the various spot ofcar camera, a sensor placed, that gather data like thesize of the encompassing car, obstacle, distance from those, etc. These data are being analyzed, then various calculation like what number angles to rotate, what should be speed, when to prevent, etc allotted. These calculations help to require action automatically.
VI. ADVANTAGES
Better Decision Making:Companies use big data in numerous ways to boost their B2B operations, advertising, and communication. Many businesses including travel, realty, finance, and insurance are mainly using big data to boost their higher cognitive process capabilities. Since big data reveals more information in an exceedingly usable format, businesses can utilize that data to form accurate decisions on what consumers want or not and their behavioral tendencies.
Reduce costs of business processes:The surveys conducted by new vantage and syncsort. Reveals that big data analytics has helped businesses to scale back their expenses significantly. 66.7% of survey respondents from new vantage claimed that they need strated using big data to cut back expenses. Furthermore, 59.4% of survey respondents from syncsort claimed that big data tools helped them reduce costs and increase operational efficiency.Fraud Detection:Fraud detection is significantly important for credit card companies to identify account information, materials, or product access. Any industry, including finance, can better serve its customers by early identification of frauds before something goes wrong.
A. Increased Productivity
Consistent with a survey from syncsort, 59.9% of survey respondents have claimed that they were using big data analytics tools like Spark and Hadoop to extend productivity. This increase in productivity has, in turn, helped them to enhance customer retention and boost sales. Modern big data analytics helps data scientists and data analysts gain more information about themselves in order that they'll identify a way to be more productive in their activities and job responsibilities.
B. Improved Customer Service
Improving customer interactions is crucial for any business as part of their marketing efforts. Since big data analytics provide businesses with more information, they'll utilize that data to make more targeted marketing compaigns and special, highly personalized offers to every individual client. the foremost sources of huge data are social media, email transactions, customers’ CRM systems, etc. So, it exposes a wealth of data to businesses about their customers’ pain points, touch points, values, and trends to serve their customers better.
VII. DISADVANTAGES
A. Lack of Talent
According to a survey by At Scale, the shortage of massive data experts and data scientists has been the most important challenge during this field for the past three years. Currently, many IT professionals don’t understand how to hold out big data analytics because it requires a unique skill set. Thus, finding data scientists who also are experts in big data is challenging. Big data experts an data scientists are two highly paid careers within the data science field. Therefore, hiring big data analysts will be very expensive for companies, especially for startups. Some companies should watch for a protracted time to rent the specified staff to continue their big data analytics tasks.
B. Security Risks
Most of the time, companies collect sensitive information for large data analytics. Those data need protection, and security risks are often demerits thanks to the shortage of proper maintenance. Besides, having accessto huge data sets can gain unwanted attention from hackers, and your business is also a target of a possible cyber attack. As you recognize, data breaches became the most important threat to several companies today.
C. Compliance
The nees to own compliance with government legislation is additionally a drawback of massive data. If big data contains personal or tip, the corporate should confirm that they follow government requirements and industry standards to store, handle, maintain, and process that data.
VIII. FUTURE SCOPE
Since Big Data first entered the scene, its definition, its use cases, technology and strategy of harnessing its value evolved significantly across different industries. Innovations in cloud computing, quantum computing, Internet of Things(IOT), AI, so on will allow for giant Data to evolve further as we’ll find new ways of harnessing its potential.
Conclusion
The availability of massive Data, low-cost commodity hardware, and new information management and analytic software have produced a novel moment within the history of information analysis. The covergence of those trends means we\'ve the capabilities required to investigate astonishing data sets quickly and cost-effectively for the primary time in history. These capabilities are neither theoretical nor trivial. They represent a real discovery and a transparent opportunity to understand enormous gains in terms of efficiency, productivity, revenue, and profitability.
Theage of huge Data is here, and these are truly revolutionary times if both business and technology professionals still work together and deliver on the promise. As more and more data is generated and picked up, data analysis requires scalable, flexible, and high performing tools to supply insights during a timely fashion. However, organizations are facinga growing big data ecosystem where new tools emerge and become outdated very quickly. Therefore, it may be very difficult to stay pace and choose theright tools.
References
[1] Computer Networking Book Fourth Edition(Behrouz A Forouzan).
[2] Data Communications and Networking.(BOOK).