BIG DATA
- Lots of data is being collected
and warehoused - Web data, e-commerce
- purchases at department/
grocery stores - Bank/Credit Card transactions
- Social Network
Big data is a term that explains the big volume of data – both organized/structured and unstructured – that inundates a company over a day-to-day basis. But it isn’t the quantity of data that’s important. It’s what organizations/companies do with the info that counts. Big data can be examined for insights that lead to raised decisions and tactical business moves.
A Single View to the Customer
How much data are we talking about:
- Google functions 20 PB per day (2008)
- Wayback Machine has 3 PB + 100 TB/month (3/2009)
- Facebook has 2.5 PB of end user data + 15 TB/day (4/2009)
- eBay has 6.5 PB of end user data + 50 TB/day (5/2009)
- CERN’s Large Hydron Collider (LHC) produces 15 PB per annum
Why Is Big Data Important?
The need for big data doesn’t revolve around how much data you have, but what you need to do with it. You may take data from any source and evaluate it to find answers that allow cost reductions, time reductions, new product development and optimized offerings, and smart decision making. Once you incorporate big data with high-powered analytics, you can attain complete business-related responsibilities such as:
- Determining root factors behind failures, problems, and flaws in near-real time.
- Generating coupons at the idea of sale predicted on the customer’s buying patterns.
- Recalculating complete risk portfolios within minutes.
- Detecting fraudulent habit or patterns before it influences your organization.
Now how does it Work:
Before discovering what big data can work for your business, you should comprehend where it originates first. The sources for big data generally get caught in one of three categories:
Streaming data
This category includes data that extends to your IT systems from an internet of linked devices. You could examine this data as it occurs and can make decisions on what data to keep, what never to keep and what requires further evaluation.
Social media data
The data on public connections can be a progressively attractive group of information, particularly for marketing, sales, and support functions. It’s in unstructured or semistructured varieties so that it poses a distinctive obstacles it pertains to usage utilization and evaluation.
Big data has been defined by the three Vs:
They are as follows-
- Volume (Scale):-
The quantity of data. While volume level signifies more data, it’s the granular characteristic of the data that is exclusive. Big data requires handling high amounts of low-density, unstructured Hadoop data—that is, data of unfamiliar value, such as Twitter data feeds, click channels on a website and a mobile app, network traffic, sensor-enabled equipment recording data at the speed of light, and plus much more. It’s the job of big data to convert such Hadoop data into valuable information. For a few organizations, this may be tens of terabytes, for others, it could be hundreds of petabytes.
- Data Volume
- 44x increase from 2009 2020
- From 0.8 zettabytes to 35zb
- Data volume is increasing exponentially
- Velocity:-
The Data streams in at an unparalleled velocity and must be handled regularly. RFID tags, detectors, receptors, sensors, and smart metering are generating the necessity to deal with torrents of data in near-real time.
- Variety:-
Data will come in all sorts of types – from organized, numeric data in traditional databases to unstructured text contents, email, video tutorials, audio tracks, stock ticker data and financial trades.
Some Make it 4V’s:
Big Data Analytics:
- Big data is more real-time in character than traditional DW applications
- Traditional DW architectures aren’t well-suited for big data applications.
- Shared nothing, parallel processing massively, scale-out architectures are well-suited for big data applications.