Big Data
Big data is a developing term that describes any large amount of structured, semi-structured and unstructured data that has the potential to be mined for information.
It is characterized in 3Vs: the extreme volume of data, the wide variety of data types and the velocity at which the data must be processed. Although big data doesn’t equate to any specific volume of data, the term is often used to describe terabytes, petabytes and even Exabyte’s of data captured over time.
What is big data in database
NoSQL, MPP databases, and Hadoop are complementary; NoSQL systems should be used to confine Big Data and provide operational intelligence to users. MPP databases and Hadoop should be used to provide analytical insight for analysts and data scientists.
Benefits
Big data is really critical to our life and its emerging as one of the most important technologies in the modern world.
Some benefits: By the information kept in the social sites; like Facebook, the marketing agencies are learning about the response for their campaigns, promotions, and other advertising mediums. Using the information in the social network like preferences and product perception of their consumers, product companies, and retail organizations are planning their production. Using the information regarding the previous medical history of patients, hospitals are providing better and quick service.
Technologies
Big data technologies are important in providing more precise analysis, which may lead to more real decision-making resulting in greater operational efficiencies, cost reductions, and reduced risks for the business.
To connect the power of big data, you would need an infrastructure that can manage and process huge volumes of structured and unstructured data in real-time and can protect data privacy and security. There are different technologies in the market from different vendors including Amazon, IBM, Microsoft, etc., to handle big data.
Operational Big Data
This includes systems like MongoDB that give operational capabilities; for real-time, interactive workloads where information or data is primarily captured and stored. NoSQL Big Data systems are designed to get a benefit of new cloud computing architectures. It has emerged over the past decade to permit massive computations to be run economically and efficiently. This makes operational big data workloads much easier to control, cheaper, and faster to implement. Some NoSQL systems can give insights into patterns and trends based on real-time data with minimum coding and without the need for data scientists and additional infrastructure.
Analytical Big Data
It includes systems like Massively Parallel Processing (MPP) database systems and MapReduce that give analytical capabilities for presentation and complex analysis that may touch most or all of the data. MapReduce provides a new technique of analyzing data; that is complementary to the capabilities provided by SQL and a system based upon MapReduce. It can be scaled up from single servers to thousands of high and low-end machines.
Big Data Challenges
- Capturing data
- Curation
- Storage
- Searching
- Sharing
- Transfer
- Analysis
- Presentation