I. Definition
Big Data refers to the dynamic, large and and disparate volumes of data being created by people, tools, and machines. It requires new, innovative, and scalable technology to collect, host, and anatically process the vast amount data gathered in order to derive real-time business insights that relate to consumers, risk, profit, performance, productivity management, and enhanced shareholder value.
- Ernst and Young
II. 5 Vs
Big Data has 5 Vs attributes:
- Velocity
- Volume
- Variety
- Veracity
- Value
1. Velocity
Velocity is the accumulate speed of the data, as data is generated extremely fast and is not stopped. Therefore, cloud-based and real-time streaming technologies is needed to handle and process these data quickly.
2. Volume
Volume is the scale of the data (the amount of data stored). Drivers in volume is the increase in the data sources, higher resolution sensor and scalable infrastructures.
3. Variety
Variety is the diversity of the data. Variety reflects different data types and data from different sources.
4. Veracity
Veracity is the origin and the quality of data, and its conformity to facts and accuracy.
Attributes:
- Consistency
- Completeness
- Integrity
- Ambiguity
5. Value
Possible values from Big Data:
- Profit
- Medical
- Social
- Satisfaction => Main reason to invest time in Big Data: Derive values from it.