What exactly is “big data”? Two interesting posts on the definition. First Andrew Brust on ZDNet offers a common definition:
Big Data is about the technologies and practice of handling data sets so large that conventional database management systems cannot handle them efficiently, and sometimes cannot handle them at all.
?He goes on to note that the term “Big Data” is entrenched, even if the definition is not settled. He also mentions the terms business intelligence, decision support, and data mining.
Now the second article… Robert Hilliard of Deloitte in an article It’s Time for a new definition of Big Data, notes that
[Wikipedia defines Big Data in this way] “In information technology, big data consists of datasets that grow so large that they become awkward to work with using on-hand database management tools”. This approach to describing the term constrains the discussion of big data to scale and fails to realise the key difference between regular data and big data.
Hilliard goes on to make two points:
- Someetimes big data is small: If you had 100K sensors on an airplane and each took a reading every second for a 1 hour flight, you have big data but it is only 3GB of data. So common database and storage technology would work for this.
- Large datasets are actually small: If you look at someting like telephone calls or internet connections, there is a lot of data, but it is quite structured and simple to parse using RDBMS (database technology).
?He then makes the critical point: Big data is more about the complexity of the data sets, especially as large numbers of discrete data points interact.
For me, I think big data has to do with data sets so large and intricate that it’s difficult to get meaning out of the data in a reasonable time frame to act on it. In short, as the quantity of data goes up, the ability to analyze and understand it at a deep level often decreases. What do you think?