What counts as "big data"?

One relatively common definition of big data is that it is data too large to fit on a single machine. However, we prefer to think of big data more in the terms of IBM's four V's:

  • volume: how much data you have
  • variety: how many types of data you have
  • veracity: how faithfully your data capture the target behavior
  • velocity: how quickly your data move along the collection and analysis pipeline

The data resources highlighted by Data on the Mind all vary along these four dimensions. Some may be much higher in volume but relatively lower in variety, perhaps only capturing a single type of data.  Others may be relatively smaller but may capture a richer set of behaviors and have better veracity. With the right mindset and tools, researchers can take advantage of the strengths of any particular dataset.