Frequently Asked Questions
Currently, we welcome cognitive scientists and psychologists at any career stage to contribute to our goals by pointing us to data resources and tools, and we would love to hear any other suggestions for Data on the Mind through our general suggestions form. We plan to open more formal involvement and affiliation with us in the future, and we will provide more information about this as it becomes available.
One relatively common definition of big data is that it is data too large to fit on a single machine. However, we prefer to think of big data more in the terms of IBM's four V's:
- volume: how much data you have
- variety: how many types of data you have
- veracity: how faithfully your data capture the target behavior
- velocity: how quickly your data move along the collection and analysis pipeline
The data resources highlighted by Data on the Mind all vary along these four dimensions. Some may be much higher in volume but relatively lower in variety, perhaps only capturing a single type of data. Others may be relatively smaller but may capture a richer set of behaviors and have better veracity. With the right mindset and tools, researchers can take advantage of the strengths of any particular dataset.
We are interested in highlighting datasets, repositories, and lists that can shed new light on human behavior and cognition. Although we certainly welcome links to large-scale experimentally derived datasets or repositories, we are especially interested in naturally occurring datasets -- that is, data that have been generated outside of the lab. For example, restaurant reviews can shed light on language and emotion, and betting data could reveal dynamics of decision-making and risk-taking. Datasets do not have to be truly massive to be considered for inclusion: True to the spirit of the four V's of big data proposed by IBM, we are interested in highlighting datasets that are interesting, large-scale, and complex.
Unfortunately, we do not have data storage capabilities at Data on the Mind, so all data resources must be hosted elsewhere. More information on how to submit a data resource to our lists can be found on our data resource suggestion form.