Complete Public Reddit Comments Corpus (2007-2015)
Complete dataset of public comments posted to Reddit (http://www.reddit.com) comments from October 2007 to May 2015.
Complete dataset of public comments posted to Reddit (http://www.reddit.com) comments from October 2007 to May 2015.
Repository of televised news, including (for many) captions and rough statistics for content
Dataset of 8 million annotated YouTube videos, including a variety of audio and visual features.
Transcripts from British speeches (1895 - 2015), categorized by date, speaker, party, and title
Data from audio recordings of human interaction across various regions of the United States and including a variety of speakers and contexts
Repository of multimodal data on child language and communication (subset of TalkBank)
Video data from head-mounted camera (first-person or egocentric perspective) during various tasks
Video data from head-mounted camera (first person or egocentric perspective) during activities at a theme park
Video data from head-mounted camera (first person or egocentric perspective) during interactions with inanimate objects, with some coupled eye tracking data