University of Virginia Miller Center Presidential Speech Archive
Data from U.S. presidential speeches (1789 - 2010), including transcript, audio, and/or video (available modalities vary by speech)
Data from U.S. presidential speeches (1789 - 2010), including transcript, audio, and/or video (available modalities vary by speech)
Data from public speeches, including transcript, audio, and/or video (varies by speech)
Data from audio recordings of human interaction across various regions of the United States and including a variety of speakers and contexts
Various Twitter dataset collected for academic studies (largely focusing on news)
Dataset of timestamped tweets and corresponding demographic information about authors (i.e., gender and location)
List of lexicons for word-emotion, word-sentiment, and word-color associations derived from a variety of sources (including Amazon, Yelp, Amazon Mechanical Turk, and Twitter)
Crowdsourced dataset of associations between words and emotions and valence (in English and other languages), with some visualization tools
List of language-related corpora and databases across multiple languages
Repository of spoken and text corpora in multiple languages (including Arabic, English, German, Japanese, Mandarin, Spanish, and more)
Word and concept similarity data