Complete Public Reddit Comments Corpus (2007-2015)
Complete dataset of public comments posted to Reddit (http://www.reddit.com) comments from October 2007 to May 2015.
Complete dataset of public comments posted to Reddit (http://www.reddit.com) comments from October 2007 to May 2015.
Repository of televised news, including (for many) captions and rough statistics for content
Unstructured dataset of open-source media articles
List of datasets used to study opinion mining, sentiment analysis, and opinion spam detection
Dataset of over 5.8 million Amazon product reviews (including information on product, rating, review text, and more)
Transcripts from British speeches (1895 - 2015), categorized by date, speaker, party, and title
Data from U.S. presidential speeches (1789 - 2010), including transcript, audio, and/or video (available modalities vary by speech)
Data from public speeches, including transcript, audio, and/or video (varies by speech)