In this page you can find the dataset used in the paper Real-Time Classification of Twitter Trends. The dataset is available for download on the following link:

[Download dataset (31MB)]

The tar.gz package contains:

In order to respect Twitter's TOS, tweets are not redistributed and only tweets ids and author screen names are provided. Tweet texts can be downloaded by using any of the following tools:

  1. SemEval-2013 Task 2 Download script (in Python)
    http://www.cs.york.ac.uk/semeval-2013/task2/index.php?id=data
  2. RepLab 2013 Twitter Texts Downloader (in Java)
    http://nlp.uned.es/replab2013/replab2013_twitter_texts_downloader_latest.tar.gz
  3. TREC Microblog Track (in Java)
    https://github.com/lintool/twitter-tools


Citation

Please cite the article below if you use this resource in your research:
A. Zubiaga, D. Spina, V. Fresno, R. Martínez.
Real-Time Classification of Twitter Trends
Journal of the American Society for Information Science and Technology (JASIST). In Press.

BibTex

@article{zubiaga2014realtime,
  author = {Zubiaga, A. and Spina, D. and Fresno, V. and Mart{\'i}nez, R.},
  journal = {{Journal of the American Society for Information Science and Technology}},
  title = {{Real-Time Classification of Twitter Trends}},
  year = {{In Press.}}
}