1. About RepLab
RepLab is a competitive evaluation exercise for Online Reputation Management systems. As in previous years, the third RepLab campaign (RepLab 2014) will be organized as an activity of CLEF, and the results of the exercise will be discussed at the CLEF 2014 conference in Sheffield, on the 15-18th September. In 2012 and 2013, RepLab focused on the problem of monitoring the reputation of entities (typically companies) in Twitter, and dealt with the tasks of entity name disambiguation (Is the tweet about the entity?), reputation polarity (Does the tweet have positive or negative implications for the entity’s reputation?), topic detection (What is the issue relative to the entity is discussed in the tweet?) and topic ranking (Is the topic a reputation alert that deserves immediate attention?).
RepLab 2014 will still focus on Reputation Monitoring on Twitter, targeting two new tasks: the categorization of messages with respect to standard reputation dimensions (Performance, Leadership, Innovation, etc.) and the characterization of Twitter profiles (author profiling) with respect to a certain activity domain, classifying authors as journalists, professionals, etc. and finding the opinion makers in the domain. The dataset will contain tweets in two languages: English and Spanish.
Note that Twitter profile classification forms part of the shared PAN-RepLab author profiling task. Besides the characterization of profiles from a reputation analysis perspective, participants can also attempt the classification of authors by gender and age, which is the focus of PAN 2014.
The papers of the RepLab 2014 --including the overview-- are available online at the CLEF 2014 Working Notes:
RepLab 2014 will include two tasks: (1) classification of Twitter posts and (2) search and classification of Twitter profiles. Participants are welcome to present systems that attempt one or both tasks.
Workplace: "We are sadly going to be loosing Sarah Smith from HSBC Bank, as she has been successful in moving forward into a...http://fb.me/18FKDLQIr"
Innovation: "HSBC to upgrade 10,000 POS terminals for contactless payments http://bit.ly/K9h6QW"
Some aspects that determine the influence of an author in Twitter – from a reputation analysis perspective – can be the number of followers, the number of comments on a domain or the type of author. As an example, below is the profile description of an influential financial journalist:
Description: New York Times Columnist & CNBC Squawk Box (@SquawkCNBC) Co-Anchor. Author, Too Big To Fail. Founder, @DealBook. Proud father. RTs ≠ endorsements
Location: New York, New York · nytimes.com/dealbook
Whitney Tilson: Evaluating the Dearth of Female Hedge Fund Managers http://nyti.ms/1gpClRq @dealbook
Dina Powell, Goldman’s Charitable Foundation Chief to Lead the Firm's Urban Investment Group http://nyti.ms/1fpdTxn @dealbook
Systems can also participate in the shared author profiling task RepLab@PAN. In order to do so, participants will need to classify profiles by gender and age. Two categories, female and male, will be used for gender. Regarding age, the following classes will be considered: 18-24, 25-34, 35-49, 50-64, and 65+ .
RepLab 2014 used Twitter data in English and Spanish.
Each opinion maker will be categorized as journalist, professional, authority, activist, investor, company, or celebrity. The data set will be split into training and test sets. The estimatated proportion is 30% and 70% respectively, although the exact splits will be given later.
The RepLab 2014 Dataset is publicly available at http://nlp.uned.es/replab2014/replab2014-dataset.tar.gz.
4. Evaluation Measures
The reputation dimensions task and the categorization of profiles by type of author in the author profiling task will be evaluated as classification problems. Accuracy and precision/recall measures over each class will be reported, using accuracy as the main measure.
Note that for the categorization subtask, systems are expected to return the type of author category for every profile. However, as pointed out above, this categorization will be evaluated only over the profiles annotated as “Influencers” in the gold standard.
In the author profiling task, the detection of opinion makers will be evaluated as a traditional ranking information retrieval problem, using the MAP, DCG, RBP and Reliability/Sensitivity measures. The systems’ output will be a ranking of profiles.
5. Important Dates
6. How to Submit RunsThis section contains the instructions on the submission of results (please note that the deadline is May 5 and cannot be further postponed).
7. TORM - Track for Online Reputation Management
This year, RepLab will explore new scenarios and offer new tasks: classification of tweets by reputation dimension and author profiling (http://nlp.uned.es/replab2014/). However, Replab 2014 will also include Track for Online Reputation Management (TORM) in order to give an opportunity to keep working on past campaigns data sets (http://nlp.uned.es/replab2013/).
RepLab is an activity sponsored by the EU project LiMoSINe.
Julio Gonzalo (UNED, Madrid)
Eugene Agichtein, Emory University, USA
RepLab 2014 Dataset available
08/10/2014The RepLab 2014 Dataset is publicly available at http://nlp.uned.es/replab2014/replab2014-dataset.tar.gz.
RepLab 2014 papers online!
12/09/2014The papers of RepLab 2014 --including the overview-- are available online at the CLEF 2014 Working Notes: http://ceur-ws.org/Vol-1180/.
2/05/2014Please note that the deadline for submitting system results has been extended to May 9th! Happy hacking!
Call for papers TORM
31/03/2014TORM, Track for Online Reputation Management, will focus on work that makes substantial progress in one or more tasks addressed in the first two RepLab campaigns. It will serve as a basis for a special issue on Online Reputation Management in an indexed journal. The deadline for paper submission is June 7. For more information, please see TORM.
RepLab 2014 Facebook event!
Just to remind you that we have created a Facebook event of the RepLab 2014 to share experiences, doubts, problems, etc. Please, join us in the following link: https://www.facebook.com/events/593775794030878/
RepLab 2014 test set and evaluation package available!
We are pleased to announce that the RepLab 2014 test set and evaluation package are now available. To access the dataset and the evaluation package, please register in the lab at CLEF. If you have already registered and have not received an email from the organizers, please contact enrique_at_lsi.uned.es and jcalbornoz_at_lsi.uned.es.
Due to technical problems with the registration form of the CLEF website, we open a new way of registering in the RepLab (until the server of CLEF is up again). If you want to participate and could not register through the CLEF website, please write to the lab organizers (enrique_at_lsi.uned.es and jcalbornoz_at_lsi.uned.es) indicating in the mail the task(s) you want to participate in:
- Task 1: Reputation Dimensions- Task 2: Author Profiling
RepLab 2014 training dataset already available!!!
We are pleased to announce that the RepLab 2014 training set is now available. To access the dataset, please register in the lab at CLEF. If you have already registered and have not received an email from the organizers, please contact enrique_at_lsi.uned.es and jcalbornoz_at_lsi.uned.es.
Virtual machines available
Thanks to the shared author profiling task PAN-RepLab, the RepLab participants will be able to claim a virtual machine which is the possibility offered by the PAN organisers. This will allow the research groups to submit a running software and to deploy it into a virtual machine at PAN's site. Instructions on how to prepare the software for this are given on (http://pan.webis.de/) in the submission box of the author profiling task.