Sentiment Analysis

eDiseases Dataset

The eDiseases dataset contains patient data from the MedHelp. We extracted 146 posts for allergies, 191 posts for crohn, and 142 posts for breast cancer; which include 983 sentences for allergies, 1780 sentences for crohn, and 1029 sentences for breast cancer. Each sentence in the dataset is labeled with Factuality (OPINION, FACT, EXPERIENCE) and Polarity (POSITIVE, NEUTRAL, NEGATIVE).

SentiSense Affective Lexicon

The SentiSense Affective Lexicon consists of 5,496 words and 2,190 synsets from WordNet 2.1 labeled with an emotional category. The main part of the lexicon consists of nouns and adjectives, followed by verbs and a small set of adverbs. SentiSense is available in English (WordNet 2.1 and WordNet 3.0) and in Spanish (WordNet 3.0). Also, Polar words are provided in both languages.

SentiSense Affective Tools

SentiSense is endowed with a set of tools that allow users to visualize the lexicon and some statistics about the distribution of synsets and emotions in SentiSense, as well as to easily expand the lexicon. This tool is only available for the SentiSense version in English that uses WordNet 2.1.

Hotel Review Corpus

The HotelReview Corpus is a corpus of 1000 reviews extracted from where each review has been manually tagged with a 5-classes category within the set Excellent, Good, Fair, Poor, Very poor and with a 3-classes category within the set Good, Fair, Poor.