The analysis on the training data brings up several research issues that need to be addressed during the system development.
First, as Ortony
8 discussed, while some words (eg,
miserable,
painful) bear fairly unambiguous affective meaning, there are words that act only as
indirect reference to emotion states, depending on the contexts in which they appear. Interestingly, we also found that, even words with the same sense can often evoke different emotions in certain contexts. Consider, for example, the underlined affect word
forgive in the sentences (E1) and (E2). It evokes two different polarity emotions,
guilt and
forgiveness when it is followed by different pronouns. Therefore, detecting affective text needs to consider the neighboring context of the affect word.
E1:
E1. Tell him to
forgive
me if I ever treated him bad. [Emotion: guilt]
E2:
Tell him I
forgive
him for all my heart aches. [Emotion: forgiveness]
Second, although the sentiment of many sentences is indicated by the presence of affect words, quite a number of sentences do not contain such words but convey affect through the underlying meaning. An example (E3), which does not contain an expected affect word, is given below. Automatically detecting such pragmatic information is a hard challenge, and the language models that rely on surface features of the sentences are very weak in detecting this kind of sentences with implicit emotion expressions.
E3:
I don’t know where she put my clothes from my dresser. [Emotion: anger]
Third, as mentioned earlier, quite a number of sentences contain two or more emotion expressions. For example, in the sentence (E4), the first clause conveys a fear emotion through the verb phrase “afraid of”, but the second clause conveys a love emotion by the verb “love”. Because of the small number of multi-emotion instances, it is impractical to build multi-emotion classifiers to distinguish the multi-emotion sentences from the text. One feasible solution might be to build multiple binary classifiers, each of which is just targeted to one particular emotion. However, for a sentence-level binary emotion classifier, the text fragment depicting other emotions will become the noisy data, which is likely to degrade the accuracy of the classifier. Therefore, further fine-grained emotion analysis at the smaller text unit level (ie, emotion cues) is required. For example, emotion cues (eg, “I am afraid of you”, “I love you”) that convey affective meaning with respect to a particular emotion needed to be separately annotated from the sentences. The annotation of emotion cues is discussed in a later section.
E4:
It is just that I am
afraid of
you both at times, but I
love
you both very much. [Emotions: fear, love]
Fourth, we found that affective text of some emotions (eg, hopelessness) is sensitive to negation expressions. Certain phrases that contain negation words, eg, “cant go on”, “can’t stand”, and “can not take it any more”, intensify the emotion strength. Moreover, negation words sometimes can trigger the polarity shifting of an emotion, such as “I do not blame him”. Therefore, it is necessary to incorporate negation detection into the identification of emotion expressions.
Fifth, while machine learning-based models may be capable of effectively classifying the emotions (eg, love, hopelessness, guilt, etc.) with a sufficient number of training instances, they do not work well on the emotions that have few training examples (eg, forgiveness, abuse, pride, etc.). With the help of a pre-compiled emotion lexicon, a keyword spotting approach with a weighted score function may provide an alternative solution to the problem of scarce training samples in emotion classification.