In this paper we addressed a strategy that is adopted regularly with multi-item questionnaires, namely the use of reversed worded items. Many developers of questionnaires adopt this strategy with the intention of avoiding response bias, particularly acquiescence. We found evidence that this goal is not met. We also discussed an often unintended consequence of reversing some items, namely the debate arising in literature about the presence of two related but different conceptual features, like for example fitness and fatigue. We will discuss first the goal of reversing items and later the consequence. Response bias is a difficult phenomenon to detect as well as to prevent. We distinguished two types of response bias, response set and response style, and focused on characteristics of three types of response styles, that are often confused or overlooked in literature: acquiescence, inattention and confusion due to item verification difficulty. An acquiescent respondent has the tendency to affirm all statements. Pure acquiescence cannot be prevented by reversing half of the items. When all items assess fatigue and are worded in the same direction, an acquiescent person who is fatigued will receive a sumscore close to the true state. An acquiescent person who is not at all fatigued, however, will receive a biased sumscore. Thus, reversing half of the items will lead to a less biased sumscore for the latter respondent, but only at the cost of a more biased sumscore for the fatigue respondent. Therefore, reversing some items cannot be presumed to lead to better assessments in case of acquiescent respondents.
Inattention can relate to several different aspects of an item, like the specific content (is it about mental or physical fatigue), the time frame (is it about today, or the last week), but also the direction of the item (is it about being fatigue or not being fatigue) or the answer categories (do they range from always to never, or the other way around).
On closer examination, several characteristics can be distinguished, all covered by the concept inattention. Some respondents may be less precise in general. Others may become more inattentive in answering, depending on the length of the questionnaire and the extent to which the items resemble each other. And some may get frustrated by having to answer more or less the same items either in the same or opposite direction. Thus, in some cases extension of the questionnaire by more items, in the same or opposite direction, may work counterproductive.
Confusion is a response style dependent on the difficulty of the item and the cognitive strategies the respondent has to employ to give an answer that is in accordance with the true state.
We argued why reversing a portion of the items is an ineffective way of dealing with response bias. Reversing items by using negative particles or affixal morphemes will lead to increased difficulty, and thus more bias, without any clear advantage. Reversing some items by reversed worded items, may decrease item difficulty for those respondents that can agree with the reversed items, but at the same time will lead to more bias due to confusion for the other respondents, together with increased bias due to inattention for all respondents. In any case, acquiescence will not be avoided, at best detected. To a great extent, the confusion that is caused by reversing items, is due to the custom to present both original and reversed items mixed up.
Finally, we demonstrated that a particular instrument, the MFI-20, designed to prevent response bias using reverse wording of half the items, does not achieve this goal. The MFI-20 is a widely used instrument for reliable and valid assessment of fatigue in general and several types of fatigue. Results of the study raise questions whether the addition of ten reverse worded items, intended to prevent response bias, is justified. An added value of the negatively formulated items to the positive items, or vice versa, was not demonstrable. With respect to content of the items, the ten negatively and ten positively formulated items are measuring almost the same, if not exactly the same aspects of fatigue. Since the developers of the MFI explicitly stated that they added reverse worded items in order to tackle response bias, it would be useless to focus on any potential difference with respect to content or responsiveness of these ten items.
Instead of preventing response bias, the addition of ten reverse worded items appears to increase the risk of inattention and confusion. No intensive focus was put on potential subtle differences with respect to their content between two reverse worded items. Firstly, because the developers of the questionnaire explicitly stated that the purpose of adding reverse worded items was to prevent response bias. Secondly, any difference with respect to content should, if it is considered to be important, be assessed by items, all formulated in the same direction, in order to maximize opportunities to assess subtle differences and to avoid artifacts due to accidentally misreading.
The addition of ten reverse worded items did lead to slightly higher values of Cronbach’s alpha. This is however to be expected when scales are twice as long. With one exception, mean inter-item correlations decreased when adding ten reversed worded items, where and increase was to be expected, considering the reason for adding reversed items.
Considering the findings of Swain et al. 
, respondents seem to make less errors with items that reflect their experience or situation than with items that describe the opposite. Since this instrument is designed to measure fatigue, it will probably be used more often among persons with a certain level of fatigue. Therefore the negatively formulated items are to be preferred. The psychometric qualities of these ten items are acceptable, if not good.
A consequence of reversing items is the identification of two related but unipolar concepts where only one was intended. We expect some validity to the claim of unipolarity and thus two related concepts that are tapped by asking for both fitness and fatigue, positivism and negativism, happiness and sadness, being relaxed and nervous. However, we consider the emerging of these claims, originating from data-analysis, instead of from a theoretical position, a serious weakness. If distinguishing between two related but opposite concepts is truly relevant, it would be helpful to take precautionary actions to assess these concepts unambiguously. In accordance with Roszkowski and Soven 
, we suggest separate presentation of ordinary items and reversed worded items, instead of a list where these items are all mixed up.
Even when a multi-item questionnaire consists of items stated in the same direction, there are problems to be addressed that hamper an obvious relationship between the theoretical concept and the sumscore resulting from an addition of the itemscores 
. Some aspects, commonly seen in multi-item instruments, that deserve to be addressed are:
- Differences in item difficulty and their consequences for the interpretation of summed scores, a field that Item Response Theory is addressing.
- Sometimes in the same questionnaire some aspects are addressed with more items than others, leading to an often unknown and implicit weighing of their contribution to the total score.
- The rationale and consequences of using different answercategories for items that are supposed to belong to the same scale
- The rationale and consequences of using both items asking for frequency and items asking for intensity.
All these phenomena deserve to be addressed. This discussion will be more fruitful if it is not obscured by effects resulting from reversed worded items.
In conclusion, we consider reversing items in order to prevent response bias a counterproductive strategy. Acquiescence cannot be prevented by reversing, and more errors will be made due to inattention or confusion. An instrument with all items formulated in the same direction and referring to the intended concept (i.e. fatigue or fitness, depression or happiness) is to be preferred. If a researcher is concerned about respondents missing subtle differences between the items, other strategies are to be considered.
It is surprising that reversing items, introduced several decades ago, is still predominant in many popular questionnaires. Discussion
about the pros and cons of this phenomenon should be revived. Consider, on a rainy day, all cows in a pasture tending to stand facing in the same direction, with their back pointing from where the wind comes. We admit that one cow standing in the opposite direction, would be conspicuous immediately. Unfortunately items do not have a head and tail.