|Home | About | Journals | Submit | Contact Us | Français|
To investigate demographic correlates of fatigue in the US general population using a new instrument developed by the Patient-Reported Outcome Measurement Information System (PROMIS). First, we examined correlations between the new PROMIS instrument and the Functional Assessment of Chronic Illness Therapy-Fatigue (FACIT-F) and the SF-36v2 Vitality subscale. Based on prior findings, we further examined several demographic correlates of fatigue: whether women would report higher levels of fatigue compared to men, and whether married people would experience lower levels of fatigue compared to unmarried people. We also explored the relationship between age, education, and fatigue.
Analyses were based on fatigue ratings by 666 individuals from the general population. Fatigue was assessed with the new PROMIS instrument, the FACIT-F, and the SF-36v2 Vitality subscale. Differences in fatigue were examined with independent samples t-tests and univariate ANOVAs.
The three fatigue instruments were highly intercorrelated. Confirming prior reports, women reported higher levels of fatigue than men. Married participants reported significantly less fatigue than their unmarried counterparts. Univariate ANOVAs yielded a main effect for participants’ age; younger participants gave significantly higher fatigue ratings. We also found a main effect for participants’ education. Participants with a masters or doctoral degree had significantly lower ratings of fatigue than participants with some college education and education up to high school.
Female gender, not being married, younger age and lower educational attainment were each associated with increased fatigue in the general population and the three fatigue instruments performed equally well in detecting the observed associations.
Fatigue is a common reason for seeking medical care and a source of considerable economic burden (1, 2). The prevalence of fatigue in the general population has been reported to range from 7% to 45% (see 1, 2); a recent study found that 38% of US workers reported being fatigued (2). Adequate treatment of fatigue has proven challenging and it is often overlooked by healthcare providers due to its diagnostically non-specific nature (3). Fatigue is a common pathological feature of various medical conditions including chronic heart disease, cancer, multiple sclerosis, chronic insomnia, and depression (4) and chronic fatigue syndrome (5).
Fatigue can be broadly characterized as either a subjective feeling or a decrement in a person’s ability to perform up to a certain standard (6). At pathological levels, fatigue can be overwhelming, debilitating, and lead to a sustained sense of exhaustion (7–11). Non-pathological fatigue has lower intensity, shorter duration, and less disabling effects on functional activities.
Fatigue has been extensively studied in medical conditions. Numerous fatigue instruments have been developed for disease-specific use, such as for rheumatoid arthritis (12), cancer (13–16), multiple sclerosis (17–19), chronic fatigue syndrome (20), and myasthenia gravis (21). Research has also sought to differentiate fatigue experienced by clinical samples from that experienced by the community (22); the development of tools to capture this difference has been initiated (23, 24). Little is known, however, about the utility of these questionnaires across medical conditions and their applicability to healthy populations (25).
Recently, an effort has been made to advance fatigue measurement across healthy and medical populations as part of the Patient-Reported Outcomes Measurement Information System (PROMIS), funded within the National Institutes of Health (NIH) Roadmap for Medical Research Initiative. PROMIS is a multi-center, collaborative project to improve the measurement of clinically important symptoms and outcomes (e.g., fatigue, pain, emotional distress, physical functioning). Its goal is to develop and standardize a set of item banks that allow the assessment of key symptoms and health concepts across a wide range of populations (26, 27). The PROMIS measurement tools are being developed using a standardized step-wise series of methods including qualitative item review and sophisticated quantitative methods of advanced psychometric modeling (26–28). The following phases of item development were completed and yielded a core pool of items for each of the PROMIS domains: identification of extant items, classification and selection, review of items and item revision, focus groups on appropriate domain coverage, cognitive interviews for item refinement, and final revisions before field testing (described in 26).
The PROMIS item banks were developed using item response theory (IRT). In IRT, an individual’s true score is defined on the latent trait or construct of interest as compared to the test score, as is the case in classical test theory. A major concept of the IRT approach is the “item characteristic curve”, which describes the association between an individual’s level on the trait/construct (for example, fatigue) and the probability that the individual will select a particular response option on a particular item (see 29). The IRT-calibrated PROMIS item banks consist of an exhaustive set of carefully calibrated questions that define and quantify a common trait/construct (see 30). IRT has distinct advantages over classical test theory approaches: 1) One does not need the full set of items to appropriately capture the construct. Assessments may be accomplished with fewer items compared to static instruments; 2) Items can be filtered from the bank that may be particularly relevant for one specific disease (for example, cognitive fatigue aspects for multiple sclerosis); 3) All PROMIS item banks are normed to match the 2000 United States Census by gender, age, and education, allowing comparison across domains and people with different conditions; 4) The banks can be administered in multiple formats including dynamic computerized adaptive testing (CAT) to allow individually tailored assessment. CAT is a specific type of computer-based assessment, similar in approach to clinical interviewing, that selects the most informative questions about the individual based on his/her previous response choices. (see 29). As such, PROMIS provides a sound, yet flexible, opportunity to stimulate and standardize the assessment of patient-reported outcomes (PROs) and to assist clinicians in evaluating treatment response (27).
Studies using more traditional measures have tried to shed light on the etiology of fatigue by identifying demographic correlates. Research in medical settings has demonstrated a consistent relationship between gender and fatigue. Women generally experience more fatigue than men with a ratio as high as 3:1 (31–33). Evidence also suggests that married people experience less fatigue compared to unmarried people (31). Results for age and educational status, however, are less consistent. Some studies suggest that older individuals and those who have less education experience less fatigue (32), while others observe no relationship (33).
Demographic correlates have also been investigated in the general population with similar findings for gender and marital status (2, 34–37), however, more population-based research is needed. In the US, fatigue has risen to become the main reason for approximately 5 to 10% of visits to primary care settings, and is a secondary issue for an additional 10 to 20% of visits (1, 2). American workers with fatigue are estimated to cost employers more than 136 billion dollars annually in health related lost productive time, over 101 billion dollars more than non-fatigued workers (2). Insight into the prevalence of fatigue in the general population can facilitate a better understanding of who is more or less likely to seek healthcare and why. This, in turn, has the potential to inform tailored treatment plans and more cost-effective utilization of healthcare resources.
The present study had three goals. The first was to examine correlations between the newly developed PROMIS instrument and two widely used fatigue legacy scales, the FACIT-F and the Vitality subscale of the SF-36v2. The second goal was to test the PROMIS instrument in a large US population study to confirm previously observed associations between fatigue, gender, and marital status. We expected that women would report higher levels of fatigue than men and that people who are married would experience lower levels of fatigue than unmarried people. Finally, we explored the relationship between age, educational status, and fatigue.
Data collection was conducted from July 2006 to March 2007. Institutional Review Board (IRB) approval was obtained for the study conduction. The present study employed secondary analysis of de-identified data. A detailed overview of the sampling strategy used for the PROMIS data collection is described elsewhere (27). In brief, data could be classified into either full-bank testing (i.e., sample completed all items included in the testing forms) or block-testing (i.e., sample completed some items from all domains). Item calibration was conducted via a T-score metric with the mean of the US general population equal to 50 and a standard deviation fixed at 10 (38).
The sample used for the current study completed the PROMIS fatigue full-bank testing (n = 666). The majority of participants (also called panelists) were recruited by YouGovPolimetrix, an internet survey company (a small subset of participants were recruited at the University of North Carolina and Stanford University, n = 38). People who visit Polimetrix’s website (www.pollingpoint.com) can provide their views on public policy and other issues. Over one million individuals regularly participate in online surveys hosted by the website and have provided the company with their contact information. Pollingpoint panelists are recruited by a variety of methods (e.g., e-random digit dialing, invitations via web newsletters, and Internet poll-based recruitment) but typically receive email invitations for various survey studies. Interested participants are directed to a study-specific website and engage in the online study protocol. YouGovPolimetrix uses a sample matching procedure to recruit a random sample of panel members representative of the population of interest. The procedure starts with a listing of all respondents in the population of interest. Next, a random sample is selected from these respondents. Then, the closest match to the selected characteristics is selected from the panel. This method has been shown effective, even for underrepresented groups on the Internet (39). Panel members receive minimal compensation. YouGovPolimetrix targeted recruitment so that invited panelists met key demographic characteristics outlined by PROMIS. The targeted population for PROMIS was as follows: gender (50% female), age (20% in each of 5 age groups: 18–29, 30–44, 45–59, 60–74, 75+), race/ethnicity (10% black and Hispanic), and education (10% less than high school graduate).
The development of the PROMIS fatigue item bank began with a review of more than 80 fatigue questionnaires, resulting in a list of over 1000 potential fatigue items. A series of qualitative and quantitative methods were used to select and modify these potential items to create item banks characterized by items with a high degree of both precision and range (11, 26, 28, 40). The current version of the PROMIS item bank consists of 95 items that include 13 modified items from the FACIT fatigue scale (these items can be accessed at http://www.assessmentcenter.net/ac1/; the 82 PROMIS items included in the present study are labeled with the prefix FATEXP or FATIMP); the recall period and response options of the FACIT items were modified to fit the PROMIS standardization requirements (e.g. using a 7-day reporting period). We excluded these 13 FACIT items to avoid overlap with the original FACIT fatigue scale administered in the present study. This yielded a total of 82 items that did not include any of the original or modified FACIT or SF-36v2 items. Table 1 displays some sample items from the PROMIS fatigue item bank.
The 82 PROMIS fatigue items included two sub-domains that were categorized a priori into: The experience of fatigue and the impact of fatigue (described in 41). Factor analysis was used to assess the factor structure of the PROMIS fatigue item bank. Specifically, we used a bi-factor analysis that assessed a model of fatigue that included one general factor (i.e., overall fatigue) and two local sub-domains (i.e., experience and impact). The results of the bi-factor analysis showed that all items had higher loadings on the general factor than on the two local sub-domains, supporting the conclusion that fatigue is sufficiently unidimensional. However, investigators can decide whether to report one single fatigue score or two separate scores based upon their research or clinical applications (42). Details of the item bank development and factor analysis are reported in Lai et al (41). Additional detail regarding the application of a bi-factor model to health-related concerns can be found in Lai et al (42, 43) and Cook et al (44). Responses for each item were scored on a 1 to 5 rating scale and, where necessary, items are reverse scored so that higher scores always indicated greater fatigue. Scale scores for the full PROMIS item bank as well as the two sub-domains were calculated by averaging item response scores. The internal consistencies for the full PROMIS item bank, the experience, and impact subscale were all 0.99.
The FACIT-F is one of the most widely used self-report instruments to assess fatigue (16). Respondents are asked to complete 13 items and rate their fatigue on a 5-point scale. The validity and reliability of the scale has been reported across numerous studies (45–47) on patients with different conditions as well as in the US general population. The internal consistency of the FACIT-F in the present study was 0.96.
The SF-36 version 2 (48, 49) assesses several aspects of functional status including physical, social, and psychological functioning. It was designed as a health survey that could be self-administered, would accurately depict overall health functioning and well-being across the range from full health through chronic illness, and could detect clinical change in health. Internal consistency for the four-item Vitality subscale has been reported at r = 0.86 (48). The internal consistency of the Vitality subscale in the present study was 0.92.
Participants were also administered approximately 21 auxiliary items consisting of global health rating items and sociodemographic variables including age, income, number of hospitalizations, disability days, whether they take prescribed medicines, body mass index, gender, race/ethnicity, marital status, education, and employment. There was also a series of questions about the presence and degree of limitations related to 25 chronic medical conditions (see Table 2).
Participants with no missing data on all items were included in the present analyses (n = 666); scale scores for each instrument were created by averaging item responses. First, we inspected correlations among the fatigue instruments. Second, we examined whether there were demographic differences in fatigue by performing independent samples t-tests and univariate ANOVAs (demographic characteristics = independent variables; mean fatigue ratings = dependent variables). When a significant main effect was found, post-hoc comparisons were examined. Due to the large number of possible comparisons, Tukey's Studentized Range (HSD) Test was used to control the Type I experiment-wise error rate. We also computed Cohen’s d (where d = .20 small, d = .50 medium, d = .80 large effect) defined as the difference between two means divided by their pooled standard deviation (50). Finally, we conducted five separate univariate ANOVAs with all four demographic variables as predictors. All data analyses were conducted using SAS version 9.1.
Table 3 displays the demographic characteristics. The present sample conformed adequately to the PROMIS target distributions. In terms of medical characteristics, Table 2 shows how many participants were ever told by a physician/health professional that they had a specific condition and how many participants were currently limited by this condition.
Table 4 presents the correlations among the three fatigue instruments. The PROMIS fatigue bank correlated highly with the FACIT-F (r = .95, p < .001) and the Vitality subscale of the SF-36v2 (r = .89, p < .001). Figure 1 presents histograms describing the distributions of each of the fatigue scales. Visual inspection of the scale scores showed a higher mean for the Vitality subscale compared to the other instruments. Paired samples t-tests demonstrated that the Vitality subscale mean was significantly higher than all of the other fatigue means (all p values < .001).
Men and women differed significantly on all fatigue scales (Table 5). Women reported higher levels of fatigue than men on the full PROMIS item bank (t(664) = 5.07, p < .0001, d = 0.40), the PROMIS impact items (t(664) = 4.80, p < .0001, d = 0.38), and the PROMIS experience items (t(664) = 5.33, p < .0001, d = 0.41) compared to men. This result was also evident on the FACIT-F (t(664) = 4.74, p < .0001, d = 0.37) and the Vitality subscale (t(664) = 5.02, p < .0001, d = 0.38).
Married participants reported significantly less fatigue compared to their unmarried counterparts. This result was evident for the full PROMIS item bank (t(664) = 2.78, p < .01, d = 0.22), the impact items (t(664) = 2.61, p < .01, d = 0.20), the experience items (t(664) = 2.95, p < .01, d = 0.23), the FACIT-F (t(664) =3.01, p < .01, d = 0.25), and the Vitality subscale (t(664) = 2.93, p < .01, d = 0.24).
Participants’ age was negatively but weakly correlated with the full PROMIS bank (r = −.12, p < .01), the experience and impact items (r = −.14, p < .001 and r = −.11, p < .05, respectively), the FACIT-F (r = −.14, p < .001), and the Vitality scale (r = −.18, p < .0001). Univariate ANOVAs on the full PROMIS item bank yielded a significant main effect for participants’ age (F(4, 661) = 4.81, p < .001). This effect was also evident for the impact (F(4, 661) = 3.98, p < .01) and experience items (F(4, 661) = 5.93, p = .0001) as well as the FACIT-F (F(4, 661) = 4.58, p < .01) and the Vitality subscale (F(4, 661) = 8.56, p < .0001). Younger participants between 30 and 44 years of age reported higher levels of fatigue on all scales compared to participants between 60 and 74 years (p < .05, d range = 0.40 to 0.55) and participants 75 years and older (p < .05, d range = 0.37 to 0.62). In addition, participants between 45 and 60 years of age reported higher levels of fatigue on the PROMIS experience items and the Vitality subscale compared to participants between 60 and 75 years (p < .05, d = 0.33 and 0.36, respectively) and participants 75 years and older (p < .05, d = 0.28 and 0.44, respectively). Finally, participants younger than 30 years rated their fatigue higher than participants 75 years and older (p < .05, d = 0.26 and 0.47, respectively).
Figure 2 depicts the mean levels of fatigue by the individual age groups. The pattern of mean levels was suggestive of a cubic trend, which was confirmed in subsequent analyses. This result was evident for the full PROMIS bank (t(665) = 3.44, p < .001), the PROMIS experience bank (t(665) = 3.67, p < .001), the PROMIS impact bank (t(665) = 3.23, p < .01), the FACIT (t(665) = 2.73, p < .01), and the Vitality subscale (t(665) = 2.95, p < .01).
ANOVAs on the full PROMIS item bank yielded a significant main effect for participants’ educational degree (F(3, 662) = 4.99, p < .01). The same effect was evident for the PROMIS impact (F(3, 662) = 4.77, p < .01) and experience items (F(3, 662) = 5.10, p < .01) as well as the FACIT-F (F(3, 662) = 5.74, p < .001) and the Vitality subscale (F(3, 662) = 6.59, p < .001). Participants with an advanced degree (i.e., masters or doctoral level degree) had significantly less fatigue than participants with some college education (p < .05, d range = 0.30 to 0.34) and education up to high school (p < .0001, d range = 0.47 to 0.59) across all five scales.
Finally, we conducted five separate univariate ANOVAs with all four demographic variables as predictors for each of the scales. When entered simultaneously, each of the demographic variables continued to make significant and independent contributions in the prediction of fatigue. This result was evident for all five fatigue scales (i.e., age: p values ranging from <.0001 to .01; gender: all p values <.001; marital status: p values ranging from .01 to <.05; education: all p values <.01). However, the total amount of variance explained by the four predictors was small and ranged between 8% and 10%.
We administered the newly developed PROMIS fatigue scale and two established fatigue instruments to investigate the association between fatigue and demographic characteristics in a US general population sample. Our first goal was to examine correlations between the new PROMIS instrument and two widely used fatigue legacy scales, the FACIT-F and the Vitality subscale of the SF-36v2. The overall pattern of results demonstrates the similarity between the measures. The three scales correlated highly with one another. Although many of the PROMIS items were derived from well-established extant fatigue measures, the items in the present paper did not include any items of the FACIT or SF-36v2. Nevertheless, the consistency across the fatigue measures supports the validity of the new PROMIS instrument. The mean for the Vitality subscale was significantly higher than the means for the PROMIS item bank and the FACIT-F. We can only speculate as to why this was the case. It may be that the difficulty level of the Vitality subscale items is lower than that of the remaining instruments yielding higher mean scores. Furthermore, 50% of the Vitality subscale items pertain to an individual’s energy level compared to around 8% of the PROMIS items and around 9% of the FACIT items. These items may tap into a different aspect of fatigue potentially contributing to the fact that the Vitality subscale behaved somewhat differently from the other instruments.
Second, we wanted to replicate previously observed associations between fatigue, gender, and marital status (2, 34–37). As expected, women reported higher levels of fatigue than men on all scales. While some have suggested that gender differences may be an artifact of sampling strategy and may reflect differences in illness behavior rather than differences in the experience of fatigue (1), studies across various settings (e.g., medical and community) have consistently documented a higher prevalence of fatigue in women compared to men (1, 2, 51–53). Predisposing vulnerabilities, such as endocrine and stress-related factors (1) as well as social-contextual determinants (54), have been proposed to explain this phenomenon; however, further investigation of the specific risk factors and how they may vary across the lifespan is warranted (7). We also found that married individuals reported significantly less fatigue compared to unmarried people on all scales. This result is consistent with other literature investigating the role of romantic relationships in the experience of fatigue (31, 34, 37). Shared responsibility of household tasks and duties may alleviate feelings of mental and physical tiredness and exhaustion. That marriage can act as a buffer against health complaints and somatic symptoms is not unique to the study of fatigue. For example, being single has been found to be associated with elevated depressed symptomatology in chronic pain (55), decreased quality of life in coronary heart disease (56), and higher mortality rates in the elderly (57).
Finally, we also explored the relation of fatigue to age and education and found that older age and higher academic degree were associated with less fatigue. Previous literature has reported inconsistent results in this regard (32, 35, 51, 54, 30). The divergent findings may at least partly be explained by variations in sampling strategy, sample size, and study setting. In addition, the distributions of these two demographic descriptors vary to a great extent across studies and have produced different cut-offs to define cohorts and educational domains. The PROMIS sampling strategy included a wide range of age groups with a balanced proportion of individuals in each group. Participants in the 30–44 years age group had the highest fatigue ratings on all scales. This may be partly explained by the fact that this age group has to juggle multiple demands from their involvement in the work force to raising children, etc. It is important to note, however, that the correlations between age and fatigue, albeit significant, were very low, which raises caution about the clinical meaningfulness of this relationship.
This study has several limitations. Our design was cross-sectional and cannot provide insight into the development of fatigue over time. Our sample was drawn from the general US community but participants were not free of self-reported health complaints and medical conditions. The low rates of current medical limitations in the present sample precluded analyses involving types of conditions; future research would benefit from examining differences in fatigue levels by disease type and comparisons of fatigue ratings between healthy and medical samples. The majority of our participants were recruited via the internet; it is unclear if our results generalize to other recruitment settings. This recruitment strategy also precluded examination of differences between responders and non-responders. Finally, detailed information about participants’ sleep cycle (such as work hours) that may have impacted their fatigue ratings was not collected.
In terms of strengths, the new PROMIS instrument proved successful for the measurement of fatigue in a representative sample of the US general population and has some advantages in comparison to other instruments. For example, it is being jointly normed to the US population in conjunction with the other PROMIS domains. The PROMIS item banks can be administered via multiple modes of administration (e.g., paper and pencil, web-based). In addition, all items included in the final item banks are calibrated by using sophisticated IRT models. Advantages of using IRT calibrated item banks include flexibility in choice of questions, ability to administer a brief-yet-precise individualized tailored assessment (real-time clinical monitoring) via CAT, and availability of multiple short-forms. Since items included in the short-forms and CAT are calibrated onto the same continuum using IRT models, results are comparable (28, 30).
The Patient-Reported Outcomes Measurement Information System (PROMIS) is a National Institutes of Health (NIH) Roadmap initiative to develop a computerized system measuring patient-reported outcomes in respondents with a wide range of chronic diseases and demographic characteristics. PROMIS was funded by cooperative agreements to a Statistical Coordinating Center (Evanston Northwestern Healthcare, PI: David Cella, PhD, U01AR52177) and six Primary Research Sites (Duke University, PI: Kevin Weinfurt, PhD, U01AR52186; University of North Carolina, PI: Darren DeWalt, MD, MPH, U01AR52181; University of Pittsburgh, PI: Paul A. Pilkonis, PhD, U01AR52155; Stanford University, PI: James Fries, MD, U01AR52158; Stony Brook University, PI: Arthur Stone, PhD, U01AR52170; and University of Washington, PI: Dagmar Amtmann, PhD, U01AR52171). NIH Science Officers on this project have included Deborah Ader, Ph.D., Susan Czajkowski, PhD, Lawrence Fine, MD, DrPH, Laura Lee Johnson, PhD, Louis Quatrano, PhD, Bryce Reeve, PhD, William Riley, PhD, Susana Serrate-Sztein, MD, and James Witter, MD, PhD. This manuscript was reviewed by the PROMIS Publications Subcommittee prior to external peer review. See the website at www.nihpromis.org for additional information on the PROMIS cooperative group. AAS is a consultant for invivodata, inc. and a senior scientist at the Gallup Organization.