|Home | About | Journals | Submit | Contact Us | Français|
Preference-weighted health-related quality-of-life (HRQoL) indexes produce a summary score from discrete health states determined by questions falling into several attributes, such as pain and mobility. Values of HRQoL are used alongside other health outcomes to monitor the health of populations.
The purpose of this study was to examine among US adults, the underlying factor structure of HRQoL attribute scores across 5 indexes of HRQoL: EuroQol-5 Dimension, Health Utilities Index Mark 2, Health Utilities Index Mark 3, Short Form-6 Dimension, and Quality of Well-Being Scale Self-Administered form.
The National Health Measurement Study surveyed a nationally representative sample of 3844 noninstitutionalized adults aged 35 to 89 years residing in the continental US. Simultaneous data on all 5 indexes were collected cross-sectionally from June 2005 to August 2006. Exploration of underlying dimensions of HRQoL was done by categorical exploratory factor analysis of HRQoL indexes' attribute scores. Item response theory was applied to explore the amount of information HRQoL attributes contribute to the underlying latent dimensions.
Three main dimensions of HRQoL emerged: physical, psychosocial, and pain. Most HRQoL index attributes contributed to the physical or psychosocial dimension. The 3 dimensions were correlated: 0.47 (physical and psychosocial), 0.57 (physical and pain), 0.46 (psychosocial and pain). Some HRQoL index attributes displayed relatively more unique variance: HUI3 hearing, speech, and vision, and some contributed to more than 1 dimension The identified factor structure fit the HRQoL data well (Comparative Fit Index = 0.98, Tucker-Lewis Index = 0.98, and Root Mean Square Error of Approximation = 0.042).
The attributes of 5 commonly used HRQoL indexes share 3 underlying latent dimensions of HRQoL, physical, psychosocial, and pain.
Health-related quality-of-life (HRQoL), a broad and multidimensional construct, is measured as a single score by several generic indexes constructed to reflect community preferences for the health state reported by a person.1 Five indexes commonly used in the United States are EuroQol-5 Dimension (EQ-5D),2,3 the Health Utilities Index Mark 2 (HUI2)4 and Mark 3 (HUI3),5 the Short Form-6 Dimension (SF-6D),6,7 and the Quality of Well-Being Scale Self-Administered form (QWB-SA).8,9
For each index, the health state of an individual is determined by different attributes, with verbal descriptors reflecting the levels of the respective aspect of HRQoL. A person's level on an attribute is in turn based on responses to 1 or several items in the instrument used for the index. Items are summarized into attribute levels, and index scores are computed as a function of attribute levels. Hence, attributes are the building blocks of preference-based index scores.
Several attributes are conceptually similar across the indexes: physical activity/functioning, ambulation; emotion, mental health, anxiety/depression, and pain, whereas others seem different. The HUI3 contains attributes for vision, hearing, speech, and dexterity, whereas SF-6D has attributes for role limitation, social functioning, and vitality. QWB-SA is the only generic preference-based measure with an attribute determined by a broad range of acute and chronic symptoms. Even similarly labeled attributes tend to differ in the wording of response-level descriptors and may not correspond to the same range of underlying health or activity limitation. The National Health Measurement Study (NHMS)1 is unique in having administered all 5 indexes simultaneously to a population-based sample and presents an opportunity to conduct an exploratory psychometric analysis of the attribute scores of the 5 most commonly used indexes of HRQoL.
Previous psychometric analyses have focused on non-preference scored HRQoL questionnaires. Disease-specific health status and HRQoL questionnaires have been shown to capture 2, 3, and even 4 dimensions of health.10–20 The most widely used HRQoL questionnaire, the SF-36, was analyzed using factor analytical techniques to derive summary scores for mental and physical health.14
The preference scoring algorithms of the 5 HRQoL indexes investigated here were not developed through psychometric methods but by econometric approaches, such as time-tradeoff (TTO), standard gamble (SG), or visual analogue scale (VAS) valuation techniques. The purpose of this study is not to produce alternative summary scores, but rather to examine what latent health dimensions are reflected in the HRQoL attribute scores, taking advantage of the simultaneous administration of the 5 indexes in the NHMS. After identifying relevant health dimensions through exploratory factor analysis, item response theory is used21,22 to determine where in the range of the underlying dimensions the specific descriptors used in each attribute and index contribute information.
The NHMS was a nationally representative random-digit-dialed telephone survey of 3844 noninstitutionalized adults between 35 and 89 years, residing in the continental US, administered by a computer-adaptive telephone interview between June 2005 and August 2006.1 Analytic weights to produce nationally representative estimates were developed based on the sampling strategy and poststratification by gender, race (black, white, and other), and age (35–44 years, 45–66 years, and older than 65 years). Four HRQoL questionnaires EQ-5D, HUI2/3, SF-36v2,23 and QWB-SA were administered in random order to all respondents.1
Ordinal attribute scores, coded so that higher levels indicate better health, were used in the analyses (Table 1). The SF-6D, HUI2, HUI3, and QWB-SA attribute levels are individually preference scored, leading to an ordering of their categorical descriptors. For the EQ-5D, ordering is inherent in the 3 levels for each of the 5 attributes.
Five items on the EQ-5D referring to health “today” represent attributes: mobility, anxiety/depression, pain/discomfort, self-care, and usual activities.2,3 Preference scoring for EQ-5D, developed using the time-trade-off, is available only at the summary score level. The ordinal attribute scores represented by single items were used for our analyses.
The HUI2 and HUI3 indexes, developed using the SG, are both estimated from the same HUI questionnaire,4,5 which refers to health “in the past week.” The HUI2 combines responses into 6 attributes (mobility, emotion, cognition, pain, self-care, and sensation) and the HUI3 into 8 attributes (ambulation, emotion, cognition, pain, vision, hearing, speech, and dexterity). Our analyses excluded HUI2 cognition, pain, emotion, and mobility because these differ from the HUI3 only by classifying item responses differently into levels of attributes. The HUI2 sensation was excluded from the analyses because HUI3 hearing, speech, and vision constitute the sensation attribute of HUI2. Categories 2 and 3 on HUI3 cognition have reversed but very similar preference scores (0.93 and 0.95) and were combined. The 8 HUI3 attribute scores with 5 to 6 levels and 1 HUI2 attribute score (self-care) with 4 levels were analyzed.
For the SF-6D,6,7 11 of the 36 questions in the SF-36v223 referring to health in the “past 4 weeks” are combined into 6 attributes: physical function, mental health, bodily pain, role limitation, social function, and vitality. These were preference scored using the SG. Some levels are assigned identical preference scores: levels 2 and 3 on physical functioning, pain, and mental health and levels 2 to 4 on role limitation and vitality.7 We combined identically scored levels and analyzed only distinct preference levels for each SF-6D attribute as ordinal variables with 2 to 5 levels.
The QWB-SA index, developed using a visual analogue scale, is a weighted summary of 4 preference-scored attributes, physical activity, self-care and mobility, self-care and usual activity, and chronic and acute symptoms, assessed during the past 3 days.8,9 The QWB-SA has 2 attributes that incorporate self-care, albeit by different items. The attribute scores of the QWB-SA take on a much larger number of discrete values than those of other indexes, with the number at or above the maximum capability of software for factor analysis of ordinal scales. Each of the 3 days was utility scored, hence the 4 QWB-SA attribute scores were averaged during the 3 days and produced preference-based utility decrement weights, ranging from 0 to 0.16 (12 unique values) on physical activity, 0 to 0.09 (10 unique values) on self-care and mobility, 0 to 0.10 (10 unique values) on self-care and usual activity, and 0 to 0.56 (485 unique values) on acute and chronic symptoms. To avoid sparsely populated levels and to produce attribute scores with numbers of levels similar to those of other indexes, categories were formed using breaks in each of the 4 scales and grouping of values as follows: physical activity into 5 levels (≤0, 0.02–0.05, 0.07–0.08, 0.10–0.11, ≥0.13), mobility into 3 levels (≤0, 0.01–0.02, ≥0.03), self-care and usual activities into 4 levels (≤0, 0.02–0.04, 0.05–0.07, ≥0.08), and acute and chronic symptoms into 10 levels (0–0.07, 0.07–0.19, 0.19–0.23, 0.23–0.27, 0.27–0.32, 0.32-0.36, 0.36–0.38, 0.38–0.41, 0.41–0.52, and 0.52–0.56).
The NHMS sample (N = 3844) was split into 2 random halves with 1922 observations each. One random half was used for exploratory factor analysis (EFA) and the other random half for confirmatory factor analysis (CFA).
We used analytic techniques specifically geared to discrete ordinal values, rather than approximations assuming these descriptors to be continuous and normally distributed. An EFA was conducted on all attribute scores using oblique (Promax) rotation and Mean-adjusted Weighted Least Squares robust estimation compatible with modeling ordinal data while using survey weights.24–28 We limited EFA to extract up to 4 factors.10 A factor structure solution was identified based on the eigenvalue rule (eigenvalues >1.0) and interpretability of the rotated loadings.29,30 Oblique rotation was chosen based on the expectation that dimensions of health would be associated.20,31–33
An ordered categorical variable CFA model was fit restricting the variables to load on the factor structure identified in the EFA and estimated with the Mean-adjusted Weighted Least Squares robust estimator, delta parameterization, and probit link.22 Attributes were allowed to cross load34,35 if EFA showed loading ≥0.25 on more than 1 dimension, and crossloading was dropped if the attribute CFA crossloading was <0.1. The CFA model, equivalent to 2-parameter item response theory model, was used to examine the amount of information contributed to the underlying dimensions by the HRQoL attribute scores.21,22,36,37 All factor loadings and item thresholds were allowed to be estimated, whereas factor variances and means were fixed at 1 and 0, respectively.28
The CFA model fit was evaluated by commonly used goodness-of-fit statistics: Tucker-Lewis Index (TLI), Comparative Fit Index (CFI), and Root Mean Square Error of Approximation (RMSEA). TLI and CFI values ≥0.95 and RMSEA point estimate ≤0.05 are indicative of good model fit.28,30
We present graphs of item information to show the contributions of individual attributes and of all attributes in each index (except HUI2) combined, to measuring the identified dimensions. Information curves indicate the precision with which the latent dimension is estimated by the attribute scores. The information contributed by an attribute depends on (1) the loading of the attribute on the dimension and (2) the “severity levels” defined by the attribute categories.21 Differences in specific category levels may cause similar attributes to contribute in different ranges of the dimension, by the percentage of individuals that fall into the specific categories of each attribute at a given value of the dimension.21 The total index information curve for a dimension is obtained by adding the information of all index attributes that load on that dimension.
Descriptive statistics and recoding of variables were done by the SAS/STAT System for Windows Version 9.1 (copyright 2002–2003 SAS Institute Inc., Cary, NC). All psychometric analyses were done in Mplus Version 5.2 using survey weights.37
The survey-weighted sample in this analysis had similar distribution as the general US population1 by gender (women: 52.9%), age categories (35–44 years: 31.6%; 45–54 years: 23.8%; 55–64 years: 19.9%; 65–74 years: 14.2%; and 74–89 years: 10.6%), and race (white: 81.2%; black: 10.3%; and other: 7.6%) but a greater proportion of respondents with a college degree or higher (40.9%) or with income above $35,000 (69.1%) (not shown).
Table 2 shows attributes organized in groups of similar content: physical, mental, pain, self-care/usual activities, psychosocial, sensation, and unique attributes. The EFA oblique rotated loadings of HRQoL attributes (Table 2) indicate a 3-factor model with a dominant factor and first 4 eigenvalues: 13.06, 1.71, 1.51, and 0.99. The 3 factors were interpreted as physical, psychosocial, and pain dimensions. Most HRQoL attributes had relatively larger positive rotated loadings on one of the factors. However, 15 attributes potentially loaded on more than 1 factor (bold EFA, Table 2).
Latent correlations between the factors were significant (P < 0.001), indicating these factors are not orthogonal: r = 0.47 (physical and psychosocial), r = 0.57 (physical and pain), and r = 0.46 (psychosocial and pain).
The CFA model was fit based on the factor structure identified in EFA (Table 2). The final CFA model was fit after removing regression paths with loadings <0.1. This CFA model fit the data well: CFI = 0.98, TLI = 0.98, and RMSEA = 0.042 (Table 2).
Published literature commonly supports 2-factor solutions,10–14,38 hence a 2-factor CFA (not bifactor) model was fit that retained the psychosocial dimension and restricted the remaining attributes to load on a combined second dimension (not shown). This model had a slightly worse fit than the final 3-factor CFA model: CFI = 0.97, TLI = 0.98, and RMSEA = 0.045. The χ2 test for difference testing37 of nested models comparing these models revealed that the restrictions imposed by the 2-factor model resulted in a significantly worse model fit (χ52 = 107, P < 0.0001).
Based on the final CFA model, information curves were built for each attribute and index (Figs. 1 and and2).2). In each graph the abscissa is 1 of the 3 dimensions scaled as a standardized latent variable with population mean 0 and standard deviation 1. Values less than 0 correspond to lower than average health in the population and values greater than 0 to above average health on the respective dimension. Hence, higher ranges of the latent physical, mental, and pain dimension spectrums reflect better physical and mental health and less pain. The contribution of the attribute score to the respective dimension is reflected in the height of the graph, and the height varies across the dimension according to how the attribute levels capture a given range of the dimension. Attributes that loaded on the pain dimension (SF-6D bodily pain; EQ-5D usual activities, pain/discomfort, and mobility; and HUI2 self-care, HUI3 pain, and dexterity; QWB-SA self-care/usual activities and physical activity) had scores that contributed most information in the middle range of their underlying continuum, in a region reflecting population average levels of pain. Attributes that loaded on the psychosocial dimension (SF-6D mental health, social functioning, vitality, and role limitation; EQ-5D anxiety/depression and self-care; HUI2 self-care; HUI3 cognition, emotion, speech, and dexterity; and QWB-SA self-care/usual activities and acute/chronic symptoms) had scores that contributed most information in the lower to middle range of the latent continuum. Most attributes loading on the physical dimension (SF-6D physical functioning, role limitation, and social functioning; EQ-5D mobility, self-care, and usual activities; HUI3 ambulation, hearing, and vision; and QWB-SA self-care/usual activities, physical activity, self-care/mobility, and acute/chronic symptoms) had scores that contributed the highest amount of information from the lower to middle range of the latent spectrum.
The QWB-SA contributed the most information on the latent physical dimension, whereas EQ-5D had most information up to the population mean of 0 on the latent pain dimension. The SF-6D had most information on the psychosocial dimension. On all dimensions, the SF-6D contributed slightly more information in the higher range than did the other indexes.
The aim of this article was to explore the factor structure of the pooled attributes of 5 commonly used HRQoL indexes, EQ-5D, HUI2, HUI3, SF-6D, and QWB-SA. Three related underlying dimensions of HRQoL emerged, which we interpreted as physical, psychosocial, and pain. The results, based on a US nationally representative sample of adults aged 35 years and older, indicate that the content of these indexes is highly related.
The dimensions we found were primarily defined by the expected specific attributes (Physical: SF-6D physical functioning, EQ-5D mobility, HUI3 ambulation, QWB-SA physical activity, and self-care/mobility; Pain: SF-6D, EQ-5D, and HUI3 pain; and Psychosocial: SF-6D mental health, EQ-5D anxiety/depression, and HUI3 emotion). Other attributes that contributed to a dimension could be seen as consequences of physical or pain issues, such as self-care, usual activities and social functioning, and tended to contribute to more than 1 dimension. Research on the factor structure of SF-36v2 attributes has also revealed that attributes may load on both physical and mental dimensions.38,39
Most literature on self-reported health supports 2 to 4 underlying dimensions of HRQoL.10 Hays and Stewart11 found dimensions of mental and physical health in their study of self-reported health. The commonly used Medical Outcomes Study SF-36, developed through factor analytic techniques, summarizes health into mental and physical components.14 This structure was retained in SF-36v2. Cella et al10 found physical and mental well-being dimensions in their factor analysis of 5 HRQoL questionnaires administered to people with cancer and HIV disease. Beyond physical and mental dimensions, the literature presents the third social dimension such as the interpersonal dimension found by Padilla et al10,15,40 and the social well-being dimension endorsed by the World Health Organization. The 3-factor model illuminates 3 distinct sources of common variance in the attribute scores of 5 HRQoL indexes. The HRQoL attributes did not capture a separate factor of social interaction in these analyses, although our model shows the attributes that may be tapping into a social construct (SF-6D role limitation and social functioning; EQ-5D self-care and usual activities; HUI2 self-care; and QWB-SA self-care/usual activities) contributing to more than 1 underlying dimension. A pain dimension arises primarily because most of the indexes contain a distinct attribute for pain. In the QWB-SA, pain is one of several symptoms grouped into 1 attribute and so perhaps diluted in impact.
This study is unique in performing factor analysis simultaneously on the attribute scores of 5 commonly used indexes of HRQoL. This is also the first time that information content of the HRQoL index attribute scoring has been analyzed to show how it reflects latent health at upper and lower ranges of their underlying health-related dimensions. Amount of information provided by HRQoL attributes is affected not only by loadings but also by level of health problems reflected in the attribute descriptors, which tends to be on the lower side of the health for most HRQoL indexes and attributes. These information curves reflect ordinal values in categorical scoring of attributes and do not reflect magnitudes of difference in actual utility scores. Utility scoring constitutes a rescaling of the information captured by the attribute scores. A high level of information of an attribute score in a certain range of a dimension does not necessary imply a large impact on utility scores. On the other hand, a low level of information of an attribute score does imply that the corresponding part of the range of the underlying dimension is not well queried by the score and that the impact of the score on utility of health in that range of the dimension must be low.
Although the HRQoL attributes clearly share a substantial proportion of variance, there were attributes that had a relatively greater amount of unique variance. Unique variance may reflect not only measurement error but also aspects of health captured by a single attribute. On the physical dimension, the sensation attributes of HUI3 (hearing and vision) and acute and chronic symptom attributes of QWB-SA displayed unique variance. The dexterity attribute of HUI3 and cognition attribute of HUI3 had relatively more unique variance not explained by the pain and psychosocial dimensions, respectively. Each captures concepts distinguished by only 1 HRQoL index.
The relatively different construction of QWB-SA attributes merits further discussion. We noted earlier that the QWB-SA attribute of acute and chronic symptoms was fundamentally different because of its diverse item content. An exploratory factor analysis (EFA) of QWB-SA acute and chronic symptoms items (not shown) revealed that this attribute is a microcosm of at least all 3 dimensions, although most items are associated with a physical-related construct. This supports the crossloading of the QWB-SA attribute on both the physical and psychosocial dimensions. The QWB-SA attribute scores were also somewhat arbitrarily categorized for this analysis. Additional exploration of QWB-SA attribute scores (not shown) involved allowing up to 10 categories on the QWB-SA physical activity attribute, but this did not change its information curve.
Generally, our results indicate that the scoring of attributes tends to target the lower end of the latent health spectrum, particularly below the population mean. However, the SF-6D attribute scores capture a slightly higher range than those of other indexes on the latent dimensions. The 3 dimensions correlated significantly (r = 0.46−0.57, P < 0.001; EFA results), a finding consistent with other studies that reported mental and physical components correlating at 0.3 to 0.7 from version 1 of SF-36.31–33
Our analysis focuses on attributes as the basic building blocks for the indexes. Alternatively, the analysis could have been based on items. An analysis of items may reveal a greater number of factors, particularly greater unique variance, as all the item content may not be used in the attributes. Such an analysis might yield different information than ours and be more relevant to future modification of attribute levels. However, it would be less relevant to the indexes as presently computed. Several technical and legal issues would need to be overcome in an item-based analysis as the number of items differs greatly between indexes, skip conditions necessitate assumptions to be made for some items, and because the items of the HUI questionnaire are proprietary.
The institutionalized portion of the US population is not represented in our results. Hence, the full spectrum of health states, defined by the HRQoL indexes' classification systems, was not observed and the performance of the attributes in such populations cannot be assessed using this study (eg, HUI3 sensation and self-care of EQ-5D, HUI2, and QWB-SA). We are limited in generalizing our results to certain other subgroups of the US population because our sample is highly educated and has an unclear representation of respondents falling into various “other” race/ethnic subgroups. The correlations among the underlying axes may have limited our interpretation of these dimensions.
A strength of this study is the simultaneous use of 5 commonly used indexes of HRQoL allowing a greater breadth of HRQoL content to explore their underlying dimensionality. The ability to generalize the results to the adult US population (with the caveats noted above) is also strength of the study. Overall, this study contributes useful information to users of HRQoL instruments. Particularly, (1) the joint content of these HRQoL measures captures physical, psychosocial, and pain dimensions of health and (2) HRQoL attributes do not differentiate among relatively healthy individuals as well as they do among relatively unhealthy individuals. Individuals at the less healthy end of the continuum have more information on all attributes than those who are relatively healthier. The finding of a 3 factor solution captured by the HRQoL attributes in this article opens a discussion about what the indexes cover jointly and what is unique about each.
Our analyses indicate that the joint attribute content of commonly used HRQoL indexes, EQ-5D, HUI2, HUI3, SF-6D, and QWB-SA, reflects 3 underlying health-related dimensions that can be interpreted as physical, psychosocial, and pain. These 3 dimensions and especially the physical and pain dimensions are related. Among these 5 indexes, most attributes loaded on the physical and psychosocial dimensions. Some HRQoL attributes, in particular attributes of sensation, dexterity, and cognition on HUI3, displayed relatively more unique variance not captured by the 3 main identified dimensions. HRQoL attribute scores had most information at the lower to middle range of the latent continuum of dimensions and least amount of information at the high end. Overall, the 3 dimensions of HRQoL captured the variability of most HRQoL attribute scores well. The appendix for this article which defines the ordinal attribute variables for each of the five HRQoL indexes, is available online only, Supplemental Digital Content 1, available at: http://links.lww.com/MLR/A96.
Supported by a grant no. P01-AG020679 (to the University of Wisconsin, Madison) by the National Institute on Aging and a grant no. T32 HS000046 (to the University of California, Los Angeles, and to the RAND Corporation, Santa Monica) by the Agency for Healthcare Research and Quality.
Presented at the 31st Annual Meeting of the Society for Medical Decision Making (October 18–21, 2009), Hollywood (Los Angeles), California.
Supplemental digital content is available for this article. Direct URL citations appear in the printed text and are provided in the HTML and PDF versions of this article on the journal's Web site (http://www.lww-medicalcare.com).