To create upper-extremity and mobility subdomain scores from the Patient-Reported Outcomes Measurement Information System (PROMIS) physical functioning adult item bank.
Expert reviews were used to identify upper-extremity and mobility items from the PROMIS item bank. Psychometric analyses were conducted to assess empirical support for scoring upper-extremity and mobility subdomains.
Data were collected from the U.S. general population and multiple disease groups via self-administered surveys.
The sample (N=21,773) included 21,133 English-speaking adults who participated in the PROMIS wave 1 data collection and 640 Spanish-speaking Latino adults recruited separately.
Main Outcome Measures
We used English- and Spanish-language data and existing PROMIS item parameters for the physical functioning item bank to estimate upper-extremity and mobility scores. In addition, we fit graded response models to calibrate the upper-extremity items and mobility items separately, compare separate to combined calibrations, and produce subdomain scores.
After eliminating items because of local dependency, 16 items remained to assess upper extremity and 17 items to assess mobility. The estimated correlation between upper extremity and mobility was .59 using existing PROMIS physical functioning item parameters (r=.60 using parameters calibrated separately for upper-extremity and mobility items).
Upper-extremity and mobility subdomains shared about 35% of the variance in common, and produced comparable scores whether calibrated separately or together. The identification of the subset of items tapping these 2 aspects of physical functioning and scored using the existing PROMIS parameters provides the option of scoring these subdomains in addition to the overall physical functioning score.
Lower extremity; Psychometrics; Rehabilitation; Upper extremity
To estimate the effect of survey mode (mail versus telephone) on reports and ratings of hospital care.
Data Sources/Study Setting
The total sample included 20,826 patients discharged from a group of 24 distinct hospitals in three states (Arizona, Maryland, New York). We collected CAHPS® data in 2003 by mail and telephone from 9,504 patients, of whom 39 percent responded by telephone and 61 percent by mail.
We estimated mode effects in an observational design, using both propensity score blocking and (ordered) logistic regression on covariates. We used variables derived from administrative data (either included as covariates in the regression function or used in estimating the propensity score) grouped in three categories: individual characteristics, characteristics of the stay and hospital, and survey administration variables.
Data Collection/Extraction Methods
We mailed a 66-item questionnaire to everyone in the sample and followed up by telephone with those who did not respond.
We found significant (p<.01) mode effects for 13 of the 21 questions examined in this study. The maximum magnitude of the survey mode effect was an 11 percentage-point difference in the probability of a “yes” response to one of the survey questions. Telephone respondents were more likely to rate care positively and health status negatively, compared with mail respondents. Standard regression-based case-mix adjustment captured much of the mode effects detected by propensity score techniques in this application.
Telephone mode increases the propensity for more favorable evaluations of care for more than half of the items examined. This suggests that mode of administration should be standardized or carefully adjusted for. Alternatively, further item development may minimize the sensitivity of items to mode of data collection.
Patient evaluations of hospital care; CAHPS; propensity score; mode effects
To create an efficient imputation algorithm for imputing the SF-12 physical component summary (PCS) and mental component summary (MCS) scores when patients have one to eleven SF-12 items missing.
Primary data collection was performed between 1996 and 1998.
Multi-pattern regression was conducted to impute the scores using only available SF-12 items (simple model), and then supplemented by demographics, smoking status and comorbidity (enhanced model) to increase the accuracy. A cut point of missing SF-12 items was determined for using the simple or the enhanced model. The algorithm was validated through simulation.
Thirty-thousand-three-hundred and eight patients from 63 physician groups were surveyed for a quality of care study in 1996, which collected the SF-12 and other information. The patients were classified as “chronic” patients if they reported that they had diabetes, heart disease, asthma/chronic obstructive pulmonary disease, or low back pain. A follow-up survey was conducted in 1998.
Thirty-one percent of the patients missed at least one SF-12 item. Means of variance of prediction and standard errors of the mean imputed scores increased with the number of missing SF-12 items. Correlations between the observed and the imputed scores derived from the enhanced models were consistently higher than those derived from the simple model and the increments were significant for patients with ≥6 missing SF-12 items (p<.03).
Missing SF-12 items are prevalent and lead to reduced analytical power. Regression-based multi-pattern imputation using the available SF-12 items is efficient and can produce good estimates of the scores. The enhancement from the additional patient information can significantly improve the accuracy of the imputed scores for patients with ≥6 items missing, leading to estimated scores that are as accurate as that of patients with <6 missing items.
Health related quality of life; SF-12 health survey; imputation; validation
In 2004 NIH awarded contracts to initiate the development of high quality psychological and neuropsychological outcome measures for improved assessment of health-related outcomes. The workshop introduced these measurement development initiatives, the measures created, and the NIH supported resource (Assessment Center) for internet or tablet-based test administration and scoring. Presentation covered: (a) item response theory (IRT) and assessment of test bias, (b) construction of item banks and computerized adaptive testing, and (c) the different ways in which qualitative analyses contribute to the definition of construct domains and the refinement of outcome constructs. The panel discussion included questions about representativeness of samples, and assessment of cultural bias.
To compare health care experiences of Medicare beneficiaries with and without symptoms of depression and investigate the role of patient confidence in shaping these experiences.
Data came from the 2009 CAHPS Medicare 4.0 Fee-for-Service (FFS) Survey, which was fielded to a national probability sample of 298,492 FFS Medicare beneficiaries.
Linear regression was used to model associations of depression with four global ratings and three composite measures of health care and to test whether beneficiaries' confidence in their ability to recognize the need for care mediates these associations.
Beneficiaries with depressive symptoms reported worse experiences with care across the full range of patient experience covered by the CAHPS survey. Depressive symptoms were associated with decreased patient confidence and decreased confidence was in turn associated with poorer reports of care.
Our study highlights depressive symptoms as a risk factor for poorer experiences of health care and highlights depressed patients' confidence in recognizing their need for care and for designing programs to improve the health care of this population.
Depression; Medicare population; patient confidence; patient experience
Despite the increasing use of panel surveys, little is known about the differences in data quality across panels.
The aim of this study was to characterize panel survey companies and their respondents based on (1) the timeliness of response by panelists, (2) the reliability of the demographic information they self-report, and (3) the generalizability of the characteristics of panelists to the US general population. A secondary objective was to highlight several issues to consider when selecting a panel vendor.
We recruited a sample of US adults from 7 panel vendors using identical quotas and online surveys. All vendors met prespecified inclusion criteria. Panels were compared on the basis of how long the respondents took to complete the survey from time of initial invitation. To validate respondent identity, this study examined the proportion of consented respondents who failed to meet the technical criteria, failed to complete the screener questions, and provided discordant responses. Finally, characteristics of the respondents were compared to US census data and to the characteristics of other panels.
Across the 7 panel vendors, 2% to 9% of panelists responded within 2 days of invitation; however, approximately 20% of the respondents failed the screener, largely because of the discordance between self-reported birth date and the birth date in panel entry data. Although geographic characteristics largely agreed with US Census estimates, each sample underrepresented adults who did not graduate from high school and/or had annual incomes less than US $15,000. Except for 1 vendor, panel vendor samples overlapped one another by approximately 20% (ie, 1 in 5 respondents participated through 2 or more panel vendors).
The results of this head-to-head comparison provide potential benchmarks in panel quality. The issues to consider when selecting panel vendors include responsiveness, failure to maintain sociodemographic diversity and validated data, and potential overlap between panels.
survey methods; community surveys; sampling bias; selection bias; Internet; data sources
Consumer assessment of health care is an important metric for evaluating quality of care. These assessments can help purchasers, health plans and providers deliver care that fits patients’ needs.
To examine differences in reports and ratings of care delivered to adults and children and whether they vary by site.
This observational study compares adult and child experiences with care at a large west coast medical center and affiliated clinics and a large mid-western health plan using Consumer Assessment of Healthcare Providers and Systems (CAHPS®) Clinician & Group 1.0 Survey data.
Office staff helpfulness and courtesy was perceived more positively for adult than pediatric care in the west coast site. In contrast, more positive perceptions of pediatric care were observed in both sites for coordination of care, shared decision making, overall rating of the doctor, and willingness to recommend the doctor to family and friends. In addition, pediatric care was perceived more positively in the Midwest site for access to care, provider communication, and office staff helpfulness and courtesy. The differences between pediatric care and adult care were larger in the mid-western site than the west coast site.
There are significant differences in the perception of care for children and adults with care provided to children tending to be perceived more positively. Further research is needed to identify the reasons for these differences and provide more definitive information at sites throughout the U.S.
patient evaluation of care; consumer assessment of health care; quality of care; patient satisfaction
In 2008, HealthPlus of Michigan introduced an online primary care provider (PCP) report that displays clinical quality data and patients’ ratings of their experiences with PCPs on a public website.
Design and Procedure
A randomized encouragement design was used to examine the impact of HealthPlus’s online physician quality report on new plan members’ choice of a PCP. This study evaluated the impact of an added encouragement to utilize the report by randomizing half of new adult plan members in 2009–2010 who were required to select a PCP (N = 1347) to receive a one-page letter signed by the health plan’s chief medical officer emphasizing the importance of the online report and a brief phone call reminder. We examined use of the report and the quality of PCPs selected by participants.
Twenty-eight percent of participants in the encouragement condition versus 22% in the control condition looked at the online report prior to selecting a PCP. Although participants in the encouragement condition selected PCPs with higher patient experience ratings than did control participants, this difference was not explained by their increased likelihood of accessing the online report.
Health plan members can be encouraged successfully to access physician-level quality data using an inexpensive letter and automated phone call. However, a large proportion of missing data in HealthPlus’s online report may have limited the influence of the physician-quality report on consumer choice.
CAHPS; choice of PCP; encouragement design; physician-level quality data; physician quality report
This study examined the physical and mental health of 126,685 men and women age 65 or over, with and without cancer that completed a Medicare Health Outcomes Survey (MHOS) between 1998–2002. Cancer information was ascertained through NCI’s Surveillance, Epidemiology and End Results (SEER) program and linked to MHOS data. Results indicated that across most cancer types, cancer patients reported significantly more comorbid conditions and poorer physical and mental health compared with patients without cancer. Negative associations were most pronounced in those with two or more comorbidities and in those diagnosed with cancer within the past year.
Adjust for subgroup differences in extreme response tendency (ERT) in ratings of health care, which otherwise obscure disparities in patient experience.
117,102 respondents to the 2004 Consumer Assessment of Healthcare Providers and Systems (CAHPS) Medicare Fee-for-Service survey.
Multinomial logistic regression is used to model respondents‘ use of extremes of the 0–10 CAHPS rating scales as a function of education. A new two-stage model adjusts for both standard case-mix effects and ERT. Ratings of subgroups are compared after these adjustments.
Medicare beneficiaries with greater educational attainment are less likely to use both extremes of the 0–10 rating scale than those with less attainment. Adjustments from the two-stage model may differ substantially from standard adjustments and resolve or attenuate several counterintuitive findings in subgroup comparisons.
Addressing ERT may be important when estimating disparities or comparing providers if patient populations differ markedly in educational attainment. Failures to do so may result in misdirected resources for reducing disparities and inaccurate assessment of some providers. Depending upon the application, ERT may be addressed by the two-stage approach developed here or through specified categorical or stratified reporting.
Health disparities; education; vulnerable populations; response bias
The Centers for Medicare and Medicaid Services will introduce the reporting of patient surveys in 2008. The Consumer Assessment of Health Care Providers and Systems (CAHPS®) Hospital Survey contains 18 questions about hospital care. Internal consistency reliability of the discharge information scale is relatively low and some important domains of care are not represented.
To determine whether adding questions increases the reliability and validity of the survey.
Data Sources and Study Setting
Surveys of patients at 181 hospitals participating in the California Hospitals Assessment and Reporting Taskforce (CHART), an initiative for voluntary public reporting of hospital performance in California.
CHART added nine questions to the CAHPS Hospital Survey; two to improve reliability of the discharge information domain, five to create a coordination of care domain, and two relating to interpreter services.
Surveys were sent to randomly selected patients from each CHART hospital.
A total of 40,172 surveys were included. Adding the new discharge information questions improved the internal consistency reliability from 0.45 to 0.72 and the hospital-level reliability from 0.75 to 0.81. New coordination of care composites had good internal consistency reliabilities ranging from 0.58 to 0.70 and hospital-level reliabilities ranging from 0.84 to 0.87. The new coordination of care composites were more closely correlated with overall hospital ratings and willingness to recommend than six of the seven original domains.
The additional discharge information questions and the new coordination of care questions significantly improved the psychometric properties of the CAHPS Hospital Survey.
Patient experience; hospital performance; public reporting; quality of care; CAHPS® hospital survey
To study the associations of eye diseases and visual symptoms with the most widely used health-related quality of life (HRQOL) generic profile measure.
HRQOL was assessed using the SF-36® version 1 survey administered to a sample of patients receiving care provided by a physician group practice association.
Eye dieases, ocular symptoms, and general health was assessed in a sample of patients from 48 physician groups. A total of 18,480 surveys were mailed out and 7,093 returned; 5,021of these had complete data. Multiple linear regression models were used to examine the decrements in self-reported physical and mental health associated with eye diseases and symptoms, including trouble seeing and blurred vision.
Nine percent of the respondents had cataracts, 2% had age-related macular degeneration, 2% glaucoma, 8% blurred vision, and 13% trouble seeing. Trouble seeing and blurred vision both had statistically unique associations with worse scores on the SF-36 mental health summary score. Only trouble seeing had a significant association with the SF-36 physical health summary score. While these ocular symptoms were significantly associated with SF-36® scores, having an eye disease (cataracts, glaucoma, macular degeneration) was not, after adjusting for other variables in the model.
Our results suggest an important link between visual symptoms and general HRQOL. The study extends the findings of prior research to show that both trouble seeing and blurred vision have independent, measurable associations with HRQOL, while the presence of specific eye diseases may not.
We evaluate the effects of mode and order of administration on health-related quality of life (HRQOL) scores.
We analyzed HRQOL data from the Clinical Outcomes and Measurement of Health Study (COMHS). In COMHS, we enrolled patients with heart failure or cataracts at three sites (University of California, San Diego, UCLA, and University of Wisconsin). Patients completed self-administered HRQOL instruments at baseline and months 1 and 6 post-baseline, including the EQ-5D, Health Utilities Index (HUI), Quality of Well-Being Scale—self-administered (QWB-SA) and the SF-36v2™. At the 6 month follow-up, individuals were randomized to mail or telephone administration first, followed by the other mode of administration. We used repeated measures mixed effects models, adjusting for site, patient age, education, gender and race.
Included were 121 individuals entering a heart failure program and 326 individuals scheduled for cataract surgery who completed the survey by mail or phone at the 6-month follow-up. The majority of the sample was female (53%) and white (86%). About a quarter of the sample had high school education or less (26%). The average age was 66 (36–91 range). HRQOL scores were higher (more positive) for phone administration following mail administration. The largest differences in scores between phone and mail responses occurred for comparisons of telephone responses for those who were randomized to a mail survey first compared to mail responses for those randomized to a telephone survey first (i.e., mode effects for responses that were given on the second administration of the HRQOL measures). The QWB-SA was the only measure that did not display the pattern of mode effects. The biggest differences between modes were 4 points on the SF-36v2™ Physical Health and Mental Health Component Summary Scores, 0.06 on the SF-6D, 0.03 on the QWB-SA, 0.08 on the EQ-5D, 0.04 on the HUI2 and 0.10 on the HUI3.
Telephone administration yields significantly more positive HRQOL scores for all of the generic HRQOL measures except for the QWB-SA. The magnitude of effects was clearly important, with some differences as large as a half-standard deviation. These findings confirm the importance of considering mode of administration when interpreting HRQOL scores.
mode effects; HRQOL; generic measures
Assess proxy respondent effects on health care evaluations by Medicare beneficiaries.
110,215 respondents from the nationally representative 2001 CAHPS® Medicare Fee-for-Service Survey.
Study Design/Data Collection/Extraction Methods
We compare the effects of both proxy respondents and proxy assistance (reading, writing, or translating) on 23 “objective” report items and four “subjective” global measures of health care experiences using propensity-score-weighted regression. We assess whether proxy effects differ among spouses, other relatives, or nonrelatives.
Proxy respondents provide less positive evaluations of beneficiary health care experiences than otherwise similar self-reporting beneficiaries for more subjective global ratings (average effect of 0.21 standard deviations); differences are smaller for relatively objective and specific report items. Proxy assistance differences are similar, but about half as large. Reports from spouse proxy respondents are more positive than those from other proxies and are similar to what would have been reported by the beneficiaries themselves. Standard regression techniques may overestimate proxy effects in this instance.
One should treat proxy responses to subjective ratings cautiously. Even seemingly innocuous reading, writing, and translation by proxies may influence answers. Spouses may be accurate proxies for the elderly in evaluations of health care.
Beneficiary evaluation of health care experiences; methodological study; consumer reports
This study uses the Consumer Assessments of Healthcare Providers and Systems (CAHPS®) survey to examine the experiences of Hispanics enrolled in Medicare managed care. Evaluations of care are examined in relationship to primary language (English or Spanish) and region of the country.
CAHPS 3.0 Medicare managed care survey data collected in 2002.
The dependent variables consist of five CAHPS multi-item scales measuring timeliness of care, provider communication, office staff helpfulness, getting needed care, and health plan customer service. The main independent variables are Hispanic primary language (English or Spanish) and region (California, Florida, New York/New Jersey, and other states). Ordinary least squares regression is used to model the effect of Hispanic primary language and region on CAHPS scales, controlling for age, gender, education, and self-rated health.
Data Collection/Extraction Methods
The analytic sample consists of 125,369 respondents (82 percent response rate) enrolled in 181 Medicare managed care plans across the U.S. Of the 125,369 respondents, 8,463 (7 percent) were self-identified as Hispanic. The survey was made available in English and Spanish, and 1,353 Hispanics completed one in Spanish.
Hispanic English speakers had less favorable reports of care than whites for all dimensions of care except provider communication. Hispanic Spanish speakers reported more negative experiences than whites with timeliness of care, provider communication, and office staff helpfulness, but better reports of care for getting needed care. Spanish speakers in all regions except Florida had less favorable scores than English-speaking Hispanics for provider communication and office staff helpfulness, but more positive assessments for getting needed care. There were greater regional variations in CAHPS scores among Hispanic Spanish speakers than among Hispanic English speakers. Spanish speakers in Florida had more positive experiences than Spanish speakers in other regions for most dimensions of care.
Hispanics in Medicare managed care face barriers to care; however, their experiences with care vary by language and region. Spanish speakers (except FL) have less favorable experiences with provider communication and office staff helpfulness than their English-speaking counterparts, suggesting language barriers in the clinical encounter. On the other hand, Spanish speakers reported more favorable experiences than their English-speaking counterparts with the managed care aspects of their care (getting needed care and plan customer service). Medicare managed care plans need to address the observed disparities in patient experiences among Hispanics as part of their quality improvement efforts. Plans can work with their network providers to address issues related to timeliness of care and office staff helpfulness. In addition, plans can provide incentives for language services, which have the potential to improve communication with providers and staff among Spanish speakers. Finally, health plans can reduce the access barriers faced by Hispanics, especially among English speakers.
CAHPS; patient experiences; Medicare managed care; Hispanics; language; ethnic disparities; geographic variations
Item response theory (IRT) has a number of potential advantages over classical test theory in assessing self-reported health outcomes. IRT models yield invariant item and latent trait estimates (within a linear transformation), standard errors conditional on trait level, and trait estimates anchored to item content. IRT also facilitates evaluation of differential item functioning, inclusion of items with different response formats in the same scale, and assessment of person fit and is ideally suited for implementing computer adaptive testing. Finally, IRT methods can be helpful in developing better health outcome measures and in assessing change over time. These issues are reviewed, along with a discussion of some of the methodological and practical challenges in applying IRT methods.
item response theory; health outcomes; differential item functioning; computer adaptive testing
To identify a parsimonious subset of reliable, valid, and consumer-salient items from 33 questions asking for patient reports about hospital care quality.
CAHPS® Hospital Survey pilot data were collected during the summer of 2003 using mail and telephone from 19,720 patients who had been treated in 132 hospitals in three states and discharged from November 2002 to January 2003.
Standard psychometric methods were used to assess the reliability (internal consistency reliability and hospital-level reliability) and construct validity (exploratory and confirmatory factor analyses, strength of relationship to overall rating of hospital) of the 33 report items. The best subset of items from among the 33 was selected based on their statistical properties in conjunction with the importance assigned to each item by participants in 14 focus groups.
Confirmatory factor analysis (CFA) indicated that a subset of 16 questions proposed to measure seven aspects of hospital care (communication with nurses, communication with doctors, responsiveness to patient needs, physical environment, pain control, communication about medication, and discharge information) demonstrated excellent fit to the data. Scales in each of these areas had acceptable levels of reliability to discriminate among hospitals and internal consistency reliability estimates comparable with previously developed CAHPS instruments.
Although half the length of the original, the shorter CAHPS hospital survey demonstrates promising measurement properties, identifies variations in care among hospitals, and deals with aspects of the hospital stay that are important to patients' evaluations of care quality.
CAHPS hospital survey; patient self-reports; survey; hospital care; psychometric analysis; patient focus groups; confirmatory factor analysis
To examine the predictors of unit and item nonresponse, the magnitude of nonresponse bias, and the need for nonresponse weights in the Consumer Assessment of Health Care Providers and Systems (CAHPS®) Hospital Survey.
A common set of 11 administrative variables (41 degrees of freedom) was used to predict unit nonresponse and the rate of item nonresponse in multivariate models. Descriptive statistics were used to examine the impact of nonresponse on CAHPS Hospital Survey ratings and reports.
Unit nonresponse was highest for younger patients and patients other than non-Hispanic whites (p<.001); item nonresponse increased steadily with age (p<.001). Fourteen of 20 reports of ratings of care had significant (p<.05) but small negative correlations with nonresponse weights (median −0.06; maximum −0.09). Nonresponse weights do not improve overall precision below sample sizes of 300–1,000, and are unlikely to improve the precision of hospital comparisons. In some contexts, case-mix adjustment eliminates most observed nonresponse bias.
Nonresponse weights should not be used for between-hospital comparisons of the CAHPS Hospital Survey, but may make small contributions to overall estimates or demographic comparisons, especially in the absence of case-mix adjustment.
Bias; weighting; missing data; mean-squared error; design effects; satisfaction
To review the existing literature (1980–2003) on survey instruments used to collect data on patients' perceptions of hospital care.
Eight literature databases were searched (PubMED, MEDLINE Pro, MEDSCAPE, MEDLINEplus, MDX Health, CINAHL, ERIC, and JSTOR). We undertook 51 searches with each of the eight databases, for a total of 408 searches. The abstracts for each of the identified publications were examined to determine their applicability for review.
Methods of Analysis
For each instrument used to collect information on patient perceptions of hospital care we provide descriptive information, instrument content, implementation characteristics, and psychometric performance characteristics.
The number of institutional settings and patients used in evaluating patient perceptions of hospital care varied greatly. The majority of survey instruments were administered by mail. Response rates varied widely from very low to relatively high. Most studies provided limited information on the psychometric properties of the instruments.
Our review reveals a diversity of survey instruments used in assessing patient perceptions of hospital care. We conclude that it would be beneficial to use a standardized survey instrument, along with standardization of the sampling, administration protocol, and mode of administration.
Patient reports of hospital care; patient satisfaction instruments; hospital quality; patient care
To estimate the associations among hospital-level scores from the Consumer Assessments of Healthcare Providers and Systems (CAHPS®) Hospital pilot survey within and across different services (surgery, obstetrics, medical), and to evaluate differences between hospital- and patient-level analyses.
CAHPS Hospital pilot survey data provided by the Centers for Medicare and Medicaid Services.
Responses to 33 questionnaire items were analyzed using patient- and hospital-level exploratory factor analytic (EFA) methods to identify both a patient-level and hospital-level composite structures for the CAHPS Hospital survey. The latter EFA was corrected for patient-level sampling variability using a hierarchical model. We compared results of these analyses with each other and to separate EFAs conducted at the service level. To quantify the similarity of assessments across services, we compared correlations of different composites within the same service with those of the same composite across different services.
Cross-sectional data were collected during the summer of 2003 via mail and telephone from 19,720 patients discharged from November 2002 through January 2003 from 132 hospitals in three states.
Six factors provided the best description of inter-item covariation at the patient level. Analyses that assessed variability across both services and hospitals suggested that three dimensions provide a parsimonious summary of inter-item covariation at the hospital level. Hospital-level factor structures also differed across services; as much variation in quality reports was explained by service as by composite.
Variability of CAHPS scores across hospitals can be reported parsimoniously using a limited number of composites. There is at least as much distinct information in composite scores from different services as in different composite scores within each service. Because items cluster slightly differently in the different services, service-specific composites may be more informative when comparing patients in a given service across hospitals. When studying individual-level variability, a more differentiated structure is probably more appropriate.
CAHPS Hospital survey; factor analysis; hospital quality; hierarchical model
To describe translation and cultural adaptation procedures, and examine the degree of equivalence between the Spanish and English versions of the Agency for Healthcare Research and Quality's (AHRQ) Consumer Assessments of Healthcare Providers and Systems (CAHPS®) Hospital Survey (H-CAHPS®) of patient experiences with care.
Cognitive interviews on survey comprehension with 12 Spanish-speaking and 31 English-speaking subjects. Psychometric analyses of 586 responses to the Spanish version and 19,134 responses to the English version of the H-CAHPS survey tested in Arizona, Maryland, and New York in 2003.
A forward/backward translation procedure followed by committee review and cognitive testing was used to ensure a translation that was both culturally and linguistically appropriate. Responses to the two language versions were compared to evaluate equivalence and assess the reliability and validity of both versions.
Data Collection/Extraction Methods
Comparative analyses were carried out on the 32 items of the shortened survey version, focusing on 16 items that comprise seven composites representing different aspects of hospital care quality (communication with nurses, communication with doctors, communication about medicines, nursing services, discharge information, pain control, and physical environment); three items that rate the quality of the nursing staff, physician staff, and the hospital overall; one item on intention to recommend the hospital. The other 12 items used in the analyses addressed mainly respondent characteristics. Analyses included item descriptives, correlations, internal consistency reliability of composites, factor analysis, and regression analysis to examine construct validity.
Responses to both language versions exhibit similar patterns with respect to item–scale correlations, factor structure, content validity, and the association between each of the seven qualities of care composites with both the hospital rating and intention to recommend the hospital. Internal consistency reliability was slightly, yet significantly lower for the Spanish-language respondents for five of the seven composites, but overall the composites were generally equivalent across language versions.
The results provide preliminary evidence of the equivalence between the Spanish and English versions of H-CAHPS. The translated Spanish version can be used to assess hospital quality of care for Spanish speakers, and compare results across these two language groups.
Survey translation and adaptation; patient survey; language equivalence; hospital care quality; Spanish-language survey
Patients in the U.S. often turn to complementary and alternative medicine (CAM) and may use it concurrently with conventional medicine to treat illness and promote wellness. However, clinicians vary in their openness to the merging of treatment paradigms. Because integration of CAM with conventional medicine can have important implications for health care, we developed a survey instrument to assess clinicians' orientation toward integrative medicine.
A convenience sample of 294 acupuncturists, chiropractors, primary care physicians, and physician acupuncturists in academic and community settings in California.
Data Collection Methods
We used a qualitative analysis of structured interviews to develop a conceptual model of integrative medicine at the provider level. Based on this conceptual model, we developed a 30-item survey (IM-30) to assess five domains of clinicians' orientation toward integrative medicine: openness, readiness to refer, learning from alternate paradigms, patient-centered care, and safety of integration.
Two hundred and two clinicians (69 percent response rate) returned the survey. The internal consistency reliability for the 30-item total scale and the five subscales ranged from 0.71 to 0.90. Item-scale correlations for the five subscales were higher for the hypothesized subscale than other subscales 75 percent or more of the time. Construct validity was supported by the association of the IM-30 total scale score (0–100 possible range, with a higher score indicative of greater orientation toward integrative medicine) with hypothesized constructs: physician acupuncturists scored higher than physicians (71 versus 50, p<.001), dual-trained practitioners scored higher than single-trained practitioners (71 versus 62, p<.001), and practitioners' self-perceived “integrativeness” was significantly correlated (r=0.60, p<.001) with the IM-30 total score.
This study provides support for the reliability and validity of the IM-30 as a measure of clinicians' orientation toward integrative medicine. The IM-30 survey, which we estimate as requiring 5 minutes to complete, can be administered to both conventional and CAM clinicians.
Integrative medicine; complementary and alternative medicine; clinicians' orientation; reliability; validity
To assess patients' use of and preferences for information about technical and interpersonal quality when using simulated, computerized health care report cards to select a primary care provider (PCP).
Data Sources/Study Setting
Primary data collected from 304 adult consumers living in Los Angeles County in January and February 2003.
Study Design/Data Collection
We constructed computerized report cards for seven pairs of hypothetical individual PCPs (two internal validity check pairs included). Participants selected the physician that they preferred. A questionnaire collected demographic information and assessed participant attitudes towards different sources of report card information. The relationship between patient characteristics and number of times the participant selected the physician who excelled in technical quality are estimated using an ordered logit model.
Ninety percent of the sample selected the dominant physician for both validity checks, indicating a level of attention to task comparable with prior studies. When presented with pairs of physicians who varied in technical and interpersonal quality, two-thirds of the sample (95 percent CI: 62, 72 percent) chose the physician who was higher in technical quality at least three out of five times (one-sample binomial test of proportion). Age, gender, and ethnicity were not significant predictors of choosing the physician who was higher in technical quality.
These participants showed a strong preference for physicians of high technical quality when forced to make tradeoffs, but a substantial proportion of the sample preferred physicians of high interpersonal quality. Individual physician report cards should contain ample information in both domains to be most useful to patients.
Quality; health care; primary care; physician profiling
The National Institutes of Health (NIH) Patient-Reported Outcomes Measurement Information System (PROMIS®) is a standardized set of patient-reported outcomes (PROs) that cover physical, mental, and social health. The aim of this study was to develop the NIH PROMIS gastrointestinal (GI) symptom measures.
We first conducted a systematic literature review to develop a broad conceptual model of GI symptoms. We complemented the review with 12 focus groups including 102 GI patients. We developed PROMIS items based on the literature and input from the focus groups followed by cognitive debriefing in 28 patients. We administered the items to diverse GI patients (irritable bowel syndrome (IBS), inflammatory bowel disease (IBD), systemic sclerosis (SSc), and other common GI disorders) and a census-based US general population (GP) control sample. We created scales based on confirmatory factor analyses and item response theory modeling, and evaluated the scales for reliability and validity.
A total of 102 items were developed and administered to 865 patients with GI conditions and 1,177 GP participants. Factor analyses provided support for eight scales: gastroesophageal reflux (13 items), disrupted swallowing (7 items), diarrhea (5 items), bowel incontinence/soilage (4 items), nausea and vomiting (4 items), constipation (9 items), belly pain (6 items), and gas/bloat/flatulence (12 items). The scales correlated significantly with both generic and disease-targeted legacy instruments, and demonstrate evidence of reliability.
Using the NIH PROMIS framework, we developed eight GI symptom scales that can now be used for clinical care and research across the full range of GI disorders.
Because gastrointestinal (GI) illnesses can cause physical, emotional, and social distress, patient-reported outcomes (PROs) are used to guide clinical decision making, conduct research, and seek drug approval. It is important to develop a mechanism for identifying, categorizing, and evaluating the over 100 GI PROs that exist. Here we describe a new, National Institutes of Health (NIH)-supported, online PRO clearinghouse—the GI-PRO database.
Using a protocol developed by the NIH Patient-Reported Outcome Measurement Information System (PROMIS®), we performed a systematic review to identify English-language GI PROs. We abstracted PRO items and developed an online searchable item database. We categorized symptoms into content “bins” to evaluate a framework for GI symptom reporting. Finally, we assigned a score for the methodological quality of each PRO represented in the published literature (0–20 range; higher indicates better).
We reviewed 15,697 titles (κ > 0.6 for title and abstract selection), from which we identified 126 PROs. Review of the PROs revealed eight GI symptom “bins”: (i) abdominal pain, (ii) bloat/gas, (iii) diarrhea, (iv) constipation, (v) bowel incontinence/soilage, (vi) heartburn/reflux, (vii) swallowing, and (viii) nausea/vomiting. In addition to these symptoms, the PROs covered four psychosocial domains: (i) behaviors, (ii) cognitions, (iii) emotions, and (iv) psychosocial impact. The quality scores were generally low (mean 8.88±4.19; 0 (min)−20 (max)). In addition, 51% did not include patient input in developing the PRO, and 41% provided no information on score interpretation.
GI PROs cover a wide range of biopsychosocial symptoms. Although plentiful, GI PROs are limited by low methodological quality. Our online PRO library (www.researchcore.org/gipro/) can help in selecting PROs for clinical and research purposes.