The use of global health items permits an efficient way of gathering general perceptions of health. These items provide useful summary information about health and are predictive of health care utilization and subsequent mortality.
Analyses of 10 self-reported global health items obtained from an internet survey as part of the Patient-Reported Outcome Measurement Information System (PROMIS) project. We derived summary scores from the global health items. We estimated the associations of the summary scores with the EQ-5D index score and the PROMIS physical function, pain, fatigue, emotional distress, and social health domain scores.
Exploratory and confirmatory factor analyses supported a two-factor model. Global physical health (GPH; 4 items on overall physical health, physical function, pain, and fatigue) and global mental health (GMH; 4 items on quality of life, mental health, satisfaction with social activities, and emotional problems) scales were created. The scales had internal consistency reliability coefficients of 0.81 and 0.86, respectively. GPH correlated more strongly with the EQ-5D than did GMH (r = 0.76 vs. 0.59). GPH correlated most strongly with pain impact (r = −0.75) whereas GMH correlated most strongly with depressive symptoms (r = −0.71).
Two dimensions representing physical and mental health underlie the global health items in PROMIS. These global health scales can be used to efficiently summarize physical and mental health in patient-reported outcome studies.
Global health; PROMIS; Item response theory; EQ-5D
The Patient-Reported Outcomes Measurement Information System (PROMIS) aims to develop self-reported item banks for clinical research. The PROMIS pediatrics (aged 8–17) project focuses on the development of item banks across several health domains (physical function, pain, fatigue, emotional distress, social role relationships, and asthma symptoms). The psychometric properties of the anxiety and depressive symptom item banks are described.
Participants (n = 1,529) were recruited in public school settings, hospital-based outpatient and subspecialty pediatrics clinics. The anxiety (k = 18) and depressive symptoms (k = 21) items were split between two test administration forms. Hierarchical confirmatory factor-analytic models (CFA) were conducted to evaluate scale dimensionality and local dependence. IRT analyses were then used to finalize item banks and short forms.
CFA results confirmed that anxiety and depressive symptoms are separate constructs and indicative of negative affect. Items with local dependence and DIF were removed resulting in 15 anxiety and 14 depressive symptoms items. The psychometric differences between short forms and simulated computer adaptive tests are presented.
PROMIS pediatric item banks were developed to provide efficient assessment of health-related quality of life domains. This sample provides initial calibrations of anxiety and depressive symptoms item banks and creates PROMIS pediatric instruments, version 1.0.
PROMIS; Anxiety; Depressive symptoms; HRQOL; PRO; Scale development; Surveys; Pediatrics
To illustrate how measurement practices can be advanced using as an example the fatigue item bank (FIB) and its applications (short-forms and computerized adaptive test) that were developed via the NIH Patient Reported Outcomes Measurement Information System (PROMIS) Cooperative Group.
Psychometric analysis of data collected by an internet survey company using Item Response Theory (IRT) related techniques.
A United States general population representative sample collected via internet.
803 respondents used for dimensionality evaluation of the PROMIS FIB and 14,931 respondents used for item calibrations
Main Outcome Measures
112 fatigue items developed by the PROMIS fatigue domain working group, 13-item Functional Assessment of Chronic Illness Therapy-Fatigue, and 4-item SF-36 Vitality scale.
The PROMIS FIB version 1 which consists of 95 items demonstrated acceptable psychometric properties. Computerized Adaptive Testing (CAT) showed consistently better precision than short-forms. However, all three short-forms showed good precision for the majority of participants, in that more than 95% of sample could be precisely measured with a reliability greater than 0.9.
Measurement practice can be advanced by using a psychometrically sound measurement tool and its applications. This example shows that CAT and short-forms derived from the PROMIS FIB can reliably estimate fatigue reported by the US general population. Evaluation in clinical populations is warranted before the item bank can be used for clinical trials.
PROMIS; fatigue; CAT; short-form
An aim of the National Institutes of Health (NIH) Patient Reported Outcomes Measurement Information System (PROMIS) initiative is to develop item banks and computerized adaptive tests (CAT) that are applicable across a wide variety of chronic disorders. The PROMIS Pediatric Cooperative Group has concentrated on the development of pediatric self-report item banks for ages 8-17 years. The objective of the present study is to describe the Item Response Theory (IRT) analysis of the NIH PROMIS pediatric pain item bank and the measurement properties of the new unidimensional PROMIS Pediatric Pain Interference Scale. Test forms containing pediatric pain items were completed by a total of 3,048 respondents. IRT analyses regarding scale dimensionality, item local dependence, and differential item functioning were conducted. A pain item pool was developed to yield scores on a T-score scale with a mean of 50 and standard deviation of 10. The recommended 8-item unidimensional short form for the PROMIS Pediatric Pain Interference Scale contains the item set which provides the maximum test information at the mean (50) on the T-score metric. A simulated CAT was computed that provides the most information at five possible score locations (30, 40, 50, 60, and 70 on the T-score metric).
Pain; pediatrics; PROMIS; pain interference; Item Response Theory
One of the PROMIS (Patient-Reported Outcome Measurement Information System) network's primary goals is the development of a comprehensive item bank for patient-reported outcomes of chronic diseases. For its first set of item banks, PROMIS chose to focus on pain, fatigue, emotional distress, physical function, and social function. An essential step for the development of an item pool is the identification, evaluation, and revision of extant questionnaire items for the core item pool. In this work, we also describe the systematic process wherein items are classified for subsequent statistical processing by the PROMIS investigators. Six phases of item development are documented: identification of extant items, item classification and selection, item review and revision, focus group input on domain coverage, cognitive interviews with individual items, and final revision before field testing. Identification of items refers to the systematic search for existing items in currently available scales. Expert item review and revision was conducted by trained professionals who reviewed the wording of each item and revised as appropriate for conventions adopted by the PROMIS network. Focus groups were used to confirm domain definitions and to identify new areas of item development for future PROMIS item banks. Cognitive interviews were used to examine individual items. Items successfully screened through this process were sent to field testing and will be subjected to innovative scale construction procedures.
patient-reported outcomes; cognitive interviews; qualitative methods; questionnaire development
The Patient-Reported Outcomes Measurement Information System (PROMIS) is a National Institutes of Health initiative to develop item banks measuring patient-reported outcomes (PROs) and to create and make available a computerized adaptive testing system (CAT) that allows for efficient and precise assessment of PROs in clinical research and practice.
Aims of the Study
Based on the presentation from a symposium on “Evidence-based Outcomes in Psychiatry: Updates on Measurement Using Patient-Reported Outcomes (PRO)” at the 2011 American Psychiatry Association Convention, this paper provides an overview of PROMIS and its application to mental health research.
The PROMIS methodology for item bank development and testing is described, with a focus on the implications of this work for mental health research.
Utilizing qualitative item review and state-of-the-art applications of item response theory (IRT), PROMIS investigators have developed, tested, and released item banks measuring physical, mental, and social health components. Ongoing efforts continue to add new item banks and further validate existing banks.
PROMIS provides item banks measuring several domains of interest to mental health researchers including emotional distress, social function, and sleep. PROMIS methodology also provides a rigorous standard for the development of new mental health measures.
Implications for Health Care Provision
Web-based CAT or administration of short forms derived from PROMIS item banks provide efficient and precise dimensional estimates of clinical outcomes that can be utilized to monitor patient progress and assess quality improvement.
Implications for Future Research
Use of the dimensional PROMIS metrics (and co-calibration of the PROMIS item banks with existing PROs) will allow comparisons of mental health and related health outcomes across disorders and studies.
The measurement of pain behavior is a key component of the assessment of persons with chronic pain; however few self-reported pain behavior instruments have been developed. We developed a pain behavior item bank as part of the Patient Reported Outcome Measurement Information System (PROMIS). For the Wave I testing, because of the large number of PROMIS items, a complex sampling approach was used where participants were randomly assigned to either respond to two full item banks or to multiple 7-item blocks of items. A web-based survey was designed and completed by 15,528 members of the general population and 967 individuals with different types of chronic pain. Item response theory (IRT) analysis models were used to evaluate item characteristics and to scale both items and individuals on the pain behavior domain. The pain behavior item bank demonstrated good fit to a unidimensional model (Comparative Fit Index = 0.94). Several iterations of IRT analyses resulted in a final 39 item pain behavior bank, and different IRT models were fit to the total sample and to those participants who experienced some pain. The results indicated that these items demonstrated good coverage of the pain behavior construct. Pain behavior scores were strongly related to pain intensity and moderately related to self-reported general health status. Mean pain behavior scores varied significantly by groups based on pain severity and general health status. The PROMIS pain behavior item bank can be used to develop static short-form and dynamic measures of pain behavior for clinical studies.
Pain behavior; item response theory analysis; patient reported outcomes; psychometric analysis; chronic pain; item banks
Recently, the National Institutes of Health Roadmap for Medical Research initiative led a large-scale effort to develop the Patient-Reported Outcomes Measurement Information System (PROMIS). PROMIS’s main goal was to develop a set of item banks and computerized adaptive tests for the clinical research community. Asthma, as the most common chronic childhood disease, was chosen for a disease-specific pediatric item bank.
The primary objective of this research is to present the details of the psychometric analyses of the asthma domain items.
Item response theory (IRT) analyses were conducted on a 34–asthma item bank. Test forms containing PROMIS Pediatric Asthma domain items were completed by 622 children ages 8 to 12. Items were subsequently evaluated for local dependence, scale dimensionality, and differential item functioning.
A 17-item pool and an 8-item short form for the new PROMIS Pediatric Asthma Impact Scale (PAIS) were generated using IRT. The recommended 8-item short form contains the item set that provides the maximum test information at the mean (50) on the T-score metric. If more score precision is required, the complete 17-item pool is recommended and may be used in toto or as the basis of a computerized adaptive test (CAT). A shorter test form can also be created and scored on the same scale.
The present study presents the PROMIS Pediatric Asthma Impact Scale (PAIS) developed with IRT, and provides the initial calibration data for the items.
asthma; pediatric; quality-of-life; item response theory
In 2002, the NIH launched the ‘Roadmap for Medical Research’. The Patient-Reported Outcomes Measurement Information System (PROMIS®) is one of the Roadmap’s key aspects. To create the next generation of patient-reported outcome measures, PROMIS utilizes item response theory (IRT) and computerized adaptive testing. In 2009, the NIH funded the second wave of PROMIS studies (PROMIS II). PROMIS II studies continue PROMIS’s agenda, but also include new features, including longitudinal analyses and more sociodemographically diverse samples. PROMIS II also includes increased emphasis on pediatric populations and evaluation of PROMIS item banks for clinical research and population science. These aspects bring new psychometric challenges. To address this, investigators associated with PROMIS gathered at the Third Psychometric Summit in September 2010 to identify, describe and discuss pressing psychometric issues and new developments in the field, as well as make analytic recommendations for PROMIS. The summit addressed five general themes: linking, differential item functioning, dimensionality, IRT models for longitudinal applications and new IRT software. In this article, we review the discussions and presentations that occurred at the Third PROMIS Psychometric Summit.
computerized adaptive testing; dimensionality; factor analysis; item response theory; patient-reported outcomes; PROMIS; psychometrics; structural equation modeling
The NIH Patient-Reported Outcomes Measurement Information System (PROMIS) Roadmap initiative is a cooperative group program of research designed to develop, evaluate, and standardize item banks to measure patient-reported outcomes relevant across medical conditions. For adults, 11 domains have been developed in physical, mental, and social health.
The objective of the current study was to assess feasibility and construct validity of PROMIS item banks versus legacy measures in a observational study in systemic sclerosis (SSc).
Patients with SSc in a single academic center completed computerized adaptive technology (CAT) administered PROMIS item banks during the clinic visit and legacy domains (using paper-and-pencil). The construct validity of PROMIS items was evaluated by examining correlations with corresponding legacy measures using multitrait-multimethod analysis.
Participants consisted of 143 SSc patients with an average age of 51.5 years; 71% were female and 68% were Caucasian. The average number of items completed for each CAT-administered item bank ranged from 5 to 8 (69 CAT items per patient), and the average time to complete each CAT-administered item bank ranged from 48 seconds to 1.9 minutes per patient (average time= 11.9 minutes/per patient for 11 banks). All correlations between PROMIS domains and respective legacy measures were large and in the hypothesized direction (ranged from .61 to .82).
Our study supports the construct validity of the CAT-administered PROMIS item banks and shows that they can be administered successfully in a clinic with support staff. Future studies should assess the feasibility of PROMIS item banks in a busy clinical practice
Systemic sclerosis; PROMIS; health-related quality of life; construct validity
The Health Assessment Questionnaire Disability Index (HAQ) and the SF-36 PF-10, among other instruments, yield sensitive and valid Disability (Physical Function) endpoints. Modern techniques, such as Item Response Theory (IRT), now enable development of more precise instruments using improved items. The NIH Patient Reported Outcomes Measurement Information System (PROMIS) is charged with developing improved IRT-based tools. We compared the ability to detect change in physical function using original (Legacy) instruments with Item-Improved and PROMIS IRT-based instruments.
We studied two Legacy (original) Physical Function/Disability instruments (HAQ, PF-10), their item-improved derivatives (Item-Improved HAQ and PF-10), and the IRT-based PROMIS Physical Function 10- (PROMIS PF 10) and 20-item (PROMIS PF 20) instruments. We compared sensitivity to detect 12-month changes in physical function in 451 rheumatoid arthritis (RA) patients and assessed relative responsiveness using P-values, effect sizes (ES), and sample size requirements.
The study sample was 81% female, 87% Caucasian, 65 years of age, had 14 years of education, and had moderate baseline disability. All instruments were sensitive to detecting change (< 0.05) in physical function over one year. The most responsive instruments in these patients were the Item-Improved HAQ and the PROMIS PF 20. IRT-improved instruments could detect a 1.2% difference with 80% power, while reference instruments could detect only a 2.3% difference (P < 0.01). The best IRT-based instruments required only one-quarter of the sample sizes of the Legacy (PF-10) comparator (95 versus 427). The HAQ outperformed the PF-10 in more impaired populations; the reverse was true in more normal populations. Considering especially the range of severity measured, the PROMIS PF 20 appears the most responsive instrument.
Physical Function scales using item improved or IRT-based items can result in greater responsiveness and precision across a broader range of physical function. This can reduce sample size requirements and thus study costs.
Short-form patient-reported outcome measures are popular because they minimize patient burden. We assessed the efficiency of static short forms and computer adaptive testing (CAT) using data from the Patient-Reported Outcomes Measurement Information System (PROMIS) project.
We evaluated the 28-item PROMIS depressive symptoms bank. We used post hoc simulations based on the PROMIS calibration sample to compare several short-form selection strategies and the PROMIS CAT to the total item bank score.
Compared with full-bank scores, all short forms and CAT produced highly correlated scores, but CAT outperformed each static short form in almost all criteria. However, short-form selection strategies performed only marginally worse than CAT. The performance gap observed in static forms was reduced by using a two-stage branching test format.
Using several polytomous items in a calibrated unidimensional bank to measure depressive symptoms yielded a CAT that provided marginally superior efficiency compared to static short forms. The efficiency of a two-stage semi-adaptive testing strategy was so close to CAT that it warrants further consideration and study.
Computer adaptive testing; PROMIS; Item response theory; Short form; Two-stage testing
The Patient-Reported Outcomes Measurement Information System (PROMIS) was developed as one of the first projects funded by the NIH Roadmap for Medical Research Initiative to re-engineer the clinical research enterprise. The primary goal of PROMIS is to build item banks and short forms that measure key health outcome domains that are manifested in a variety of chronic diseases which could be used as a “common currency” across research projects. To date, item banks, short forms and computerized adaptive tests (CAT) have been developed for 13 domains with relevance to pediatric and adult subjects. To enable easy delivery of these new instruments, PROMIS built a web-based resource (Assessment Center) for administering CATs and other self-report data, tracking item and instrument development, monitoring accrual, managing data, and storing statistical analysis results. Assessment Center can also be used to deliver custom researcher developed content, and has numerous features that support both simple and complicated accrual designs (branching, multiple arms, multiple time points, etc.). This paper provides an overview of the development of the PROMIS item banks and details Assessment Center functionality.
Childhood obesity is a growing health concern known to adversely affect quality of life in children and adolescents. The Patient Reported Outcomes Measurement Information System (PROMIS) pediatric measures were developed to capture child self-reports across a variety of health conditions experienced by children and adolescents. The purpose of this study is to begin the process of validation of the PROMIS pediatric measures in children and adolescents affected by obesity.
The pediatric PROMIS instruments were administered to 138 children and adolescents in a cross-sectional study of patient reported outcomes in children aged 8–17 years with age-adjusted body mass index (BMI) greater than the 85th percentile in a design to establish known-group validity. The children completed the depressive symptoms, anxiety, anger, peer relationships, pain interference, fatigue, upper extremity, and mobility PROMIS domains utilizing a computer interface. PROMIS domains and individual items were administered in random order and included a total of 95 items. Patient responses were compared between patients with BMI 85 to < 99th percentile versus ≥ 99th percentile.
136 participants were recruited and had all necessary clinical data for analysis. Of the 136 participants, 5% ended the survey early resulting in missing domain scores at the end of survey administration. In multivariate analysis, patients with BMI ≥ 99th percentile had worse scores for depressive symptoms, anger, fatigue, and mobility (p < 0.05). Parent-reported exercise was associated with better scores for depressive symptoms, anxiety, and fatigue (p < 0.05).
Children and adolescents ranging from overweight to severely obese can complete multiple PROMIS pediatric measures using a computer interface in the outpatient setting. In the 5% with missing domain scores, the missing scores were consistently found in the domains administered last, suggesting the length of the assessment is important. The differences in domain scores found in this study are consistent with previous reports investigating the quality of life in children and adolescents with obesity. We show that the PROMIS instrument represents a feasible and potentially valuable instrument for the future study of the effect of pediatric obesity on quality of life.
Quality of life; Obesity; Patient Reported Outcomes Measurement Information System (PROMIS); Child; Depression
Patient-reported physical function is an established outcome domain in clinical studies in rheumatology. To overcome the limitations of the current generation of questionnaires, the Patient-Reported Outcomes Measurement Information System (PROMIS®) project in the USA has developed calibrated item banks for measuring several domains of health status in people with a wide range of chronic diseases. The aim of this study was to translate and cross-culturally adapt the PROMIS physical function item bank to the Dutch language and to pretest it in a sample of patients with arthritis.
The items of the PROMIS physical function item bank were translated using rigorous forward-backward protocols and the translated version was subsequently cognitively pretested in a sample of Dutch patients with rheumatoid arthritis.
Few issues were encountered in the forward-backward translation. Only 5 of the 124 items to be translated had to be rewritten because of culturally inappropriate content. Subsequent pretesting showed that overall, questions of the Dutch version were understood as they were intended, while only one item required rewriting.
Results suggest that the translated version of the PROMIS physical function item bank is semantically and conceptually equivalent to the original. Future work will be directed at creating a Dutch-Flemish final version of the item bank to be used in research with Dutch speaking populations.
In order to fully capture the impact of a disease or condition on the lives of individuals, patient-reported outcomes (PROs) are considered a necessary component of health measurement in rehabilitation. This article provides an overview of the involvement of rehabilitation stakeholders in the development of sound measurement tools for the Patient-Reported Outcomes Measurement Information System (PROMIS), a National Institutes of Health-funded initiative. PROMIS is a multi-site study that included many different populations. Here we focus on the involvement of people with several chronic conditions, including multiple sclerosis, spinal cord injury, and arthritis in the development of PROMIS measures. We describe both qualitative and quantitative methods used, including expert panels, focus groups, cognitive interviews and item response theory modeling, which resulted in enhanced utility of PROMIS measures in rehabilitation. The measures include a set of global health items and twelve item banks representing six domains. Scores are reported in the T-score metric (Mean = 50, SD=10) and centered on means from the U.S. general population. The PROMIS item banks measure quality of life and symptoms of people with chronic conditions and have the potential to enhance research and clinical practice by facilitating comparisons of scores across domains and populations.
Outcome Assessment (Health Care); rehabilitation; Disabled Persons
Patient-reported outcomes (PROs) are essential when evaluating many new treatments in health care, yet current measures have been limited by a lack of precision, standardization and comparability of scores across studies and diseases. The Patient-Reported Outcomes Measurement Information System (PROMIS™) provides item banks that offer the potential for PRO measurement that is efficient (minimizes item number without compromising reliability) flexible (enables optional use of interchangeable items), and precise (has minimal error in estimate) measurement of commonly-studied PROs. We report results from the first large-scale testing of PROMIS items.
Study Design and Setting
Fourteen item pools were tested in the U.S. general population and clinical groups using an online panel and clinic recruitment. A scale-setting sub-sample was created reflecting demographics proportional to the 2000 U.S. census.
Using item response theory (graded response model), 11 item banks were calibrated on a sample of 21,133, measuring components of self-reported physical, mental and social health, along with a 10-item global health scale. Short forms from each bank were developed and compared to the overall bank as well as with other well-validated and widely accepted (“legacy”) measures. All item banks demonstrated good reliability across the majority of the score distributions. Construct validity was supported by moderate to strong correlations with legacy measures.
PROMIS item banks and their short forms provide evidence they are reliable and precise measures of generic symptoms and functional reports comparable to legacy instruments. Further testing will continue to validate and test PROMIS items and banks in diverse clinical populations.
Outcome Measures; Quality of life; Chronic disease
The Patient Reported Outcomes Measurement Information System (PROMIS) aims to develop patient-reported outcome (PROs) instruments for use in clinical research. The PROMIS pediatrics (ages 8–17) project focuses on the development of PROs across several health domains (physical function, pain, fatigue, emotional distress, social role relationships, and asthma symptoms). The objective of the present study is to report on the psychometric properties of the PROMIS Pediatric Anger Scale.
Participants (n=759) were recruited in public school settings, hospital-based outpatient and subspecialty pediatrics clinics. The anger items (k=10) were administered on one test form. A hierarchical confirmatory factor analytic model (CFA) was conducted to evaluate scale dimensionality and local dependence. Item response theory (IRT) analyses were then used to finalize the item scale and short form.
CFA confirmed that the anger items are representative of a unidimensional scale and items with local dependence were removed resulting in a six-item short form. The IRT-scaled scores from summed scores and each score’s conditional standard error were calculated for the new six-item PROMIS Pediatric Anger Scale.
This study provides initial calibrations of the anger items and creates the PROMIS Pediatric Anger Scale, version 1.0
PROMIS; Anger; HRQOL; PRO; Scale Development; Surveys; Pediatrics
Pediatric self-report should be considered the standard for measuring patient reported outcomes (PRO) among children. However, circumstances exist when the child is too young, cognitively impaired, or too ill to complete a PRO instrument and a proxy-report is needed. This paper describes the development process including the proxy cognitive interviews and large-field-test survey methods and sample characteristics employed to produce item parameters for the Patient Reported Outcomes Measurement Information System (PROMIS) pediatric proxy-report item banks.
The PROMIS pediatric self-report items were converted into proxy-report items before undergoing cognitive interviews. These items covered six domains (physical function, emotional distress, social peer relationships, fatigue, pain interference, and asthma impact). Caregivers (n = 25) of children ages of 5 and 17 years provided qualitative feedback on proxy-report items to assess any major issues with these items. From May 2008 to March 2009, the large-scale survey enrolled children ages 8-17 years to complete the self-report version and caregivers to complete the proxy-report version of the survey (n = 1548 dyads). Caregivers of children ages 5 to 7 years completed the proxy report survey (n = 432). In addition, caregivers completed other proxy instruments, PedsQL™ 4.0 Generic Core Scales Parent Proxy-Report version, PedsQL™ Asthma Module Parent Proxy-Report version, and KIDSCREEN Parent-Proxy-52.
Item content was well understood by proxies and did not require item revisions but some proxies clearly noted that determining an answer on behalf of their child was difficult for some items. Dyads and caregivers of children ages 5-17 years old were enrolled in the large-scale testing. The majority were female (85%), married (70%), Caucasian (64%) and had at least a high school education (94%). Approximately 50% had children with a chronic health condition, primarily asthma, which was diagnosed or treated within 6 months prior to the
interview. The PROMIS proxy sample scored similar or better on the other proxy instruments compared to normative samples.
The initial calibration data was provided by a diverse set of caregivers of children with a variety of common chronic illnesses and racial/ethnic backgrounds. The PROMIS pediatric proxy-report item banks include physical function (mobility n = 23; upper extremity n = 29), emotional distress (anxiety n = 15; depressive symptoms n = 14; anger n = 5), social peer relationships (n = 15), fatigue (n = 34), pain interference (n = 13), and asthma impact (n = 17).
PROMIS; HRQOL; PRO; Scale development; Parent Proxy; Pediatrics
The authors report on the development and calibration of item banks for depression, anxiety, and anger as part of the Patient-Reported Outcomes Measurement Information System (PROMIS®). Comprehensive literature searches yielded an initial bank of 1,404 items from 305 instruments. After qualitative item analysis (including focus groups and cognitive interviewing), 168 items (56 for each construct) were written in a first person, past tense format with a 7-day time frame and five response options reflecting frequency. The calibration sample included nearly 15,000 respondents. Final banks of 28, 29, and 29 items were calibrated for depression, anxiety, and anger, respectively, using item response theory. Test information curves showed that the PROMIS item banks provided more information than conventional measures in a range of severity from approximately −1 to +3 standard deviations (with higher scores indicating greater distress). Short forms consisting of seven to eight items provided information comparable to legacy measures containing more items.
depression; anxiety; anger; item response theory; measurement
Patient-reported outcomes (PROs) play an increasingly important role in clinical practice and research. Modern psychometric methods such as item response theory (IRT) enable the creation of item banks that support fixed-length forms as well as computerized adaptive testing (CAT), often resulting in improved measurement precision and responsiveness. Here we describe and discuss the case for developing an international core set of PROs building from the US PROMIS® network.
PROMIS is a U.S.-based cooperative group of research sites and centers of excellence convened to develop and standardize PRO measures across studies and settings. If extended to a global collaboration, PROMIS has the potential to transform PRO measurement by creating a shared, unifying terminology and metric for reporting of common symptoms and functional life domains. Extending a common set of standardized PRO measures to the international community offers great potential for improving patient-centered research, clinical trials reporting, population monitoring, and health care worldwide. Benefits of such standardization include the possibility of: international syntheses (such as meta-analyses) of research findings; international population monitoring and policy development; health services administrators and planners access to relevant information on the populations they serve; better assessment and monitoring of patients by providers; and improved shared decision making.
The goal of the current PROMIS International initiative is to ensure that item banks are translated and culturally adapted for use in adults and children in as many countries as possible. The process includes 3 key steps: translation/cultural adaptation, calibration, and validation. A universal translation, an approach focusing on commonalities, rather than differences across versions developed in regions or countries speaking the same language, is proposed to ensure conceptual equivalence for all items. International item calibration using nationally representative samples of adults and children within countries is essential to demonstrate that all items possess expected strong measurement properties. Finally, it is important to demonstrate that the PROMIS measures are valid, reliable and responsive to change when used in an international context.
IRT item banking will allow for tailoring within countries and facilitate growth and evolution of PROs through contributions from the international measurement community. A number of opportunities and challenges of international development of PROs item banks are discussed.
Patient-reported outcomes; Health-related quality of life research; Patients’ experiences; Questionnaires; Cross-cultural equivalence; Health information systems; Clinical decision making; Comparative effectiveness research; Patient empowerment; Cross-national comparisons
This study examined the measurement invariance of responses to the patient-reported outcomes measurement information system (PROMIS) pain interference (PI) item bank. The original PROMIS calibration sample (Wave I) was augmented with a sample of persons recruited from the American Chronic Pain Association (ACPA) to increase the number of participants reporting higher levels of pain. Establishing measurement invariance of an item bank is essential for the valid interpretation of group differences in the latent concept being measured.
Multi-group confirmatory factor analysis (MG-CFA) was used to evaluate successive levels of measurement invariance: configural, metric, and scalar invariance.
Support was found for configural and metric invariance of the PROMIS-PI, but not for scalar invariance.
Conclusions and recommendations
Based on our results of MG-CFA, we recommend retaining the original parameter estimates obtained by combining the community sample of Wave I and ACPA participants. Future studies should extend this study by examining measurement equivalence in an item response theory framework such as differential item functioning analysis.
Factor analysis; Pain interference; Pain measurement; Patient outcome measures; Psychometrics
We provide detailed instructions for analyzing patient-reported outcome (PRO) data collected with an existing (legacy) instrument so that scores can be calibrated to the PRO Measurement Information System (PROMIS) metric. This calibration facilitates migration to computerized adaptive test (CAT) PROMIS data collection, while facilitating research using historical legacy data alongside new PROMIS data.
A cross-sectional convenience sample (n = 2,178) from the Universities of Washington and Alabama at Birmingham HIV clinics completed the PROMIS short form and Patient Health Questionnaire (PHQ-9) depression symptom measures between August 2008 and December 2009. We calibrated the tests using item response theory. We compared measurement precision of the PHQ-9, the PROMIS short form, and simulated PROMIS CAT.
Dimensionality analyses confirmed the PHQ-9 could be calibrated to the PROMIS metric. We provide code used to score the PHQ-9 on the PROMIS metric. The mean standard errors of measurement were 0.49 for the PHQ-9, 0.35 for the PROMIS short form, and 0.37, 0.28, and 0.27 for 3-, 8-, and 9-item-simulated CATs.
The strategy described here facilitated migration from a fixed-format legacy scale to PROMIS CAT administration and may be useful in other settings.
Calibration; Computerized adaptive testing; Depression; Item banks; Item response theory; PROMIS
The Patient-Reported Outcomes Measurement Information System (PROMIS) allows assessment of the impact of chronic conditions on health-related quality of life (HRQL) across diseases. We report on the HRQL impact of individual and comorbid conditions as well as conditions that are described as limiting activity.
Study Design and Setting
Data were collected through online and clinic recruitment as part of the PROMIS item calibration sample (n=21,133). Participants reported the presence or absence of 24 chronic health conditions and whether or not their activity was limited by each condition.
Across health status domains, the presence of a chronic condition was associated with poorer scores than those without a diagnosis, particularly for those individuals who reported their condition was disabling. The magnitude of detriment in HRQL was more pronounced for individuals with two or more chronic conditions and could not be explained by sociodemographic factors. Patterns of HRQL deficits varied across disease and comorbidity status.
The impact of chronic conditions, particularly when experienced with comorbid disease, is associated with detriments in HRQL. The negative impact on HRQL varies across symptoms and functional areas within a given condition.
Chronic disease; Comorbidity; Quality of life; Outcome measures
The National Institutes of Health's Patient-Reported Outcomes Measurement Information System (PROMIS) has developed several scales measuring symptoms and function for use by the clinical research community. One advantage of PROMIS is the ability to link other scales to the PROMIS metric.
The objectives of this research are to provide evidence of validity for one of the PROMIS measures, the Pediatric Asthma Impact Scale (PAIS), and to link the PedsQL™ Asthma Symptoms Scale with the metric of the PAIS.
Descriptive statistics were computed describing the relationships among scores on the PAIS, the PedsQL™ Asthma Symptoms, Treatment, Worry, and Communication Scales, and the DISABKIDS Asthma Impact and Worry Scales for approximately 300 children ages 8–17. A novel linkage method based on item response theory (IRT), calibrated projection, was used to link scores on the PedsQL™ Asthma Symptoms Scale with the metric of the PAIS.
The PAIS exhibited strong convergent validity with the PedsQL™ Asthma Symptoms Scale, and less strong relations with the other five scales. The linkage system uses scores on the PedsQL™ Asthma Symptoms Scale to produce relatively precise score estimates on the metric of the PAIS.
Results of this study provide evidence for the validity of the PAIS, and a method to use scores on the PedsQL™ Asthma Symptoms Scale to estimate scores on the metric of the PAIS, in partial fulfillment of the PROMIS goal to provide a lingua franca for health-related quality of life.
PROMIS; HRQOL; PRO; Scale development; Pediatrics; Asthma