Cost-effectiveness/cost-utility analyses are increasingly needed to inform decisions about care. Algorithms have been developed using the Functional Assessment of Cancer Therapy (FACT) quality of life instrument to estimate utility weights for cost analyses. This study was designed to compare these algorithms in the setting of ovarian cancer.
GOG-0152 was a 550-patient randomized phase III trial of interval cytoreduction, and GOG-0172 was a 415-patient randomized phase III trial comparing intravenous versus intraperitoneal therapy among women with advanced ovarian cancer. QOL data were collected via the FACT at four time points in each study. Two published mapping algorithms (Cheung and Dobrez) and a linear transformation method were applied to these data. The agreement between measures was assessed by the concordance correlation coefficient (rCCC), and paired t-tests were used to compare means.
While agreement between the estimation algorithms was good (ranged from 0.72 to 0.81), there were statistically significant (p<0.001) and clinically meaningful differences between the scores: mean scores were higher with Dobrez than with Cheung or the linear transformation method.. Scores were also statistically significantly different (p<0.001) between studies.
In the absence of prospectively collected utility data, the use of mapping algorithms is feasible, however, the optimal algorithm is not clear. There were significant differences between studies, which highlights the need for validation of these algorithms in specific settings. If cost analyses incorporate mapping algorithms to obtain utility estimates, investigators should take the variability into account.
health utilities; ovarian cancer; quality of life; methodology; comparative effectiveness research
Metastatic renal cell cancer is associated with poor long-term survival and has no cure. Traditional clinical endpoints are best supplemented by patient-reported outcomes designed to assess symptoms and function. We obtained normative data on the NCCN - Functional Assessment of Cancer Therapy – Kidney Symptom Index (NFKSI) to aid in score interpretation and planning of future trials.
General population data were obtained from 2000 respondents, who completed the NFKSI-19, as well the SF-36 and the PROMIS-29, both general health status measures. Basic demographic and self-reported comorbidity data were also collected.
The sample was 50% female, 85.7% Caucasian, with an equal distribution across age bands 18–75+. Most respondents (62.8%) had more than a high school education and reported an ECOG performance status of normal activity without symptoms (63.4%). Score distributions on the NFKSI-19, its subscales, and individual items are summarized.
The NFKSI-19 and its subscales now have scores for the general US population, allowing comparability to generic questionnaires such as the SF-36 and PROMIS-29. These data can be used to guide treatment expectations and plan future comparative effectiveness research using the scales.
quality of life; questionnaire; renal cell cancer; general population
Neuro-QOL provides a clinically relevant and psychometrically robust health-related quality of life (HRQL) assessment tool for both adults and children with common neurological disorders. We now report the psychometric results for the adult tools.
An extensive research, survey and consensus process was used to produce a list of 5 priority adult neurological conditions (stroke, multiple sclerosis, Parkinson’s disease, epilepsy and ALS). We identified relevant health related quality of life (HRQL) domains through multiple methods and data sources including a comprehensive review of the literature and literature search, expert interviews and surveys and patient and caregiver focus groups. The final domain framework consisted of 17 domains of Physical, Mental and Social health. There were five phases of item development: (1) identification of 3,482 extant items, (2) item classification and selection, (3) item review and revision, (4) cognitive interviews with 63 patients to assess their understanding of individual items and (5) field testing of 432 representative items.
Participants and Procedures
Participants were drawn from the US general population and clinical settings, and included both English and Spanish speaking subjects (N = 3,246). Confirmatory factor analysis (CFA) was used to evaluate the dimensionality of unidimensional domains. Where the domain structure was previously unknown, the dataset was split and first analyzed with exploratory factor analysis and then CFA. Samejima’s graded response model (GRM) was used to calculate IRT parameters. We further evaluated differential item functioning (DIF) on gender, education and age.
Thirteen unidimensional calibrated item banks consisting of 297 items were developed. All of the tested item banks had high reliability and few or no locally dependent items. The range of item slopes and thresholds with good information are reported for each of the item banks. The banks can support CAT and the development of short forms.
The Neuro-QOL measurement system provides item banks and short forms that enable PRO measurement in neurological research, minimizes patient burden and can be used to create multiple instrument types minimizing standard error. The 17 adult measures include 13 calibrated item banks, 3 item pools available for calibration work by others, and 1 stand-alone scale (index). The Neuro-QOL instruments provide a “common metric” of representative concepts for use across patient groups in different studies.
Outcome measures; Quality of life; Neurological disorders; Computerized adaptive testing, item banking
This paper reports the development and evaluation of a perceived cognitive function (pedsPCF) item bank reported by parents of the pediatric US general population.
Based on feedback from clinicians, parents, and children, we developed a scale sampling concerns related to children’s cognitive functioning. We administered the scale to 1,409 parents of children aged 7–17 years; of them, 319 had a neurological diagnosis. Dimensionality of the pedsPCF was evaluated via factor analyses and its clinical utility studied by comparing parent ratings in patient groups and symptom cluster defined by the Child Behavior Checklist (CBCL).
Forty-four of 45 items met criteria for unidimensionality. The pedsPCF significantly differentiated samples defined by medication use, repeated grades, special education status, neurologic diagnosis, and relevant symptom clusters with large effect sizes (>0.8). It can predicted children symptoms with the correction rates ranging 79–89%.
We have provided empirical support for the unidimensionality of the pedsPCF item bank and evidence for its potential clinical utility. The pedsPCF is a promising measurement tool to screen children for further comprehensive cognitive tests.
Perceived cognitive function; Children; Brain tumor; Neuro-oncology; Item bank
Given the importance of fatigue in cancer, stroke and HIV, we sought to assess the measurement properties of a single, well-described fatigue scale in these populations. We hypothesized that the psychometric properties of the Functional Assessment of Chronic Illness Therapy – Fatigue (FACIT-F) subscale would be favorable and that the scale could serve as a useful indicator of fatigue in these populations.
Patients were eligible for the study if they were outpatients, aged 18 or older, with a diagnosis of cancer (n=297), stroke (n=51), or HIV/AIDS (n=51). All participants were able to understand and speak English. Patients answered study-related questions, including the FACIT-F using a touch-screen laptop, assisted by the research assistant as necessary. Clinical information was abstracted from patients’ medical records.
Item-level statistics on the FACIT-F were similar across the groups and internal consistency reliability was uniformly high (α>0.91). Correlations with performance status ratings were statistically significant across the groups (range r=−0.28 to −0.80). Fatigue scores were moderately to highly correlated with general quality of life (range r=0.66–0.80) in patients with cancer, stroke, and HIV. Divergent validity was supported in low correlations with variables not expected to correlate with fatigue.
Originally developed to assess cancer-related fatigue, the FACIT-F has utility as a measure of fatigue in other populations, such as stroke and HIV. Ongoing research will soon allow for comparison of FACIT-F scores to those obtained using the fatigue measures from the Patient-Reported Outcomes Measurement Information System (PROMIS®; www.nihpromis.org) initiative.
fatigue; assessment; psychometrics; cancer; stroke; HIV
Stroke is a leading cause of long-term disability in the USA; however, we have an incomplete understanding of how stroke affects long-term quality of life.
We report here findings from focus groups with 9 long-term stroke survivors and 6 caregivers addressing patients’ post-stroke quality of life.
Key themes identified by patients were: social support, coping mechanisms, communication, physical functioning and independence. Role changes in patients were important to caregivers. Much of the discussion with patients and caregivers described specific ways in which the stroke altered social relationships.
These findings are consistent with prior research indicating the importance of social factors to quality of life following stroke. Our findings suggest that measures of stroke-related quality of life should include assessment of social function and social support.
stroke; quality of life; qualitative analysis; social function
Patient-reported outcomes (PROs) play an increasingly important role in clinical practice and research. Modern psychometric methods such as item response theory (IRT) enable the creation of item banks that support fixed-length forms as well as computerized adaptive testing (CAT), often resulting in improved measurement precision and responsiveness. Here we describe and discuss the case for developing an international core set of PROs building from the US PROMIS® network.
PROMIS is a U.S.-based cooperative group of research sites and centers of excellence convened to develop and standardize PRO measures across studies and settings. If extended to a global collaboration, PROMIS has the potential to transform PRO measurement by creating a shared, unifying terminology and metric for reporting of common symptoms and functional life domains. Extending a common set of standardized PRO measures to the international community offers great potential for improving patient-centered research, clinical trials reporting, population monitoring, and health care worldwide. Benefits of such standardization include the possibility of: international syntheses (such as meta-analyses) of research findings; international population monitoring and policy development; health services administrators and planners access to relevant information on the populations they serve; better assessment and monitoring of patients by providers; and improved shared decision making.
The goal of the current PROMIS International initiative is to ensure that item banks are translated and culturally adapted for use in adults and children in as many countries as possible. The process includes 3 key steps: translation/cultural adaptation, calibration, and validation. A universal translation, an approach focusing on commonalities, rather than differences across versions developed in regions or countries speaking the same language, is proposed to ensure conceptual equivalence for all items. International item calibration using nationally representative samples of adults and children within countries is essential to demonstrate that all items possess expected strong measurement properties. Finally, it is important to demonstrate that the PROMIS measures are valid, reliable and responsive to change when used in an international context.
IRT item banking will allow for tailoring within countries and facilitate growth and evolution of PROs through contributions from the international measurement community. A number of opportunities and challenges of international development of PROs item banks are discussed.
Patient-reported outcomes; Health-related quality of life research; Patients’ experiences; Questionnaires; Cross-cultural equivalence; Health information systems; Clinical decision making; Comparative effectiveness research; Patient empowerment; Cross-national comparisons
The 45-item Functional Assessment of Cancer Therapy – Hepatobiliary (FACT-Hep) questionnaire assesses health-related quality of life in patients with liver, bile duct and pancreatic cancers. Although the FACT-Hep was initially derived from patient input, we sought to verify adequate coverage of items by soliciting open-ended input from patients with advanced disease.
As part of a larger study in collaboration with the NCCN, 50 people (60% male, 80% Caucasian, average age 60.4 yrs) with Stage 3 or 4 hepatobiliary or pancreatic cancer were recruited. Participants generated and ranked up to 10 important symptoms and concerns that physicians should monitor when assessing the value of chemotherapy. Patients were also able to provide open-ended, qualitative information that was evaluated systematically. Ten expert physicians also provided input on priority symptoms.
The resulting 18-item NCCN-FHSI (NFHSI-18) demonstrated high internal consistency (α = 0.89) and moderate to strong correlations with measures of physical well-being (ρ = 0.76), emotional well-being (ρ = 0.52), and functional well-being (ρ = 0.57). Scores on the NFHSI-18 were also highly correlated with the original hepatobiliary scale of the FACT-Hep (ρ = 0.82; all p<0.001). Compared to patients with better performance status, patients with poor performance status had worse NFHSI-18 symptom scores, F (3, 47) = 9.74; p=0.0003.
The NFHSI 18 assesses symptoms of importance to patients with hepatobiliary and pancreatic cancers and-demonstrates promising measurement properties. The scale is a good candidate for brief symptom assessment in clinical trials.
advanced cancer; symptom assessment; hepatobiliary cancer; pancreatic cancer
The field of solid organ transplantation has historically concentrated research efforts on basic science and translational studies. However, there has been increasing interest in health services and outcomes research. The aim was to build an effective and sustainable, inter- and transdisciplinary health services and outcomes research team (NUTORC), that leveraged institutional strengths in social science, engineering, and management disciplines, coupled with an international recognized transplant program. In 2008, leading methodological experts across the university were identified and intramural funding was obtained for the NUTORC initiative. Inter- and transdisciplinary collaborative teams were created across departments and schools within the university. Within 3 years, NUTORC became fiscally sustainable, yielding more than tenfold return of the initial investment. Academic productivity included funding for 39 grants, publication of 60 manuscripts, and 166 national presentations. Sustainable educational opportunities for students were created. Inter- and transdisciplinary health services and outcomes research in transplant can be innovative and sustainable.
Transdisciplinary research teams; Health Services and Outcomes Research; Educational opportunities; Academic productivity; Sustainable research efforts
The importance of faith and its associations with health are well-documented. As part of the Patient Reported Outcomes Measurement Information System, items tapping positive and negative impact of illness (PII & NII) were developed across four content domains: Coping/Stress Response; Self-Concept; Social Connection/Isolation; Meaning and Spirituality. Faith items were included within the concept of meaning and spirituality.
This measurement model was tested on a heterogeneous group of 509 cancer survivors. To evaluate dimensionality, we applied two bi-factor models, specifying a general factor (PII or NII) and four local factors: Coping/Stress Response, Self-Concept, Social Connection/Isolation, and Meaning and Spirituality.
Bi-factor analysis supported sufficient unidimensionality within PII and NII item sets. The unidimensionality of both PII and NII item sets was enhanced by extraction of the faith items from the rest of the questions. Of the 10 faith items, 9 demonstrated higher local than general factor loadings (range for local factor loadings= .402 to .876), suggesting utility as a separate but related “faith” factor. The same was true for only 2 of the remaining 63 items across the PII and NII item sets.
While conceptually and to a degree empirically related to Meaning and Spirituality, Faith appears to be a distinct subdomain of PII and NII, better-handled by distinct assessment. A 10-item measure of the impact of illness upon faith (II-Faith) was therefore assembled.
cancer; oncology; measurement; psychosocial impact; faith
Despite the increasing use of panel surveys, little is known about the differences in data quality across panels.
The aim of this study was to characterize panel survey companies and their respondents based on (1) the timeliness of response by panelists, (2) the reliability of the demographic information they self-report, and (3) the generalizability of the characteristics of panelists to the US general population. A secondary objective was to highlight several issues to consider when selecting a panel vendor.
We recruited a sample of US adults from 7 panel vendors using identical quotas and online surveys. All vendors met prespecified inclusion criteria. Panels were compared on the basis of how long the respondents took to complete the survey from time of initial invitation. To validate respondent identity, this study examined the proportion of consented respondents who failed to meet the technical criteria, failed to complete the screener questions, and provided discordant responses. Finally, characteristics of the respondents were compared to US census data and to the characteristics of other panels.
Across the 7 panel vendors, 2% to 9% of panelists responded within 2 days of invitation; however, approximately 20% of the respondents failed the screener, largely because of the discordance between self-reported birth date and the birth date in panel entry data. Although geographic characteristics largely agreed with US Census estimates, each sample underrepresented adults who did not graduate from high school and/or had annual incomes less than US $15,000. Except for 1 vendor, panel vendor samples overlapped one another by approximately 20% (ie, 1 in 5 respondents participated through 2 or more panel vendors).
The results of this head-to-head comparison provide potential benchmarks in panel quality. The issues to consider when selecting panel vendors include responsiveness, failure to maintain sociodemographic diversity and validated data, and potential overlap between panels.
survey methods; community surveys; sampling bias; selection bias; Internet; data sources
L-carnitine, a popular complementary and alternative medicine product, is used by patients with cancer for the treatment of fatigue, the most commonly reported symptom in this patient population. The purpose of this study was to determine the efficacy of L-carnitine supplementation as a treatment for fatigue in patients with cancer.
Patients and Methods
In this double-blind, placebo-controlled trial, patients with invasive malignancies and fatigue were randomly assigned to either 2 g/d of L-carnitine oral supplementation or matching placebo. The primary end point was the change in average daily fatigue from baseline to week 4 using the Brief Fatigue Inventory (BFI).
Three hundred seventy-six patients were randomly assigned to treatment with L-carnitine supplementation or placebo. L-carnitine supplementation resulted in significant carnitine plasma level increase by week 4. The primary outcome, fatigue, measured using the BFI, improved in both arms compared with baseline (L-carnitine: −0.96, 95% CI, −1.32 to −0.60; placebo: −1.11, 95% CI −1.44 to −0.78). There were no statistically significant differences between arms (P = .57). Secondary outcomes, including fatigue measured by the Functional Assessment of Chronic Illness Therapy–Fatigue instrument, depression, and pain, did not show significant difference between arms. A separate analysis of patients who were carnitine-deficient at baseline did not show statistically significant improvement in fatigue or other outcomes after L-carnitine supplementation.
Four weeks of 2 g of L-carnitine supplementation did not improve fatigue in patients with invasive malignancies and good performance status.
To present responses to sexual function items contained within the quality of life (QOL) survey of the Gynecologic Oncology Group (GOG) LAP2 study, to investigate associations between sexual function and other factors such as relationship quality and body image), and to explore patterns of response in endometrial cancer patients.
Participants enrolled on the LAP2 QOL study arm completed a self-report QOL survey, which contained sexual function items, before surgery, and at 1, 3, 6-weeks and 6-months post surgery. Responses to sexual function questions were classified into three patterns—responder, intermittent responder and non-responder—based on whether the sexual function items were answered when the QOL survey was completed.
Of 752 patients who completed the QOL survey, 225 completed the sexual function items within the QOL survey, 224 responded intermittently, and 303 did not respond at all. No significant differences of sexual function were found between the patients randomized to laparoscopy compared to laparotomy. Among those who responded completely or intermittently, sexual function scores declined after surgery and recovered to pre-surgery levels at 6 months. Sexual function was positively associated with better quality of relationship (P<0.001), body image (P<0.001), and QOL (P<0.001), and negatively associated with fear of sex (P<0.001).
Our findings suggest that younger patients, those who were married, and those who had quality relationships were more likely to answer the sexual function items and have better quality of sexual function. Factors such as age, relationship quality, body image, and pain may place women with endometrial cancer at risk for sexual difficulties in the immediate recovery period; however, sexual function improved by 6-months postoperatively in our cohort of early-stage endometrial cancer patients.
sexual function; endometrial cancer; gynecologic cancer treatment; laparoscopy; laparotomy
Considerable research has demonstrated the negative psychosocial impact of cancer. Recent work has highlighted positive psychosocial outcomes. Research is now needed to evaluate the relationship between negative and positive impacts. This paper reports the development and validation of a measurement model capturing positive and negative psychosocial illness impacts.
The sample included 754 cancer patients on- or post-treatment. Item development was informed by literature review, expert input patient interviews and the results of a pilot study of 205 cancer patients, resulting in 43 positive and 46 negative items. Factor analyses were used to evaluate the dimensionality of the item pools. Analysis of variance (ANOVA) was used to examine relationships between psychosocial illness impact and other variables.
Unidimensionality was demonstrated within but not across negative and positive impact items. ANOVA results showed differential relationships between negative and positive impacts, respectively, and patient sociodemographic and clinical variables.
Positive and negative psychosocial illness impacts are best conceptualized and measured as two independent factors. Computerized adaptive tests and short-form measures developed from this comprehensive psychosocial illness impact item bank may benefit future research and clinical applications.
Psychosocial sequelae; Cancer; Cancer survivors; Bi-factor analysis
Cancers of the head and neck are associated with detriments in health-related quality of life (HRQOL), however little is known about different experiences between African Americans and non-Hispanic whites.
HRQOL was measured by the Functional Assessment of Cancer Therapy – Head and Neck approximately five months post diagnosis among 222 cancer patients from North Carolina. Higher scores represent better HRQOL. Regression models included sociodemographic characteristics and clinical factors.
African Americans reported higher Physical Well-Being than Caucasians (adjusted means 23.1 vs 20.9). African Americans with incomes <$20,000 reported higher Emotional Well-Being (21.4) and fewer head and neck symptoms (22.0). Non-Hispanic whites making <$20,000 reported the poorest Emotional Well-Being (17.3) while African Americans making >$20,000 reported the most head and neck symptoms (18.7).
Further investigation is needed to explore variation in HRQOL experiences among different race and socio-economic groups that may inform resource allocation to improve cancer care.
health-related quality of life; head and neck cancer; African Americans
Purpose of review
In this review, we briefly summarize three fruitful, emerging areas in liver transplantation research: quality of life; risk assessment; and patient safety. Our goal is to highlight recent findings in these areas, with a call for increased integration of social scientists and transplant clinicians to address how best to shape policy and improve outcomes.
After liver transplantation, recipients generally experience clinically significant, sustained improvement in their physical, social and emotional well-being. However, a sizeable minority of patients do experience excess morbidity that may benefit from ongoing surveillance and/or intervention. There is growing body of research that describes risks associated with liver transplantation, which can be useful aids to better inform decision making by patients, clinicians, payers, and policy makers. In contrast, there has been a relative lack of empirical data on transplant patient safety vulnerabilities, placing the field of surgery in stark contrast to other high risk industries, where such assessments inform continuous process improvement.
Health services and outcomes research has grown in importance in the liver transplantation literature, but several important questions remain unanswered that merit programmatic, interdisciplinary research.
liver transplantation; recipients; quality of life; risk assessment; patient safety; outcomes
Women diagnosed with ovarian cancer are at risk for reduced quality of life (QOL). It is imperative to further define these declines to interpret treatment outcomes and design appropriate clinical interventions.
The primary objective of this study was to compare data obtained from ovarian cancer patients with normative data to assess the degree to which QOL differs from the norm. Secondary objectives were to examine demographic variables and determine if there was a correlation between physical/functional and social/emotional scores during chemotherapy.
Patients with Stage III/IV ovarian cancer on Gynecologic Oncology Group Protocols 152 and 172 who underwent surgery followed by intravenous paclitaxel and cisplatin completed the Functional Assessment of Cancer Therapy-Ovarian. The Functional Assessment of Cancer Therapy scale includes the four domains of physical, functional, social, and emotional well-being (PWB, FWB, SWB, and EWB, respectively).
Ovarian cancer patients had a total QOL (Functional Assessment of Cancer Therapy-General) score similar to the U.S. female adult population. However, the reported subscale scores were 2.0 points (95% confidence interval [CI] 1.4–2.5, P < 0.001, effect size = 0.37) lower in PWB, 0.9 points (95% CI 0.3–1.5, P = 0.005, effect size = 0.13) lower in FWB, 5.0 points (95% CI 4.6–5.3, P < 0.001, effect size = 0.74) higher in SWB, and 0.8 points (95% CI 0.3–1.2, P < 0.001, effect size = 0.16) lower in EWB. Correlation between the sum of PWB and FWB and the sum of SWB and EWB was r = 0.53 (P < 0.001). Age was positively correlated with EWB (r = 0.193; 95% CI 0.09–0.29).
Ovarian cancer patients have decreased QOL in physical, functional, and emotional domains; however, they may compensate with increased social support. At the time of diagnosis and treatment, patients’ QOL is affected by inherent characteristics. Assessment of treatment outcomes should take into account the effect of these independent variables.
Ovarian cancer; chemotherapy; quality of life
A comprehensive, reliable, and valid measurement system is needed to monitor changes in children with neurological conditions who experience lifelong functional limitations.
This article describes the development and psychometric properties of the pediatric version of the Quality of Life in Neurological Disorders (Neuro-QOL) measurement system.
The pediatric Neuro-QOL consists of generic and targeted measures. Literature review, focus groups, individual interviews, cognitive interviews of children and consensus meetings were used to identify and finalize relevant domains and item content. Testing was conducted on 1018 children aged 10 to 17 years drawn from the US general population for generic measures and 171 similarly aged children with muscular dystrophy or epilepsy for targeted measures. Dimensionality was evaluated using factor analytic methods. For unidimensional domains, item parameters were estimated using item response theory models. Measures with acceptable fit indices were calibrated as item banks; those without acceptable fit indices were treated as summary scales.
Ten measures were developed: 8 generic or targeted banks (anxiety, depression, anger, interaction with peers, fatigue, pain, applied cognition, and stigma) and 2 generic scales (upper and lower extremity function). The banks reliably (r > 0.90) measured 63.2% to 100% of the children tested.
The pediatric Neuro-QOL is a comprehensive measurement system with acceptable psychometric properties that could be used in computerized adaptive testing. The next step is to validate these measures in various clinical populations.
health-related quality of life; children; neurological disorders; item bank; item response theory
The Patient-Reported Outcomes Measurement Information System (PROMIS) is a National Institutes of Health initiative to develop item banks measuring patient-reported outcomes (PROs) and to create and make available a computerized adaptive testing system (CAT) that allows for efficient and precise assessment of PROs in clinical research and practice.
Aims of the Study
Based on the presentation from a symposium on “Evidence-based Outcomes in Psychiatry: Updates on Measurement Using Patient-Reported Outcomes (PRO)” at the 2011 American Psychiatry Association Convention, this paper provides an overview of PROMIS and its application to mental health research.
The PROMIS methodology for item bank development and testing is described, with a focus on the implications of this work for mental health research.
Utilizing qualitative item review and state-of-the-art applications of item response theory (IRT), PROMIS investigators have developed, tested, and released item banks measuring physical, mental, and social health components. Ongoing efforts continue to add new item banks and further validate existing banks.
PROMIS provides item banks measuring several domains of interest to mental health researchers including emotional distress, social function, and sleep. PROMIS methodology also provides a rigorous standard for the development of new mental health measures.
Implications for Health Care Provision
Web-based CAT or administration of short forms derived from PROMIS item banks provide efficient and precise dimensional estimates of clinical outcomes that can be utilized to monitor patient progress and assess quality improvement.
Implications for Future Research
Use of the dimensional PROMIS metrics (and co-calibration of the PROMIS item banks with existing PROs) will allow comparisons of mental health and related health outcomes across disorders and studies.
To illustrate how measurement practices can be advanced using as an example the fatigue item bank (FIB) and its applications (short-forms and computerized adaptive test) that were developed via the NIH Patient Reported Outcomes Measurement Information System (PROMIS) Cooperative Group.
Psychometric analysis of data collected by an internet survey company using Item Response Theory (IRT) related techniques.
A United States general population representative sample collected via internet.
803 respondents used for dimensionality evaluation of the PROMIS FIB and 14,931 respondents used for item calibrations
Main Outcome Measures
112 fatigue items developed by the PROMIS fatigue domain working group, 13-item Functional Assessment of Chronic Illness Therapy-Fatigue, and 4-item SF-36 Vitality scale.
The PROMIS FIB version 1 which consists of 95 items demonstrated acceptable psychometric properties. Computerized Adaptive Testing (CAT) showed consistently better precision than short-forms. However, all three short-forms showed good precision for the majority of participants, in that more than 95% of sample could be precisely measured with a reliability greater than 0.9.
Measurement practice can be advanced by using a psychometrically sound measurement tool and its applications. This example shows that CAT and short-forms derived from the PROMIS FIB can reliably estimate fatigue reported by the US general population. Evaluation in clinical populations is warranted before the item bank can be used for clinical trials.
PROMIS; fatigue; CAT; short-form
The Patient-Reported Outcomes Measurement Information System (PROMIS) was developed as one of the first projects funded by the NIH Roadmap for Medical Research Initiative to re-engineer the clinical research enterprise. The primary goal of PROMIS is to build item banks and short forms that measure key health outcome domains that are manifested in a variety of chronic diseases which could be used as a “common currency” across research projects. To date, item banks, short forms and computerized adaptive tests (CAT) have been developed for 13 domains with relevance to pediatric and adult subjects. To enable easy delivery of these new instruments, PROMIS built a web-based resource (Assessment Center) for administering CATs and other self-report data, tracking item and instrument development, monitoring accrual, managing data, and storing statistical analysis results. Assessment Center can also be used to deliver custom researcher developed content, and has numerous features that support both simple and complicated accrual designs (branching, multiple arms, multiple time points, etc.). This paper provides an overview of the development of the PROMIS item banks and details Assessment Center functionality.
The purpose of this study was to examine health-related quality of life (HRQL) among women with metastatic breast cancer treated on E2100 with paclitaxel or paclitaxel plus bevacizumab. Trial participants (N = 670) completed the Functional Assessment of Cancer Therapy-Breast (FACT-B) pre-treatment and following 4 and 8 cycles of treatment to assess HRQL and breast cancer-specific concerns. A significantly higher proportion of missing FACT-B assessments was observed among patients receiving paclitaxel only, due to faster time to death. To account for this non-ignorable pattern of missing data, we conducted a survival-adjusted HRQL analysis by jointly modeling the longitudinal HRQL outcome and time to non-ignorable dropout using a two-stage model. FACT scores assessing HRQL did not differ following 4 and 8 cycles of treatment; however mean scores on the 9-item Breast Cancer Scale were significantly higher after 4 and 8 cycles of treatment among patients receiving paclitaxel plus bevacizumab. No differences were observed between treatment arms on FACT-B total scores. The addition of bevacizumab was not associated with additional side effect burden from the patient perspective and was associated with a greater reduction in breast cancer-specific concerns. No other differences were noted.
Health-related quality of life; Metastatic breast cancer; Survival-adjusted quality of life; Patient-reported outcomes
To describe the lessons learned in the initial development of PROMIS social function item banks.
Development and testing of two item pools within a general population to create item banks that measure ability-to-participate and satisfaction-with-participation in social activities.
Administration via the Internet.
General population members (N=956) of a national polling organization registry participated; data for 768 and 778 participants used in the analysis.
Main Outcome Measures
Measures of ability-to-participate and satisfaction-with-participation in social activities.
Fifty six items measuring the ability-to-participate were essentially unidimensional but did not fit an IRT model. As a result, item banks were not developed for these items. Of the 56 items measuring satisfaction-with-participation, 14 items measuring social roles and 12 items measuring discretionary activities were unidimensional and met IRT model assumptions. Two 7-item short forms were also developed.
Four lessons, mostly concerning item content, were learned in the development of banks measuring social function. These lessons led to item revisions that are being tested in subsequent studies.
Quality of life; Rehabilitation
Content validity of patient-reported outcomes (PROs) is evaluated primarily during item development, but subsequent psychometric analyses, particularly for item-response theory (IRT)-derived scales, often result in considerable item pruning and potential loss of content. After selecting items for the PROMIS banks based on psychometric and content considerations, we invited external content expert reviews of the degree to which the initial domain names and definitions represented the calibrated item bank content.
A minimum of four content experts reviewed each item bank and recommended a domain name and definition based on item content. Domain names and definitions then were revealed to the experts who rated how well these names and definitions fit the bank content and provided recommendations for definition revisions.
These reviews indicated that the PROMIS domain names and definitions remained generally representative of bank content following item pruning, but modifications to two domain names and minor to moderate revisions of all domain definitions were needed to optimize fit with the item bank content.
This reevaluation of domain names and definitions following psychometric item pruning, although not previously documented in the literature, appears to be an important procedure for refining conceptual frameworks and further supporting content validity.
content validity; conceptual framework; domain definition; item-response theory
Pain is prevalent among patients with cancer, yet pain management patterns in outpatient oncology are poorly understood.
Patients and Methods
A total of 3,123 ambulatory patients with invasive cancer of the breast, prostate, colon/rectum, or lung were enrolled onto this prospective study regardless of phase of care or stage of disease. At initial assessment and 4 to 5 weeks later, patients completed a 25-item measure of pain, functional interference, and other symptoms. Providers recorded analgesic prescribing. The pain management index was calculated to assess treatment adequacy.
Of the 3,023 patients we identified to be at risk for pain, 2,026 (67%) reported having pain or requiring analgesics at initial assessment; of these 2,026 patients, 670 (33%) were receiving inadequate analgesic prescribing. We found no difference in treatment adequacy between the initial and follow-up visits. Multivariable analysis revealed that the odds of a non-Hispanic white patient having inadequate pain treatment were approximately half those of a minority patient after adjusting for other explanatory variables (odds ratio, 0.51; 95% CI, 0.37 to 0.70; P = .002). Other significant predictors of inadequate pain treatment were having a good performance status, being treated at a minority treatment site, and having nonadvanced disease without concurrent treatment.
Most outpatients with common solid tumors must confront issues related to pain and the use of analgesics. There is significant disparity in pain treatment adequacy, with the odds of undertreatment twice as high for minority patients. These findings persist over 1 month of follow-up, highlighting the complexity of these problems.