|Home | About | Journals | Submit | Contact Us | Français|
We describe the development and validation of the PROMIS Sexual Function and Satisfaction (PROMIS SexFS) measures version 1.0 for cancer populations.
To develop a customizable self-report measure of sexual function and satisfaction as part of the U.S. National Institutes of Health PROMIS® Network.
Our multidisciplinary working group followed a comprehensive protocol for developing psychometrically robust patient reported outcome (PRO) measures including qualitative (scale development) and quantitative (psychometric evaluation) development. We performed an extensive literature review, conducted 16 focus groups with cancer patients and multiple discussions with clinicians, and evaluated candidate items in cognitive testing with patients. We administered items to 819 cancer patients. Items were calibrated using item response theory and evaluated for reliability and validity.
The PROMIS Sexual Function and Satisfaction (PROMIS SexFS) measures version 1.0 include 79 items in 11 domains: interest in sexual activity, lubrication, vaginal discomfort, erectile function, global satisfaction with sex life, orgasm, anal discomfort, therapeutic aids, sexual activities, interfering factors, and screener questions.
In addition to content validity (patients indicate that items cover important aspects of their experiences) and face validity (patients indicate that items measure sexual function and satisfaction), the measure shows evidence for discriminant validity (domains discriminate between groups expected to be different), convergent validity (strong correlations between scores on PROMIS and scores on conceptually-similar older measures of sexual function), as well as favorable test-retest reliability among people not expected to change (inter-class correlations from 2 administrations of the instrument, 1 month apart).
The PROMIS SexFS offers researchers a reliable and valid set of tools to measure self-reported sexual function and satisfaction among diverse men and women. The measures are customizable; researchers can select the relevant domains and items comprising those domains for their study.
Cancer and other chronic diseases, as well as their treatments, often have adverse effects on sexual function including sexual interest, arousal, orgasm, and comfort during intercourse.1, 2 In addition, chronic diseases can have indirect effects on sexual function by altering relationships and self-image as a result of experiencing pain, fatigue, disfigurement, and dependency.1, 3–5 Tools to assess sexual function and satisfaction in a comprehensive, valid and precise manner are required to develop targeted and effective treatments. A recent review highlighted the critical need for improved patient-reported outcome measures of sexual function.6
The National Institutes of Health’s Patient-Reported Outcomes Measurement Information System® (PROMIS®) Network was created in 2004 to develop standardized, precise, and valid measures of patient-reported outcomes for use in research across chronic diseases.7 Unique to the PROMIS initiative is the extensive, comprehensive development of its item banks, i.e., a large set of questions and response options (items) for each measured domain. PROMIS item banks undergo qualitative evaluation using focus groups, cognitive testing, and psychometric evaluation incorporating item response theory (IRT). Content represents substantial input from patients, clinicians, and survey methodologists, making the PROMIS measures both clinically relevant and valid for research. Finally, PROMIS item banks provide an unusual degree of flexibility to researchers, allowing them to select relevant items to accurately measure the domains of interest for their target population. In 2004, the National Cancer Institute identified sexual function as highly relevant for cancer survivors and provided support for creating a new measure of sexual function through the PROMIS network.
We describe the development and validation of the PROMIS Sexual Function and Satisfaction (PROMIS SexFS) measures version 1.0 for cancer populations. Our goal was to develop a customizable self-report set of measures that was a) comprehensive in scope, b) flexible to meet diverse research needs, c) broadly applicable with respect to age, gender, sexual orientation, partner status, and literacy level, and d) disease-neutral yet useful across a wide spectrum of diseases with symptoms that potentially interfere with functioning.
Our multidisciplinary working group followed the comprehensive PROMIS protocol for developing psychometrically robust patient reported outcome (PRO) measures.8, 9 The qualitative steps (scale development) are reported in detail elsewhere6, 10, 11 and described in brief here. The quantitative steps (psychometric evaluation) are described in detail here. All patient participants provided informed consent. The Institutional Review Board of the XXX Health System approved this study.
Using a consensus-driven approach, we conducted a literature search for articles published from 1991–2007 that reported the administration of a self-reported measure of sexual function in a cancer population; see Jeffery 2009 for detail.6 Based on this literature review, we developed a preliminary conceptual model to reflect domains to be included in the measures: interest in sexual activity, lubrication, vaginal discomfort, erectile function, orgasm, anal discomfort, frequency of sexual activity, and sexual satisfaction. We categorized more than 1100 items from existing measures into these domains and selected ~50 clinically relevant items for further testing.
Item banks are dynamic and can incorporate extant items on the same scoring metric. We incorporated into the PROMIS SexFS some items that are publically available or for which the copyright holders granted permission. Thus, some PROMIS SexFS instruments include modified items from other sexual function instruments (e.g., UCLA-Prostate Cancer Index,12 Female Sexual Function Index [FSFI]13).
We conducted 16 focus groups with 109 patients to establish the content validity of the PROMIS SexFS measures.10 Patient focus groups focused on physical and psychosocial impacts of cancer on sexual function and intimate relationships; see Flynn 2011 for detail. We also discussed the clinical relevance of the proposed conceptual model with oncology researchers and clinicians and sought their views on how cancer and its treatment affected patients’ sexual health. We published our working conceptual measurement model10 and provide an adapted version to show the domains covered and how they may relate to each other (Figure). We wrote ~40 new items based on the results of these groups/discussions and added domains describing symptoms and other factors that interfere with sexual satisfaction.
To evaluate face validity, we tested items in cognitive interviews. Each of 83 candidate items was evaluated for clarity, sensitivity, and relevance by 5 or more participants, at least 2 of whom had less than a 9th grade education or reading level as tested by the Wide Range Achievement Test 4;14 see Fortune-Greeley 2009 for detail.11 The 39 participants (16 low literacy) were diverse with respect to sex and race as well as cancer type and stage. Items were revised based on results of the interviews. Substantially revised items were retested in additional interviews.
The team responsible for translating all PROMIS measures reviewed all items and minor changes were made to reduce potential difficulty with translation to non-English languages.15 We also convened 7 additional clinical and academic experts on sexual function and cancer to review the conceptual model and item development to date.
Patients who participated in item testing were recruited from the Duke University tumor registry (mailed invitation, 56% of sample), Duke’s private diagnostic oncology clinics (in person, 14% of sample), as well as the NexCura internet panel (emailed invitation, 30% of sample). For patients recruited through NexCura, we verified cancer diagnosis and treatment status with their treating physician. We also targeted recruitment of additional lesbian, gay, and bisexual cancer patients and survivors through online communities, though this strategy yielded few participants (n=14). Patients were eligible if they were 18 years of age or older, had been diagnosed with cancer, and were able to speak English. We aimed for a diverse sample regarding sex, race, tumor site, and whether the person was undergoing active treatment for cancer or in post-treatment follow-up (Table 1). The initial target sample comprised ~250 men and ~250 women, but roughly 40% of this sample had not been sexually active in the past 30 days. Thus, we recruited a supplemental sample of ~300 sexually active cancer patients and survivors from the same sources. Participants completed 2 screener items, 79 PROMIS SexFS items, and 49 items from other commonly used measures of sexual function (FSFI, UCLA Prostate Cancer Index, Medical Outcomes Study Sexual Problems Survey,16 International Index of Erectile Function [IIEF]17).
We performed a psychometric evaluation of the SexFS domains of: Interest in Sexual Activity, Lubrication [women], Vaginal Discomfort [women], Erectile Function [men], Orgasm, Anal Discomfort, and Global Satisfaction with Sex Life using the established PROMIS methodology based on classical test theory and IRT.9 We first examined the distribution of item responses and examined item-total correlations within the same SexFS domain. Multiple statistical software programs were used, including Mplus (version 5.21, Muthen & Muthen) and MULTILOG (ver 7.03, Scientific Software International) as noted below, and SAS software 9.2 (SAS Institute Inc., Cary, NC, USA) for other analyses. Psychometric analyses were not conducted for the subdomains not thought to comprise latent variables (Sexual Activities, Interfering Factors, Therapeutic Aids, and Screener Items).
After adequate distribution of item responses and item-total correlations within the same SexFS domain being verified were established, confirmatory factor analyses (CFA) were conducted in Mplus to supply statistics and factor score estimates for evaluating assumptions applied in IRT modeling, namely unidimensionality, local independence, and monotonicity. Unidimensionality means that the items measure a common underlying aspect of functioning. This was tested by examining the fit of a single-factor CFA model for ordinal categorical item responses.18 For this, multiple indices were computed, including the comparative fit index (criterion for good fit is >0.95), Tucker-Lewis index (>0.95 for good fit), and root mean square error of approximation (<0.06 for good fit). When CFA showed a poor fit, exploratory factor analysis (EFA) was used to explore alternative models. Local independence requires that there are no significant associations among item responses once the dominant factor influencing a person’s response to an item is controlled. Item pairs with high residual correlations (>0.2) were flagged as possible cases of local dependence. Monotonicity assumes that the probability of endorsing an item response indicative of better health status should increase as the underlying level of health increases. For each item, we examined the relationship between the item score and the factor score graphically and via a polyserial correlation.
IRT modeling was used to evaluate the psychometric properties of each item within a scale and to calibrate the metric for the items, i.e., to generate statistical parameters that allow translations of a person’s item responses into an estimate of that person’s underlying functioning. The probability of choosing each item-response category is modeled as a function of the underlying latent trait, denoted by theta (θ). Samejima’s Graded Response Models (GRM) was chosen as the primary IRT model, following PROMIS conventions.19, 20 In the GRM, each item is assigned a set of parameter estimates: a slope parameter for discrimination and a number of between-category thresholds. Item characteristic curves, as well as item and test information curves were produced and examined to ensure adequate properties of items and measures. Scores for each respondent were calculated based on their responses to each of the items within the SexFS item bank and rescaled to a T distribution (mean = 50, standard deviation = 10). IRT analysis was performed by using MULTILOG.
DIF occurs when people from different groups with the same latent trait have a different probability of giving a certain response on a question.21 Differences between groups (e.g., males and females) in individual item behavior were assessed via an analysis of DIF. We tested DIF for males vs females (on domains that both men and women answered, including Global Satisfaction with Sex Life and Interest in Sexual Activity) and web- versus phone-based mode of administration (on all domains). We used ordinal logistic regression to examine DIF at the item level. An item was classified as displaying DIF if (1) the Likelihood ratio chi-square test in logistic regression had a p-value ≤ 0.01, and (2) the Zumbo-Thomas effect size, which is increment of the R-square, was≥ 0.130.22
We also used multiple-group analysis in MPlus to examine measurement invariance at the domain level. Because our sample size for some DIF analyses was smaller than typically desired, we also relied on graphical methods, producing item-predicted factor score mean curves to visually verify the magnitude of the DIF and whether non-uniform DIF exists. Intra-class correlation coefficients were calculated to assess the agreement between factor scores generated from group-specific and common IRT parameter estimates.
To establish concurrent validity, we first examined Pearson correlation coefficients between subdomains of the PROMIS SexFS and other widely used measures of conceptually similar constructs, namely the 19-item FSFI13 and the 15-item IIEF.17 Second, we examined whether scores on selected subdomains of the PROMIS SexFS could discriminate between groups that should, in theory, differ in terms of their sexual experiences. During item testing, participants were asked whether they had ever asked an oncology professional about sexual problems. We hypothesized that asking for help with sexual problems may indicate a clinically meaningful decrement in function.
We calculated Cronbach’s alpha for men and women separately to demonstrate internal consistency. To examine the validity of the 30-day recall period and test-retest reliability, we conducted a separate 30-day diary study in 202 men and women for whom we did not expect changes over the month.23 To express the consistency of scores over time, we report intraclass correlations coefficients (ICCs) and their corresponding 95% confidence intervals (CIs) for two administrations of the measure, one month apart.
A summary of fit statistics is shown in Table 2. We expect future data collection efforts will allow for calibration of the Orgasm and Anal Discomfort subdomains, but there were not enough responses (Anal Discomfort) or items (Orgasm) to support calibration at present. For the other domains, all calibrated items satisfied the basic assumptions for adequate distribution (none displayed sparseness or monotonicity and higher item scores corresponded to higher scale sum-scores). All calibrated item banks had high comparative fit indices, supporting unidimensional models, though some lack of fit was suggested by the RMSEA. With rare exception, items had factor loading and polyserial correlation of 0.8 or greater. Item properties (from item calibration) are available online.24
For DIF analysis by survey mode (phone versus internet), there was mode equivalence (i.e., no DIF) between phone and internet surveys at both the domain- and item-level; however, the multiple group models failed to converge for the Vaginal Discomfort and Erectile Function domains due to inadequate sample size. For Satisfaction, many of the likelihood-ratio tests from ordinal logistic regression were significant for DIF (p < 0.001); however, the effect sizes were small and did not reach the pre-specified 0.130 cut-point, tempering the finding of differences in item response by gender. Multiple-group and graphical analyses confirmed the lack of noticeable DIF. The intraclass correlation between the factor scores predicted by the common and sex-specific IRT models was greater than 0.99, which reflected an almost perfect psychometric agreement between males and females. For Interest in Sexual Activity, none of the items displayed significant DIF by the ordinal logistic regression method, though the multiple-group analysis suggested possible DIF. Graphical analyses did not reveal any substantial DIF, though the ICC between the factor scores predicted by the common and gender-specific IRT models was 0.88. This ambiguity regarding gender-related DIF for Interest in Sexual Activity suggests the need for more research.
Calibrated subdomain scores are expressed as T scores (mean = 50, standard deviation= 10). A T score of 50 corresponds to the mean response among the cancer survivors used for item testing. Higher scores corresponded to higher levels of the item or domain.
All items in the PROMIS SexFS item banks are not intended to be administered together, but rather that researchers should have the flexibility to select the items that are relevant to their specific sample. For the calibrated scales, if one or more items from within that instrument are administered, a respondent’s score will be calculated using IRT parameters (either through the PROMIS Assessment Center or look-up tables provided in the user manual). For the 6 non-calibrated item banks (e.g., Interfering Factors), the items within those instruments are not combined in any way to create a score. Each item in these instruments measures a very specific construct corresponding only to that item (e.g., how much has fatigue affected satisfaction with sex life). For any given item in these uncalibrated instruments, the researcher can use the raw item responses directly for analyses.
In addition to the full banks, from which researchers can select items, we produced a brief “off the shelf” measure, the PROMIS Sexual Function and Satisfaction Brief Profile. It includes 1 to 3 of the best general-purpose items from each of the key domains of sexual function (8 total items for men, 10 for women).
In general, the correlations between subdomains of the PROMIS SexFS and the corresponding subdomains of the FSFI and IIEF provide strong evidence for the construct validity of the PROMIS SexFS and the brief profile measures (Table 3). Additionally, patients who had asked a provider for help with sexual problems were significantly different from those who had not asked and in the directions we expected (Table 4). Askers had significantly greater interest in sexual activity, increased vaginal discomfort, and lower levels of erectile function, lubrication, orgasm, and overall satisfaction. These effect sizes were greater than or equal to the effects for the corresponding subscales of the FSFI and IIEF. In three cases, the PROMIS SexFS and brief profiles detected statistically significant (p<.05) differences between those who did and did not ask, whereas the FSFI or IIEF did not.
Estimates of internal consistency (Cronbach’s alpha) were calculated separately for men and women and were high, ranging from 0.87–0.95 (Table 5). Test-retest reliability was also favorable, based on intraclass correlation coefficients (ranging from 0.71–0.87).
The PROMIS SexFS is a customizable self-reported measure of sexual function with demonstrated validity for use in cancer populations. Version 1.0 consists of 11 domains. It has 79 items in 5 calibrated scales (Interest in Sexual Activity, Lubrication [women], Vaginal Discomfort [women], Erectile Function [men], Global Satisfaction with Sex Life) and 6 collections of stand-alone items (Sexual Activities, Orgasm, Interfering Factors, Therapeutic Aids, Anal Discomfort, Screener Items). The PROMIS SexFS instruments are available for download on the Assessment Center™ website (http://assessmentcenter.net/).
The PROMIS SexFS offers researchers an outcome measure for use with men and women with common domains where possible (e.g., Satisfaction). It allows researchers to record frequencies of specific sexual activities but does not generally refer to specific activities when measuring function, making it neutral to the respondent’s partner status or sexual orientation. It offers a group of items (Interfering Factors) to capture reasons for respondent dissatisfaction with sex life. This greater conceptual precision in understanding the problems experienced by individuals could lead to more focused interventions.
While it was developed primarily as an outcome measure, the PROMIS SexFS may offer benefit when used in clinical practice. For instance, screening patients for sexual problems or tracking the trajectory of patients’ sexual activities, function, satisfaction, and/or use of therapeutic aids could help practitioners identify patients who may be in need of intervention or treatment. Administering this measure may also provide psycho-educational value by helping patients to rate and track their sexual satisfaction and to identify specific aspects of sexual function that they wish to improve in order to increase the quality of their sexual lives. Finally, clinicians may find that offering their patients the opportunity to report on their sexual function and satisfaction in a confidential manner will facilitate conversations with patients about this sensitive topic.
Most PROMIS SexFS items are not specific to cancer but have thus far only been validated in cancer samples. However, we designed the measure to accommodate various health states or conditions that could affect sexual function. Ongoing research is currently testing the PROMIS SexFS measure in other targeted groups (e.g., adults with diabetes, heart disease, anxiety, depression, age >65, lesbian, gay, or bisexual, as well as a nationally representative sample of U.S. adults). We are also testing the expansion of the existing domains (e.g., Orgasm), and the addition of new domains (e.g., subjective arousal, oral discomfort, and vulvar discomfort)..
The PROMIS SexFS measures version 1.0 offer researchers a reliable and valid tool to measure self-reported sexual function and satisfaction among diverse men and women with cancer. The measure is customizable in that researchers can select the relevant SexFS domains and items comprising those domains for their study. The measures are comprehensive in scope, covering both physical and psychological components. They are broadly applicable with respect to age, gender, sexual orientation, partner status, and literacy level. Finally, they are disease-neutral yet also able to capture relevant symptoms of cancer and its treatment that interfere with sexual satisfaction. These features should enhance our ability to describe and intervene on specific aspects of the sexual function and satisfaction of patients with cancer.
This work was funded by grant U01AR052186 from the National Institute of Arthritis and Musculoskeletal and Skin Diseases, with additional support from the National Cancer Institute.
Conflict of Interest: None