Extending scientific research outside the clinical setting in SLE remains a challenging task. The relative rarity and complexity of the disease remain barriers for population studies, as do the lack of suitable case finding and disease assessment tools. Although measures of general health, such as the SF-36, provide some insight into disease status in epidemiologic studies, more specific tools have the potential to better detect health–related outcome changes. The development and validation of patient–reported instruments hold promise in addressing this gap. In this study, we report our methods for developing and performing an initial validation study of a patient-reported instrument, the BILD, designed to assess disease damage. Our findings suggest that the BILD is acceptable to respondents, is efficient to administer, and has content, criterion and construct validity.
We designed the BILD to capture the overall concept of damage in SLE for epidemiological research. It is important to note that it is not a direct substitute for the SDI, since the BILD omits many items from the SDI that were either not suitable for patient self-report or were not informative because of their frequent reporting by patients. Instead, among a group of patients (rather than in any individual patient), the BILD is able to differentiate between those with high or low degrees of SLE damage. We found good agreement between items in the BILD and the corresponding items in the SDI, and an overall moderately high correlation between the two measures (0.64), suggesting criterion validity. In addition, through both pilot testing and administration of the instrument to over 700 individuals with SLE, we found that the instrument was acceptable to patients, as evidenced by the very high individual item response rate.
To ascertain construct validity, we compared patients in the four quartiles of BILD scores on measures found in previous literature to relate to disease damage as measured by the SDI. Consistent with studies of the SDI, we found that BILD scores in the LOS were higher among older individuals, those with longer disease duration, and those living below the federal poverty level (9
). Some previous studies have suggested greater damage among certain racial/ethnic minority groups (22
), but many have not, once poverty was accounted for (27
); we also did not find a statistically significant association between damage and race/ethnicity in our multivariable analyses. As expected, those with a higher mean disease activity score over a four year period had higher damage scores (24
). Higher BILD scores were also associated with worse self-rated health, a lower SF-36 physical component score, work disability and employment (6
). Finally, individuals with higher BILD scores had significantly greater health care utilization, including a greater number of hospitalizations and physician visits over the last four years. This is consistent with health care utilization studies involving the SDI (8
A written survey to assess patient-reported damage, the LDIQ, was developed concurrently with our effort to develop and test the BILD for telephone or interviewer administration. LDIQ investigators allowed us to simultaneously test the criterion validity of that instrument in our clinic sample, providing a second U.S. validation for that instrument. We found that both the LDIQ and BILD correlated acceptably with the SDI (rs
for LDIQ=0.54, rs
for BILD 0.64). The correlation of the LDIQ and SDI in our study sample was similar to the published LDIQ criterion validity assessments performed in the United States (rs
=0.48). Important differences between the BILD and LDIQ include the mode of administration (LDIQ is a written survey, BILD is designed for administration by an interviewer in-person or on the telephone) and length (LDIQ has 56 items, the final BILD instrument has 26 items). Given four large international patient samples, criterion validity testing for the LDIQ has been significantly more extensive; the BILD will require further testing to confirm criterion validity in larger, independent samples. Construct validity testing for the two instruments has been comparable in two large community-based samples (the National Databank of Rheumatic Diseases for the LDIQ and the Lupus Outcomes Study for the BILD) (18
Although the analyses presented here support the content, criterion validity and construct validity of the BILD, it is important to note that characterization of the other psychometric properties of the instrument will require further research. For example, we did not assess the reliability of the BILD (either test-retest, or inter-interviewer reliability). Assessment of external validity in an independent sample with different sociodemographic or clinical characteristics should also be performed. The clinic-based sample used to assess criterion validity and the LOS sample differed significantly based on race/ethnicity, disease duration and age. Theoretically, the BILD could correlate differently among these subgroups with the physician-assessed SDI, and future studies of criterion validity with a larger, more heterogeneous sample should investigate this possibility. Finally, an important strength of the SDI is its association with significant long-term clinical outcomes, such as mortality. It remains to be seen whether either of the two newly developed patient-reported measures of disease damage will have similar predictive validity.
In summary, we have developed and performed a preliminary validation study of a new patient reported instrument of disease damage in SLE. The BILD, which is designed for telephone or interviewer administration, had content, criterion and construct validity in this study. Although further studies are needed to examine its reliability and to document its psychometric properties in populations with different sociodemographic or clinical characteristics, the BILD appears to represent a promising tool for studies of SLE outside the clinical setting.