Test forms containing PROMIS pediatric pain items were completed by a total of 3,048 respondents. The sample was about 52% female and 58% of the children were between the ages 8 to 12 years old. Sixty percent were Caucasian, 21% Black, 6% multi-racial, and 13% other races (Asian/Pacific Islanders, Native Americans and Other Races). Eighteen percent of the sample was of Hispanic ethnicity. The vast majority of the adults providing informed consent for the children were parents of the child (92%) or grandparents (4%). The educational attainment of these parents or guardians ranged from less than high school (8%) to advanced degree (13%) with 25% reporting a college degree, 33% some college, and 21% a high school diploma. Approximately 23% of the children participating in the survey had a chronic illness diagnosis during the past 6 months. Participant characteristics are summarized in .
There were adequate numbers of pain items on each of the four forms to permit factor analysis of each. Tables and provide the factor loadings from models that fit well. The models indicate that the items on separate forms are generally unidimensional, though with some evidence of local dependence. Local dependence, or nuisance multidimensionality, is modeled in Forms 1, 2, and 4 () by error covariances (in this case between two items, or “doublets”). Form 3 () contains three items (a “triplet”) pertaining to the physical limitations caused by pain, and as such was modeled as a second factor (with a correlation between the general pain interference factor and the “difficulty moving” subfactor). Indicators of goodness of fit suggest all four models fit the data well, using indices suggested by Reeve et al. [29
]: For Form 1 () , χ2
(7) = 9, CFI = 1.00, TLI = 1.00, RMSEA = 0.02; Form 2 (), χ2
(12) = 10, CFI = 1.00, TLI = 1.00, RMSEA = 0.00; Form 3 (), χ2
(10) = 8, CFI = 1.00, TLI = 1.00, RMSEA = 0.00; and Form 4 (), χ2
(13) = 21, CFI = 1.00, TLI = 1.00, RMSEA = 0.03.
Factor Loadings and Error Covariances for Pain Interference Items on Forms 1, 2, and 4
Factor Loadings and Error Covariances for Pain Interference Items on Form 3.
The local dependence in Forms 1 through 4 occurs primarily because items share similar wording, or have shared content that differs from the content of the scale’s other items. As an example of shared item content, Form 3 contains a “triplet,” or 3 items with responses that are more related than expected given the items’ relationship with the pain interference dimension. In this case the triplet measures physical limitations caused by pain. In other instances, local dependence may result from shared content or the response scale used. Form 2 contains two items measuring pain intensity on a 0 to 10 scale. In addition to being similarly worded and assessed on a unique response scale, the items are measuring pain intensity, while the scale’s other items assess interference on daily activities caused by pain. To ensure unidimensionality of the final scales, only one item from each doublet or triplet was included in the final item pool.
Following the factor analyses, locally independent sets of items from Forms 1 through 4 were calibrated using the GRM. To control for local dependence identified in the item factor analyses, separate item calibrations were completed for each collection of unidimensional items. This process resulted in two sets of calibrations for each Form (three in the case of Form 3). To avoid capitalization on chance, we conservatively selected parameter estimates across calibrations that had the lower estimated slope. shows the item parameter estimates, item fit statistics (S X2), and DIF statistics (LR X2) for the items comprising the final pool (sorted in order of magnitude of slope parameters), and for the items set aside.
Item Parameters, Fit Indices, and DIF Statistics for the Pain Interference Items
The Benjamini-Hochberg correction for multiplicity was used with the fit and DIF statistics. Two items had either significant DIF or lack of fit as indicated by the S X2 statistic; however, these items were retained when considered in relation to the relatively good fit of the items comprising the final pool. As indicated in , there were 15 items set aside. Five were set aside from locally dependent item sets. An additional five were set aside due to low discrimination parameters. Interestingly, these items measured pain intensity, and as such discriminate poorly between levels of pain interference. Finally, four items were set aside for DIF (both threshold and slope DIF). As an interpretive example of threshold DIF, boys were less willing to endorse the item “It was hard to do sports or exercise when I had pain,” after controlling for mean and variance differences between boys and girls. Additionally, slope DIF occurred for the item “I felt grumpy when I had pain,” indicating that “feeling grumpy” is a poor indicator of pain interference for boys. The remaining 13 items comprise the final pain item pool.
In the analysis of DIF by age, five of the 13 items in the pool exhibited significant DIF. For three of those items, the aggregate effect size of the DIF is very small: For the items “It was hard for me to pay attention when I had pain,” “I had trouble doing schoolwork when I had pain,” and “I felt angry when I had pain,” the difference between older and younger children in the expected value of the item response on the 0-4 scale is much less than a half point across the entire range of the latent variable pain interference. To a large extent, the tendency is for those three items to be slightly more discriminating for older than younger children. For the item “It was hard to remember things when I had pain,” younger children tend to give slightly higher responses than older children; the difference, which varies as a function of the latent variable, is around a half point on the 0-4 scale. For “It was hard to get along with other people when I had pain,” older children tend to select slightly higher responses than younger children (again, the difference is a fraction of a point on the 0-4 scale, and is only observed for respondents at high levels of pain interference).
shows test information functions for the pain item pool and four potential short forms on a T
-score scale with a mean of 50 and standard deviation of 10 (on which all PROMIS scales are reported). Test information is the expected value of the inverse of the squared standard error of measurement, and indicates the precision of scores on a scaled metric. A standard error of measurement of approximately 0.32 (on a standardized metric, or 3.2 on a T
-score metric) is associated with a test information value of 10 and hence a reliability coefficient of approximately 0.90. Three 8-item short forms provide test information greater than 10 for a range of scores between, approximately, 45 to 70 on the T
-score scale. The recommended 8-item short form in the Appendix
contains the item set which provides the maximum test information at the mean (50) on the T
-score metric. However, if more score precision is required (or “broader” precision), the complete item pool is contained in and may be used to compute IRT response pattern scores or IRT-scaled scores from summed scores.
Test information functions for Pediatric Pain Interference Scale.
also serves as a simulated Computer Adaptive Test (CAT). A CAT selects items based on an individual’s response to previous items. As such, a CAT can theoretically choose the most informative items for an individual depending on their level of the trait being measured, in this case, pain interference. For this simulation, separate test information functions are computed from the 8 items that provide the most information at five possible score locations (30, 40, 50, 60, and 70 on the T-score metric). In other words, the items used to generate the test information function at T = 50 are those that a perfect CAT would select for an individual at the mean of pain interference. To consider the usefulness of CAT given these items, one may compare both the range of score precision and the magnitude of score precision across the separate potential short forms. In this case, because the items in the final pool generally discriminate in the same range, there is little score precision gained between the four potential short forms. However, the PROMIS Assessment Center contains the item pool and is capable of administering these items as a CAT if the researcher desires to do so.