Although the results of the psychometric analyses were not as good as we had hoped, we have learned some valuable lessons and have the beginnings of two item sets that can be developed into banks in future studies. The social function domain was the least developed and tested of all the PROMIS domains22
but, through this work, definitions of these domains have been refined. We are optimistic that the ordering of items from those involving home and family activities to those involving activities outside the home and involving a larger social circle suggested by these results will be borne out with additional research. We further believe that the distinction between items measuring social roles and discretionary activities is conceptually meaningful and can lead to more precise measurement of participation in social activities. In addition, we learned a number of valuable lessons regarding the content of social function items.
Lesson 1: Combining positively- and Negatively-Worded Items
Although the use of both positively- and negatively-worded items may widen the range of the traits that an instrument could reliably measure and reduce response set, combining these item types in this study threatened the dimensionality of the combined item set. After the initial CFA showed unacceptable dimensionality, exploratory analyses were conducted to examine the factor structure of the item sets. In both the ability-to-participate and satisfaction-with-participation domains, positively- and negatively-worded items loaded on different factors. The bifactor model was used to determine if the two item types were secondary factors under a general factor, that is, whether the items were ‘essentially unidimensional.’ However, even with the assumption of secondary factors, the item sets were not undimensional. Closer examination of the items loading on each factor showed the relevant distinction to be whether the item was measuring ability-to-participate in social roles (activities with family and work activities) or ability-to-participate in discretionary activities (activities with friends and leisure activities). The positive-negative split in the ability-to-participate items appeared to be less pronounced in the social role than in the discretionary activity items, suggesting an as-yet-unknown distinction in the perception of ability-to-participate in social activities with others. Negatively-worded satisfaction-with-participation items had been deleted from the item pool due to lack of dimensionality thus it was not possible to determine whether there was a similar distinction in these items. Because data from the clinical sample were not used in the analysis of social function items, it is not clear if the problem with combining positive and negative items was related to using a general population instead of a clinical sample or if the two sets of items are measuring slightly different constructs.
Lesson 2: Effect of Using Non-Synonymous Terminology
While the intent was to choose synonyms in the wording of the ability-to-participate and satisfaction-with-participation items, in some items the wording turned out to be a problem. For ability, there seemed to be little effect from using synonyms in the wording of the items. Items with similar content had approximately the same likelihood of endorsement (e.g., the theta estimates for feel good about ability to do things for family and satisfied with ability to do things for family were .28 and .30, respectively), did not differ in their discrimination ability, and loaded on the same factor. However, for the satisfaction-with-participation items, the use of disappointed with and bothered by proved to be problematic. Indications of this problem were found in the relationship of responses to these items and a set of global items (one for each of the PROMIS domains) that were administered along with the domain item sets. The satisfaction-with-participation items using these synonyms not only loaded on a different factor than the positively-worded items but their correlations with the scores on the mental health global item (rate your mental health from excellent to poor) were as high or higher than with the scores on the social activity global item (rate your social activity/relationships from excellent to poor). Spearman rho correlations of these satisfaction items with the global social activity item ranged from .34 to .54 and with the mental health global item ranged from −.32 to −.51. As a result of this ambiguity, the 29 negatively-worded satisfaction items were deleted from the item sets, reducing content coverage and leaving fewer items for examination.
Lesson 3: Effect of Wording Modifications
Modifiers were used with the ability items to reduce ceiling effects, but too little effect. The correlations among four versions of an item asking about ability-to-participate in leisure activities with friends ranged from .79 to .83. Asking about ability-to-participate in leisure activities or those with friends showed a similar pattern in the likelihood of endorsing the item. For the social function domains, likelihood of endorsement refers to the extent to which respondents are more or less able to participate (or satisfied with their participation) in various activities. Activities that are less likely to be endorsed are those in which the sample reported less ability (or satisfaction) and activities that are more likely to be endorsed items are those in which the sample reported more ability (or satisfaction). In both cases using the (activity) that is important to me modifier tended to make the item slightly more likely to be endorsed and using the (activity) that I want to do modifier tended to make the item slightly less likely to be endorsed but not substantially so. The likelihood of endorsement when framing items in terms of one’s own or others’ expectations were similar to the likelihood of endorsement when no modifiers were used. The pattern was different for items asking about ability-to-participate in work activities. In these items, ability to meet people’s expectations were the most likely to be endorsed and ability to meet one’s own expectations were least likely. A problem with using these modifications is they confound the construct definition such that harder-to-endorse items are measuring a slightly different construct (one reflecting personal expectations) than easier-to-endorse items (one reflecting societal expectations).
Lesson 4: Confirming an a Priori Item Ordering
In a previous study in which similar items were administered to a sample of cancer patients 2
, discretionary activities and social roles loaded on a single factor, However, in this study using data from this general population in which the majority of the participants reported some common chronic conditions and few reported more serious conditions such as stroke or spinal cord injury, these items did not load on a single factor. The difference in results could be attributed not only to the type of sample (clinical versus general population) but perhaps to the IRT model used (Rasch analysis in the original study and 2pl in this study). The results from this study, however, suggest a partial confirmation of the hypothesized ordering of items in terms of likelihood of endorsement in that social role items (participation in work and family activities) are more likely to be endorsed than the discretionary activities items (participation in leisure activities and activities with friends). That is, individuals tended to receive higher social role scores than discretionary activities scores. Furthermore, within the ability and satisfaction subdomains, the previously suggested hierarchy was also suggested. Items measuring satisfaction-with-participation in social roles involving family or household activities were more likely to be endorsed than those involving work activities. Similarly, items measuring participation in discretionary activities involving activities around the home were more likely to be endorsed than those involving activities outside the home. In both the social role and discretionary activities subdomains, there was a similar progression of activities; that is, activities in which people were more able to participate or be satisfied with their participation tended to be those that could be performed at home or with one’s family and activities in which people were less able to participate or be satisfied with their participation were those performed outside of the home or with a larger social network.
Lesson 5: Unidimensionality Versus Item Response Theory Model Fit
While not an issue in content development, discrepancies between dimensionality and model fit may be related to the consistency of the ordering of items by likelihood of endorsement across individuals. Items at the bottom of the scale (easier items to endorse) were consistently endorsed by most individuals while those at the top of the scale (harder items to endorse) were consistently endorsed by few individuals; however, the endorsement of items in the middle of the scale may be less consistent. Thus, despite all items essentially measuring the same construct, this lack of a common ordering of items across respondents can cause a problem. This suggests that a consistent item ordering across individuals is as important as unidimensionality in developing an item set that has acceptable IRT model fit. In the social function domains, item sets that were essentially unidimensional using the bifactor model did not necessarily have acceptable IRT model fit. In the satisfaction-with-participation subdomains that had acceptable model fit, the hierarchy was more evident than in the ability-to-participate subdomains. Perhaps this is an issue with the model assumptions in that a unidimensional IRT model may only work well with items that strictly fit a 1-factor CFA model (that is, a single general factor) whereas a multidimensional IRT model may be needed with items identified as only essentially unidimensional, that is, reflecting specific secondary factors in addition to the general factor). Perhaps the IRT model fit assumptions are too stringent for a small number of items and a large sample. Or perhaps preferences in the choices people make regarding the activities in which they participate preclude the development of a hierarchy of items measuring participation in social activities. Further research is required to examine these possibilities.
Insights and Future Directions
The PROMIS experience with measuring social function had mixed results. Little foundation existed on which to build these banks, necessitating the development of many new items. Although the items developed for the two original social function domains failed to meet PROMIS model assumptions, much has been learned that should help in future development.
The optimal level of social function may not be determinable a priori. While more satisfaction is considered better, more activity may not necessarily be better. Perhaps a narrower definition of the domains is needed to isolate aspects of participation that are hierarchical in nature. Or perhaps a certain threshold of limitation needs to be exceeded before one curtails participation in social activities. Satisfaction-with-participation may be more relevant than ability-to-participate. Regardless of level of activity; being more satisfied may be a better indicator of someone’s social function than the level of their activity. The fact that both satisfaction-with-participation subdomains fit the IRT model may suggest this is the case.
Personal preference plays a large part in the extent to which people engage in different social activities and additional research is needed to distinguish what people want to do from what they actually do. Perhaps discrepancies between wanting and doing can be used to better understand the role that activity preference plays in measuring activity participation and result in hypotheses regarding the ordering of activities into a generalizable and meaningful hierarchy.
It may be that differences in which activities people participate cancel out when summarized at the group level. Other than identifying activities endorsed by most people and those endorsed by very few people, the levels of endorsement of activities between these extremes may be too similar to be hierarchical. The hierarchies may differ for as-yet-unidentified subsets of people and would need to be further explored.
Social function in a general population may differ in substantive ways from social function in a clinical sample. Physical limitations may not result in reduced participation until the limitation is more severe. Additional testing and development with a more heterogeneous sample may detect a currently hidden hierarchy.
Two type of participation were identified in both social function domains that are separate but related: participation in social roles (what you do for others) and participation in discretionary activities (what you do with others). Although the constructs being measured were distinct, social role items are more likely to be endorsed than discretionary activity items.
The sizes of the current item sets are insufficient for banking purposes and more work is needed to develop and refine items that measure both types of social function. Sets of 12 and 14 satisfaction-with-participation items can be considered a basis upon which to develop banks that are of adequate depth for this trait across the continuum. However, different ways of measuring ability-to-participate may be needed. Explorations of alternative item types, such as asking about interference or building attributions to the health condition into the items, should be considered in developing such banks.
Work is already under way to apply what we learned in the first wave of PROMIS testing. Items have been rewritten so that the subdomains shown in contains only positively- or negatively-worded items; disappointment and bother are no longer used in the wording of satisfaction-with-participation items, and modifiers regarding expectation have been eliminated. The new items are being administered to persons with cancer, arthritis and stroke as part of a supplemental PROMIS study. Analysis of the data from these administrations will provide a test of the insights we’ve gained and should go a long way in improving our measures of social function.