|Home | About | Journals | Submit | Contact Us | Français|
There are many types of pain assessments available to researchers conducting clinical trials, ranging from simple, single-item Visual Analog Scale (VAS) questions through extensive, multidimensional inventories. The primary question addressed in this survey of top-tier medical journals was: Which pain assessments are most commonly used in trials? Articles addressing chronic musculoskeletal pain in clinical trials were identified in seven major medical journals for the year 2003. A total of 50 studies (total original research articles reviewed: 1,476) met selection criteria, and from these we identified 28 types of pain assessments. Selected studies were classified according to the dimensions of pain assessed, the type of scale and descriptors/anchors used, and the reporting period specified. The most frequently used assessments were the single-item VAS scale and the Numeric Rating Scale (NRS); multidimensional inventories were used infrequently. There was considerable variability in the instructions patients received about the period to consider when evaluating their pain, and many studies provided only cursory information about their assessments in the methods. Overall, it appears that clinical trials utilize simple measures of pain and that there is no widely accepted standard for clinical pain assessment that would facilitate comparison of outcomes across trials.
This review highlights the heterogeneity of pain outcome measures used and the abundance of single-item measures in clinical trials. While there are many pain outcome measures available to clinical researchers, more consistency in the field should be encouraged so that results between studies can be compared.
Pain is a leading public health problem in the United States affecting over 50 million Americans at an annual cost exceeding $100 billion dollars.46 This translates into 70 million healthcare visits a year, making pain the leading cause for health care utilization. Fifteen percent of adults in the United States have had chronic low back pain at some point in their lives, resulting in 250 million lost days of work, and conferment of disability to over 10 million Americans.20 This staggering figure only accounts for one of many chronic pain diagnoses. The American Pain Society and the Joint Commission on Accreditation of Healthcare Organizations have designated pain as the “fifth vital sign” in an effort to enhance the awareness and need of assessment.18
Pain assessments rely on self-report measures intended to quantify qualities of pain like intensity, sensory characteristics, affective responses, and coping.30 Many multi-dimensional measures are available including the McGill Pain Questionnaire (MPQ40) and the Western Ontario and MacMaster Universities Osteoarthritis Index (WOMAC2). Alternatively, simpler single-item measures are available, such as the Visual Analog Scale (VAS) and Numeric Rating Scale (NRS), which are meant to measure particular qualities of pain such as intensity or unpleasantness. A great deal of information is available about the psychometric qualities of these questions and scales, and many are considered reliable and valid (for a review see62). It is also noteworthy that researchers are actively pursuing more accurate alternatives for the measurement of pain including advanced scaling,21 real-time assessments,59 and alternatives to self-report.31,43 In addition, initiatives in the pain research community have recently generated recommendations on the types of outcome measures that should be used in future research.12
With the large number of options available to clinical researchers, our interest was in determining which self-report pain measures were employed in clinical trials. One of our motivations for posing this question is based on the possible disconnect between the availability of sophisticated, well-validated, multidimensional instruments and the instruments that are actually used in trials. Our impression of this literature is that simple assessments (e.g., one-item questions addressing pain intensity) are often used as outcomes in important clinical trials. Knowing that casual impressions can be mistaken, we conducted a systemic review of the trials published in 2003 that used pain as an outcome.
Because randomized controlled clinical trials (RCTs) and clinical trials (CTs) are used to determine clinical effectiveness of treatments and are, therefore, particularly influential studies, we chose to review pain outcome measures used in RCT and CT research published in several leading medical journals. We assumed these journals would be indicative of best research practice. We limited our review to studies of patients with chronic pain and only included studies that examined chronic musculoskeletal, non-malignant pain to prevent a high degree of heterogeneity in the measures observed.
A systematic review was conducted for all original articles in three types of medical journals (pain, general medicine, and musculoskeletal disease) for the year 2003. Our goal was to choose top-tier journals for each the three groups, thus, we reviewed the AIM (Abridged Index Medicus) Core Clinical Journal list and the impact factors (from 2004) for each journal. For the pain and general medicine journals, we chose the two most influential in their respective fields, and for the disease-specific journals, we chose those with high impact factors that focused on musculoskeletal, non-malignant, non-headache pain. Thus, the seven journals reviewed and their impact factors were Journal of Pain (impact factor 2.0), Pain (4.1), Journal of the American Medical Association (JAMA; 24.8), New England Journal of Medicine (NEJM; 38.57), Arthritis and Rheumatism (7.4), Rheumatology (4.1), and Spine (2.7).
All original research articles in these journals for the year 2003 were reviewed. The following inclusion criteria were required: (1) a randomized, controlled clinical trial (RCT; patients were randomized to treatment arms) or a clinical trial (CT; prospective treatment study where patients were non-randomly assigned to one or more groups), or a secondary analysis of an RCT or CT (2) participants were adults diagnosed with chronic (more than 12 weeks) musculoskeletal, non-malignant pain and (3) self-reported pain was an outcome (not necessarily the primary outcome).
The first and second authors conducted preliminary screening for inclusion criteria and coded all articles that were eligible. To ensure consistency of article coding, the first and second authors each reviewed several issues of Pain and found that there were very few discrepancies in their results. The articles were then all independently coded; the few discrepancies were resolved by re-reviewing the article together.
Information was coded about study characteristics and about the pain measure(s) employed. For each study, we coded: 1) the duration of the study protocol; 2) the chronic pain diagnosis of the sample; 3) the length of time patients were in pain or had a diagnosed chronic pain condition; 4) the definition of chronic pain reported in the study; and, 5) the outcome measure(s) used to assess pain. The following information was coded for each pain outcome: 1) the type of scaling used (e.g., Visual Analog Scale, Numeric Rating Scale); 2) the response options for the scale (e.g., “no pain” to “most pain” as the endpoints for a VAS); 3) the type of quality of pain assessed (e.g., pain intensity or level of usual pain); 4) the reporting period assessed for making the rating (e.g., right now, or over the last week); 5) whether or not the measure is a standard validated measure (e.g., the MPQ, the WOMAC); and, 6) whether or not the article reports/uses the measure as it was originally developed or was revised to meet the needs of the authors' protocol. If the measure used was a standard measure (i.e., MPQ) and the article did not provide information to the contrary, we assumed it was used as designed.
From a total of 1476 original research articles reviewed, 50 met our eligibility criteria (see Table 1 for a description of article selection and distribution of articles by journal).1,5-8,10,11,13-17, 19, 22-29, 33-38, 41, 44, 45, 47-58, 60, 61, 63-68 Of the 50 studies selected for review, 66% were RCTs or secondary analyses of RCTs; the other 34% were CTs. The duration of the study protocol was less than 6 months for almost half (42% of the 50 eligible studies), and it was 6 months to 12 months for 20%, >12 months to 2 years for 26%, and > 2 years for 10% (one study did not report the duration of the protocol). The most frequent diagnoses in these studies were chronic low back pain (n=13; 26%), rheumatoid arthritis (n=9; 18%), chronic pain not specified (n=7; 14%), osteoarthritis (n=5; 10%), fibromyalgia (n=3; 6%), and thirteen other diagnoses that were reported by only one study. The mean duration of time patients were in pain (or with a chronic pain diagnosis) was greater than 3 years for 48% of the studies. With regard to the definition of chronic pain, only 23 of 50 (46%) studies provided descriptions. Of these, 9 used American College of Rheumatology (ACR) criteria for rheumatoid arthritis, osteoarthritis, or fibromyalgia, 4 used a criterion of pain for at least 3 months, and 2 used pain for at least 6 months. No other definitions of chronic pain were used more than once.
Forty-four percent of the 50 eligible studies used one measure to assess pain as an outcome, 36% used two measures, 12% used three measures, and 8% used four measures. The majority of the eligible studies included pain as their primary outcome measure (64%). The pain outcome measures that were coded from the 50 studies were broken into 28 distinct classifications for measuring pain (e.g., VAS, McGill Pain Questionnaire, Pain Frequency). Table 2 lists each of the measures and their frequencies. Some articles did not provide sufficient information to determine under which classification it should be grouped. For example, when a measure was simply described as a “5-point categorical scale,” it may have been a NRS, but could have been another type of scale (i.e., VRS). We always coded scales as specifically as possible, but when in doubt, we created a separate category that was exactly as the authors report it in the methods section of the article (in the above example, the measure was coded as a “5-point categorical scale”).
Table 2 also classifies measures as single-item, multi-item or multidimensional. In order to determine which category to place each measure, we used the information provided in the methods section as well as the citation used for the measure (when provided). As the name implies, single-item measures consist of only one item and only measure one dimension of pain. Examples are VAS, Verbal Rating Scales (VRS), Numeric Rating Scales (NRS) and single-item categorical variables. Similarly, multi-item measures also only measure one dimension of pain, but use more than one item to assess this construct. An example is the Roland-Morris Disability Questionnaire which measures pain-related disability. Another example of a multi-item measure is the SF-36 pain subscale since it is only assessing one dimension of pain (namely bodily pain). One article stated in the methods section that Pain Frequency was measured using the sum of four items and we categorized this as multi-item. Finally, we also categorized measures as multidimensional if they assessed more than one dimension of pain. For example, the MPQ assesses sensory and affective qualities of pain along with pain intensity. There were measures that we couldn't categorize based on the information in the methods section and the citations, and we include them under the heading ‘unknown.’
Table 2 demonstrates the propensity for using single-item measures to assess pain in clinical trials. Over half (54%) of all the pain outcome measures we coded from the 50 eligible studies were single-item measures, and the vast majority were VASs. In contrast only 16% were multidimensional scales such as the MPQ, and 27% were multi-item scales that measured one dimension of pain (i.e. the Neck Disability Index). Note that one article reported using only one of the pain subscales from the multidimensional pain scale, the MPI and it was unclear whether the whole measure was administered. For this case, we classified it as “one subscale from a multidimensional pain measure.” We were unable to classify two measures in terms of single-item, multi-item, or multidimensional because not enough information was provided within the article (one we assumed was multi-item; see footnote for Table 2). Finally, three articles reported they assessed the “Pain Frequency,” however, the three assessed this construct in different ways and, thus, it appears in different categories in Table 2. One article stated that they assessed pain frequency using a single item, another wrote that it was the sum of 4 6-point Likert questions and the last did not give enough information about the measure for us to determine which category was appropriate.
The pain outcome measure most frequently used in the 50 eligible studies was the VAS, with over half of the studies (60%) including a VAS to measure pain as an outcome in their trial. Remarkably, the VAS was used as the only pain outcome measure in 20% of the 50 studies. Since it was the most frequently used measure, we have provided details about the characteristics of the VAS reported in the studies. Table 3 presents the breakdown of the dimensions assessed, the description of the scale provided, the absence or presence of scale anchors, and the reporting periods for the VAS measures.
The purpose of this review was to increase understanding about how pain has been assessed in recent clinical trials. To accomplish this task, we reviewed over 1476 papers published in 7 major medical journals for a single year. Fifty studies used a pain measure as an outcome in the context of a clinical trial (RCT or CT) and met our inclusion criteria. This study was primarily exploratory; we did not undertake this review with a set of hypotheses about what we would find, although we had several impressions and general questions about pain measurement in published studies. For example, would the results show that one of the sophisticated, multidimensional pain questionnaires available to researchers, such as the McGill Pain Questionnaire, dominates the field?
Before addressing these questions, we report an unanticipated finding that emerged from the coding task. The task of coding the studies was quite challenging because many articles provided sparse descriptions of the assessments of their outcome measure(s). While this clearly made it more difficult to conduct the review, it suggests a more important point. A good methods section of an article should contain enough detail to allow for the study to be replicated, but the descriptions we found would often have been inadequate for conducting a replication study. Furthermore, comparison of results with other trials would be very difficult. Unfortunately, we originally planned to examine the ways that pain outcome measures were modified for specific protocols, but because of the frequently inadequate descriptions provided in the methods sections, it was difficult to ascertain whether measures were modified and if so, how. Comments during the review process by journal editors and reviewers could remedy this problem, and we recommend such improved reporting of assessment methods in future studies.
This survey plainly showed that one or two instruments did not dominate the pain outcome scene. Of the 50 articles included in this review, 28 different classifications of pain measures were derived. We did not expect this degree of assessment heterogeneity, and we contrast this finding to another area we are familiar with – the assessment of depression. In that area of research, there are just a handful of instruments (e.g., the Beck Depression Inventory and the Center for Epidemiological Studies Depression Scale) used in clinical trials. We interpret this as signifying that there is not an accepted standard for the assessment of chronic pain in clinical trials. An unfortunate consequence of this situation is that there is an inherent lack of comparability among the trials, because so few studies assessed pain in the same way. This observation is consistent with a recent review article examining the psychometric properties of osteoarthritis (OA) questionnaires. It included 37 articles (after reviewing 1,930) and consistent with our results, found 32 measures to assess pain and physical functioning in those papers.62 Clearly, this shows that there is not a ‘gold standard’ for measuring pain in either the broad category of chronic pain or within the specific chronic pain disease literature of OA.
The most compelling finding was that single-item measures were used as the only pain outcome measure in 34% of the trials (17 out of 50). Another important finding is that 60% of all eligible studies (30 out of 50) used a VAS as their pain outcome measure, and 10 of these used a VAS as their only pain outcome measure. This is in contrast to our expectation at the outset of the review when we thought that multidimensional pain scales, such as the MPQ, would be typical. Instead, we found that only 16% of the trials utilized a multidimensional pain measure. Thus, despite the vast amount of conceptual and empirical development that has gone into the multidimensional scales, they do not appear to be the choice of clinical pain researchers.
Our conclusion is that clinical researchers are opting for very brief pain assessments, and we speculate that this may be driven by the need to minimize patient assessment burden. We question whether this is a good decision in all cases. In some trials, pain may not be a primary outcome and therefore a comprehensive assessment may not be warranted. However, when pain is an outcome of importance to the trial, researchers need to ask what aspects of pain their treatment is expected to effect and then consider if that outcome will be adequately measured with single items or scales.
Our review allowed additional insights into how VAS scales are used in trials, and there was considerable heterogeneity in the way they were employed (i.e., time frames of the assessments) and how they were reported in method sections. The VAS was originally designed to measure current pain with either a 10cm or 100mm vertical or horizontal line with anchors on either end representing the extremes of the pain dimension being assessed. Almost half of the studies using VAS (47%) assessed the concept of pain intensity or severity, 27% measured “general pain,” “global pain,” or “pain” in general. In terms of the reporting period provided for rating the question, 17% asked about current pain, 17% about pain over the last week, and 7% about “estimated daily pain.” Of concern, 50% did not specify the reporting period used in the VAS. This further underscores the observation that there is a lack of consensus regarding measurement of pain in clinical trials, including the dimensions of pain and a recommended time frame. It also is another example of the inadequate level of description in the methods.
A challenging question emerges from the prior observation: are empirical evaluations of new treatments being shortchanged by the use of simple scales or are the simple VAS scales adequate? For at least 40 years, the multidimensional nature of the pain experience has been recognized,39 and measures that incorporate assessment of several pain factors have been developed.9,32,40 Pain intensity, the most commonly measured factor, is but one of these dimensions. Sensory, affective, behavioral, social and attitudinal factors are among the other dimensions that have demonstrated importance for pain measurement and that are not captured through the measurement of pain intensity. Morley and Williams (2002) discuss the commonly observed gap between the dimensions of pain that are expected to be impacted by a treatment and the measures selected to index treatment outcome.42 When a treatment is designed to reduce the experience of pain and consequently to improve functioning and quality of life, then it does seem that we are shortchanging the impact of a clinical trial by limiting pain assessment to a single item of pain intensity. Yet, this appears to be the current state of pain assessment in the majority of clinical trials.
Despite the potential short-comings of single-item measures of pain, a group of pain researchers recently published recommendations on how chronic pain should be assessed.12 Interestingly, the outcome measures they recommend include a single-item measure of pain intensity (a NRS as opposed to VAS) and assessments of rescue analgesic use. The report also listed several questionnaires to assess the other dimensions of pain (i.e. physical functioning, emotional impact, global improvement) to be used depending on the requirements of a particular protocol. While these recommendations agree with the current findings in that most pain researchers are already relying on single-item assessments of pain, it is notable that multidimensional measures were not recommended as the gold-standard outcome measure for chronic pain. Perhaps this is because pain researchers acknowledge these lengthier measures are more burdensome to patients or perhaps because clinical trials are more interested in assessing pain intensity and the use of rescue pain medication.
In addition to these recommendations, two studies that examined the utility and responsiveness of several pain measures in patients with OA and RA taking a non-steroidal anti-inflammatory medication reported that single-item measures like a Likert item or VAS were more responsive than more complex multidimensional measures.3,4 Therefore, perhaps single-item measures are appropriate for certain clinical trials as a way to assess pain relief and/or changes pain severity/intensity. Thus the current finding that single-item measures were used in the majority of the eligible studies included in this review may not be as alarming as we first thought. However, we still maintain that for trials where pain is the primary outcome, understanding how the treatment changes all aspects of the pain experience (i.e. affective, sensory, analgesic, etc) is important.
Finally, it is worth noting that increased rigor in the measurement of pain in clinical trials is now expected in some research settings. In 2006, for the first time, the US Food and Drug Administration issued preliminary guidance for trials submitted for their review using patient reported outcomes (FDA Docket No. 2006D-0044). The issues covered in that guidance (including a variety of psychometric and assessment standards) will have far-reaching implications for the acceptability of pain assessments utilized in trials submitted to this regulatory agency. While investigators conducting pharmaceutical trials and pain researchers are undoubtedly apprising themselves of these new expectations, this review highlights the importance of more consistent measurement of pain for all clinical trials.
This work was supported by a grant from the NIH PROMIS Network (grant number here). Special thanks to Doerte Junghaenel and Natascha Santos for their assistance with data entry and to Dr. Dagmar Antmann for her comments and encouragement. Arthur A. Stone is the Associate Chair of the scientific advisory board of invivodata, inc.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.