|Home | About | Journals | Submit | Contact Us | Français|
Recovery is commonly used as an outcome measure in low back pain (LBP) research. There is, however, no accepted definition of what recovery involves or guidance as to how it should be measured. The objective of the study was designed to appraise the LBP literature from the last 10 years to review the methods used to measure recovery. The research design includes electronic searches of Medline, EMBASE, CINAHL, Cochrane database of clinical trials and PEDro from the beginning of 1999 to December 2008. All prospective studies of subjects with non-specific LBP that measured recovery as an outcome were included. The way in which recovery was measured was extracted and categorised according to the domain used to assess recovery. Eighty-two included studies used 66 different measures of recovery. Fifty-nine of the measures did not appear in more than one study. Seventeen measures used pain as a proxy for recovery, seven used disability or function and seventeen were based on a combination of two or more constructs. There were nine single-item recovery rating scales. Eleven studies used a global change scale that included an anchor of ‘completely recovered’. Three measures used return to work as the recovery criterion, two used time to insurance claim closure and six used physical performance. In conclusion, almost every study that measured recovery from LBP in the last 10 years did so differently. This lack of consistency makes interpretation and comparison of the LBP literature problematic. It is likely that the failure to use a standardised measure of recovery is due to the absence of an established definition, and highlights the need for such a definition in back pain research.
The concept of ‘recovery’ from a disease or health condition is central to health care . Within the low back pain (LBP) discipline, the concept of recovery is used in studies examining diagnosis , charting prognosis  and determining the effect of treatments . Although the term ‘recovery’ is used commonly, there is no accepted definition of what recovery from LBP means or agreement on how it should be measured.
Despite the apparent simplicity of the idea, forming a coherent and appropriate definition of recovery from LBP is not a straightforward task. For example in some studies the term is used synonymously with global improvement , in others with improvement on various indicators such as disability  and return to work . There is also a fundamental consideration regarding the meaning of recovery; that being whether recovery requires return to a prior health state or whether attainment of a fulfilling and satisfying life within the limitations of the condition is enough [21, 80]. The fact that LBP commonly follows an episodic or recurrent pattern  adds complexity to how recovery is conceptualised and measured.
It is worthwhile at this to point out the distinction between the definition of recovery and its measurement. While the problems with measurement of a concept in the absence of a standardised definition are self-evident, LBP researchers frequently measure recovery without an explicit statement of their definition of recovery . This omission makes the process of reviewing definitions of recovery used by researchers problematic. Nevertheless, we can make inferences about definitions from the way in which recovery is currently measured; this information then can be used as a first step in formulating an acceptable definition. The aim of this study was to systematically review the LBP literature from the last ten years for measures used to assess recovery from LBP.
Studies were identified for inclusion in the review via sensitive searches of electronic databases. Medline, EMBASE, CINAHL, Cochrane database of clinical trials and PEDro were searched from the beginning of 1999 to December 2008. Keywords describing LBP (LBP OR back pain OR backache OR low back injury OR sciatica OR lumbago) AND recovery (recover$) OR resolution (resol$) were used to identify papers that measured recovery from LBP as an outcome.
To be included studies needed to meet all of the following criteria.
Two authors reviewed the database searches and excluded clearly ineligible studies based on titles and abstracts. Full reports of the remaining records were obtained and assessed for eligibility according to the inclusion criteria by the same two reviewers. Disagreements were resolved via consensus and consultation with a third author.
Measures of recovery were extracted from each of the included studies. Where sufficient information was reported, the domain and measurement tool used to measure recovery were also recorded. Where several measures were used, all were extracted and classified according to domain.
Figure 1 presents the numbers of papers screened and included in the review. From the electronic database search, a total of 5,504 papers were identified of which 82 [2–4, 6–15, 17–20, 23, 24, 26–40, 42–44, 46–49, 52–56, 59–64, 66–68, 70, 71, 73, 74, 76–79, 81–87, 91, 93, 95–102, 104, 105] papers met the inclusion criteria. In some instances, several papers reported on the same dataset, for the purposes of this study such papers were treated as a single study.
The 82 included studies reported 76 measures of recovery, among these were 66 different measures. One of the measures was used in five different studies [9, 42, 48, 52, 62, 63] and six other measures were used in two studies. However, the remaining 59 measures of recovery were not used in more than one study. The majority of studies used one measure of recovery, however six studies measured recovery in two ways [8, 13, 20, 34, 35, 60, 83], one study used three measures [19, 68] and one study had four different measures of recovery .
Of the 66 different measures reported, recovery was determined by a defined cut-off value on an established measurement instrument on 36 occasions. Five recovery measures were based on the answer to a direct question [8, 61, 68, 79, 101] (e.g. ‘have you had back pain in the previous week?”), administrative data (e.g. time until insurance claim closure ) were used to measure recovery on three occasions [14, 34, 35, 91] and three studies described and quantified a physical performance test [4, 29, 47] (e.g. isokinetic muscle test ). Nineteen measures were described in a vague or uninformative manner that would preclude replication [6, 11, 26, 31, 33, 43, 49, 59, 70, 74, 81, 87, 93, 95, 100].
Seventeen studies used a minimum level of pain or ‘symptoms’ as a proxy for recovery; however, no two studies did so in exactly the same way (Table 1). Three recovery measures required the complete absence of pain, whereas three others fixed a cut-off score on the instrument [39, 40, 60, 66] that categorised subjects with minimal pain levels as recovered. The remaining studies gave a description of the symptomatic state necessary to indicate recovery. Seven studies determined recovery based on low or zero scores on disability questionnaires or required a return to previous levels of self-rated function (Table 1).
Seventeen studies determined recovery based on a combination of two or more domains, most commonly low scores on pain and disability measures (Table 2). As with the pain-based measures, however, no two were exactly alike. Ten studies asked subjects to fill in a single-item recovery rating scale [7, 10, 46, 53–55, 67, 77, 83, 96, 97, 102, 104, 105] (ranging from 4 to 15 points); nine variants of this scale were used in different studies (Table 3). Eleven studies used a global rating of change scale [9, 30, 42, 48, 52, 62–64, 76, 78, 84–86, 97, 98]; five variants of this scale were used. While the global rating of change scales was not designed to explicitly measure recovery, the scales include an anchor of ‘Completely Recovered’. Two studies used a dichotomous self-report measure of recovery, by directly asking patients whether or not they had recovered [8, 68]. Administrative data were used in four studies; two used return to work as the criterion [34, 35, 91], and two used time to insurance claim closure [14, 34, 35], and one further study used a self-rating of return to work  (Table 4). Six studies measured physical performance or absence of neurological deficits [4, 11, 29, 47, 74, 95], however in only three of these studies [4, 29, 47] was the test clearly described.
Another aspect of recovery that varied widely among included studies is the duration for which patients had to meet the recovery criteria to be regarded as recovered. This feature was infrequently reported (in 18 out of 67 measures); one study based their measure on recall over 10 years, in all others the duration ranged from 1 week to 12 months.
The principle finding of this review is the striking lack of consistency among measures of recovery from LBP. Of the 82 studies published in the last 10 years that measured recovery as an outcome, very few did so in exactly the same way. These data perhaps reflect the paucity of investigation into the concept of recovery from LBP [5, 51]. Irrespective of the reason, this lack of standardisation has important implications for the comparability and interpretation of the LBP literature.
Researchers assessed various related domains as surrogate measures of recovery, examples include; pain, disability and return to work, alone or in combination. The use of a range of domains reflects differing ideas among researchers as to how best to conceptualise recovery from LBP. For example, should the absence of pain denote recovery  or the absence of disability  or are both domains relevant ? Even when recovery is based on a single domain, e.g. pain, there remains the question of whether low levels of residual symptoms indicate recovery or complete absence of symptoms is necessary. Decisions regarding the domain, instrument and cut-off appear in most cases to have been made arbitrarily by researchers, perhaps because there is no uniform definition of recovery.
A range of methods were used to measure recovery, including: previously validated instruments, administrative/insurance data, or direct questions. There were also however a significant number of reports (more than 25% of measures) that provided only an imprecise description of their recovery criterion. Further, only a minority of studies reported the duration for which subjects must meet the specified criteria in order to be regarded as recovered. These findings highlight an important limitation in the recent literature; inadequate reporting of outcome measures provides a barrier to interpretability and comparability of research.
A number of studies assessed recovery via a single-item question, with either a dichotomous response or scored on a continuous/Likert scale anchored by ‘completely recovered’ or similar. A single-item measures may suffer from poorer reliability than multi-item measures  and there is also a conceptual obstacle in that it would seem unlikely that a complex process such as recovery can be adequately captured by a single-item measure. This method does however enable the researcher to assess the subject’s overall perspective of their recovery, ensuring relevance of the measure. This is in contrast to the approach outlined above, where the researchers determine what domains they regard as important in the subject’s recovery. Although this prescriptive approach offers advantages in terms of subject-to-subject comparability, the importance of incorporating patients’ views into outcome measurement has been increasingly recognised recently [5, 89, 94]. Research in this area suggests patients’ perspectives of recovery are idiosyncratic and often determined by individual appraisal of the impact of symptoms on daily activities and quality of life .
A relevant question is whether recovery should be considered a dichotomous or a continuous construct. Most of the included studies described recovery as dichotomous; dividing participants into exclusive categories of ‘recovered’ or ‘not recovered’ according to set criteria. On the other hand, several studies (see Table 3) used a recovery scale with between 4 and 14 points to place subjects along a recovery continuum. The former approach offers the advantage of simplicity for interpretation, but will almost certainly provide a less responsive measure of patient recovery. This consideration, along with their particular conceptualisation of recovery will direct researchers’ decision on what type of scale to use.
It is perhaps not surprising that wide variability exists in the measurement of recovery in LBP. Indeed, this situation is not uncommon in health-care research. Lack of standardised definitions for key terms and outcomes is noted in studies of whiplash-associated disorders , drowning , falls , spasticity , peptic ulcers  and schizophrenia . This finding is likely due to the lack of a standardised measure for recovery as well as the absence of a clear and agreed-upon definition of what recovery from LBP means.
It is possible that studies including definitions of recovery were missed by this review so we may have underestimated the variability in measurement of recovery. This would not influence the main finding of our study that there is a lack of consensus in this area.
This study highlights the lack of consistency among measures of recovery in LBP studies. Of the 66 different measures of recovery extracted, only 7 were used in more than one study. This variability is patently detrimental to the interpretability of the LBP literature. It is likely that the lack of an agreed definition for recovery from LBP contributes to this problem. Thus, it is recommended that efforts be directed toward formulating a definition, this step being a necessary precursor to selection or creation of a reliable and valid measure of recovery. Previous studies have used a Delphi process to arrive at a definition of various terms related to health-care research, e.g. complaints of the arm neck and shoulder , functional capacity evaluation , an episode of LBP . An alternate method may be via a discussion process among experts in the area, e.g. outcome measures for chronic pain studies , disease activity in rheumatoid arthritis . Development of a definition for recovery from LBP may be amenable to either of these processes.
The authors would like to thank those that assisted with translation of the non-English language articles: Prof Rob Smeets, Ms Luciana Macedo, Dr Leo Costa, Dr Christine Lin and Mr Fred Zmudski. SJK’s scholarship and CGM’s fellowship are funded by the National Health and Medical Research Council of Australia, TRS’s scholarship is funded by the University of Sydney.
Conflict of interest statement The authors have no conflict of interest to declare.