|Home | About | Journals | Submit | Contact Us | Français|
To systematically review the psychometric properties of outcome measures used in stroke self-management interventions (SMIs) to (1) inform researchers, clinicians and commissioners about the properties of the measures in use and (2) make recommendations for the future development of self-management measurement in stroke.
Electronic databases, government websites, generic internet search engines and hand searches of reference lists. Abstracts were selected against inclusion criteria and retrieved for appraisal and systematically scored, using the COSMIN checklist.
Thirteen studies of stroke self-management originating from six countries were identified. Forty-three different measures (mean 5.08/study, SD 2.19) were adopted to evaluate self-SMIs. No studies measured self-management as a discreet concept. Six (46%) studies included untested measures. Eleven (85%) studies included at least one measure without reported reliability and validity in stroke populations.
The use of outcome measures which are related, indirect or proxy indicators of self-management and that have questionable reliability and validity, contributes to an inability to sensitively evaluate the effectiveness of stroke self-SMIs. Further enquiry into how the concept of self-management in stroke operates, would help to clarify the nature and range of specific self-management activities to be targeted and aid the selection of existing appropriate measures or the development of new measures.
Stroke is a major cause of death and disability world-wide . By 2020 stroke, together with coronary-artery disease, are predicted to be the leading causes of global lost healthy life-years . Stroke represents an often devastating disruption to life , the majority of survivors experiencing some degree of impairment requiring additional care or support 1 year post-stroke .
Stroke is an acute event, but may result in significant long-term impact for the individual, such as social isolation, mood disturbance, communication difficulties and reduction in mobility and life roles [5,6]. Recovery following stroke is complex and multidimensional [3,7,8], encompassing bio-medical, psychological and sociological elements [9–11]. Engagement in self-management practices by individuals with long-term conditions has been suggested as key to promoting recovery  and is cited as a means of empowerment and facilitator of improved health outcomes [13,14].
Self-management is a prominent issue in UK health policy [15–17] and has been identified as a key priority for health by organisations independent of the UK government [18,19]. Self-management can be defined as the “active management by individuals of their treatment, symptoms, lifestyle, physical and psychological consequences inherent with living with a chronic condition” . Self-management is an attractive initiative in managing the increasing burden on health and social care resources and reducing associated costs; the assumption being that effective self-management by an individual reduces their healthcare utilisation [21–23]. Healthcare professionals are well placed to promote the effective self-management of stroke [9,24,25].
Self-management interventions (SMIs) are designed to enable people to manage their health more effectively. Evaluation therefore must consider two key areas; firstly whether people develop the skills to manage their own health and secondly, if this consequently results in better health. SMIs operate over multiple dimensions and within different contexts. As such evaluation is complex, not least due to variation in delivery, culture of the sponsoring healthcare organisation and anticipated goals and outcomes [26,27].
The UK Medical Research Council advocates establishing the theoretical basis of an intervention as a first step in estimating its possible outcomes . Currently, evidence suggests that the mediators of change and theoretical premises in SMIs are unclear [28–30]. This poses difficulty in the evaluation and operation of interventions for two key reasons. Firstly, doubt exists regarding the appropriate outcome(s) to monitor to assist evaluation of the intervention and aid determination of cost-effectiveness and clinical impact. Secondly, if the theoretical premises underpinning the intervention are uncertain, intervention fidelity is difficult to monitor and maintain. Questions then exist regarding what influences change and how this can be appropriately measured and SMIs evaluated.
SMIs may be evaluated by examining the effect on health outcomes that potentially change as a consequence of better self-management. Using patient reported outcome measures (PROMs) (e.g. functional status, symptom control, mood and health-related quality of life) is an important way of to ensuring evaluation considers outcomes important to patients. Preliminary investigation of self-management suggests that effective self-management corresponds with positive changes in health behaviour [20,28]. More recently there has been a focus upon measuring attitudes since these are thought to modify health behaviour [29–31]. Additionally, measurement may facilitate understanding of the relationships between attitudes and behaviour [32,33]. PROMs endeavor to capture information that is not directly observable and unmediated by healthcare professionals; consequently accurate measurement is contingent on the extent that the PROM is an accurate reflection of the variable in question . Therefore, it is vital to evaluate whether the measures adopted in SMI studies provide legitimate information to evaluate self-management, both the process and obtaining of skills to better manage health and subsequent potential improvements in health.
Before using an outcome measure in research or clinical practice, it should be assessed and considered to possess adequate psychometric properties. Despite the recognised value of reliable and valid outcome measures and the increasing importance of identifying effective SMIs in stroke, we know of no review that has systematically evaluated international research for the quality of outcome measures used in stroke self-management. The purpose of this article is to systematically review outcome measures used in stroke self-SMIs, with the aim of informing researchers, healthcare professionals and policy-makers and making recommendations for the design of future outcome measures suitable for use in stroke self-management.
This review seeks to systematically examine the outcome measures adopted in stroke SMIs in terms of the methodology adopted in their development and subsequent strength of their psychometric properties for use with stroke populations. Differing criteria have been adopted to evaluate the psychometric properties of outcome measures [34,35]. Often the methodology adopted in reviews of outcome measures, use differing assessment standards, creating confusion for researchers and clinicians .
A recent international Delphi study of 57 experts (63% response rate) resulted in a tool to assess the methodological quality of studies on measurement properties, referred to as the COnsensus-based Standards for the selection of health status Measurement INstruments (COSMIN) checklist . The COSMIN list has good inter-rater agreement and reliability  and represents the first critical appraisal tool that is based on the consensus of experts in psychometric theory. COSMIN has been used in other systematic reviews examining the measurement properties of outcome measures in a range of health conditions [39–41]. The Delphi study consisted of four rounds and sought to reach consensus on the terminology and definitions to be adopted with regard to psychometrics. This consensus offers both researchers and clinicians guidance with regard to some of the complexities of measurement properties. COSMIN also addresses modern psychometric theory methodology, such as Item Response Theory, as well as Classical Test Theory. Further information on COSMIN can be accessed via www.cosmin.nl
The properties examined in this systematic review are defined by COSMIN  and consist of nine items: Internal consistency; Reliability; Measurement error; Content Validity; Structural validity; Hypothesis testing; Cross-cultural validity; Criterion Validity and Responsiveness.
The purpose of the review is not to make a judgement on the quality of the SMI studies, or to synthesize findings to answer questions regarding the effectiveness of the interventions in the review. Instead the review focuses upon the value of the outcome measures adopted within stroke SMI studies according to their methodological quality, reliability and validity for stroke populations as outlined by COSMIN .This is vital since judgments about the results and impact attributed to SMIs are dependent on valid and reliable measurement.
In order to examine the properties of the outcome measures used in stroke SMIs, it was first necessary to identify which measures were used to evaluate the SMIs. Stroke self-management literature was systematically searched on the following electronic databases by one author (E.J.B.): Medline, PsychInfo, Science Direct, Web of Science and CINAHL. The following terms were used to identify existing stroke SMI studies:
Search terms were chosen to represent concepts often linked to self-management (education, rehabilitation), however, studies were excluded unless they specifically stated their purpose was to enhance self-management. Article reference lists, website of UK government health department, generic internet search engines, and stroke-specific organisations were also searched. Dissertations and conference abstracts were excluded, however, searches for publications by dissertation or conference abstract authors were conducted. Selected articles described either (1) stroke-SMI development and/or implementation or (2) presented outcomes of stroke SMIs. Identified interventions and associated outcome measures were extracted and tabulated. Authors screened each abstract to eliminate articles that were not relevant, based on the following inclusion criteria:
One reviewer (E.J.B.) assessed all relevant full text articles obtained and a second reviewer (S.D.) assessed 10% to check reliability. Outcome measures included in studies were then identified and recorded (the version adopted by the SMI was selected for review). A second literature search then sought to find evidence for the measurement properties (as outlined by COSMIN) of those outcome measures identified. The following electronic databases were used: Medline, PsychInfo, Science Direct, Web of Science and CINAHL. The following search terms were used in conjunction with the title of the identified outcome measure:
For example, for a search on the validity of the Geriatric Depression Scale, the search terms “Stroke” AND “Validity” AND “geriatric depression scale” was performed. Search terms were chosen to represent the properties of outcome measure quality as outlined by the COSMIN checklist. No time limitations were set as advocated by COSMIN, since older literature on measurement properties is still relevant. Searches were conducted to specifically find evidence of those properties with stroke populations. This is crucial since a measure’s reliability and validity are on-going properties, dependent upon the context and population with which it is used [34,43]. For example, a measure developed to assess quality of life with a traumatic brain injury population will not necessarily possess acceptable content validity for stroke populations, since the issues faced by both populations may have similarities and differences.
Article reference lists from the originating studies and generic internet search engines were also searched. Discussion between the authors sought to clarify any issues regarding terminology and interpretation of the COSMIN checklist and preceded scoring of the identified measurement tools. Studies were excluded if they investigated postal or proxy reliability and validity unless this was how they were used in the SMIs. For outcome measures with more than one result per COSMIN criteria, the article stating the most robust results was reviewed. Where it was not clear that the study populations were specifically stroke, articles were excluded. One reviewer (E.J.B.) assessed all relevant full text articles using a standard data extraction form advocated by COSMIN. To ensure consistency of interpretation and scoring, a second reviewer (S.D.) independently scored a random 10% of the articles, with discussion between the two reviewers regarding the scores attained. Disagreement regarding interpretation of COSMIN terminology was resolved through consensus meetings. Agreement between scores was consistent.
Identified interventions and associated outcome measures were extracted and tabulated. Paper authors were contacted for further details where the study reported on early phases, or cited unpublished work. Outcome measures included in identified stroke SMIs were rated using the COSMIN checklist . COSMIN consists of four steps and 12 items with different categories for scoring. Ten items are used to assess whether a study meets the standard for good methodological quality. Two items contain general requirements for articles in which Item Response Theory (IRT) methods and general requirements for the generalisability of the results are applied. Where a published paper does not report on a COSMIN item, the item is not scored. For example, if the responsiveness of the measure has yet to be determined, this item is not scored. Each item is rated as excellent (++++), good (+++), fair (++) or poor (+). Full details of the scoring system are available at www.cosmin.nl. The overall score per item is determined by the category with the lowest score.
Eighty nine records for possible stroke SMIs were identified. Of those, 43 abstracts were identified as potentially relevant studies and were screened (46 duplicate records were excluded). From these, 19 articles were retrieved and reviewed for inclusion criteria and data extraction (studies were excluded because they did not meet the detailed criteria or if they reported on an earlier phase of the same study). Outcome measures within each study were then identified and grouped conceptually into different themes, using content analysis. A total of 13 studies met the eligibility criteria (Table I).
All studies included participants over 18 years of age who had experienced a stroke. Three studies of stroke-SMIs originated from the UK; three from Australia; two from Canada; two from the USA; two from Sweden and one from Hong Kong. Nine studies reported upon interventions aimed at community-dwelling participants; two upon the acute recovery phase (<3 months post-stroke); one upon recovery for care home residents and one study did not report details of the setting. Four studies delivered individualised interventions; three utilised workbook interventions; four tested group SMIs designed specifically for stroke and two tested existing self-management programs adapted for stroke. Interventions were delivered primarily; by Allied Health Professionals (six); Nurse specialists (four); researcher (two) and lay experts (one).
Four studies identified primary outcomes; Health-related Quality of Life ; self-efficacy ; physical functioning  and feasibility . Although all studies focused upon stroke self-management, none measured stroke self-management as a discrete concept. Instead, a range of concepts were measured which presumably were selected to reflect the expected outcomes or process of self-management. Evaluation relies upon judgements concerning the process of the SMI and the outcome expected following participation in the SMI. The majority of measures used sought to measure health outcomes e.g. physical functioning, mood, quality of life (Figure 1). Attitudes were also measured which could be considered to more readily reflect the process of self-management e.g. healthcare utilization, medication compliance although the theoretical mechanisms linking self-management to these concepts was not elucidated by the authors.
The term “unreported” is used in this review to describe an outcome measure that has not, at time of writing, been published in peer reviewed publicly available media. Unreported measures relate to those developed either by the study authors or through modifications made to existing measures without examination to ensure such assumptions or modifications were valid. Therefore, it is not possible to determine if unreported measures meet any of the COSMIN criteria. Six studies adopted unreported measures of concepts presumably (although the theoretical links were not explicitly stated by the authors), relating to the process of self-management, that had unknown reliability, validity, responsiveness.
Allen et al.  included an unreported measure to assess condition management and patient and carer satisfaction with the intervention. Two studies used 10 point visual analogue rating scales (VASs) to assess usefulness and intelligibility  and satisfaction  related to the intervention. Whilst VASs are brief and simple to administer and minimal in terms of respondent burden, without established reliability or validity relating to the underlying construct purported to be measured, they remain of limited value. Ljungberg and colleagues  designed four questions to assess life satisfaction pre- and post-participation in the SMI and Sit and colleagues  modified an existing stroke knowledge scale; details of the modifications were absent in the paper. Marsden and colleagues  used a measure of stroke knowledge test, but stated clearly they were not basing inferences from the data obtained using this measure.
Forty-three different outcome measures were adopted by studies in this review of measures used in stroke SMIs. Of these, 21 measures (49%) demonstrated some properties in stroke populations, according to the COSMIN checklist  (Table II). For the remaining measures no evidence could be found for any of the COSMIN properties in stroke populations (n = 16, 39%), or the measures were observer-based assessments (n = 5, 12%).
A summary of how the measures included in the stroke SMI studies scored according to COSMIN, when examined for their measurement properties with stroke populations, is shown in Figure 2. Not every measure scored in each category on COSMIN. Of 21 measures, none scored in every category of COSMIN. Where more than one paper addressed a COSMIN category, the article which stated the most robust results was scored. The majority of measures scored either “fair” or “poor” in each category. The only category to obtain an “excellent” rating was content validity. Three measures scored “excellent” in this category as follows; the Stroke Adapted Sickness Impact Profile (SA-SIP30); the Stroke Self-Efficacy Questionnaire (SSEQ) and the Subjective Index of Physical and Social Outcome (SIPSO).
This review examined the methodological quality of studies determining the psychometric properties of outcome measures, used in stroke SMIs according to criteria outlined by the COSMIN checklist. Consistent with measurement theory we explored the validity and reliability of these measures for use in people with stroke, not their general use in broader populations. To our knowledge, this is the first review to systematically appraise and summarize the evidence on the quality of outcome measures used in stroke SMIs. Since no study adopted a measure of stroke self-management attitudes or behaviours, the theoretical concepts utilised by studies in the review to measure self-management will first be addressed.
The range and number of different published outcome measures adopted by studies in this review  may suggest a current lack of consensus regarding the appropriate measures to assist evaluation of stroke SMIs. Alternatively, the use of heterogeneous measures may be reflective of recognition by researchers that self-management embraces a range of differing concepts. The current absence of consensus may in part reflect an underlying lack of consensus about the concept and operation of self-management in stroke. In addition, most SMIs have been developed for generic audiences, which may partly explain the lack of specific measures developed for stroke self-management. An argument exists for research to investigate the conceptual properties of stroke self-management, to examine which measurement concepts currently being used, if any, are appropriate.
A range of concepts were measured. Some captured health outcomes, such as physical functioning, which the study authors anticipated may be affected by the SMI; others attempted to capture behaviours, such as resource utilisation or attitudes, such as changes in self-efficacy thought to be associated with self-management processes (Figure 2). However, how the concepts measured align with the patient experience of stroke self-management remain unknown. Physical function (PF) was most often used as an indicator of effective self-management. Of the 21 measures possessing at least one property of the COSMIN checklist in this review, 11 related to PF (52%). This is potentially suggestive of an assumption that effective self-management results in improved PF, or that improved PF is a desired outcome. PF appears to remain a dominant concept within stroke rehabilitation, despite increasing evidence of the role of psychosocial factors in recovery [53–55]. For example, the measurement of PF is of limited value in studies that target speech disorder, depression, social participation or cognitive function , debatably all factors in effective stroke self-management. Questions regarding the differing priorities of rehabilitation between healthcare professionals and those affected by stroke have been raised before . Effective self-management extends beyond the ability to perform certain tasks, encompassing decision making and choices regarding health and behaviour . The role of PF in stroke self-management requires further clarification before it can be adopted as a robust indicator of effective self-management.
Six studies collected information on health behaviours and healthcare resource utilisation. However, issues of potential greater importance to patients, for example a change in confidence or increased awareness about how to manage fatigue, may not be captured in measures focused upon management of health behaviours or resource utilisation. There is a need to further conceptualise stroke self-management to ensure that self-management strategies pertinent to people recovering from stroke are captured in existing or new outcome measures.
Eight of the studies in this review explicitly cited a theoretical basis to the intervention adopted in the study (Table I). The most commonly cited theory was psychologist Albert Bandura’s Social Cognition Theory and the concept of Self-Efficacy (n = 4 studies). Self-efficacy can be described as the belief in one’s capabilities to organise and execute the course of action required, to produce given achievements[58,59]. The validity of outcome measures is contingent upon using them for the purpose they were intended for.
Of the four studies citing self-efficacy as a theoretical premise underpinning the intervention, only two studies utilised outcome measures to reflect change attributed to this theoretical concept in the interventions [45,60]. The measure adopted by Kendall and colleagues, The Self-Efficacy Scale , has unknown psychometric properties in stroke populations, and therefore requires further examination to establish its validity for use in these populations. The Stroke Self-efficacy Scale adopted by Jones and Colleagues  was developed with stroke populations. However questions exist concerning the relevance of the sample. Data were generated with people a relatively short time frame since stroke (mean duration was 4.2 weeks and 16 days post-stroke for two of the development phases). This may not represent sufficient time since stroke for individuals to adequately appraise their situation, especially as some were still in hospital.
The relationship to self-management of any of the measures in this review was not explicitly stated by any of the study authors. This suggests that further clarification is required to determine the extent to which they reflect the process or outcomes of self-management. Whilst potential theoretical bases for the self-management of long-term conditions, such as self-efficacy, have gained increasing acknowledgement, the role in stroke self-management remains unclear. This is in part due to a lack of robust outcome measures and, in addition, a lack of clarity regarding the purported theoretical foundations of stroke self-management [20,62].
A paucity of measures scored “excellent” or “good” for quality according to the criteria outlined by COSMIN (Figure 2). The COSMIN checklist does not advocate summarising the quality criteria into one overall quality score, as is often the case in other systematic reviews. An overall quality score would assume that all measurement properties are of equal importance. Since measurement properties are in part affected by the context in which they have been determined, this approach would be misleading. For example in our review, no measure was scored on cross-cultural validity, since the purpose of the review was not to assess how well a measure had been developed and validated in other languages or cultures.
Outcome measures should be developed with involvement of the target population to identify what is meaningful from their perspective and hence enhance content validity and clinical utility [61,63,64]. Three measures scored “excellent” in the content validity category (SSEQ; SIPSO; SA-SIP30). Measures that did not include involvement of users in the development of the measure scored “poor” on the COSMIN checklist, regardless of other aspects of the content validity process which may have been classed “fair” “good” or “excellent”. This is partly as a result of the COSMINs scoring method in which the lowest score in any given category counts, but is also indicative of the importance of involving potential users in measurement development. Arguably, measures developed without user-involvement have questionable meaning and other types of validity, since without steps in the design to capture the experience of the population to be measured, the context of the measure remains largely that of the measure developers [65,66]. More recently techniques such as cognitive interviewing , have been used by researchers [68,69] to ensure the content validity of new measures is optimal.
Difficulty exists in determining which measures used in the stroke SMIs in this review reflect self-management with validity since most measures did not score well according to the COSMIN criteria.
Several studies included unreported measures, designed specifically by the authors for the purpose of the SMI study [44,48–52]. With the exception of Marsden et al., studies utilised data from unreported measures as indicators of outcomes. An absence of psychometric data confounds the ability to draw reliable inferences from studies adopting those measures. In addition, a lack of information regarding the development or modification of unreported measures limits the ability to make judgments upon the validity and appropriateness of the measure. A further possible limitation on interpreting data from unreported measures may be a tendency for reporting positive results . Without establishment of reliability and validity the outcome measure is little more than a collection of items that have meaning to the developer alone [33,71]. Given the lack of consensus of how stroke self-management operates in the literature, and a lack of consensus upon the theoretical premises grounding stroke SMIs, the assumptions underpinning unreported measures remain speculative. Researchers and clinicians should exercise caution in considering findings from studies adopting unreported measures.
Of note is that 11 studies (85%) within this review adopted at least one outcome measure without reported validity and reliability with stroke populations. The reporting of minimal, or non-significant, observed changes following stroke SMIs in those studies including measures without established psychometric properties in stroke populations may be indicative of a lack of relevance and meaningfulness of those measures to stroke populations. Problems exist in using unreported measures when determining whether change occurred as a result of an ineffective intervention or due to imprecise measures.
Measures developed with intended user populations, facilitate the gaining of information about health, illness and the effects of health-care interventions from the perspective of the patient [72,73]. As well as enhancing content validity, this can also facilitate shared decision making with healthcare professionals. This is of particular relevance to those involved in promoting self-management and increasing patient autonomy, such as nurses and therapists.
Responsiveness is a necessary property of instruments intended for measuring clinically meaningful change, such as in stroke self-SMIs [74,75]. Involvement of users in the development of outcome measures promotes the responsiveness of measures. Arguments exist that responsiveness should focus upon detecting change that is valued by the person rather than the clinician or researcher . This is of particular relevance to self-management.
Change attributed to an intervention is an important aspect of evaluating clinical effectiveness. In this review, 13 measures included information on responsiveness in stroke populations. None of these measures scored “excellent” for this property, and only 15% scored “good”. Aside from inadequate sample sizes, a common finding was that studies often were not clear about what happened to study populations between testing. Additionally, authors often did not specify how missing items from respondents were handled. The result is that judgments regarding responsiveness data were difficult to substantiate. This also affected how well measures scored for reliability and other areas of validity. There is, therefore, a need for future measurement developers to specify these overlooked aspects of development more clearly in subsequent reporting.
The majority of SMI study populations within this review experienced stroke <24 months previously, with a number of studies using populations experiencing stroke no more than 6 months previously [48,50,51,76]. This may be a result of sampling to reduce the influence of additional factors upon study outcomes, such as the development of unhelpful coping behaviours, the likelihood of which might increase over time, and out of an assumption that more change may be observed in those early in their recovery. However, in reality the number of people living in the community and recovering from stroke extends beyond those who are 6–24 months post-stroke. As engagement in self-management activities varies during recovery, particularly following adjustment to stroke as a long-term condition, there is a future need to consider outcome measures sensitive to change(s) at different durations since stroke.
The role of PROMs, developed using rigorous investigation with the population to be measured extends beyond validating patient experience . PROMs may improve the quality of interactions between health professionals and patients, assess levels of health and need, and provide evidence of outcomes of services, for the purposes of audit, quality assurance and comparative performance evaluation [78,79]. This is of particular importance when trying to capture the essence of self-management, since the experience of clients is vital in determining what is valued from their perspective. There is a need to focus upon the development of measures of self-management developed with people recovering from stroke.
This review focused upon the current state of measurement in stroke self-management. Consensus between reviewers was used to determine eligibility and inclusion of SMI articles. Whilst we were in agreement, there is the possibility for selection bias. Our aim was not to make judgments on the quality of the SMIs identified. However, the use of a standardized critical appraisal tool may assist the selection of articles for future reviews. Where interpretation of the COSMIN criteria differed, agreement was reached by discussion and consensus. Additional reviewers may have further validated this process however the criteria within COSMIN are explicitly stated and differences were quickly resolved. Data extraction was facilitated by a standardized tool advocated by COSMIN, with extraction and scoring checked in a random 10% of articles.We acknowledge that checking of 10% may be viewed as a limitation of this review, however, assert that a systematic process using a standard data extraction tool was followed throughout.
That COSMIN operates a “lowest score counts” scoring system may account for the lack of measures scoring well in the measurement property criteria. Some studies used otherwise appropriate methodologies, but were rated as “poor” due to inadequate sample sizes for analyses. For example, for a measure to be rated as “good” for reliability, measurement error, criterion validity and responsiveness, a sample size of n = 50–99 is required. To score “good” for internal consistency and structural validity, the sample size required increased to five times the number of items within a measure (and ≥100 OR 5–7* #items but <100). Therefore, if those studies were repeated with larger sample sizes, their ratings according to COSMIN could change dramatically. The tendency for measures to score poorly may be reflective of a floor effect of the COSMIN checklist. COSMIN was developed following consensus of experts in health measurement, therefore if its stringent criteria is to be adopted this is indicative of a need to debate the rigorous methods for measure development required. It is fair to comment that some of the measures in this review were developed before the focus upon involving potential users in measure development. It may be that the measures examined in this review require further development and investigation to establish adequate measurement properties for use with stroke populations.
Our review points to existing limitations in the evaluation of stroke self-SMIs. Our recommendations for clinicians and researchers seeking to evaluate such interventions would be firstly to clarify the theoretical premise of the intervention in question, as advocated elsewhere [27,80,81]. Without this step, it is difficult to identify the mechanisms by which the intervention may influence outcomes, and thus difficult to select an outcome measure which appropriately captures the potential outcome. Potential outcome measures should be selected on the basis that they appropriately reflect and capture the expected outcome change.
This review highlights that the reported theoretical drivers within stroke SMIs are unclear, not least because they are often not explicitly stated by researchers. The heterogeneity of the outcome measures utilised by SMIs in this review may indicate a difficulty in determining the expected outcomes of stroke SMIs. A systematic review demonstrated that interventions with specific aims, such as reduced systolic blood pressure in Hypertension or glycosylated haemoglobin levels in Diabetes, produced greater effect sizes than those without defined outcomes . Further work is therefore warranted to conceptualise stroke self-management and examine the theoretical premises supporting such interventions, and expected outcomes so that appropriate outcome measures which accurately reflect the concept can be selected and/or developed. Until such clarification, researchers and clinicians should, where possible, select outcome measures with reliability and validity data in the population to be tested in the intervention. The selection of outcome measures developed with involvement from the target population is also advocated. This ensures that what is meaningful to the patient is more likely to be captured appropriately, thus enhancing content validity .
In the meantime, researchers must support clinicians by conducting further work to examine the concept and theoretical premises of self-management and developing appropriate measures if required.
This is the first systematic review of international research on outcome measures used and selected in stroke self-SMI studies. We have identified important limitations in the measures used to evaluate the effectiveness of stroke self-SMIs, which has significant implications for the inferences we are currently able to draw about the evidence base. None of the measures used in studies of stroke SMIs, purported to specifically measure self-management as a discrete concept. This is indicative of the difficulty in conceptualisation and operation of this concept, a view expressed elsewhere [13,84]. Further work is required to determine how the measures identified in this review, align with the concept of self-management. The range of outcomes adopted, the lack of observed changes in outcomes following stroke SMIs and the lack of consensus surrounding which outcome measures to utilise, indicates that the causal mechanisms of stroke SMIs remain imprecise. Stroke SMIs have raced ahead of the evidence to support their theoretical basis, operation and effective evaluation [85,86]. Work to conceptualise stroke self-management is required to help identify which outcomes are most appropriate for evaluating interventions, to further inform the theoretical basis for SMIs  and to assist the development of interventions. There is a need for studies to explore the theoretical underpinnings of SMI in stroke and for the development of robust outcome measures to enable evaluation of stroke SMIs.
Declaration of Interest: This work was supported by a University of Southampton research studentship. The authors report no declarations of interest.