|Home | About | Journals | Submit | Contact Us | Français|
(1) Establish a core domain set for fibromyalgia (FM) assessment in clinical trials and practice, (2) review outcome measures’ performance characteristics, (3) discuss development of a responder index for the assessment of FM in clinical trials, (4) review objective markers, (5) review the domain of cognitive dysfunction, (6) establish a research agenda for work regarding outcomes research.
(1) Results of univariate and multivariate analysis of 10 different FM clinical trials of four different drugs, mapping key domains identified in previously presented patient focus group: Delphi exercises and a clinician/researcher Delphi exercise, breakout discussions to vote on possible essential domains and reliable measures. (2) Updates presented regarding outcome measures’ status. (3) Presented update on objective markers to measure FM disease state. 4) The issue of cognitive dysfunction (dyscognition) in FM was reviewed.
(1) Greater than 70% of OMERACT participants agreed that pain, tenderness, fatigue, patient global, multidimensional function and sleep disturbance domains should be measured in all FM clinical trials, dyscognition and depression in some trial, and domains of research interest include stiffness, anxiety, functional imaging, and cerebrospinal fluid biomarkers. (2) FM domains’ outcome measures have generally proven to be reliable, discriminative, and feasible. More sophisticated and comprehensive measures are in development, as is a responder index for FM. (3) Increasing number of objective markers are being developed for FM assessment. (4) Cognitive dysfunction assessment by self-assessed and applied outcome measures is being developed.
A multidimensional symptom core set is proposed for the evaluation of FM in clinical trials. There is ongoing research on improved measures of single domains and composite measures.
Fibromyalgia (FM), also known as fibromyalgia syndrome, is characterised by chronic widespread pain and tenderness on physical examination, as defined by the 1990 American College of Rheumatology criterion1. These criteria have been beneficial in identifying a more homogeneous group of individuals with chronic widespread pain upon which to conduct research aimed at better understanding FM. Currently, separate clinical diagnostic criteria for FM do not exist. Applying the ACR criteria in clinical practice may over emphasise the importance of tenderness (e.g., over sampling for women), the importance of peripheral as opposed to central factors, and distress (e.g., distress raises tenderness). Clinically, patients with FM often complain of other symptoms beyond pain. Additional symptoms include: fatigue, sleep disturbance, mood disturbance, cognitive dysfunction, and syndromes such as irritable bowel and bladder syndrome, and various forms of headache2. Each patient with FM experiences a number of different symptoms to varying degrees, which may change over time and with treatment, thus constituting the need for continual assessment of the multidimensional nature of the condition. FM may occur on its own and also has been noted to be co-morbid with such diseases as rheumatoid arthritis, lupus, other chronic pain conditions, hypothyroidism, and infections such as Lyme disease or Hepatitis C. It is prevalent in at least 2% of the population, occurring more frequently in females than in males1. Current research posits that FM results from disordered central pain and sensory processing. Disregulation of several neuropeptide and neurohormone networks have been identified, leading to a deficiency in pain inhibitory pathways and/or increase in faciliatory networks 3,4. The triggering and maintenance of FM appears to require both genetic disposition and environmental influences such as emotional, or physical stressors or illness5.
Until the 1990s, there had been a paucity of well-controlled clinical trials of pharmacotherapy of FM. This was partly due to the lack of classification criteria for the condition, partly related to the lack of clear understanding about pathophysiology, uncertainty about what core symptom domains could be reliably measured, lack of objective markers of disease activity and severity, suboptimal confidence that measures could discriminate a therapeutic response, and perhaps a certain skepticism among some that the condition was in fact legitimate. Stemming from the work of Moldofsky and Smythe on sleep disorders in FM6, studies with tricyclic antidepressants (TCAs) were conducted and showed short-term benefit for pain and sleep in FM7. However, it was apparent that these agents were incomplete in their effectiveness and poorly tolerated.
In parallel with the increased understanding of the neuropathophysiology of FM more specifically targeted and better tolerated pharmaceutical agents, which could potentially benefit FM symptomatology, became available. Examples of these include serotonin and norepinephrine reuptake inhibitors (SNRIs) which can augment the activity of these neurotransmitters, α2-δ subunit modulators and that inhibit excitatory neuropeptides such as glutamate and substance P, and other neuromodulators as a means of diminishing pain and fatigue, improving sleep, and beneficially affecting other symptom domains of FM. Controlled trials of several of these agents have been conducted, and utilising a variety of measures, clinically meaningful improvements in pain, patient global impression of change, and function have been demonstrated compared to placebo. Two of these agents, pregabalin, an α2-δ modulator, and duloxetine, an SNRI, have been approved for the management of FM in the United States, and a third agent, milnacipran, an SNRI, has recently been approved8–12. This process has occurred, however, using a wide variety of outcome measures and approval has primarily been based on demonstration of efficacy in the domains of pain, patient global impression of change, and the total impact of FM as measured by the Fibromyalgia Impact Questionnaire (FIQ)13. There has been a need for scientific validation of a core set of domains that more fully constitute FM syndrome and should be addressed in clinical trials. Another need is for evaluation of the performance characteristics of domain measures to assure clinicians, regulators, and the public about the soundness of our ability to evaluate therapies in FM and to provide guidance to developers of new therapies.
Recognising this need, a group of clinician/researchers interested in FM gathered in 2004 to develop a workshop for Outcome Measures in Rheumatology Clinical Trials (OMERACT). The group included both academic and pharmaceutical-based researchers and focused on several areas. In order to gain a preliminary sense of the key domains needing to be assessed in FM, 23 clinician/researchers participated in a Delphi exercise based upon a list of domains developed by the expert steering committee of the working group. The results of this exercise and the voting which occurred in the workshop at OMERACT 7 is shown in Table 114.
To gain some understanding of the performance characteristics of measures of these domains, a review of controlled clinical trials was conducted to determine the effect sizes of these measures15. Whereas pain and patient global measures appeared to be reliable and showed good effect sizes, this was not the case in other domains such as sleep, fatigue, and function, raising questions about either the effectiveness of therapy for these domains or the quality of measures. Other areas tackled by the group included a review of more objective measures being explored in FM, e.g., functional magnetic resonance imaging (fMRI), and potential linkages with patient reported outcomes developed by the World Health Organization International Classification of Function (ICF) project and the PROMIS network15. These projects represented broader and more in-depth attempts to characterise the full patient experience of disease, function, and health-related quality of life (HRQOL) impact of FM. The Dephi exercise concurred that the research agenda included continuation of each of these areas of work as well as gaining a more in depth patient perspective on the outcomes domains relevant to FM.
A second FM workshop was conducted in 2006 at OMERACT 8. In preparation for this workshop, the now expanding working group, with the aid of MAPI Values, an independent research organization, conducted a series of patient focus groups in order to map the array of symptoms experienced and problems caused by FM16. Utilising the information gained in these discussions, a Delphi exercise was conducted amongst patients17. The key symptom areas which impacted the majority of these representative patients, although worded differently than in the clinician Delphi, had considerable conceptual overlap with the clinician Delphi exercise, thus providing face validity to the two different exercises. In addition, an updated review of the performance of outcome measures used in more recent clinical trials, objective measure data18, and linkage work with the ICF and PROMIS FM-extension projects (D Williams) was reported. The research agenda included the need to determine which of the key domains, identified by patients and clinician/researchers, represented the full core set of domains experienced in FM, whether areas of domains overlap, and to do the preliminary analysis necessary to develop a responder index for FM. Two FM OMERACT Steering Committee members (L Arnold and L Crofford) are co-PIs on an NIH funded project to develop such a responder index, which was outlined at this meeting. The core domain construct work subsequently completed based on the research agenda formulated at OMERACT 8 is outlined below and described in detail in the accompanying paper by Choy et al19. In addition, more complete understanding of the symptom domain of cognitive dysfunction (dyscognition) and appropriate measures for it was identified as a key subject for the research agenda. This history provides the foundation for the current report on the proceedings of the FM Workshop presented at OMERACT 9.
(1) Establish a core domain set for the assessment of fibromyalgia (FM) in clinical trials and practice, (2) review the performance characteristics of outcome measures, including patient reported outcomes, currently being used to assess FM domains, (3) discuss development of a responder index for the assessment of FM in clinical trials, (4) review objective markers of FM, (5) review the domain of cognitive dysfunction in FM and its potential assessment in clinical trials and practice, (6) establish a research agenda for further work to be done regarding FM outcomes research.
Since the OMERACT 8 meeting in 5/04, working group members met in regular teleconferences and in person at the ACR, EULAR and Myopain Society meetings. The working group, noted above, was constituted of clinician/researchers, statisticians, pharmaceutical industry representatives, and patients from North America, Europe, and Australia. There were four subgroups (leaders): (1) Domain construct (Ernest Choy, Philip Mease, Lesley Arnold, Dan Clauw, Jennifer Glass, Susan Martin, David Williams), (2) Outcome measures/Patient Reported Outcomes (PROs)/Responder index (David Williams, Susan Martin, Lesley Arnold), (3) Objective markers (Dan Clauw, Leslie Crofford, Jessica Morea), and (4) Cognition (Jennifer Glass). Liaison to the OMERACT executive committee was conducted by Lee Simon and Vibeke Strand. The group’s fellow was Jessica Morea. Patient participants were Lynne Matallana, Kathy Longley, Michael Peterman, and Sharon Waldrop.
As noted above, the working group had previously conducted a clinician/researcher Delphi exercise and patient focus groups and Delphi to determine the key domains considered to be important to assess in clinical trials of FM (Tables 1 and and2).2). In order to determine how well these domains approximate the totality of the FM experience for a patient (content validity) and how overlapping versus independent the domains were, data from FM clinical trials was analysed. PGIC was used as a surrogate of overall improvement and was the dependent variable in multivariate regression analyses, against which other domains were regressed. Outcome measures used in trials were mapped onto one or more of the following domains identified in the clinician/researcher Delphi: pain, patient global, fatigue, HRQOL, multidimensional function, sleep, depression, physical function, tenderness, dyscognition, and anxiety as well as stiffness, which had additionally been identified in the patient Delphi. Ten studies involving four pharmacological agents were analysed: two serotonin and norepinephrine reuptake inhibitors (i.e., duloxetine and milnacipran), one α2-δ modulator (i.e., pregabalin), and sodium oxybate(gamma hydroxybutyrate) — all of which have shown efficacy in FM clinical trials. Details of this study are summarised in a separate paper in this issue of the Journal19.
Univariate analysis showed that instruments that measured these various domains showed moderate to high correlation with PGIC, highest with pain, fatigue, multidimensional function, physical function, and stiffness, and only moderate associations with depression, anxiety, and dyscognition. It should be recalled that in the majority of these trials, patients with major depressive disorder were excluded, resulted in lower effect size of change scores since baseline depression scores were low. In addition, only one trial utilised a measure of self-assessed cognition, partly because of uncertainty about how best to approach and assess this domain.
Multivariate analysis showed moderate to high values of R2, with studies having more non-overlapping domains demonstrating higher values, suggesting that if key domains are not assessed, the variance accounted for in PGIC will be diminished. Pain, fatigue, physical function, multi-dimensional function, and depression were retained as separate domains in trials of all four compounds. Tenderness was retained as a domain separate from pain in all three trials in which it was assessed, suggesting that it is a sign of allodynia and/or hyperalgesia separate from the subjective impression of pain. Sleep was retained in two out of three clinical trial groups; stiffness, assessed by a single question in the FIQ, in two out of four; and dyscognition in none, the latter presumably related to either non- or insufficient assessment.
The domain construct was discussed in breakout sessions, taking into account the clinician and patient Delphi exercises, the data analysis, and aided by the presence of patient participants. Voting, by audience response methodology, on the construct was done on two occasions: at the time of the module and in the plenary recap at the end of the meeting, when further clarification on key issues was offered.
There was little debate about whether core issues such as pain, fatigue, and patient global should be measured in all FM trials as relevant domains for the “inner” core set (Table 3). However, there was considerable discussion about other domains. One issue was the separation and overlap of the concepts of multidimensional function, physical function, and quality of life. The two principle instruments currently used for measuring these domains are the SF-36 and FIQ, which include both function and HRQOL questions. Work is underway to develop more sophisticated instruments that more comprehensively measure these domains through the PROMIS and PROMIS FM-extension project, and/or linkage with the ICF methodology. Since the current measures are primarily considered to be optimal instruments to assess the concept of multidimensional function, it was voted (63%) that for the time being, until more optimal HRQOL instruments are available, to subsume these concepts under the phrase “multidimensional function”, and was voted to be a core domain item, keeping open the possibility of separating out HRQOL as more sensitive and specific instruments are developed. Tenderness separated from pain in the multivariate analysis and was considered by more than 60% in initial voting (Table 3) and 70% in revised voting (Table 4) to be in the core set as an essential domain to measure in all trials. Sleep disturbance has long been considered an important part of the FM experience, and was so endorsed in the clinician and patient Delphi exercises. However, in the data analysis, it did not correlate highly with PGIC and was somewhat insensitive to change. More careful analysis of the instruments used to assess sleep demonstrated that some subscales performed well and others, e.g., “snoring” in the Medical Outcomes Study (MOS) sleep scale, did not. Thus, the poor correlation with PGIC could have been due to dilution of quality of the scale by assessments that were irrelevant to FM patients. It was agreed that there should be a focus on development and testing of more relevant measurements of sleep in FM and use of more sensitive subscales of existing measures. Thus, with further discussion, it was voted to include sleep disturbance in the inner core (Table 4).
Some domains were shown to be core domains in FM by the multivariate analysis but not considered by the majority of OMERACT attendees to be necessary to assess in all clinical trials in a development program. Depression was retained in the multivariate analysis as a core domain in FM and was voted, by 65%, that it should be assessed in some trials; but only 35% felt it should be assessed in all trials. Thus, in Figure 1 it is listed in the second circle. Cognitive dysfunction, or dyscognition, was noted to be an important domain by patients and to a lesser extent by clinician/researchers in previous Delphi exercises. However, full understanding of it as a domain and how best to assess it in FM trials is still uncertain and is an active research issue. Given its importance as a domain, 38% felt it should be in the core set and 45% thought that it should be measured in some trials. Thus, dyscognition was moved to the second circle (Figure 1).
Several domains were highlighted in discussions as being of potential interest to further explore and are listed in the third circle. Stiffness has been identified as an important symptom domain by patients. In the multivarate analysis it did not separate out as a domain distinct from pain in all trials, although it was only assessed with a single question in the FIQ. Thus, it is part of the research agenda (outer circle). Functional imaging and cerebrospinal fluid biomarkers are examples of potential objective markers that may be important and discriminative, although not currently feasible for all trials. These were, therefore, listed in the research agenda. Because anxiety was considered to be an essential part of the core set by just 18%, it was placed in the outer circle.
In previous FM workshops, adverse events (AEs) were listed as an important domain to assess in trials. Since AEs are naturally assessed in clinical trials, it was not felt to be necessary to be listed as a symptom in the core set.
The Outcomes Measurement (OM) committee within the FM Working Group of OMERACT works to identify patient reported outcome measures (PROs) that best assess the domains of most relevance to individuals with FM. The work of this group is informed by ongoing initiatives either within or outside of OMERACT and at the OMERACT 9 meeting this group presented data in the following areas: (1) building evidence supporting the valid use of existing PROs specifically for FM, (2) developing responder indices based upon existing PROs, (3) further refinement of the domain definitions of relevance specifically for FM, (4) developing new and next generation outcomes measures specific to FM, and (5) integrating the guidance of regulatory bodies to the work of improved outcomes measurement in FM.
Many of the outcomes measures currently used in FM research were developed and validated for use with other medical conditions. Thus many of the indices used to assess the domains of relevance were “adopted” from other conditions. Adopting instruments is neither uncommon nor inappropriate when exploring a relatively new and poorly understood condition such as FM. For example, a research definition for FM did not exist prior to 1990 and until the recent work within OMERACT, there was no consensus regarding the clinical domains of relevance to this condition. Lacking basic foundations in the understanding of this condition precluded the interest, time, and funding needed to develop assessment instruments more specific to FM. Borrowing and adopting assessment instruments has facilitated basic exploration into the nature and impact of FM, and represents a methodological advance over previous un-standardised methods of inquiry.
As interest and understanding of FM matures, the need for greater rigor in assessment methods also advances. It is plausible to suspect that “adopted” instruments are suitable for use in FM; however, support of this suitability needs to be based upon performance within individuals with FM rather than upon assumptions of equivalence. Several studies are currently underway examining the performance characteristics of validated instruments in other conditions for use in studies with FM. An example of one such effort is the ongoing work of V Strand and colleagues on the use of the SF-36 in FM. Importantly, as with other rheumatic diseases, the SF-36 represents a generic measure of health-related quality of life (HRQOL) that meets the OMERACT filter in RA, OA, SLE and FM, and may be well suited for use with other disease specific instruments, once developed.
The SF-36 is a brief, well established, self-administered patient questionnaire for the assessment of HRQOL that can also be viewed as a measure of multidimensional function, including “participation”20. The SF-36 measures eight domains of health status: physical functioning, role limitations because of physical problems, bodily pain, general health perceptions, energy/vitality, social functioning, role limitations due to emotional problems, and mental health. A summary score for physical functional status (PCS) can be calculated by combining and weighting the various individual scales21. Individual or group domain and summary scores can be compared to national norms for the US and other populations or contrasted for various medical conditions22.
To date, the SF-36 has been used in over 70 studies involving individuals with FM; including randomised controlled trials (RCTs) of tramadol, gabapentin, pregabalin, duloxetine, and milnacipran. The domains of coverage within the SF-36 map nicely with the domains identified as being relevant in the aforementioned Delphi studies. Domain scores have been consistently observed to improve in studies where active treatment arms can be compared to placebo; supporting the SF-36 as being responsive to change in individuals with FM when change is expected to occur. To date, there is much evidence supporting the use of the SF-36 as an index of multidimensional function as it satisfies the OMERACT filter for FM. Of interest, data from both RCTs and longitudinal observational studies demonstrate remarkably similar decrements in baseline domain and PCS and MCS summary scores compared with age/gender and population matched normative data. Trials of gabapentin, pregabaline, duloxetine and milnacipran have demonstrated treatment associated mean improvements in summary and domain scores that are remarkably similar and well exceed minimum clinically important differences (MCID)9, 23–27.
Responder indices have become popular for identifying treatment successes in illnesses where improvement needs to occur across multiple domains. Such responder indices have a history of use in FM. However, consistent with the work on relevant domains, there has not been consensus regarding composition of these indices. For example, Simms et al 28,29 reported on the use of an index requiring improvement on four out of six criteria defined as 50% improvement in pain, sleep fatigue, patient global and physician global, dolorimeter improvements and improved myalgic score. These criteria were later used in RCTs of amitriptyline30. This initial response index for FM was a first attempt beyond assessment of pain and tenderness in clinical trials. However, the Simms criteria were not as sensitive as would be desired in part because physical function was not included. A second attempt at a responder index for FM was the work of Dunkl et al31, requiring improvements in three of four measures including FIQ, pain intensity, tender point count, and pain intensity.
Clinical trials of new compounds for FM have also used responder indices as primary efficacy endpoints. For example, in RCTs of milnacipran, a responder index required participants to report ≥30% improvement in pain intensity, a patient global change of “moderately improved or much improved” and ≥6 points improvement in SF-36 PCS score32. Clinical trials of sodium oxybate used a different responder index: ≥20% improvement in pain intensity, ≥20% improvement in FIQ, and a patient global assessment of “much better or very much better”33.
Thus, to date, most responder indices have been rationally derived, based upon what investigators or regulatory bodies deemed to represent improvement in the context of a clinical trial involving FM. Given consensus regarding relevant domains is only just evolving, most responder indices have not benefited from a data driven development process. Arnold, Crofford, and colleagues are currently working on a NIH/NIAMS sponsored project to develop a responder index for FM that is based upon both consensus and empirical data for eventual use in RCTs of FM. This project begins with the consensually derived domains for FM, links existing assessment instruments to each domain, evaluates each measure for five types of validity in FM, and evaluates the performance of each instrument as a member of a composite index. The project also establishes consensus among clinicians regarding criteria for improvement in FM, tests the consensually derived criteria with empirical data, and identifies which definitions of improvement result in fewest placebo improvements. This project is ongoing and will inform the efforts of the working group in subsequent sessions of OMERACT.
Identification of consensually derived domains of relevance for FM is an important first step in gaining a better understanding of what needs to be assessed in FM. However, studies that attempt to validate adopted measures for use in FM must rely on several assumptions. First, instruments purporting to measure a given domain (e.g., fatigue) will in fact measure those facets of fatigue that are relevant to individuals with FM. Second the domain names (e.g., fatigue) have shared meaning for individuals with FM, clinicians, and other medical populations from which existing measures may have been adopted. Early investigations into this area of inquiry suggest that neither assumption holds completely.
Perhaps the largest body of work in this area comes from the investigators associated with The International Classification of Functioning, Disability, and Health project (ICF) within the World Health Organization (ICF-WHO). The ICF developed a domain categorisation coding system that identifies the relevant domains of functional status assessment for medical illnesses in general34. This large system can be broken down into core sets for specific illnesses. Currently the closest core set to FM is the “chronic-wide spread pain core-set (CWP)”. CWP affects between 5–15% of the population and includes FM as an extreme subset35–37. When used, this coding system helps to identify relevant domains of functional limitations for different diseases/conditions and then provides a code (much like the ICD10) that identifies the area of functioning that is affected by the condition.
Various standardised instruments used to assess domains in FM have been examined and items within each instrument have been mapped to specific categories (subdomains) within each broader domain (e.g., fatigue can be subcategorised into physical, mental, motivation, etc.). One recent study found that out of 42 RCTs in FM, 27 different questionnaires were used to assess FM. From the 27 different questionnaires, 1138 distinct health-related concepts could be identified based upon items. These concepts were linked to 113 ICF categories. Each questionnaire differed greatly from each other with regard to the specific subdomain categories covered and the relative importance paid to the broader ICF domains of body structure, body function, activities/participation, and environmental factors. The least well covered broad domain for all existing questionnaires were environmental factors38.
A second manuscript explored differences in the ICF categories that were represented in PROs commonly used in FM research that purportedly assess the same construct. This manuscript applied ICF linkages to common indices of pain, fatigue, sleep function, and affect. In each case the domains were indexed by assessment tools that varied substantially, depending upon which assessment tool was chosen. Thus, quite disparate conclusions might be found for a given construct based upon which assessment instrument was used and which specific facets of the construct the instrument and its scales emphasise39.
That different instruments emphasise different facets of constructs is not always limiting. As we learn more about how patients with FM define and think about the various domains of relevance, we will be better able to match our assessment instruments to the way individuals with FM use these terms resulting in improved assessment ability with increased sensitivity in our measures of outcomes. That different instruments assess different facets of domains is also reason not to limit by decree which assessment tools must be used for FM, as the choice of instrument might be best driven by which facets of the domain a given intervention hopes to address.
Efforts to learn about how patients with FM think about the domains of relevance are currently in progress. Methodologies typically start with a consensually derived generic definition of the domain (e.g., ICF definitions or definitions from the NIH Roadmap PROMIS project) that are then agreed to or modified by focus groups of individuals with FM.
One such study recently presented at an NIH PROMIS conference found generally good agreement among patients with FM with generic definitions of pain, fatigue, negative mood, and physical functioning. For each domain however, insufficient depth of impact was expressed as a concern of the definition. For example, individuals with FM reported that most existing definitions of fatigue focused on simply being tired and failed to capture the profound unrecoverable and disabling exhaustion that accompanies FM40.
Perhaps the largest scale project aimed at developing new highly sensitive FM-specific measures for the domains of relevance to FM is the NIH/NIAMS sponsored project “FM-Specific extension of the PROMIS network”. PROMIS is an NIH Roadmap initiative that is building a next generation Patient-Reported Outcomes Measurement Information System (PROMIS). In the development of PROMIS, each domain is defined generally, and then patient reported outcome measures are developed and linked to those specific domain definitions. PROMIS, still under development, is to be a publicly available user-friendly computerised adaptive testing (CAT) system that will be used for the efficient generic measurement of patient reported outcomes (PROs) across a wide range of chronic diseases and dimensions41. While costly and time-consuming to develop and maintain a national public resource of this nature, the benefit of this system will be the ability to assess multiple domains using fewer items (i.e., less patient burden) with greater precision (i.e., increased power for clinical trials with fewer subjects).
PROMIS was established for the general assessment of chronic illnesses; and as might be expected, many of the domains identified in PROMIS are of relevance to FM such as pain, fatigue, negative mood, and physical function. Several domains identified in the OMERACT Delphi exercises, however, were not included in the first iteration of PROMIS such as sleep disturbance, dyscognition, stiffness, and tenderness. Williams and colleagues are currently participating in a cooperative agreement with NIH/NIAMS to develop a FM-specific extension of PROMIS. The goals of this initiative include the following: (1) determining whether the existing PROMIS definitions of pain, fatigue, physical function, and negative mood, hold up or require modification for patients with FM, (2) developing new definitions for sleep disturbance, dyscognition, stiffness, and tenderness for FM, (3) developing new item banks for the new domains and/or supplementing existing banks with FM-specific items, (4) large scale field testing following the methods of the larger PROMIS initiative thus facilitating the development of FM-specific calibrations for existing and new item banks, and (5) development of static short forms and CAT assessments specific to the domains of relevance for FM. These new item banks and calibrations will be merged within the context of the larger PROMIS roadmap initiative.
Many of the domain assessment tools currently in use for FM were developed in academia for purposes of exploring and gaining a better understanding of FM. With broader interest and new treatments for FM, researchers and PRO developers must become aware of not only methods of test development but also the guidelines of regulatory bodies such as the United States’s Food and Drug Administration (FDA) and the European Medicines Agency (EMEA) if the assessment device is to be used in a clinical trial for eventual product approval. One such regulatory body, the FDA, released a valuable draft regulatory guidance for (1) the use of existing measurement tools, (2) the development of new measurement tools, and (3) the transition of tools from one medium to another (e.g., paper to electronic formats)42. Of particular importance in the draft guidance is the documentation of patient input during the PRO instrument development process, both in the identification of the domains of importance in any particular disease area as well as at the item-level development and evaluation. Understanding the current PRO instrument requirements from the perspective of regulatory bodies, can guide decisions related to choice of currently available instruments versus development of new instruments.
Participants at OMERACT 9 were presented an update on the current understanding of the underlying pathophysiology of FM and the biomarkers that relate to these pathophysiological processes. Researchers and clinicians now view FM as a common pain syndrome characterised by primarily central, non-nociceptive pain, and a variety of aberrant pain and sensory processing pathways have been identified that can lead to pain or sensory amplification.
All potential biomarkers that have been identified to date in FM are related in some way to this central amplification. Current biomarkers under study include but are not limited to experimental or evoked pain testing (EPT), MRI imaging (sometimes during EPT), levels of neurotransmitters in cerebrospinal fluid, including Substance P, glutamate, serotonin, and norepinephrine, muscle biopsy, polysomnography (PSG), cytokines, and sensory testing.
The objective biomarker breakout session focused on three main issues: (1) the “objectivity” of biomarkers, (2) whether a marker belonged in the core domain of outcomes that must be measured in a clinical trial or required further study before becoming a core domain, and (3) the application of the OMERACT filter of truth, discrimination and feasibility to specific biomarkers.
Neurotransmitters and muscle biopsy were the only markers designated as totally objective, but no single marker was designated as a core domain. When applying the OMERACT filter, some markers were considered more useful in research than in clinical practice. For example, polysomnography was considered truthful and discriminating but might only be feasible in a clinical trial where the investigational intervention aims to improve sleep, and may not be feasible in clinical practice. In the case of neural imaging, some participants rated this biomarker a 7 out of 10 in terms of truth but rated it low on the feasibility scale due to cost, and did not believe there was yet enough evidence to assign a score on the discrimination scale.
The goal of OMERACT is to place a set of disease markers through a filter of truth, feasibility, and discrimination in order to achieve a succinct and practical set of outcomes with which to measure change in health status. With numerous available markers measured in so many different ways, it is impossible to compare the efficacy of potential treatments. In preparation for OMERACT, and during the workshop, it became evident that there were too many biomarkers with too little evidence to support the existence of a core set that would pass through the OMERACT filter. As such, the biomarker research agenda focused on a single class of biomarker that has the most support for feasibility, truth, and discrimination: experimental or evoked pain testing (EPT).
EPT, which encompasses multiple techniques, including tender point intensity, pressure pain thresholds, and heat/cold thresholds, is emerging as a promising evidence-based biomarker. The goal of experimental pain testing is to quantify the experience of pain objectively and to demonstrate that FM is related to aberrations in central, rather than peripheral, pain processing. The presence of hyperalgesia (increased pain in response to normally painful stimuli) and allodynia (pain in response to non-painful stimuli) implicate central pain mechanisms and are measured by EPT.
Research shows that some methods of EPT are correlated with reports of clinical pain in FM patients. For example, Geisser et al found that dolorimetry and pressure thresholds were associated with clinical pain, but heat stimuli were not43. Particularly, the use of the multiple random staircase (MRS) method for delivering pressure stimuli has been shown to be associated with patients’ reports of clinical pain. MRS uses an interactive software system to determine low, medium, and high pain intensity thresholds for each subject based on their response to random stimuli. Harris et al compared MRS to other evoked pain measures and found that it was the only “objective” technique that tracked with improvement during the course of treatment44. Such findings may indicate that experimental pain testing, and MRS specifically, correspond to a patient’s clinical condition, rendering this type of testing a potential biomarker of disease status, progression, and improvement. In addition, MRS is not subject to bias in terms of variation between clinicians or fluctuations within individual clinicians, as with tender point counts, and is not associated with patient distress45,46. With both dolorimetry and tender point count, the patient is aware of when the stimulation is forthcoming47, and such techniques have been shown to be influenced by patient distress48. EPT in FM yields a measure of objective pain that correlates with clinical pain, is less subject to bias, underscores the central pain mechanisms in FM, and is less invasive than other biomarkers (e.g., collecting cerebrospinal fluid; CSF).
Although pain and fatigue are hallmark symptoms of FM, many patients find that problems with cognitive function (dyscognition) are just as troublesome5,15,49. A small but growing body of literature supports the presence of dyscognition in FM50. In this section the current state of knowledge about dyscognition is reviewed.
Measurement of dyscognition can be divided into two categories: self report of cognitive difficulties, and performance-based measures of cognition; most reports are performance-based50. About one dozen studies have been published that use either standardised neuropsychological tests or non-standardised but common measures from cognitive science. Although these studies have used a variety of measures, a pattern has begun to emerge where deficits are seen in four separate cognitive systems. Most notably, problems with verbal working memory have been consistently reported. Working memory refers to a memory system that combines short-term storage (on the order of seconds) with other mental operations such as retrieving knowledge from semantic memory and deleting or adding items. Working memory is an important construct in cognition as it functions as basic skill. Results from four different measures of working memory, the Paced Auditory Serial Attention Test (PASAT)51–53, the Reading Span Test54, the Everyday Test of Attention55, and Consonant Trigrams53 have all found impairment on this crucial cognitive system.
Related to working memory are attention and executive control. Attention is the ability to maintain focus on a specific item, task or location. Executive control involves the many processes used to maintain focus, such as ignoring irrelevant items, suppressing responses not consistent with a goal, and planning. The results from the PASAT and the Test of Everyday Attention point to a problem with executive control of attention in FM. Ongoing work indicates greater memory impairment in FM patients when they have distraction56,57,53. An important point out that most standardised neuropsychological tests are conducted without distraction.
Deficits are also seen in memory systems with longer duration. Episodic memory refers to the ability to remember a specific episode. Many of our memory tasks fall into this category, such as remembering a list of items to buy at the grocery store. FM patients perform more poorly than controls on word list tasks54 as well as standardised tests of memory52,53,58,59.
The final area where deficits have been reported is in semantic memory, particularly the ability to access semantic memory. Semantic memory refers to our knowledge of facts. It is separate from episodic memory (e.g., you may remember that there are 12 inches in a foot, but not remember when you learned this fact). Patients anecdotally report word finding problems, and there are reports of decreased performance on both verbal fluency tasks54,58 and on vocabulary tests54.
There are now a number of computerised neuropsychological batteries (e.g., CANTAB, COGSTATE). Computerised batteries would help the ease of testing, data collection, and interpretation across clinical trials and other studies. To date, there is only one report that used a computerised battery, the Automated Neuropsychological Assessment Metrics60. Unfortunately, this battery did not yield any differences between FM patients and controls, perhaps due to the lack of distraction and working memory tests. Future work will be needed to assess the utility of other computerised neuropsychological batteries in FM research.
Self report of cognitive function is an important addition to performance-based measures because it can be influenced by many factors, including effort required for performance, stress regarding performance, and depression. There is a surprising paucity of studies using self-report instruments of dyscognition in FM, although several studies include one or two items about memory or concentration. An exception is a study of memory beliefs in FM with the Metamemory in Adulthood Questionnaire, used frequently for studying memory in older adults56. FM patients reported lower memory capacity, more memory deterioration, low self-efficacy over memory, higher anxiety about memory performance and more strategy use to support memory than in age and education matched controls. Among FM patients, performance on a memory task was correlated with perceived memory capacity. Further work using other well-validated, self report measures of cognitive function would be very helpful in clinical trials, since self report measures are easy to administer and fulfill the need for patient reported outcomes.
To summarise, the existing data support the fact that dyscognition is a salient symptom, and objective cognitive impairments can be demonstrated in FM patients. This will be important in future clinical trials, but the field is not yet at the point where we can recommend outcome measures that should be included in all trials. In addition, during break-out discussions, three important areas that have not been well studied were identified. First, there was considerable concern about how other conditions (e.g., depression, anxiety, fatigue, and medications) could influence dyscognition. Second, some aspects of dyscognition described by patients have not been well studied, in particular the idea of mental exhaustion and feelings of dissociation. Finally, there was a good deal of discussion about the frequent lack of correspondence between objective cognitive testing and self report of dyscognition. The group noted that self report may also include other non-cognitive aspects. For example, someone with cognitive losses compared to pre-illness state may still perform well when their pre-illness state was above average.
FM is a condition characterised by chronic widespread pain, excessive tenderness, and a number of associated symptoms such as fatigue, sleep disturbance, mood disorder, and cognitive dysfunction with associated impairment of function and HRQOL. The symptom complex is caused by dysregulation of central sensory processing systems. Evidence points to genetic, environmental, and concomitant disease state factors in its etiology. As therapies are developed that not only address pain, but also other symptom domains, clinicians, regulatory agencies, patients, and others need to know the relative contribution of these various domains to the disease experience of the patient and how best to measure them in a reliable and feasible manner in clinical trials.
The primary objective of the OMERACT 9 FM module was to achieve relative consensus on a domain construct for FM clinical trials. This was accomplished through (1) review of work presented in previous OMERACT workshops (clinician/researcher Delphi, patient focus group and Delphi exercises), (2) presentation of a study in which the key clinical domains identified in these exercises were mapped against the patient global impression of change noted in 10 FM pharmacologic studies in order to determine the degree to which key domains both constituted the global patient experience of FM and were not completely overlapping, (3) presentation of the current status of outcome measures, objective biomarkers, and understanding about disease state, (4) discussion of the above in breakout groups, and (5) a voting process. Figure 1 demonstrates the outcome of this process. Domains considered essential to measure in all FM clinical trials include pain, tenderness, fatigue, patient global, multidimensional function, and sleep disturbance. Domains considered important to measure at some point in a clinical development program, but not essential to measure in all clinical trials, are depression and cognitive dysfunction, also known as dyscognition. Domains that are of research interest and considered elective to measure at this time, include stiffness, anxiety, and objective markers such as functional imaging, e.g., fMRI, and cerebrospinal fluid biomarkers. It is well recognised that this domain construct is a “work in progress”. For example, it is recognised that there are important elements of HRQOL that are not necessarily subsumed under the concept of “multidimensional function”, yet the best instruments that we currently have available, the SF-36 and FIQ, to measure these domains are primarily measures of function. Further, whereas both clinical experience and emerging research suggest that cognitive dysfunction is an important clinical domain in FM, the optimal methods to assess it are still very much in development, thus it is uncertain whether ultimately this will be considered a more essential domain to measure in all trials or not. As new and more sophisticated instruments become available to more completely measure the totality of patient experience vis-à-vis these domains, and as we gain a more full understanding of the disease process and better ways to measure impact of therapeutic intervention, this framework is expected to evolve.
As in previous OMERACT meetings, an update was provided on the outcome measures used in FM trials and the current status of objective markers of FM disease state. The quality of their performance was discussed and areas where improvement is necessary, particularly related to assessment of sleep, mood disturbance, tenderness, stiffness, multidimensional function and HRQOL were reviewed. Since the majority of outcome measures are PROs it is important that they fulfill the standards of evidence being developed by regulatory agencies. The working group reported on several projects underway, which will be more fully reviewed in future OMERACT meetings as part of the group’s research agenda: linkages with existing disease assessment networks such as PROMIS and the ICF and the development of an FM responder index. These will be developed in the context of the OMERACT filter of truth (forms of validation), discrimination, and feasibility. Objective markers of FM disease state continue to be developed, such as CSF biomarkers and functional imaging. The relationship of these markers to disease state and their ability to reflect change with treatment remains on the research agenda.
Special focus was placed on the domain of cognitive dysfunction in FM during the course of the module. This domain is ranked highly by patients in terms of disease impact and understanding about this problem is emerging. There have been fledgling attempts to measure change of this domain via self-assessment questionnaires. There are a number of more objective and potentially feasible applied measures, e.g., computer based cognition assessment methods which are beginning to be studied in FM clinical trials and will be reviewed at future OMERACT meetings.
We thank the support from Nooshine Dayani, Qu Peng and Robert Palmer from Forest Laboratories Inc, Chinglin Lai, Yanping Zheng and Diane Guinta from Jazz Pharmaceuticals Inc., Daniel Kajdasz and Amy Chappell from Eli Lilly & Co. and Gergana Zlateva and Emir Birol from Pfizer Inc.
L Arnold, D Clauw, L Crofford, P Mease, DA Williams supported in part by Grant Number AR053207 from NIAMS/NIH. E Choy is supported in part by an Integrated Clinical and Academic Centre grant from the Arthritis Research Campaign, UK. DA Williams is supported in part by Grant Number U01AR55069 from NIAMS/NIH. L Arnold and L Crofford are supported in part by Grant Number AR053207 from NIAMS/NIH.
obert Allen, Dennis Ang, Lesley Arnold, Annelise Boonen, Daniel Buskila, Larry Bradley, Alarcos Cieza, Ernest Choy, Dan Clauw, Leslie Crofford, Brian Cuffel, Michael Gauthier, Michael Gendreau, Jennifer Glass, Don Goldenberg, Richard Gracely, Diane Guinta, Kim Jones, Chinglin Lai, Geoff Littlejohn, Yves Mainguy, Susan Martin, Lynne Matallana, Philip Mease, Jamal Mikdashi, Jessica Morea, Robert Palmer, Daniel Radecki, I Jon Russell, Stuart Silverman, Lee Simon, Michael Spaeth, Tanja Stamm, Raj Tummala, Olivier Vitton, Brian Walitt, David Williams, Madelaine Wohlriech, Gergana Zlateva
Philip Mease, Seattle Rheumatology Associates, Chief, Division of Rheumatology Research, Swedish Medical Center, Clinical Professor of Rheumatology, University of Washington, Seattle, Washington, USA.
Lesley M Arnold, Department of Psychiatry, University of Cincinnati College of Medicine, Cincinnati, Ohio, USA.
Ernest H Choy, Director, Sir Alfred Baring Garrod Clinical Trials Unit, Academic Department Rheumatology, King’s College London, London, UK.
Daniel J. Clauw, Professor of Medicine and Psychiatry, University of Michigan, Ann Arbor, Michigan, USA.
Leslie Crofford, Gloria W. Singletary Professor of Internal Medicine, Chief, Division of Rheumatology & Women’s Health, University of Kentucky, Lexington, Kentucky, USA.
Jennifer M Glass, Assistant Research Scientist, Research Center for Group Dynamics, Research Assistant Professor, Department of Psychiatry, Division of Substance Abuse, University of Michigan, Ann Arbor, Michigan, USA.
Susan A Martin, RTI-Health Solutions, Ann Arbor, Michigan, USA.
Jessica Morea, Oregon Health and Science University, Portland, Oregon, USA.
Lee Simon, Associate Professor of Medicine, Harvard Medical School, Directory of Rheumatology Clinical Research, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA.
Vibeke Strand, Clinical Professor of Medicine, Adj., Division of Immunology/Rheumatology, Stanford University, Portola Valley, California, USA.
David A Williams, Professor, Anaesthesiology, Medicine, Psychiatry, and Psychology, University of Michigan, Ann Arbor, Michigan, USA.