|Home | About | Journals | Submit | Contact Us | Français|
An operant conditioning protocol that bases reward on the electromyographic (EMG) response produced by a specific CNS pathway can change that pathway. For example, in both animals and people, an operant conditioning protocol can increase or decrease the spinal stretch reflex or its electrical analog, the H-reflex. Reflex change is associated with plasticity in the pathway of the reflex as well as elsewhere in the spinal cord and brain. Because these pathways serve many different behaviors, the plasticity produced by this conditioning can change other behaviors. Thus, in animals or people with partial spinal cord injuries, appropriate reflex conditioning can improve locomotion. Furthermore, in people with spinal cord injuries, appropriate reflex conditioning can trigger widespread beneficial plasticity. This wider plasticity appears to reflect an iterative process through which the multiple behaviors in the individual’s repertoire negotiate the properties of the spinal neurons and synapses that they all use. Operant conditioning protocols are a promising new therapeutic method that could complement other rehabilitation methods and enhance functional recovery. Their successful use requires strict adherence to appropriately designed procedures, as well as close attention to accommodating and engaging the individual subject in the conditioning process.
Operant conditioning is a powerful method to induce behavioral learning; through operant conditioning, modification of a behavior is induced by the consequence of that behavior. In 1983, Wolpaw and his colleagues (Wolpaw et al., 1983) showed for the first time that a properly designed operant conditioning protocol could change the spinal stretch reflex (SSR), a largely monosynaptic behavior arising from the excitation of muscle spindle afferents. Variations of this protocol have been applied to condition the SSR or its electrical analog, the H-reflex, in monkeys, rats, humans, and mice; they have confirmed that a specific change (i.e., up- or down-regulation) can be induced in the targeted reflex through operant conditioning (for review: Wolpaw, 2010; Thompson and Wolpaw, 2014a).
All the different versions of this conditioning protocol have three key features: (1) they require maintenance of a certain level of background (pre-stimulus) EMG activity in the target muscle; (2) the reward is based on the size of the reflex measured as EMG activity; and (3) the reward contingency (i.e., whether larger or smaller reflexes are rewarded) remains the same over days and weeks. These protocols are designed to induce and maintain a long-term change in descending influence over the spinal reflex pathway, and to thereby produce targeted neuroplasticity in that pathway (Wolpaw, 1997). A comparable operant conditioning protocol for the motor evoked potentials (MEPs) evoked by transcranial magnetic stimulation (TMS) has recently been developed to induce targeted neuroplasticity in a corticospinal pathway (Brangaccio et al., 2014; Favale et al., 2014).
Because these protocols can change the function of specific neural pathways, they can be designed to address the specific functional deficits of an individual with a spinal cord injury (SCI) or other CNS disorder. In a study of people with spastic hyperreflexia due to incomplete SCI, the soleus H-reflex was down-conditioned, because hyperactivity in this reflex pathway impaired their locomotion (Thompson et al., 2013; Thompson and Wolpaw, 2014c). In contrast, in a study of rats with limping due to partial SCI, the soleus H-reflex was up-conditioned because soleus weakness impaired the stance phase of locomotion (Chen et al., 2006). In both cases, the intervention was effective; both the humans and the rats walked better. Because it can focus on an individual’s particular deficits, the targeted neuroplasticity that can be induced and guided by operant conditioning protocols is distinguished from less focused interventions such as botulinum toxin or baclofen, which simply weaken muscles or reflexes and may have undesirable side effects (Dario et al., 2004; Dario and Tomei, 2004; Sheean, 2006; Ward, 2008; Thomas and Simpson, 2012).
While a reflex operant conditioning protocol does induce plasticity in the targeted pathway, studies in monkeys and rats show that plasticity at other sites in the spinal cord and brain is also involved in the reflex change (Wolpaw, 2010; Thompson and Wolpaw, 2014a). In the spinal cord, conditioning-induced H-reflex change is accompanied by changes in motoneuron properties (e.g., firing threshold and axonal conduction velocity), in GABAergic terminals and several other terminal populations on the motoneuron, and in spinal interneurons. In the brain, plasticity occurs in sensorimotor cortex and/or closely related areas. The corticospinal tract (CST) is the only major descending pathway that is essential for conditioning. Hence, all together, the emerging picture is that operantly conditioned change in a spinal reflex rests on a hierarchy of plasticity in which the reward contingency produces plasticity in the brain that induces and maintains the plasticity in the spinal cord that is directly responsible for the conditioned H-reflex change (Wolpaw, 2010; Thompson and Wolpaw, 2014c, b).
The mechanisms of reflex conditioning are most readily studied in animals as summarized above (Thompson and Wolpaw, 2014a); at the same time, the time course of reflex change, while discernible in animals, can be best analyzed in humans. This skill acquisition (i.e., acquisition of a larger or smaller H-reflex) can be dissected into two components: a rapid component in which the reward contingency modifies CST output to produce an acute reflex change (i.e., called “task-dependent adaptation”); and a slow component in which the CST output gradually induces the spinal cord plasticity underlying long-term reflex change (Wolpaw and O'Keefe, 1984; Wolpaw et al., 1994; Chen et al., 2001; Thompson et al., 2009a). In the human reflex conditioning protocol, the rapid component can be readily turned on and off by subject instruction, while the slow component is left unaffected. By doing this repeatedly over the course of conditioning, it is possible to track the development of each component separately (Thompson et al., 2009a).
In the human protocol, reflex size is measured in two different situations: control trials and conditioning trials. In control trials, the reflex is simply measured (without feedback as to reflex size). In conditioning trials, the reflex is measured while the subject is encouraged to increase (up-conditioning) or decrease (down-conditioning) reflex size and is provided with immediate visual feedback as to whether s/he has succeeded in producing a reflex larger (up-conditioning) or smaller (down-conditioning) than a criterion. Thus, the task of changing reflex size in the rewarded direction is imposed only in conditioning trials. The within-session difference in size between the reflexes of the control and conditioning trials reflects rapid task-dependent adaptation, while the change in the control reflex across sessions reflects long-term plasticity in the targeted reflex pathway (Thompson et al., 2009a).
Task-dependent adaptation and long-term change begin at different points in the multi-session study protocol, and they develop at different rates. Furthermore, their relative contributions to the final magnitude of reflex change appear to correlate with their impacts on other important motor skills, such as locomotion (Thompson et al., 2009a; Thompson et al., 2013; Makihara et al., 2014; Thompson and Wolpaw, 2014b, c). Figure 1 shows the conditioning H-reflex, the control H-reflex, and the within-session task-dependent adaptation (i.e., conditioning H-reflex minus control H-reflex) across the course of H-reflex down-conditioning in people with and without chronic incomplete SCI (Thompson et al., 2009a; Thompson et al., 2013). In people with SCI (A), the conditioned reflex decreases to 69% of the baseline value over 30 conditioning sessions; in neurologically normal subjects (B), it decreases to the same value over 24 conditioning sessions. However, task-dependent adaptation (i.e., within-session change), which is thought to reflect immediate change in cortical influence (e.g., on presynaptic inhibition), is significantly smaller in people with SCI than in neurologically normal subjects (−7% vs. −15%; Figure 1, bottom). This difference may be due to SCI-related damage to the CST (reviewed in (Wolpaw, 2010; Thompson and Wolpaw, 2014c)). CST damage may also account for the slightly slower course of H-reflex decrease (i.e., 30 sessions vs. 24 sessions in normal subjects) in people with SCI. On the other hand, the long-term change in the control H-reflex (i.e., across-session change), which is thought to reflect spinal cord plasticity, is significantly greater in people with SCI than in normal subjects (−24% vs. −16%; Figure 1, middle) (Thompson et al., 2013). Interestingly, this difference between people with and without SCI in the magnitude of long-term plasticity is reflected in the difference between them in the locomotor effects of H-reflex conditioning. H-reflex down-conditioning markedly improved locomotion in people with SCI (Thompson et al., 2013), while it did not disturb normal locomotion in people without SCI (Makihara et al., 2014). (This is further discussed below in the section “Functional impact of conditioning: negotiation of plasticity”.)
The human H-reflex operant conditioning protocol that allows tracking of the two distinct components of skill acquisition (Thompson et al., 2009a) has the three key features described above. At the same time, it differs from the animal protocols in several ways: (1) conditioning occurs in discrete 1-hr sessions of 225 conditioning trials each at a rate of 3/week over 8-10 weeks (thus people complete only 3-5% as many trials as the rats, which are continuously exposed to conditioning over 50 days); (2) the EMG recording and nerve stimulating electrodes are superficial rather than implanted; (3) the reward is visual feedback rather than a food pellet; and (4) each conditioning session begins with 20 control trials in which the subject is not asked to change the reflex and receives no feedback as to reflex size. The standard human conditioning protocol consists of 6 baseline (i.e., control) and 24 (or 30) conditioning sessions at a rate of three per week. In each trial, the reflex is elicited while the subject maintains a pre-determined level of background EMG and stable posture and joint angles (Figure 2). The effective strength of the stimulus that elicits the reflex is kept constant (at just above M-wave threshold) within and across the sessions. In each baseline session, three blocks of 75 control reflexes each (i.e., 225 total) are elicited. In each conditioning session, 20 control reflexes are elicited as in the baseline sessions and then three blocks of 75 conditioning reflexes (i.e., 225 total) are elicited. In these conditioning trials, the subject is asked to increase (up-conditioning) or decrease (down-conditioning) reflex size and is given immediate visual feedback as to whether the reflex was larger (up-conditioning) or smaller (down-conditioning) than a criterion value. Satisfying the criterion on more than a specific percent of the trials earns an additional monetary reward.
For successful implementation of an operant conditioning protocol in human subjects, it should be kept in mind that operant conditioning is a method to induce learning. The person is being asked to learn how to change the brain’s descending influence (i.e., conveyed by the CST) over the spinal pathway of the H-reflex in a specific direction (i.e., to increase or decrease the reflex). Thus, the principles important in general skill learning are important. Successful operant conditioning of spinal reflexes (or other EMG responses (e.g., TMS-evoked MEPs)) in humans requires that correct subject set-up and session procedures be meticulously followed throughout, in every session. In addition, the ongoing interactions between the subject and the investigator during the sessions are important; the investigator serves essentially as a coach who encourages and guides the subject in mastering and maintaining the change in CST activity that is responsible for task-dependent adaptation and the gradual long-term change. Without careful adherence to protocol procedures and good coaching, conditioning failures and subject withdrawal prior to study completion are more likely.
Effective operant conditioning of spinal reflexes (or other EMG responses) in human subjects requires meticulous repetition of completely standardized conditioning sessions. This is a challenge peculiar to human conditioning, since in animals the electrodes are chronically implanted and conditioning occurs throughout the day, without the need for daily preparation (Wolpaw and Herchenroder, 1990; Chen and Wolpaw, 1995; Carp et al., 2006). Reproducing the same experimental conditions over and over is an essential part of human conditioning. If the conditions vary between sessions, it is difficult for the subject to master and maintain the targeted direction of reflex change. It should also be recognized that each conditioning session is likely to have persistent impact; thus consistency across sessions is essential. If operant conditioning is to be successful, the investigator(s) must adhere to the same procedures throughout the 30-36 sessions of the study. Here is a brief summary of the most critical aspects of the set-up and procedures.
The area of the skin where the electrodes are placed is cleaned with alcohol and paper towel. Because the next session will occur within the next few days, dry shaving of the skin is not recommended (it may create scabs on the skin, and thus can affect recording/stimulation in subsequent sessions). Electrode positions are mapped in relation to landmarks on the skin (e.g., scars or moles), in order to avoid session-to-session variability in placement (Thompson et al., 2009a; Thompson et al., 2013; Makihara et al., 2014).
After electrode placement, single pulses of electrical stimulation are applied to the nerve that innervates the target muscle, to test the quality of EMG signals and the effectiveness of nerve stimulation. During this testing procedure, the subject may or may not be asked to produce the same background EMG activity as for the actual trials. While rapidly increasing the stimulus intensity from below H-reflex threshold to the maximum M-wave (Mmax) level, the investigator should determine whether H-reflex and M-wave recruitment and Mmax amplitude are similar to those from previous sessions. If they appear different, the skin should be cleaned again and the electrode placement should be rechecked to ensure that it is correct.
MVC may be measured as absolute EMG amplitude during maximum isometric contraction of the target muscle, with or without concurrent measurement of joint force.
Prior to the control and conditioning trials, a full H-reflex and M-wave recruitment curve of the target muscle is obtained while the subject maintains a defined level of EMG activity and posture (e.g., natural standing, or sitting in a chair with specific ankle, knee, and hip angles). Stimulus intensity is increased from H-reflex threshold to an intensity just above that needed to elicit the maximum M-wave (Mmax) (Zehr and Stein, 1999; Kido et al., 2004). At least four EMG responses are averaged at each intensity. With the same background EMG level and postural constraints, recruitment curves with other modes of stimulation may also be obtained. For example, in the protocol for conditioning the MEP to TMS, an MEP recruitment curve should be obtained prior to the control and conditioning MEP trials. The investigator may elect to repeat the recruitment curve measurements at the end of session.
When the subject has maintained EMG activity in the target muscle within the specified range for at least 2 s, a stimulus is delivered to elicit the muscle response (e.g., H-reflex or MEP). For H-reflex conditioning, the stimulus intensity that produces an M-wave just above the threshold and an H-reflex below the maximum H-reflex (Hmax) should be used. For MEP conditioning, TMS intensity 5-10% above MEP threshold (with active background EMG) is appropriate. The minimum interstimulus interval is at least 5 s. No visual feedback on the size of the evoked EMG response (e.g., H-reflex or MEP) is provided.
The conditioning trials are identical to the control trials (i.e., same stimulus intensity, background EMG, and posture), except that the subject is asked to increase (up-conditioning) or decrease (down-conditioning) the response size and is provided with immediate visual feedback that indicates his or her success in doing so. During the conditioning trials, the investigator’s coaching skills (see below) become very imporatnt.
It should be noted that the procedures summarized here reflect the current state of development of H-reflex and other EMG-response operant conditioning protocols in humans. We expect that growing understanding of the mechanisms and process of conditioning, and further technical developments, will soon allow the methodology to be refined and simplified, and to thereby become suitable for widespread clinical use.
In rats with a right lateral column lesion that weakened right stance and produced an asymmetrical gait (Chen et al., 2006), up-conditioning of the right soleus H-reflex increased the motoneuron excitation produced by group 1a input from muscle spindles. Because this input contributes to the stance phase of locomotion (Bennett et al., 1996; Stein et al., 2000), H-reflex up-conditioning strengthened right stance and restored right/left step symmetry in these rats with partial SCI (Chen et al., 2006). In people with spasticity (i.e., associated with a hyperactive soleus H-reflex) due to chronic incomplete SCI, successful down-conditioning of the H-reflex during standing decreased the H-reflex during walking, increased walking speed (by 59%), and improved right/left step symmetry (Thompson et al., 2013). These first results in animals and people (see also Manella et al., 2014) with incomplete SCI suggest that operant conditioning of spinal reflexes can improve gait recovery after chronic incomplete SCI, and possibly in other disorders as well (e.g., (Chen et al., 2010)).
Current understanding of the spinal cord plasticity associated with H-reflex down-conditioning provides some insight into the mechanisms that underlie the locomotor improvement in people with SCI. In these individuals with spasticity due to SCI, motoneuron excitation from muscle spindle afferents is exaggerated (Knutsson et al., 1973; Mailis and Ashby, 1990), motoneuron and interneuron properties are altered (Hultborn, 2003; Gorassini et al., 2004; Hornby et al., 2006; Onushko and Schmit, 2007; Heckman et al., 2008), and inhibitory interneuron activity is abnormal (Ashby and Wiens, 1989; Boorman et al., 1996; Morita et al., 2001; Crone et al., 2003; Thompson et al., 2009b; Knikou and Mummidisetty, 2011). Animal studies indicate that H-reflex down-conditioning raises motoneuron firing threshold, slightly decreases the primary afferent EPSP, and markedly increases the number of identifiable GABAergic terminals on the motoneuron and the number of identifiable GABAergic interneurons in the ventral horn (reviewed in (Wolpaw, 2010)). By counteracting the abnormalities associated with SCI, these effects appear to underlie the locomotor improvement produced by H-reflex down-conditioning. At the same time, these changes in the conditioned pathway of the H-reflex cannot in themselves fully account for the widespread improvement in locomotion noted in people with SCI (Thompson et al., 2013).
As summarized in Figure 3 (Thompson and Wolpaw, 2014c), the spinal cord is a multi-user system in which the users are the many different behaviors (i.e., skills) in the individual’s repertoire. A given spinal pathway is likely to participate in multiple behaviors. For each of these behaviors, the excitability (i.e., gain) of the spinal pathway is adjusted appropriately. For instance, in the case of the soleus H-reflex pathway in Figure 3, reflex gain decreases from sitting to standing (Kawashima et al., 2003) and from standing to walking (Capaday and Stein, 1986; Stein and Capaday, 1988). In some people, the gain is further adjusted to accommodate specific athletic skills, such as kicking a ball, jumping, and ballet dancing (Nielsen et al., 1993). Such task-dependent adaptation of reflex pathways is important in ensuring satisfactory execution of each behavior.
Each task-dependent adaptation affects only its specific behavior; it does not affect other behaviors. For example, after a person acquires the new behavior of a larger or smaller H-reflex through operant conditioning, task-dependent increase or decrease affects only the H-reflexes elicited in the context of the conditioning protocol. However, when this task-dependent adaptation is imposed repeatedly over multiple sessions, it changes the spinal pathway (i.e., it induces long-term plasticity in the pathway); this lasting change affects all the behaviors that use the pathway (Zehr, 2006; Wolpaw, 2010; Thompson and Wolpaw, 2014b). For example, when H-reflex conditioning produces long-term plasticity, it changes the central element in Figure 3A, the baseline strength of the H-reflex pathway. It thereby affects previously acquired behaviors (e.g., locomotion), which must now use an altered H-reflex pathway, a pathway that is stronger or weaker than it was previously and may not respond in the same way to the descending and sensory inputs associated with these older behaviors. The functional consequences of this impact on other behaviors differ substantially between people with incomplete SCI and people who are neurologically normal. This difference is likely to account for the fact that long-term plasticity is substantially greater in people with SCI than in neurologically normal people (Figure 1, (Thompson et al., 2013)).
The data to date indicate that the probability of conditioning success and the magnitude of reflex change are comparable in people with or without incomplete SCI (Chen et al., 2005; Chen et al., 2006; Thompson et al., 2009a; Thompson et al., 2013). However, these two populations differ markedly in the proportions of task-dependent adaptation and long-term change in the final conditioned H-reflex (Thompson et al., 2013). Specifically, the greater long-term H-reflex change found in people with SCI is reflected in the difference in the locomotor effects of H-reflex conditioning between the groups with and without SCI. H-reflex down-conditioning markedly improved locomotion in individuals with SCI (Thompson et al., 2013); while it did not disturb normal locomotion in neurologically normal subjects (Makihara et al., 2013). In normal subjects, the long-term change in the H-reflex pathway produced by H-reflex conditioning may disturb other behaviors (e.g., locomotion) and may thereby trigger additional compensatory plasticity to preserve key features of those behaviors. Thus, in normal subjects, the conditioned H-reflex change would ideally consist largely of task-dependent adaptation, with little long-term plasticity to disturb other behaviors. In contrast, in people with SCI, the conditioned change in the H-reflex would ideally consist largely of long-term change, because that change restores more normal locomotion, one of the most important skills in their very limited repertoire (Figure 3B).
This difference can be best understood in terms of the negotiated equilibrium hypothesis (Wolpaw, 2010). According to this hypothesis, spinal neurons and pathways are continually maintained in a state of “negotiated equilibrium,” a balance that supports the satisfactory performance of all the behaviors in an individual’s repertoire (Nielsen et al., 1993; Ozmerdivenli et al., 2002; Zehr, 2006). In normal subjects, the spinal cord plasticity that supports a new behavior (e.g., a smaller H-reflex) necessitates the achievement of a new equilibrium that produces a smaller H-reflex and still supports other behaviors (e.g., locomotion) satisfactorily. This new negotiation causes concurrent changes in the networks underlying the many behaviors that use the pathway. For a behavior such as locomotion, which is already satisfactory, these concurrent changes may reduce the long-term plasticity that changes the H-reflex. The outcome is that, in normal subjects, a large part of the final change in the conditioned H-reflex is due to task-dependent adaptation, which does not disrupt other behaviors.
In contrast, for people with SCI, the spinal cord plasticity underlying the long-term H-reflex decrease improves locomotion. Similarly, in rats in which a SCI has caused step-cycle asymmetry (i.e., limping), appropriate soleus H-reflex conditioning restores symmetry (Chen et al., 2006). In these SCI rats, as in the people with SCI, the long-term change in the H-reflex was doubly adaptive: it increased the probability of reward in the conditioning protocol and, in addition, it improved locomotion. It led to a new spinal cord equilibrium better than the one that existed prior to H-reflex conditioning. Thus, it is likely that long-term H-reflex change was greater in people with SCI than in normal subjects because it did more than support the new behavior (i.e., a smaller H-reflex); it also improved locomotion. A recent study in rats with partial spinal cord injuries provides additional support for this analysis (Chen et al., 2014).
The locomotor improvement produced by H-reflex down-conditioning in people with SCI was surprising in its extent: the muscle activity improved in both legs, and people walked faster and more symmetrically (Thompson et al., 2013). It is unlikely that the plasticity responsible for the smaller soleus H-reflex in one leg could by itself have such widespread salutary impact (e.g., on the locomotor behavior of proximal and distal muscles in the other leg). The breadth of the effect implies that, in these people with SCI, H-reflex conditioning led to additional plasticity in other pathways involved in locomotion, and thereby improved the entire behavior. The acquisition of the new behavior, a smaller soleus H-reflex, triggered a new negotiation among the behaviors using the injured spinal cord. The targeted beneficial change in the soleus H-reflex pathway apparently enabled the new negotiation to result in widespread adaptive plasticity. The result was a new negotiated equilibrium that decreased the H-reflex and also improved locomotion.
In summary, the studies to date in animals and people with spinal cord injuries indicate that operant conditioning protocols that change specific CNS pathways provide a valuable new therapeutic approach that can complement other rehabilitation methods and enhance recovery of function. At the same time, as the studies are proceeding, it is becoming apparent that the long-term impact of spinal reflex conditioning depends to a considerable degree on whether the patients who complete them and gain improvements in function take advantage of these improvements in their daily lives. Doing this may require changes in life style. For example, a person who prior to conditioning could walk only with a walker and after conditioning can walk with a cane, will retain this improvement only if he continues to walk with a cane. If he still uses the walker, or uses only a wheelchair, in his daily life, the benefits of conditioning are likely to disappear. If reacquired capacities are to be retained, and perhaps to grow further, they must be used in daily life.
This work was supported in part by the New York State Spinal Cord Injury Research Trust [C023685 to AKT]; the National Institutes of Health [NS069551 to AKT, NS22189 to JRW, and NS061823 to JRW and Xiang Yang Chen]; and the Helen Hayes Hospital Foundation [to AKT].