The development of habitual behaviors requires repetition. It is interesting to note that the illicit drugs with the highest risk of addiction are those that have shorter half-lives (
O'Brien, 2001). For example, although cocaine and amphetamine-like psychostimulants are all highly addictive, the risk of addiction for cocaine is substantially higher than other psychostimulants (
O'Brien, 2001). Whereas psychostimulants have similar mechanisms of action in that they all increase extracellular levels of biogenic amines in the brain (
Parsons et al., 1996;
Ritz et al., 1988;
Schmidt and Pierce, 2006), the half-life of cocaine is an hour or less compared to ten hours or more for amphetamine and its derivatives (including methamphetamine and MDMA) (
O'Brien, 2001). The shorter half-life of cocaine, and therefore its shorter duration of action on aminergic neurotransmission, undoubtedly contributes to the binge consumption of this drug. Indeed, animal studies indicate that the pattern of cocaine taking is closely related to extracellular dopamine levels in the nucleus accumbens. That is, falling dopamine levels after cocaine is self-administered appear to trigger subsequent cocaine taking (
Wise et al., 1995). Moreover, the rapid delivery of cocaine and other drugs of abuse more readily promotes forms of neuronal and behavioral plasticity associated with compulsive drug seeking/taking (
Samaha et al., 2004;
Samaha and Robinson, 2005). Consistent with the idea that repetitive drug use ingrains habitual aspects of drug seeking/taking thereby complicating efforts to remain abstinent, heavy tobacco smokers routinely administer nicotine hundreds of times per day and the risk of addiction for cigarette smoking is by far the highest of any drug of abuse (approximately double the risk of addiction for cocaine and alcohol) (
O'Brien, 2001).
One common way of testing the extent to which responding for a positive reinforcer occurs according to a stimulus-response (i.e. habitual) or action-outcome (i.e. goal-directed) associative structure is by evaluating instrumental responding for a devalued reinforcer (
Adams, 1982;
Dickinson, 1985;
Killcross and Coutureau, 2003). Ingestive reinforcers, for example, can be devalued either by pre-feeding the subject with the food (or liquid) reinforcer, by degrading the taste of the reinforcer using bitter substances like quinine, or by associating the food with illness (induced by post-ingestion injection with lithium chloride, for example). If responding for the reinforcer is subsequently decreased, it is interpreted as being mediated by an action-outcome process, because behavior is driven by a representation of the value of the outcome, which is reduced by devaluing the reinforcer. On the other hand, if after devaluation the subject shows no difference in responding for valued and devalued reinforcers, responding is interpreted as being under the control of a stimulus-response process, independent of outcome value. Research in this area has shown that drug seeking can indeed become dependent on a habitual stimulus-response associative structure, and that this may occur more quickly for drugs than for food. Thus, it has been shown that lever pressing for food or sucrose was markedly suppressed after pairing with lithium chloride, whereas responding for cocaine and ethanol was impervious to this kind of devaluation (
Dickinson et al., 2002;
Miles et al., 2003). In these studies, both the drug (ethanol, cocaine) and natural (food, sucrose) reinforcers were ingestive rewards (i.e. ethanol, cocaine and sucrose solutions, food pellets, with clear and distinguishable taste characteristics). It is therefore unlikely that differences in stimulus associability (contiguity) explain the differential sensitivity to lithium devaluation between ethanol and cocaine on the one hand, and food pellets and sucrose on the other. Remarkably, both during taste aversion conditioning with lithium (which took place in four daily sessions) and during re-acquisition of responding for drug after the extinction test, responding for and intake of the drug solution associated with lithium-induced malaise was markedly decreased (
Dickinson et al., 2002;
Miles et al., 2003). This demonstrates that intake of the reinforcer itself can remain sensitive to devaluation even when responding for a drug reinforcer in extinction, using just the internal representation of the reinforcer to guide behavior, is impervious to devaluation. Thus, after relatively modest operant training under a random-interval schedule, responding for drugs relies on a stimulus-response associative structure of behavior, whereas intake of the drug is still a goal-directed action (
Dickinson et al., 2002;
Miles et al., 2003). It is possible that the pharmacological effects of the drug itself cause drug-directed behavior to change from a habitual to a goal-directed structure. This is not to say that drug intake always remains goal-directed. Insensitivity to devaluation of intake of a drug reinforcer has been observed in studies investigating the oral intake of ethanol and amphetamine, but this took prolonged drug experience to develop (
Galli and Wolffgramm, 2004;
Wolffgramm, 1991). In these experiments, rats became insensitive to the aversive properties of quinine since they did not reduce their intake of the drug solution after it had been rendered bitter. In early phases of the experiments, intake of the drug solution was reduced by quinine. In fact, the insensitivity to devaluation required no less than nine months of drug experience followed by an abstinence period to develop (
Galli and Wolffgramm, 2004;
Wolffgramm, 1991). On the other hand, as described above, the insensitivity to devaluation of operant responding for drug already developed after a relatively moderate amount of operant training (
Dickinson et al., 2002;
Miles et al., 2003). It should be born in mind that there are methodological differences between the studies by
Dickinson et al. (2002) and
Miles et al. (2003) on the one hand, and
Galli and Wolffgramm (2004) and
Wolffgramm (1991) on the other, such as differences in the devaluation procedure (lithium vs. quinine) and output parameters (lever pressing vs. drug intake). Nevertheless, these studies, taken together, suggest that operant responding for drugs (an act relatively distal from the actual subjective drug effects) more readily gains habitual properties than intake of the drug itself. This points to the possibility that the development of inflexible, habitual drug use occurs in several distinct phases, where distal drug cues or acts gain habitual properties before intake of the drug itself does, perhaps representing a gradual worsening of the addiction syndrome with increasing drug experience. These results also clearly indicate that drug seeking progresses more readily from a goal-directed action-outcome to a habitual stimulus-response associative structure than the seeking of natural rewards.
Consistent with this notion, it has been shown that although instrumental behaviors directed at obtaining cocaine initially are goal-directed, after lengthy drug exposure aspects of these behaviors lose their flexibility and gain automatic, habitual characteristics (
Deroche-Gamonet et al., 2004;
Vanderschuren and Everitt, 2004). In order to evaluate continued drug seeking in the face of adverse consequences, which is one of the criteria of drug addiction as defined in the DSM-IV-TR (
American Psychiatric Association, 2000), cocaine seeking in rats with a history of limited or extended cocaine self-administration was assessed in the presence of a footshock-associated conditioned stimulus (CS). Among animals with limited cocaine self-administration experience, the aversive CS markedly suppressed cocaine seeking. In contrast, the footshock-associated CS had no effect on cocaine seeking in rats with an extended cocaine self-administration history (
Vanderschuren and Everitt, 2004). This progression from flexible to inflexible appetitive behavior was cocaine-specific since the footshock CS markedly suppressed sucrose seeking following a similarly extended period of sucrose ingestion (
Vanderschuren and Everitt, 2004). Likewise, punishment of cocaine seeking with footshock markedly suppressed instrumental behavior in animals with limited drug experience, but a subgroup of animals subsequently displayed insensitivity to punishment, whereas animals responding for sucrose remained highly sensitive to footshock throughout lengthy training (
Pelloux et al., 2007). Consonant with these results, pairing a footshock with delivery of cocaine suppressed self-administration to a much lesser extent after lengthy cocaine self-administration experience in rats that displayed high levels of cocaine-induced reinstatement (
Deroche-Gamonet et al., 2004). Notably, only rats with prolonged cocaine self-administration experience were insensitive to devaluation of cocaine (
Deroche-Gamonet et al., 2004;
Vanderschuren and Everitt, 2004). This insensitivity to devaluation was not a consequence of exaggerated motivation for the drug.
Vanderschuren and Everitt (2004) found no difference in rate of responding for cocaine (reflecting the motivation for the drug) between animals that displayed sensitivity (i.e. limited cocaine self-administration experience) and insensitivity (i.e. prolonged cocaine self-administration experience) to presentation of a footshock-associated cue during cocaine seeking. In the study by
Deroche-Gamonet et al. (2004, see also
Belin et al., 2009) the animals that showed willingness to endure footshock together with self-administered cocaine previously showed increased motivation for the drug. However, the increase in motivation for cocaine preceded the willingness to endure footshock by some 40 self-administration sessions, indicating that increased motivation for the drug occurs independently of willingness to endure footshocks. Thus, enhanced motivation for the drug may be necessary for habitual cocaine self-administration to develop, but it is not sufficient to explain its presence. Taken together, these results indicate that prolonged periods of drug taking result in the development of inflexible and habitual behaviors in that drug seeking and taking becomes insensitive to a variety of manipulations aimed at devaluing the drug reinforcer. These aspects of drug use may depend on nuclei involved in stimulus-response habit learning such as the dorsolateral striatum. Below, we review evidence that dorsolateral regions of the striatum become involved in drug seeking.