Goal-directed behavior, unlike habits, is adjusted immediately and appropriately to changes in the value of the expected outcome. This reflects the finding that such behavior is based on associations between the response and the outcome or goal of the action, so that organisms may continuously re-evaluate their goal objects and dynamically change their actions in order to effectively produce adaptive behaviors (Dickinson,
1985). A rewarding goal’s value can be diminished by selective satiety and by induction of taste aversion (Colwill and Rescorla,
1986; Yin and Knowlton,
2002). Such manipulations do not produce a significant change in habitual behaviors; habits persist even if the reward becomes less attractive or if the action is not necessary to earn the reward (Adams and Dickinson,
1981; Adams,
1982). Thus, once lever pressing for a reward becomes habitual in this sense, induced taste aversion or unlimited exposure to the reward prior to a probe test have very little consequences on subsequent lever pressing behavior.
Since the discovery that organisms will seek and reinitiate electrical stimulation to certain brain areas (Olds and Milner,
1954; Olds,
1962), brain stimulation reward (BSR) has become the paradigm of choice for studying the neural reward circuitry. Some of the reasons for this are that the electrical stimulation can be precisely manipulated and that its parameters have neurophysiological meaning. The current passed through the electrode tip depolarizes nearby neurons thereby triggering action potentials. If the train and pulse duration are held constant, the number of action potentials elicited in the neurons close to the electrode tip is determined by the pulse frequency, whereas the stimulation current or pulse amplitude determines the radius of effective stimulation, and thus the number of cells excited by the electrode (Gallistel et al.,
1981).
The behavior elicited and controlled by the electrical stimulation, unlike the behavior controlled by natural rewards (McSweeney and Roll,
1993), is stable both between and within sessions. The electrical signal is delivered directly into the brain, bypassing sensory inputs, and physiological feedback mechanisms that discount natural rewards over the length of the experimental session. Moreover, it is delivered with a minimal delay after the behavior that procures the reward has occurred; therefore response–reward delays that degrade natural rewards are avoided. The behavior controlled by the rewarding signal that arises as a result of the delivery of electrical pulses is very sensitive to changes in the stimulation parameters and therefore the rewarding efficacy.
Even though BSR has very peculiar characteristics, the rewarding signal delivered by the electrode and that of natural rewards are evaluated and compared on a similar scale. The rewarding signal produced by the stimulation can compete with, summate with (Conover and Shizgal,
1994; Conover et al.,
1994), and substitute for (Green and Rachlin,
1991) natural rewards. Drugs that are used to devaluate natural rewards like lithium chloride (LiCl) decrease the rewarding effect of electrical brain stimulation. Specifically, when the curve shift paradigm is used it has been reported that injecting LiCl at relatively high doses (100 or 200

mg/kg, i.p.) produces an increase in self-stimulation threshold, meaning that higher stimulation is required to produce a response similar to that observed during vehicle conditions (Tomasiewicz et al.,
2006; Mavrikaki et al.,
2009). Thus, a rightward shift of the curve that relates operant performance to stimulation frequency occurs, without significantly disrupting performance capacity (Miliaressis et al.,
1986).
A similar increase in reinforcement threshold is observed when the post-reinforcement pause method is used (Cassens and Mills,
1973). In this method the experimental subjects are trained under a concurrent fixed ratio (FR)–continuous reinforcement (CRF) schedule of reinforcement, in which the stimulation for the FR schedule is kept at maximal intensity whereas for the CRF stimulation is varied between zero and maximal. Increasing and decreasing stimulus intensity on the CRF schedule leads to the switching in schedule control over the behavior and a gradual disappearance and reappearance, of post-reinforcement pauses (PRPs) on the concurrent FR schedule. These PRPs are critical for providing a criterion for changeover in schedule control, and constitute a measure for reinforcement threshold (Buscher et al.,
1990). The threshold obtained through this method, like the one obtained with the curve shift method, is then used as a baseline against which the effect of various experimental manipulations are expressed quantitatively in psychophysical units therefore avoiding the confounds effects of drugs on response rate (Bozarth,
1987).
These studies suggest that LiCl produces a hypofunction of brain reward systems and immediate effects on reward. One of the goals of the present study was to further characterize reward devaluation of BSR by providing evidence of long-lasting effects of LiCl when non-contingent reward delivery is paired with this drug, using a paradigm commonly used with natural rewards (Holland and Rescorla,
1975; Adams and Dickinson,
1981; Schoenbaum and Setlow,
2005; Nelson and Killcross,
2006). An advantage of using this approach is that BSR will be given in a different context than where the rats will be trained or tested (instead of performance under the effects of the drug), therefore minimizing associations between training context and reward that could counteract the effects of LiCl.
Additionally we also evaluated the effects of AM251, a cannabinoid receptor (CB1) antagonist. Behavioral output during the pursuit of reward can be potently modulated by activation of CB1 receptors, which are ubiquitous in brain circuitry associated with reward (Solinas et al.,
2008). For example, injection of a CB1 agonist can reinstate drug-seeking behavior (De Vries et al.,
2001). Similarly CB1 receptor agonists can potentiate the rewarding effect of drugs of abuse and natural rewards (Gallate et al.,
1999; Valjent et al.,
2002; Solinas et al.,
2005); whereas antagonists have the opposite effect (Fattore et al.,
2003,
2007; Cippitelli et al.,
2005; Economidou et al.,
2006). When the role of CB1 receptors is evaluated in the context of BSR the results are contradictory. Some studies using CB1 receptor agonists show small or no decreases in self-stimulation threshold (Lepore et al.,
1996; Arnold et al.,
2001); whereas other experiments report pronounced decreases in self-stimulation thresholds (Vlachou et al.,
2005,
2006). When CB1 receptor antagonists are used, similar contradictory results are observed; some studies report no effects (Vlachou et al.,
2005; Xi et al.,
2008) whereas other show significant increases (Deroche-Gamonet et al.,
2001; De Vry et al.,
2004). The contrast between the robust effects of CB1 receptor manipulations on the reinforcing effects of natural rewards and drugs of abuse with those obtained with BSR could be an indirect indication of what factors are affected by CB1 receptor activation. It is possible that these receptors elicit a change in reinforcement by affecting the organism’s motivational state and not the reward’s intrinsic value. Indeed, it has been recently reported that CB1 receptors produce their effects on BSR by altering factors others than reward sensitivity (Trujillo-Pisanty et al.,
2011). Therefore we hypothesized that the effects of pairing AM251 with non-contingent rewarding stimulation should not produce enduring effects on the valuation of reward.