|Home | About | Journals | Submit | Contact Us | Français|
What is particularly worth remembering about a traumatic experience is what brought it about, and what made it cease. For example, fruit flies avoid an odor which during training had preceded electric shock punishment; on the other hand, if the odor had followed shock during training, it is later on approached as a signal for the relieving end of shock. We provide a neurogenetic analysis of such relief learning. Blocking, using UAS-shibirets1, the output from a particular set of dopaminergic neurons defined by the TH-Gal4 driver partially impaired punishment learning, but left relief learning intact. Thus, with respect to these particular neurons, relief learning differs from punishment learning. Targeting another set of dopaminergic/serotonergic neurons defined by the DDC-Gal4 driver on the other hand affected neither punishment nor relief learning. As for the octopaminergic system, the tbhM18 mutation, compromising octopamine biosynthesis, partially impaired sugar-reward learning, but not relief learning. Thus, with respect to this particular mutation, relief learning, and reward learning are dissociated. Finally, blocking output from the set of octopaminergic/tyraminergic neurons defined by the TDC2-Gal4 driver affected neither reward, nor relief learning. We conclude that regarding the used genetic tools, relief learning is neurogenetically dissociated from both punishment and reward learning. This may be a message relevant also for analyses of relief learning in other experimental systems including man.
Having no idea as to what will happen next is not only bewildering, but can also be dangerous. This is why animals learn about the predictors for upcoming events. For example, a stimulus that had preceded a traumatic event can be learned as a predictor for this event and is later on avoided. Such predictive learning qualitatively depends on the relative timing of events: a stimulus that occurred once a traumatic event had subsided later on supports opposite behavioral tendencies, such as approach, as it signals what may be called relief (Solomon and Corbit, 1974; Wagner, 1981) or safety (Sutton and Barto, 1990; Chang et al., 2003). Such opposing memories about the beginning and end of traumatic experiences are common to distant phyla (e.g., dog: Moskovitch and LoLordo, 1968, rabbit: Plotkin and Oakley, 1975, rat: Maier et al., 1976, snail: Britton and Farley, 1999, adult fruit fly: Tanimoto et al., 2004; Yarali et al., 2008, 2009; Murakami et al., 2010, larval fruit fly: Khurana et al., 2009), including man (Andreatta et al., 2010). This timing-dependency may reflect a universal adaptation to what one may call the “causal texture” of the world, such that whatever precedes X is likely to be the cause of X, and whatever follows X may be responsible for X's disappearance (Dickinson, 2001). Correspondingly, pleasant experiences, too, support opposing kinds of memory for stimuli that respectively precede and follow them (e.g., pigeon: Hearst, 1988; honeybee: Hellstern et al., 1998). Thus, to fully appreciate the behavioral consequences of affective experiences, it is necessary to study the mnemonic effects of their beginning and their end.
To do so, the fruit fly offers a fortunate possibility for fine grained behavioral analyses, combined with a small, experimentally accessible brain. Once trained with odor-electric shock pairings, fruit flies avoid this odor as a signal for punishment (Tully and Quinn, 1985); training with a reversed timing of events, that is first shock and then the odor, on the other hand, results in approach toward this odor as a predictor for relief (in adults: Tanimoto et al., 2004; Yarali et al., 2008, 2009; Murakami et al., 2010; in larvae: Khurana et al., 2009). Presenting an odor together with a sugar reward establishes conditioned approach, too (Tempel et al., 1983).
Punishment and reward learning are well-studied, including how the respective kinds of reinforcement are signaled. Shock activates a set of fruit fly dopaminergic neurons (Riemensperger et al., 2005), defined by the TH-Gal4 driver; blocking the output from these neurons impairs punishment learning, but not reward learning (in adults: Schwaerzel et al., 2003; Aso et al., 2010; in larvae: Honjo and Furukubo-Tokunaga, 2009; Selcho et al., 2009; regarding the former larval study, Gerber and Stocker (2007) filed caveats which may challenge the associative nature of the used paradigm). Also, loss of function of the dopamine receptor DAMB selectively impairs punishment rather than reward learning in fruit fly larvae (Selcho et al., 2009). Accordingly, in the cricket and the honey bee as well, punishment rather than reward learning is impaired by dopamine receptor antagonists (Unoki et al., 2005, 2006; Vergoz et al., 2007). Finally, activating a set of dopaminergic neurons, defined by the TH-Gal4 driver in adult (Claridge-Chang et al., 2009; Aso et al., 2010) and reportedly also in larval (Schroll et al., 2006) fruit flies substitutes for punishment during training. Altogether, these results point to dopamine as covered by the applied genetic tools, to be necessary and sufficient to signal punishment.
As for reward signaling, this reinforcing role seems to be fulfilled by octopamine. In the honeybee, activity of a sugar responsive octopaminergic neuron “VUMmx1,” innervating the olfactory pathway, is sufficient to substitute for the rewarding, but not the reflex-releasing, effects of sugar during training (Hammer, 1993), as does injecting octopamine at various sites along the olfactory pathway (Hammer and Menzel, 1998). In turn, interfering with the honey bee or cricket octopamine receptors impairs reward learning, but leaves punishment learning intact (Farooqui et al., 2003; Unoki et al., 2005, 2006; Vergoz et al., 2007). Accordingly, in the fruit fly, compromising octopamine biosynthesis via the tbhM18 mutation impairs reward learning, but not punishment learning (Schwaerzel et al., 2003; Sitaraman et al., 2010). Finally, in larval fruit flies, the output from a particular set of octopaminergic/tyraminergic neurons, defined by the TDC2-Gal4 driver seems to be required selectively for reward learning (see Honjo and Furukubo-Tokunaga, 2009, but see above); in turn, activating these neurons reportedly substitutes for the reward during training (Schroll et al., 2006).
These findings together suggest a double dissociation between the roles of dopamine and octopamine in signaling punishment and reward, respectively. This double dissociation however may need qualification, as the function of the fruit fly dopamine receptor dDA1 turns out to be required for both kinds of learning (in adults: Kim et al., 2007; in larvae: Selcho et al., 2009). The picture becomes more complicated with the additional role of dopaminergic neurons in signaling the state of hunger, which is a determinant for the behavioral expression of the sugar-reward memory in adult fruit flies (Krashes et al., 2009; in other insects, too, octopamine and dopamine affect the behavioral expression of memory, Farooqui et al., 2003; Mizunami et al., 2009; also in crabs: Kaczer and Maldonado, 2009). Finally, in a fruit fly operant place learning paradigm, where high temperature acts as punishment and preferred temperature as potential reward, neither dopamine nor octopamine signaling seems to be critical (Sitaraman et al., 2008, 2010). Thus, the scope of what octopamine and dopamine do for punishment and reward learning, memory, and retrieval remains open, including (except for the seminal case of the VUMmx1 neuron in the bee, Hammer, 1993, and a recent study on dopaminergic signaling in the fly, Aso et al., 2010) the assignment of these putative roles to specific amine-releasing and receiving neurons and the receptors involved, as well as the utility of the genetic tools available. Here, we ask for the neurogenetic bases of relief learning, comparing the underpinnings of relief learning to punishment and reward learning.
Drosophila melanogaster were reared as mass culture at 25°C, 60–70% relative humidity, under a 14:10h light:dark cycle.
We used shibirets1 for temperature-controlled, reversible blockage of synaptic output (Kitamoto, 2001). shibirets1 expression was directed to different sets of neuron by crossing the males of the respective Gal4 strains (Table (Table1)1) to females of a UAS-shibirets1 strain (Kitamoto, 2001; first and third chromosomes); thus the offspring were heterozygous for both the Gal4-driver and UAS-shibirets1. We refer to these flies with the name of the Gal4-driver together with “shits1” (e.g., “TH/shits1”). To obtain proper genetic controls, we crossed each of the UAS-shibirets1 or the Gal4-driver strains to white1118 flies, thus obtaining flies heterozygous either for the Gal4-driver or for UAS-shibirets1. We refer to these as, e.g., “TH/+” and “shits1/+,” respectively.
To approximate the patterns of Gal4 expression, we used the respective drivers (Table (Table1)1) to express the UAS-controlled transgene mCD8GFP, which encodes for a green fluorescent protein (GFP) to insert into cellular membranes. To do this, we crossed males from each driver strain to females of a UAS-mCD8GFP strain (Lee and Luo, 1999; second chromosome) and stained the brains of the progeny against the Synapsin protein to visualize the neuropils and against GFP to approximate the pattern of Gal4 expression. Note however that the pattern of GFP-immunoreactivity does not necessarily reflect which neurons would be targeted had another effector, e.g., shibirets1 been expressed using the same Gal4 driver (Ito et al., 2003): first, UAS-mCD8GFP and UAS-shibirets1 may support different levels and patterns of background expression without any Gal4; this background expression then adds up with the driven expression when the Gal4 is present. Second, the level of mCD8GFP expression sufficient for immunohistochemical detection may well be different from the level of shibirets1 expression sufficient to block neuronal output; thus potentially, not all neurons that are visualized by immunohistochemistry may be affected by shibirets1 or vice versa.
To test for an effect of an octopamine biosynthesis deficiency, we used the mutant strain tbhM18 (Monastirioti et al., 1996; also see Schwaerzel et al., 2003; Saraswati et al., 2004; Scholz, 2005; Brembs et al., 2007; Certel et al., 2007; Hardie et al., 2007; Sitaraman et al., 2010). These flies have reduced or no octopamine (Monastirioti et al., 1996), due to the deficiency of the tyramine β-hydroxylase enzyme, which catalyzes the last step of octopamine biosynthesis (Figure (Figure2).2). Since the original tbhM18 strain (Monastirioti et al., 1996) contains an additional mutation in the white gene, we instead used a recombinant strain with a wild-type white+ allele, which was generated by Schwaerzel et al. (2003). As genetic control, we used a non-recombinant strain with wild-type tbh+ and white+ alleles, which was generated in parallel; we refer to this strain simply as “Control.”
Brains were dissected in saline and fixed for 2h in 4% formaldehyde with PBST as solvent (phosphate-buffered saline containing 0.3% Triton X-100). After a 1.5h incubation in blocking solution (3% normal goat serum [Jackson Immuno Research Laboratories Inc., West Grove, PA, USA] in PBST), brains were incubated overnight with the monoclonal anti-Synapsin mouse antibody SYNORF1, diluted 1:20 in PBST (Klagges et al., 1996) and polyclonal anti-GFP rabbit antibody, diluted 1:2000 in PBST (Invitrogen Molecular Probes, Eugene, OR, USA). These primary antibodies were detected after an overnight incubation with Cy3 goat anti-mouse Ig, diluted 1:250 in PBST (Jackson Immuno Research Laboratories Inc., West Grove, PA, USA) and Alexa488 goat anti-rabbit Ig, diluted 1:1000 in PBST (Invitrogen Molecular Probes, Eugene, OR, USA). All incubation steps were followed by multiple PBST washes. Incubations with antibodies were done at 4°C; all other steps were performed at room temperature. Finally, brains were mounted in Vectashield mounting medium (Vector Laboratories Inc., Burlingame, CA, USA) and examined under a confocal microscope (Leica SP1, Leica, Wetzlar, Germany).
Flies were collected from fresh food vials and kept for 1–4 days at 18°C and 60–70% relative humidity before experiments. For reward learning as well as for the punishment learning experiments shown in Figures Figures6B,B′,6B,B′, flies were instead starved overnight for 18–20h at 25°C and 60–70% relative humidity in vials equipped with a moist tissue paper and a moist filter paper. Those experiments that did not use shibirets1 were performed at 22–25°C and 75–85% relative humidity. For inducing the effect of shibirets1, flies were first exposed to 34–36°C and 60–70% relative humidity for 30min; then the experiment took place under these same conditions, which are referred to as “@ high temperature.” The condition referred to as “@ low temperature” in turn involved exposing the flies to 20–23°C and 75–85% relative humidity for 30min; then the experiment followed also under these conditions.
The experimental setup was in principle as described by Tully and Quinn (1985) and Schwaerzel et al. (2003). Flies were trained and tested as groups of 100–150. Trainings took place under dim red light which does not allow flies to see, tests were in complete darkness.
As odorants, 90μl benzaldehyde (BA), 340μl 3-octanol (OCT), 340μl 4-methylcyclohexanol (MCH), 340μl n-amyl acetate (AM) and 340 μl isoamyl acetate (IAA) (CAS 100-52-7, 589-98-0, 589-91-3, 628-63-7, 123-92-2; all from Fluka, Steinheim, Germany) were applied in 1cm-deep Teflon containers of 5, 14, 14, 14, and 14mm diameters, respectively. For the experiments in Figures Figures6A,B,C6A,B,C MCH and OCT were diluted 100-fold in paraffin oil (Merck, Darmstadt, Germany, CAS 8012-95-1), whereas for Figures Figures6A′,B′,6A′,B′, AM and IAA were diluted 36-fold. All other experiments used undiluted BA and OCT.
For punishment learning (Figure (Figure1A),1A), flies received six training trials. Each trial started by loading the flies into the experimental setup (0:00min). From 4:00min on, the control odor was presented for 15s. Then, from 7:15min on, the to-be-learned odor was presented also for 15s. From 7:30min on, electric shock was applied as four pulses of 100V; each pulse was 1.2s-long and was followed by the next with an onset-to-onset interval of 5s. Thus the to-be-learned odor preceded shock with an onset-to-onset interval of 15s. The control odor on the other hand preceded the shock by an onset-to-onset interval of 210s, which does not result in a measurable association between the two (Tanimoto et al., 2004; Yarali et al., 2008, loc. cit. Figures 1D and 2F, Yarali et al., 2009, loc. cit. Figure 1B). For relief learning (Figure (Figure1B),1B), keeping all other parameters unchanged, we reversed the relative timing of events: that is, the to-be-learned odor was presented from 8:10min on, thus following shock with an onset-to-onset interval of 40s. At 12:00min, flies were transferred out of the setup into food vials, where they stayed for 16min until the next trial. At the end of the sixth training trial, after the usual 16min break, flies were loaded back into the setup. After a 5min accommodation period, they were transferred to the choice point of a T-maze, where they could escape toward either the control odor or the learned odor. After 2min, the arms of the maze were closed and flies on each side were counted. A preference index (PREF) was calculated as:
# indicates the number of flies found in the respective maze-arm. Two groups of flies were trained and tested in parallel (Figure (Figure1D).1D). For one of these, e.g., 3-octanol (OCT) was the control odor and BA was to be learned; the second group was trained reciprocally. PREFs from the two reciprocal measurements were then averaged to obtain a final learning index (LI):
Subscripts of PREF indicate the learned odor in the respective training. Positive LIs indicate conditioned approach to the learned odor; negative values reflect conditioned avoidance.
Reward learning (Figure (Figure1C)1C) used two training trials. Each trial started by loading the flies into the setup (0:00min). One minute later, flies were transferred to a tube lined with a filter paper which was soaked the previous day with 2ml of 2M sucrose solution, and then was left to dry over night. This tube was scented with the to-be-learned odor. After 45s, the to-be-learned odor was removed, and after 15 additional seconds flies were taken out of the tube. At the end of a 1min waiting period, they were transferred into another tube lined with a filter paper which was soaked with pure water and then dried. This second tube was scented with the control odor. After 45s, control odor was removed and 15s later, flies were taken out of this second tube. The next trial started immediately. This transfer between the two kinds of tube during training should prevent the learning of an association between the control odor and the sugar. For half of the cases, training trials started with the to-be-learned odor and sugar; in the other half, control odor was given precedence. Once the training was completed, after a 3min waiting period, flies were transferred to the choice point of a T-maze between the control odor and the learned odor. After 2min, the arms of the maze were closed, flies on each side were counted and a preference index (PREF) was calculated according to Eq. 1. As detailed above (also see Figure Figure1D),1D), two groups were trained reciprocally and the LI was calculated based on their PREF values according to Eq. 2.
Finally, a modified punishment training procedure (not shown in Figure Figure1)1) imitated the reward learning as in Figure Figure1C,1C, but sugar presentation was replaced by 12 pulses of 100V electric shock, each lasting 1.2s and separated by an onset-to-onset interval of 5s.
All data were analyzed using non-parametric statistics and are reported as box plots, showing the median as the midline and 10, 90, and 25, 75% as whiskers and box boundaries, respectively. For comparing scores of individual groups to 0, we used one-sample sign tests. Mann–Whitney U-tests and Kruskal–Wallis tests were used for pair-wise and global between-group comparisons, respectively. When multiple tests of one kind were performed within a single experiment, we adjusted the experiment-wide error-rate to 5% by Bonferroni correction: we divided the critical P<0.05 by the number of tests. One-sample sign tests were done using a web-based tool (http://www.fon.hum.uva.nl/Service/Statistics/Sign_Test.html). All other statistical analyses were performed with the software Statistica (Statsoft, Tulsa, OK, USA). Sample sizes are reported in the figure legends.
First, we compared relief learning to punishment learning in terms of the roles of dopaminergic neurons. We confirmed that blocking the output from a particular set of dopaminergic neurons, using the temperature-sensitive UAS-shibirets1 in combination with the TH-Gal4 driver (Friggi-Grelin et al., 2003, Table Table1;1; Figures Figures22 and and3A),3A), impairs punishment learning: when trained and tested at high temperature, TH/shits1 flies showed less negative learning scores than the genetic controls (Figure (Figure4A4A @ high temperature: Kruskal–Wallis test: H=11.44, d.f.=2, P<0.05). This impairment in punishment learning, however, was obviously partial in the TH/shits1 flies (Figure (Figure4A4A @ high temperature: one-sample sign tests: P<0.05/3 for each genotype), as was the case in previous studies (Schwaerzel et al., 2003; Aso et al., 2010). This residual learning ability may be due to incomplete coverage of dopaminergic neurons by the TH-Gal4 driver (Friggi-Grelin et al., 2003; Sitaraman et al., 2008; Claridge-Chang et al., 2009; Mao and Davis, 2009; see the Discussion for details) and/or to an incomplete block of neuronal output by shibirets1. At low temperature, as shibirets1 was benign, TH/shits1 flies performed comparably to the genetic controls in punishment learning (Figure (Figure4A4A @ low temperature: Kruskal–Wallis test: H=2.06, d.f.=2, P=0.36).
Importantly, blocking output from TH-Gal4 neurons, a treatment which did impair punishment learning, left relief learning intact: with training and test at high temperature, we found relief learning scores of TH/shits1 flies to be indistinguishable from the genetic controls (Figure (Figure4B4B @ high temperature: Kruskal–Wallis test: H=0.10, d.f.=2, P=0.96). Accordingly pooling the data, we found conditioned approach (Figure (Figure4B4B @ high temperature: one-sample sign test for the pooled data set: P<0.05). One might argue that the generally low relief learning scores may not allow detecting a possible partial impairment due to neurogenetic intervention. This however does not apply to Figure Figure4B,4B, as relief learning in the TH/shits1 flies does not even tend to be inferior to the genetic controls (similarly, see Figures Figures5B,5B, B,6C,6C, and and7B).7B). We note that punishment and relief learning procedures differ only with respect to the timing of the to-be-learned odor during training; otherwise they entail the same handling and stimulus–exposure. Therefore, intact relief learning in the TH/shits1 flies (Figure (Figure4B)4B) excludes sensory and/or motor problems as potential cause for the impairment in punishment learning (Figure (Figure4A,4A, left).
Next, we used an independent driver, DDC-Gal4 (Li et al., 2000; Table Table1;1; Figures Figures22 and and3B),3B), to express UAS-shibirets1 in a set of dopaminergic/serotonergic neurons. Blocking the output from these neurons left punishment learning unaffected: when trained and tested at high temperature, DDC/shits1 flies showed learning scores comparable to the genetic controls (Figure (Figure5A5A @ high temperature: Kruskal–Wallis test: H=2.14, d.f.=2, P=0.34). Thus pooling the scores across genotypes, we observed conditioned avoidance (Figure (Figure5A5A @ high temperature: one-sample sign test for the pooled data set: P<0.05). This lack of effect on punishment learning may be caused by (i) the DDC-Gal4 driver not covering all dopaminergic neurons; (ii) incomplete overlap to those dopaminergic neurons targeted by the TH-Gal4 (Sitaraman et al., 2008; Claridge-Chang et al., 2009; Mao and Davis, 2009; see the Discussion for details), (iii) incomplete block of synaptic output by shibirets1; (iv) a dominant-negative effect of DDC-Gal4, which is non-additive with the effect of shibirets1 expression in these neurons (see below).
In any case, we probed for an effect of blocking output from the DDC-Gal4 neurons on relief learning and found none: after training and test at high temperature, learning scores were not different between genotypes (Figure (Figure5B5B @ high temperature: Kruskal–Wallis test: H=1.24, d.f.=2, P=0.54). We thus pooled the data and found weak yet significant conditioned approach (Figure (Figure5B5B @ high temperature: one-sample sign test for the pooled data set: P<0.05). We note that the DDC/+ flies tended to show less pronounced punishment and relief learning when compared to the TH/+ flies (compare Figure Figure44 versus Figure Figure5)5) as well as when compared to the shits1/+ flies (Figure (Figure5).5). In the case of punishment learning, as we used a Kruskal–Wallis test across all three experimental groups, this effect of the DDC-Gal4 driver construct may have obscured an actual effect of blocking the output from DDC-Gal4-targeted neurons (compare shits1/+ to DDC/shits1 in Figure Figure5A).5A). For relief learning, however, no corresponding trend is noted (compare shits1/+ to DDC/shits1 in Figure Figure5B).5B). In any case, with respect to the role of the neurons defined by DDC-Gal4, our results do not offer an argument to dissociate punishment from relief learning.
To summarize, concerning the neurons defined by TH-Gal4, we found a clear dissociation between punishment and relief learning (Figure (Figure4),4), while for the DDC-Gal4 neurons the situation remains inconclusive (Figure (Figure5).5). We would like to stress that this does not at all exclude a role for the dopaminergic system in relief learning, given that first, in neither experiment did we cover all dopaminergic neurons at once, and second, as a general concern, blockage of neuronal output by shibirets1 may well be incomplete (see the Discussion for details).
Next, we compared relief learning to reward learning in terms of the role of octopamine. We first confirmed that compromising octopamine biosynthesis via the tbhM18 mutation in the key enzyme tyramine β-hydroxylase (Monastirioti et al., 1996; Figure Figure2)2) impairs reward learning: after odor-sugar training, using the odors 3-octanol (OCT) and 4-methylcyclohexanol (MCH), the tbhM18 mutant showed significantly less conditioned approach than the genetic Control (Figure (Figure6A:6A: U-test: U=544.00, P<0.05). Residual reward learning ability was however detectable in the tbhM18 mutant (Figure (Figure6A:6A: one-sample sign tests: P<0.05/2 for each genotype). This contrasts to the report of Sitaraman et al. (2010), who had shown a complete loss of reward learning using the same odors; the discrepancy may be due to the different genetic backgrounds used in the two studies (i.e., the present study uses the strains from Schwaerzel et al., 2003, whereas Sitaraman et al., 2010 uses those from Certel et al., 2007). Schwaerzel et al. (2003) found no reward learning ability in the tbhM18 mutant, using the odors ethyl acetate and isoamyl acetate (IAA); indeed, using n-amyl acetate (AM) and IAA as odors, we also found a complete loss of reward learning in the tbhM18 mutant (Figure (Figure6A′:6A′: U-test: U=33.00, P<0.05; one-sample sign tests: P<0.05/2 for Control, and P=0.58 for the tbhM18 mutant). Surprisingly however, when the odors OCT and benzaldehyde (BA) were used, tbhM18 mutant flies showed fully intact reward learning (Figure (Figure6A′′:6A′′: U-test: U=204.50, P=0.27; one-sample sign test for the pooled data set: P<0.05). This lack of effect in Figure Figure6A′′6A′′ should not be due to the relatively low learning indices of the Control flies, since in Figure Figure6A,6A, we could detect even a partial effect of the tbhM18 mutation despite such low Control scores. Note that using the present two-odor reciprocal training design (Figure (Figure1D),1D), the contribution of each odor to the LI, and hence the question whether the tbhM18 mutation affects learning about any one given odor but not the other, remains unresolved. We can however conclude that the reward learning impairment of the tbhM18 mutant can be partial, complete, or absent, depending on the combination of odors used and likely also on the genetic background; this suggests residual octopaminergic function and/or an octopamine-independent compensatory mechanism (see the Discussion for details).
To test for an effect of the tbhM18 mutation on punishment learning, we used a modified training, which entailed the same pre-starvation, handling, and stimulus–exposure as reward learning, except the sugar presentation was replaced by shock pulses. In such modified punishment learning, the tbhM18 mutant performed comparably to the genetic Control, using either the odors OCT and MCH (Figure (Figure6B:6B: U-test: U=47.00, P=0.15; one-sample sign test for the pooled data set: P<0.05) or AM and IAA (Figure (Figure6B′:6B′: U-test: U=38.00, P=0.82; one-sample sign test for the pooled data set: P<0.05). Thus, confirming Schwaerzel et al. (2003), we can conclude that reward and punishment learning are dissociated in terms of the effect of the tbhM18 mutation. In addition, normal performance of the tbhM18 mutant in this modified punishment learning makes deficiencies in odor perception or motor control unlikely as causes for the reward learning impairment (Figures (Figures66A,A′).
In order to test for an effect of the tbhM18 mutation on relief learning, we used the odors OCT and MCH, because the odors AM and IAA do not support relief learning (Yarali et al., 2008, loc. cit. Figure 5D). Under conditions for which the tbhM18 mutant did show a reward learning impairment, however partial (i.e., using the odors OCT and MCH), relief learning ability remained unaffected: learning scores were statistically indistinguishable between genotypes (Figure (Figure6C:6C: U-test: U=168.00, P=0.40), with no apparent trend for lower scores in the tbhM18 mutant. We thus pooled the data and found weak yet significant conditioned approach (Figure (Figure6C:6C: one-sample sign test for the pooled data set: P<0.05).
As an additional, independent assault toward the octopaminergic system, we blocked the output from a set of octopaminergic/tyraminergic neurons, using UAS-shibirets1, in combination with the TDC2-Gal4 driver (Cole et al., 2005; Table Table1;1; Figures Figures22 and and3C).3C). We first tested for an effect on reward learning: when trained and tested at high temperature, TDC2/shits1 flies performed comparably to the genetic controls (Figure (Figure7A7A @ high temperature: Kruskal–Wallis test: H=3.03, d.f.=2, P=0.22). Accordingly pooling the learning scores across genotypes, we found conditioned approach (Figure (Figure7A7A @ high temperature: one-sample sign test for the pooled data set: P<0.05). This lack of effect on reward learning may be because the TDC2-Gal4 driver does not target all octopaminergic neurons (Busch et al., 2009; see the Discussion for details) and/or the output from the targeted neurons is not completely blocked by the shibirets1.
Nevertheless, we probed for an effect on relief learning and found none: after training and test at high temperature, learning scores were statistically indistinguishable between genotypes (Figure (Figure7B7B @ high temperature: Kruskal–Wallis test: H=2.43, d.f.=2, P=0.30). Accordingly pooling the data, we found conditioned approach (Figure (Figure7B7B @ high temperature: one-sample sign test for the pooled data set: P<0.05). To summarize, while reward and relief learning are apparently dissociated when considering the tbhM18 mutant, we can put no distinction between these two kinds of learning in terms of the role of the neurons covered by the TDC2-Gal4 driver. Again, this does not rule out a role for the octopaminergic system in relief learning, as these conclusions refer only to the specific genetic manipulations used.
We compared relief learning to both punishment learning and reward learning, focusing on the involvement of aminergic modulation by dopamine and octopamine.
As previously reported (Schwaerzel et al., 2003; Aso et al., 2010), directing the expression of UAS-shibirets1 to a particular set of dopaminergic neurons defined by the TH-Gal4 driver partially impaired punishment learning (Figure (Figure4A).4A). Relief learning however was left intact (Figure (Figure4B).4B). Expressing UAS-shibirets1 with another driver, DDC-Gal4, on the other hand affected neither punishment nor relief learning (Figure (Figure55).
All dopaminergic neuron clusters in the fly brain are targeted by the TH-Gal4 driver; some clusters however, are covered only partially, e.g., 80–90% of the anterior medial “PAM cluster” neurons are left out (Friggi-Grelin et al., 2003; Sitaraman et al., 2008; Claridge-Chang et al., 2009; Mao and Davis, 2009). Contrarily, the DDC-Gal4 driver, along with serotonergic neurons, likely targets most of the PAM cluster dopaminergic neurons, while possibly leaving out dopaminergic neurons in other clusters (Sitaraman et al., 2008; Figure Figure3B).3B). In a mixed classical-operant olfactory punishment learning task, Claridge-Chang et al. (2009) found no impairment upon blocking the activity of most PAM cluster neurons with an inwardly rectifying K+ channel (UAS-kir2.1), driven by HL9-Gal4. Although relying on both a different Gal4 driver and a different effector, this result is in agreement with the intact punishment learning we found when expressing UAS-shibirets1 with the DDC-Gal4 driver (Figure (Figure5A).5A). Thus, as far as short-term punishment learning is concerned, there is so far no evidence for a role for the PAM cluster neurons (for middle-term punishment learning, see Aso et al., 2010). Nevertheless, targeting the remaining dopaminergic neuron clusters by the TH-Gal4 driver only partially impairs punishment learning (Schwaerzel et al., 2003; Aso et al., 2010; Figure Figure4A).4A). Conceivably, the TH-Gal4 driver may leave out few dopaminergic neurons in clusters other than PAM; these may then carry a punishment signal, redundant to that carried by the TH-Gal4-targeted neurons. This scenario would readily accommodate Schroll et al.’s (2006) report that activity of the TH-Gal4-targeted neurons in larval fruit flies substitutes for punishment. The intact relief learning upon expressing UAS-shibirets1 with TH-Gal4 can also be explained by this scenario. Alternatively, the level of shibirets1 expression driven by TH-Gal4 may fall short of effectively blocking the neuronal output required for relief learning, and/or an additional, shibirets1-resistant neurotransmission mechanism may be employed in relief learning. Further, if punishment were to be signaled by a shock-induced increase in the activity of the TH-Gal4 neurons and relief was to be signaled by a decrease in their activity below the baseline at the shock offset, incomplete blockage of output from these neurons could partially impair punishment learning, while leaving relief learning intact. In face of these caveats, we find it too early to exclude any role of dopamine or of the TH-Gal4 neurons. What then is a safe minimal conclusion? Given that while punishment learning is partially impaired (Figure (Figure4A)4A) relief learning does not even tend to be impaired (Figure (Figure4B),4B), these two kinds of learning do differ in terms of whether and which role the TH-Gal4-covered neurons play. This does dissociate punishment and relief learning in terms of their underlying mechanisms.
Turning to the octopaminergic system, we confirmed Schwaerzel et al. (2003) in that the tbhM18 mutant with compromised octopamine biosynthesis is impaired in reward learning (Figures (Figures6A,A′),6A,A′), but not in punishment learning (Figures (Figures6B,B′).6B,B′). The effect on reward learning was however conditional on the kinds of odor used (Figures (Figures6A,A′,A′′).6A,A′,A′′). Under the conditions that significantly impaired reward learning, we found relief learning intact (Figure (Figure6C).6C). Although the tbhM18 mutant we used revealed no octopamine content in immunohistochemical and high pressure liquid chromatography (HPLC) analyses (Monastirioti et al., 1996), it may retain an amount of octopamine below the detection thresholds of these methods but sufficient to signal reward and/or relief. Furthermore, HPLC analysis reveals a ~10-fold increase in the amount of octopamine-precursor tyramine in this mutant (Monastirioti et al., 1996); this excessive tyramine may compensate for the lack of octopamine (Uzzan and Dudai, 1982).
As an additional approach, we blocked the output from a set of octopaminergic/tyraminergic neurons, expressing UAS-shibirets1 with the TDC2-Gal4 driver; this impaired neither reward, nor relief learning (Figure (Figure7).7). The TDC2-Gal4 driver targets, along with tyraminergic neurons, octopaminergic neurons in three paired and one unpaired neuron clusters (Busch et al., 2009). Among these, the unpaired “VM cluster” harbors octopaminergic neurons innervating on the one hand the subesophageal ganglion (SOG), and on the other hand the antennal lobes, mushroom bodies, and the lateral horn (Busch et al., 2009); such connectivity would enable signaling gustatory reward onto the olfactory pathway. Indeed, in the honey bee, activation of a single octopaminergic neuron, VUMmx1, with such innervation pattern, is sufficient to carry the reward signal for olfactory learning (Hammer, 1993). Surprisingly however, although all octopaminergic neurons in the VM cluster are targeted by the TDC2-Gal4 (Busch et al., 2009), using this driver with UAS-shibirets1, we found reward learning intact (Figure (Figure7A).7A). This may be because the level UAS-shibirets1 expression falls short of completely blocking the neuronal output. Alternatively, given that activation of the TDC2-Gal4-targeted neurons in fruit fly larvae reportedly substitutes for reward (Schroll et al., 2006), the VM cluster neurons may indeed carry a reward signal, but other octopaminergic neurons outside this cluster, left out by the TDC2-Gal4 driver (Busch et al., 2009) may redundantly do so. Either kind of argument could also explain the lack of effect on relief learning (Figure (Figure7B).7B). Thus, although we find no evidence for a role for the octopaminergic system in relief learning, we refrain from excluding such a role. Still, given that the tbhM18 mutation affects reward learning, but not relief learning, these two forms of learning are to some extent dissociated in their genetic requirements.
Obviously, the question whether dopaminergic and octopaminergic systems are involved in relief learning remains open. Follow up studies should extend our neurogenetic approach to further tools. For example, dopamine biosynthesis can be specifically compromised in the fly nervous system using a tyrosine hydroxylase mutant in combination with a hypoderm-specific rescue construct (Hirsh et al., 2010). Also, for two different dopamine receptors, DAMB and dDA-1, loss of function mutations are available (Kim et al., 2007; Selcho et al., 2009). Notably, by means of the dDA-1 receptor loss of function mutant, the role of the dopaminergic system in reward learning was revealed (Kim et al., 2007; Selcho et al., 2009), which had been overlooked with the tools used in the present study. In addition, a pharmacological approach would be useful. Antagonists for the vertebrate D1 and D2 receptors have been successfully used in the fruit fly (Yellman et al., 1997; Seugnet et al., 2008) and other insects (Unoki et al., 2005, 2006; Vergoz et al., 2007) (regarding the octopamine receptors: Unoki et al., 2005, 2006; Vergoz et al., 2007). Such pharmacological approach could be extended to other aminergic, as well as peptidergic systems and could also test for the effects of human psychotherapeuticals. The results of such studies may then guide subsequent analyses at the cellular level.
To summarize, while this study has shed no light on how relief learning works, it did show that relief learning works in a way neurogenetically different from both punishment learning and reward learning, likely at the level of the roles of aminergic neurons. Interestingly, at this level also punishment and reward learning are dissociated. However, all three kinds of learning also share genetic commons, for example with respect to the role of the synapsin gene, likely critical for neuronal plasticity (Godenschwege et al., 2004; Michels et al., 2005; Knapek et al., 2010; T. Niewalda, Universität Würzburg, personal communication). Thus, punishment-, relief-, and reward-learning may conceivably rely on common molecular mechanisms of memory trace formation, which however are triggered by experimentally dissociable reinforcement signals, and/or operate in distinct neuronal circuits. This may be a message relevant also for analyses of relief learning in other experimental systems, including rodent (Rogan et al., 2005), monkey (Tobler et al., 2003; Belova et al., 2007; Matsumoto and Hikosaka, 2009), and man (Seymour et al., 2005; Andreatta et al., 2010).
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
The continuous support of the members of the Würzburg group, especially of M. Heisenberg, K. Oechsener and H. Kaderschabek, is gratefully acknowledged. Special thanks to T. Niewalda, Y. Aso and M. Appel for comments on the manuscript. The authors are grateful to E. Münch for the generous support to Ayse Yarali during the startup phase of her PhD. This work was supported by the Deutsche Forschungsgemeinschaft (DFG) via CRC-TR 58 Fear, Anxiety, Anxiety Disorders, and a Heisenberg Fellowship (to Bertram Gerber); Ayse Yarali was supported by the Boehringer Ingelheim Fond. Dedicated to our respective daughters.