PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of comintbioLink to Publisher's site
 
Commun Integr Biol. 2010 Mar-Apr; 3(2): 95–100.
PMCID: PMC2889962

Reward expectations in honeybees

Abstract

The study of expectations of reward helps to understand rules controlling goal-directed behavior as well as decision making and planning. I shall review a series of recent studies focusing on how the food gathering behavior of honeybees depends upon reward expectations. These studies document that free-flying honeybees develop long-term expectations of reward and use them to regulate their investment of energy/time during foraging. Also, they present a laboratory procedure suitable for analysis of neural substrates of reward expectations in the honeybee brain. I discuss these findings in the context of individual and collective foraging, on the one hand, and neurobiology of learning and memory of reward.

Key words: honeybees, reward, expectations, foraging, proboscis extension response

Expectations of Reward

An expectation is said to be “a strong hope or belief that something that you want will happen”, or the action “to anticipate or look forward to the occurrence of an event.”1 Clearly, the notions of expectation and anticipation are linked to each other and it is taken for granted that these two words are interchangeable. However, although one needs to expect in order to anticipate, expecting per se does not imply anticipation. In psychology, the term ‘incentive’ designates internal correlates of specific rewards guiding the behavior of subjects pursuing such rewards.2 Incentive denotes what is referred to as a subject’s expectation of reward.2 Animals develop memories of specific properties of reward,24 alongside memories arising from the contingency between salient stimuli (as conditioned stimuli, or CSs) and reward (as an unconditioned stimulus, or US). Behaviorally, an expectation of reward is seen as an adjustment of a response which depends upon the formation and subsequent activation of memories about specific properties of reward, whereas the recollection of such memories is triggered by the cues and events predicting reward. In this scheme, expectations of reward are determined by past reward experience and can guide reward-induced behavior.2

The Study of Reward Expectations

Studying reward expectations helps us to understand rules controlling goal-directed behavior as well as decision making and planning. Early studies showed that non-human primates5 and rodents2,6 learn to expect specific outcomes and that these expectations are linked to specific magnitudes or kinds of rewards. Monkeys trained in a simple choice task show “disappointment, hesitation, and searching behavior” when they find a non-preferred food item where a preferred food item used to be,5 and the running time of rats in a runway changes dramatically when they experience a sudden shift in reward magnitude.2,6 Since these initial findings, reward expectations have extensively been addressed in pigeons,7 rodents,8,9 non-human primates10 and humans.11 However, very little is known about reward expectations in invertebrate species. Here, I shall review a series of four recent studies focusing on how the food gathering behavior of honeybees depends upon expectations of reward. The first two studies concern the behavior of free-flying bees foraging under conditions closely mimicking a natural situation.12,13 Using an approach frequent in behavioral ecology, they helped to ponder the role of reward expectations in the ecology of foraging. The remaining two studies concern a laboratory procedure suitable for analysis of neural substrates of reward expectations in the honeybee brain.14,15 They made use of an approach frequent in the study of reward expectations, which relies heavily on experiments with restrained subjects made under highly controlled conditions. Using this approach, for example, studies in mammals have shown that interaction between the basolateral complex of the amygdala and the orbitofrontal cortex is necessary for development and subsequent use of reward expectations involved in goal-directed behaviours.4,11,16,17 I shall discuss these studies in the context of individual and collective foraging, on the one hand, and the neurobiology of learning and memory of reward.

Reward Expectations in Foraging Honeybees

Honeybees live in large colonies whose primary source of energy is the nectar found within flowers. Nectar offer varies continuously, both in space and time, depending on species-specific flowering patterns, weather conditions, and the activity of other pollinators.1823 In spite of such variability, honeybees gather energy efficiently using their learning and memory skills.24,25 They learn, for example, the location and time of day when flowers are productive as well as their odors, colors and shapes.2630 But, can they also learn that reward level increases or decreases over time? In a recent study,12 we trained bees to forage individually on an artificial flower patch offering increasing (small-medium-large), decreasing (large-medium-small) or constant (small, medium or large) reward levels (Fig. 1A). Next, after a long foraging pause, we recorded how persistently they searched for food at the patch in the absence of reward. We found that the bees that had previously experienced increasing reward levels searched for food more persistently than the bees that had experienced decreasing reward levels (Fig. 1B). This difference could not be explained by the bees’ most recent reward experience or by the total amount of reward that they had previously collected. This conclusion was drawn from the fact that the bees that had collected small, medium and large constant rewards equally searched for food during testing (even if the last reward experience and the total amount of reward that they had previously collected were clearly different, Fig. 1B). We also found that the difference in persistence between the increasing and decreasing groups was independent of classical and/or operant associations between the offered reward and its predicting signals (see Fig. 1 in Gil et al.12). Taken together, the results of this study showed that bees can learn that the level of reward either increases or decreases over time and, subsequently, adjust their persistence during food searches accordingly. Can they also learn the magnitude of reward variations? We addressed this question in a second study13 in which we trained bees to forage individually on an artificial flower patch offering either a large (small-large) or a small (either small-medium or medium-large) increase in reward level (Fig. 1C). After a long foraging pause following training, we recorded how persistently they searched for food at the patch in the absence of reward. We found that the bees that had previously experienced a large increase in reward level searched for food more persistently than the bees that had experienced a small increase in reward level (Fig. 1D). As before, this difference could not be explained by the bees’ most recent reward experience or by the total amount of reward that they had previously collected. Honeybees, therefore, adjust their persistence to search for food in relation to both the sign and magnitude of past variations in reward level.12,13

Figure 1
(A) Reward schedules offered during training by the artificial flower patch in the first experiment.12 Each bee collected either increasing (I, small-medium-large), decreasing (D, large-medium-small), small (S), medium (M), or large (L) reward levels ...

The outcome of the above experiments is shown schematically in Figure 2. When honeybees forage on a flower patch offering variable reward levels, two parallel learning processes take place. On the one hand, bees learn the sign and magnitude of reward variations across successive foraging visits. They may do this using a build-in change detector that computes the difference in reward magnitude across foraging events. This computation leads to an estimate of an expected reward; we refer to such estimate as to a reward memory. On the other hand, bees associate the offered reward (as the US) with signals and cues present at the feeding site (as CSs) and an associative memory is formed. When a bee visits the feeding site after a long foraging pause, these memories are retrieved by reward-predicting stimuli. Associative memories are revealed through the bee’s choice behaviour,12 whereas reward memories are revealed through the bee’s persistence to search for food in the absence of reward.12,13 Hence, foraging honeybees adjust their investment of time/energy during food searches in relation to both the sign and magnitude of past variations in the level of reward. An increase in reward level leads to the formation of expectations of reward enhancing a forager’s reliance on a food source, and the strength of such reliance increases together with the magnitude of the past increase in reward level. This ability can make it more likely for them to successfully compete with other flower pollinators for limited resources as well as to maximize their individual rates of food collection by increasing their chances of finding food when forage is scarce. Because honeybees are social insects, one might ask how the colony as a whole benefits from a honeybee’s ability to develop expectations of reward. Such ability might enhance a colony’s selectivity among—variable—nectar sources. It would be interesting to examine the within-hive individual behavior as well as the pattern of collective behavior of honeybees foraging on multiple feeders offering increasing, decreasing and constant reward levels. One might also compare the ability of different species of bees to develop and use reward memories, and study how such ability relates to the specificities of their particular environments.

Figure 2
Schematic representation of the effects of constant (grey lines), decreasing (blue lines), and increasing sugar reward levels (red and orange lines for either a large or small increase in reward level, respectively) on a honeybee’s foraging behavior. ...

Optimal foraging theory attempts to predict foraging behavior in situations where sources of variable quality are heterogeneously distributed in space (reviewed in ref. 31). According to theory, foragers assess feeding site quality using an optimization rule that tends to maximize the rate of net energy intake.32 Thus, a forager’s investment of time/energy at a given patch is positively correlated to food availability.3234 This rule does not capture a forager’s investment of time/energy in the absence of reward.31 When a honeybee searches for food at an empty source, the cost of searching influences its behavior in a way that memories on past reward experiences at the site will help the forager to determine how much time/energy is ought to be invested in the ongoing task. The results presented above show that a honeybee’s persistence to search for food on a negative energy budget relies on its—already developed—expectations of reward.12,13 Effort has been made to build models incorporating learning and memory phenomena into the context of foraging. One such model incorporates reward level variability into the forager’s evaluation of food patch quality.35 It predicts that the foraging behavior of animals that have previously experienced variable rewards at a given patch will depend upon their memories of either the most recent reward level or the average reward level experienced, depending on the time elapsed since the last encounter with reward.35 Our results do not match predictions from this model.12,13 Therefore, an alternative model is needed to explain how honeybees use reward memories during foraging.

Reward Expectations in Harnessed Bees

In addressing neural correlates of reward expectations in honeybees, one has to find a behavioral correlate of reward memories suitable for laboratory studies. Such behavioral correlate can be the honeybee proboscis extension response (PER).3638 This response allows bees to gather sugar solution and is triggered when the gustatory receptors of the antennae, proboscis and tarsi are stimulated with sucrose.37 In a recent study,14 we asked whether harnessed bees can learn the sign of reward variations so as to subsequently adjust their PERs. We used an experimental design similar to that of our initial experiment with free-flying bees.12 We first trained bees by coupling the stimulation of one antenna with either increasing, decreasing or constant reward levels offered to their proboscis throughout consecutive learning trials (Fig. 3A). We then recorded the bees’ PE reaction-time to sucrose stimulation of the antenna in the absence of reward. We found that the bees that had experienced increasing reward levels subsequently extended their probosces earlier than the bees that had experienced decreasing or constant reward levels (Fig. 3B). These results could not be accounted for by the bees’ most recent experience or the total amount of reward that they received during training. The bees that had experienced small, medium or large constant rewards showed similar reaction-times, although their last reward experience and the total amount of received reward were different (Fig. 3B). Therefore, one can conclude that harnessed bees can learn that reward level increases or decreases over time and adjust their PERs accordingly.14 But, further studies addressing neural correlates of reward memories in harnessed bees require within-animal controls. This is because recordings of neural activity are variable and, therefore, a reference from the same experimental subject is necessary for analysis of responses to any given stimulus. In a new series of experiments, we aimed to incorporate within-animal controls into our laboratory procedure. To this end, we asked whether bees can learn side-specifically that reward level increases or decreases over time.15 Side-specific learning is well documented in honeybees.3942 We developed a side-specific training in which bees were trained by coupling stimulation of one antenna with increasing reward levels and stimulation of the other antenna with decreasing reward levels throughout consecutive learning trials (Fig. 3C). Next, at different times following training, we recorded the bees’ PE reaction-time to sucrose stimulation of each antenna in the absence of reward. We found that the bees extended their probosces earlier after stimulation of the antenna that had been linked to increasing reward levels than after stimulation of the antenna that had been linked to decreasing reward levels (Fig. 3D). Therefore, bees can also learn side-specifically that reward increases or decreases over time. They develop both short- and long-term side-specific reward memories, and the long-term memories are extinguished by repetitive stimulation of their antennae (Fig. 3D). Also, we found that these side-specific adjustments of PE response involve an interplay between gustatory and mechanosensory input, and correlate well with the activity of muscles responsible for controlling the movements of the proboscis.15 Taken together, these findings constitute a basis on which further analyses of reward memories can be built. Such analyses will include within-animal controls and a physiological correlates of a robust behavioral measure.

Figure 3
(A) Reward schedules offered during training.14 Each bee was presented with either increasing (I, small-medium-large), decreasing (D, large-medium-small), small (S), medium (M) or large (L) reward levels throughout three consecutive training trials (first, ...

The events involved in this side-specific learning are schematically shown in Figure 4. Bees learn to associate gustatory and mechanical stimulation of each antenna with either increasing or decreasing rewards offered to their probosces throughout consecutive learning trials. They do this using a built-in change detector that computes differences in reward level (linked to each antenna) across feeding events. This computation leads to the formation of an internal estimate of an expected reward associated with each input side, and then, to the formation of side-specific reward memories. After training, gustatory and mechanosensory input can activate both short- and long-term side-specific reward memories. Activation of such memories leads to side differences in a honeybee’s PE reaction-time, also evinced by activity of the muscles (M17s) involved in movement of the proboscis. It would be interesting to address how the magnitude and frequency of reward variations relate to the adjustment of a honeybee’s PER as well as the relative involvement of mechanical and gustatory inputs in side-specific learning. In addition, the fact that the above procedure allows analysis of within-animal behavioral correlates of reward memories makes it suitable for pharmacological, electrophysiological and optophysiological study of neural substrates underlying these memories. These studies can be combined with neuro-anatomical studies identifying brain areas where projections of gustatory receptors from the antenna and proboscis, on the one hand, and mechanosensory receptors from the antenna, on the other, converge. Previous studies showed that gustatory receptors of the antenna project into the ipsilateral antennal lobe (AL), dorsal lobe (DL) and suboesophageal ganglion (SOG).43 The gustatory receptors of the proboscis project into the SOG and ascend to the DLs.44 The mechanoreceptors of the antenna project into the ipsilateral DL and SOG.43,45 Hence, the DL and the SOG seem the first-order neuropils for processing mechanosensory and gustatory input from the antennae and the proboscis. Electrophysiological and optophysiological experiments will address how neural activity in these neuropils relates to adjustment of a honeybee’s PER occurring after activation of reward memories. Pharmacological approaches would also prove fruitful in this context. For example, it would be interesting to evaluate the role of the octopamine (OA, a bioamine involved in associative learning, memory retrieval, and food arousal in honeybees)4649 during formation and retrieval of reward memories.

Figure 4
Schematic representation of the events involved in side-specific learning. When a honeybee experiences gustatory and mechanical stimulation of each antenna coupled with either increasing or decreasing rewards (red and blue lines, respectively) throughout ...

Conclusions

The results of the above studies show that honeybees learn the sign and magnitude of reward variations and develop long-term reward expectations allowing adjustment of time/energy investment during foraging. The results also show that honeybees adjust their PE reaction-time in relation to the sign of reward variations; this form of learning involves the joint action of gustatory and mechanosensory input of the antennae, and can be side-specific. These studies constitute a basis on which three lines of investigation can be built. The first line concerns the role of reward expectations in honeybee foraging. In particular, further experiments are needed to address how the colony as a whole benefits from a honeybee’s ability to develop expectations of reward. The second line of investigation concerns the question of how reward expectations can be incorporated into theoretical accounts of individual and collective foraging. The third line concerns identifying neural substrates involved in the development of reward memories in honeybees. Progress in these three lines of investigation will bring together behavioral, theoretical and physiological data to better understand the role and underlying mechanisms of reward memories in honeybees.

Acknowledgements

I am indebted to R.J. De Marco for his permanent encouragement, fruitful discussions and valuable comments. I thank R. Menzel for his helpful comments and support. This work was supported by the Deutsche Forschungsgemeinschaft (DFG).

Footnotes

References

1. Merriam-Webster’s Collegiate Dictionary. Springfield MA, USA: Merriam-Webster, Inc; 1995.
2. Logan FA. Incentive. New Haven: Yale University Press; 1960.
3. Tolman EC. Principles of purposive behaviour. In: Koch S, editor. A study of a science. New York: McGraw-Hill; 1959. (Psychology Vol. 2).
4. Schultz W. Multiple reward signals in the brain. Nature Rev Neurosc. 2000;1:199–207. [PubMed]
5. Tinklepaugh OL. An experimental study of representative factors in monkeys. J Comp Psychol. 1928;8:197–236.
6. Crespi LP. Quantitative variation in incentive and performance in the white rat. Am J Psychol. 1942;40:467–517.
7. Peterson GB, Wheeler RL, Armstrong GD. Expectancies as mediators in the differential reward conditional discrimination performance of pigeons. Anim Learn Behav. 1978;6:279–285.
8. Trapold MA. Are expectancies based upon different positive reinforcing events discriminably different? Learn Motiv. 1970;1:129–140.
9. Holland PC, Straub JJ. Differential effects of two ways of devaluing the unconditioned stimulus after Pavlovian appetitive conditioning. J Exp Psychol. 1979;5:65–78. [PubMed]
10. Watanabe M, Cromwell H, Tremblay L, Hollerman JR, Hikosaka K, Schultz W. Behavioral reactions reflecting differential reward expectations in monkeys. Exp Brain Res. 2001;140:511–518. [PubMed]
11. O’Doherty J, Kringelbach ML, Rolls ET, Hornak J, Andrews C. Abstract reward and punishment representations in the human orbitofrontal cortex. Nat Neurosc. 2001;4:95–102. [PubMed]
12. Gil M, De Marco RJ, Menzel R. Learning reward expectations in honeybees. Learn Mem. 2007;14:491–496. [PubMed]
13. Gil M, De Marco RJ. Honeybees learn the sign and magnitude of reward variations. J Exp Biol. 2009;212:2830–2834. [PubMed]
14. Gil M, Menzel R, De Marco RJ. Does an Insect’s Unconditioned Response to Sucrose Reveal Expectations of Reward? Tags for format 3PLoS ONE. 2008;3:2810. doi: 10.1371/journal.pone.0002810. [PMC free article] [PubMed]
15. Gil M, Menzel R, De Marco RJ. Side-specific reward memories in honeybees. Learn Mem. 2009;16:426–432. [PubMed]
16. Holland PC, Gallagher M. Amygdala-frontal interactions and reward expectancy. Curr Op Neurobiol. 2004;14:148–155. [PubMed]
17. Gallagher M, McMahan RW, Schoenbaum G. Orbitofrontal cortex and representation of incentive value in associative learning. J Neurosc. 1999;19:6610–6614. [PubMed]
18. Núñez JA. Nectar flow by melliferous flora and gathering flow by Apis mellifera ligustica. J Insect Physiol. 1977;23:265–275.
19. Teuber LR, Barnes DK. Environmental and genetic influences on quantity and quality of alfalfa nectar. Crop Sci. 1979;19:874–878.
20. Vogel S. Ecophysiology of zoophilic pollination. In: Lange OL, Nobel PS, Osmond CB, Ziegler H, editors. Physiological plant ecology III. Berlin, Heidelberg, New York: Springer; 1983.
21. Baker HG, Baker I. A brief historical review of the chemistry of floral nectar. In: Bentley B, Elias T, editors. The Biology of Nectaries. New York: Columbia University Press; 1983.
22. Real LA, Rathcke BJ. Patterns of individual variability in floral resources. Ecol. 1988;69:728–735.
23. Rathcke B, Lacey EP. Phenological patterns of terrestrial plants. Annu Rev Ecol Sys. 1985;16:179–214.
24. Gould JL, Gould CG. The Honey Bee. New York: Scientific American Library; 1988.
25. Seeley TD. The Wisdom of the Hive. Cambridge, Mass: Harvard University Press; 1995.
26. Wahl O. Neue Untersuchungen über das Zeitgedächtnis der Bienen. Z vergl Physiol. 1932;16:529–589. (Ger).
27. Kleber E. Hat das Zeitgedächtnis der Bienen biologische Bedeutung? Z vergl Physiol. 1935;22:221–262. (Ger).
28. Kolterman R. Lern- und Vergessensprozesse bei der Honigbiene-aufgezeigt anhand von Duftdressuren Z vergl. Physiol. 1969;63:310–334.
29. Frisch K. von. The Dance Language and Orientation of Bees. Harvard University Press; 1967.
30. Menzel R. Learning, Memory and ‘Cognition’ in Honey Bees. In: Kesner RP, Olten DS, editors. Neurobiology of Comparative Cognition. Hillsdale, NJ: Erlbaum,; 1990.
31. Pyke GH. Optimal foraging theory: a critical review. Annu Rev Ecol Syst. 1984;15:523–575.
32. Charnov EL. Optimal foraging, the marginal value theorem. Theor Pop Biol. 1976;9:129–136. [PubMed]
33. Schmid-Hempel P, Kacelnik A, Houston AJ. Honeybees maximize efficiency by not filling their crop. Behav Ecol Sociobiol. 1985;17:61–66.
34. Varjú D, Núñez JA. What do foraging honeybees optimize? J Comp Physiol A. 1991;169:729–736.
35. Devenport LD, Devenport JA. Time-dependent averaging of foraging information in least chipmunks and golden-mantled ground squirrels. Anim Behav. 1994;47:787–802.
36. Takeda K. Classical conditioned response in the honey bee. J Insect Physiol. 1961;6:168–179.
37. Kuwabara M. Bildung des bedingten Reflexes von Pavlovs Typus bei der Honigbiene, Apis mellifica. J Fac Hokkaido Univ Ser VI Zol. 1957;13:458–464.
38. Bitterman ME, Menzel R, Fietz A, Schäfer S. Classical conditioning of proboscis extension in honeybees (Apis mellifera) J Comp Psychol. 1983;97:107–119. [PubMed]
39. Masuhr T, Menzel R. Learning Experiments on the Use of Side-Specific Information in the Olfactory and Visual System in the Honeybee (Apis mellifica) In: Wehner R, editor. Information Processing in the Visual Systems of Arthropods. Berlin-Heidelberg-New York: Springer; 1972.
40. Macmillan CS, Mercer AR. An investigation of the role of dopamine in the antennal lobes of the honeybee, Apis mellifera. J Comp Physiol. 1987;160:359–366.
41. Sandoz J-C, Hammer M, Menzel R. Side-specificity of olfactory learning in the honeybee: US input side. Learn Mem. 2002;9:337–348. [PubMed]
42. Giurfa M, Malun D. Associative Mechanosensory Conditioning of the Proboscis Extension Reflex in Honeybees. Learn Mem. 2004;11:294–302. [PubMed]
43. Suzuki H. Antennal movements induced by odour and central projection of the antennal neurons in. 21:831–847.
44. Haupt SS. Das Gustatorische System und antennales Lernen in der Honigbiene (Apis mellifera L.) Germany: Technischen Universit[mapsto] Berlin,; 2005. PhD Thesis,
45. Maronde U. Common projection areas of antennal and visual pathways in the honeybee brain, Apis mellifera. J Comp Neurol. 1991;309:328–340. [PubMed]
46. Hammer M. An identified neuron mediates the unconditioned stimulus in associative olfactory learning in honeybees. Nature. 1993;366:59–63. [PubMed]
47. Hammer M, Menzel R. Multiple sites of associative odor learning as revealed by local brain microinjections of octopamine in honeybees. Learn Mem. 1998;5:146–156. [PubMed]
48. Mercer AR, Menzel R. The effects of biogenic amines on conditioned and unconditioned responses to olfactory stimuli in the honeybee, Apis mellifera. J Comp Physiol. 1982;145:363–368.
49. Braun G, Bicker G. Habituation of an appetitive reflex in the honeybee. J Neurophysiol. 1992;67:588–598. [PubMed]

Articles from Communicative & Integrative Biology are provided here courtesy of Taylor & Francis