|Home | About | Journals | Submit | Contact Us | Français|
Humans and animals show cooperative behaviour, but our understanding of cooperation among unrelated laboratory animals is limited. A classic test of cooperation is the iterated Prisoner’s Dilemma (IPD) game, where two players receive varying payoffs for cooperation or defection in repeated trials. To determine whether unrelated rats cooperate in the IPD, we tested pairs of rats making operant responses to earn food reward in 25 trials/day. The operant chamber was bisected by a metal screen with a retractable lever and pellet dispenser on each side. When levers extended, rats had 2 s to respond. Mutual cooperation (Reward) delivered three pellets each, mutual defection (Punishment) provided no pellets, and unilateral defection (Temptation) gave five pellets to the defector, while the partner (Sucker) received none. In eight pairs of males (RM–) and females (RF–), cooperation was defined by withholding a response. In seven pairs of RM+ males, cooperation was defined by responding on the lever. In males, food restriction significantly inhibited both cooperation and pellets received. There was no effect of dominance status. Males and females made similar numbers of responses under ad libitum feeding. However, neither food restriction nor dominance status affected responses in females. Rats were subsequently tested for reciprocity in 24 alternating trials/day. A response on the lever within 5 s delivered three pellets to the partner. Females made significantly more responses for their cage-mate than males. Responses within pairs were significantly correlated for males, but not for females. For both sexes, responses declined significantly when paired with an unfamiliar partner who never reciprocated (‘bad stooge’). These results demonstrate that rats working for food show cooperation in IPD and direct reciprocity. Their responses depend on food availability and responses of their partner.
There is a long history of efforts to understand cooperative behaviour, because cooperation is an important dimension of social interactions in humans and animals. Cooperation can be understood from an economic perspective, where benefits to participants are measured in terms of resources gained or evolutionary fitness (see Schuster & Perelberg, 2004). Cooperation can also involve cognitive and emotional elements, including responses to risk and reward (Rilling, 2011). To explore neural mechanisms underlying cooperative behaviour in laboratory animals, new experimental models must be developed. Studies in laboratory animals have already elaborated brain circuits and signals that shape decision making under conditions of uncertainty, punishment and delay (Floresco St Onge, Ghods-Sharifi, Winstanley, 2008). Although laboratory animal tests of decision making do not typically incorporate social interactions, social decision making is an important component of cooperative behaviour. In particular, individual participants can increase their benefit or reduce their risks by ‘gaming’ the system. Thus, game theory has been used to model interactions among participants (humans, animals, organizations, governments) in potential cooperative interactions (Axelrod, 2006). The present study tested cooperation in pairs of unrelated rats in an operant model of the iterated Prisoner's Dilemma (IPD) game and in a test of direct reciprocity.
Kin selection and reciprocal altruism have been proposed to explain how cooperation develops (see Ale, Brown, Sullivan, 2013). Kinship can promote cooperation when the benefit to the recipient increases the evolutionary fitness of the donor (Hamilton, 1964). Reciprocal altruism can promote cooperation when long-term benefits accrue to partners interacting repeatedly (Trivers, 1971). Field studies describe the flexible interplay of multiple partners working for rewards and punishments among social animals living in complex environments. However, the sheer complexity of such interactions makes it difficult to resolve the relative roles of kin selection and reciprocal altruism in understanding cooperative behaviour (Raihani & Bshary, 2011). Laboratory investigations of cooperation often simplify the interactions to pairs of conspecifics (Axelrod, 2006). Pairwise games include the Prisoner’s Dilemma, Hawk–Dove, and stag hunt. In their classic form, each of these games is both symmetric and simultaneous, where each player does not have knowledge of the actions of their partner. The games may be played in a single round, or may be repeated in multiple rounds with same partners, as in the IPD (Raihani & Bshary, 2011). While cooperation has been extensively studied in human laboratory tests (Melis & Sammann, 2010), there currently exist only a handful of laboratory studies of IPD in animals, and these differ in terms of animal species and experimental design. Additional studies will help to refine methods to test IPD in laboratory animals and provide insight into the limits of cooperative behaviour in animals. IPD tests reciprocal altruism, where a cooperative response by each participant benefits the recipient, while reducing the immediate benefit to the donor (Trivers, 1971). Like IPD, direct reciprocity is a dyadic interaction, representing the repeated reciprocal exchange of equivalent benefits between two parties. When delivering a benefit to their partner, each participant experiences a temporary net cost, which is exceeded by the benefit they subsequently receive from a partner working on their behalf (Nowak, 2006). Direct reciprocity is distinguished from generalized reciprocity, where one party offers benefits without expectation of return, or pseudoreciprocity, where actions initiated by one party produce self-interested behaviour by the other party that conveys benefits to initiator (Connor, 2010).
Testing participants in repeated trials, as with IPD, allows for development and expression of cooperative responses. Cooperation in a symmetric and simultaneous game such as IPD is limited by the cognitive abilities of the participants. The specific cognitive requirements for cooperation are not yet established. At a minimum, they include individual recognition, communication (Lopuch & Popik, 2011), as well as elements of cognitive flexibility (Floresco, 2013). Rats are social animals that possess these basic capabilities (Schuster & Perelberg, 2004). Unlike other recent animal models of IPD (Stevens & Stephens, 2004; St-Pierre, LaRose, Dubois, 2009; Viana, Gordo, Sucena, Moita, 2010), the model used here requires that the participants make a decision quickly without information about their partner's choice, and the model can be repeated in multiple trials per session. We tested cooperation in male and female rats working for food reward, under both food restriction and ad libitum feeding. The hypothesis was that cooperative responses by pairs of rats playing IPD and direct reciprocity vary according to the sex, dominance status, familiarity and satiety of the participants. Specifically, we predicted that cooperation would be greater in females, in subordinate rats and among well-fed, familiar partners. We compared responses of individual rats in repeated trials against successful strategies, such as Tit for tat (Axelrod, 2006) and Pavlov (Nowak & Sigmund, 1993). We also measured the effects of dominance status and partner familiarity on cooperative responses.
Adolescent male (N = 32) and female (N = 16) Long–Evans rats (6 weeks of age; ca. 200 g body weight at the start of the study; Charles River Laboratories, Wilmington, MA, U.S.A.) were pair-housed with a same-sex conspecific under a reversed 14:10 h light:dark cycle. To facilitate operant responding, male rats were maintained on a slow rate of growth (3–4 g/day) during training, as in Cooper, Goings, Kim, and Wood (2014). To eliminate cyclic fluctuations in ovarian steroid hormones and maintain chronic physiologic levels of oestrogen, female rats were ovariectomized via bilateral dorsal flank incision, and received a subcutaneous 4 mm Silastic oestradiol implant (inner diameter: 1.98 mm, outer diameter: 3.18 mm; Dow Corning, Midland, MI, U.S.A.; Bridges, 1984). Behaviour was tested under dim light during the first 4 h of the dark phase when activity peaks. Experimental procedures were approved by the University of Southern California’s Institutional Animal Care and Use Committee (protocol number 11773) and were conducted in accordance with the animal care guidelines of the National Research Council (2011).
Training and testing were conducted in operant conditioning chambers controlled by WMPC software (Med Associates, Fairfax, VT, U.S.A.), and enclosed in sound-attenuating boxes with fans for ventilation. Operant chambers were divided in half by a removable mesh screen. Each side of the chamber was equipped with a retractable lever and stimulus light adjacent to a food trough connected to a pellet dispenser. A house light and clicker were mounted in the centre of the ceiling.
Rats were trained individually to respond on the lever to receive 45 mg sucrose pellets (Bio-Serv Inc., Frenchtown, NJ, U.S.A.). They were habituated to lever insertion in daily 20 min sessions. Each trial began in darkness with the lever retracted in the intertrial interval (ITI) state. The stimulus light was illuminated 2 s before the lever was inserted into the chamber. Rats were required to press the lever within 10 s to receive a sucrose pellet, after which the lever retracted, the stimulus light turned off and the house-light was illuminated for 30 s. If a rat failed to respond within 10 s, the chamber reverted to ITI and the trial was counted as an omission. The response time was gradually decreased to 5 s, and then to 2 s. Final trial duration was 34 s. All rats met a criterion of 25 responses per 20 min session (35 trials) for 2 consecutive days before behavioural testing began.
Once both cage-mates were trained, they were tested as pairs in daily sessions of 25 trials each. At the start of each trial, stimulus lights were illuminated for 2 s before levers were inserted on both sides of the chamber. Rats had 2 s to respond before the levers retracted, and the house-light was illuminated for 30 s. For trial outcomes where pellets were delivered (mutual cooperation, unilateral defection), pellets were dispensed every 0.5 s, and an audible clicker on the cage top signified each pellet entry into a food trough so that both rats could recognize when pellets were delivered.
On every trial, each rat chooses to cooperate or defect. As defined by Rapaport and Chammah (1965), mutual cooperation is represented as Reward, unilateral defection is Temptation/Sucker and mutual defection is Punishment. For RM− male and RF− female rats (N = 8 pairs each), cooperation was signified by withholding a lever response (Reward−; Fig. 1a, left). Each rat received three pellets (Reward) on trials when both rats refrained from pressing their lever. They received no pellets on trials when both pressed their levers (Punishment). When one rat pressed a lever (Temptation) while the cage-mate refrained (Sucker), the Temptation rat received five pellets and the Sucker received none. In terms of pellets earned, Temptation > Reward > Punishment = Sucker. Because rats receive equal numbers of pellets with Punishment or Sucker, the payoff matrix is a weak Prisoner's Dilemma (Kuhn, 2014). However, the Sucker could hear and see the Temptation partner receiving and consuming pellets. In separate RM+ male rats (N = 7 pairs), the rules for cooperation were reversed (Reward+; Fig. 1a, right). Both rats were rewarded for pressing the levers and punished for withholding responses, and received five pellets (Temptation) for withholding a response when the cage-mate responded (Sucker). Rats were tested for IPD with their cage-mate for 10 days. Responses for the last 4 days were averaged for each pair. Rats were tested for 4 days under food restriction similar to the training regimen (3–4 g body weight gain/day).
After completing IPD, all pairs of rat were subsequently tested for reciprocity (Fig. 1b, left). At the start of a trial, the stimulus light was illuminated and a lever was inserted on one side of the chamber (either left or right). The Donor rat had 5 s to respond before the lever retracted, and the trial was scored as an omission. A response on the lever dispensed three pellets to the Recipient rat at a rate of one pellet/0.5 s, while the Donor’s lever retracted and the house-light was illuminated for 30 s. An audible clicker on the cage top signified each pellet entry into a food trough. Trials alternated (Donor versus Recipient), as did the first trial of each daily session (left versus right lever). Similar to IPD (25 trials/day), each daily session consisted of 24 trials, representing 12 trials per rat per day.
Rats were tested for reciprocity with their cage-mate for 10 days. Each rat was then randomly assigned to be tested for an additional 10 days with an unfamiliar same-sex conspecific. Responses for the last 5 days were averaged for each pair. This was followed by 3 days with an unfamiliar rat (‘bad stooge’) who never delivered pellets, and 3 days with the different rat (‘good stooge’) who delivered pellets on every trial. To minimize variability, all rats were tested with different partners in the same order (cage-mate, unfamiliar, bad stooge, good stooge). It is important to acknowledge the potential for carryover effects, particularly because bad stooge always preceded good stooge.
Dominance was determined for each pair of rats in three trials of tube displacement (Lindzey Winston, & Manosevitz, 1961). The two rats were introduced simultaneously into opposite ends of a ventilated plastic tube (100 cm long, 7 cm diameter). Since they are unable to pass each other, one rat must back up to exit the tube. The rat that backed out of the tube in the majority of trials was considered subordinate.
During reciprocity testing, trials alternated every 35 s on average, and pellet delivery to the Recipient was uncertain. To address the possibility that rats were responding individually for a delayed probabilistic reward, seven male rats were individually tested on a probability discounting schedule that mimicked the conditions for the reciprocity test. In particular, pairs of rats tested for reciprocity were required to respond on a lever to deliver pellets to their partner. Control rats were also required to respond on a lever to receive a delayed and uncertain pellet reward. For each trial, a single lever was inserted, and rats had 5 s to respond. A response on the lever delivered three pellets after a 35 s delay. Rats were tested in daily sessions of 12 trials/day with decreasing probability of pellet delivery: 100, 75, 50 and 25%. Each rat was tested for 3 days at each probability.
For each pair of rats, we averaged data from 25 trials/day in the last 4 days of testing (100 trials total) during food restriction and under ad libitum feeding. Mean responses from each rat were averaged for all rats in each experimental group (RM−, RM+, RF−) and compared statistically. We compared the number of responses and pellets received per day in dominant and subordinate rats during ad libitum feeding and food restriction in each experimental group by repeated measures ANOVA, with feeding condition as the repeated measure. Data were analysed using JMP 9.0 statistical software (SAS Institute, Cary, NC, U.S.A.), and P <0.05 was considered significant.
For each experimental group, we compared trial outcomes (Reward, Temptation/Sucker, Punishment) during ad libitum feeding and food restriction by repeated measures ANOVA, with feeding condition as the repeated measure. Subsequently, we compared trial outcomes under each feeding condition separately by ANOVA with post hoc analysis by Tukey–Kramer HSD. To evaluate rats’ decision rules, the transition vectors r, t, s and p reflect the probability of cooperation when the previous trial resulted in outcomes of Reward, Temptation, Sucker or Punishment, respectively (Stevens & Stephens, 2004). For each experimental group, we compared transition vectors during ad libitum feeding and food restriction by repeated measures ANOVA, with feeding condition as the repeated measure. We then compared transitions under each feeding condition separately by ANOVA with post hoc analysis by Tukey–Kramer HSD.
For each rat, we averaged the number of responses from 12 trials/day in the last 5 days of testing (60 trials total) with their cage-mate and with an unfamiliar same-sex conspecific. Responses for bad stooge and good stooge were averaged across 3 days of testing. Responses in males and females for different partners (cage-mate, unfamiliar rat, bad stooge, good stooge) were compared by repeated measures ANOVA with partner as the repeated measure. Within each sex, we compared responses for different partners by ANOVA. We compared responses by males and females for each partner by Student’s t test, with Bonferroni correction for multiple comparisons. Correlated responses within each pair of rats (cage-mates, unfamiliar pairs) were evaluated by regression analysis.
For each rat, we averaged the number of responses in 12 trials/day across the 3 days of testing at each probability. Responses across the different probabilities (100, 75, 50 and 25%) were compared by repeated measures ANOVA with probability as the repeated measure.
Figure 2 illustrates the frequency of Reward outcomes in 25 trials/day during 10 days of testing in RM−, RM+ and RF− rats. Although all rats were initially trained to respond on the lever to receive pellets, cooperation in RM− and RF− rats playing IPD was defined by withholding a response. Therefore, RM− and RF− rats had relatively few Reward outcomes (3.3 ± 1.4 and 4.0 ± 1.0/25 trials, respectively) on the first day of testing. During the last 4 days of testing, average Reward outcomes were significantly increased to 8.7 ± 1.5/25 trials in RM− males (P < 0.05). A similar response was observed in RF− females, although the increase in Reward outcomes (to 9.1 ± 2.3/25 trials) was not significant (P = 0.07). For RM+ males, Reward outcomes on day 1 (7.3 ± 2.2 Reward outcomes/25 trials) reflected the requirement to respond on the lever for cooperation. During the last 4 days of testing for RM+ rats, half of the trials resulted in Reward outcomes (11.8 ± 1.7/25 trials).
Figure 3 compares the mean ±SE number of responses versus pellets received per 25 trials in RM−, RM+ and RF− rats by dominance status (Fig. 3a–c) and food availability (Fig. 3d–f). For all three experimental groups, there was no significant effect of dominance status on responses made or pellets received, and no interaction of dominance status with feeding condition. This contrasts with previous studies, where dominant males were more motivated for food reward (Davis, Krause, Melhorn, Sakai, & Benoit, 2009). However, food restriction differentially influenced the frequency of operant responses and pellets earned in the three groups of rats. With ad libitum feeding, RM− rats each made 11.0 ± 1.2 responses in 25 trials and earned 52.9 ± 4.9 pellets (Fig. 3d). Rats obtained 52.3 ± 6.3% of pellets from Reward trials (three pellets/trial), and the remainder from Temptation trials (five pellets/trial). Cooperation was reduced under food restriction: the rats made significantly more operant responses (16.3 ± 1.3 responses/25 trials; F1,14 = 13.47, P < 0.05, np 2 = 0.480), but received fewer pellets (36.7 ± 6.7 pellets/25 trials; F1,14 = 7.06, P < 0.05, np 2 = 0.322). Behaviour of RF− females during ad libitum feeding was similar to RM− males: 11.7 ± 1.6 responses/25 trials to receive 48.2 ± 5.4 pellets (Fig. 3f). However, food restriction in females had no effect on responses or pellets received. RM+ males made 17.1 ± 1.2 responses/25 trials during ad libitum feeding to obtain 62.2 ± 4.4 pellets (Fig. 3e). As with RM− rats, RM+ rats earned significantly fewer pellets under food restriction (47.8 ± 6.9 pellets/25 trials; F1,12 = 6.03, P < 0.05, np 2 = 0.314). Response rates in RM+ rats during food restriction (12.7 ± 1.7 responses/25 trials) showed a similar pattern, although the effect was not significant (F1,12 = 2.73, P = 0.12, np 2 = 0.183).
Figure 4 illustrates average trial outcomes in RM−, RM+ and RF− pairs under ad libitum feeding and food restriction. By repeated measures ANOVA, both RM− and RM+ male rats showed significant differences in trial outcomes (RM−: F2,21 = 8.10, P < 0.05, np 2 = 0.394; RM+: F2,18 = 3.98, P < 0.05, np 2 = 0.724), as well as a significant interaction between feeding condition and trial outcome (RM−: F2,21 = 10.37, P < 0.05, np 2 = 0.497; RM+: F2,18 = 7.00, P < 0.05, np 2 = 0.437). Among RM− pairs during ad libitum feeding (Fig. 4a), significantly more trials resulted in Temptation/Sucker payoffs (10.8 ± 0.8/25 trials) versus Punishment (5.6 ± 1.0/25 trials; P < 0.05). With food restriction (Fig. 4d), the frequencies of Temptation/Sucker (10.7 ± 1.3/25 trials) and Punishment (11.0 ± 1.5) outcomes were similar (P > 0.05), but Reward outcomes were infrequent (3.3 ± 0.5 per 25 trials; P < 0.05 versus other trial outcomes). Pairs of RM+ males showed a similar distribution of trial outcomes during ad libitum feeding (Fig. 4b). In particular, the frequency of Punishment outcomes (2.5±1.8/25 trials) was significantly less (P < 0.05) than for either Reward (11.8 ± 1.7) or Temptation/Sucker (10.8 ± 1.1). However, when RM+ rats were food restricted (Fig. 4e), the frequency of Punishment (7.3 ± 2.6) and Temptation/Sucker (10.8 ± 1.5) outcomes did not differ (P > 0.05). The shift in trial outcome during food restriction in RM− and RM+ rats is consistent with the increase in response rates in Fig. 3, as rats increased defection to obtain the high payoff from Temptation. During ad libitum feeding, 5 of 32 males (16%, 3 RM−, 2 RM+) received more pellets than would be obtained from consistent mutual cooperation (75 pellets/25 trials). With food restriction, only three male rats (9%, 2 RM−, 1 RM+) received more than 75 pellets/25 trials. Among individual pairs of RM− and RM+ rats, there was considerable variation in response strategies (Table 1). Regardless of the response requirements, the most cooperative pair of rats received the most pellets and the least cooperative pair received the fewest number of pellets. By contrast, in RF− females, there were no significant differences in trial outcome and no effect of feeding condition. Trial outcomes as Reward (9.1 ± 2.3), Temptation/Sucker (8.3 ± 0.9) or Punishment (7.5 ± 2.4) during ad libitum feeding were relatively balanced (P > 0.05).
Figure 5 evaluates response strategies reflected by the transition vectors r, t, s and p, which describe the probability of cooperation following Reward, Temptation, Sucker and Punishment outcomes, respectively, in the previous trial. As illustrated in Fig. 5a, the Tit-for-tat strategy should produce high values for r and t, and low values for s and p. For the Pavlov strategy (Fig. 5b), values for r and p should be high, while t and s should be low. However, RM−, RM+ and RF− rats did not follow either the Tit-for-tat or the Pavlov strategy. RM− males were significantly more likely to cooperate when tested during ad libitum feeding versus food restriction (F1,60 = 37.18, P < 0.05, np 2 = 0.373; Fig. 5c, f). They were also more likely to cooperate after cooperating in a previous trial (average of r and s) than after defecting (average of t and p; F1,30 = 5.03, P < 0.05, np 2 = 0.443). With food restriction, the value of s (probability of cooperation following Sucker: 0.37 ± 0.07) was significantly greater than t (probability of cooperation following Temptation: 0.13 ± 0.03; P < 0.05). Like RM− males, RM+ males were more cooperative with ad libitum feeding than with restricted feeding (F1,42 = 11.61, P < 0.05, 2 = 0.425; Fig. 5d, g). However, in well-fed RM+ rats, p (probability of cooperation following Punishment) was significantly lower (0.36 ± 0.08; P < 0.05) compared with other transition vectors. In RF− females, feeding condition had no effect on cooperation (P > 0.05). Even so, the value of r (probability of cooperation following Reward: 0.64 ± 0.06) during food restriction (Fig. 5h) was significantly greater than p (0.24 ± 0.09; P < 0.05).
Figure 6 depicts responses by a donor to deliver pellets to a recipient in alternating trials of 12 each. Female donors made significantly more responses (9.3 ± 0.5 responses/12 trials) than males (5.6 ± 0.4 responses/12 trials, P < 0.05), either for their cage-mate or for an unfamiliar same-sex conspecific. In both males and females, donors did not give more pellets to a familiar cage-mate over an unfamiliar rat (P > 0.05). Similarly, response rates for dominant and subordinate rats were not different (data not shown). However, when an unresponsive rat (bad stooge) was substituted for the unfamiliar partner, responses per 12 trials declined significantly in both males (2.6 ± 0.2 responses/12 trials, P < 0.05) and females (4.5 ± 0.6 responses/12 trials, P < 0.05), although females made more responses than males (P < 0.05). When bad stooge was replaced by good stooge, who delivered pellets on every trial, response rates in males increased within the 3 days of testing (4.4 ± 0.5 responses/12 trials, P < 0.05 versus bad stooge). Response rates of females did not change (P > 0.05).
Figure 7 compares responses of male and female rats paired with their cage-mate, and of control males responding for pellets with a comparable delay and probability. In males (Fig. 7a), there was a significant correlation for response rates within pairs of rats, regardless of whether rats were paired with their cage-mate (R2 = 0.76, P < 0.05) or with an unfamiliar partner (R2 = 0.77, P < 0.05; data not shown). The most generous pair of rats made 9.4 ± 1.0 and 8.4 ± 1.3 responses/12 trials, versus 2.6 ± 0.7 and 2.6 ± 1.0 responses/12 trials for the least generous pair. Response rates for dominant and subordinate rats were not different (P > 0.05). In females (Fig. 7b), response rates were not correlated within pairs (R2 = 0.39, P > 0.05), although overall response rates were greater than in males (P < 0.05). Unlike males tested for reciprocity with their cage-mate, response rates in control males working for a delayed and uncertain pellet reward did not vary across different reward probabilities (P > 0.05; Fig. 7c).
The present study evaluated cooperative behaviour in pairs of male and female rats working for food reward in an operant test modelled on the IPD game and in a test of direct reciprocity. Regardless of response requirements (RM− versus RM+) in the rat IPD game, male rats earned similar numbers of pellets when tested during ad libitum feeding. There was no effect of dominance status on responses made or pellets received. Under food restriction, cooperative behaviour decreased, leading to a reduction in Reward outcomes, and an increase in Punishment. Responses in RF− females did not vary with dominance status or feeding condition. Rats did not use either Tit-for-tat or Pavlov response strategies. When tested for direct reciprocity, females were more likely than males to respond on a lever to deliver food pellets to their partner, but both males and females showed no significant preference to give pellets to their cage-mate over an unfamiliar same-sex partner. In males, responses of the two partners were correlated. Responses of both males and females were reduced when tested with an unresponsive partner (bad stooge). However, rats working individually for pellets under similar conditions of delay and uncertainty showed no decrease in responses when pellets were delivered infrequently. These data demonstrate that pairs of rats working for food reward in a laboratory show cooperative responses for non-kin.
Previous laboratory studies of IPD have tested pairs of birds or rats working for food reward using a variety of apparatus and payoff matrices. Cooperation has been demonstrated using perch choice in birds (St-Pierre et al, 2009; Stevens and Stephens, 2004) or selection of one of two chambers by rats in a T-maze (Viana et al, 2010) to signify cooperation or defection. In particular, St-Pierre et al (2009) demonstrated that zebra finches show high levels of cooperation using a strategy that resembles Tit-for-tat when tested with their pair-bonded social partner, but not when tested with an unfamiliar opposite-sex partner. Similarly, rats playing against an unfamiliar stooge adjust their responses according to the strategy of their opponent (Viana et al, 2010). However, these experiments depart from the classic Prisoner's Dilemma model in that test subjects can see the location (perch, T-maze chamber) selected by their partner.
By contrast, participants in a simultaneous game are unable to determine the decision of their partner. Instead, their response strategy is limited by the lack of information, a short duration to decide, and the cognitive limitations of the participants (Axelrod, 2006). In the present study, to limit information about their partner’s decision, levers were positioned on opposite sides of the chamber, and were presented for only 2 seconds. To control for response omission with this brief decision interval, we included males tested under either RM+ or RM− conditions. It is notable that these groups made opposite responses to food restriction: RM− males made more responses, RM+ males made fewer responses. However, the effect on cooperation was similar: punishment trials increased and reward trials were reduced. This indicates that food-restriction favors defection in both groups, regardless of the response requirement. Similar findings have been reported previously (Viana et al, 2010). It also suggests that rats favor the large short-term payoff of unilateral defection (Temptation: 5 pellets), and discount the long-term advantages (75 pellets/25 trials) of mutual cooperation (Stevens and Stephens, 2004). In this regard, only a minority of rats averaged >75 pellets/25 trials, and this number decreased with food restriction. When tested individually in a rat version of the Iowa Gambling Task that incorporates elements of reward magnitude, uncertainty, and delay, rats can develop a preference for the advantageous lever (Zeeb, Robbins, Winstanley, 2008). However, this test is cognitively less demanding than Prisoner’s Dilemma. Although rats choose among 4 different levers in the rat Iowa Gambling Task, the payoffs on any individual lever are consistent.
The differing response strategies that rats use under ad libitum feeding vs. food restriction highlight some of the challenges in studies of cooperation in animals. Laboratory studies of cooperation in adult humans typically use monetary reward, while food reward is usually the incentive in studies of animals and children. An advantage of monetary rewards is that they can accumulate over successive trials, and be reduced as a form of punishment. On the other hand, food has an immediate reinforcing effect, and the motivation to obtain a food reward varies with feeding condition, as demonstrated in the present study. Other studies in rats have used a payoff matrix combining reward (pellets) and punishment (tail pinch; Viana et al, 2010). However, it is difficult to establish the relative relationship between pellets and tail pinches. Ultimately, it can be difficult to equate results from humans working for monetary rewards with animals working for food rewards. In particular, adult humans make more impulsive choices when working for candy vs. money (Rosati, Stevens, Hare, Hauser, 2007), and adults and children show different reactions to unfair distribution of candy (McAuliffe, Blake, Warneken, 2014). In the present study, the payoff matrix was balanced so that Reward > (Temptation + Sucker)/2 (Rapoport and Chammah, 1965). Rats received no pellets for either Punishment or Sucker, as in Viana et al (2010). Nonetheless, the Sucker had visual and auditory evidence that the Temptation partner received pellets, thereby enhancing the inequality of the Sucker outcome relative to Punishment. Such first-order inequity aversion has previously been shown for primates, crows and dogs, and has been implied in rodents (Brosnan and de Waal, 2014). If our study were repeated using a payoff matrix in which Punishment > Sucker, it seems likely that we would obtain similar results. This is based on comparison with previous animal studies of IPD where Punishment > Sucker. In pairs of food-restricted blue jays where P = 2 and S = 0, values of r and s exceeded those of t and p (Stevens & Stephens, 2004), as with food-restricted RM− males in the present study. In food-restricted male zebra finches where P = 1 and S = 0 (St-Pierre et al, 2009), the value of s (0.375) was equivalent to that of food-restricted RM− males (0.37±0.07).
Although rats playing IPD in the present study adjusted their response strategy to different conditions, they did not follow either Tit-for-tat or Pavlov strategies. It has been suggested that such strict bookkeeping strategies are difficult for many animals, both because of steeper delay discounting curves, and because of reduced memory potential relative to humans (Stevens, Rosati, Ross, Hauser, 2005). However, just because animals do not play Tit-for-tat doesn’t mean that they are not cooperating. Instead, RM− males in the present study followed a strategy of cooperate after cooperation/defect after defection. Vectors r and s represent the probability of cooperation after cooperating in the previous trial, while t and p reflect the chance of cooperating after defecting in the previous trial. In RM− males, the average of r and s exceeded 50%, while the average of t and p was less than 50%. Similar responses have been observed with cooperation in blue jays (Cyanocitta cristata; Stevens and Stephens, 2004).
Unlike Prisoner’s Dilemma, direct reciprocity offers no incentive to defect, and the cost to cooperate (pressing on a lever) is minimal. Accordingly, it is reasonable to expect that rats would show high rates of response on the lever, regardless of the likelihood of receiving a reward from their partner. Certainly, control rats in the present study were willing to work for pellets under similar conditions of delay and uncertainty. However, when tested in pairs, male rats were resistant to respond for a partner who failed to reciprocate consistently, or who failed to deliver pellets (bad stooge). Instead, response rates in pairs of male rats were highly correlated. This is consistent with a strategy of attitudinal reciprocity, in which individuals mirror the recent response of their partner (de Waal, 2000). The modest increase in response for good stooge was probably related to a carryover effect from bad stooge. Future studies should test whether replacing the cagemate with good stooge increases reciprocity above baseline.
When tested for reciprocity, pairs of females were significantly more likely than males to respond on a lever to deliver food to a partner. Indeed, female rats responded for their cage-mate on over 75% of trials, consistent with direct reciprocity. In a related study, female rats also showed generalised reciprocity, in which their willingness to work on behalf of an unfamiliar partner was increased if they had previously benefited from work by another rat (Rutte and Taborsky, 2007). Human studies have also found greater tendency for reciprocity in women than men (Yamasue et al, 2008). On the other hand, an increased likelihood to reciprocate for a partner does not necessarily mean that females are more cooperative in all situations. For IPD in the present study, Reward outcomes in RM− and RF− pairs were similar. Sex differences emerged when rats were challenged by food restriction, where cooperation declined in males only. In this regard, female rats trained to pull a stick to bring food to a partner were more likely to work for a partner that was underfed, suggesting that responses in females take into account the needs of the partner (Schneeberger, Dietz, Taborsky, 2012). Likewise, a recent meta-analysis of research on social dilemma games (including Prisoners Dilemma) found no overall sex difference (Balliet, Li, Macfarlan, Van Vugt, 2011). Instead, sex differences in human cooperative behaviour are complicated by contextual factors: same-sex vs. mixed-sex interactions, number of interactions, group size.
Laboratory animal models for testing IPD and direct reciprocity offer opportunities to manipulate the underlying circuits and signals to gain a better understanding the neurobiology of social cooperation. The underlying neurobiologic substrates for cooperative behaviour remain to be determined, but laboratory animal models of cooperation represent an opportunity in this regard. Because participants work for food or monetary reward, it is likely that dopaminergic projections from the ventral tegmental area to the nucleus accumbens are involved (Tsoory, Youdim, Schuster, 2012). Decision making also involves connections between accumbens and the prefrontal cortex (Floresco, 2013). Complex bookkeeping strategies such as Tit-for-tat or Pavlov require substantial working memory capacity involving the hippocampal formation, although this appears to be lacking in rats (Milinski and Wedekind, 1998). In addition, cooperation involves an additional social dimension that overlies basic circuits for decision and reward (Rilling, 2011). In this regard, the neuromodulator oxytocin promotes affiliative behaviour in animals (Carter, Grippo, Pournajafi-Nazarloo, Ruscio, Porges, 2008) and trust in humans (Macdonald and Macdonald, 2010), and projections from oxytocin neurons in the paraventricular nucleus of the hypothalamus to the mesolimbic dopamine system may be activated during cooperative games.
We thank Ms Kathryn G. Wallin, and Ms Sydney P. Goings for assistance with experimental design and animal handling.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.