One possible reason for the between-species disparity in traditional operant studies of choice and self-control may be differences in the nature of the consequences used in the different studies. In most studies with humans, behavior is maintained by points exchangeable for money, whereas in studies with nonhuman animals behavior is maintained by consumable consequences (e.g., food). Because all points earned are typically exchanged at the same time (i.e., at the end of the session or at the end of the entire experiment), there is no immediate utility to earning points sooner rather than later within the session (cf. Hyten et al., 1994
). Some operant studies to show evidence of impulsive choice in adults have used consumable reinforcers in the form of escape from loud noise (Navarick, 1982
; Solnick et al., 1980
), juice (Jimura et al., 2009
; McClure et al., 2007
), or access to a video game (Millar and Navarick, 1984
). The present research is an extension of this approach.
We designed a task in which an outcome would gradually increase in value as time passed, a paradigm that we will call an “escalating interest” task. Thus, a participant must decide when to cash-in, a decision analogous to that faced by investors for whom an asset is continuously increasing in value. A decision to cash in at a delay of 5 s indicates that the outcome available at that delay was of higher subjective value than all of the smaller sooner outcomes and all of the larger later ones. The choice that our participants faced is similar to that used in studies of deferred gratification in which the reward is always present during the waiting period (e.g., Mischel and Ebbesen, 1970
; Reynolds and Schiffbauer, 2005
), but in our task the reward available increased continuously during the delay. The only direct analogues we could uncover for our escalating interest task were (a) the single key impulsivity paradigm (SKIP, Swann et al., 2002
) in which each mouse-click earned 1 cent for every 2 s since the previous response, or (b) an increasing food task in which nonhuman primates were given an increasing amount of food as the delay-to-consumption increased (Anderson et al., 2010
; Beran and Evans, 2006
; Pele et al., 2010
). For example, Anderson et al. (2010)
increased the amount of food available to a monkey over a delay, either by adding additional equally sized food items at regular intervals (thus magnitude was a linear function of delay) or by adding increasingly larger food items (thus magnitude was a power function of delay). Only the latter method induced waiting in their monkeys.
In order to study these types of choices, we designed a video game in which participants were told to simply destroy all of their targets (stationary monsters, “orcs”, distributed throughout a game environment involving hills, a lake, and buildings) in each of four levels of the game (for a clip, see http://bcs.siuc.edu/facultypages/young/Research/Supplemental.html
). The four levels of the game were identical with only the behavior of the player’s weapon continuing to change; once a player completed all of the decisions in the game environment, the environment was reset (i.e., all orcs were resurrected) thus creating the next level of the game. This method was used to ensure that the participant would learn the spatial layout of the targets during the first level thus making for more efficient navigation in subsequent levels.
The key variable was the way in which the player’s weapon recharged; the weapon always obtained its full charge (maximum damage) 10 s after its previous shot. The player could choose to fire more quickly with less damage but with the advantage of being able to fire again soon or to fire more slowly in order to do more damage with each shot. In order to create situations in which impulsive behavior was detrimental to performance (i.e., firing early decreases overall rate of damage), we systematically changed the mathematical function dictating the recharge of the weapon. For some conditions (manipulated within-subject), a weapon recharged more slowly early in the 10 s interval, thus encouraging waiting, and for other conditions the weapon recharged more quickly early in the 10 s interval, thus encouraging firing sooner. Because multiple shots were required to destroy an orc, there were many decisions regarding how long to wait between shots thus allowing for ample opportunity for the participant to learn the properties of their weapon before the mathematical relationship changed.
The function producing the recharge behavior was the superellipsoid:
dictates the percentage of maximal damage done by firing the weapon t
seconds after the previous shot. Examples of the application of this function are shown in . For powers greater than 1, it was beneficial to fire earlier. For powers less than 1, it was always beneficial to wait (i.e., impulsive behavior was detrimental to performance). Thus, under these conditions (power < 1), the procedure may have utility for assessing individual differences in sensitivity to delayed consequences. That is, a relatively “impulsive” participant would be more likely to “cash in” early before the weapon is fully charged (t
< 10 s), leading to more immediate reinforcement in the short run but less efficient performance overall.
Fig. 1 The four superellipsoid functions generated by Eq. (1) using a power value of 1.25 (top curve), 1.00, 0.75, and 0.50 (bottom curve).
If we assume that the participants’ goal is to destroy the targets as quickly as possible, then there is an optimal wait time for each level of power. Waiting has the benefit of increasing the weapon’s damage but the cost of reducing the firing rate. For weapons with a power value of 1.0, the linear increase in damage for waiting is directly offset by the decrease in firing rate and thus a participant will maximize the rate of target destruction for any wait time of 10 s or less (cf. the single key impulsivity paradigm or SKIP, Dougherty et al., 2005
). In contrast, for weapons with a power less than 1.0, the optimal participant should always wait the full 10 s and no longer. The degree of sub-optimality is a function of both the deviation from this optimal value and the weapon’s power. For example, at a power of 0.5, firing every 5 s produces a rate of destruction only 17% of optimal whereas at a power of 0.75 the same rate of firing produces a rate of destruction that is 60% of optimal. If one were to wait for 9 s, a power of 0.5 produces a rate of destruction 52% of optimal whereas at a power of 0.75 these longer wait times produce a rate of destruction that is 85% of optimal.
Finally, for weapons with a power greater than 1.0, the participant should shoot as rapidly as possible (in our game, players can shoot as rapidly as once every 0.25 s). Again, degree of suboptimality for deviating from this optimum is a function of both the deviation from the optimal rate and the weapon’s power. For example, at a power of 1.25, firing every 5 s produces a rate of destruction 52% of optimal whereas at a power of 1.50, the same rate of firing produces a rate of destruction that is 33% of optimal. The penalty for waiting longer than 10 s is identical across all power values.
In Experiment 1, we sought to establish the basic issue of control by consequences as a function of power (Eq. (1)
) in the context of the video game. Will conditions in which impulsive choice is beneficial to fast task completion (powers > 1.0) elicit faster firing whereas conditions in which impulsive choice is detrimental to quick completion (powers < 1.0) elicit slower firing? Once sensitivity to our key independent variable, power, was established, we were positioned to examine the effect of other environmental variables on both the sensitivity to power and the overall firing rate.
Experiment 2 was designed to assess behavior under two conditions. In the magnitude condition, the amount of damage increased as specified in Eq. (1)
and thus replicates Experiment 1. In the probability condition, the probability of doing maximal damage increased as specified in Eq. (1)
. In terms of expected value, these two conditions were matched – if waiting 5 s produced 4 points of damage (of 10) for a given power value in the magnitude condition, then waiting 5 s produced a 40% chance of 10 points of damage (0.4 × 10 points = EV of 4 points) for the same power value in the probability condition. Thus, if participants are only sensitive to the expected value of the outcome, then the two conditions should give rise to the same behavioral tendencies toward impulsive choice as a function of the weapon’s power value. If, however, guaranteed small amounts are valued more highly than uncertain large amounts (Kahneman and Tversky, 1979
; Shafir et al., 2008
), then our players will cash-in more quickly in the magnitude condition than in the probability condition (because the uncertainty will evoke a tendency to wait in order to achieve certainty because of its greater perceived value).
1.1. Predicting game performance
A natural question that arises concerns the degree to which other measures of impulsivity would correlate with choice in our video game. Prior studies have shown moderate, weak, and no correlations among various measures of impulsivity (e.g., Lane et al., 2003b
; Reynolds et al., 2006
; Swann et al., 2002
; Wingrove and Bond, 1997
), so we did not anticipate strong correlations. Furthermore, we expected the task contingencies to be relatively strong thus dominating any small differences among our participants’ behavioral tendencies (either from prior history or biological differences). In Experiment 1, participants completed a delay discounting task assessing preference for immediate and delayed hypothetical monetary amounts to determine if discounting rate in the hypothetical choice task predicts choice in the video game (either sensitivity to consequences or the overall tendency to fire sooner rather than later). Participants also completed the Fagerström Test for Nicotine Dependence (Heatherton et al., 1991
). Given that earlier studies have shown that smokers tend to have higher discount rates as assessed using traditional discounting procedures (Bickel et al., 1999
; Johnson et al., 2007
; Mitchell, 1999
), smokers might produce shorter interresponse times in the video game. Our natural sampling of smokers in a college population produced small samples, however, and thus these results will be given little attention. In Experiment 2, participants completed the Barratt Impulsiveness Scale (or BIS) as well as a behavioral inhibition task in which a pre-potent response had to be withheld in the presence of a stop signal that occurred at random. The BIS is the most prevalent measure of impulsivity in the clinical literature and includes a set of questions (e.g., “I act on impulse” and “I am future oriented”) that are intended to assess self-reported behavioral tendencies that transcend any given decision or situation (Barratt, 1959
; Patton et al., 1995
). Given that a behavioral task that has some similarity to our game choices, the SKIP, has shown no or weak correlation with a choice-based delay discounting task, the BIS, and a behavioral inhibition task (e.g., Dougherty et al., 2009
), we did not anticipate any significant correlations between these measures and choice in our video game. Regardless, we chose to revisit the issue in these experiments.