The novel QDOT showed systematic effects of delayed monetary reinforcement in the human laboratory over the course of 80 s, using 100% choice delivery, and with a task duration of less than 20 min. The effect of delay followed a hyperbolic decay model widely found to describe human and nonhuman animal delay discounting (e.g., Mazur, 1987
; Rachlin et al., 1991
). Human operant reinforcement studies have traditionally had difficulty in showing sensitivity to delay (e.g., Hyten, Madden, & Field, 1994
; A. W. Logue, King, Chavarro, & Volpe, 1990
; A. W. Logue, Pena-Correal, Rodriguez, & Kabela, 1986
). Moreover, human delay discounting research using hypothetical and potentially-real choice procedures have typically shown decreases in reward value with relatively long delays measured by days, months, and years. In contrast, delays on the order of seconds have been shown to cause non-trivial decreases in reinforcement in humans when utilizing immediately-usable, non-monetary reinforcers, including video clips (Navarick, 1996
; Navarick, 1998
) and consumable liquids (Jimura, Myerson, Hilgard, Braver, & Green, 2009
; McClure, Ericson, Laibson, Loewenstein, & Cohen, 2007
). One hypothesis is that the tendency towards self-control (i.e., larger later reinforcer) responding traditionally observed in human operant research is due to the use of token reinforcers that cannot be exchanged for other goods until after the session (Jackson & Hackenberg, 1996
). The EDT and QDOT may be able to show human delay discounting for monetary rewards over such short time frames because reinforcement entails actual coin delivery, not just token reinforcement (e.g., an earnings total on a screen which is exchanged for real money after the task). While money itself is a token reinforcer and cannot be exchanged for other goods until leaving the laboratory, it is possible that money is so extensively generalized that its physical acquisition is delay discounted more similar to immediately-useable reinforcers than to other token reinforcers typically used in human operant research (e.g., points or earnings displayed on a screen). A comparison of QDOT performance with and without coin delivery throughout the task would provide a test of this unexamined hypothesis.
Performance on the QDOT was significantly, positively correlated with performance on the EDT. Similarly, a trend was evident for discounting in the hypothetical $1,000 task to be positively correlated to discounting in the potentially real $10 task. Although previous research has shown EDT performance to be positively correlated with hypothetical delay-discounting performance (Reynolds, 2006a
), the present study did not find this effect. QDOT and EDT performance may be similar because both tasks provide operant reinforcement during the task, involve relatively small magnitude reinforcers, and assess short time frames. In contrast, the hypothetical $1,000 and potentially real $10 tasks do not provide operant reinforcement during the task, involve larger rewards, and relate to timeframes involving one day to at least six months. Consistent with the distinction between the two types of tasks, the EDT but not a potentially real reward task was affected by ethanol administration (Reynolds et al., 2006
). Moreover, the EDT but not a hypothetical rewards task was affected by methylphenidate administration (Shiels et al., 2009
). Therefore, procedures such as the QDOT and EDT may be closer to “state” measures, while hypothetical and potentially real tasks may be closer to “trait” measures (these descriptors likely fall along a continuum), with shorter timeframe contingencies more malleable to experimental conditions. Cocaine-dependent individuals may tend to be less sensitive to delayed reinforcement over both short and long timeframes compared to controls, suggesting that they all can serve as “trait” measures, consistent with the group differences observed for the QDOT, EDT, and hypothetical tasks. The lack of correlation between short and long timeframe tasks could result from differential sensitivities to different time frames, or from methodological differences.
The QDOT resulted in an effect size between the control and cocaine-dependent groups (Cohen's d = 0.42) that compared favorably with the other tasks. That is, this effect was similar to that obtained with the EDT (0.50) and hypothetical $1,000 task (0.40), and substantially larger than that obtained with the potentially real $10 task (0.14). Moreover, the effect size obtained with the QDOT was similar to the effect size obtained with the hypothetical $1,000 task between cocaine-dependent and matched control participants in a previous study (0.53) (Heil et al., 2006
). These data suggest that the QDOT is sensitive to the types of between group differences that have been examined so frequently in the delay-discounting and drug-dependence literature. Although the effect size obtained with the EDT was somewhat larger than the QDOT, the EDT involves probabilistic as well as delayed reinforcement. Therefore differences in probability discounting may have contributed to this increased effect size. The small effect size (0.14) between the control and cocaine groups for the potentially real $10 task was surprising. Previous research has shown this task to correlate well with a hypothetical task (using the same reward magnitude for both tasks, which was not the case in the present study) (Johnson & Bickel, 2002
). Moreover, the $10 potentially real reward task showed greater discounting in heavy smokers than demographically matched nonsmokers, with Cohen's d = 0.84 (Baker et al., 2003
). The task also showed light smokers to discount more than demographically matched nonsmokers, with Cohen's d = 0.57 (Johnson et al., 2007
). It is not clear why the potentially real reward task showed large effects in these studies comparing tobacco smokers to controls, but showed a relatively small effect (and no significant difference) in the present study comparing cocaine-dependent and control participants.
Despite the relatively large difference in delay discount effect size between groups on the QDOT, mean total earnings on the QDOT for the control group was $0.89 (~7%) greater than for the cocaine group, a relatively small difference. However, because task duration was not choice-dependent (told to participants before beginning) and session duration was extremely similar between groups, it cannot be argued that the cocaine group was behaving more efficiently.
Consistent with the effect size analysis, the QDOT, EDT, and hypothetical $1,000 tasks showed that the cocaine-dependent participants discounted significantly great than the control participants, replicating previous findings regarding cocaine abusing/dependent individuals using hypothetical delay discounting tasks (Coffey et al., 2003
; Heil et al., 2006
) and a potentially real reward delay discounting task (Kirby & Petry, 2004
). The lack of significant differences in discounting between the groups on the potentially real $10 task contrasts with previous results in cocaine abusers using a potentially real reward task (Kirby & Petry, 2004
), although it should be noted that the potentially real reward task used in that study differed from the one used in the present study.
The QDOT may provide advantages for particular human delay-discounting studies, although it would not be the best alternative for all experimental questions. First, the QDOT does not confound delay discounting with probability discounting. That is, the QDOT provides certain reinforcers on all trials, and involves no aspect of probabilistic reinforcement. In contrast, in the EDT the larger delayed reward is probabilistic, a characteristic which was suggested by pilot work to increase the reliability of observing a discounting gradient (Reynolds & Schiffbauer, 2004
). Although many real-life choices involve options that are both delayed and probabilistic, emerging evidence suggests that delay and probability discounting are similar yet independent processes. For example, reward magnitude is inversely related to discounting rate for delay discounting, but is directly related to discounting rate for probability discounting (Estle, Green, Myerson, & Holt, 2006
). Also, research suggests that delay and probability interact in a complex fashion. For example, when assessing rewards that are both a delayed and probabilistic, the effect of probability is greater at smaller than larger delays, and the effects of reward magnitude on discounting rate in these combined delayed and probabilistic rewards follows the direction normally observed for delay rather than probability discounting (Yi, de la Piedad, & Bickel, 2006
The second advantage of the QDOT is its substantially shorter and less variable task duration. The mean duration of the QDOT, including instructions, was approximately 16 min, with a range that spanned only 2.5 min and a maximal duration of approximately 17 min. This suggests that the task may be readily utilized in a study within a 20 minute time frame, which compares favorably with other human operant delay-discounting tasks (Lane et al., 2003
; Reynolds & Schiffbauer, 2004
) such as the EDT, with a maximum duration of 70 min and duration that varied over a 42 min range across participants in the present study. The shorter and more reliable duration of the QDOT allows for its use in studies with multiple other measures, or during a drug administration study or other experimental manipulations in which one is interested in assessing effects on delay discounting at relatively discrete time points.
The third advantage is that the QDOT resulted in less variable money payments to volunteers than some human operant delay-discounting tasks. Although the difference in absolute amount earned between the QDOT and EDT is arbitrary given that the magnitude of earnings for either task could be manipulated, the variability in earnings was substantially less in the QDOT than the EDT. That is, the QDOT earnings ranged across an approximately 2-fold range, while the EDT varied over an approximately 5-fold range. The potentially real $10 task (which is not an operant procedure) varied over an approximately 50-fold range. This decreased variability with the QDOT may be important in two respects. First, researchers can make a more accurate estimate of study costs. Second, there may be experimental advantages in reducing the variability of participant earnings, so that differences in study earnings per se are less likely to confound other study hypotheses.
The fourth advantage of the QDOT over previous human operant delay-discounting tasks is that the QDOT always converges on an indifference point. Some previous tasks have allowed for the potential of indeterminate indifference points or multiple indifference points for a single delay (Lagorio & Madden, 2005
; Lane et al., 2003
; Reynolds & Schiffbauer, 2004
; Scheres et al., 2006
). To deal with this issue, one procedure has used independent raters to subjectively judge indifference points from the raw data, with differences resolved by discussion (Scheres et al., 2006
). As was seen in the present data set for one participant, with the EDT indeterminate indifference points are possible. The possibility that the EDT task may not work for an a priori
unknown number of participants may pose challenges for power analyses and the determination of target sample size for a study. The participant (Cocaine 19) for whom the EDT failed to produce data provides an example worth exploration. That participant emitted 100% of choices for the immediate option on all delay blocks, and therefore the titration algorithm did not converge on a solution. To address the fact that the delayed option is probabilistic and the immediate value is certain in the EDT, all non-zero delay indifference points are divided by the zero-delay block indifference point. Therefore, in the case of Cocaine 19, even if a value of 0 were assigned as the indifference point for each delay, normalization would result in undefined data.
Limitations of the QDOT should be considered. Although the QDOT is more similar to the typical methodology in nonhuman delay-discounting studies because rewards are contingent and experienced throughout the task, there are nonetheless remaining differences. One is that nonhuman animal studies provide extensive experience with choices until stability in choice is achieved. The QDOT provides only a single exposure to each choice within the procedure. Through both instructions and experience throughout the task, participants presumably learn that consequences will indeed be provided for every option he or she makes. It is possible that the response on the first trial or few first few trials may have differed had those same choices been presented later in the task, given that there was no or limited experience with delays early in the task. Therefore, the results may be different if the task had be repeatedly performed by the participants (e.g., Lane et al., 2003
), a hypothesis that may be empirically addressed in future research. Ultimately, the order in the data, including the hyperbolic nature of the delay-discounting functions, and the differences in degree of discounting between control and cocaine-dependent participants, suggests that the task provides a valid assessment of delay discounting. Another limitation is that it is unknown to what extent individual trial duration could have impacted results. Although the instructions and the waiting period at the end of the task provided assurance that early termination of the task did not encourage choice for the smaller immediate reinforcer, inter-trial-interval (individual trial duration or the time between reinforcement deliveries) was decreased by selection of the smaller immediate reward. It is therefore unknown whether results would have differed if inter-trial-interval were held constant (i.e., if an adjustable waiting period were imposed after reinforcer delivery). Previous human contingent delay discounting research has shown that although results can differ depending on whether inter-trial-interval is constant or variable, both methods result in concordance regarding whether or not groups differ in delay discounting (Marco et al., 2009
; Scheres et al., 2006
). Future research may examine the effect of holding inter-trial-interval constant on the QDOT. Another limitation is that the reliability of the task is unknown. Assessing the test-retest reliability of the task will be important before incorporating the QDOT as a repeated measure in future studies.
Although the present results suggest that the QDOT may be appropriate in some experimental situations, other procedures will be appropriate for other experimental situations. Each of the operant human tasks mentioned in the present studies have made important scientific contributions, and were appropriate for addressing experimental questions in the references studies. Advantages of these other procedures have included exposure to forced choice trials (Lagorio & Madden, 2005
; Lane et al., 2003
; Reynolds & Schiffbauer, 2004
); innovative graphical representation that may facilitate participation in children (Scheres et al., 2006
), assessing real consequences with delay durations of multiple days (Lagorio & Madden, 2005
), and the use of coin delivery which was replicated in the present methods for the QDOT (Reynolds & Schiffbauer, 2004
). Because delayed consequences in life are typically uncertain in addition to delayed, the EDT may provide a more face valid model of these conditions (Reynolds & Schiffbauer, 2004
). Ultimately, research questions and experimental constraints should determine the nature of the delay-discounting task to be used for a particular study.
In summary, the QDOT showed systematic effects of delay that conformed well to a hyperbolic function, correlated well with the EDT, showed an effect size between cocaine-dependent and control participants that was similar to most other delay-discounting tasks, and found that the cocaine-dependent participants discounted significantly more than the control participants. The QDOT may provide methodological advantages for particular human delay-discounting research because it is not confounded by probabilistic reinforcement, it can be reliably administered in less than 20 minutes, results in a relatively reliable magnitude of session earnings, and consistently results in complete delay-discounting data.