|Home | About | Journals | Submit | Contact Us | Français|
The experimental analysis of delay of reinforcement is considered from the perspective of three questions that seem basic not only to understanding delay of reinforcement, but, also, by implication, the contributions of temporal relations between events to operant behavior. The first question is whether effects of the temporal relation between responses and reinforcers can be isolated from other features of the environment that often accompany delays, such as stimuli or changes in the temporal distribution or rate of reinforcement. The second question is that of the effects of delays on operant behavior. Beyond the common denominator of a temporal separation between reinforcers and the responses that produce them, delay of reinforcement procedures differ from one another along several dimensions, making delay effects circumstance dependent. The final question is one of interpreting delay of reinforcement effects. It centers on the role of the response–reinforcer temporal relation in the context of other, concurrently operating behavioral processes.
Along with rate, quality, and magnitude, delay has been considered a primary determinant of the effectiveness of a reinforcer (e.g., Catania, 1979; Kimble, 1961). The study of delay of reinforcement in the experimental analysis of behavior is a contemporary manifestation of the long-standing question in the history of ideas, from Aristotle to Hume and on to James, of how the temporal relations between events influence the actions of organisms. Early in the history of experimental psychology, Thorndike (1911) noted that response acquisition is negatively related to the interval between a response and its effect or consequence, and Watson (1917) studied delay of reinforcement experimentally by imposing a period of time between rats' responses of digging through sawdust and subsequent access to food. Thereafter, a plethora of experiments have examined delay of reinforcement using a variety of methods and from an equal variety of theoretical perspectives. Earlier work was placed in perspective in previous reviews (Renner, 1964; Tarpy & Sawabini, 1974), though the focus of neither of these reviews was on free-operant behavior.
The experimental analysis of delay of reinforcement has resulted not only in an extensive empirical literature, but the results of these analyses have been incorporated into a number of integrative theories of reinforcement, such as delay-reduction theory (e.g., Fantino,1969), the correlation-based law of effect (Baum, 1973; cf. Williams, 1983), and behavioral economics (e.g., Mazur, 1987). Practically, delay of reinforcement is part and parcel of many human endeavors, as well as the applications of behavioral research (e.g., Hayes & Hayes, 1993; Stromer, McComas, & Rehfeldt, 2000). Perhaps because of its long history of study, rather than in spite of it, delay of reinforcement continues as a fruitful area of research, application, and theory.
Delay of reinforcement is considered herein from the perspective of three questions. The first is one of separating the time period between a reinforcer and the response that produces from other features of the environment that can, and often do, accompany the introduction of a delay, such as stimuli or changes in the temporal distribution and rate of reinforcement. The second question is that of the effects of delays on operant behavior. Beyond the common denominator of temporal separation, delay of reinforcement procedures differ from one another along several dimensions, making delay effects circumstance-dependent. The final question is one of interpreting delay of reinforcement effects. It revolves around the contribution of the response–reinforcer temporal relation in the context of other, concurrently operating behavioral processes. Before considering these three questions, however, answering a more basic question is in order.
Reinforcement is delayed whenever there is period of time between the response producing the reinforcer and its subsequent delivery. This time period has been arranged in different ways. Skinner (1938) programmed the delay in the presence of the same stimuli that were in effect during the nondelay period, as did Dews (1960) and Azzi, Fix, Keller, & Rocha e Silva (1965). Ferster (1953), and many subsequent researchers (e.g., Chung, 1965; Chung & Herrnstein, 1967; Pierce, Hanford, & Zimmerman, 1972; Richards, 1981) correlated the delay interval with a stimulus change. These two procedures define, respectively, unsignaled and signaled delays of reinforcement. With both, there are two components: one before the reinforced response, and the other between it and the reinforcer. Thus, in the conventional schedule nomenclature of Ferster and Skinner (1957), an unsignaled delay of reinforcement may be categorized as a type of tandem schedule and a signaled delay of reinforcement as a type of chained schedule of reinforcement.
Other features of the delay are important in its discussion. When delays have been accompanied by a distinct stimulus, the stimulus most often is a blackout, but changes in both visual stimuli (e.g., key color lights, lever retraction) and auditory stimuli also have been investigated (Ferster, 1953; Pierce et al., 1972; Lattal, 1987). Delays also can be nonresetting or resetting. In the former, once the delay interval is initiated by a response, further responses have no effect. A resetting delay is one in which each response during the delay returns the delay to its initial value. Resetting and nonresetting delays are most often combined with unsignaled delays, because the stimuli present during a signaled delay usually control the behavior during the delay (Lattal, 1987). The delay duration also can be either fixed, that is, the same each time it is effected, or variable, such that it changes before successive reinforcers. Resetting delays typically are fixed, though in principle they need not be. A variable delay can be arranged such that the values are selected and then set for the duration of each specific delay by using a variable-time (VT) schedule (e.g., Cicerone, 1976). With a nonresetting unsignaled delay, the actual delay preceding reinforcers also will vary as a function of when responding occurs during the delay. In the latter case, the nominal delay value defines the upper limit of the delay, with actual or obtained delays typically being less than the programmed delay value (cf. Dews, 1981; Sizemore & Lattal, 1977, 1978).
The first question posed in the introduction is, in essence, whether there is a delay of reinforcement effect. Imposing either signaled or unsignaled delays of reinforcement can simultaneously alter other features of the environment, which in turn can contribute to the behavioral changes nominally attributable to imposing the delay. This is most obvious in signaled delays, where there is an immediate, response-produced stimulus change that can have conditioned reinforcing, eliciting, or overshadowing effects on operant responding, independently of the effects of the delay.
Breaking the chains of conditioned reinforcement, or these other stimulus functions, is easily accomplished by eliminating the signal. When unsignaled delay value is varied, a delay of reinforcement gradient is observed (Sizemore & Lattal, 1978) that qualitatively resembles the gradient obtained with signaled delays (Richards, 1981). The unsignaled delay gradient, based on obtained delay values, however, is characterized by lower response rates and a steeper slope than the gradient obtained with otherwise equivalent signaled delays (Richards; Sizemore, 1976).
Even with unsignaled delays, other potential problems remain. Consider a typical manipulation for imposing an unsignaled delay of reinforcement on responding maintained by a variable-interval (VI) 60-s schedule of immediate reinforcement. When a delay of, for example, 20 s is imposed, the schedule is converted to a tandem VI 60-s fixed-time FT 20-s schedule. One problem is that the structure of the schedule changes such that reinforcers that previously could occur within less than a second of one another now are always separated by 20 additional seconds. This lengthens postreinforcement pausing, which in turn can be reflected as lower overall response rates, the same effect that might be expected with a 20-s delay. Another problem is that the rate of reinforcement is reduced from an average of one per 60 s to one per 80 s, also potentially reducing response rate independently of the delay effects.
Different solutions to these two problems have been employed. Chung (1965; see also Lattal, 1984), arranged a concurrent VI 1-min VI 1-min schedule in which responding on one of two keys produced scheduled reinforcers after a delay signaled by a blackout. Pecks on the second response key yielded reinforcers immediately after they were programmed. At the same time, according to a VI 1-min schedule, pecks on this same key also produced blackouts of identical duration to those of the signaled delay on the other key. These blackouts were not followed by reinforcement. Thus, reinforcement rate during the immediate and delayed reinforcement conditions were the same. At delay values greater than 1 s, relative response rates were lower on the key correlated with the delay of reinforcement. The procedure, however, converts the schedule on the nondelay key to a multiple VI Extinction schedule with effects on responding on both that key as well as on the key associated with the delayed reinforcement. Lattal (1984) found that adding such a blackout to a VI schedule often increased response rates, instances of behavioral contrast (Catania, 1961; Reynolds, 1961). Sizemore and Lattal (1977, 1978) used as the immediate reinforcement baseline a tandem VT FI schedule in which the values of the VT and FI schedules were equivalent to the values of the subsequent unsignaled delay condition (i.e., a tandem VI FT schedule). Thus, the distribution and rate of reinforcement remains unchanged from the immediate reinforcement baseline condition. This procedure works well for assessing the effects of a single delay value; however, when constructing a delay gradient by comparing different delay values, reinforcement rate still varies as the delays are varied. As a result, variations in reinforcement rate seem an inevitable confound of introducing delays in this manner.
Different approaches to assessing the role of delay relative to changes in rate of reinforcement were used by Shull, Spear, and Bryson (1981, Experiment 2; cf. also Moore, 1979) and Weil (1984). Shull et al. arranged for pigeons' key pecks during the initial link of a chained schedule to produce a terminal link composed of a fixed-duration reinforcement cycle, accompanied by a key color change. During this cycle three food presentations occurred. The first occurred at a fixed time after the choice response was made, as did the third, which was near the end of the reinforcement cycle. Because the time of the second reinforcer was varied across different conditions of the experiment, it was possible to examine the effects of different delays between second-reinforcer onset and the initial-link response while holding overall reinforcement rate constant. Responding varied as a function of these delays.
Weil (1984) employed a related technique to eliminate the potential confound between reinforcement rate changes and delay durations. He used a schedule derived from the work described by Schoenfeld and Cole (1972). Under this arrangement, a repeating time cycle, T, is composed of two time periods. During tD the first response results in reinforcement at the end of the period. During tΔ, responses have no consequence. By holding T constant and varying the placement (before or after tΔ) and duration of tD, Weil generated different obtained delays to reinforcement while holding reinforcement rate constant. For 3 of 4 pigeons, response rates were a decreasing monotonic function of increases in the obtained (as opposed to programmed, which was unrelated to response rate) delay value. This led Weil to conclude that “both obtained delay and reinforcer frequency appear to contribute to response rate, but obtained delay does so to a much greater degree” (p. 154).
Another potential confounding variable when interpreting delay of reinforcement effects occurs when delays are unsignaled. The response is free to occur during unsignaled nonresetting delays, making the obtained delays less than the nominal delays. Dews (1981, p. 216) suggested that results from such a procedure were difficult to interpret because they involve variable delays of reinforcement and the obtained delays are likely to be of brief duration. By implication, this latter point would mean that they do not generate a sufficient range of delay values to allow a functional relation to be established between responding and delay value. Unsignaled delays, however, do yield orderly delay gradients, based on either nominal or averaged obtained delays (Sizemore & Lattal, 1978). Nonresetting delays are indeed variable, rather than fixed. The available evidence, which is not extensive, suggests that fixed and variable delays have different behavioral effects (e.g., Cicerone, 1976; Logan, 1960). When delays are variable, mean delay values also may affect responding differently as a function of the distribution of those delays (e.g., as in the different effects on VI responding of an arithmetic or a constant probability distribution of interreinforcer intervals, cf. Catania & Reynolds, 1968).
For assessing the behavioral effects of delays of reinforcement, Dews (1981) favored resetting delays over either signaled delays in which the opportunity to respond was removed (cf. Pierce et al., 1972), or nonresetting delays. Although the resetting delay keeps the obtained delay value constant, it creates other problems. One is that reinforcement rates are determined by whether responses occur during the delay. This becomes a problem particularly with longer delays, because some responding during the unsignaled delay typically occurs and each response reduces reinforcement rate, which in turn can reduce response rates independently of the effects of the delay.
The response that produces the reinforcer initiates the delay, at the end of which the reinforcer is provided independently of further responding. In the case of unsignaled delays, this procedure raises the question of whether simply providing the established reinforcers independently of responding would have similar effects on behavior—that is, does the dependency matter in formulating delays of reinforcement? The effects of unsignaled delayed reinforcement therefore have been compared to those of response-independent reinforcers occurring at the same rate and with the same temporal distribution as the delayed reinforcers. The effect depends critically on the duration of the delay. If the programmed delays are relatively brief (e.g., 3 s), delayed reinforcement maintains more responding than does a similar schedule of response-independent food delivery (e.g., Catania & Keller, 1981; Sizemore & Lattal, 1977; Williams, 1976). If the programmed delays are relatively long (30 s), then the differences between the two conditions are more equivocal (Gleeson & Lattal, 1987). This latter finding, however, simply underlines the fact that at some point delays become sufficiently long that their effects are indistinguishable from those resulting from an absence of a response–reinforcer dependency.
Given that an effect of delays unconfounded by procedural variables can be established, the second question is: What are the effects? As a function of circumstances, delays of reinforcement can decrease or increase behavior, or leave it unchanged relative to immediate reinforcement. Furthermore, the same delay value may have different effects as a function of other parameters of both the delay and at least some of the maintaining conditions of reinforcement (but cf. Shahan & Lattal, 2005).
A primary consideration concerning how responding will be affected by delays of reinforcement is the baseline on which the delay is imposed. The most common procedure in the experimental analysis of behavior is to impose a delay of reinforcement after obtaining steady-state responding on some schedule with immediate reinforcement. Under these conditions, delays typically reduce response rates, but, even here, reductions are not invariably the outcome. A less common procedure is to impose delays of reinforcement in the absence of a previously established operant response.
Skinner (1938) was the first to differentiate operant responses in the absence of immediate reinforcement. He observed that lever-press responding of experimentally naive rats was established when reinforcement followed an unsignaled delay of up to 8 s. Lattal and Gleeson (1990) systematically investigated such acquisition. In their first experiment, they magazine trained experimentally naive, food-deprived rats and pigeons. Then, with some pigeons, tandem fixed-ratio (FR) 1 FT 30-s and with other pigeons and rats tandem FR1 differential-reinforcement-of-other-behavior (DRO) 30-s schedules of reinforcement were implemented. Under the former schedule, the first response initiated an unsignaled delay of 30 s for pigeons that terminated with food delivery, regardless of whether further responding occurred during the delay interval. The arrangement was similar in the second schedule, except that every response during the delay (30 s for pigeons; 10 s for rats) reset the delay timer, thereby imposing a 10- or 30-s period of no operant responses immediately before food delivery. Without response shaping or any other form of response training, the response (key pecking by pigeons; lever pressing by rats) developed in most of the animals in periods of time ranging from a few minutes to a few hours. Thereafter, the response was maintained. Subsequent research ruled out the control of responding by such variables as simple exposure to the apparatus (Lattal & Gleeson), evocation of the response by food delivery per se, (Lattal & Gleeson; Wilkenfield, Nickel, Blakely, & Poling, 1993), and auditory feedback associated with the response (Critchfield & Lattal, 1993; Schlinger & Blakely, 1994). Similar response differentiation with delayed reinforcement also has been reported using different strains of rats (e.g., Anderson & Elcoro, 2007; Hand, Fox, & Reilly, 2006), Siamese fighting fish (Elcoro, daSilva, & Lattal, 2008; Lattal & Metzger, 1994), and humans (Okouchi, 2009).
The shaping of responding by the differential reinforcement of successive approximations has been suggested to be most likely with immediate differential reinforcement of such approximations to the target or criterion response (Skinner, 1953). Given the preceding findings of response acquisition in the absence of explicit attempts to train the response, it was of interest to attempt to shape responding with delayed reinforcement. The procedure was identical to a conventional shaping procedure, except that the shaping switch was arranged so that when an appropriate approximation to the criterion response occurred, activating the shaping switch initiated a nonresetting unsignaled delay. When reinforcement was either immediate or delayed by 1 s from the to-be-reinforced approximation, shaping of naive pigeons' key peck responses occurred quickly, completed within a single session of 1 hr or less. When, however, a 10-s delay was initiated by activating the shaping switch, responding was not shaped in 1 pigeon in the course of more than ten 1-hr sessions. Visual observation of the pigeon revealed that during the last training sessions, it continuously oriented to the chamber wall containing the response key and the grain magazine. The lateral distance of its movements appeared to be constrained by the 10-s delay in that when its beak was near the key, the shaping switch was activated. The pigeon then would continue to engage in other behavior during the delay interval, ensuring that there was considerable variability in the responses that occurred just prior to food delivery. The 10-s delay delineated the outer limits of the pigeon's physical distance from the key. A similar pattern occurred with the 1-s delay, only the distance from the key allowed by this delay before reinforcement was considerably less. Thus, the delay contingency seemed to constrain behavior spatially, but the delay was sufficiently long in the 10-s case that key pecking failed to develop over a time period that was more than adequate to shape key pecking with immediate or even briefly delayed (1-s) reinforcement.
Response differentiation with delayed reinforcement increases operant response rates relative to the baseline condition, which is either zero or near zero. The more common effect associated with delays of reinforcement is response rate reduction; however, under some conditions responding has been observed to be unchanged from immediate reinforcement. Under others, response rates have increased when delays are imposed.
Delays of reinforcement most often are introduced at full value following an immediate reinforcement baseline. Leaving aside for the moment the complexity of interpretation introduced by signaled delays of reinforcement, Ferster (1953) suggested that the effects of signaled delays might be attenuated if the delay is introduced gradually and titrated as a function of the organism's behavior, a technique used with considerable success subsequently by Terrace (1963) to introduce negative discriminative stimuli during discrimination training. Using this titration technique, Ferster maintained responding of pigeons under VI 60-s schedules with blackout-signaled delays of up to 120 s. His data, however, were limited to a sample of cumulative records illustrating terminal performance and details of the titrating procedure were not reported. Using a related procedure with baboons as subjects, Ferster and Hammer (1965) reported developing sustained responding by baboons with a 24-hr signaled delay between responding on one of two keys and delivery of reinforcement following a response on a second key when “the delay in reinforcement was increased slowly, paced with the monkeys' performance” (p. 249). Although responding was sustained, it “differed significantly from [that] usually recorded with immediate reinforcement” (p. 252) and, in a follow-up experiment, introducing an 18-hr delay suddenly as opposed to gradually showed that “performance maintained with long delays to reinforcement does not depend on a gradual, paced increase in the length of the delay” (p. 253).
Imposing brief (around 0.5 s) unsignaled nonresetting delays between a response and the reinforcer often increases, and characteristically changes the structure of, responding relative to that observed with immediate reinforcement. This has been observed when key-peck responses of pigeons were initially maintained by immediate reinforcement on either VI or differential-reinforcement-of-low-rate (DRL) schedules (Arbuckle & Lattal, 1988; Hall, Channell, & Schachtman, 1987; Lattal & Ziegler, 1982; Richards, 1981; Sizemore & Lattal, 1978). Lattal and his colleagues have shown that, regardless of whether response rates increase, there is a consistent increase in the number of short (< 0.5 s) interresponse times. Lattal and Ziegler (1982) suggested that, with brief delays to reinforcement, if the first peck of what is to be a series of pecks initiates a delay, this allows the remaining pecks in the burst to occur, thereby making the reinforcer contiguous with a burst of responses. This same effect occurs if the 0.5-s delays are resetting (Lattal & Ziegler). If, however, that same response starts a delay signaled by a blackout, potential bursts are more likely to be truncated and, as a result, neither IRT distributions change nor do response rates increase as they do with brief unsignaled delays (Lattal & Ziegler; Richards, 1981).
By far, the most typically reported effect of delay of reinforcement imposed following steady-state responding on schedules of immediate reinforcement is a reduction in response rates relative to those observed when reinforcement is immediate. In some cases, however, conclusions about the effects often have been compromised because, particularly in some of the early experiments in which different parameters of both the delay and the maintaining conditions of reinforcement were manipulated, they failed to employ appropriate control procedures as discussed in the previous section. Where the appropriate controls have been incorporated, the effects of the delay depend on parameters of both the delay and the reinforcement schedule. The former parameters include delay duration, whether the delay is fixed or variable (Cicerone, 1976), type of signal (Lattal, 1987), and whether the delay is signaled, unsignaled, or partially signaled (e.g., Richards, 1981; Schaal & Branch, 1988). The latter include the type of schedule (e.g., Gonzales & Newman, 1976; Kendall & Newby, 1978; Morgan, 1972; Richards, 1981) and parameters of reinforcement (e.g., Mazur, 1987). An exception is the finding of Shahan and Lattal (2005) that unsignaled 3-s delays reduced response rates maintained by different rates of reinforcement proportionally the same.
Historically, theoretical accounts of delay of reinforcement have distinguished between a primary and a secondary or derived gradient of reinforcement delay (see Renner, 1964, for a review). Hull (1943), for example, asserted that the derived delay of reinforcement gradient was mediated by conditioned reinforcement and the primary one was not. Spence (1947) initially treated all delays of reinforcement as involving conditioned reinforcement, including their mediation by proprioceptive stimuli, but later modified his position to distinguish two types of delay of reinforcement, a chaining and nonchaining type, each involving different behavioral mechanisms (Spence, 1956).
These historical themes recur in the experimental analysis of delay of reinforcement. Response–reinforcer temporal contiguity was assigned a primary role in the effectiveness of reinforcement by Skinner (1948, 1953; Ferster & Skinner, 1957) and thereafter by others (e.g., Peele, Casey, & Silberberg, 1984; see also Schneider, 1990). A number of subsequent experiments employed unsignaled delays of reinforcement to examine more systematically the functional relations between rate of response and the disruption of response–reinforcer temporal contiguity, in the form of delay of reinforcement, unencumbered by exteroceptive stimuli accompanying the delay (Sizemore & Lattal, 1977, 1978; Williams, 1976).
Ferster (1953) appealed to mediating behavior during the delay to account for his observations of sustained operant responding under delays of reinforcement up to 120 s, where a blackout or other stimulus change accompanied each delay. Eliminating the stimulus accompanying the delay, and thus a source of immediate conditioned reinforcement for operant responses, still led to interpretations in terms of mediating behavior during the delay. Azzi et al. (1965) observed mediating behavior in the form of “some sort of dipper contact, with a regularity similar to that which typifies ratio responding” (p. 161) during their unsignaled delay periods. A comment by Dews (1960), also studying unsignaled resetting and nonresetting delays of reinforcement, underlines the combined influence of contiguity and mediating behavior: “[p]resumably, some behavior occurring during [the] delay will be fortuitously reinforced” (p. 229).
A primary delay of reinforcement gradient based simply on the temporal relation between the reinforcer and the last operant response that produced it, that is, a “pure” effect of delay, isolated from behavior, seems unachievable, if not implausible. Certainly, with delays there is a structural or procedural temporal separation between operant responses and reinforcers that follow, but this does not necessarily correspond to a functional separation between responding and reinforcement. Delays of reinforcement do not dam the behavior stream. They simply rechannel it. Thus, the delay is not a period of behavioral emptiness through which time passes. Rather, as is suggested by the observations quoted in the preceding paragraph, responding invariably occurs during the delay, accompanied by a stimulus change or not, raising the possibility that such behavior contributes to the delay of reinforcement effect. Baum (1973) more broadly questioned the importance of response–reinforcer temporal contiguity in the maintenance of operant behavior. He proposed that delay of reinforcement has its effects because disrupting response–reinforcer temporal contiguity weakens the correlation between responding and reinforcement (cf. also Williams, 1976).
Regardless of the conceptualization of delays as disrupting correlations or actual contiguity, introducing a delay of reinforcement, by definition, relaxes or loosens the response–reinforcer relation. The response–reinforcer dependency constrains and focuses responding (e.g., Staddon, 1979; Timberlake & Allison, 1974; Zeiler & Buchman, 1979) when reinforcement is immediate. Relaxing this constraint in a schedule of reinforcement allows other forms of behavior to intrude, in a manner similar to the intrusion of nonnative species into an otherwise stable ecosystem. Like nonnative species, these new behavioral forms can either compete with the operant behavior or they can complement or even augment it.
How relaxing but not eliminating the response–reinforcer dependency affects response variability is illustrated by the pigeon's behavior during the attempt to shape behavior with delayed reinforcement described in preceding section. A systematic analysis of such an effect was reported by Escobar and Bruner (2007), who studied response differentiation by experimentally naïve rats in a chamber containing an array of seven levers. The center lever was associated with a tandem random-time FT x-s schedule, where x varied from 1–32 s for different groups. Responses on the other six levers were recorded, but had no other effect. At all delay values, responding was highest on the center lever. Responding on the other levers was more confined to the levers adjacent to the center lever at 0-s delay. Although the effect was variable, with 1–8 s delays responding on levers physically more distant from the operative lever tended to increase. With 16-s and 32-s delays, responding to all of the levers was low to zero. Both of these experiments suggest that longer delays allow greater behavioral variation, that is, the opportunity for different response forms to intrude and thereby interact with the extant contingencies.
Schaal, Shahan, Kovera, and Reilly (1998) provided an example of how the intrusion of other behavior can compete with the operant response during a delay of reinforcement procedure. In one experiment, pigeons' key pecking was maintained on a VI schedule. After the interfood interval expired, a peck to a second, food, key was required to activate the food hopper. In different conditions, the food key was or was not illuminated and a feedback sound was or was not produced by responses on the food key during the interfood interval arranged by the VI schedule. Response rates on the VI key were lower when the food key was illuminated and when the feedback sound was present during the interfood interval. Schaal et al. suggested that their results are analogous to what happens during unsignaled nonresetting delays to reinforcement: Rather than other behavior occurring at the time of reinforcement being adventitiously reinforced, hopper-related observing behavior can be explicitly reinforced by hopper access. This intermittent reinforcement of observing in competition with the operant responses thereby reduces operant response rates. These results also support Azzi et al.'s (1965) informal observation of feeder-directed behavior during and before the delay and related observations during signaled delays by Iversen (1981). Nor are they unique to delay of reinforcement. Nevin (1971), for example, exposed pigeons to two-key concurrent fixed-interval (FI) VI schedules. Negatively accelerated responding developed on the VI key as a function of the lapsed time on the concurrently available FI schedule. That is, the concurrently reinforced FI responding competed with the VI responding, changing the rate and pattern of the latter. Related results were reported by Lattal and Abreu-Rodrigues (1997) when FT schedules operated concomitantly with VI schedules.
Examples of intruding behavior accompanying the introduction of delays of reinforcement complementing or augmenting operant behavior have been reported in several experiments as well. As already noted, Lattal and Ziegler (1982) showed that brief delays of reinforcement reorganized temporal patterns of responding and often even increased response rates relative to those occurring with immediate reinforcement. Furthermore, there is considerable evidence that events occurring in proximity to reinforcement in chained and tandem schedules, of which operant delays of reinforcement are an example, do affect responding earlier in those schedules (cf. Lattal & Crawford-Godbey, 1985; Starin, 1987).
Lejeune, Richelle, and Wearden (2006) observed that mediating behavior often occurs in experimental analyses of temporally controlled responding, but they questioned whether it played a necessary role in such circumstances. A similar conclusion seems appropriate in the case of delay of reinforcement in that not all experiments reporting delay of reinforcement effects report systematic patterns of mediating behavior. The experiments in the preceding paragraphs certainly show that relaxing the response–reinforcer relation affects response variability, which in turn affects response rate. Whether such an increase in response variability in all cases constitutes evidence of mediating behavior seems questionable.
Delays of reinforcement are both imposed by contingencies and impose contingencies. Zeiler (1977) distinguished between direct, programmed variables in reinforcement schedules and indirect ones. The latter result from the interaction of the direct variables with behavior. Delay of reinforcement effects are determined in part by such direct variables as the schedule of reinforcement and parameters of the delay itself. Such effects, however, seem equally determined by the dynamic generated when these direct variables interact with behavior. Given the difficulties of isolating temporal contiguity and the ambiguities of mediating behavior, it seems useful to consider delayed reinforcement of operant behavior in functional terms, that is, in terms of the behavioral dynamics that emerge from the interaction of responding with the formal properties of both the maintaining conditions of reinforcement and the characteristics of the delay itself.
Thorndike's law of effect states that “[o]f several responses made to the same situation, those which are accompanied or closely followed by satisfaction to the animal will, other things being equal, be more firmly connected to the situation, so that, when it recurs, they will be more likely to recur (1911, p. 244).” Left for future generations was the task of exploring the implications of “closely followed by” for the understanding of reinforcement. The research discussed in this Perspective describes some of the empirical and interpretive fruits of Thorndike's bequeathal derived from the experimental analysis of behavior.
Separated from structural confounds associated with their imposition, delays of reinforcement have diverse behavioral effects, as a function of the circumstances of their construction and imposition. All delays involve a relaxing of the response–reinforcer temporal relation and hence, in a structural sense at least, disruption of temporal contiguity between the response and reinforcer that follows. Whether this structural change in temporal contiguity translates into functional change is ambiguous. Both correlational and mediational accounts of delay of reinforcement question, in different ways, the primacy of disruptions in temporal contiguity in determining delay of reinforcement effects. The ambiguity about the functional nature of response–reinforcer temporal contiguity in combination with the varied effects that delays have as a function of the kinds of variables described previously invites a broader view of delay of reinforcement. To wit, delay of reinforcement seems more usefully viewed as a dynamic behavioral process resulting from the actions of direct and indirect variables on behavior rather than as simply a static parameter of reinforcement.
Thanks to Alicia Roca, Rogelio Escobar, Carlos Cançado, Andrés Garcia Penagos, David Jarmolowicz, Toshikazu Kuroda, and Allison Tetrault for thoughtful discussions and reviews of earlier versions of this article.