|Home | About | Journals | Submit | Contact Us | Français|
An experiment with adult humans investigated the effects of response-contingent money loss (response-cost punishment) on monetary-reinforced responding. A yoked-control procedure was used to separate the effects on responding of the response-cost contingency from the effects of reduced reinforcement density. Eight adults pressed buttons for money on a three-component multiple reinforcement schedule. During baseline, responding in all components produced money gains according to a random-interval 20-s schedule. During punishment conditions, responding during the punishment component conjointly produced money losses according to a random-interval schedule. The value of the response-cost schedule was manipulated across conditions to systematically evaluate the effects on responding of response-cost frequency. Participants were assigned to one of two yoked-control conditions. For participants in the Yoked Punishment group, during punishment conditions money losses were delivered in the yoked component response independently at the same intervals that money losses were produced in the punishment component. For participants in the Yoked Reinforcement group, responding in the yoked component produced the same net earnings as produced in the punishment component. In 6 of 8 participants, contingent response cost selectively decreased response rates in the punishment component and the magnitude of the decrease was directly related to the punishment schedule value. Under punishment conditions, for participants in the Yoked Punishment group response rates in the yoked component also decreased, but the decrease was less than that observed in the punishment component, whereas for participants in the Yoked Reinforcement group response rates in the yoked component remained similar to rates in the no-punishment component. These results provide further evidence that contingent response cost functions similarly to noxious punishers in that it appears to suppress responding apart from its effects on reinforcement density.
Response cost, or the response-contingent removal of conditioned reinforcers such as tokens or money (Kazdin, 1972; Weiner, 1962), is widely used by social institutions to reduce or eliminate undesirable behavior. Laboratory studies designed to investigate the effects of response cost on response rates have shown that response cost can function as a punisher in humans (Bennett & Cherek, 1990; Bradshaw, Szabadi, & Bevan, 1977; 1978; Critchfield, Paletz, MacAleese, & Newland, 2003; Crosbie, Williams, Lattal, Anderson, & Brown, 1997; O'Donnell & Crosbie, 1998; Rasmussen & Newland, 2008; Weiner, 1962; 1964a, 1964b, 1964c) and pigeons (Pietras & Hackenberg, 2005; Raiff, Bullock, & Hackenberg, 2008). Applied studies with adults and children have also shown that response-cost is an effective behavioral intervention for reducing problem behavior (see Kazdin, 1972; Pazulinec, Meyerrose, & Sajwaj, 1983; Worsdell, 1998).
Although research has shown that response cost can decrease responding, it is uncertain whether response cost is functionally equivalent to more noxious, unconditioned punishers such as electric shock. Most of what is known about punishment processes has come from laboratory studies with nonhumans using electric shock as the aversive stimulus (see Azrin & Holtz, 1966; Baron, 1991) and relatively few studies have sought to systematically replicate those findings with other punishers (see Lerman & Vorndran, 2002). Thus, whether functional relationships obtained with electric-shock punishment generalize to other punishers, including response cost, has not been fully established.
One reason to question the equivalence of response cost and electric-shock punishment is the possibility that response cost may decrease response rates via different behavioral mechanisms than electric shock. In response-cost studies, because the punishing stimulus is the removal of token reinforcers, under punishment conditions there is typically a reduction in net positive reinforcement. Because of this confound, it is difficult to determine whether the decrease in responding under response cost is attributable solely to the response-contingent production of response cost, or whether the reduction in reinforcement density also contributes to its rate-decreasing effects.
A similar question about the role of reduced reinforcement has been raised with respect to timeout punishment. Timeout has been defined as a signaled period in which reinforcement is unavailable (Azrin & Holz, 1966). Research has shown that response-contingent timeout may decrease response rates (e.g., Thomas, 1968; McMillan, 1967). As with response-cost, however, under timeout punishment reinforcement rates typically decrease from no-punishment conditions. The lower net reinforcement under punishment may generate lower response rates. It is therefore difficult to separate the punishing effects of timeout on responding from the effects of reduced reinforcement frequency.
To help determine variables responsible for the rate-decreasing effects of response cost and timeout, several studies have investigated the effects of timeout or response cost on responding while controlling for decreased reinforcement rates. In a study with children, Willoughby (1969) examined timeout punishment using a yoked procedure designed to separate the effects on responding of contingent timeout from those of reduced reinforcement. In the first experiment, responding in one group was reinforced and conjointly punished by timeout. Reinforcement rates in the timeout group were yoked to a second group that received reinforcement only. Responding was initially lower in the timeout group compared to the yoked group, but responding in the timeout group gradually increased. In a second study, preference was investigated between a schedule delivering reinforcement and timeout punishment and a schedule delivering reinforcement only but with equivalent net reinforcement. The reinforcement-only schedule was preferred. These results suggest that timeout functioned as an aversive stimulus independent of its effects on reinforcement frequency.
In a study with pigeons, Pietras and Hackenberg (2005) investigated the role of reduced net reinforcement on responding under response-cost punishment. Pigeons' keypecking was maintained on a two-component multiple schedule of token production. In one component responding produced tokens exchangeable for food, and in another component responding produced tokens and also token losses (response cost). In Experiment 1 keypecking was investigated under several fixed-ratio (FR) punishment schedule values and response rates decreased as a function of response-cost frequency. Because each punisher decreased the net food amount, however, the decrease in response rate could not be unambiguously attributed to the response-cost contingency. Thus, in Experiment 2 responding in the punishment component was compared to responding in a yoked-control component. Two yoked-control component types were investigated: Yoked Complete and Yoked Food. During Yoked Complete conditions, responses in the yoked-control component produced tokens but tokens were removed response independently at the same intervals at which response cost was produced during the punishment component. In Yoked Food conditions, responses in the yoked-control component produced tokens but the maximum number of food deliveries equaled the total number of tokens exchanged for food in the punishment component. It was found that response rates decreased in both the punishment and yoked components, but that response rates decreased more quickly and were usually lower in the punishment component. The results suggested that response cost, like electric shock, had a direct suppressive effect on responding apart from its effects on reinforcement rate. These findings were subsequently replicated in a study by Raiff et al. (2008) that used an across-condition rather than an across-component yoking procedure.
Several studies with humans have also reported that response cost decreased responding in the absence of significant reductions in net reinforcement rates (e.g., Crosbie et al., 1997; O'Donnell & Crosbie, 1998), however, no studies with humans have systematically investigated the role of decreased net reinforcement on responding under response-cost punishment. Therefore, a primary aim of the present research was to investigate the effects of response-cost punishment on human behavior while controlling for reductions in net reinforcement caused by response cost. This was accomplished with a within-subject yoked-control procedure modeled after that used by Pietras and Hackenberg (2005) and Raiff et al. (2008).
In the present study, responding was reinforced according to a three-component multiple schedule. In one component responding was never punished. In a second component, responding was punished by response cost. The third component was a yoked-control component designed to evaluate the effects on responding of the reduced net reinforcement experienced under response cost. As in the Pietras and Hackenberg (2005) study, two yoked-control conditions were investigated: Yoked Punishment and Yoked Reinforcement. In Yoked Punishment conditions, deliveries of response cost in the punishment component were yoked to the third component, but losses were delivered response-independently. In Yoked Reinforcement conditions the net earnings obtained in the punishment component were yoked to the third component. Participants were exposed to one of the two yoking procedures and response rates during the yoked components were compared to response rates during the no-punishment and response-cost components. It was predicted that if response cost had a suppressive effect on responding independent of the decreased reinforcement rate, then responding should be lower in the punishment component than in either yoked component.
Across conditions, the value of the response-cost schedule was manipulated to investigate responding under a range of response-cost frequencies. Punishment studies using electric-shock punishment have typically shown a direct relationship between punishment frequency and the degree of response suppression under FR (Azrin, Holz, & Hake, 1963), variable-ratio (VR) (Iida & Kimura, 2005), fixed-interval (FI) (Appel, 1968; Azrin, 1956), variable-interval (VI) (Azrin, 1956; Ferraro, 1967) and probabilistic-shock schedules (Vogel-Sprott, 1966). Only a few studies, however, have investigated the effects of response-cost frequency on responding (VR schedules: Bennett & Cherek, 1990; VI schedules: Critchfield et al., 2003; probabilistic punishment: Meier, Brigham, Ward, Myers, & Warren, 1996; FR schedules: Pietras & Hackenberg, 2005). These studies have shown that, like electric-shock punishment, the frequency of response cost is an important determinant of response rate. With the exception of Meier et al. (1996), however, prior response-cost studies have investigated only a few punishment-schedule values. Additional research examining the effects of response-cost frequency on responding would allow better comparisons to nonhuman punishment research, and may be of interest to clinicians and practitioners. In applied research, typically every instance of problem behavior is punished to maximize punishment effectiveness. It is often more practical however, to punish behavior less frequently (see Lerman & Vorndran, 2002). A few applied studies have investigated whether intermittent punishment will reduce problem behavior, but the results have been mixed (e.g., Lerman, Iwata, Shore & DeLeon, 1997). Thus, a second aim of the present research was to further investigate how the frequency (i.e., random-interval schedule value) of response cost affects responding in humans.
In summary, the present study sought to examine response-cost punishment in humans to: (a) evaluate the mechanism responsible for the rate-reducing effects of response cost, and (b) systematically investigate the effects of response-cost frequency on responding. The results would provide additional information about the similarity of response-cost and electric-shock punishment.
Eight individuals (3 women and 5 men) completed the study. Two other participants showed a decrease in responding under response-cost conditions but quit the study before completing all experimental conditions. One other participant was dismissed from the study for failing to show a decrease in response rates under response cost, but unlike subsequent participants was dismissed before lower response-cost schedule values were attempted. Data from these 3 participants are therefore not presented. Participants were recruited by flyers posted around the university campus requesting volunteers 18–40 yrs of age for “Behavioral Research” or “Research on Decision Making.” Individuals reporting current illicit drug use or use of psychoactive medications were excluded. Participant information is shown in Table 1. To reduce attrition, during informed consent participants were told that they would be eligible for a completion bonus of $1.00 per session if they completed all scheduled sessions. Participants who quit the study forfeited this bonus. They were also told that earnings could vary day to day and that at the end of the study, if it was determined that their total earnings fell below a $6.00 per hr average, they would be paid an additional amount to bring their net earnings to a $6.00 per hr average. This bonus was required for Participants 50, 51, 63, 87, 99, and 102. Across participants, the average hourly earnings (without bonuses) was approximately $5.52 (± $0.95 SD). All earnings were in US dollars.
Participants were seated alone in one of two identical cubicles measuring 1.7 m wide by 1.3 m deep, with 2.1 m high walls, containing a swivel chair, desk, computer monitor, and response panel (10 cm by 43 cm by 25 cm) containing three push buttons (General Electric® P9CPNVS) labeled right to left “A”, “B”, and “C”. The cubicles were located in a 2.13 m by 3.51 m windowless room. Each cubicle also contained a white noise generator (Marsona TSC-330) to help mask extraneous noise and a camera for real-time observation. Participants were also asked to wear headphones during experimental sessions to reduce extraneous noise and to deliver auditory stimuli. All experimental events and data monitoring were controlled by computers located in another room using Microsoft Visual Basic© software.
All procedures were approved by Western Michigan University's Human Subjects Institutional Review Board. Each session consisted of a three-component multiple schedule of reinforcement. For ease of description, components were designated as no punishment, punishment, and yoked components, although in baseline (see below) no punishers were delivered in any component. Table 2 summarizes experimental contingencies in each component across conditions. Each component was signaled by a different background color on the computer screen (yellow, blue, or purple). The color associated with each component varied across participants. Components were always presented in the following order: no punishment, punishment, no punishment, yoked. Components were 240 s (excluding time for consummatory responses, money deliveries, and money subtractions) and were separated by 30-s intervals during which the screen was black and all stimuli were removed from the screen. At the start of each component, the total counter was shown in the upper middle of the computer screen. The text was in white font (about 0.7 cm in height) and was surrounded by a square black background. At the start of each session, the value of the total counter was set to “$0.00” and the letter “B” (about 2.2 cm in height), also colored white and surrounded by a black square background, appeared in the lower middle of the screen.
During baseline conditions, no punishers were delivered in any component and the contingencies in all three components were identical. Specifically, in all components responses on the button labeled “B” produced money delivery according to a random interval (RI) 20-s schedule. The RI schedule was programmed by sampling a .05 probability gate every second with the restriction that no interval could exceed 100 s. When the RI schedule requirement was met, the letter “B” disappeared from the screen and either the letter “A” appeared on the right side of the computer screen or the letter “C” appeared on the left side with equal probability. A single press on the corresponding “A” or “C” button (the consummatory response) removed the letter from the screen and caused a positive amount of money to be shown on the screen for 1 s (e.g., +$0.05) in black font. The money delivery was accompanied by a 500 ms tone (1500 Hz). The money amount was then added to the total counter and the letter “B” again appeared on the screen. The consummatory response was designed to maintain attending to the computer screen. As a form of response feedback, every response on the “B” button caused the font of the button to turn from white to grey for 25 ms.
In punishment conditions, responding in one component, the no-punishment component, continued to occasionally produce money. Responding in the punishment component occasionally produced money and money losses. When a money loss was scheduled, the next press on the “B” button caused the letter “B” to disappear from the screen and caused a negative amount of money to be shown on the screen for 1 s (e.g., −$0.02) in red font. The money loss was accompanied by a 500 ms tone (1000 Hz). The money amount was then subtracted from the total counter and the letter “B” again appeared on the screen. The total counter was not permitted to go into negative values; thus, money losses only occurred if the total earnings counter was greater than $0.00. If a money loss and money gain were scheduled simultaneously, the outcome presented after the response was determined randomly (p = .5) and the next response produced the other outcome.
In punishment conditions, the events scheduled in the yoked component depended on the group assignment. Participants were assigned to one of two yoked conditions: Yoked Reinforcement or Yoked Punishment. Participants P87, P99, and P100 were assigned to the Yoked Reinforcement group and participants P50, P51, P63, P102, and P103 were assigned to the Yoked Punishment group (P87 was accidentally exposed to one condition of Yoked Punishment before being switched to the Yoked Reinforcement group; see Table 3).
For participants assigned to the Yoked-Punishment group, responding produced money during the yoked component but occasionally response-independent money losses occurred, i.e., losses occurred according to a variable time (VT) schedule. The number and timing of money losses were yoked to the money losses produced in the preceding punishment component. During response-independent money losses the letter “B” was removed from the screen, the RI schedule was disabled, and the amount lost was shown on the screen in red font for 1 s. The money loss was accompanied by a 500-ms tone (1000 Hz).
For participants assigned to the Yoked-Reinforcement group, in punishment conditions responding produced money during the yoked component, but the total amount that could be earned in the yoked component was set (approximately) equal to the net amount earned during the previous Punishment component. To calculate the number of money deliveries in the yoked component, net earnings in the punishment component were divided by the gain value (e.g., $0.05) and rounded to the nearest decimal. For example, if the participant earned $0.25 and lost $0.08 during the punishment component, the net earnings, $0.17, were divided by 0.05, yielding 3.4, which was then rounded to yield 3 money deliveries or $0.15. The temporal distribution of money deliveries was programmed according to a Fleshler-Hoffman (1962) distribution; that is, money deliveries were programmed according to a variable-interval (VI) schedule. For this calculation, the rate of reinforcement was determined by dividing the number of programmed money deliveries (e.g., 3) by the component time (240 s), and the number intervals was set to the number of programmed money deliveries (e.g., 3). If only a single money delivery was scheduled, it was delivered at a randomly determined second within the component. For participants in both Yoked Punishment and Yoked Reinforcement groups, occasionally the duration of the Yoked component had to be extended by several seconds beyond its normal duration of 240 s so that all of the scheduled reinforcers or punishers could be produced or delivered, respectively.
Across punishment conditions, the value of the response-cost schedule was manipulated and was RI 20 s, RI 10 s, or RI 5 s. These schedules were arranged by sampling .05, .10, and .20 probability gates every second, with the restriction that no interval could exceed five times the schedule value. Participants 51, 87, 99, and 102 were also exposed to slightly lower schedule values (RI 2 s, RI 1 s, or FR 1) when the RI 5-s schedule did not decrease responding, and P 99 was exposed to a slightly higher schedule value (RI 40 s) when the RI 20-s schedule significantly decreased responding. Table 3 shows the sequence and number of sessions per condition. Typically, baseline conditions were programmed between punishment conditions.
Because it was often difficult to obtain stability in all three components, priority was given to establishing stability in the punishment component. Stability was assessed using visual inspection. Thus, conditions were changed after a minimum of five sessions and when the rate of responding in the punishment component across three sessions showed minimal session-to-session variability. (Response rates in the final three sessions of each condition are shown in Figure 2).
The first day of participation, participants were read a set of scripted instructions about the experimental task, and the instructions remained in the chamber throughout the experiment. The instructions were as follows:
You will be able to earn money by working at the response console. The response panel contains three buttons labeled A, B and C. When the session starts, the letter B and a counter will appear on the computer screen. The counter will be at zero. Pushing the B button will cause the letter B to go off the screen, sound a tone, and cause other letters to appear. Pushing the button corresponding to the letter on the screen will add money to the counter. During the session, a tone may sound and money will be subtracted from your earnings. The amount of money shown on the counter at the top of the computer screen is the amount you have earned during the session. Please remain seated. When you see the words “session over” appear on the screen you may return to the waiting area.
Training sessions were designed to establish button pressing, as pilot data indicated that instructions alone failed to establish button pressing under a RI schedule. No punishers were delivered in training. During the first session, responding in all components was reinforced according to a random-ratio (RR) 2 or 3 schedule and component durations were brief (i.e., 30 s to 45 s). Across subsequent sessions, the component (and thus session) duration was gradually lengthened as the reinforcement schedule was increased and changed to RI (e.g., RR 5, RR 10, RR 15, RI 10 s, RI 15 s, and finally RI 20 s). Training was completed in 6–13 sessions (See Table 3).
Participants completed 4–6 sessions per day (see below), 3–4 times per week, at approximately the same hour. Sessions were separated by 5-min breaks. At the end of the study, participants completed several postexperimental questionnaires, were paid their completion bonus, and were debriefed.
Participants 50, 51, and 63 were exposed to eight components per session (two sequences of no punishment, punishment, no punishment, yoked components) and completed four sessions per day. To help reduce fatigue and boredom, subsequent participants were exposed to four components per session (one sequence of no punishment, punishment, no punishment, yoked components) and completed six sessions per day. Thus, sessions were 35 min in duration for Participants 50, 51, and 63, and were 17.5 min in duration for all others. Differences in the number of components a participant experienced per session did not have any discernable effect on responding. Table 1 shows session parameters for all participants. For Participants 50 and 51, the values of the money gain and money loss were equal and were set to $0.04. Because session earnings were low, for all other participants the value of the money gain was increased to $0.05 and the money loss was decreased to $0.02.
Under response-cost conditions, 2 participants (P 50 and P102) showed no decrease in responding, even when the response-cost schedule was increased to FR 1 (P50 and P102) or when the response-cost magnitude was increased to $0.10 (P102). Data from these 2 participants were therefore omitted from subsequent analyses. Figure 1 shows for the remaining 6 participants responding in the punishment component and yoked component as a proportion of responding in the same session's no-punishment component across conditions. Mean values for conditions experienced by all participants are also shown. Under baseline, response rates were similar across all three components. Under response-cost conditions, responding typically decreased in the punishment component relative to the no-punishment component and, in most cases, the magnitude of the decrease was directly related to the response-cost schedule value. Participant 99 initially showed no decrease in responding in the punishment component, but response rates decreased when the response-cost magnitude was increased from $0.02 to $0.05 (see Table 3). In most subsequent figures, for P99 only conditions in which the loss magnitude was $0.05 are shown. In several cases response rates during response-cost punishment showed no decrease in the initial exposure to a condition but then decreased substantially in subsequent exposures (i.e., the first exposures to the RI 5-s punishment schedule for P87 and the first $0.05 exposure to RI 10-s punishment schedule for P99). Overall, when averaged across exposures and then participants, the average proportion of punishment to no-punishment response rates during the RI 20-s, RI 10-s, and RI 5-s punishment conditions was .55, .37 and .13 respectively.
For 2 of the 3 participants in the Yoked Punishment group (P51 and P103), responding in the yoked component also decreased under response-cost conditions, but the decrease was less than that observed in the punishment component. For P63, responding in the yoked component remained at baseline levels across punishment conditions. For these 3 participants, when data were averaged across exposures and then participants, the average proportions of punishment to no-punishment response rates was .79, .45, and .22, and the average proportions of yoked to no-punishment response rates were .78, .64, and .53 during the RI 20-s, RI 10-s, and RI 5-s punishment schedules, respectively. Although the number and timing of money-losses were matched across the punishment and yoked components, when participants in the Yoked Punishment group were asked to describe what happened during experimental sessions on the postexperimental questionnaire, none commented on the similarity.
For the 3 participants in the Yoked Reinforcement group, response rates in the yoked component decreased slightly under punishment conditions (P87) or remained at baseline levels (P99 and P100). For Yoked Reinforcement participants, when data were averaged across exposures and then participants (omitting the erroneous RI 5-s Yoked-Punishment condition for P87), the average proportions of punishment to no-punishment response rates were .31, .29, and .03, and the average proportions of yoked to no-punishment response rates were .93, .91, and .93 during the RI 20-s, RI 10-s, and RI 5-s punishment conditions, respectively. Thus, response rates in the yoked component during punishment conditions tended to be somewhat higher for participants exposed to Yoked Reinforcement compared to participants exposed to Yoked Punishment conditions.
Figure 2 shows absolute response rates in each component for the final three sessions of each condition. Conditions are presented in order of exposure. Absolute response rates under baseline conditions varied across participants from approximately 30 to 300 responses per min. There were no consistent effects of response cost on response rates in the no-punishment component. In some cases there was a slight decrease in responding in the no-punishment component during punishment conditions (e.g., P51 in RI 5-s, P87 in RI 2-s and RI 20-s, and P100 in RI 5-s punishment conditions), whereas in other cases there was a slight increase in responding (P87 in the second RI 5-s and P103 in RI 5-s, 10-s, and RI 20-s punishment conditions).
Because response rates during the unchanged, no-punishment component sometimes increased or decreased when conditions were changed from baseline to punishment, the proportion of responding in the punishment and yoked components to responding in the no-punishment component may have been inflated or deflated, respectively. Thus, response rates in the yoked and punishment components during punishment conditions were also analyzed in relation to response rates in the corresponding component during the preceding baseline condition (i.e., when no response cost occurred in any component). The overall pattern of responding in yoked and punishment components under punishment conditions was very similar to that shown in Figure 1.
Session earnings varied as a function of the punishment-schedule value and response rate during punishment conditions. Across participants, the average number of net money deliveries (gains minus losses) per session was 47.3, 32.5, 26.8, and 26.5 during baseline, RI 20-s, RI 10-s, and RI 5-s response cost conditions, respectively. Participants exposed to Yoked Punishment conditions tended to earn slightly more than participants in Yoked Reinforcement conditions during RI 10 s and RI 5 s response-cost conditions (averaging 28.8 and 29.2 compared to 24.8 and 23.8 money deliveries per session). This likely occurred because, for Yoked Punishment participants, when response rates decreased under response-cost conditions, few money losses were delivered in the yoked component, whereas for Yoked Reinforcement participants, when response rates decreased under response-cost conditions few money deliveries were produced, and thus few money deliveries were programmed in the yoked component.
As described above, the primary goal of the yoking procedure was to equate net earnings across punishment and yoked components. Figure 3 shows net reinforcement rates (obtained cents per min) in each component across conditions. Obtained cents per min were calculated by summing the total number of cents earned and subtracting the total number of cents lost. Net earnings were averaged across all sessions of each condition (instead of only the final three) because it was assumed that outcomes experienced across the entire condition would influence steady-state response rates. Figure 3 shows that during baseline, net earnings were similar across all components. During response-cost conditions, in the no-punishment component net earnings remained comparable to earnings during baseline. For participants exposed to Yoked Punishment, during response-cost conditions net earnings in the punishment and yoked components were often similar, but net earnings tended to be slightly lower in the punishment component. Earnings were lower in the punishment component because response rates in that component were lower than in the yoked component, and thus fewer money deliveries were produced (the number of money losses across punishment and yoked components were identical). For participants exposed to Yoked Reinforcement, during response-cost conditions net earnings in the punishment and yoked components were more comparable. The main exceptions were earnings in two response-cost conditions for P99. For this participant, response rates did not decrease during the first RI 10-s condition, and decreased only gradually in the second FR 1 condition. As a result, net earnings in the punishment components were negative and, consequently, few reinforcers were programmed in the yoked component. Overall, this figure shows that the yoking procedure was mostly successful in equating reinforcement rates across the yoked and punishment components, but that differences in response rates in the punishment and yoked components created occasional discrepancies in the Yoked Punishment group.
Reductions in response rate under response-cost conditions could decrease rates of reinforcement. Because response cost involves the removal of positive reinforcers, however, decreases in response rate, by reducing response-cost frequency, could also increase positive reinforcement. Thus, rate of reinforcement was analyzed in relation to rate of responding. Figure 4 shows obtained cents per min (gains minus losses) per session plotted as a function of responses per min per session across conditions. Data are from all participants, from all sessions in each condition. Regression lines are included to show trends. There was little relationship between rate of responding and earnings under baseline or RI 20-s response-cost conditions. There was also no relationship between rate of responding and earnings under the RI 10-s condition except for P99; all of the data points below -5 cents per min are from this participant. For P99 (as well as P51), the response-cost magnitude equaled the reinforcer magnitude (see Table 1), and P99 responded at a relatively high rate under the first RI 10-s response-cost condition. As the dashed regression line in Figure 4 shows, without P99 the relationship between response rate and earnings under RI 10-s response cost is much weaker. Only under the RI 5-s response-cost condition was there a reliable negative relationship between response rates and earnings. Thus, lower response rates were not correlated with lower earnings, and in two of the three punishment conditions, low response rates were not correlated with higher earnings.
The latency to make the first response in each component was collected to evaluate stimulus control by component stimuli. These are shown in Figure 5. Latencies to make the first response in the first no-punishment component were omitted because they were the first response of the session. For each participant, data were averaged across the final three sessions of all exposures to a condition. Response latencies in the yoked component were usually similar to latencies in the no-punishment component. Response latencies were often longer in the punishment component than in the no-punishment and yoked components, but the effect was inconsistent. Short latencies in the punishment component sometimes occurred although response rates were lower than rates in the no-punishment component. An analysis of response patterning revealed that in some of these cases (e.g., RI 20-s for P103, see Figure 6 below), participants responded at a high rate until the first money loss occurred, and then response rates decreased.
To evaluate within-component response patterns, cumulative records of response rates were constructed from session data. Figure 6 shows sample cumulative graphs from P100 and P103 during final sessions of Baseline, RI 5-s, RI 10-s, and RI 20-s response-cost conditions. In these 2 participants, responding under response-cost conditions varied as a direct function of the response-cost schedule value. Under baseline conditions, responding was relatively steady and uniform across components. Under response-cost conditions, response rates decreased and response patterns became slightly more irregular in the punishment component. There were occasional pauses after money losses, but the decrease in responding under response cost appeared to be mainly the result of decreased run rates. Consistent with Figure 1 above, for P103 who was exposed to Yoked Punishment conditions, response rates in both the punishment and yoked components were lower than response rates in the no-punishment component, but slopes were shallower in the punishment component. For P 100 who was exposed to Yoked Reinforcement conditions, response rates were also lower in the punishment component than in the yoked and no-punishment components, and response rates and patterns in the yoked component resembled those in the no-punishment component.
Figure 7 shows representative cumulative graphs from the remaining four participants under RI 10-s response-cost conditions. The pattern of responding across components in these 4 participants resembled those of P100 and P103, although P63 showed no decrease in responding in the punishment component in this condition. As described above, both P51 and P63 experienced eight components per session rather than four. There were few differences between the first and second sequence of component presentations for these 2 participants, although for P 63 under RI 5-s punishment conditions (not shown) the decrease in responding in the punishment component occurred primarily in the second exposure.
Button pressing in 8 adult humans was investigated under a three-component multiple schedule of monetary reinforcement. Under response-cost conditions, in one component responses produced both money gains and losses. Six of eight participants showed a decrease in responding in the punishment component under response-cost conditions. The decrease in responding is consistent with prior studies with humans that have demonstrated that money-loss is an effective punisher (e.g., Critchfield et al., 2003; Weiner, 1962). A few participants showed little sensitivity to response-cost punishment. This finding is not uncommon; several other studies with humans have also reported variability across participants in the sensitivity of behavior to electric shock (e.g., Scobie & Kaufman, 1969) and response-cost punishment (e.g., Crosbie et al., 1997; O'Donnell & Crosbie, 1998) and that some participants showed no suppression or required more intense or frequent punishment to suppress responding.
Several variables may have produced this insensitivity to response cost shown in some participants. For instance, the relatively rich schedule of reinforcement may have contributed to the insensitivity. Bradshaw et al. (1978) showed, for example, that responding in humans under VI schedules of reinforcement was less sensitive to a VI schedule of response cost when the reinforcement frequency was high (e.g., VI 8 s) than when it was low (e.g., VI 720 s). Alternatively, some participants may have continued to respond during response cost because of strong instructional control. All participants were told at the beginning of the study to press buttons to earn money. Experimenter demand may therefore have produced a persistence in responding despite the loss in earnings (c.f. Hackenberg & Joker, 1994). Participants were also told that if earnings fell below a $6 per hr average they would be compensated for the difference at the end of the experiment. Although participants were paid in cash following each day of participation, it is possible that the delayed payment may have reduced the aversiveness of immediate money losses. That responding in most participants decreased under response cost suggests, however, that if the delayed payment influenced performance, its effects were inconsistent.
Finally, it is possible that the particular sequence of exposure to response-cost schedules may have contributed to the insensitivity to response cost shown by some participants. For several participants, the first response-cost schedule that was experienced was a relatively high schedule value (i.e., punishers were relatively infrequent). This initial exposure to infrequent punishment may have reduced sensitivity to more frequent punishment. For example, for P50 the first response-cost condition was RI 20 s. When responding showed no decrease, the schedule value was decreased to RI 5 s, and then to FR 1, with no effect. Similarly, for P99 the first response-cost condition was RI 10 s. This participant showed no punishment effect, and the schedule value was gradually reduced. No decrease in responding was observed until the response-cost magnitude was increased. Prior studies with nonhumans using electric-shock punishment have reported similar sequence effects (e.g., Azrin et al. 1963; Banks, 1966; Banks & Torney, 1969). Not all participants showed this sequence effect, however. Participant 102, for example, was initially exposed to a relatively dense response-cost schedule (RI 5 s) and showed only minimal reductions in responding under this and lower schedule values. More research is therefore needed to explore sequence effects with response-cost punishment.
One goal of the present study was to investigate the effects on responding of the response-cost schedule value. For the 6 participants who showed a decrease in responding under response-cost conditions, response rates typically varied as a function of the RI punishment schedule, with lower punishment schedule values producing lower response rates. This finding is consistent with a prior study that has shown that responding by humans was sensitive to response-cost frequency under interval punishment schedules (Critchfield et al., 2003). These results are also consistent with nonhuman studies using electric-shock punishment that have shown that the degree of response suppression under interval punishment schedules varies as a direct function of punishment frequency (e.g., Ferraro, 1967). It should also be noted that for many participants response rates under response cost decreased to low levels despite the fact that responding was only intermittently punished.
Cumulative records showed that in the punishment component during the RI schedules of response cost, response patterns were somewhat more irregular than in the no-punishment component. Except when responding was completely eliminated, however, the reduction in responding was better characterized by an overall decrease in response rate than by an increase in pausing (see Figures 6–7). This pattern of responding resembled that shown by nonhumans (Azrin et al., 1963) under FR schedules and in humans (Scobie & Kaufman, 1969) under VI schedules of electric-shock punishment, and that shown by nonhumans under RI schedules of response cost (Pietras & Hackenberg, 2005; Raiff et al., 2008). Therefore, these data further show that response-cost and electric-shock punishment have comparable effects on response patterns. Additional parametric manipulations are needed to investigate response patterns under other response-cost schedules (e.g., fixed interval, fixed ratio).
The primary goal of the present study was to determine whether decreased net reinforcement rates under response-cost punishment contributes to the reduction in response rates typically observed under response cost. To accomplish this, a yoked-control component was programmed in which the net earnings equaled the net earnings in the punishment component. Two yoking procedures were investigated: Yoked Punishment and Yoked Reinforcement. Under Yoked Punishment conditions, net earnings were equated across components by delivering the same number of punishers obtained in the response-cost component in the yoked component, but delivering them response-independently. Under Yoked Reinforcement conditions, net earnings were equated across components by making the number of reinforcers available in the yoked component equal to the net earnings (gains minus losses) obtained under the response-cost component. Participants were exposed to one of the two yoking conditions. The yoking procedure was in most cases successful in equating net reinforcement rates across components. Under both yoking procedures, response rates were typically lower in the punishment component than in the yoked component (see Figure 1). This suggests that the contingency between responding and response-cost punishment had a suppressive effect on responding independent of reduced net earnings. This finding replicates prior response-cost studies with nonhumans (Pietras & Hackenberg, 2005; Raiff et al., 2008) and provides additional evidence that response-cost and electric-shock punishment may suppress behavior via comparable behavioral mechanisms.
The present results also correspond to the findings of a study by Rasmussen and Newland (2008) designed to investigate the symmetry of reinforcers (money gains) and punishers (money losses). In that study, participants were presented with choices between concurrent VI schedules of money gains under gain-only conditions and conditions in which a schedule of money loss (of equivalent magnitude) was superimposed on one of the options. A matching-law analysis revealed considerable bias towards the unpunished alternative. For example, when the net reinforcement on both options was equated, there was a strong preference for the option without punishment. Their data therefore suggest that the punishing effects of a money loss exceed the reinforcing effects of an equivalent money gain. In the present study, response rates under punishment conditions were lower in the punishment component than in the yoked component despite equivalent net reinforcement rates. These data also suggest then, that losses had a greater effect on behavior than gains.
Compared to responding during the no-punishment component, response rates during the yoked component were lower for participants exposed to Yoked Punishment conditions than for participants exposed to Yoked Reinforcement conditions. Several aspects of the yoked-control conditions may have contributed to this effect. Under Yoked Punishment conditions, money losses were delivered independently of responding in the yoked component. Although response rates were typically higher in the yoked component than in the punishment component, response rates were lower than those obtained in the no-punishment component. One possible explanation for the lower response rates in the yoked component is that reinforcement rates were lower. The possibility that lower reinforcement rates were solely responsible for the low response rates seems unlikely however, given that response rates in participants in the Yoked Reinforcement group showed little decrease in the yoked component although reinforcement rates were also low. It seems more probable that the reduction in response rates under Yoked Punishment was the result of adventitious punishment. Money losses were delivered according to a VT schedule in the yoked component, and occasionally, money losses may have immediately followed responses. In support of this, an analysis of the final session of the last exposure to RI 5-s punishment for participants exposed to yoked punishment revealed that the obtained delay between a response and money loss in the yoked component was less than 1 s for 65% to 100% of response-cost presentations. The finding that a VT response-cost schedule may decrease responding is consistent with studies with both nonhumans (e.g., Azrin, 1956) and humans (e.g., Vogel-Sprott & Burrows, 1969) reporting that response-independent electric shock had a suppressive effect on responding (although the effect was not as pronounced as contingent shock), an effect attributed to chance pairings between responses and punisher presentations. Several other studies with humans (Poetter & Lewis, 1972) and nonhumans (Branch, Nicholson, & Dworkin,1977; Schuster & Rachlin, 1968) however, have reported little suppression in responding by response-independent punishment. Procedural differences (such as the frequency of response-independent punishment), and differences in baseline response rates may account for the different effects. Pietras and Hackenberg (2005) also found greater reduction in response rates under Yoked Complete conditions, in which tokens were removed response-independently, than under Yoked Food conditions in which there were no token losses and net food amounts were equated across punishment and yoked components, but a systematic replication by Raiff et al. (2008) found an opposite effect. Differences in yoking methods between those studies (within versus across conditions), as well as differences in the token-reinforcement schedule between those two studies and the present study make direct comparisons difficult. More research is needed to explore variables that determine whether response-independent punishment will decrease responding.
Under Yoked Reinforcement conditions no money losses were scheduled, but the number of reinforcers delivered in the yoked component was programmed to equal the net amount earned during the punishment component. Because response rates typically decreased under punishment, the number of reinforcers delivered during the yoked component was often very low (see Figure 5). Despite this, response rates in the yoked component were often similar to no-punishment values. It is unclear why response rates remained high in the yoked component given the low reinforcement rates—and even extinction conditions—experienced by some participants. Possibly, the intermittent reinforcement experienced in the yoked component by several participants may have been sufficient to maintain responding. Alternatively, participants may not have been exposed to conditions long enough for responding to decrease or extinguish in the yoked component. Punishment conditions were typically alternated with baseline conditions. Responding was therefore frequently reinforced in the yoked component. This occasional reinforcement of responding in the yoked component may have contributed to the persistent responding.
The present procedure included yoked-control conditions to control for the reduction in net reinforcement rates under response cost. There is another confound, however, that is often present in response-cost procedures: During response cost, reductions in response rates often reduce the punishment frequency and increase reinforcement rates. Reductions in responding under response cost may therefore be attributed to increased reinforcement rate (positive reinforcement) rather than to punishment. A similar confound has been noted in timeout punishment studies (see Coughlin, 1972; Leitenberg, 1965). In the present procedure, decreases in responding under response cost, by reducing the number of punishers, could bring earnings closer to baseline reinforcement rates under the RI 5-s condition, but had little effect on earnings under the RI 10-s condition (except for P99 when the punisher magnitude was increased) or RI 20-s punishment condition (see Figure 4). As Figure 1 shows, response rates often decreased under RI 10-s and RI 20-s response-cost conditions. Thus, the reduction in response rates cannot be accounted for solely in terms of increased reinforcement rates. This finding corresponds to results of several other response-cost studies that have also reported response suppression without increased positive reinforcement (e.g., Trenholme & Baron, 1975).
Because responding was maintained on a multiple reinforcement schedule, it was possible to examine whether discriminative-stimulus control by punishment-correlated stimuli (the background color of the computer screen) developed. Prior studies investigating stimulus control by punishment-correlated stimuli have produced mixed results, with some showing (e.g., Doughty, Anderson, Doughty, Williams, & Saunders, 2007; Honig & Slivka, 1964; O'Donnell, Crosbie, Williams, & Saunders, 2000) and some failing to show suppression in the presence of stimuli signaling punishment (e.g., O'Donnell & Crosbie, 1998; Weisman, 1975). Only one study has shown stimulus control by stimuli correlated with response cost in humans (O'Donnell et al., 2000). In that study, punisher deliveries were delayed until the end of experimental sessions to prevent discriminative control by the punisher itself. In the present study, response latencies were examined to determine whether latencies were longer in the punishment component than in the no-punishment component. It was necessary to examine latencies to make the first response to evaluate the discriminative control by the background color separate from the discriminative control of response cost (see Doughty et al., 2007). Stimulus control by component stimuli was unreliable. In some conditions, response latencies were clearly longer in the punishment component than in the no-punishment component, but in other conditions, as in prior studies (e.g., O'Donnell & Crosbie, 1998), the delivery of the money loss rather than the component stimuli appeared to function as a discriminative stimulus for punishment. In the present study, the lack of consistent stimulus control by component stimuli might have been caused in part by the frequent alternation between baseline and punishment conditions. Under such changing contingencies, the response-cost contingency may have been a more reliable discriminative stimulus than the background color (see Weisman, 1975). It is also possible that in some cases the exposure to punishment conditions was too brief for stimulus control to develop. For example, for P100 stimulus control by component stimuli was observed under RI 10-s but not RI 5-s response-cost conditions, but the RI 10-s condition was in effect for 15 sessions (first exposure) and 12 sessions (second exposure), whereas the RI 5-s response cost was in effect for only 7 sessions during both exposures. Doughty et al. (2007) found that for 2 participants stimulus control developed after nine and fifteen 10-min sessions, but for a 3rd participant stimulus control developed only after extended training with reduced component durations. Thus, in the present study, a longer exposure to punishment conditions may have been needed for more robust stimulus control to develop.
Research with nonhumans has shown that when responding is maintained under a multiple reinforcement schedule, punishing responses in one component can affect responding in an unchanged component, either by increasing (contrast) or decreasing (induction) unpunished response rates (e.g., Brethower & Reynolds, 1962; Crosbie et al., 1997; Raiff et al., 2008). The variables that determine whether unpunished behavior changes—and the direction of the change—are uncertain (see Crosbie et al., 1997). Few studies with humans have investigated punishment contrast, but several studies using response-cost punishment have found contrast or more often induction in an unchanged component of a multiple-reinforcement schedule (Crosbie et al. 1997; Emmendorfer & Crosbie, 1999; O'Donnell, & Crosbie, 1998). In the present study, responding in the unpunished component typically did not change during punishment conditions, but occasionally responding showed contrast or induction. Of the two, induction was the more common outcome, but there was no obvious pattern to the effect. It is also interesting to note that although reinforcement rates in the punishment and yoked components decreased under low punishment schedule values, there was no consistent positive contrast in the no-punishment component (cf., Reynolds, 1961).
In summary, the present results show that response-cost punishment can decrease responding in humans apart from its effects on reinforcement density, and that the magnitude of the decrease varies as a function of response-cost frequency. It appears that, as with electric shock punishment, the contingency between responding and the presentation of response cost is the primary mechanism by which response cost reduces responding. These findings therefore provide additional evidence for the functional equivalence of response cost and electric-shock punishment.
This research was supported in part by a Faculty Research and Creative Activities award from Western Michigan University. Portions of these data were presented at the 2007 Association for Behavior Analysis International convention, San Diego, CA, and at the 2006 Mid-American Association for Behavior Analysis conference, Carbondale, IL. The authors thank Adam E. Fox and J. Adam Bennett for their assistance with this project.