|Home | About | Journals | Submit | Contact Us | Français|
Learning to discriminate stimuli can alter how one distinguishes related stimuli. For instance, training an individual to differentiate between two stimuli along a single dimension can alter how that individual generalizes learned responses. In this study, we examined the persistence of shifts in generalization gradients after training with sounds. University students were trained to differentiate two sounds that varied along a complex acoustic dimension. Students subsequently were tested on their ability to recognize a sound they experienced during training when it was presented among several novel sounds varying along this same dimension. Peak shift was observed in Experiment 1 when generalization tests immediately followed training, and in Experiment 2 when tests were delayed by 24 hours. These findings further support the universality of generalization processes across species, modalities, and levels of stimulus complexity. They also raise new questions about the mechanisms underlying learning-related shifts in generalization gradients.
Generalization is the tendency for organisms to differentially judge novel stimuli as predictive of a learned consequence based on their similarity to stimuli experienced during learning (Shepard, 1987). The phenomenon is often studied experimentally by training a subject/participant to respond to one stimulus (S+). Generalization is then measured by removing feedback, presenting novel stimuli, and measuring responses to stimuli that are similar to the S+. In a classic example, pigeons trained to respond to a key illuminated with a 580 nm light also responded to other wavelengths when reinforcement for responses was absent; the number of responses to newly presented wavelengths lessened with decreasing similarity to the conditioned stimulus (Guttman & Kalish, 1956). This phenomenon is seen in animals ranging from invertebrates (Cheng, 1999, 2000, 2002) to humans (reviewed by Thomas, 1993) with various testing strategies and dimensions (Shepard, 1987). Because of its ubiquity, generalization was described by Pavlov (1927) as being a fundamental associative process and by Shepard (1987) as psychology’s first law. The universality of generalization likely reflects the fact that an organism’s ability to react appropriately to new environmental conditions often depends on its capacity to predict possible outcomes based on prior experience (Ghirlanda & Enquist, 2003; Shepard, 1987).
Despite the pervasiveness and importance of generalization across species, experimental studies of this phenomenon in human and non-human animals have proceeded somewhat independently. Processes postulated to underlie generalization by humans, such as identifying relational cues or rules (Ahn & Medin, 1992; Doll & Thomas, 1967; Helson, 1964; Imai & Garner, 1965; Shepard, Hovland, & Jenkins, 1961; Thomas, 1993; Thomas & Bistey, 1964; Verbeek, Spetch, Cheng, & Clifford, 2006; Wattenmaker, 1992), using abstract strategies (Gunzelmann, 2008; Rodrigues & Murre, 2007), and language (Fagot, Goldstein, Davidoff, & Pickering, 2006; Purtle, 1973; Shepard, et al., 1961), are often difficult or impossible to investigate across species. Studies of nonhuman animals instead have focused on measuring the form of generalization gradients, context effects, dimension variety, and training procedures (reviewed by Purtle, 1973; Honig & Urciouli, 1981; Ghirlanda & Enquist, 2003). There are some comparable data sets from human participants (Cheng, Spetch, & Johnston, 1997; Doll & Thomas, 1967; Spetch & Cheng, 1998; Thomas & Bistey, 1964; Thomas & Mitchell, 1962), but they are relatively rare. Although differences in the processes underlying generalization gradients in humans and nonhumans undoubtedly exist, many fundamental mechanisms are likely shared (Cheng, 2002; Ghirlanda & Enquist, 1999; 2003; Shepard, 1987).
A basic process in all animals that strongly impacts generalization is learning. For example, Jenkins and Harrison (1960) found that conditioning pigeons to respond to a particular pitch led to narrower generalization gradients. Another effect of learning on generalization, described as the peak shift effect (Hanson, 1959; Purtle, 1973; Spence, 1937), typically occurs after intradimensional discrimination training in which one stimulus is reinforced (S+), and another stimulus is not (S−). When generalization tests are given after such training, subjects respond most not to S+, as one might expect, but to one of the unconditioned test stimuli. These shifts are often accompanied by a likewise shift in gradient mean and overall proportion of responding (Bizo & McMahon, 2007; Newlin, Rodgers, & Thomas, 1979; Thomas & Bistey, 1964; Thomas, Mood, Morrison, & Wiertelak, 1991). In a classic example, pigeons trained on a wavelength discrimination between 550 nm light (S+) and 560 nm light (S−) responded most to shorter wavelengths such as 540 nm in a generalization test (Hanson, 1959). Such learning-related gradient shifts have been observed in bees (Lynn, Cnaani, & Papaj, 2005), horses (Dougherty & Lewis, 1991), rats, goldfish, guinea pigs, chickens, pigeons, and humans (reviewed by Purtle, 1973), and with various methods such as category learning (McLaren, Bennett, Guttman-Nahir, Kim, & Mackintosh, 1995; McLaren & Mackintosh, 2002; Wills & Mackintosh, 1998), varying distributions of test stimuli (Bizo & McMahon, 2007; Helson & Avant, 1967; Spetch, Cheng, & Clifford, 2004; Thomas & Bistey, 1964), and training with multiple S+ stimuli (Galizio & Baron, 1979; White & Thomas, 1979). The effect occurs for stimuli varying in brightness (Newlin et al., 1979; Thomas, Ost, & Thomas, 1960; White & Thomas, 1979; Thomas, Mood, Morrison. & Wiertelak, 1991), facial characteristics (Lewis & Johnston, 1999; McLaren & Mackintosh, 2002; Spetch et al., 2004), spatial location (Cheng & Spetch, 2002), pitch (Galizio, 1985; Galizio & Baron, 1979), line tilt (Winton and Beale, 1971; Spetch et al., 2004), floor tilt (Riccio, Urda, & Thomas, 1966; Thomas & Burr, 1969; Thomas & Lyons, 1968), numerosity (Honig & Stewart, 1993), and motor movement (Dickinson & Hedges, 1986). As with other generalization phenomena, the processes postulated to give rise to the peak shift effect have been different for humans and nonhumans, despite its apparent universality.
In nonhumans, Spence’s (1937) theory of gradient summation has been the prevailing explanation. Spence proposed that after training, an excitatory gradient surrounds the S+ stimulus and an inhibitory gradient surrounds the S−. The peak shift is proposed to result from the summation of these gradients. Specifically, although the S+ is located at the peak of the excitatory gradient, this peak overlaps with part of the inhibitory gradient. As a result, a location displaced from S+ that still has part of the excitatory gradient, but less of the inhibitory gradient, ends up with a higher total excitatory value. For humans, a more commonly cited explanation is the adaptation level account. According to adaptation level theory, people develop a psychological average of the presented stimuli during discrimination training, called an adaptation level (Thomas, 1993). Incoming exemplars are then judged based on this learned prototype, and participants respond to “the prototype + x units” during generalization tests. When the range and distribution of stimuli change from training to generalization tests, adaptation level also changes, but the rule for responding remains the same. Consequently, use of the rule learned during training produces peak responding to a novel stimulus. Both of these explanations fail to explain some instances of peak shift (Bizo & McMahon, 2007, Cheng & Spetch, 2002; Doll & Thomas, 1964, Hanson, 1959; Lazareva, Wasserman, & Young, 2005; McLaren et al., 1995; McLaren & Mackintosh, 2002; Thomas, 1993; Thomas & Bistey, 1964; Thomas et al., 1991), but they have been able to explain other aspects of generalization (Helson, 1964; Spence, 1937; Thomas, 1993). For this reason, it is worthwhile to study peak shift in ways that control for the effects of inhibitory gradients and stimulus range. By taking this approach one limits the possibility that aspects of Spence’s theory (1937) or adaptation level (Helson, 1964; Thomas, 1993) may mask alternative mechanisms underlying shift that are common across species.
One other factor that may have contributed to disparate theoretical accounts of learning-related shifts is methodologies that emphasize species differences. Traditionally, studies of the peak shift effect have involved “simple” stimuli varying along a single dimension. Such simple stimulus sets may be generalized differently by humans and other animals because humans can develop verbal response rules for differentiating such stimuli (Ahn & Medin, 1992; Imai & Garner, 1965; Shepard et al., 1961; Wattenmaker, 1992). More recently, experiments in which participants were trained to discriminate between two faces that morphed along a complex visuospatial dimension (Lewis & Johnston, 1999; McLaren & Mackintosh, 2002; Spetch et al., 2004) established that peak shift also occurs with multidimensional stimuli. As a result, questions have been raised about whether or not explanations necessitating response rules can explain shift with stimuli of greater complexity (Spetch et al., 2004). Further investigation into peak shift with other complex stimuli is needed before conclusions can be reached.
The current experiments further explore the generality of learning-related shifts in generalization gradients by measuring responses in individuals trained to identify temporally dynamic sounds varying along a complex acoustic dimension. Specifically, participants learned to identify sounds in which frequency was modulated periodically at a fixed rate. There appear to be no prior reports of a peak shift effect in humans after training with temporally dynamic stimuli in any modality, or after training with multidimensional acoustic stimuli. The first aim of these experiments was thus to assess whether learning-related shifts in generalization gradients similar to those observed after training with complex visual stimuli (e.g., faces) are also observed after training with complex acoustic stimuli. Additionally, we controlled for changes in adaptation level between training and testing as well as for asymmetries in excitatory or inhibitory gradients resulting from differential reinforcement. Observations of learning-related shifts with these controls in place may provide new clues about the mechanisms underlying generalization.
We also explored the time course of the peak shift phenomenon. Aside from a few studies with pigeons (Moye & Thomas, 1982; Spence, 1937; Thomas, et al., 1960; Thomas, 1985), the time course of peak shift retention has not been explored. Tests of forgetting suggest that as time passes after training, novel stimuli become more likely to elicit the same responses as trained stimuli (reviewed by Bouton, Nelson, & Rosas, 1999; Riccio, Ackil, & Burch-Vernon, 1992). Similarly, generalization gradients tend to flatten over time in both humans and nonhumans. These results predict that shifts in generalization might also vary over time. However, Spence (1937) and Moye and Thomas (1982) reported peak shift effects after a 24 hr. delay between training and test, and Thomas, Ost, and Thomas (1960) showed that this effect persisted for at least three weeks. Is peak shift similarly stable in humans? We addressed this question by inserting a 24 hr delay between training and testing.
Knowing how long learning-related shifts in gradients last can provide insights into their underlying mechanisms (Blough, 1975). If peak shift effects do not persist in humans, then short-term mechanisms such as attentional shifts may be sufficient to explain the effect. This finding would support current dissociations between generalization in humans and other animals. If the effect is still present after a 24 hr delay however, then accounts based on temporary adjustments of stimulus processing or decision criteria are untenable, and additional similarities in mechanisms across species would be revealed.
Experiment 1 was conducted to determine whether peak shift would be observed in participants trained to distinguish complex sounds. We held objective adaptation levels constant between training and testing, and controlled for differences between excitatory and inhibitory gradients by using a two choice task. For ease of discussion and to remain consistent with the literature, we refer to the stimuli experienced during training as S+ and S− even though participants receive equal reinforcement for responses to both stimuli. Participants in the identification training condition were instructed to respond to S+ on every trial during training. Participants in the discrimination training condition responded differentially to S+ and S− during training. Both groups were then tested with sounds similar to S+ and S− and asked to identify the S+. We hypothesized that after identification training generalization gradients would peak at S+, but that after discrimination training gradients would peak at a stimulus other than S+. Specifically, we predicted that the generalization gradients of the discrimination training group would display a peak proportion of S+ responses at a location displaced from S+ in a direction further from S−.
Forty-four introductory psychology students at the University at Buffalo, SUNY, participated in the study for partial course credit. Three participants were dropped from the identification training group. One was dropped due to a computer error, and two others for clearly ignoring instructions to make only 15% S+ responses during the generalization phase. The criterion for elimination was an S+ response proportion in excess of 50%. Five participants were dropped from the discrimination training group because they failed to reach the training criterion of 70 % correct. The 70 % correct criterion guaranteed that we were observing people who learned the discrimination, and who were paying attention to the task. We felt a difficult discrimination was desirable to ensure that learning was required. A total of 36 participants, 18 in each training condition, were used for data analyses.
The sounds used in this study were one second long trains of frequency-modulated (FM) tonal sweeps that varied in repetition rate (see Figure 1)1 all sweep trains were generated using Matlab 6.5. Individual sweeps increased in frequency from 500 Hz to 4000 Hz. Thus, all stimuli spanned the same broad range of frequencies and were the same duration. There were several advantages to using FM sweep trains. First, participants were unlikely to have heard the stimuli prior to the experiment. Second, sweep trains generate predictable cortical responses in rats (Orduña, Mercado, Gluck, & Merzenich, 2005) and owl monkeys (deCharms, Blake, & Merzenich, 1998), suggesting that they likely do so in other mammals, including humans. Finally, the structure of these sounds allows them to be varied independently along several dimensions including repetition rate, modulation rate, modulation direction, and range of modulation (Mercado, Orduña, & Nowak, 2005).
The repetition rates used were: 4, 7.9, and 14 Hz (used for pre-training), 6.9 and 7.9 Hz (used in training), and 4.5, 5.2, 6, 6.9, 7.9, 9.1, 10.5 and 12.1 Hz (used for testing). Rates other than those used in pre-training were selected so that the repetition rate of each sound was approximately 15% greater than the prior sound. The increase in rate not only increased the number of sweep repetitions per second, but also altered the length and steepness of each sweep. Therefore, these sounds differed on a number of acoustic dimensions.
Sounds were presented and responses were collected using DMDX experimental software (Forster & Forster, 2003) running on HP Pavillion a300n, IBM compatible desktop computers. Participants heard sounds at a normal conversational volume through Audio-Technica, ATH-M40fs headphones, and indicated their responses using the shift keys on a keyboard.
Experiment 1 employed a 2 × (8) mixed factorial design. The between participants factor, training condition, had two levels (identification and discrimination), and the within participant factor, Hz, had eight levels. One of the dependent measures was the proportion of times a participant produced a response indicating S+ had occurred after hearing a particular sound. A second dependent measure was the mean of the participant’s obtained gradient.
All participants were instructed to respond by pressing the right shift key on a computer keyboard labeled “8” if they heard a repetition rate of 8 (7.9 Hz) and to respond with the left shift key labeled “NOT 8” if they heard a rate any slower or faster than 8. Participants were told to guess if they were unsure of whether or not the sound they heard was an “8”. Each participant engaged in a pre-training task, training, a pre-test reminder, and a testing phase. The pre-training task consisted of six trials meant to assist participants in understanding what they were to do in the task. During these six trials participants were presented with four sounds not heard in the training or test (two 4 Hz and two 14 Hz) and two 7.9Hz (S+) sounds. In each pre-training trial, participants were given the correct answer before making their response (“NOT 8” or “8).
During the training phase, the identification group received 20 presentations of the (S+) stimulus and the discrimination group received 20 (S+) stimuli and 20 (S−) stimuli. Training stimuli for the discrimination group were presented in pseudorandom order so that no more than 5 of the same sounds appeared consecutively. The identification group heard 20 consecutive presentations of (S+). Participants were prompted to make a response during each trial and were given feedback after their responses. The word “correct” appeared on the screen after a correct response, along with the recorded reaction time in milliseconds. If an incorrect response was made, the word “wrong” appeared on the screen. After training was completed, participants were instructed that during the next phase of the experiment only 15% of the sounds would be “8” and the remaining 85% of sweeps would be “NOT 8”. Participants were also reminded that their task was to respond “NOT 8” to any sounds that were slower or faster than “8”. These instructions constituted the pre-test reminder. Participants then hit the spacebar to start the test. During the test, the 8 Hz values, 4.5, 5.2, 6, 6.9, 7.9, 9.1, 10.5, and 12.1Hz, were presented in pseudorandom order; so that the same stimulus did not occur more than two times in a row. Participants were not given feedback for their responses. Each stimulus was presented 12 times for a total of 96 test trials.
The results from averaging the proportions of responding across participants for the discrimination and identification groups are shown in Figure 2.
All statistical tests were two tailed and employed an alpha level of .05. A 2 × (8) ANOVA was performed with training condition as the between and Hz as the within participant factor. The analysis revealed a significant main effect of Hz, F(7, 238)=29.766, p<.001, partial eta2=.467, which shows that some sounds elicited more S+ responding than others. The main effect of condition was not significant, indicating that a difference in overall amount of S+ responding was not detected. There was also a significant Hz × condition interaction, F(7, 238)=3.674, p=.001, partial eta2=.098. This interaction indicates that the discrimination and identification training groups differed in regards to the proportion of S+ responses they made to different Hz values.
Mean response gradients were calculated and compared to determine whether the interaction reflected a peak shift in the discrimination group. To calculate the mean gradients the Hz values were rank-ordered on the dimension: 4.5Hz (1), 5.2Hz (2), 6Hz (3), 6.9Hz (4), 7.9Hz (5), 9.1Hz (6), 10.5Hz (7), and 12.1Hz (8). The means of the responding gradients were obtained by multiplying S+ responses by their rank. This value was then divided by the total number of S+ responses. The 7.9Hz S+ sweep had a rank of 5. Mean gradients were: discrimination group, M=6.09, SD=.49, identification group, M=5.36, SD=1.01. A planned comparison independent t-test revealed that the means of these groups differed significantly, t(34)=2.69, p=.011, Cohen’s d=.93. To determine whether either of the means was significantly different from the S+ value we conducted one sample t-tests on each condition’s mean gradient with 5 (S+) as the comparison value. The discrimination group’s mean of 6.09 was significantly different from 5, revealing a peak shift, t(17)=9.25, p<.001, Cohen’s d=2.39. The control group’s mean of 5.36 was not significantly different from 5, t(17)=1.53, p=.143, Cohen’s d=.37.2
The results of Experiment 1 display a classic peak shift even though the factors important for either adaptation level or Spence’s theory to predict peak shift were eliminated. Explanations based on changes in adaptation level cannot account for the observed shift because the relational responding learned in training, if used later in test, should have yielded peak responding at the S+. In this case participants would have learned to respond to some average of the stimuli, presumably between 6.9 Hz and 7.9 Hz, plus some value. Because this average was objectively the same in both training and test, adding the same value throughout the session would have yielded responding at a consistent location. This result is consistent with studies performed by Spetch, Cheng, & Clifford (2004) that controlled for range effects with multidimensional visual stimuli. No evidence that adaptation level impacted learning related shifts was seen (see also Cheng & Spetch, 2002). Spence’s (1937) theory is also unable to explain this shift because participants in the discrimination training group were reinforced equally for responses to both discriminative stimuli. Neither stimulus should have been associated with inhibition because correct responses in the presence of both were followed with a “Correct”. The inhibition necessary for Spence’s theory to predict shift was thus not present.
It is also clear that the results do not reflect participants counting the number of repetitions. At lower repetition values these stimuli are countable, but with the repetition values at or above the S+/S− counting is extremely difficult, if not impossible. There were a number of people who failed to meet criterion in training. If counting had been employed, we would not have seen this attrition. Additionally, if participants could accurately count the majority of the repetition values, there would be no expected difference between the identification and discrimination participants and no peak shift.
Alternative theoretical frameworks that emphasize the representational structure of stimuli provide a more plausible account of the current results. These frameworks assume that a stimulus is represented as graded activation over a set of elements (Blough, 1975; Ghirlanda & Enquist, 1999; McLaren & Mackintosh, 2002; Thorndike, 1932). Similar stimuli, such as those that vary along a particular dimension, share elements; the more similar two stimuli are, the more elements they share. Elemental explanations account for learning-related gradient shifts by assuming that certain elements of a stimulus that are predictive of the correct response become more salient than elements that are not predictive. This account has previously been used to explain generalization after training with static visual images. For the elemental explanation to apply to the current data, one must assume that elements of stimulus representations include dynamic features, and that variations over time activate overlapping sets of elements in the same way as different static features.
To our knowledge, Experiment 1 is the first to show peak shift in humans after training with stimuli that vary over time. This finding is consistent with recent reports of peak shift effects in humans trained to distinguish faces varying along a continuum (Lewis & Johnston, 1999; Spetch et al., 2004; McLaren & Mackintosh, 2002), and pigeons trained to discriminate multi-item visual displays (Honig & Stewart, 1993; Wills & Mackintosh, 1998). The current data afford more direct comparisons across species than studies of humans generalizing across faces, however, because faces are especially salient stimuli for humans (Jitsumori & Makino, 2004), and participants have an enormous amount of prior experience distinguishing faces prior to training. In contrast, participants were unlikely to have experienced FM sweep trains before participating in this experiment, and the sounds had no natural significance. Observations of a learning-related shift after training with these sounds thus increase confidence in the global nature of this phenomenon.
Experiment 2 assessed whether the peak shift effect observed in Experiment 1 would persist when there was a significant delay interval between discrimination training and generalization testing. Specifically, we planned to replicate experiment 1 in a zero delay condition and then contrast these results to those of participants who experienced a 24 hr. delay between training and test. Thomas and colleagues (1960; 1969; 1982; 1985) reported that pigeons showed the peak shift effect after this and longer delays. Comparable tests have not been performed in humans. Identifying similarities and differences in generalization across species can clarify whether the proposed dichotomy between mechanisms of human and nonhuman peak shift is justified. Additionally, information regarding the nature and function of learning-related gradient shifts may be obtained by observing its time course. For example, if the peak shift effect disappears after a delay, then short-term attentional shifts might be a plausible explanation for the effect. If peak shift remains after the 24 hr. delay then a long-term learning process that is common across species may be more probable. Given Thomas and colleagues (1960; 1969;, 1982, 1985) findings and the results of Experiment 1, we hypothesized that both (with and without delay) discrimination groups would show a classic peak shift, and the identification groups would not.
Sixty-eight introductory psychology students at SUNY Buffalo participated in the study to partially fulfill their course requirements. Seven participants from the discrimination groups tested without delay and three from groups tested after a delay were dropped because of their failure to reach the training criterion of 70 % correct. To create equal sample sizes, three participants from both the identification without delay and identification with delay groups were dropped randomly and blindly. The total number of participants in each group was 13 after these exclusions.
The stimuli and apparatus were the same as used in Experiment l.
A 2 × 2 × (8) mixed factorial design was used. The between participants factor of training condition and the within participant factor of Hz were the same as in Experiment 1. The second between participants factor, delay time, had 2 levels (0 hours and 24 hours). The 0 hour delay was an exact replication of Experiment 1. The dependent measures were the same as in Experiment 1.
The procedures were also the same as in Experiment 1 except that instead of proceeding to the generalization period after training, the discrimination and identification training groups with 24hr delay were told to come back the next day at exactly the same time. Upon returning to the lab, 24 hours later, these participants were given the pretest reminder and then the generalization test.
The dependent measures, as in Experiment 1, were the proportion of S+ responses to each particular Hz value and the mean gradient of responding. The mean of participants’ responses for each group at each Hz value is shown in Figure 3.
All statistical tests were two tailed and used an alpha level of .05. A 2 × 2 × (8) ANOVA was conducted. The between participant factors were training condition and delay condition. The within participant factor was Hz. A significant main effect of Hz, demonstrating a difference in responding elicited by the Hz values, was found, F (7, 336)=28.860, p<.001, partial eta2=.375. The ANOVA also revealed a significant Hz × condition interaction, F (7, 336)=5.978, p<.001, partial eta2=.111. All other main effects and interactions were not significant, F’s<2. We performed two separate 2 × (8) ANOVAs, with condition as the between and Hz as the within participant factor to be cautious in accepting peak shift effects at both levels of delay. The ANOVA conducted with the discrimination without delay and the identification without delay groups yielded a significant main effect of Hz. F(7, 168)=28.785, p<.001, partial eta2=.365, and a significant condition × Hz interaction, F(7, 168)=5.962, p<.001, partial eta2=.107. The main effect of condition was not significant, F<2, showing the same overall level of responding. The main effect of Hz illustrates that different Hz values elicited different proportions of S+ responding. The significant Hz × condition interaction supports a difference between the S+ responding of the training conditions elicited by the presentation of each sound. The ANOVA performed on the discrimination with delay and identification with delay groups also revealed a significant main effect of Hz, F(7, 168)=17.287, p<.001, partial eta2=.419, and a significant condition × Hz interaction, F(7, 168)=3.691, p=.001, partial eta2=.133, illustrating the same differences as the previous sub-comparison ANOVA. The main effect of condition was not significant, F<2.
Again, comparisons of the mean gradients were used to determine whether peak shifts in the discrimination conditions caused the significant interactions. Stimuli were rank ordered as done in Experiment 1. The mean gradients of the groups were: discrimination without delay, M=5.43, SD=.55, identification without delay, M=4.7, SD =1.14, discrimination with delay, M=5.7, SD=.81, identification with delay, M=4.7, SD=1.12. A 2 (training condition) × 2 (delay) ANOVA for the means revealed a significant main effect of condition, F(1, 48)=10.59, p=.002, partial eta2=.181 The main effect of delay and the delay × condition interaction were not significant, F<2.
Planned comparison t tests revealed significant differences between the discrimination without delay and identification without delay groups, t(24)=2.08, p=.048, Cohen’s d=.85, and between the discrimination with delay and identification with delay groups, t(24)=2.50, p=.019, Cohen’s d=1.02. One sample t tests revealed that the means for the discrimination without delay, t(12)=2.80, p=.016, Cohen’s d=.85, and the discrimination with delay, t (12)=3.027. p=.011, Cohen’s d=.94, groups were significantly above 5 (S+), indicating a peak shift. The t tests performed on the identification without delay and identification with delay groups did not reach significance, t<1; showing that the identification groups’ means were not significantly displaced from 5 (S+).3,4
Experiment 2 replicated the findings of Experiment 1 and also showed that the peak shift effect remained when there was a 24 hr delay between discrimination training and generalization testing. Given that training nonhuman animals involves significantly more exposures to a stimulus dimension, and the explanatory dichotomy in the literature, it was possible that peak shift with humans might not persist as previously reported in pigeons. On the contrary, we found that training humans for only forty trials was sufficient to induce a long lasting effect. This finding suggests the possibility that similar learning mechanisms may underlie peak shift in humans and other animals. Additionally, although prior generalization research in both human and nonhumans has suggested that gradients flatten over time, and several theories of generalization assume this gradient change (Nairne, 1991; Riccio et al., 1992; Estes, 1997), our data give no indications of delay-dependent changes in generalization for either training condition.
In this study, participants learned to press particular keys after hearing dynamic acoustic stimuli. Having learned this task, they generalized learned responses to novel stimuli varying along a complex acoustic dimension. Individuals that learned to respond to a single sound responded most to the sound experienced during training and less to novel sounds. In contrast, individuals trained to respond to two sounds, and then tested on their ability to recognize one of the sounds, showed a generalization gradient that was shifted. Peak responding was to a novel stimulus rather than to either of the sounds experienced during training. This peak shift was evident immediately after training (Experiment 1), as well as 24 hours after training (Experiment 2). These findings are consistent with numerous past reports of a peak shift effect after discrimination learning, but extend this phenomenon to temporally dynamic stimuli. This also is, to our knowledge, the first demonstration in humans that learning-related shifts in generalization gradients can persist for at least a day.
Recent studies have convincingly demonstrated peak shift effects after discrimination training with complex, static, visual stimuli, including sets of icons (Wills & Mackintosh, 1998; Honig & Stewart, 1993), spatial layouts (Cheng & Spetch, 2002), faces (Lewis & Johnston, 1999; Spetch et al., 2004; McLaren & Mackintosh, 2002), and various other naturalistic images (Ghirlanda & Enquist, 2003; Lynn et al., 2005). Collectively, these studies strongly suggest a common mode of visual discrimination learning that impacts stimulus representations similarly across species and levels of stimulus complexity. Experiment 1 extends the generality of these past results by showing that similar shifts in generalization occur when individuals learn to discriminate (but not to identify) complex sounds. In addition, Spetch et al. (2004) speculated that discrimination training with multidimensional stimuli may be more likely to lead to shifts in generalization gradients that do not depend on asymmetries in the stimulus set used to test generalization. Our results are consistent with this hypothesis, but further experiments with complex stimuli are needed to identify how stimulus features impact learning-dependent generalization.
One stimulus feature that has rarely been incorporated into past studies of the peak shift effect is spatiotemporal variability. Bizo and McMahon (2007) recently reported that individuals trained to discriminate squares presented for different durations (e.g., 0.79 s versus 0.95 s) showed a peak shift effect along the dimension of duration. Russell and Kirkpatrick (2005) acquired similar results with rats. Although the discriminandum in these experiments was time, the stimuli they displayed were static. In contrast, our participants learned to distinguish temporally dynamic stimuli; the sounds they heard changed continuously. Comparable visual stimuli might include a shape that repeatedly moves around the screen in a figure-eight pattern, or lights that flash at different frequencies (Sloane, 1964). Early work by Pavlov (1927) suggested that generalization mechanisms apply to variations in inputs over time in ways that parallel how they apply to spatial variations in inputs across sensory receptors. For example, Pavlov found similar patterns of generalization in dogs conditioned to salivate in the presence of a particular metronome rate and in dogs conditioned to respond to touches of a particular body part (see also Konorski, 1968). Experiments 1 and 2 provide further support for the supposition that training with temporally dynamic stimuli leads to generalization gradients comparable to those observed after training with static stimuli.
How is it possible that complex stimuli that vary over time could be generalized in the same way as simple stimuli that do not, and that learning could affect both kinds of stimuli in similar ways? One possibility is that there are no “simple” static stimuli. For example, presenting a colored shape to an observer does not insure that the inputs processed by that observer are simple or static. This will be determined by how the observer directs his or her gaze, head movements, and attention. In short, stimuli arise from a temporally dynamic stream of patterned receptor activation. Compared to visual reception, sound reception is less affected by both attention and body movements. Also, periodic sounds lead to predictable cortical responses (e.g., Orduña et al., 2005). From this perspective, stimuli with predictable temporal dynamics (e.g. FM Sweeps) should generate the simplest stimulus representations (in the sense of being most consistent across trials). A better understanding of the stimulus representations that underlie discrimination learning and generalization may clarify when and how training shifts generalization gradients, and why learning to distinguish stimuli varying along multiple dimensions might modulate the probability of generating a peak shift effect (Spetch et al., 2004).
Numerous past studies of learning-related shifts in generalization gradients have involved go/no-go training followed by testing with a set of novel stimuli centered on the reinforced stimulus (Purtle, 1973). This experimental design has led to a long history of debates about the role of inhibitory gradients and adaptation level adjustments as mechanisms for observed shifts (Ghirlanda & Enquist, 1999; MacLeod, Dodd, Sheard, Wilson, Bibi, 2003; Purtle, 1973; Thomas, 1993; Thomas et al., 1991). The current study controlled for differential reinforcement during training, as well as for asymmetries in stimulus sets during testing, and still found a learning-related shift in peak responding. Neither Spence’s (1937) nor Thomas and colleagues (1973; 1991) explanation of the peak-shift effect are suitable accounts for these results. There are other less discussed explanations of learning related shifts, but these also have explanatory troubles in this case. For example, a statistical decision theory interpretation might propose that during training a distribution of internal events would develop for each training stimulus (Boneau & Cole, 1967). These discriminal distributions would overlap each other so that some internal events would be associated with responses to the S+, others would be associated with responses to the S−, and some would be associated with both (Boneau & Cole, 1967). A creature might respond in a way that maximizes his reward and minimizes his non-reward. To do so he would set a criterion for a response that is not midway between the distributions, but shifted to minimize false alarms (responses in the presence of an S−). Such an explanation is consistent with findings that creatures will shift their criterion on the basis of how negative an S− is (Boneau & Cole, 1967; Heinemann, Avin, Sullivan, & Chase, 1969; Lynn, Cnaani, & Papaj, 2005). In our study however, participants in order to be correct in the presence of an S− still needed to make a response. Any shift in a criterion would have led to more misses of the S+ and less correct rejections of S−.5
Response competition offers another alternative. Here it is proposed that during training two gradients are developed for responses (Heinemann et al. 1969; Russell & Kirkpatrick, 2005). Peak shift occurs through a summation process alike to that proposed by Spence (1937). The advantage is that no need for inhibition is present because the gradients, by being associated with different responses, subtract from each other to create shift (Russell & Kirkpatrick, 2005). Because the peaks of our gradients and others (Hanson, 1959; reviewed by Purtle 1973) are not lower than those obtained after single stimulus training, response competition is not an optimal explanation. Alternatively, the competition between responses may cause participants’ stimulus representations to be if flux during training. This would be indicative of perceptual learning (Gibson, 1959; Goldstone, 1994; Hall, 1991; Saksida, 1999; Liu, Mercado, Church & Orduña, 2008). In training that involves different consequences of different stimuli, participants could develop more distinct perceptions of the sounds. Differentiating representations of the two stimuli would in turn make consequential differences between the training stimuli easier to separate. By separating, however, the representations of these stimuli may become overlapped with representations of other stimuli on the dimension. Shift would then not result from a summation of separate gradients, but from representational reorganization caused by the competition of responses.
The elemental model of discrimination learning (Blough, 1975; Ghirlanda & Enquist, 1999; McLaren & Mackintosh, 2002; Thorndike, 1932) is alike to this kind of process. It also appears to provide the simplest account of our findings, if it is assumed that the elements being analyzed include time-varying features. Previously employed techniques for measuring peak shift effects are insufficient for determining the nature of such a representational change (Petrov, Dosher & Lu, 2005); a fact partially due to the lack of an investigation into the temporal dynamics of learning related shifts.
In many past studies of peak shift in humans, generalization tests occurred immediately after training, and no assessment was made of possible changes in gradients at different time points during or after training. The implicit assumption underlying this methodology is that the effects of training on generalization should be strongest soon after training. In fact, several models propose that generalization gradients are likely to flatten over time (Nairne, 1991; Riccio et al., 1992; Estes, 1997), which would predict that differences in gradients should decrease as the interval between training and testing increases. As noted previously, however, there are reports of the peak shift effect persisting in nonhuman animals when days or weeks intervened between training and testing (Moye & Thomas, 1982; Spence, 1937; Thomas et al, 1960), and similar reports of stable generalization gradients over months and years (reviewed by Bouton et al., 1999). If similar mechanisms underlie peak shift in humans and other animals, one would expect that delayed testing should have little impact on the peak shift effect. The results of Experiment 2 confirm this prediction.
It remains uncertain how long learning-induced shifts in generalization gradients might last; the effect was still observed after the longest interval tested in any species (three weeks in pigeons; Thomas et al., 1960). Stimulus complexity, familiarity, naturalness, modality, or spatiotemporal variability might impact the time course of experience-dependent changes in generalization, and the quality and quantity of training that precedes the interval are also likely to be critical factors. An example of this can be seen in prior work showing that overtraining can actually cause the peak shift effect to dissipate (Gerry, 1971; Terrace, 1966). This raises the question of whether learning-related shifts reflect a useful end state or an intermediate/transitional state that is a prerequisite for a preferable end state. It could be the case that as learning progresses the amount of peak shift changes quadratically rather than linearly. A novice would display no shift, an intermediate would peak shift, and an expert discriminator would lack shift due to his enhanced ability to distinguish fine differences on the dimension. In this case the expert’s increased perceptual ability may lead to a decreased chance of being fooled by a novel stimulus.
Current explanations of learning-related shifts in generalization gradients often describe these phenomena as a side-effect of discrimination learning. An alternative possibility that deserves further consideration is that gradient shifts are a direct effect of representational reorganization. Shifts in gradients that maximize differential responding to the stimuli most similar to the one experienced during training may increase the distinctiveness of the experienced stimulus more than simply increasing the response to it. This account could potentially explain why in some cases discrimination training not only shifts the peak during generalization, but also raises it considerably for an unfamiliar stimulus (Hanson, 1959; Purtle, 1973).
In conclusion, future studies of learning-related shifts in generalization that control for: (1) differential reinforcement during training; (2) asymmetries in stimulus sets during testing; (3) the temporal dynamics of stimuli; and (4) the time course of training and testing, may provide useful data with which to further develop theories of stimulus generalization.
This research was made possible by support from National Institute of Mental Health Grant, MH67952 and a National Science Foundation Science of Learning Center Grant, SBE 0542013 to the Temporal Dynamics of Learning Center. We thank Estella H. Liu for her help in analysis and stimuli creation, and Laura Buckley, Loren Mallen, and Sara Rought for help carrying out the experiments. Thanks go out to David Smith and Peter Pfordresher for their guidance in developing the experimental design and data analyses, and for providing useful comments on earlier versions of this manuscript. We also thank the editor and three anonymous reviewers for their helpful feedback on a previous version.
2Separate analysis were done to determine if more conservative posthoc tests and/or inclusion of dropped participants would produce different results, Two-tailed t tests with Bonferroni corrections yielded identical results. We also found that when we included the data of the participants that were dropped the statistical conclusions were the same.
3In Experiment 2 separate analysis were done to determine if more conservative posthoc tests would produce different results. Two-tailed t tests with Bonferroni corrections yielded identical significant results for all comparisons of conditions with a delay. However, the comparisons of the discrimination without delay to the. identification without delay and the one sample t-test comparing the discrimination without delay group’s gradient mean to the S+ dropped to marginal significance. These comparisons are significant one-tailed. It is important to remember that these conditions are a direct replication of Experiment 1.
4Increasing variability by including the dropped participants in the analysis of Experiment 2, also dropped the comparisons of the discrimination without delay to the identification without delay, and the one sample t-test comparing the discrimination without delay group’s gradient mean to the S+ below significance. All other main effects, interactions, and comparisons were identical to the analysis conducted with these participants dropped.
5In a separate analysis the S+ response proportions from experiments 1 and 2 were pooled together by condition (discrimination or identification). We contrasted the S+ response proportions to 7.9Hz vs. the response proportions to 4.5Hz, 5.2Hz, 6Hz, and 6.9Hz as a group. d’ was calculated and were as follows: discrimination d’ = 1.88, identification d’ = 1.29. An independent samples t-test confirmed that these means were different, t(86)=2.47, p=.016, Cohen’s d=.525. This difference shows that participants who discriminated in training developed a finer ability to distinguish the left side of the distribution from the S+ stimulus than did those who received identification training. There were no significant differences between groups on measures of c’.