|Home | About | Journals | Submit | Contact Us | Français|
A major concern when designing a psychophysical experiment is that participants may use another stimulus feature (“cue”) than that intended by the experimenter. One way to avoid this involves applying random variations to the corresponding feature across stimulus presentations, to make the “unwanted” cue unreliable. An important question facing experimenters who use this randomization (“roving”) technique is: How large should the randomization range be to ensure that participants cannot achieve a certain proportion correct (PC) by using the unwanted cue, while at the same time avoiding unnecessary interference of the randomization with task performance? Previous publications have provided formulas for the selection of adequate randomization ranges in yes-no and multiple-alternative, forced-choice tasks. In this article, we provide figures and tables, which can be used to select randomization ranges that are better suited to experiments involving a same-different, dual-pair, or oddity task.
A common concern in designing a psychophysical experiment relates to the possibility that participants perform the task using another “cue” (stimulus feature, or dimension) than that intended by the experimenter. For instance, in the field of auditory-perception research known as “profile analysis” (for a review, see: Green, 1988), the researcher is primarily interested in how well listeners can detect or discriminate features, such as peaks or troughs, in the spectral shape of sounds. However, unless special precautions are taken, listeners may be able to perform the task correctly without even extracting spectral shape. For example, listeners can identify which of two successively presented sounds contains a spectral peak based solely on differences in loudness, if the sound containing the spectral peak has a higher intensity overall. Consequently, there is a risk that thresholds or performance in this type of experiment reflect loudness perception, rather than spectral-shape perception. Another example is auditory frequency (subjectively, pitch) discrimination, where listeners can use differences in loudness between tones, due to variation in equal-loudness contours across frequency (Dai, Nguyen, and Green, 1995; Emmerich, Ellermeier, & Butensky, 1989; Henning, 1966; Moore & Glasberg, 1989; Moore, Glasberg, Low, Cope, & Cope, 2006). As a result, performance or thresholds in an experiment that originally sought to measure the perception of pitch may actually reflect, or be contaminated by, the perception of another sound attribute (loudness).
Two approaches have traditionally been used by experimenters to limit participants’ ability to take advantage of unwanted cues in discrimination tasks. The first approach involves equalizing the values of the stimuli along the unwanted dimension. For instance, in a frequency-discrimination experiment, the experimenter can try to adjust the relative intensities of tones in a frequency-dependent manner, in an attempt to ensure that loudness remains constant as frequency changes. Unfortunately, in general, equating precisely the perceived values of stimuli, which keep changing during the course of psychophysical measurements, is a challenging task, which often requires detailed and time-consuming measurements beforehand.1
A second approach to limit participants’ use of unwanted cues involves applying random stimulus variation along the unwanted dimension. In the auditory psychophysics literature, this is commonly referred to as “randomization”, or “roving”. For instance, to prevent listeners in the above-mentioned spectral-shape discrimination experiment to take advantage of overall loudness cues, experimenters “rove” (i.e., vary randomly) the overall level of the stimuli (see: Dai & Green, 1992; Drennan & Watson, 2001; Durlach, Braida, & Ito, 1986; Farrar et al., 1987; Green, 1988; Kidd & Dai, 1993; Kidd, Mason, Uchanski, Brantley, & Shah, 1991; Mason, Kidd, Hanna, & Green, 1984; Spiegel, Picardi, & Green, 1981; Versfeld & Houtsma, 1991). Similarly, to prevent listeners from taking advantage of loudness cues in an auditory frequency-discrimination experiment, experimenters can randomize the level of each tone, so that loudness differences no longer provide a reliable cue for task performance (Dai et al., 1995; Emmerich et al., 1989; Henning, 1966; Moore & Glasberg, 1989; Semal & Demany, 2006). The roving technique is very popular among auditory-perception researchers. It has been used in studies of intensity perception (Berliner & Durlach, 1973; Berliner, Durlach, & Braida, 1977; Oxenham & Buus, 2000), pitch discrimination with complex tones (see: Houtsma & Smurzynski, 1990; Moore, Glasberg, Flanagan, & Adams, 2006; Oxenham, Micheyl, & Keebler, 2009), tone-in-noise detection (Hall & Fernandes, 1983; Kidd, Mason, Brantley, & Owen, 1989), binaural hearing (Bernstein & Trahiotis, 1997; Bernstein & Trahiotis, 1994; Henning, Richards, & Lentz, 2005), speech perception (Macmillan, Goldberg, & Braida, 1988), temporal gap detection (Formby & Muir, 1989; Forrest & Green, 1987), and frequency- or amplitude-modulation perception (Furukawa & Moore, 1997; Moore & Sek, 1998; Stellmack, Viemeister, & Byrne, 2006), among others.
A practical question facing the experimenter who plans to use roving is: how large should the roving range be? If the range is too small, participants may still be able to achieve a relatively high proportion of correct responses based on the unwanted cue. On the other hand, if the range is too large, participants’ performance might be impacted unnecessarily by the random stimulus variations2. Therefore, experimenters must strive to find a good compromise between limiting contributions from the unwanted cue (which encourages the use of a wide roving range), and limiting potential side-effects of roving on performance (which calls for the use of as small a roving range as is safely possible). In order to select a suitable roving range, experimenters must know how the proportion of correct responses that can be achieved based on the unwanted cue, Pcunwanted, depends on the roving range, R. The latter is defined as the distance between the largest and smallest values that the stimulus can assume along the “unwanted” dimension, due to roving. For instance, in a frequency-discrimination experiment in which the level of tones can vary randomly between 45 and 55 dB SPL across presentations, the roving range is 10 dB.
In addition to depending on R, Pcunwanted also depends on the size of the unwanted cue, Δ. The latter corresponds to the change along the “unwanted” stimulus dimension, which accompanies (and is correlated with) the change applied by the experimenter along the primary dimension. For instance, if the loudness of a tone changes by an amount corresponding to 1 dB when the tone frequency changes by 1%, then the size of the unwanted loudness cue corresponding to a 1% change in frequency in a frequency-discrimination experiment is 1 dB. In the framework of signal detection theory (SDT, see: Green & Swets, 1966), Δ can be identified with the distance, along the “unwanted” physical dimension, between the two stimuli that must be discriminated in a yes-no paradigm. If the physical-to-sensory mapping is linear, and the internal noise that contaminates the sensory observations evoked by the stimuli is constant, Δ is directly proportional to the familiar index of sensitivity, d′. However, there are two important differences between Δ and d′. Firstly, whereas d′ usually denotes sensitivity to the primary cue, were, Δ refers to an unwanted cue. Secondly, whereas d′ is dimensionless, Δ has the dimension of the stimulus attribute being randomized.
In most applications, the size of the unwanted cue is either known to the experimenter, or it can be estimated based on data in the relevant literature—especially, data from studies in which the corresponding cue, which is now the unwanted cue, was then the cue of primary interest. For instance, loudness cues associated with changes in the frequency of pure tones in a frequency-discrimination experiment can be estimated based on data on equal-loudness contours and intensity discrimination. When relevant data for estimating the size of the unwanted cue are not available in the existing literature, such data must be collected. In some applications, the size of the unwanted cue is directly available. For instance, in spectral profile- analysis experiments, the overall loudness difference between the sounds that the listener must discriminate (i.e., the size of the unwanted cue) is, to a first approximation, proportional to the increment or decrement in level that the listener must detect (i.e., the size of the primary cue). In general, when the primary and unwanted cues share the same dimension—as in the profile-analysis example—the size of the unwanted cue is known to the experimenter; in all other cases, the size must be estimated based on existing data, or measured.
While the relationship between Pcunwanted, R, and Δ can be studied empirically, measurements of this relationship are usually impractical in the context of experimental studies, the primary aim of which is not to characterize it. Therefore, experimenters do not usually choose R based on empirical measurements. Instead, they rely on predictions derived based on ideal-observer models from signal detection theory (Green & Swets, 1966). Because these models assume a noiseless observer who uses the information conveyed by the unwanted cue optimally, they provide an upper bound on the performance that can be achieved based on that cue. In particular, Green (1988, pp. 19–21) provided a relatively simple formula relating Pcunwanted, R, and Δ, in the 2I-2AFC paradigm: Pcunwanted = 0.5 + Δ/R − 0.5 (Δ/R)2. More recently, Dai and Kidd (2009) derived similar formulas for the yes-no and m-alternative forced-choice (mAFC) paradigms. Specifically, they showed that, for the yes-no paradigm, Pc = 0.5 + 0.5 (Δ/R), whereas for the mAFC paradigm, Pcunwanted = Δ/R + [1 − (Δ/R)m]/m.
Although the yes-no and mAFC paradigms have been used in a large number of auditory- and visual-perception studies over the past fifty years, other paradigms exist, which are better suited for certain applications (overviews can be found in: Creelman & Macmillan, 1979; Macmillan & Creelman, 2005; Macmillan, Kaplan, & Creelman, 1977; Noreen, 1981). For instance, the same-different paradigm provides a measure of basic stimulus discriminability, which does not involve an ability to identify the direction of changes in, e.g., sound intensity or frequency (Dai, Versfeld, & Green, 1996). The “oddity” paradigm, wherein participants are on each trial presented with m stimuli, one of which differs from the other m-1, is better suited than its mAFC counterpart in some situations (Versfeld, Dai, & Green, 1996). The “dual-pair comparison” paradigm (Creelman & Macmillan, 1979) allows researchers to measure stimulus-change detection and change-direction identification using the same stimulus structure (two-pairs of stimuli, one containing a change, the other not), by simply changing the instructions given to the participant (Semal & Demany, 2006; Micheyl, Kaernbach, & Demany, 2008). In some contexts, it is necessary to use roving in a same-different (e.g., Jesteadt & Bilger, 1974), oddity (e.g., Lyzenga & Horst, 1995; Lyzenga & Horst, 1997, 1998)3, or dual-pair (e.g., Micheyl et al., 2006; Semal & Demany, 2006) paradigm. Unfortunately, the above-cited formulas, which give the relationship between Pcunwanted and roving range for the yes-no and mAFC paradigms, do not apply to these other paradigms. In fact, as the results presented in this article reveal, using these formulas to determine the roving range required to keep Pcunwanted under a target level in any of three paradigms mentioned above (same-different, dual-pair, and oddity) can lead to substantial errors in both experimental design, and data interpretation.
While no simple analytical formulas exist, which can be used as guidelines for selecting suitable roving ranges in same-different, dual-pair, or oddity experiments, in this article, we provide figures and tables, which experimenters can use to select an adequate roving range, R, given a target Pcunwanted, and a known (or estimated) unwanted-cue size, Δ, in the same-different paradigm, two versions of the dual-pair paradigm (i.e., 4IAX and AB-versus-BA), and two versions of the oddity paradigm (i.e., three- and four-interval oddity). The information in the tables and figures can also be used, conversely, to find the Pcunwanted that can (or could, in retrospect) be achieved in an experiment using one of these paradigms, given the roving range and unwanted-cue size.
In order to derive the results presented below, we assumed a maximum-likelihood (ML) observer, who makes optimal use of the information conveyed by the unwanted cue, which is being roved. Obviously, the information conveyed by the unwanted cue becomes less and less useful for correct task performance as the roving range increases. The general approach is similar to that described by Green (1988) for the 2I-2AFC task, and more recently extended to yes-no and the mAFC tasks by Dai and Kidd (2009). The basic idea of this approach is that the unwanted cue shifts the distribution of stimulus values, and the corresponding distribution of sensory observations, along the considered stimulus dimension. The distribution of stimulus values is produced by the application of stimulus roving. Here, as in Green (1988) and Dai and Kidd (2009), we assume a “rectangular”, i.e., continuous-uniform distribution. Near the end of the article, we show how the results can be corrected when the uniform distribution is discrete, instead of continuous. The uniform is the distribution most frequently used in studies of auditory perception. Of all continuous roving distributions having a fixed range, the uniform is the one that minimizes the maximal Pc that can be achieved (by an ideal, maximum-likelihood observer) based on the unwanted cue (Dai, 2008).
Under these assumptions, the maximal Pc that can be achieved based on the unwanted cue alone (hereafter referred to as Pcunwanted) can be computed as the integral, over the observation space, of the probability density corresponding to the most likely a-posteriori stimulus alternative—at the current point in the observation space. Using Bayes’ theorem, the latter probability can be determined based on the (uniform) probability density function of the observations, given the stimulus alternative. Since our calculations are for an ideal observer, and in most experimental applications the various stimulus alternatives are equally likely, the maximum a-posteriori (MAP) and maximum-likelihood (ML) solutions are equivalent.
For the 2I-2AFC, yes-no, and mAFC paradigms, the integral has a relatively simple analytical solution—see the above-mentioned equations by Green (1988) and Dai and Kidd (2009). For other paradigms, analytical solutions are more difficult to obtain due to the greater dimensionality of the decision space, or to the presence of nonlinearities (e.g., an absolute-value, or maximum-of operation) in the decision rule. Here, rather than attempt to provide analytical solutions, we resorted to a numerical-evaluation approach. We evaluated the integral, over the relevant observation space, of the (uniform) probability density corresponding to the most likely stimulus alternative. Several publications have already described the relevant observation spaces and ML decision rules for the various paradigms considered here: same-different (Dai et al., 1996; Irwin & Hautus, 1997; Irwin, Hautus, & Butcher, 1999; Macmillan & Creelman, 2005), dual-pair 4IAX (Micheyl, Kaernbach, & Demany, 2008; Micheyl & Messing, 2006; Noreen, 1981; Rousseau & Ennis, 2001, 2002), dual-pair AB-versus-BA (Micheyl & Dai, 2008, 2009), and oddity (Frijters, 1979a, 1979b; Geelhoed, MacRae, & Ennis, 1994; Versfeld et al., 1996). Readers are referred to these earlier texts. In the remainder of this article, we present the results of our calculations relating Pcunwanted to Δ/R for these different paradigms, in both figure and table format.
Figure 1 shows Pcunwanted as a function of Δ/R for the same-different paradigm, the 4IAX paradigm, and the 4IAX AB-versus-BA paradigm. Similar functions are also shown for the yes-no paradigm and the 2I-2AFC paradigm for comparison. As mentioned in the Introduction, for these two paradigms, analytical solutions for the relationship between Δ/R and Pcunwanted have been provided in other publications (Dai & Kidd, 2009; Green, 1988). The five paradigms illustrated in this figure all have the same chance-performance level, corresponding to Pc = 0.5. The tables in Appendices A, ,B,B, and andCC list Δ/R values corresponding to Pcunwanted between 0.5 and 1 (in steps of 0.01) for the same-different paradigm, and the two versions of the dual-pair paradigm (i.e., 4IAX and AB-versus-BA).
Figure 1 reveals that, of the five paradigms, the same-different paradigm generally yields the lowest Pcunwanted (given Δ/R). On the other hand, the 2I-2AFC paradigm generally yields higher Pcunwanted values than the other paradigms considered in this figure—with the exception of the 4IAX version of the dual-pair paradigm, for relatively low Pc levels (below 0.65). The difference between the same-different and 2I-2AFC curves is considerable. For example, whereas a roving range five times the size of the unwanted cue (which corresponds to a Δ/R ratio of 0.2) is needed in order to limit Pcunwanted to just under 70% in a 2I-2AFC experiment, in a same-different experiment, a roving range having this relative size limits Pcunwanted to less than 55%. Another way of looking at the difference between the 2I-2AFC and same-different paradigms is that, for a given Δ, the smallest roving range required to ensure that Pcunwanted does not exceed 60% is about four times smaller for the same-different paradigm than for the 2I-2AFC paradigm. An experimenter who uses Green’s (1988) formula (reproduced in the Introduction) to determine the roving range needed to limit Pcunwanted to within 52 to70%, is likely to over-estimate the required roving range by as much as six times.
These observations should not be interpreted as implying that stimulus roving always reduces the influence of an unwanted cue more effectively in the same-different paradigm than in the 2I-2AFC paradigm. As discussed in Dai (2008), the relative contributions of a primary cue and of an unwanted cue in determining the Pc measured in an experiment depend, among other things, on the relative salience of each cue, and on how these cues interact. However, the ideal-observer analysis described in this article provides an upper bound on the performance that can be achieved by a real observer based on the unwanted cue.
Figure 2 shows how Pcunwanted depends on Δ/R in the three- and four-interval oddity paradigms. For comparison, the functions relating Pcunwanted to Δ/R in the 3AFC and 4AFC paradigms are also shown (as gray solid and dashed lines, respectively). The latter were computed using the formula provided by Dai and Kidd (2009), which can be found in the Introduction of the current paper. It can be seen that, for a given value of Δ/R, the three- and four-interval oddity paradigms yield lower values of Pcunwanted than their 3AFC and 4AFC counterparts.
The results in Figure 1 and and2,2, and the tables in Appendices A to toE,E, apply to continuous uniform roving distributions. However, in experimental studies, researchers sometimes use discrete uniform distributions with a relatively small number of levels (or “bins”) on the roving continuum. For instance, Henning (1966) used a 10-dB roving range with levels spaced 0.5 dB apart, yielding 21 possible stimulus intensities. In Jesteadt and Bilger (1974), the uniform discrete distributions used for frequency and level roving contained five bins. In pitch-discrimination experiments with complex tones in which the lowest-harmonic number has been roved, the roving distribution typically contained only two or three bins (see: Houtsma & Smurzynski, 1990; Moore, Glasberg, Flanagan et al., 2006; Oxenham et al., 2009). Therefore, it is of interest to determine how the number of bins in a uniform-discrete roving distribution influences the relationship between Pcunwanted and Δ/R. Provided that the bins of the roving distributions for the “standard” and “signal” stimuli coincide with each other within the region where the two distributions overlap, the results for discrete uniform roving distributions with n bins can be derived from the results obtained using continuous uniform distributions by replacing Δ/R with [(n−1)/n] Δ/R (Dai & Kidd, 2009). As an example, suppose that an experimenter desires to predict Pcunwanted in a same-different task for a uniform discrete distribution having a range of 2Δ, and containing three bins. The result can be obtained by, firstly, calculating [(n−1)/n] Δ/R, then, looking for the closest Δ/R value in Appendix A, and looking up the corresponding Pcunwanted. In our example, [(n−1)/n] Δ/R equals 1/3, and the Pcunwanted corresponding to the closest Δ/R value (0.3463) in Appendix A is 0.56. With a continuous roving distribution, or approximately, a distribution containing a large number of bins, the same shift of Δ/R = 1/2 would yield a Pcunwanted of about 63%. Thus, for the same roving range, performance based on unwanted cues can be limited to a lower level by using a uniform discrete roving distribution with a relatively small number of bins, compared to a discrete distribution with a larger number of bins.
Finally, another question of practical interest to experimenters concerns the smallest number of bins that a discrete roving distribution should have, in order for its effect on Pcunwanted to be essentially indistinguishable from that achieved with a continuous roving distribution. To answer this question, we computed the lower bound of the 95% confidence interval around the Pcunwanted values shown in Figures 1 and and2,2, which were derived using a continuous distribution. The confidence intervals were determined under the assumption of binomial variability (i.e., no over-dispersion), and measures based on 100 trials. Pcunwanted values corresponding to discrete distributions with different number of bins were then computed, using the approach described in the previous paragraph, and these values were compared to the lower bound of the 95% confidence interval. Figure 3 shows for each paradigm (in each panel) the lower bound of the 95% confidence interval (dotted line) for the Pcunwanted function derived from a continuous distribution (solid line, re-plotted from Fig. 1 or or2),2), and Pcunwanted values derived from a discrete distribution with eight bins (open circles). The results are similar across all paradigms, showing that for Δ/R values of less than 0.5 (i.e., a roving range at least twice as large as the assumed size of the unwanted cue, which is typically the case in experimental studies), the Pcunwanted values from the discrete distribution (open circles) fall within the 95% confidence interval, thus are statistically indistinguishable from that achieved using a continuous roving distribution. Therefore, in limiting the effectiveness of an unwanted cue via random roving, a discrete uniform distribution is practically identical to a continuous uniform distribution, provided that the discrete distribution consists of eight or more bins. This provides a guideline for experimenters.
In this section, we provide two examples to illustrate how the information provided in this article can be used. The first example discusses whether the roving range in an experiment using the dual-pair paradigm was large enough to warrant ruling out the possibility that the measured discrimination thresholds were based on an unwanted cue. The second example illustrates how using Green’s (1988) formula (which was explicitly derived for the 2I-2AFC paradigm) when designing a same-different experiment can result in substantial over-estimation of the roving range needed.
The first example comes from a recent study of pitch perception by Semal and Demany (2006). In this study, the authors used both the 4IAX and the AB-versus-BA versions of the dual-pair paradigm in order to measure thresholds for the detection of frequency changes (4IAX), and thresholds for the identification of the direction of frequency changes (AB-versus-BA) between pure tones, in the same listeners. One of the experiments in this study sought to test hypothesis that listeners’ performance in these two tasks was based on level changes at the output of a single auditory channel (for the details of this explanation, see pp. 3910–3911 in Semal & Demany, 2006; see also: Emmerich et al., 1989; Henning, 1966; Moore & Glasberg, 1989). The level of each tone was roved over a 10 dB range (± 5 dB) around the nominal level (65 dB SPL). The authors reasoned that such roving would lead to an increase in thresholds if listeners’ performance was based on changes in level at the output of a single auditory channel.
The question, which we ask here, is: Was the 10-dB roving range used by Semal and Demany sufficient to ensure that listeners could not reliably achieve 75% of correct responses—the percent-correct level targeted by the adaptive threshold-tracking procedure—in the pitch-change detection and pitch-change direction-identification tasks, based on level changes at the output of an auditory channel? Taking into account both the nominal level of the tones (65 dB SPL), and the mean thresholds that were measured without roving the level in this experiment (about 29 cents, slightly less than 2%), the average size of excitation-level differences at the output of auditory filters in Semal and Demany’s (2006) experiment can be estimated between 2 and 3 dB on average.4 To be on the safe side, we set Δ = 3 dB. First, we consider the change-detection task, which corresponds to the 4IAX dual-pair paradigm. The table in Appendix B indicates that, for this paradigm, the value of Pcunwanted corresponding to Δ/R = 0.3 (i.e., 3 dB/10 dB) is 60%. This is well below the targeted level of 75%. Therefore, for the dual-pair change-detection (4IAX) task, we can confidently rule out the possibility that listeners’ performance was based on loudness cues alone.
Next, we consider the direction-identification task, which corresponds to the dual-pair AB-versus-BA paradigm. The table in Appendix C reveals that, for this paradigm, the same value of Δ/R = 0.3 yields a substantially higher Pcunwanted: 74%, which is practically indistinguishable from the targeted percent-correct level of 75%. Since thresholds were measured using an adaptive procedure that visited different points (both below and above the targeted proportion-correct of 75%) on the psychometric function, one cannot rule out the possibility that level cues had some influence on the threshold measurements, even with roving.
This outcome illustrates the important point that, when the same roving range is used in different experiments, which involve superficially similar stimulus designs but different underlying psychophysical paradigms, the predicted influence of roving on the proportion of correct responses can be substantially different across experiments.
Green (1988)’ formula, which is reproduced in the Introduction of the current article, is frequently used by auditory psychophysicists to select appropriate roving ranges in their experiments. However, as mentioned above, this formula was designed specifically with the 2I-2AFC paradigm in mind. If the formula is applied in the context of experiments that use a different paradigm, it can lead to the selection of unnecessarily large roving ranges. For example, consider an experimenter who is designing a spectral-shape discrimination experiment with a same-different task. To prevent listeners from performing the task reliably on the basis of simple loudness cues, the experimenter will rove the overall level of each complex. Suppose that the experimenter determines using Green’s (1988) formula that a roving range of 30 dB is required to limit PCunwanted to 55% correct at most. The information in Figure 1 and the table in Appendix A reveals that for the same-different paradigm, in fact, a roving range of merely 5 dB is sufficient, in principle, to limit PCunwanted to 55% correct. Therefore, in this example, using Green’s (1988) formula would lead the experimenter to use a roving range about six times as large as that needed to achieve the objective. Outcomes such as this one should matter to experimenters. Unless the sensory dimensions involved are completely independent perceptually, random variations along the unwanted dimension might have detrimental effects on the processing of the primary cue—even if the unwanted cue does not provide any useful information for task performance (Ashby & Townsend, 1986; Garner, 1974). Therefore, experimenters should avoid using unnecessarily large range of rove, while ensuring that performance is safely below the level targeted in an experiment. The figures and tables in this article should help them achieve this objective.
Stimulus randomization, or “roving”, is a technique commonly used to limit the use of unwanted cues by participants in psychophysical experiments. A practical question for experimenters who use the technique is how large should the roving range be. Previous publications have provided equations for selecting adequate roving ranges in the mAFC and yes-no paradigms (Dai & Kidd, 2009; Green, 1988). In the present article, these analyses were extended to several other psychophysical paradigms, including the same-different paradigm, two versions of the dual-pair paradigm (4IAX and AB-versus-BA), as well as the 3- and 4-interval oddity paradigm. Uses of the information given in this article is subject to the same limitations as applications based on Green’s (1988) or Dai and Kidd’s (2009) formulas. In particular, they require a valid estimate, or measure, of the unwanted-cue size. If the size of unwanted cue is under-estimated, the mimimum roving range required to limit proportion-correct based on the unwanted cue to a predefined level. However, to the extent that the size of the unwanted cue can be measured, or correctly estimated, the predictions described in this article provide an upper bound on the performance that can be achieved based on the unwanted cue.
This work was supported by National Institutes of Health—National Institute on Deafness and Other Communication Disorders Grant R01 DC 05216, as well as by the University of Arizona. The authors are grateful to Dr. D. Creelman, Dr. S. Grondin, and two anonymous reviewers for constructive comments, which greatly helped improve the manuscript.
1For instance, in a frequency discrimination task, the experimenter can try to equalize loudness. Precise loudness equalization of tones that differ in frequency by a variable amount can be very difficult to achieve in practice, due to irregularities and individual differences in equal-loudness contours (Mauermann, Long, & Kollmeier, 2004)—especially at low sound levels, or in hearing-impaired listeners (see: McDermott, Lech, Kornblum, & Irvine, 1998; Thai-Van, Micheyl, Moore, & Collet, 2003).
2Numerous studies have demonstrated that random variation along an irrelevant stimulus dimension can adversely affect performance in various perceptual tasks if the irrelevant and relevant dimensions are not independent or “separable” (e.g,. Ashby & Townsend, 1986; Garner, 1974). In addition, a few studies have demonstrated detrimental effects of increasing roving range on performance or thresholds in auditory intensity- and frequency-discrimination tasks (Jesteadt & Bilger, 1974), and spectral-shape discrimination (Mason et al., 1984). It is not entirely clear whether, and to what extent, these detrimental effects were due to roving actually limiting listeners’ ability to use unwanted cues, or to the random and irrelevant variations having a “distracting” influence.
3Although Lyzenga and Horst mentioned using a “3AFC” design, they instructed their listeners to select the odd stimulus. This suggests that, from the point of view of the listener, the task was essentially a form of three-interval oddity task.
4This estimate is based on the formulas provided in Glasberg and Moore (1990) for calculating the shapes of auditory filters, as defined by the “rounded exponential” (roexp) function with a p value of 25, which corresponds to normal auditory filters. For such filters, a 2% frequency change on the steepest-slope side yields a change in output level of about 2-3 dB.
Huanping Dai, University of Arizona.
Christophe Micheyl, University of Minnesota.