|Home | About | Journals | Submit | Contact Us | Français|
Preschoolers and adults were asked to detect a 1000-Hz signal, which was masked by a multitone complex. The frequencies and amplitudes of the components in the complex varied randomly and independently on each presentation. A staircase, cued two-interval, forced-choice procedure disguised as a “listening game” was used to obtain signal thresholds in quiet and in the presence of the multitone maskers. The number of components in the masker was fixed within an experimental condition and varied from 2 to 906 across experimental conditions. Thresholds were also measured with a broadband noise masker. Eight preschool children and eight adults were tested. Although individual differences were large, among both adults and children, there was little difference between the groups in the mean amount of masking produced by the maskers with large numbers of components (400 and 906). There was also a small but significant difference between adults and children in the mean amount of masking produced by the broadband noise. The difference between the groups was much larger with smaller numbers of components. Data obtained from the adults were basically similar to that previously reported [cf. Neff and Green, Percept. Psychophys. 41, 409–415 (1987); Oh and Lutfi, J. Acoust. Soc. Am. 104, 3489–3499 (1998)]: maskers comprised of 10–40 components produced as much as 30 to 60 dB of masking in some, but not all listeners. Those same maskers produced larger amounts of masking (70–83 dB) in many of the preschool children, although, as in the adult group, individual differences were large. The component-relative-entropy (CoRE) model [Lutfi, J. Acoust. Soc. Am. 94, 748–758 (1993)] was used to describe the differences in performance between the children and adults. According to this model the average child appears to integrate information over a larger number of auditory filters than the average adult.
A child’s ability to hear out relevant signals from a background of noise is essential in a classroom where the random or chaotic nature of noise can be very distracting. It has been well documented that young children perform poorly compared to adults in such detection tasks (Allen et al., 1989; Allen and Wightman, 1992, 1994, 1995; Elfenbein et al., 1993; Elliott and Katz, 1980; Irwin et al., 1985; Schneider et al., 1986; Trehub et al., 1995). The fact that large within- and between-listener variability is reported, particularly in young children, might suggest that the differences involve not only auditory sensitivity but also more central, attentional aspects of auditory processing (Allen et al., 1989; Jensen and Neff, 1993; Wightman and Allen, 1992).
Developmental changes in auditory detection threshold have been reported for conditions in which a pure tone or complex signal is presented either in quiet or in the presence of maskers such as wideband and narrowband noise. Most previous studies of developmental changes in threshold used maskers (e.g., broadband noise) that did not vary greatly from trial to trial. Allen and Wightman (1995) were among the first to examine the effect of masker uncertainty on signal detectability by preschool children. In their study, the signal to be detected was fixed in frequency. The masker consisted of a broadband noise plus a single pure tone, the frequency of which was randomly chosen (from a set of two frequencies) on each presentation. Since the frequency of the additional tone was far from the frequency of the signal, it seemed appropriate to refer to this tone as a “distracter” rather than a masker. The presence of the uncertain distracter, on average, increased the adults’ thresholds by 11 dB and the children’s thresholds by at least 24 dB. This suggests that when a signal is embedded in an uncertain acoustic background a child’s ability to detect a tone is much more severely impaired than that of an adult.
The effect of stimulus uncertainty on detection has been extensively investigated in adult listeners (cf. Lutfi, 1993; Neff and Green, 1987; Spiegel et al., 1981). A typical task requires the listener to detect a fixed-frequency signal in the presence of other tonal components (i.e., distracters) that vary in frequency and/or amplitude on each presentation. The frequencies of these distracter components are usually remote from the signal frequency. A key independent variable is the number of frequency components comprising the distracter. The usual results are that the total amount of masking depends in a nonmonotonic fashion on the number of distracter components and that there are large individual differences in the amount of masking and in the number of distracter components producing the maximum masking (Neff and Dethlefs, 1995; Oh and Lutfi, 1998). Relatively few adult listeners are only slightly affected by the presence of the random multitone distracters. These “low-threshold” listeners seem to be able to focus attention on the signal and to ignore irrelevant information in the uncertain distracters. For most adult listeners, however, the uncertain tonal background does interfere with signal detection, especially when distracters are comprised of 10–40 components. The total amount of masking is as much as 50–60 dB for these multitone distracters (Neff and Green, 1987; Oh and Lutfi, 1998).
Some of this masking effect is almost certainly due to masker energy falling in close proximity to the signal frequency. This type of masking, called energetic masking, can be estimated from the signal-to-noise ratio at the output of the auditory filter centered at the signal frequency (cf. Patterson, 1976). Any additional masking is referred to as informational masking (Pollack, 1975) and is attributed to the distraction effect of the random frequency masker. Oh and Lutfi (1998) have used the component-relative-entropy (CoRE) model (Lutfi, 1993) to assess the relative contributions of energetic and informational masking and to describe individual differences in terms of two free parameters of the model. Their analysis suggests that the amount of informational masking obtained from adult listeners, which can be as much as 30 dB in so-called “high threshold” listeners, is related to the number of auditory filters over which information is integrated.
In the study reported here, the effect of multitone maskers with uncertain frequency content was investigated in preschool children. The task was to detect a 1000-Hz tone simultaneously presented with a multitone masker comprised of components whose frequencies and amplitudes varied randomly on each presentation. This task was disguised as a “listening game” in order to engage the attention and participation of young listeners. The amount of masking as a function of the number of masker components (the masking function) was estimated for each individual listener so that individual differences as well as age group differences could be quantified. The specific purposes of this study were to quantify each child’s selective listening capability and to examine how children might differ from each other and from adults in uncertain listening conditions. We also attempt to describe individual differences in each age group and differences between adults and children by applying the CoRE model to the data obtained.
The signal was a 1000-Hz tone presented simultaneously with random distracters. Distracters were derived from 100 samples of Gaussian noise, bandpass filtered from 0.1 to 10 kHz. The magnitude and phase spectra of each noise sample were analyzed into individual spectral components with a discrete Fourier transform (FFT). On each presentation, one of the noise samples was drawn at random, and a fixed number of its frequency components was selected, also randomly. The phases and amplitudes of the selected frequency components were then used to synthesize the multitone distracter. The number of distracter components (2, 10, 20, 40, 200, 400, or 906) was fixed within a given experimental condition and was varied across different experimental conditions. In a “broadband noise” condition, all available components of each noise sample (approximately 3700) were selected, thus producing bursts of true Gaussian noise. In all conditions (including the broadband noise condition) distracter components within a rectangular band centered at the signal frequency (920–1080 Hz) were excluded to reduce the contribution of energetic masking.
Both signal and distracter were gated on and off together with 10-ms, cos2 onset/offset ramps for a total duration of 370 ms. The RMS level of the distracter was 60 dB SPL regardless of the number of distracter components. The broadband noise distracter was also presented at 60 dB SPL overall (thus, its spectrum level was approximately 20 dB SPL). The dB levels of the individual distracter components were random, approximately normally distributed (component amplitudes were Rayleigh distributed) with a standard deviation of 5.6 dB. The maximum level of the signal was limited to 84 dB SPL. The signal and the distracter were computer generated and played over a 16 bit, digital-to-analog converter (Tucker Davis Technologies DD1) at a sampling rate of 44.1 kHz. All stimuli were presented monaurally, to the listener’s right ear, through Sennheiser model HD-414 headphones. The sound levels produced by the Sennheiser headphones were estimated with a loudness balancing procedure using calibrated TDH-49 headphones and adult listeners.
A staircase, cued two-interval, forced-choice procedure was used to estimate signal threshold in quiet and in the presence of both multitone and Gaussian noise distracters. Each trial was preceded by a cue, which consisted of the presentation of a bird picture on a computer screen and a simultaneous unmasked signal tone at 60 dB SPL. Two successive stimulus intervals were then presented with a 700-ms silent interval between them. Each stimulus interval was marked (on the computer screen) by a flashing square with the numeral “1” or “2” on it. One of the two intervals contained a distracter sample and the other contained a different distracter sample with the signal added to it. The signal occurred in the first or second interval with equal probability. The child’s task was to select the interval that contained the signal. The instructions were, “Listen to the two sounds presented with the two boxes and point to the box that has the bird sound.” Correct responses were reinforced by presenting a few pieces of a picture puzzle. The child was allowed to choose pictures for the puzzle (cartoon characters, animals, his/her own pictures, etc.). The child’s goal was to complete the puzzle within a block of trials.
The starting signal level was selected so as to make the signal clearly audible to the listener. On each of the next four trials the signal level was decreased in 8 dB steps and then was increased in 8 dB steps back to the starting level. This up–down pattern was continued for a total of 40 trials, producing 5 trials at the highest and lowest levels and 10 trials at each of the three intermediate levels. The signal level was varied by a programmable attenuator (Tucker Davis Technologies PA4). At least three blocks of trials were completed for an experimental condition. If performance levels near 100% and 50% (chance) were not obtained for the highest and lowest signals levels, an additional block of trials was obtained with the starting level adjusted either up or down so that desirable performance levels were observed at both extremes. A logistic function relating percentage of correct responses to signal level was fit to the data from each condition using a maximum likelihood criterion [see Allen and Wightman (1994) for details on the fitting procedure]. The signal level at which performance was expected to be 75% correct was taken to be the threshold value.
Each listener was tested in a double-walled, sound-attenuating chamber. Two experimenters accompanied a child listener. One experimenter set up an appropriate starting level, initiated stimulus presentation when the child was ready for the next trial, and typed in “1” or “2” when the child made a response by touching one of the boxes on the screen or by calling out the number. The other experimenter was present to satisfy local security rules. The experimenters interacted with the child in an attempt to hold his/her interest during the session.
Practice trials were given until the listener appeared familiar with the task. The children completed three or fewer blocks of 40 trials each day, depending on their willingness to continue and time availability. It took the children 8–10 minutes to complete a block of trials and they participated for no longer than 30 minutes on any single day. The children were tested several times until all experimental conditions were completed. Adults were tested in the same conditions as the children and required approximately 5 minutes to complete each block of 40 trials. The adult listeners participated for 1.5 hours on each day. All listeners first completed the condition in which the signal was presented alone (quiet threshold), and then the condition involving the broadband noise masker. Next they completed the experimental conditions in which the signal was presented with the multitone distracters. These latter experimental conditions were presented in random order.
Eight preschool children and eight adults participated in this study. The children were selected from the Waisman Center Early Childhood Program on the basis of their willingness to participate and both parent and teacher consent. Their age at the time of the first session ranged from 4 years 1 month to 5 years 7 months. The children appeared to have enjoyed the listening games, and they were rewarded with a toy or some other item (e.g., a Ty™ Beanie Baby) at the end of each session. The adults were students at the University of Wisconsin who were paid at an hourly rate for their participation, except for SNK (the first author of this article).
The listeners’ audiometrically determined pure tone thresholds were less than 15 dB HL (ANSI, 1989) at octave frequencies from 250 to 4000 Hz. Middle-ear problems, which may increase detection thresholds, are common in young children. Thus, tympanometry was performed on each child prior to each session using a screening tympanometer (Grason–Stadler, GSI-27A Auto-Tymp), calibrated to ANSI specifications (ANSI, 1987). Peak-compensated static acoustic admittance was measured, and the child permitted to continue only if these results were normal.
Figure 1 shows psychometric functions fitted to the data of individual children and adults where the signal was presented in quiet and in broadband noise. Thresholds for individual children and adults in these conditions are given in Tables I and andIIII along with the means and 95% confidence intervals. Quiet thresholds in both groups seem low, and this is most likely a result of slight errors in the headphone calibration procedure. [There are several features of the headphone calibration procedure that could lead to small errors in absolute SPL levels. First, it involved loudness balancing, a subjective procedure, and reference was made to a different headphone (TDM-49) that had been calibrated on a coupler, not on either an adult or child’s ear. At the target frequency of 1000 Hz, we feel these errors would be less than 5 dB.] Since most results from these experiments are reported in terms of differences between thresholds (amount of masking), these calibration issues are inconsequential. It is also the case that in terms of signal-to-noise ratio (sometimes expressed as E/No), the noise-masked thresholds are also lower than previously reported (e.g., Allen and Wightman, 1994), and this is probably a result of the fact that the noise spectrum excluded frequencies in a band around the signal frequency. As Tables I and andIIII show, the children’s quiet thresholds were not significantly different from those of adults, but the children’s noise-masked thresholds were significantly higher (about 7 dB on average).
As reported in previous studies, individual differences in the slopes and thresholds of the psychometric functions were larger for the children than for the adults (Allen and Wightman, 1994; Allen and Wightman, 1995). Some children’s psychometric functions were adultlike. Most of the children produced shallower psychometric functions and higher thresholds. The psychometric functions of two children did not always asymptote at 100%, which implies that they may simply have been inattentive on some proportion of the trials (Schneider and Trehub, 1992; Wightman and Allen, 1992).
Thresholds (and 95% confidence limits) obtained in the presence of multitone distracters are plotted in Figs. 2 (Children) and 3 (Adults). The listener’s age at the time of the first session is indicated in each panel. The filled symbols indicate total masking (dB difference between signal threshold in quiet and in the presence of the distracter) for a fixed number of distracter components. Horizontal lines represent total masking when the signal was presented in broadband noise. The open symbols shown in Fig. 2 for two of the listeners represent data from a retest series of sessions conducted approximately one month after the first series of sessions had been completed. Only two of the children were available for such retests.
The data from both adults and children are characterized by large individual differences. Consider first the data from the children (Fig. 2). Note that the maximum amount of masking varies over at least a 20 dB range (roughly 65–85 dB), and although the maximum occurs with from 10–100 distracter components for each listener, there are dramatic differences in the shape of the function relating masking to number of distracter components. It seems clear that even eight listeners cannot adequately represent the range of possible performance in preschool children. Note also that the replications suggest improved performance (smaller amounts of masking) in both cases. This result is consistent with previous observations (Oh and Lutfi, 1998) and probably represents a practice effect. For the adults, the maximum total amount of masking ranges from 30 to 60 dB for those same distracters.
Next consider the results from the adult listeners (Fig. 3). Just as in the group of children, maximum masking varies over at least a 20 dB range and there are large individual differences in the shape of the function relating masking to number of distracter components. Most remarkable in the adult data is the fact that half of the listeners appear to be the kind of “low-threshold” listeners described by Neff and Dethlefs (1995) and by Oh and Lutfi (1998). These listeners do not demonstrate masking with random multitone complexes in excess of that produced by a broadband noise. In Fig. 3, these are the listeners (SSO, SNK, SQW, and SSH) who produced functions that remained at or below the horizontal line. In the previous studies the proportion of “low-threshold” listeners was relatively low, but in this study at least half of the listeners appeared to be in the “low-threshold” category. The reasons for this might relate to the combination of two procedural features in this study that could allow listeners to focus attention on the target stimulus. First, an unmasked cue was presented on every trial, and second, the stimulus level did not change adaptively, hovering near threshold, but according to a staircase rules that guarantee a large proportion of suprathreshold trials. Of course, given the small number of listeners tested, it is also possible that the results reflect nothing more than sampling variability.
Finally, consider the differences between the data produced by the children and by the adults. The children demonstrated only slightly (about 3 dB on average) greater amounts of masking with the broadband noise (horizontal lines) than the adults. Since masking with a fixed, broadband noise masker is assumed to involve primarily energetic masking, this result suggests some similarity between adults and children in energetic masking. However, in most conditions involving a random, multitone distracter, children produced considerably more masking than the adults. One striking difference can be observed for distracters consisting of only two components. All of the adult listeners showed masking less than that produced by the broadband noise, but all but one of the children show masking more than 20 dB greater than that produced by broadband noise. A similar statement could be made about distracters consisting of 200 components. It is also important to note the potential impact of procedural factors on the results from the children. Children and adults were tested with the same procedure. Thus, it is possible that the cue and the nonadaptive trial structure might have helped both children and adults to focus attention on the target and in this way reduce informational masking. The implication is that without those procedural features both children and adults might have produced more informational masking. Unfortunately the data presented here do not allow resolution of this issue.
Figure 4 summarizes the data. Mean amounts of masking are shown (* symbols) for both adults (dashed line) and children (solid line), along with means from a related study (Oh and Lutfi, 1998, dotted line). Data from each individual listener in the present study are also shown (filled symbols for children and open symbols for adults). This figure highlights three features of the data from this study. First, the large individual differences are obvious. In many conditions the range of masking is more than 30 dB for both adults and children. Although there is little overlap of the data from children and adults, it is clear that conclusions based on mean data are not warranted. Second, the differences between the adult and child data are greatest with distracters consisting of small numbers of components. Third, the adult data from this study are different from the data obtained in previous studies, especially for distracters consisting of small numbers of components.
This study measured the ability of preschool children to detect a fixed frequency tone embedded in a multitone distracter with frequencies that varied randomly on each presentation. The fact that nearly all psychometric functions asymptoted at 100% correct attests to the attentiveness of the children. Nevertheless, large elevations in thresholds with random distracters were observed in all children. The distracters produced 65–83 dB of masking when they were comprised of 10–40 components. With adult listeners the threshold elevations were generally much lower, although individual differences were quite large.
As reported in previous studies, young children’s thresholds are higher than those from adult listeners in most auditory detection and discrimination tasks. The rather surprising result of this study is the magnitude of this difference when a signal is presented with a distracter that varies randomly on each presentation. When a pure tone or complex signal is presented in quiet or in the presence of stationary narrow-band or wideband noise, adult–child detection threshold differences are relatively small, ranging from 4 dB to 10 dB (Allen and Wightman, 1992; Allen et al., 1989; Elliott and Katz, 1980; Jensen and Neff, 1993). When a single, fixed level, random frequency distracter tone is added to a broadband noise, adult–child differences are at least 25 dB (Allen and Wightman, 1995). The result from this study show that adult–child differences can be more than 50 dB when a multitone distracter varies at random on each presentation.
In this section, we examine possible sources for the adult–child differences. Age-dependent changes in anatomy of physiology seem improbable. There is some evidence that the auditory system continues developing after birth. For instance, animal studies suggest that there are systematic changes in the tonotopic organization in the cochlea as a function of age (Lippe and Rubel, 1983). Also, there are age-dependent changes in the anatomy of the ear (e.g., ear canal size) that continue even after age five (Schneider et al., 1986). Nevertheless, these anatomical developments occur early and most auditory structures are adult-like at birth. It appears unlikely, therefore, that the large differences we obtained between preschool children and adults (and the large differences within groups) could derive from structural and functional immaturities in the auditory peripheral system. Also, given the fact that most estimates of the bandwidth of the auditory filter do not seem to change from preschool to adulthood (Hall and Grose, 1991; Schneider et al., 1990), we expect relatively small adult–child differences in energetic masking, those being caused by differences in the “efficiency” parameter of the common filter model of masking. Rather we suspect that the differences between adults and preschool children in performance in this study mainly result from differences in informational masking. In order to quantify these differences, we applied a model that has been used successfully to quantify differences in informational masking in adults.
Results from studies similar to that reported here have been shown to be well predicted by the CoRE model (Lutfi, 1993). In the model, listeners are assumed to adopt a maximum-likelihood decision rule. Informational masking is thought to result from an imperfect implementation of the decision rule, which can be described as a failure to ignore irrelevant information that varies on each presentation. The CoRE model was applied to the mean data in the present study. Figure 5 (top panels show mean data) shows the agreement between the predictions of the model and the total masking averaged over individual listeners in each age group. The total masking predicted by the CoRE model (solid line) is the dB sum of the estimates of energetic (dashed line) and informational masking (dotted line). The process of estimating energetic and informational masking is described in detail in the analysis section of Oh and Lutfi (1998). Briefly, a three-parameter ROEX filter (Patterson et al., 1982) was applied to estimate the amount of energetic masking produced by distracters in a given experimental condition. The same values of filter parameters are used for the children and the adults, since the bandwidth of the auditory filter does not seem to change from preschool to adulthood (Hall and Grose, 1991; Schneider et al., 1990). The differences in the estimates of energetic masking between the adults and children are assumed to arise from differences in the efficiency of the detection process (K). For our purposes here the value of K was chosen such that the signal threshold predicted by the model would converge near the mean threshold obtained from the listeners for the broadband noise condition (cf. Patterson et al., 1982). As shown in this figure, the estimates of energetic masking (dashed line) from the mean data in the two age groups are similar.
The estimates of informational masking (dotted line) depend on two free parameters of the model [see Eq. (8) in Oh and Lutfi (1998)]. They are the width of the “attentional band” (W), and the number of independent auditory filters (n) from which information is integrated within the attentional band. The attentional bandwidth determines the basic shape of the masking function (function relating total masking to number of distracter components). The number of auditory filters serves as a scaling factor on the masking function: larger amounts of informational masking are predicted as n increases. With best-fitting choices for the two parameter values, informational masking estimated by the CoRE model is 30–50 dB in the children and less than 5 dB in the adults. The best fitting parameter values are W = 7 kHz and n = 7 for the children, while they are W = 1 kHz and n = 1 for the adults. Since the basic shape of the masking function is the same for children and adults, the results suggest that children may monitor the outputs of many auditory filters that are irrelevant to the task. The masking functions for half of the adults (low-threshold listeners) show little informational masking. Thus, those results are consistent with the predictions of the traditional auditory filter model wherein masking is determined by the energy at the output of a single auditory filter centered at the signal frequency (see Patterson, 1976). Overall, the agreement between the obtained and the predicted amount of total masking is good in both groups. This suggests that the discrepancy between children and adults derives primarily from differences in attentional factors rather than differences in the functioning of the auditory periphery (e.g., Viemeister and Schlauch, 1992; Wightman and Allen, 1992). The extent to which those attentional factors may have interacted with procedural details (e.g., the presence of the cue and the nonadaptive trial structure) is unclear.
Care must be taken in interpreting the values of W reported here as an estimate of the listener’s true attentional bandwidth. For a number of reasons, large changes in the value of W have a relatively small effect on the predicted shape of the masking function; in effect, serving to shift the peak of the function slightly to the right or left. Because of the relative insensitivity of W as an estimate of listener attention, we refrain from drawing any inferences about child–adult differences in performance based on this parameter.
The model was also applied to the individual data. Fits of the model to the data from two adults and two children are shown in Fig. 5. These specific listeners were selected from each group to represent the most (“worst”) and least (“best”) amounts of informational masking obtained from the group. The model fits reveal that the difference between the children and the adults can be explained primarily by differences in the parameter n, the number of auditory filters over which the listener integrates information within the attentional band. The best fitting values of n and W for individual adult and child listeners are given in Tables I and andII.II. Compared to the adults the children seem to integrate information over a larger number of auditory filters.
The modeling results are encouraging since they allow us to quantify the reduced auditory processing capabilities of children that are often attributed to immaturity of attentional mechanisms or poor attentional control (Allen and Wightman, 1994; Allen and Wightman, 1995; Stellmack et al., 1997; Wightman and Allen, 1992).
(1) A preschool child’s ability to listen selectively to a fixed-frequency signal in the presence of multitone distracters was severely degraded in conditions in which the distracters varied on each presentation. The total masking observed in children was as much as 83 dB for distracters with 10–40 components.
(2) Data from both children and adults were characterized by large individual differences, such that statements about overall adult–child differences in performance in any one condition seem unwarranted.
(3) Both the individual and the group masking functions were well described by the CoRE model with two free parameters, suggesting a means by which the effects of attentional factors in children and adults can be quantified.
The authors would like to thank Dr. Doris Kistler, Sara Conzemius, Jen Junion Dienger, and the teachers at the Waisman Center Early Childhood Program for their contributions to the research. This research was supported by grants from the National Institutes of Health (Grants Nos. R01 HD23333 and R01 CD01262-9).