Search tips
Search criteria 


Logo of jarospringer.comThis journalToc AlertsSubmit OnlineOpen Choice
J Assoc Res Otolaryngol. 2009 December; 10(4): 511–523.
Published online 2009 June 13. doi:  10.1007/s10162-009-0176-9
PMCID: PMC2774409

Otoacoustic Emission Theories and Behavioral Estimates of Human Basilar Membrane Motion Are Mutually Consistent


When two pure tones (or primaries) of slightly different frequencies (f1 and f2) are presented to the ear, new frequency components are generated by nonlinear interaction of the primaries within the cochlea. These new components can be recorded in the ear canal as otoacoustic emissions (OAE). The level of the 2f1f2 OAE component is known as the distortion product otoacoustic emission (DPOAE) and is regarded as an indicator of the physiological state of the cochlea. The current view is that maximal level DPOAEs occur for primaries that produce equal excitation at the f2 cochlear region, but this notion cannot be directly tested in living humans because it is impossible to record their cochlear responses while monitoring their ear canal DPOAE levels. On the other hand, it has been claimed that the temporal masking curve (TMC) method of inferring human basilar membrane responses allows measurement of the levels of equally effective pure tones at any given cochlear site. The assumptions of this behavioral method, however, lack firm physiological support in humans. Here, the TMC method was applied to test the current notion on the conditions that maximize DPOAE levels in humans. DPOAE and TMC results were mutually consistent for frequencies of 1 and 4 kHz and for levels below around 65 dB sound pressure level. This match supports the current view on the generation of maximal level DPOAEs as well as the assumptions of the behavioral TMC method.

Keywords: cochlear nonlinearity, DPOAE, auditory masking, psychoacoustics, human physiology, basilar membrane


The human ear is not a high-fidelity system. It distorts acoustic signals within the cochlea (Ruggero 1993). The distortions can be perceived as audible sounds (Goldstein 1967) and are emitted from the cochlea back to the ear canal as otoacoustic emissions (Kemp 1978). Indeed, emitted distortions are a sign of a healthy ear: the weaker the emission, the greater the cochlear damage (Dorn et al. 2001; Lonsbury-Martin and Martin 1990). The level of these emissions also depends on the parameters of the sounds used to evoke them. Typically, two pure tones (or primaries) of slightly different frequencies (f1 and f2; f2/f1 ~ 1.2) are used and the level of the 2f1f2 emitted distortion at the ear canal is regarded as an indicator of the physiological state of the cochlea (Gorga et al. 1997). We will refer to this indicator as the distortion product otoacoustic emission (DPOAE). The sensitivity of this measure is greatest when the primaries have levels that evoke maximal level DPOAEs (Mills and Rubel 1994; Whitehead et al. 1995). We will use the term DPOAE optimal rule to refer to the combination of primary levels that evokes the highest level of DPOAEs. Up to now, efforts have been directed to obtain optimal rules empirically (Kummer et al. 1998) but the form of the optimal rule is still controversial (Johnson et al. 2006; Kummer et al. 2000).

The controversy could be clarified by elucidating the cochlear mechanical conditions that maximize DPOAE levels. The overriding view is that maximal level DPOAEs occur when the primaries produce equal excitation at the cochlear region most sensitive to f2 (Kummer et al. 2000; Neely et al. 2005; Shera and Guinan 2007). Concurrent DPOAE and basilar membrane (BM) recordings have revealed that this view is approximately true for rodents (Rhode 2007), but a confirmation in living humans is not currently feasible because it is not possible to directly record the motion of their BM while monitoring their ear canal DPOAE levels.

On the other hand, it has been claimed that it is possible to infer the levels of two equally effective pure tones at a given cochlear site from behavioral forward masking thresholds. The technique is known as the temporal masking curve (TMC) method and is arguably the most powerful procedure to infer human BM input/output (I/O) curves (Lopez-Poveda et al. 2003; Nelson et al. 2001). The TMC method would seem an appropriate tool to verify the DPOAE generation conjecture of Kummer et al. (2000) in humans. Unfortunately, its assumptions (described below) have been validated only indirectly, using computer models or other psychoacoustical methods, and lack direct physiological support.

A high correlation between DPOAE optimal level rules and corresponding behavioral rules inferred using the TMC method would provide strong support to both the conjecture of Kummer et al. (2000) on the generation of maximal level DPOAEs and the assumptions of the TMC method of inferring human BM responses. The present study aimed at investigating such correlation. It will be shown that a high correspondence exists for frequencies of 1 and 4 kHz and for levels below around 65 dB sound pressure level (SPL).



A TMC is a plot of the levels of a pure tone (masker) required to just mask a brief following tone (probe) as a function of the time gap between the masker and the probe. The probe level is fixed just above the absolute threshold for the probe. The masker level at the masking threshold increases with increasing time gap and is thought to depend on two variables (Nelson et al. 2001). First, it depends on the time gap: the amount of masking decreases as the masker–probe time gap increases (Duifhuis 1973; Moore and Glasberg 1983; Nelson and Freyman 1987). Second, it depends on the relative excitation produced by the masker and the probe at the BM place tuned at or close to the probe frequency (Nelson et al. 2001; Oxenham et al. 1997; Oxenham and Moore 1995; Oxenham and Plack 1997). Because the probe level is fixed at all times, a TMC is assumed to represent the masker levels required to generate a fixed level of excitation after decaying during the masker–probe time gap. This is why the resulting functions are referred to as isoresponse temporal masking curves or TMCs (Nelson et al. 2001).

There is strong evidence that the rate of recovery from forward masking is approximately the same for different masker frequencies over a wide range of masker levels (Wojtczak and Oxenham 2009). Although this evidence is for a probe frequency of 4 kHz, indirect evidence suggests that the same applies to probe frequencies as low as 0.5 kHz (Lopez-Poveda and Alves-Pinto 2008). Therefore, it seems reasonable to assume that for any given masker–probe time gap, two maskers of slightly different frequencies (e.g., f and f/1.2) with levels at their masking thresholds produce identical degrees of excitation at a cochlear site tuned approximately to the probe frequency. This assumption is commonplace when inferring cochlear I/O curves and compression exponents from TMCs (Lopez-Poveda et al. 2003, 2005; Nelson et al. 2001; Plack et al. 2004; Wojtczak and Oxenham 2009).

Based on the above, our approach consisted in measuring two TMCs, both for a probe frequency equal to the DPOAE test frequency (f2) and for masker frequencies equal to the DPOAE primary tones (f1, f2; with f2/f1 = 1.2). We then plotted the resulting levels for the f1 masker (L1) against those for the f2 masker (L2), paired according to masker–probe time gap. Based on the previously explained interpretation of TMCs, the resulting plot should illustrate the combination of levels, L1L2, for which two pure tones of frequencies f1 and f2 produce approximately comparable degrees of excitation at the f2 cochlear site. If the current DPOAE generation model (as described by Kummer et al. 2000) and the assumptions of the TMC method are both correct, then this behavioral rule should match with a DPOAE optimal rule obtained empirically.

All human procedures were approved by the human experimentation ethical committee of the University of Salamanca.


A total of 14 subjects participated in the study. Their ages ranged from 20 to 39 years. Their hearing was audiometrically normal (i.e., absolute hearing thresholds <20 dB HL) at the three tests frequencies considered in this study (0.5, 1, and 4 kHz). Table 1 details their behavioral absolute thresholds (in decibel sound pressure level) for pure tones of 0.5, 1, and 4 kHz and durations of 10, 110, and 300 ms.

Thresholds (in decibel sound pressure level) measured with Etymotic ER2 insert earphones for all subjects and for tone durations of 300 ms (absolute threshold), 110 ms (masker threshold), and 10 ms (probe threshold), respectively ...

Behavioral rules

TMCs were measured for probe frequencies (fP) of 0.5, 1, and 4 kHz and for masker frequencies equal to fP and fP/1.2. These masker frequencies were equal to those of the primary tones (f1 and f2, respectively) used to measure DPOAEs (see below). The masker–probe time gaps, defined as the 0-V period from masker offset to probe onset, ranged from 5 to 100 ms in 5-ms steps with an additional gap of 2 ms. The durations of the masker and the probe were 110 and 10 ms, respectively, including 5-ms cosine-squared onset and offset ramps. The probe had no steady-state portion. The level of the probe was fixed at 9 dB above the individual absolute threshold for the probe as shown in Table 1.

Stimuli were generated with a Tucker Davies Technologies Psychoacoustics Workstation (System 3) operating at a sampling rate of 48.8 kHz and with analog to digital conversion resolution of 24 bits. If needed, signals were attenuated with a programmable attenuator (PA-5) before being output through the headphone buffer (HB-7). Stimuli were presented to the listeners through Etymotic ER-2 insert earphones. TMC SPLs were calibrated by coupling the earphones to a sound level meter through a Zwislocki DB-100 coupler. Calibration was performed at 1 kHz only, and the obtained sensitivity was used at all other frequencies because the earphone manufacturer guarantees an approximately flat (±2 dB) response between 200 Hz and 10 kHz.

Masker levels at masking threshold were measured using a two-interval, two-alternative, forced-choice adaptive procedure with feedback. Two sound intervals were presented to the listener in each trial. One of them contained the masker only and the other contained the masker followed by the probe. The interval containing the probe was selected randomly. The subject was asked to indicate the interval containing the probe. The inter-stimulus interval was 500 ms. The initial masker level was set sufficiently low that the listener always could hear both the masker and the probe. The masker level was then changed according to a two-up, one-down adaptive procedure to estimate the 71% point on the psychometric function (Levitt 1971). An initial step size of 6 dB was applied, which was decreased to 2 dB after three reversals. A total of 15 reversals were measured. Threshold was calculated as the mean of the masker levels at the last 12 reversals. A measurement was discarded if the standard deviation (SD) of the last 12 reversals exceeded 6 dB. Three threshold estimates were obtained in this way and their mean was taken as the threshold. If the SD of these three measurements exceeded 6 dB, a fourth threshold estimate was obtained and included in the mean. Measurements were made in a double-wall sound attenuating booth. Listeners were given at least 2 h of training on the TMC task before data collection began.

The resulting TMCs were least-squares fitted with Eq. (1) of Lopez-Poveda et al. (2005). Individual behavioral level rules were obtained by plotting the fitted levels for the f1 masker against those for the f2 masker, paired according to the masker–probe time gaps.

DPOAE optimal rules

The magnitude (in decibel sound pressure level) of the 2f1f2 DPOAE was measured for f2 frequencies of 1 and 4 kHz and for a fixed primary frequency ratio of f2/f1 = 1.2. Individual DPOAE optimal rules were obtained by systematically varying the levels of the two primaries (L1 and L2 for f1 and f2, respectively) to find the L1L2 combinations that produced the highest DPOAE response levels. L2 was varied in 5-dB steps within the range from 35 to 75 dB SPL. For each fixed L2, L1 was varied in 3-dB steps and the individual optimal value (i.e., the level that produced the highest DPOAE level) was noted.

The DPOAE magnitude can vary rapidly by changing the test frequency only slightly (Gaskill and Brown 1990). These variations are most clearly seen in a DP gram (i.e., the graphical representation of the DPOAE magnitude as a function of test frequency f2). They are known as “DPOAE fine structure” and can be as large as 20 dB for an f2 change of 1/32 octave (He and Schmiedt 1993). The fine structure is thought to occur by vector summation of two DPOAE contributions: one that originates at the BM region of maximum overlap between the cochlear excitation patterns evoked by the two primaries (i.e., the f2 region), and one that originates at the cochlear site with characteristic frequency (CF) ~ 2f1f2, where the first contribution reflects back to the ear canal. The varying phases of these two contributions give rise to constructive and destructive interference, thus to peaks and valleys in the DP gram (Heitmann et al. 1998; Shera and Guinan 1999).

In an attempt to reduce the potential influence of the fine structure on the DPOAE optimal rules, three such rules were obtained for three f2 frequencies close to the frequency of interest and their mean was taken as the actual DPOAE optimal rule. The three test frequencies in question were equal to 0.99f, f, and 1.01f, where f denotes the frequency of interest. For instance, the final DPOAE optimal rule at 4 kHz was the mean of three optimal rules for f2 frequencies of 3,960, 4,000, and 4,040 Hz. This procedure was inspired by earlier studies that showed that a “clean” DP gram (i.e., a DP gram without the influence of the fine structure) resembled very closely a moving average of the original DP gram with fine structure (Kalluri and Shera 2001; Mauermann and Kollmeier 2004).

DPOAE I/O curves

DPOAE I/O curves were measured for f2 frequencies of 1 and 4 kHz with individual behavioral and DPOAE optimal level rules, as well as with the rule of Kummer et al. (1998) (L1 = 0.4L2 + 39). I/O curves were also measured for an f2 of 500 Hz, but using only individual behavioral rules and the rule of Kummer et al. When the rule of Kummer et al. was applied, L2 ranged from 20 to 75 dB SPL in 5-dB steps, except for f2 = 0.5 kHz for which it ranged from 45 to 75 dB SPL. The primary frequency ratio was always fixed at f2/f1 = 1.2.

To reduce the potential influence of the fine structure on the I/O curves, five such curves were measured for five close f2 frequencies around the frequency of interest, and the resulting I/O curves were averaged (Johannesen and Lopez-Poveda 2008). For instance, the final DPOAE I/O curve at 4 kHz was the mean of five I/O functions for f2 frequencies of 3,920, 3,960, 4,000, 4,040, and 4,080 Hz. Three I/O curves were obtained in this way for the behavioral and Kummer rules per f2 frequency and subject, the mean of which was taken as the “true” I/O curve. For the individual DPOAE optimal rules, only one such I/O curve was measured.

DPOAE measurement procedure

DPOAE measurements were obtained with an Intelligent Hearing System’s Smart device (with SmartOAE software version 4.52) equipped with an Etymotic ER-10D probe. During the measurements, subjects sat comfortably in a double-wall sound attenuating chamber and were asked to remain as steady as possible. When seeking DPOAE optimal rules, a recording session consisted of measuring DPOAE responses for all possible primary level combinations (L1, L2) for one of the three adjacent f2 frequencies considered per frequency of interest (see above). When measuring DPOAE I/O curves, a recording session consisted of measuring I/O curves for the five adjacent frequencies considered for each frequency of interest (see above).

The probe fit was checked before and after each recording session. The probe remained in the subject’s ear throughout the whole measurement session to avoid measurement variance from probe fit. DPOAEs were measured for a preset measurement time, which ranged from 12 s for high L2 to 1 min for low L2. A DPOAE measurement was considered valid when it was 2 SD above the measurement noise floor (defined as the mean level over 10 frequency bins adjacent to the 2f1f2 component in the OAE spectrum). When a response did not meet this criterion, the measurement was repeated and the measurement time was increased if necessary. The probe remained in the same position during these re-measurements. If the required criterion was not met after successive tries, the measurement point was discarded.

DPOAE measurements were regarded as valid only when they were 6 dB above the system’s artifact response. The rationale behind this rather strict criterion and the details of the procedure for controlling for system’s artifacts can be found elsewhere (Johannesen and Lopez-Poveda 2008).


Temporal masking curves

Figures Figures1,1, ,2,2, and and33 illustrate TMCs for probe frequencies (fP) of 0.5, 1, and 4 kHz, respectively. Each panel illustrates the TMCs for one subject (as indicated in the top-left corner of the panel) and for two masker frequencies at f1 = fP/1.2 (filled symbols) and f2 = fP (open symbols).

FIG. 1.
TMCs for all listeners for probe frequencies (fp) of 500 Hz. Each panel shows data for one subject. Open symbols illustrate TMCs for a masker frequency equal to the probe frequency (f2 = fP); filled symbols illustrate TMCs for ...
FIG. 2.
As Figure Figure11 but for a probe frequency of 1 kHz.
FIG. 3.
As Figure Figure11 but for a probe frequency of 4 kHz.

The characteristics of the present TMCs were overall consistent with those reported elsewhere for similar stimuli (Lopez-Poveda et al. 2003; Nelson and Schroder 2004; Plack and Drga 2003). In broad terms, these are as follows: TMCs were overall steeper for the on-frequency masker (i.e., the masker whose frequency was equal to the probe frequency) than for the off-frequency masker (i.e., the masker whose frequency was below the probe frequency). This difference in slope is interpreted to reflect the different rates of growth of the corresponding cochlear responses for stimulus frequencies at or below the CF of the cochlear site tuned to the probe frequency, respectively. That is, the steeper portions of the TMCs are interpreted to reflect shallower growths of BM response with increasing masker level, hence greater degrees of BM compression.

At short masker–probe time gaps, higher levels are required for the off- than for the on-frequency masker to mask the fixed level probe. This is consistent with the fact that, at low levels, the level of a pure tone below the CF must be higher than that of an on-CF tone for both tones to produce equal responses at the cochlear site in question. For moderate-to-long gaps, however, the levels of the lower, off-frequency masker overlap or are even lower than those of the on-frequency masker. This is interpreted to reflect that at high levels, below-CF tones produce comparable or more cochlear excitation than on-CF tones, which is consistent with broader tuning at high levels and with the well-reported basalward shift of cochlear excitation with increasing level for CFs above approximately 1 kHz (Robles and Ruggero 2001; Ruggero et al. 1997). Interestingly, in a few instances (e.g., S10 and S15 in Fig. 1, or S11 in Fig. 2) the TMCs for both maskers crossed again at very long gaps, suggesting that the on-frequency masker became more effective than the off-frequency masker again at very high levels. A similar “rebound” effect can be observed in earlier reports (e.g., Fig. 2 of Lopez-Poveda et al. 2003). The explanation of this result is uncertain. It would be consistent with an apicalward shift of cochlear excitation at very high levels following the previously mentioned basalward shift at moderate levels. Direct BM responses suggest that this shift is possible but existing evidence only applies to apical cochlear regions [see the cross symbols (×) in Figs. 2.3 and 2.4 of Cooper (2004)]. Another possibility would be that the rate of decay of the post-cochlear masker effect becomes slower for the on-frequency masker than for the lower, off-frequency one at very high levels. Hence, for long gaps, the required masker level at threshold would be lower for the on- than for the off-frequency masker. To our knowledge, there is no evidence that this is the case. In fact, existing evidence suggests the opposite (Wojtczak and Oxenham 2009). In any case, the “rebound” effect was rare and occurred over a range of masker levels much higher than the maximum primary level for which DPOAEs could be measured reliably (80 dB SPL). Therefore, it had no effect on the conclusions of the present paper.

More detailed interpretations of TMC characteristics are provided elsewhere (Lopez-Poveda et al. 2003; Nelson et al. 2001).

The influence of the fine structure on DPOAE optimal rules

Figure Figure44 provides several illustrative examples of the influence of the DPOAE fine structure on the individual DPOAE optimal rules of several subjects for frequencies of 1 (left panels) and 4 kHz (right panels). Clearly, the level of f1 (L1) that evoked the maximal DPOAE response for any given level of f2 (L2) changed by as much as 8 dB with a change in f2 of only 1% for some conditions and subjects. The figure shows that the fine structure could have affected the DPOAE optimal rules, albeit only slightly, and thus justifies our approach to use the mean curve for three adjacent frequencies (illustrated with filled circles) as the DPOAE optimal rule.

FIG. 4.
Examples of the influence of the fine structure on individual DPOAE optimal rules at 1 and 4 kHz (left and right panels, respectively). The listener identifier is shown on the top-left corner of each panel. Open symbols illustrate DPOAE optimal ...

Behavioral vs. DPOAE optimal rules

Figures Figures55 and and66 illustrate plots of the mean levels of the f1 masker at threshold against those of the f2 masker, paired according to the masker–probe time gap. Based on the interpretation of TMCs explained in the “Methods” section, these illustrate level combinations of two equally effective maskers at the cochlear site tuned to the probe frequency (f2). These behavioral rules are compared with individual DPOAE optimal rules (i.e., with primary level combinations, L1L2, that produced maximal DPOAE levels) for primary tone frequencies equal to the masker frequencies. The match between the two rules varied from subject to subject and across frequencies. It was extremely close at 1 kHz for several subjects (e.g., S1, S2, S4, or S14) over an L2 range typically below 65 dB SPL. The degree of correspondence in the individual data was generally less at 4 kHz, but for several subjects (e.g., S2, S8, or S10) the agreement was reasonably close even at this frequency. The reasons for the lower degree of correspondence above 65 dB SPL will be discussed later.

FIG. 5.
Comparison of behavioral and DPOAE optimal level rules for a test frequency of 1 kHz. Each panel illustrates results for a single subject. Gray squares illustrate behavioral rules based on mean TMCs. Small gray dots illustrate all possible L1 ...
FIG. 6.
As Figure Figure55 but for a test frequency of 4 kHz.

Individual behavioral rules were based on mean values of at least three independent estimates of L1 and L2, whereas the DPOAE optimal rules were based on a single L1 for every L2 (note that this L1 was the mean of three estimates, each for a slightly different f2 around the frequency of interest; see “Methods”). To test for the statistical significance of the difference between the two individual rules, we simply checked if the single DPOAE optimal rule estimate fell within the range of all possible combinations of L1 and L2 based on the available TMC data. The latter are illustrated as small gray dots in Figures Figures55 and and6.6. Except, perhaps, for S7 at 4 kHz, all DPOAE optimal rules always fell within the variability of the behavioral combinations for L2  65 dB SPL. Subject S7 was peculiar in that he repeatedly reported that the behavioral task was extremely difficult.

The reason for the variability in the behavioral L1L2 combinations (gray dots in Figs. 5 and and6)6) is uncertain. Such variability reflects, by definition, the variability across TMC estimates (Figs. 2 and and3).3). Consistent with many previous studies (e.g., Lopez-Poveda et al. 2003, 2005; Nelson et al. 2001; Plack et al. 2004), the variability in the present masker levels was noticeably larger over the steeper portions of the TMCs (Figs. 2 and and3).3). The steeper portion of a TMC is assumed to reflect the range of levels where the masker is subject to greater cochlear compression (Nelson et al. 2001). Therefore, even small changes in cochlear responses or in the listener’s sensitivity across measurement sessions would produce a large change in masker level at threshold and would explain such variability.

Figure Figure77 illustrates mean rules across listeners. Interestingly, the mean DPOAE optimal rules overlapped with behavioral rules at 1 kHz over the L2 range for which both sets could be measured reliably (35–65 dB SPL). At 4 kHz, the two rules did not overlap but they were within 1 SD of each other and their difference was not statistically significant for L2  65 dB SPL (p < 0.05, point-by-point, two-tailed, paired t test). Indeed, the difference was accentuated by the results of a single subject (S7).

FIG. 7.
Average L1L2 rules at different frequencies and as proposed by the present and earlier studies. A For f2 = 0.5 kHz. B For f2 = 1 kHz. C For f2 = 4 kHz. Circles and squares ...

The dependence of behavioral and DPOAE optimal rules on test frequency

Average behavioral rules (gray squares in Fig. 7) approached equal level for high L2 levels (~75 dB SPL). The difference between L1 and L2 was larger at lower than at higher L2s and increased gradually with increasing frequency. Straight lines were fitted to the data for L2  65 dB SPL (thick continuous lines in Fig. 7). As indicated in Table 2, the lines for both behavioral and DPOAE optimal rules had similar slopes at 1 and 4 kHz, and these were shallower than the line fitted to the behavioral rule at 0.5 kHz. Behavioral rules, however, had slopes that were statistically indistinct across frequencies (p > 0.05, two-tailed, equal variance, t test).

Regression parameters for linear relationships, L1 = aL2 + b, based on the present behavioral and DPOAE optimal rules, for an L2 range between 30 and 65 dB SPL

DPOAE I/O curves

The growth of the DPOAE magnitude as a function of L2 was measured for each listener using his/her individual DPOAE-optimal and behavioral rules to obtain individual DPOAE I/O curves. The mean I/O curves are shown in Figure Figure8.8. The figure also illustrates mean I/O curves measured with the rule of Kummer et al. (1998), which was identical across subjects and frequencies. For the three f2 frequencies, DPOAEs grew with increasing L2 at rates considerably lower than 1 dB/dB. DPOAE levels measured using DPOAE optimal rules (filled circles) were the highest and were consistently 3–5 dB higher than those measured with the behavioral rules (gray squares); that is, circles and squares run parallel to each other across the L2 level range.

FIG. 8.
Average I/O curves for the 2f1f2 DPOAE for different primary level rules (as indicated in the inset of A). Different panels illustrate results for a different f2 frequency. Af2 = 0.5 kHz. Bf2 = 1 kHz. ...

The DPOAE levels evoked by the rule of Kummer et al. were identical or lower than those evoked by the behavioral rule at 1 and 4 kHz across levels, except for L2 < 50 dB SPL. The rule of Kummer et al. evoked slightly higher DPOAE levels than the behavioral rule at 500 Hz (Fig. 8A). The difference between the DPOAE levels measured with the optimal and the Kummer rules increased with increasing L2 at 1 and 4 kHz.


The general overlap between behavioral and empirical DPOAE optimal rules at 1 and 4 kHz (Figs. 5, ,6,6, and and7)7) for L2 below approximately 65 dB SPL supports the view that DPOAE levels are highest when the two primaries produce similar responses at the f2 cochlear region (Kummer et al. 2000).

The present behavioral rules were derived on the assumption that the TMCs for the two maskers (f1 and f2) reflect only differences in the cochlear excitation evoked by the two tones at the cochlear site tuned to the probe frequency (f2 in this case). That is, on the assumption that the post-cochlear interaction between the masker and the probe is linear and identical for the two maskers, for all masker–probe time gaps and masker levels. This assumption is commonplace when inferring cochlear I/O functions from TMCs (Lopez-Poveda et al. 2003; Nelson et al. 2001) and is supported by modeling (Oxenham and Moore 1994) and experimental studies, at least for levels below approximately 83 dB SPL (Lopez-Poveda and Alves-Pinto 2008; Wojtczak and Oxenham 2009). The post-cochlear interaction may be (slightly) different for masker frequencies that are an octave apart at higher masker levels (Lopez-Poveda and Alves-Pinto 2008; Wojtczak and Oxenham 2009), but this is unlikely to undermine the validity of the present approach because the present maskers were closer in frequency and conclusions are claimed to be valid only for L2  ~65 dB SPL. Furthermore, given that the behavioral and DPOAE optimal rules were inferred using fundamentally different assumptions and methods, the correspondence between the two provides further support for the assumptions of the behavioral TMC method, at least below 65 dB SPL.

The level of the 2f1f2 DPOAE measured in the ear canal is almost certainly the sum of contributions from several cochlear sources and generation mechanisms (Shaffer et al. 2003). These include distortion generated by nonlinear interaction of the primaries in the f2 cochlear region (Martin et al. 1998), reflection of this distortion at the 2f1f2 cochlear site (Kalluri and Shera 2001), and nonlinear interaction between f2 and the first harmonic of f1 (2f1) at a more basal (2f1) cochlear site (Fahey et al. 2000). The relative weight of these contributions to the measured DPOAE is uncertain. The present behavioral rules were obtained from TMCs for probe frequencies equal to the DPOAE test frequencies (f2). Based on the current interpretation of TMCs (Lopez-Poveda et al. 2003; Lopez-Poveda and Alves-Pinto 2008; Nelson et al. 2001), the behavioral rules thus reflect L1L2 combinations for which the two maskers (f1, f2) produce equal responses at a cochlear site with a CF ~ f2. Therefore, the match between behavioral and DPOAE optimal rules for levels below 65 dB SPL (Figs. 5, ,6,6, and and7)7) together with the evidence that DPOAEs originate at multiple cochlear locations suggest that the DPOAE contribution from the f2 region is dominant and/or that the DPOAEs originated at the other sites are proportional to the contribution generated at the f2 region.

Given the reasonable match between mean behavioral and DPOAE optimal rules (Fig. 7B, C), it is unclear why the DPOAE levels were on average 3–5 dB higher for the DPOAE optimal than for the behavioral rules (Fig. 8B, C). Recall that the DPOAE I/O curves of Figure Figure88 represent the mean of the curves obtained with individual DPOAE optimal and behavioral rules. One possibility is that mutual suppression between the primaries may have affected the results. One fundamental difference between the individual behavioral and DPOAE optimal rules is that the latter were obtained from DPOAE measurements that required the simultaneous presentation of the two primary tones, while behavioral rules were inferred from single-tone responses. In other words, DPOAE optimal rules implicitly take into account possible nonlinear interactions (e.g., suppression) between the primary tones that are disregarded by the behavioral rules. Concurrent recordings of DPOAE and basilar membrane responses in chinchilla suggest that DPOAE levels are submaximal for L1L2 combinations that produce equal cochlear responses of simultaneously presented primaries, and this possibly occurs because the primary tone f1 suppresses the response of the basilar membrane to f2 [Fig. 1 of Rhode (2007)]. Given that the behavioral rule reflects an equal response criterion for nonsimultaneous primaries, this might explain why the mean DPOAE levels for the behavioral rules were consistently lower (on average) than those measured with the optimal rules. That said, the same suppression mechanism would have led to behavioral L1 values being consistently lower than DPOAE optimal L1 values and this was not the case (Figs. 5, ,6,6, and and7).7). Therefore, mutual suppression is unlikely to account for the difference in DPOAE levels evoked by the two rules [see also the discussion of Kummer et al. (2000)].

An alternative simpler explanation is that individual DPOAE optimal rules were, by definition, those that evoked the highest DPOAE levels (within the 3-dB precision considered for L1, see “Methods”). Therefore, any deviation, however small, of the individual behavioral rules from the individual DPOAE optimal rule (Figs. 5 and and6)6) would have always produced submaximal DPOAE levels for each subject and this would be reflected in the mean I/O curves (Fig. 8).

The correspondence between the present behavioral and DPOAE optimal rules tended to be less in the individual data (Figs. 5 and and6)6) for L2 above around 65 dB SPL. In several cases (depicted by open circles in Figs. 5 and and6),6), the optimal L1 levels were higher than the maximum output of our system and most of these were higher than the corresponding behavioral L1 values. The reason for this result is uncertain, but it may reflect a shift of the DPOAE generation cochlear site towards the base of the cochlea with increasing L2. It is reported that the peak of the cochlear traveling wave shifts basally with increasing sound level (Robles and Ruggero 2001). Therefore, the region of maximum interaction between the traveling waves evoked by the two primaries is likely to shift from that with CF ~ f2 to a more basal region as L2 increases. This shift is illustrated in Figure Figure9,9, where the regions of maximal interaction at low and high L2 levels are denoted x2 and equation M1, respectively. Figure Figure99 also illustrates how the increase in L1 for a given increase in L2 would be greater if the two primaries were to evoke equal responses at the equation M2 than at the x2 cochlear regions. The present behavioral rules were unlikely affected by the shift in question because they were based on TMCs for fixed, low level probes and thus presumably reflected cochlear responses at a fixed cochlear site with CF ~ f2 at all L2 (Nelson et al. 2001). If the highest DPOAE levels occurred for equally effective primaries near the peak of the f2 traveling waves at each L2 level, then DPOAE optimal L1 values would be higher than behavioral values at high L2 levels (Fig. 9). This might explain why the behavioral L1 were sometimes lower than the DPOAE optimal L1 at high L2 levels. Furthermore, if the level-dependent shift in question were gradual, then this would also explain why some of the DPOAE optimal L1L2 rules appeared generally steeper than their behavioral counterparts over the range of L2 levels where both of them could be measured (Figs. 5 and and66).

FIG. 9.
Schematic cochlear excitation patterns of the two DPOAE primary tones, f1 and f2, for low and high L2 levels (lower and higher figures, respectively). For low L2 levels, the maximum interaction between the two excitation patterns occurs at the x2 cochlear ...

The theory that the main DPOAE generation site shifts basally with increasing L2 might be tested by comparing the degree of correlation between individual optimal level rules with behavioral rules measured with the present method and with the growth-of-masking (GOM) method for a signal frequency equal to f2 and a masker frequency equal to f1 (Nelson et al. 2001; Oxenham and Plack 1997). Based on the current interpretation of GOM functions (Oxenham and Plack 1997; Rosengard et al. 2005), the prediction would be that GOM functions would provide a more accurate behavioral correlate of optimal DPOAE level rules because they implicitly encompass potential level-dependent shifts of cochlear excitation (Lopez-Poveda and Johannesen, in press).

On the controversy about DPOAE optimal primary level rules

Kummer and colleagues have argued that “optimizing the L1 level for any given L2 is not a trivial DPOAE level maximization but rather appropriate for maximizing the sensitivity of DPOAE measurements,” to discriminate between healthy and damaged cochleae [p. 54 of Kummer et al. (2000)]. It is uncertain that DPOAE optimal rules for normal-hearing subjects serve to maximize DPOAE levels of hearing-impaired subjects. It is also unlikely that average DPOAE optimal rules account for individual DPOAE level variations. Nevertheless, average DPOAE optimal rules for normal-hearing subjects may be regarded as the “best-guess” parameters for any given normal-hearing or hearing-impaired individual. Because of this and given its potential clinical implications, much effort has been spent on providing an accurate average DPOAE optimal rule. Unfortunately, different studies disagree in their conclusions. There exists consensus that the optimal L1 should increase with increasing L2 following a linear relationship (L1 = aL2 + b), but some studies have concluded that a and b should be constant across the f2 range from 1 to 8 kHz (Kummer et al. 2000) while others have concluded that they should vary rather systematically with f2 (Johnson et al. 2006; Neely et al. 2005).

Although the present study was not aimed at resolving this controversy, the behavioral and DPOAE data it produced incidentally support the view that optimal rules should be approximately similar at 1 and 4 kHz (Fig. 7B, C and Table 2). The behavioral data suggest, however, that the optimal rule at 500 Hz is likely to be significantly different from those at higher frequencies (Fig. 7 and Table 2). Unfortunately, this could not be corroborated with empirical DPOAE optimal rules at this frequency. That said, the I/O curves of Figure Figure88 suggest that the rule of Kummer et al. (2000), which was identical across test frequencies, evoked average DPOAE levels that were indistinguishable for most conditions from those evoked by the present behavioral rules, despite them being different, particularly at 500 Hz.

In summary, the agreement between the behavioral and DPOAE optimal rules supports the view that maximal DPOAE levels occur when the two primary tones produce approximately equal responses in the f2 cochlear region as well as the assumptions of the popular TMC method of inferring human BM responses.


We thank Almudena Eustaquio-Martin for technical support, and the associate editor, Barbara Shinn-Cunningham, and three anonymous reviewers for their thoughtful comments on an earlier version of this paper. This work was supported by the Spanish Ministry of Science and Education (refs. BFU2006-07536 and CIT-390000-2005-4) and by The William-Demant Oticon Foundation.


  • Cooper NP. Compression in the peripheral auditoy system. In: Bacon SP, Fay RR, Popper AN (eds) Compression: From Cochlea to Cochlear Implants. New York, Springer, pp. 18–61, 2004.

  • Dorn PA, Konrad-Martin D, Neely ST, Keefe DH, Cyr E, Gorga MP. Distortion product otoacoustic emission input/output functions in normal-hearing and hearing-impaired human ears. J. Acoust. Soc. Am. 110:3119–3131, 2001. [PubMed]

  • Duifhuis H. Consequences of peripheral frequency selectivity for nonsimultaneous masking. J. Acoust. Soc. Am. 54:1471–1488, 1973. [PubMed]

  • Fahey PF, Stagner BB, Lonsbury-Martin BL, Martin GK. Nonlinear interactions that could explain distortion product interference response areas. J. Acoust. Soc. Am. 108:1786–1802, 2000. [PubMed]

  • Gaskill SA, Brown AM. The behavior of the acoustic distortion product, 2f1–f2, from the human ear and its relation to auditory sensitivity. J. Acoust. Soc. Am. 88:821–839, 1990. [PubMed]

  • Goldstein JL. Auditory nonlinearity. J. Acoust. Soc. Am. 41:676–689, 1967. [PubMed]

  • Gorga MP, Neely ST, Ohlrich B, Hoover B, Redner J, Peters J. From laboratory to clinic: a large scale study of distortion product otoacoustic emissions in ears with normal hearing and ears with hearing loss. Ear Hear. 18:440–455, 1997. [PubMed]

  • He NJ, Schmiedt RA. Fine structure of the 2f1–f2 acoustic distortion product: changes with primary level. J. Acoust. Soc. Am. 94:2659–2669, 1993. [PubMed]

  • Heitmann J, Waldmann B, Schnitzler H, Plinkert PK, Zenner H. Suppression of distortion product otoacoustic emissions (DPOAE) near 2f1–f2 removes DP-gram fine structure—evidence for a secondary generator. J. Acoust. Soc. Am. 103:1527–1531, 1998.

  • Johannesen PT, Lopez-Poveda EA. Cochlear nonlinearity in normal-hearing subjects as inferred psychophysically and from distortion-product otoacoustic emissions. J. Acoust. Soc. Am. 124:2149–2163, 2008. [PubMed]

  • Johnson TA, Neely ST, Garner CA, Gorga MP. Influence of primary-level and primary-frequency ratios on human distortion product otoacoustic emissions. J. Acoust. Soc. Am. 119:418–428, 2006. [PMC free article] [PubMed]

  • Kalluri R, Shera CA. Distortion-product source unmixing: a test of the two-mechanism model for DPOAE generation. J. Acoust. Soc. Am. 109:622–637, 2001. [PubMed]

  • Kemp DT. Stimulated acoustic emissions from within the human auditory system. J. Acoust. Soc. Am. 64:1386–1391, 1978. [PubMed]

  • Kummer P, Janssen T, Arnold W. The level and growth behavior of the 2 f1–f2 distortion product otoacoustic emission and its relationship to auditory sensitivity in normal hearing and cochlear hearing loss. J. Acoust. Soc. Am. 103:3431–3444, 1998. [PubMed]

  • Kummer P, Janssen T, Hulin P, Arnold W. Optimal L(1)–L(2) primary tone level separation remains independent of test frequency in humans. Hear Res. 146:47–56, 2000. [PubMed]
  • Levitt H. Transformed up-down methods in psychoacoustics. J. Acoust. Soc. Am. 49(Suppl 2)1971. [PubMed]

  • Lonsbury-Martin BL, Martin GK. The clinical utility of distortion-product otoacoustic emissions. Ear Hear. 11:144–154, 1990. [PubMed]

  • Lopez-Poveda EA, Alves-Pinto A. A variant temporal-masking-curve method for inferring peripheral auditory compression. J. Acoust. Soc. Am. 123:1544–1554, 2008. [PubMed]
  • Lopez-Poveda EA, Johannesen PT. Otoacoustic emission theories can be tested with behavioral methods. In: Lopez-Poveda EA, Palmer AR, Meddis R (eds) Advances in Auditory Research: Physiology, Psychophysics, and Models. New York, Springer.

  • Lopez-Poveda EA, Plack CJ, Meddis R. Cochlear nonlinearity between 500 and 8000 Hz in listeners with normal hearing. J. Acoust. Soc. Am. 113:951–960, 2003. [PubMed]

  • Lopez-Poveda EA, Plack CJ, Meddis R, Blanco JL. Cochlear compression in listeners with moderate sensorineural hearing loss. Hear Res. 205:172–183, 2005. [PubMed]

  • Martin GK, Jassir D, Stagner BB, Whitehead ML, Lonsbury-Martin BL. Locus of generation for the 2f1–f2 vs 2f2–f1 distortion-product otoacoustic emissions in normal-hearing humans revealed by suppression tuning, onset latencies, and amplitude correlations. J. Acoust. Soc. Am. 103:1957–1971, 1998. [PubMed]

  • Mauermann M, Kollmeier B. Distortion product otoacoustic emission (DPOAE) input/output functions and the influence of the second DPOAE source. J. Acoust. Soc. Am. 116:2199–2212, 2004. [PubMed]

  • Mills DM, Rubel EW. Variation of distortion product otoacoustic emissions with furosemide injection. Hear Res. 77:183–199, 1994. [PubMed]

  • Moore BC, Glasberg BR. Growth of forward masking for sinusoidal and noise maskers as a function of signal delay; implications for suppression in noise. J. Acoust. Soc. Am. 73:1249–1259, 1983. [PubMed]

  • Neely ST, Johnson TA, Gorga MP. Distortion-product otoacoustic emission measured with continuously varying stimulus level. J. Acoust. Soc. Am. 117:1248–1259, 2005. [PMC free article] [PubMed]

  • Nelson DA, Freyman RL. Temporal resolution in sensorineural hearing-impaired listeners. J. Acoust. Soc. Am. 81:709–720, 1987. [PubMed]

  • Nelson DA, Schroder AC. Peripheral compression as a function of stimulus level and frequency region in normal-hearing listeners. J. Acoust. Soc. Am. 115:2221–2233, 2004. [PubMed]

  • Nelson DA, Schroder AC, Wojtczak M. A new procedure for measuring peripheral compression in normal-hearing and hearing-impaired listeners. J. Acoust. Soc. Am. 110:2045–2064, 2001. [PubMed]

  • Oxenham AJ, Moore BC. Modeling the additivity of nonsimultaneous masking. Hear Res 80:105–118, 1994. [PubMed]

  • Oxenham AJ, Moore BC. Additivity of masking in normally hearing and hearing-impaired subjects. J. Acoust. Soc. Am. 98:1921–1934, 1995. [PubMed]

  • Oxenham AJ, Moore BC, Vickers DA. Short-term temporal integration: evidence for the influence of peripheral compression. J. Acoust. Soc. Am. 101:3676–3687, 1997. [PubMed]

  • Oxenham AJ, Plack CJ. A behavioral measure of basilar-membrane nonlinearity in listeners with normal and impaired hearing. J. Acoust. Soc. Am. 101:3666–3675, 1997. [PubMed]

  • Plack CJ, Drga V. Psychophysical evidence for auditory compression at low characteristic frequencies. J. Acoust. Soc. Am. 113:1574–1586, 2003. [PubMed]

  • Plack CJ, Drga V, Lopez-Poveda EA. Inferred basilar-membrane response functions for listeners with mild to moderate sensorineural hearing loss. J. Acoust. Soc. Am. 115:1684–1695, 2004. [PubMed]

  • Rhode WS. Distortion product otoacoustic emissions and basilar membrane vibration in the 6–9 kHz region of sensitive chinchilla cochleae. J. Acoust. Soc. Am. 122:2725–2737, 2007. [PubMed]

  • Robles L, Ruggero MA. Mechanics of the mammalian cochlea. Physiol. Rev. 81:1305–1352, 2001. [PMC free article] [PubMed]

  • Rosengard PS, Oxenham AJ, Braida LD. Comparing different estimates of cochlear compression in listeners with normal and impaired hearing. J. Acoust. Soc. Am. 117:3028–3041, 2005. [PMC free article] [PubMed]

  • Ruggero MA. Distortion in those good vibrations. Curr. Biol. 3:755–758, 1993. [PMC free article] [PubMed]

  • Ruggero MA, Rich NC, Recio A, Narayan SS, Robles L. Basilar-membrane responses to tones at the base of the chinchilla cochlea. J. Acoust. Soc. Am. 101:2151–2163, 1997. [PMC free article] [PubMed]

  • Shaffer LA, Withnell RH, Dhar S, Lilly DJ, Goodman SS, Harmon KM. Sources and mechanisms of DPOAE generation: implications for the prediction of auditory sensitivity. Ear Hear. 24:367–379, 2003. [PubMed]

  • Shera CA, Guinan JJ, Jr. Evoked otoacoustic emissions arise by two fundamentally different mechanisms: a taxonomy for mammalian OAEs. J. Acoust. Soc. Am. 105:782–798, 1999. [PubMed]

  • Shera CA, Guinan JJ, Jr. Cochlear traveling-wave amplification, suppression, and beamforming probed using noninvasive calibration of intracochlear distortion sources. J. Acoust. Soc. Am. 121:1003–1016, 2007. [PubMed]

  • Whitehead ML, Stagner BB, McCoy MJ, Lonsbury-Martin BL, Martin GK. Dependence of distortion-product otoacoustic emissions on primary levels in normal and impaired ears. II. Asymmetry in L1,L2 space. J. Acoust. Soc. Am. 97:2359–2377, 1995. [PubMed]

  • Wojtczak M, Oxenham AJ. Pitfalls in behavioral estimates of basilar-membrane compression in humans. J. Acoust. Soc. Am. 125:270–281, 2009. [PubMed]

Articles from JARO: Journal of the Association for Research in Otolaryngology are provided here courtesy of Association for Research in Otolaryngology