|Home | About | Journals | Submit | Contact Us | Français|
Although it is largely agreed that phonological processing deficits are a major cause of poor reading, the neural origins of phonological processing are not well understood. We now show, for the first time, that phonological decoding, measured with a test of single-nonword reading, is significantly correlated with the timing of subcortical auditory processing and also, to a lesser extent, with the robustness of subcortical representation of the harmonic content of speech, but not with pitch encoding. The relationships we observe between reading and subcortical processing fall along a continuum, with poor readers at one end and good readers at the other. These data suggest that reading skill may depend on the integrity of subcortical auditory mechanisms and are consistent with the idea that subcortical representation of the acoustic features of speech may play a role in normal reading as well as in the development of reading disorders. These data establish a significant link between subcortical auditory function and reading, thereby contributing to the understanding of the biological bases of reading. At a more general level, these findings are among the first to establish a direct relationship between subcortical sensory function and a specific cognitive skill (reading). We argue that this relationship between cortical and subcortical function could be shaped during development by the corticofugal pathway and that this cortical–subcortical link could contribute to the phonological processing deficits experienced by poor readers.
Although the neural basis of reading is still poorly understood, there is now agreement that the development of fluent reading relies on adequate phonological processing and that phonological processing deficits are a major cause of poor reading (Vellutino et al. 2004; Shaywitz et al. 2008). Indeed, the large majority of individuals with reading disability (dyslexia) often exhibit difficulties on an array of tasks measuring phonological skill, such as decomposing words into their constituent syllables and phonemes, deciding whether a pair of words rhymes, repeating a list of digits or made-up words, or quickly retrieving information from long-term memory. The causal role phonological processing plays in reading has been shown both developmentally (Bradley and Bryant 1983; Lyytinen et al. 2004; Torppa et al. 2006) and in intervention studies (Bradley and Bryant 1983; Torgesen et al. 2001; Moore et al. 2005). Adequate phonological skills require the explicit manipulation of speech sounds and therefore adequate representation of these sounds in the brain, as well as adequate online access to the representations during task performance. Whether the difficulty in manipulating speech sounds stems from abnormal neural encoding of some (speech) sounds in the auditory pathway resulting in impoverished phonological representations, as proposed by Tallal et al. (1993), from difficulty in efficiently using those representations (Ahissar 2007; Ramus and Szenkovits 2008), or whether a combination of these possibilities characterizes subgroups of poor readers is still unknown.
In typically developing children and adults, the scalp-recorded auditory brain stem response (ABR), presumably generated in the inferior colliculus and other brain stem nuclei, reflects the acoustic characteristics of speech with remarkable fidelity (Galbraith et al. 1995, 2000; Krishnan 2002; Galbraith et al. 2004; Russo et al. 2004; Kraus and Nicol 2005; Akhoun et al. 2008; Basu et al. 2009). When recorded in response to a consonant–vowel syllable, the timing of the speech ABR provides information about the onset, periodicity, and offset of the stimulus. Analysis of the spectral content of the response includes the fundamental frequency that conveys the pitch of the signal (prosodic cues conveying the affective intent of the message [e.g., question vs. statement] and speaker identification) as well as its harmonics, which are shaped by the articulators producing the speech formants (i.e., information about the message or verbal meaning of the utterance). Recent studies indicate that several aspects of the speech ABR are sensitive to language (Krishnan et al. 2005, 2008) and musical experience (Musacchia et al. 2007; Wong et al. 2007; Kraus et al. 2009; Strait et al. 2009; Lee et al. 2009), as well as to short-term training (Russo et al. 2004; Song, Skoe, et al. 2008), putatively via influences of the corticofugal auditory system (Perrot et al. 2006; Winer 2006; Suga 2008). Therefore, the speech ABR seems well suited to provide objective physiological information about speech encoding in the auditory pathway, particularly in populations with known deficits at perceptual and cognitive levels.
Indeed, the high prevalence of subcortical neural encoding deficits in the learning-impaired population (King et al. 2002; Wible et al. 2004; Banai et al. 2005; Johnson et al. 2007) lead us to hypothesize here that some aspects of phonological processing are related to subcortical encoding of sound. The aforementioned studies demonstrate that a sizeable subgroup of all individuals diagnosed with learning disability, mainly those exhibiting poor phonological abilities and below-average reading, show abnormal timing of their ABRs to speech sounds. The same pattern has not been found for click sounds (Song et al. 2006). In addition to abnormal timing, encoding in the range of speech formant frequencies also appears abnormal in some children with learning problems (Cunningham et al. 2002; Wible et al. 2004; Johnson et al. 2007). On the other hand, brain stem processing of pitch (i.e., fundamental frequency) in children with learning problems seems normal (Wible et al. 2004; Johnson et al. 2007). Finally, it should be noted that abnormal brain stem timing is rarely observed among average and above-average readers (Banai et al. 2005).
In the learning-impaired population, abnormal brain stem responses to speech sounds are thought to be part of a more general central auditory disorder involving interactive relationships between cortical and subcortical activity. In particular, the brain stem response to speech is related to 3 cortical measures that are known to be sensitive to the presence of reading problems. The timing of the response is related to cortical discrimination of fine acoustic differences (Banai et al. 2005), cortical representation of speech in background noise (Wible et al. 2005), and the degree of leftward cortical asymmetry to speech (Abrams et al. 2006).
Based on those previous findings, we now hypothesize that direct relationships will be observed between features of the speech ABR that are sensitive to learning problems and measures of literacy and phonological processing. Specifically, we hypothesize that reading is selectively related to subcortical encoding of timing and harmonic information, but not with pitch (see operational definitions in Materials and Methods). To test these hypotheses, we administered a battery of reading and phonological tests and measured the speech ABR in a group of children with a wide range of reading skills. The data we report are in line with the hypotheses that subcortical auditory processing may play a role in phonological processing and in reading.
Sixty-three children, 28 females, aged 7–15 years (M = 9.8, standard deviation [SD] = 1.6) participated in the study. Twenty of the children underwent a thorough audiological evaluation that included a full audiogram (see Results for a group comparison between good and poor readers). All children passed a hearing screening that required normal click-evoked brain stem responses indicating that their auditory function at levels peripheral to the brain stem is normal. Sixty-two children had IQ scores higher than 80 as measured by either the Wechsler Abbreviated Scale of Intelligence (WASI, n = 58) or the Test of Nonverbal Intelligence III (n = 4). One child (with reading and spelling scores within the normal range) was not tested on either IQ test. Twenty-five children had an external diagnosis of a learning impairment, however, due to the controversy surrounding the diagnosis of learning disabilities (Fletcher et al. 1992; Shaywitz et al. 1995), and given that our focus here is on actual reading skill rather than on reading disability diagnosis, data analysis was based on the psychoeducational assessment described below rather than on formal diagnosis. Data were collected as part of distinct studies, so while the general procedure was the same for all children, not all had the full battery of psychoeducational assessments. N values for the separate tests are given in Table 1. All procedures were approved by the Internal Review Board of Northwestern University. Participants signed informed consents and assents with a parent/guardian present and were monetarily compensated for their time.
Phonological processing was assessed with the Comprehensive Test of Phonological Processing (Wagner et al. 1999). Subtests included elision (participants are required to create a new word by omitting a syllable or a phoneme from a given word presented aurally), blending words (participants are asked to blend a set of syllables to create a word), rapid letter naming, rapid number naming (participants are asked to read aloud a list of letters/digits presented in an array as fast and as accurately as they can), digit repetition, and nonword repetition (participants are required to repeat a list of digits or increasingly longer nonsense words). Three cluster scores, phonological awareness, phonological memory, and rapid naming, were derived from the subtests.
Measures of literacy included single-word reading (Wide Range Achievement Test, third edition [Wilkinson 1993] [WRAT-3], or Woodcock–Johnson, third edition [Woodcock et al. 2001] [WJ-III]), spelling (WRAT-3 or WJ-III), and single-nonword reading (word attack subtest of [WJ-III] or WJ, revised [Woodcock and Johnson 1989–1990]). Performance on the single-word reading may be influenced by visual memory for the words, whereas nonword reading relies solely on phonological decoding, as phonology is the only cue available to the sound of these unfamiliar stimuli. Although different literacy tests, or different revisions of the same tests, were administered, standardized scores are highly correlated between the tests (Salvia et al. 2007), indicating that they were measuring the same underlying skill.
The stimulus was a 40-ms synthesized /da/ produced in KLATT (Klatt 1980) with a fundamental frequency (F0) that linearly rose from 103 to 125 Hz with voicing beginning at 5 ms and an onset noise burst during the first 10 ms. The first formant (F1) rose from 220 to 720 Hz while the second and third formants (F2 and F3) decreased from 1700 to 1240 Hz and 2580 to 2500 Hz, respectively, over the duration of the stimulus. The fourth and fifth formants (F4 and F5) were constant at 3600 and 4500 Hz, respectively. The stimulus comprised an initial noise burst and formant transition between the consonant and a steady-state vowel. Although the utterance was short and there was no steady-state vowel, the stimulus was voiced and was perceived as a consonant–vowel syllable.
Responses were recorded with the Bio-logic Navigator Pro System (Natus Medical Inc., Mundelein, IL). Alternating-polarity stimuli were presented monaurally to the right ear at a rate of 10.9 Hz through insert earphones at 80.3 dB sound pressure level (SPL), while subjects were watching a video of their choice with the soundtrack of the video presented in free field at 40 dB SPL. A vertical montage of 3 Ag–AgCl electrodes was used to record neurophysiological responses (central vertex [Cz] active, forehead ground, and ipsilateral earlobe reference). Online artifact rejection was employed with a criterion of ±23 μV. Three blocks of 2000 artifact-free sweeps were collected for each participant and averaged using a 74.67-ms time window that included a 15.8-ms prestimulus period. The responses were online band-pass-filtered from 100 to 2000 Hz (12 dB/octave) and digitally sampled at 6857 Hz.
Data analysis followed published reports using similar stimulus and recording parameters (Russo et al. 2004; Banai et al. 2005; Abrams et al. 2006; Johnson et al. 2008). All data analysis was automated using routines coded in Matlab 7 (The MathWorks, Inc., Natick, MA). The characteristic 7 peaks of the response to /da/ were manually identified and confirmed by a second experienced observer. Peaks that were deemed not replicable or not reliably above the noise floor (12 peaks out of a total of 441) were marked as missing data points and were excluded from analyses. Likewise, peaks delayed beyond 2.5 SDs of the mean (a total of 7 data points coming from the data of 4 children) were dropped to control for the influence of outlying data points.
The onset burst of the stimulus contains broad frequency information and elicited waves V and A. Peak C was thought to encode the transition from the aperiodic stop burst to the periodic (voiced), formant transition, and peak O corresponded to the cessation of the stimulus. The frequency-following response (FFR) to the voiced portion of the syllable included peaks D, E, and F, which occurred at the period of the F0. Higher frequency information, including formant structure, was encoded in the smaller voltage fluctuations between the 3 principal FFR waves. Based on these characteristics of the response, 3 dimensions were defined for further analysis—timing, harmonics, and pitch. Timing was defined as the latency of each peak and reflected the temporal precision of the synchronous neural activity with respect to the onset, periodicity, and offset of the stimulus. Pitch was defined for the purpose of the current study as the neural information that reflected the fundamental frequency of the stimulus. Although other aspects of speech are certainly important for the perception of pitch, we focused here on the fundamental frequency that has major contributions to the percept (Cruttenden 1997). Harmonics were defined as the neural activity that arose to the harmonics of the fundamental. The formant structure of the signal, determined by the filtering of the harmonics by the articulators, gives identity to the speech signal, independent of pitch. Thus, we think of the response to the harmonics as a metric of the processing of the verbal message.
To obtain measures of timing, the local minima (maximum, in the case of wave V) within 2 sampling points (±2; corresponding to ±0.29 ms) of the visually identified peak were chosen by the automated peak-picking routine. For wave V, a narrower range was used (+2) to avoid the accidental identification of wave IV. Pitch and harmonic encoding were analyzed using a 4096-point Fourier analysis over the 21.9- to 40.6-ms portion of the response. Average spectral amplitude was calculated for 4 frequency ranges—F0: 103–120 Hz, low harmonics: 180–410 Hz; middle harmonics: 410–755 Hz; and high harmonics: 755–1130 Hz. The F0 range encompasses the stimulus’ F0. Together the low and middle harmonics encapsulate the stimulus F1 range. The high harmonics range begins just above F1 and extends up to the maximal frequency that can be seen in the response—the effective limit of phase locking in the brain stem. In our analysis of the speech ABR, the F1 range was broken into 2 response regions representing the most prominent frequency peaks in the F1 range of the /da/ syllable (410–755 Hz) evoking the FFR and the less prominent frequencies (180–410 Hz) (see Johnson et al. 2005). Furthermore, a previous study (Johnson et al. 2007) showed that only the spectral amplitude of the middle harmonic range was associated with the presence of learning disability. The second formant was beyond the phase-locking capabilities of the brain stem response (Liu et al. 2006), and F2–F5 frequency ranges were, therefore, not included in the analysis.
In order to determine an overall relationship between reading and phonological processing and auditory brain stem function, 2 sets of analyses were conducted. First, Pearson's correlations were calculated on each measure in the data set. Second, the data set was broken into terciles based on word attack scores. Children without word attack scores were not included in the analysis. Group comparisons (t-tests, effect sizes) were made between the top third (good readers; range: 113–134, mean = 121.3, n = 19) and the bottom third (poor readers; range: 64–100, mean = 89.7, n = 19). All statistical analyses were conducted in SPSS (SPSS Inc., Chicago, IL).
In order to test the hypothesis that reading and phonological awareness are related to subcortical timing and harmonic encoding, but not to pitch encoding, we measured reading and reading-related skills as well as the brain stem response to the speech sound /da/ in a group of children with a broad range of reading skills. Timing was defined as the latencies of the 7 prominent response peaks (denoted V, A, C, D, E, F, and O) taken from the waveform of each individual. Pitch and harmonic encoding were defined as the spectral amplitudes of response frequency ranges corresponding to the fundamental frequency (F0) of the /da/ syllable and its harmonics (low, middle, and high) extracted using a Fourier analysis.
Significant correlations were observed between measures of reading and timing and to a lesser extent between reading and harmonic encoding (see Table 1). On the other hand, reading and phonological measures were not significantly correlated with pitch. For the number of correlations shown in Table 1, ~4 values are expected to be significant at a level of P = 0.05 by chance; yet, we observed 27 correlation values that were significant (bolded values in Table 1). Most of these correlations were with measures of timing and fewer with measures of harmonic encoding.
Reading of single nonwords, measured with word attack, was found to correlate significantly with peak V, A, C, E, and F latencies. Single-word reading, spelling, phonological awareness, and rapid naming, but not phonological memory scores, were also significantly correlated with peak latencies (see Tables 1 and and22 and Fig. 1). All correlations with peak latencies were negative, indicating that earlier responses are associated with better reading and later responses with poorer reading.
Word attack and phonological awareness scores were also significantly correlated with amplitude of middle harmonics (410–755 Hz). In this case, the correlations were positive, indicating that larger spectral amplitude was related to higher reading score, although the correlations were not as high as those between reading and timing. Correlations with amplitude of the low and with high harmonics were not significant (see Tables 1 and and22 and Fig. 1).
Correlations between reading measures and F0 amplitude were not significant (see Tables 1 and and22 and Fig. 1). This suggests that encoding of pitch was not as strongly related to measures of reading and phonological processing as were timing and encoding of harmonics in the speech signal. Furthermore, note that although some of the correlation values were marginally significant, the direction of the correlation with reading measures was in the opposite direction from that found for harmonics. Therefore, counter to what would have been predicted based on timing and harmonics, larger F0 amplitude was related to poorer reading.
As a way to control for multiple correlations, we calculated a composite timing score for each participant by transforming the latency values of each peak to a Z score and then averaging over the Z scores of all the peaks. We then calculated the Spearman’s correlation between word attack, our representative reading measure, and the composite timing score, as well as harmonic and pitch encoding (Table 2). The outcomes of these analyses support the outcomes of the primary analyses. Furthermore, as also shown in Table 2, the pattern of correlations between the subcortical response and phonological decoding was not sensitive to the effects of age and IQ when these were included together in a partial correlation analysis. Therefore, it is unlikely that this relationship is derived from general maturational processes or general cognitive capacities. Likewise, peripheral auditory function measured by click ABR was not a factor.
The data set was broken into terciles based on word attack scores. This analysis allowed for the direct comparison of high-performing readers and low-performing readers to further corroborate the relationships found across the entire reading spectrum. Not surprisingly, the 2 groups were significantly different on word attack and also on all other measures of reading and phonological processing (P ≤ 0.04), with the exception of phonological memory, which was only marginally different between the groups (P = 0.066). The 2 groups did not differ significantly in age (good readers: 9.3 ± 1.2, poor readers: 10.0 ± 1.8, P = 0.15). As is often the case, better readers also had higher IQ scores compared with poor readers (performance IQ evaluated with the WASI: 118 ± 12 vs. 97 ± 18, P < 0.001); therefore, IQ scores were included as a covariate in the group comparisons to control for the possibility that group differences are driven mainly by general cognitive rather than by reading-related factors.
The peak latencies of responses in the good and poor reading groups were compared for all 7 response peaks. For all peaks, average peak latencies were shorter in the good readers than in the poor readers (P < 0.005 for peaks V through E, P = 0.077 for peak F, and P < 0.05 for peak O; see Table 3 and Fig. 2). The group difference for peak F just failed to reach significance. The effect sizes (corrected for IQ) of the group differences for peaks V, A, C, D, E, and O and the composite timing measure were large (all > 0.8), whereas the effect sizes for peak F was moderate. Despite the fact that the differences between groups on the timing measures are very small in absolute terms, the large effect sizes indicate that the overlap between groups for all peaks but F is quite small (e.g., with an effect size of 1.2, 37.8% of the scores are overlapping or conversely, 62.2% are nonoverlapping).
The 2 groups differed significantly in spectral amplitude in the middle (410–744 Hz, P < 0.001) and high harmonic ranges (P < 0.05). Effect sizes were high and moderate (respectively), indicating that harmonics of the speech signal were encoded more robustly in the good readers than the poor (see Table 3 and Fig. 3).
Spectral amplitude in the F0 frequency range did not differ between the 2 groups, and the corresponding effect size was small. The lack of group differences in the F0 range suggests that the 2 groups did not differ in their representation of the pitch of the stimulus.
Although a clinically normal brain stem response to click and normal hearing were part of the inclusion criteria for the current study, there is still a slight possibility that the relationships between speech ABR timing and reading could be an outcome of a minimal, undiagnosed hearing loss. To control for this possibility, we conducted 2 analyses. First, we compared the audiograms of good (n = 9) and poor (n = 11) readers for octave frequencies in the range of 250–8000 Hz for the 2 ears using a Mann–Whitney U test. No significant differences were observed between the groups for any frequency in either ear. As a second test, we calculated partial correlations between word attack and the speech ABR measures controlling for the latency of the click-evoked wave V. The partial correlation between reading and timing was still highly significant, indicating that it cannot be accounted for by delayed timing of the click response, which would be delayed in the case of a mild hearing loss or a more peripheral brain stem deficit (see Table 2). It should also be noted that among poor readers with delayed speech ABR timing defined based on peaks V and A, the timing of wave III of the response is known to be normal (Song, Banai, and Kraus 2008).
Here we show, for the first time to our knowledge, direct relationships between subcortical auditory processing of speech, reading, and phonological skills. Specifically, we show that poor timing of subcortical auditory encoding and also, to some extent, impoverished representation of signal harmonics are characteristic of children who read poorly and perform below average on tasks of phonological awareness and rapid naming, whereas good readers are characterized by more temporally precise encoding and more robust representation of speech harmonics. The relationships we observe between reading and subcortical processing fall along a continuum, with poor readers at one end and good readers at the other.
The current data demonstrate that reading and phonological processing are related to subcortical auditory encoding. Specifically, we show that when phonological processing is hampered at the cognitive level, sensory encoding of acoustic features that represent phonological information (at a sublexical input level) is also impaired. Furthermore, the pattern of subcortical processing deficits parallels behavior. Although poor readers typically have no difficulty determining the intention of a speaker, as is the case in autistic spectrum disorders, they do have difficulty in decoding the verbal message, especially when it is brief or rapidly presented (Bruno et al. 2007). Consistent with this pattern, their subcortical representation of pitch is intact, whereas their representation of timing and harmonics, which correspond to the verbal message, is compromised. This is in contrast to the more pervasive encoding deficits in children with autism, which include pitch encoding (Russo et al. 2008). Though consistent with behavioral findings (Marshall et al. 2008), our conclusion that poor readers have intact linguistic pitch processing at the level of the brain stem is based upon a statistical null result and should therefore be treated with caution, particularly because the pitch trajectory of our stimulus was not ecological (i.e., does not occur in natural language). Given recent findings showing that brain stem encoding advantage of native Mandarin speakers was highly specific to the use of an ecological (i.e., Mandarin, rather than artificial but similarly perceived) pitch contour, it is conceivable that different patterns would emerge with a natural, not linear, pitch contour (Xu et al. 2006; Chandrasekaran et al. 2007).
Current theories of dyslexia attribute poor reading and poor phonological processing to a difficulty in forming (Richardson et al. 2004; Boada and Pennington 2006) or accessing (Ramus and Szenkovits 2008) phonological representations of speech sounds pertinent to learning the mapping between sounds and letters. To account for the wider array of perceptual and motor symptoms often associated with dyslexia, it has been proposed that the core deficit in dyslexia relates to sluggish or slow attention mechanisms (Hari et al. 2001), poor implicit (Sperling et al. 2004; Vicari et al. 2005) or procedural (Nicolson and Fawcett 2007) learning, poor utilization of the context of recently presented stimuli (Ahissar et al. 2006, 2007), or generally slow neural processing across sensory and motor systems (Tallal et al. 1993; Stein and Walsh 1997). Consistent with those accounts, which assume that phonological processing and reading are mainly cortical processes, we now suggest that sensory processing in the brain stem may also be compromised because years of abnormal (phonological) processing trickle down (via the corticofugal system) to impoverish the neural encoding of sound, resulting in abnormal development of the normal experience-dependent sharpening of brain stem neuron receptive fields as has been observed in primary auditory cortex (Fritz et al. 2007; Schreiner and Winer 2007). Therefore, it appears that among poor readers, the abnormal representation of the acoustic elements of speech, which are critical for phonemic discrimination, would result in impoverished input into higher level areas dedicated to phonological processing and thus contribute to the phonological deficit. Although determining whether abnormal subcortical processing of speech at the brain stem is a cause or a consequence of higher level factors (as we suggested above) requires further studies, brain stem responses to nonspeech sounds mature at about 1.5 years of age (Salamy 1984), whereas the brain stem response to speech matures later, possibly in parallel to phonological awareness at the syllable level (Johnson et al. 2008), providing putative support to our perspective. Whether the prolonged developmental trajectory is specific to speech-like stimuli, as would be predicted by a top-down account, or whether prolonged development is observed to all stimuli sharing the spectrotemporal complexity of speech, as would be predicted by the bottom-up account, is a topic for future investigations.
Alternatively and consistent with the notion that reading disabilities arise due to an interaction between multiple risk and protective factors such that a single, confined deficit may not result in severe symptoms (Bishop 2006; Snowling 2008), abnormal subcortical neural encoding would add to phonological weaknesses resulting in more severe symptoms.
Either way, the current data show that just as poor reading represents the lower range of the normal reading continuum (Shaywitz et al. 1992), the relationships between reading and subcortical auditory processing also represent a continuum, with poor readers having delayed timing and good readers having early timing. Furthermore, in combination with previous studies, the present data show that poor reading is often accompanied by physiological deficits across multiple levels of the auditory pathway from the low brain stem (Veuillet et al. 2007) to the auditory cortex (e.g., Kujala et al. 2006; Bishop 2007).
Speech encoding in the subcortical auditory pathway can be disrupted locally, at the level of the response generator (putatively the midbrain), due to abnormal input from more peripheral auditory structures (bottom-up accounts) or due to abnormal modulation from more central ones via descending pathways (a top-down account). Most likely is that the phonological deficits observed with dyslexia are a combination of bottom-up and top-down processes. Because the present data cannot directly differentiate between bottom-up and top-down processes, we will discuss each in turn. One interpretation of our findings is that a specific disruption at the level of the brain stem leads to abnormal cortical processing of speech sounds which in turn leads to the development of difficulties in phonological processing and reading. This account is consistent with bottom-up accounts of reading disability such as the fast temporal processing deficit hypothesis proposed by Tallal and her colleagues (Tallal 1980; Tallal et al. 1993). By this account, a low-level deficit in processing brief and rapidly changing stimuli leads to difficulties in the perception of consonants, which are the brief and rapidly changing aspects of speech, and hence to phonological and reading deficits (Tallal 1980). Indeed, several studies (Kraus et al. 1996; Nagarajan et al. 1999; Temple et al. 2000; Gaab et al. 2007) show that the auditory cortex of individuals with dyslexia responds abnormally to both speech and nonspeech acoustic stimuli containing consonant-like temporospectral patterns. Furthermore, developmentally, deficits in those types of processing measured using both perceptual and neural measures were found to distinguish infants at risk of future language learning problems from those who are not and to predict future language outcomes (Benasich and Tallal 2002; Benasich et al. 2006; Choudhury et al. 2007). Viewed in this light, the current findings suggest a subcortical component to the deficit in spectrotemporal processing.
Several reasons lead us to propose that top-down mechanisms are also operative, namely, that the auditory corticofugal system likely plays an important role in mediating the observed relationships between subcortical neural encoding and reading. First, the corticofugal system has been shown to fine-tune subcortical auditory signal processing in the time and frequency domains (Perrot et al. 2006; Luo et al. 2008; Suga 2008). Indeed, we found that only specific aspects of subcortical auditory processing are associated with reading, in particular those relating to timing and to a lesser extent those related to harmonic encoding. This finding that components of the speech-evoked brain stem response are differentially related to reading is also in line with previous observations that subcortical auditory processing can be broken down to subcomponents only some of which are compromised in poor readers. One such observation is that the encoding of speech can be disrupted even when the encoding of another stimulus (click) is intact (Song et al. 2006). Another is that in the general population, the developmental time course for speech is prolonged compared with the rapid maturation of the click response (Johnson et al. 2008), and that the brain stem representation of timing and harmonic information can be distinguished from the representation of pitch (as defined here; Russo et al. 2004; Kraus and Nicol 2005). Therefore, a pervasive structural deficit to the generator itself (i.e., brain stem) in poor readers seems unlikely. This finding is consistent with previous work on distinct cortical processing streams for different acoustic aspects (Rauschecker 1998; Romanski et al. 1999; Belin and Zatorre 2000; Rauschecker and Tian 2000; Hickok and Poeppel 2004, 2007). For a discussion of the possible relationships between the cortical streams and subcortical processing, see Kraus and Nicol (2005).
Second, and perhaps more compelling, the subcortical representation of speech is sensitive to lifelong language (Krishnan et al. 2005, 2008; Xu et al. 2006) and music (Musacchia et al. 2007; Wong et al. 2007; Kraus et al. 2009; Strait et al. 2009; Lee et al. 2009) experience whose effects are probably mediated through the corticofugal system. Moreover, just as subcortical function varies as a function of reading ability, subcortical enhancements of speech has also been shown to vary as a function of the extent and onset of music experience (Musacchia et al. 2007; Wong et al. 2007; Kraus et al. 2009; Strait et al. 2009; Lee et al. 2009). Because poor reading is often a lifelong impediment, it seems likely that over time, abnormal “interactions” with sounds (Renvall and Hari 2003; Ahissar et al. 2006) or deficient attentional mechanisms (Stevens et al. 2006) would lead, via corticofugal feedback, to abnormal shaping of the sensory processing of some aspects of these sounds. The subcortical deficits we observed in poor readers may, therefore, result from abnormal cortical processes trickling down to subcortical structures via the corticofugal pathway or from suboptimal engagement of corticofugal auditory activity. In the normal system, cortical activation has been found to modulate the latencies and amplitudes of subcortical responses (Luo et al. 2008), suggesting that faulty cortical or corticofugal processes might result in poor readers having delayed response latencies. The involvement of the corticofugal system in auditory processing in humans has been recently demonstrated by Perrot et al. (Perrot et al. 2006) who showed that stimulation of the auditory cortex resulted in suppressed contralateral cochlear emissions, and its putative role in shaping subcortical auditory processing in both clinical and nonclinical groups is inferred from outcomes of training studies (Veuillet et al. 2007; de Boer and Thornton 2008; Song, Skoe, et al. 2008).
Finally, there is evidence that language training can ameliorate the subcortical encoding of speech in children with language learning problems (King et al. 2002; Russo et al. 2005). Because it seems unlikely that the training procedure used influenced low-level auditory processing (Moore et al. 2005), these findings also provide putative support for a corticofugal pathway involvement.
Abnormal neural encoding of speech seems to be one consequence of a more general disorder in the processing of rapidly changing information by the nervous system that is present from birth (for very early differences between infants with and without family risk for language learning and reading impairments, see Leppanen et al. 1999; Guttorm et al. 2001; Benasich et al. 2006; Choudhury et al. 2007). In turn, this abnormal encoding may contribute to abnormal phonological processing and reading (Lyytinen et al. 2004; Guttorm et al. 2005). Indeed, in individuals who read poorly, auditory processing can be compromised in multiple levels of the auditory pathway from the cochlea (Veuillet et al. 2007) to the cortex (Kraus et al. 1996; Nagarajan et al. 1999; Renvall and Hari 2003); yet, the pattern of their speech (and nonspeech) perception deficits does not support any single bottom-up account (Amitay et al. 2002; Banai and Ahissar 2006). Furthermore, previous data from our laboratory show that subcortical and cortical auditory processing are correlated, such that individuals with more robust cortical encoding also show more temporally precise brain stem timing (Banai et al. 2005; Wible et al. 2005; Abrams et al. 2006; Musacchia et al. 2008). In this context, it should be pointed out that bottom-up and top-down influences are not mutually exclusive but instead are likely inextricably linked and feed each other to enhance the pattern of neural and behavioral deficits associated with reading disability. This notion is supported by recent findings on the relationships between prefrontal activity and language achievements in infants, which are thought to result from recurrent cortical–thalamic and thalamocortical activity (Benasich et al. 2008).
Taken together, these data establish a significant link between reading (a cortical process) and subcortical auditory function, providing a conceptual advance on our present state of knowledge of the biological correlates of literacy. Moreover, the findings are consistent with prevailing theoretical and experimental data on the subject. Whether this link represents a cause or a consequence of reading skill requires further developmental and intervention studies. Either way, these findings are among the first to implicate a direct relationship between subcortical sensory function and a specific cognitive skill (reading).
National Institutes of Health; the National Institute on Deafness and Other Communication Disorders (RO1 DC01510, F32DC008052); Morasha Program of the Israel Science Foundation.
We wish to thank the members of the Auditory Neuroscience Laboratory, specifically Krista Johnson and Nicole Russo for help with data collection and other aspects of this work, as well as the children who participated in the studies and their families. Conflict of Interest: None declared.