PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
 
Lang Var Change. Author manuscript; available in PMC 2010 July 1.
Published in final edited form as:
Lang Var Change. 2009 July 1; 21(2): 233–256.
doi:  10.1017/S0954394509990093
PMCID: PMC2790192
NIHMSID: NIHMS124778

Articulation rate across dialect, age, and gender

Abstract

The understanding of sociolinguistic variation is growing rapidly, but basic gaps still remain. Whether some languages or dialects are spoken faster or slower than others constitutes such a gap. Speech tempo is interconnected with social, physical and psychological markings of speech. This study examines regional variation in articulation rate and its manifestations across speaker age, gender and speaking situations (reading vs. free conversation). The results of an experimental investigation show that articulation rate differs significantly between two regional varieties of American English examined here. A group of Northern speakers (from Wisconsin) spoke significantly faster than a group of Southern speakers (from North Carolina). With regard to age and gender, young adults read faster than older adults in both regions; in free speech, only Northern young adults spoke faster than older adults. Effects of gender were smaller and less consistent; men generally spoke slightly faster than women. As the body of work on the sociophonetics of American English continues to grow in scope and depth, we argue that it is important to include fundamental phonetic information as part of our catalog of regional differences and patterns of change in American English.

INTRODUCTION

Humans vary in how they produce speech, and those differences can depend on a number of factors. For instance, speech tempo can be inherently speaker-specific, pertaining to the inherent speed of articulatory movements which define unique speaker characteristics along with other variables such as voice, use of prosody, or pausing. The within-speaker variation in speech tempo is systematically affected by the length of the utterance, discourse complexity, formality, affect, mood and communication style in noisy environments or over a longer distance, to name a few. Yet, in addition to the complex interaction of the within-speaker factors, there are other powerful sources of variation in speech tempo related to social variables, most notably speaker age, gender, geographic region of origin, place of residence, education, occupation and socio-economic status. The latter group of factors determines the between-speaker variation in speech tempo, which has been a topic of fruitful investigation (e.g., Byrd, 1994; Hewlett & Rendall, 1998; Smith, Wasowicz & Preston, 1987).

Researchers have long recognized that variation in speech tempo is a way of marking individual speaker characteristics. Abercrombie (1967:7–9) and Laver & Trudgill (1991: 237ff.) have proposed “a typology of markers of identity in speech,” including three types:

  • Social markers, that are associated with, among other things, regional affiliation, social status;
  • Physical markers, that correlate with age, sex, health status;
  • Psychological markers, that tell us something about “psychological characteristics of personality and affective state.”

This typology emphasizes that there are several quite divergent aspects of speaker identity about which speech tempo may convey information. In contrast, numerous of the most familiar sociolinguistic variables — negative concord, “g-dropping,” /ai/ monophthongization, /æ/ raising, quotative like or “uptalk”— are perceived by speakers in terms of relatively focused social meanings, correlating primarily with social status or educational background (negative concord or g-dropping), regional speech (monophthongization or raising), or age (like or uptalk). This complexity alone makes speech tempo an important subject for language variation and change: It is necessarily interconnected with social, physical and psychological markings of speech. However, there is a basic lacuna in our understanding of speech tempo. Even the fundamental patterns of regional variation in how fast people speak have not been securely demonstrated.

This paper presents results of an experimental investigation of speech tempo which focuses primarily on regional variation in American English. In particular, the present experiments examine one pervasive stereotype about American dialects: the notion of “slow-talking Southerners” and “fast-talking Northerners” (Niedzielski & Preston, 1999 www.pbs.org/speak/speech/prejudice/attitudes).1 The examination of Southern and Northern speech is carried out in a controlled setting (reading task) as well as in an uncontrolled condition (free or spontaneous speech). To gain a fuller account of potential differences in speech tempo between the speakers from the South and the North, the study includes two additional between-speaker variables: Age and gender. It is of interest whether the speech of “Southerners” (men and women, younger and older) is indeed slower than the speech of “Northerners” under tow different levels of formality (reading versus speaking). On the other hand, some of the between-speaker effects are predicted to remain unaffected by regional variation, based on previous reports which are discussed below. This includes the effects of age (speech tempo to be faster for younger than older speakers) and gender (speech tempo to be faster for men than for women). We may also expect slower tempo in reading as compared to speaking. How well the present results conform to these predictions is assessed by measuring the articulation rate for each individual speaker in the study.

In this paper we operationally define speech tempo as the articulation rate. In the phonetics literature, the terms “speaking rate,” “speech rate,” or “articulation rate” tended to be used interchangeably to indicate speech tempo, i.e. the pace at which a stretch of connected discourse is delivered by the speaker. The consensus today is that while speaking/speech rate and articulation rate are both defined as “the number of output units per unit of time” (Tsao, Weismer & Iqbal, 2006:1156), speaking/speech rate includes pause intervals while articulation rate does not. Articulation rate determines the pace at which speech segments are actually produced and does not take into account speaker-specific ways of conveying information, such as hesitations, pausing, emotional expressions, and so on. Speech (or speaking) rate, on the other hand, captures more “global” speaker characteristics including frequency of pausing, use of laugh or fillers such as “you know” or “I mean” which, inserted in a stretch of discourse, cause an interruption of fluency and define a speaker-specific communication style. In this paper, measuring the articulation rate rather than speaking/speech rate will give us a better estimate of cross-dialectal differences in speech tempo as it will eliminate the use of pauses as an additional within-speaker variable. In this regard, the paper follows a more recent approach to study the regional variation in speech tempo in terms of measuring the articulation rate rather than speaking rate (e.g., Quené, 2008; Verhoeven, De Pauw & Kloots, 2004).

THE EFFECTS OF SOCIAL AND PHYSICAL MARKERS ON VARIATION IN SPEECH TEMPO

Regional differences

Speech tempo has been shown to vary not only across individuals but to have roots in regional variants of the same language. Byrd (1994) provides the most compelling evidence on regional differences in speech rate (including the pauses) in American English, drawing on the TIMIT database. She finds a slower speech rate for Southerners and a faster rate for those from the North, broadly speaking, while “Army Brats” (who had lived in three or more areas) spoke the fastest. Southern and South Midland speakers produced the most pauses and North Midland, Western, and “Army Brat” speakers produced less than was expected given a random distribution determined by a chi-square test. As discussed below, however, the TIMIT database has serious limitations which do not always allow for a conclusive assessment of regionally defined differences in speech rate in American English. Ultimately, few studies to date have provided direct evidence that overall speech rate differs regionally.

As an example, Ray & Zahn (1990) investigated speech rate with 93 speakers from Utah, Oregon, and Washington in the Pacific Northwest, Texas and Louisiana in the Southwest, and Wisconsin and Ohio in the Upper Midwest. The speech samples were taken from public speaking and group discussion, both in university settings. Context category (public speaking versus discussion) was the only significant factor in the variation in speaking rate to the exclusion of gender and region. The division of speakers was done entirely by the university they attended (i.e. without evidence on where they actually grew up). Even aside from that, the regions do not align with current views of American dialect areas. For instance, in Labov’s maps (e.g. Labov, Ash & Boberg, 2006:142, 148), Wisconsin typically includes large areas corresponding to North Central/“North” and Inland North, some other dialect maps also show a small patch of Midlands. In contrast, Ohio is mostly Midlands, with a band of the Inland North, along with various transition areas and small areas belonging to other dialect areas, including Southern (especially along the Ohio River).

Speaker age and gender

As pointed out in a number of studies, young adults tend to speak faster than older adults (e.g. Quené, 2008; Verhoeven et al., 2004). This has been also discussed by Smith, Wasowicz & Preston (1987) and treated more recently by Yuan, Cieri & Liberman (2006, with references to earlier work). Seeking an explanation of such patterns, Ramig (1983) cited physiological factors such as “visual acuity, processing time, general neuromuscular slowing, peripheral degeneration of the speech mechanism, and psychosocial variables” (p. 224) as possible reasons why the physical condition of an elderly person affected their speaking rate. The findings by Quené (2008) point to the fact that older speakers produce relatively shorter phrases than younger speakers and the difference in phrase length may explain some of the age-related effects.

With regard to gender, most available evidence suggests that men actually speak somewhat faster than women, as found for example by Byrd (1994). Yuan et al. (2006) report a small but significant difference in the same direction. Outside of North America, Whiteside (1996) examined the characteristics of read speech in three women and three men speakers with a British General Northern accent. Whiteside noted that the small sample limits the conclusions that can be drawn, but results showed that women had longer mean sentence durations, and also had higher standard deviations. They also paused more frequently than the men. Other work on gender and speech rate has been reviewed recently by Heffernan (2007).

Urban and rural speech

"Rate of speaking" in the informal sense is involved in a cluster of other stereotypes, including the view that some languages are spoken faster than others (Roach, 1998) or that urban speech is faster than rural speech. Hewlett & Rendall (1998) examined the claim that urban dwellers speak more quickly than rural dwellers by comparing the speech rates of 24 speakers from the rural Orkney Islands, north of Scotland, and from urban Edinburgh. The speakers read a passage and were interviewed for several minutes about their lives. From these data, both speaking and articulation rates were calculated in syllables per second (counting only turns with ten or more syllables). They found little difference in speaking rate while the speakers were reading, but in conversation mode the Edinburgh group was slower than in their reading rate, although this difference was not significant. Articulation rate also showed no significant differences in reading rate of the two groups but the Orkney group had a significantly faster rate in conversation than the Edinburgh group. Thus, results failed to support the claim that urban speech is faster than rural speech.

Speaking versus reading

Speaking rate and reading rate were directly compared by Crystal & House (1982) who noted that speakers using “less formal production” (p. 706) – that is, informal conversations rather than reading – showed more temporal syllable reductions in their speech, which increased their speaking rates. In another study, Hirose & Kawanami (2002) examined dialogue versus read speech, and focused on the prosodic differences that exist between the two, noting that “dialogue speech generally shows wider dynamic ranges in its prosodic features” (p. 97), such as in tone and rhythm, as well as a higher speech rate than read speech. Howell & Kadi-Hanifi (1991) also examined prosodic differences in reading and speaking. Six speakers spoke spontaneously (describing a room of their choice), and then three months later, after transcribing their response, investigators had those speakers read what they had responded, as well as the response of two other speakers, in order to compare speaking styles. They found that speakers produced a “larger number of short tone units” (p. 166) while reading, indicating more fragmentation and a more formal style of speech. They concluded that this variation in stress placement did not produce significant differences in speaking rate, but did observe that speaking rate “changes with the mode of delivery” (p. 168).

As this brief review indicates, factor such as regional affiliation of the speaker, age, gender and speaking style have been found to produce differences in speech tempo in a systematic way. In this study we are primarily interested in regional differences in articulation rate and their manifestations across speaker age, gender and speaking situations (reading vs. speaking). At present, we only focus on the selected social and physical markers as potential predictors of differences in the articulation rate and do not address the contribution of psychological markers. The questions underlying our investigation center on the strength of a given predictor in making broader generalizations. Specifically, can we legitimately claim that certain regional varieties of American English are spoken faster (or slower) than others? If the differences do exist, are they manifested in both speaking and reading? Do older adults always speak slower than the younger? Do men always speak faster than women?

The present paper has one additional aim. Studies by Crystal & House (1988a, b) and Jacewicz et al. (2006, 2007) indicate that Northern vowels are in fact shorter than Southern vowels. This may suggest that there is a relationship between vowel duration and variation in speech tempo so that Southern vowels reflect the overall slower tempo of the Southern speech whereas the shorter Northern vowels correspond to the faster tempo of Northern speech. However, this correspondence may not hold for all vowels as pointed out by Clopper, Pisoni & de Jong (2005:1665) who found that vowel duration differences are “selective in nature,” with Southern lax vowels significantly longer than those of other Americans. The present study seeks to find a more conclusive account of regional variation in speech tempo which may also be carried on to smaller temporal units such as phrasal or segmental durations.

METHODS

In the present paper, we report on a study that compares articulation rates of Northern speakers born and raised in south central/southeastern Wisconsin with those of Southern speakers born and raised in westernmost North Carolina. Subjects read sentences aloud and engaged in free conversation, providing two very different samples, and yielding clear differences between dialects. While our primary focus is on regional difference in speech tempo, our sample yields data on age — contrasting young adults (20–34 years old) with older adults (51–65 years old) — and gender — contrasting men’s and women’s speech.

Speakers

A total of 94 speakers participated in the study. All were born, raised, and spent most of their lives in either South-central Wisconsin or Western North Carolina. The speakers fell into two age groups. There were 40 older adults aged 51–65 years, 20 from Wisconsin and 20 from North Carolina who were evenly divided by gender (10 men and 10 women), in each region. There were also 54 young adults aged 20–34 years. In this group, 36 speakers were the original participants of the study (18 speakers from Wisconsin and 18 from North Carolina, 9 men and 9 women in each region) and 18 additional speakers (6 from Wisconsin and 12 from North Carolina) were recorded at a later time. All Wisconsin speakers came from the Madison area and areas east, well within the parts of the state defined as Labov et al.’s (2006) “Inland North” dialect, the area characterized by the Northern Cities Shift. The North Carolina speakers came from the Sylva, Cullowhee, and Waynesville areas (Jackson, Swain and Haywood counties). Geographically, these participants yielded a highly homogenous sample of varieties of Northern and Southern speech, respectively.

The speakers were also comparable in terms of educational background and socioeconomic status. Most of the young adults were students at either University of Wisconsin-Madison or Western Carolina University. The older adults had mostly college education except for two Wisconsin speakers and six North Carolina speakers, three of whom worked as teacher’s assistants and a librarian. As Table 1 and Table 2 summarize, the occupation of the older adults in this study do not reflect a sharp regional difference in terms of socioeconomic status. To a limited extent, Wisconsin speakers are somewhat more metropolitan than North Carolina speakers. Still, these groups do not reflect a clear urban/rural split, something which has been argued to correlate with speech differences in the South (see Tillery & Bailey, 2008, and Thomas, 2008). Although some of the Wisconsin speakers grew up in Madison (which, according to U.S. Census data reached a population of 203,704 in 2005), many of the younger and older subjects came from small towns such as Beaver Dam, Fond du Lac, Stoughton, West Bend and New Glarus. The small towns in North Carolina where our subjects grew up (Dillsboro, Waynesville, Sylva, Webster, Cullowhee and Whittier) are located close to the Great Smoky Mountains National Park, a popular tourist destination, and these small towns experience a steady flow of visitors during the summer and autumn months. Moreover, the entire region encountered increased in-migration in recent years and the picturesque area continues to attract new residents from other states. Overall then, the subject populations in this study are generally similar in basic demographic terms.

Table 1
Educational and professional background of older adults (51–65 years) from Wisconsin as reported in a background questionnaire. For years of higher education, the questionnaire displayed the following options: college (1–2), college (3–4), ...
Table 2
Educational and professional background of older adults (51–65 years) from North Carolina. See Table 1 legend for details.

Stimulus materials

Two types of recorded speech were obtained, read sentences and spontaneous talks. Each speaker read a set of 240 contextually and prosodically constrained sentences which were constructed to elicit variable emphasis in vowel production, which was a focus of a larger project in our lab. In these sentences, main sentence stress was systematically manipulated as in the examples below (see Appendix 1 for a complete set of the recorded sentence material):

  • JANE knows the small bits are sharp.
    No! JOHN knows the small bits are sharp.
  • John FEELS the small bits are sharp.
    No! John KNOWS the small bits are sharp.
  • John knows the small SCREWS are sharp.
    No! John knows the small BITS are sharp.

The complete sentence material was recorded by 76 speakers (40 older adults and 36 young adults). One advantage of collecting read sentences instead of a longer passage of read discourse was that it led to better control of the stress placement and fluency in reading. That is, the sentence material created a testing condition in which all speakers emphasized a particular word and read the phrase without a pause or hesitation. More within-speaker variability can be expected in reading a longer text which introduces considerable differences in reading style, tempo, repetitions, corrections, hesitations, etc. Since we were interested in articulation rate which is linked to the speed of movement of articulators in a unit of time, fluency in reading was essential to obtaining a representative sample of speech tempo.

The second type of recorded speech consisted of a short informal and unconstrained talk, whose duration ranged from 10 to 15 minutes. Most speakers recounted stories from their lives or spoke about their families, friends, hobbies, and their daily activities. They were instructed to speak at their typical tempo and mode, and that the topic of their talk was not of interest to the research. Rather, they were told that the focus of the study was to examine variation in pronunciation across different regions in the United States. Recordings of a talk were obtained from each of the older speakers. However, not all young speakers who read the sentence material were recorded producing an informal talk. For that reason, 6 new Wisconsin speakers and 12 new North Carolina speakers were brought to the study. These speakers recorded the informal talks only. Altogether, the recordings of the talks were obtained from 40 older speakers and 40 young speakers (20 from Wisconsin and 20 from North Carolina in each age group, evenly divided by gender: 10 men and 10 women) for a total of 80 speakers.

Recording procedure

Recording of sentences was controlled by a custom program written in Matlab. The sentence pairs appeared on a computer monitor in random order. The participant read the sentence pair speaking to a head-mounted microphone (Shure SM10A), placed at a 1-inch distance from the lips. The sentences were recorded directly onto a hard drive disc at a sampling rate of 44.1 kHz. Only fluently read sentences with a proper stress placement were accepted by the experimenter. The recordings were repeated as many times as needed to obtain satisfactory productions. The spontaneous talks were recorded in the same session using Adobe Audition speech analysis program. Two female research assistants helped with data collection, one in Wisconsin, and one in North Carolina. All participants in either state were thus recorded by the same experimenter. In general, the spontaneous talks produced by North Carolina speakers were mostly uninterrupted by the experimenter. The speakers clearly enjoyed sharing stories from their lives and mostly did not require prompting. Wisconsin speakers ran out of topics more often and needed a leading question when they stopped talking.

Data analysis

Articulation rate in read sentences

Only 120 sentences from each participant were analyzed for the present study for a total of 9120 sentences. The second sentence in the pair was chosen because initial analyses indicated significant differences in the number of hesitations and pauses between the two sentences. The second sentence in the pair was also produced more fluently than the first by most of the participants. All acoustic waveform analyses were done using the Adobe Audition waveform editing program. The locations of sentence onsets and offsets were determined by hand and these values served as input to a Matlab program which calculated the overall sentence duration automatically, displaying the onset and offset markings in the waveform for the researcher to examine. A reliability check was performed by a second researcher on all measurements using the same Matlab program with graphical display of sentence onsets and offsets. Agreement between these two researchers was essentially 100% as all disagreements in measurements were noted and resolved. The average articulation rate for each sentence was measured in syllables per second, which was calculated by dividing the sentence duration by the total number of syllables. There were seven syllables in each sentence and the syllable count for each sentence was verified by a researcher who performed the reliability check on the whole data set. The word “No” was excluded from the analyses. Because the focus of the study was to obtain fluent productions from each speaker, the pauses between words were very sparse throughout the entire sample. Nevertheless, if found, they were edited out and subtracted from the duration of the sentence.

Articulation rate in spontaneous talks

To calculate articulation rate for spontaneous speech, the talks were first transcribed. Two types of orthographic transcripts were created. In the first transcript, all words and sounds (such as hesitations or laughing) coming from the speaker were written down. The second transcript focused on fluent phrases only and eliminated all non-fluent productions identified in the first transcript. For the present purposes, the phrase was defined as a string of words containing five or more syllables uttered without a pause. These phrases were numbered consecutively for each subject and the articulation rate was calculated for these fluent phrases only. An example of the second transcript is given in Appendix 2. Next, the onset and offset of each phrase was measured using waveform editor (Adobe Audition) and the articulation rate was calculated in the same manner as for read sentences, i.e. by dividing the duration of each phrase by the number of spoken syllables as determined by the experimenter. After listening to each phrase and marking its temporal onsets and offsets, the experimenter counted the number of syllables it contained based on the spoken utterance (and not on its orthographic notation). A total of 4930 phrases were analyzed from all speakers and the number of syllables per phrase varied (mean = 12.7 syllables per phrase, s.d. = 6.9). As for the read sentences, a reliability check was performed by a second researcher using a dedicated custom Matlab program.

RESULTS

The overall mean articulation rate in read sentences was 3.40 syll/s (s.d.=0.42) whereas the rate in spontaneous talks was 5.12 syll/s (s.d.=0.59). The large difference of 66.4% between the read and spoken productions indicated that, in general, articulation rate in reading is much slower than in a free speech. Of interest to the study, however, was whether there were significant differences within each production type as a function of speaker dialect, age, and gender. The results were initially assessed by two separate univariate analyses of variance (ANOVA). In the first ANOVA, the dependent variable was articulation rate in reading and the between-subject factors were dialect, age, and gender. In the second ANOVA, the dependent variable was articulation rate in talks and the between-subject factors remained the same. Additional analyses, if necessary, were conducted as detailed below.2

The effects of speaker dialect

In reading, the overall means were 3.54 syll/s (s.d.=0.34) for Wisconsin speakers and 3.27 syll/s (s.d.=0.44) for North Carolina. The main effect of dialect was significant [F(1, 68) = 12.7, p = 0.001, η2 = 0.157], indicating that Wisconsin speakers demonstrated a faster articulation rate in reading as compared to North Carolina speakers. Using the same reading material and the same measurement criteria, the Wisconsin speakers read at a rate 8% faster than North Carolina speakers. For spontaneous talks, the second ANOVA also revealed a significant main effect of dialect [F(1, 72) = 28.8, p < 0.001, η2 = 0.286]. The overall mean articulation rates were 5.41 syll/s (s.d.=0.48) for Wisconsin speakers and 4.81 syll/s (s.d.=0.54) for North Carolina. Thus the articulation rate for Wisconsin speakers was 12.5% faster than for North Carolina speakers. These findings clearly show that the articulation rate, whether in reading or speaking, is faster for the regional variety of American English spoken in Wisconsin as compared to North Carolina.

The effects of speaker age

Figure 1 shows the mean articulation rate for young and older adults in Wisconsin and North Carolina for the two types of productions. As can be seen, young adults tend to speak faster than older adults in both reading and spontaneous talks in Wisconsin. However, North Carolina young adults tend to speak faster only in reading but not in spontaneous talks.

Figure 1
The effects of age on articulation rate for Wisconsin and North Carolina speakers in read sentences denoted here as reading style (RD) and spontaneous talks – conversational style (CS).

The ANOVA results for read sentences showed a significant main effect of age [F(1, 68) = 20.9, p < 0.001, η2 = 0.235]. Young adults’ articulation rate in reading was 11% faster than that of older adults (3.58 syll/s (s.d.=0.44) vs. 3.23 syll/s (s.d.=0.32)). For spontaneous talks, however, the main effect of age was not significant, indicating no differences in articulation rate between the young and older adults (the overall means were 5.18 syll/s (s.d.=0.58) for young adults and 5.04 syll/s (s.d.=0.60) for the older). Since the results for Wisconsin suggested some differences between the two groups (see Figure 1), we conducted additional two-way ANOVAs with the between-factors age and gender separately for Wisconsin and North Carolina talks. The results for Wisconsin showed a significant main effect of age [F(1, 36) = 5.19, p = 0.029, η2 = 0.126] although the effect size was small. Young Wisconsin adults were shown to speak 6% faster than older adults (5.58 syll/s (s.d.=0.41) vs. 5.25 syll/s (s.d.=0.50)). For North Carolina, the main effect of age was not significant.

Altogether, the results show that the articulation rate in reading is faster for young adults as compared to older adults in both Wisconsin and North Carolina. However, in free speech, the results are less consistent. Young adults tend to speak faster than older adults in Wisconsin but not in North Carolina, where young and older adults do not show differences in the articulation rate (4.79 syll/s (s.d.=0.44) and 4.83 syll/s (s.d.=0.64), respectively).

The effects of speaker gender

As Figure 2 shows, the differences in articulation rate as a function of speaker gender were very small. As a general tendency, men tended to speak slightly faster than women both in reading (3.48 syll/s (s.d.=0.43) vs. 3.33 syll/s (s.d.=0.40)) and in spontaneous talks (5.2 syll/s (s.d.=0.57) vs. 5.03 syll/s (s.d.=0.61)). For read sentences, the articulation rate for men was 4.5% faster than for women. Although the ANOVA results showed a significant effect of gender [F(1, 68) = 4.06, p = 0.048, η2 = 0.056], its small effect size indicates that gender contributed very little to the variance accounted for. A near-significant interaction between gender and age [F(1, 68) = 3.85, p = 0.054, η2 = 0.054] pointed to the difference between the rates of young men and young women (3.74 syll/s (s.d.=0.42) vs. 3.43 syll/s (s.d.=0.41)). However, although young men tended to read faster than young women, the size of this effect was again very small. For spontaneous talks, the effect of gender was not significant nor were there any significant interactions between gender and any other factor.

Figure 2
The effects of gender on articulation rate for Wisconsin and North Carolina speakers in read sentences denoted here as reading style (RD) and spontaneous talks – conversational style (CS).

We also conducted separate two-way ANOVAs for each dialect with the between-subject factors gender and age. The main effect of gender on either articulation rate (reading or talks) was again not significant for either Wisconsin or North Carolina. However, there was one significant age by gender interaction for read sentences for North Carolina although its effect size was small [F(1, 34) = 4.8, p = 0.035, η2 = 0.124]. This interaction indicated greater differences in the articulation rate between young men and young women as compared to older men and older women. A subsequent two-tailed independent samples t-test showed that the difference for young adults was significant [t=2.7, p = 0.015], showing that young men in North Carolina read 17% faster than young women (3.67 syll/s (s.d.=0.48) vs. 3.14 syll/s (s.d.=0.34)). The difference in speech rate between older men and older women was not significant (3.11 syll/s (s.d.=0.27) and 3.15 syll/s (s.d.=0.47), respectively).

Overall, the effects of speaker gender on articulation rate were small and were found mainly in reading where the articulation rate for men was slightly faster than for women, with the exception of North Carolina young men as compared to North Carolina young women. However, no differences as a function of speaker gender were found in the spontaneous speech where all speakers, whether young or older, spoke equally fast.

The choice of the univariate ANOVAs in the present study was motivated by the fact that not all speakers for the young adults group participated in both tasks, i.e., reading and spontaneous talks. Thus, the results for the articulation rate in read sentences and talks could not be compared directly. However, the older speakers did participate in both tasks. To address the question of whether there are in fact significant differences between the articulation rate in reading and in free speech a within-subject ANOVA was conducted for the older adults only. In this analysis, the speaking condition (reading and talk) was the dependent variable and dialect and gender were the between-subject factors. The results showed a strong significant effect of speaking condition [F(1, 36) = 495.26, p < 0.001, η2 = 0.932]. For the older adults, the articulation rate in reading was significantly slower than that in spontaneous talks (3.23 syll/s (s.d.=0.32) vs. 5.04 syll/s (s.d.=0.60)). The effect of dialect was also significant [F(1, 36) = 6.69, p = 0.014, η2 = 0.157] indicating that Wisconsin speakers demonstrate a faster articulation rate than North Carolina speakers (4.29 syll/s (s.d.=0.44) vs. 3.98 syll/s (s.d.=0.41)). Across the present speaking conditions, Wisconsin speakers spoke about 8% faster. The effect of gender was not significant. There were also no significant interactions.

GENERAL DISCUSSION

The present results provide a broad set of data to allow detailed comparison of articulation rate in two regional varieties of American English, one Northern and the other Southern. The study provides strong evidence that, in both reading and spontaneous speech, the articulation rate of the Northern speakers was higher than that of the Southern speakers. The difference was as much as 8% in reading and 12.5% in spontaneous talks. Considering other factors such as age and gender, it was found that young adults read faster than older adults in both the North and the South. However, in free speech, young adults tended to speak faster than older adults in the Northern dialect but not in the Southern dialect where the articulation rates of both young and older adults did not differ. The effects of gender were smaller and less consistent. In general, men tended to speak a little faster than women but the differences were negligible. There was one exception to this trend among young adults in North Carolina. In reading, the articulation rate of young Southern men was significantly higher than of young Southern women.

Comparing the present results with those in previously published studies, we detect numerous important similarities as well as a few differences. In particular, Byrd (1994) reported notable differences in speech rate in the TIMIT database across eight broadly defined dialect regions in the United States. The corpus included read sentences only and pauses were included in the calculation of speech rate. The speakers were mostly young and mostly male. It was found that the speech rate among Southern speakers tended to be slower than among the Northern speakers. In terms of speaker gender, men spoke on average 6% faster than women (4.69 syll/s vs. 4.42 syll/s). It must be underscored that these general trends in the TIMIT database need to be interpreted with caution, given the unbalanced design in terms of speaker gender (69.5% men and 30.5% women) and broad regional distribution of speakers classified as Northern and Southern (some shortcomings of the TIMIT database are discussed in Keating, Byrd, Flemming & Todaka (1994)). Nonetheless, trends in that study and the present study are similar in terms of finding that speakers from the North spoke faster, on average, than speakers from the South. Although the effects of speaker gender were more variable in our study, the articulation rate for men was also slightly faster than for women.

The tendency for men to speak faster than women was also found in Whiteside (1996) for British Northern English. Results showed a higher articulation rate for men than for women (4.10 vs. 3.38 syll/s). The much higher rate of 21% for men may be attributed to a small sample size and perhaps to differences in the ages of the speakers, which were not reported in that study. Our results from a much larger subject pool and an extensive corpus of data do not support this finding. However, there was one exception in our data which corresponds to the finding in Whiteside (1996). In particular, the articulation rate in reading for young North Carolina men was 17% higher as compared to young North Carolina women, approximating the rate reported in Whiteside (1996). This suggests that the tendency for men to speak faster may be more variable across regional varieties and in some regions the articulation rate for men may be much higher than for women.

Turning to the articulation rate in spontaneous speech, Verhoeven et al. (2004) address the effects of dialect, gender, and age on the articulation rate (excluding silent pauses) and speaking rate (including the pauses) in free conversations by 160 Dutch-speaking teachers in the Netherlands and Belgium. The variables in that study were comparable with those in the present study although the number of speakers per region in the former was smaller (10 men and 10 women). Assessing the articulation rate, the study found significant effects of country (the Netherlands and Belgium), region, age, and gender. Men spoke 6% faster than women (4.79 vs. 4.50 syll/s) and young adults (aged below 40 years) spoke 5.4% faster than older adults (aged over 45 years). On average, articulation rate in the Netherlands was 16.2% faster than in Belgium (5.05 vs. 4.23 syll/s). Interestingly, there were no statistical differences in articulation rate between the regional varieties in Belgium. In the Netherlands, there was only one region which differed significantly from the remaining three. The significant effect of region disappeared in the assessment of the speaking rate, however, whereas men remained speaking faster than women (5.2%) and younger adults spoke also 5.2% faster than older adults.

Comparing our present results with those in Verhoeven et al. (2004) for the national standard varieties of Dutch in the Netherlands and Belgium, we found a much stronger effect of region which was manifested in both read sentences and spontaneous talks of Wisconsin and North Carolina speakers. Our results for gender are generally consistent, although men in our study spoke only 3.3% faster than women and the main effect of gender was not significant. In terms of age, we found a consistency only in Wisconsin where the present young adults spoke about 6% faster than older adults in spontaneous talks. (Recall that there were no differences between the two age groups in North Carolina.)

Verhoeven et al. (2004) contrasts with work by Byrd and others just discussed in regard to areal/dialect distribution: previous studies have typically taken speakers from across broadly defined regions (Clopper et al. 2005; Byrd 1994), while Verhoeven et al. provide a picture of regional variation in two national standards and a comparison over those two broader groups. Our present study differs from both approaches in having large samples from two very different but dialectally very coherent areas, establishing systematic differences in articulation rate between one Southern and one Upper Midwestern variety of American English.

The question arises whether our reported variation in the articulation rate is of any perceptual relevance or whether the differences were too small to be detected by an ordinary listener? The results from Quené (2007) may shed some light on this issue. In his study, listeners detected a change in the articulation rate when the difference was about 5%. This Just Noticeable Difference (JND) indicates that a 5% or more difference of rate in speech fragments could be perceived by the listener as faster (or slower). Most of the articulation rate differences reported here are clearly well above Quené’s reported JND of 5%. For most of our measures, the differences in articulation rate ranged from 6% to 17%. Except for the effects of speaker gender, the perceptual cues in our data should be available to listeners in creating impressions of how fast others speak.

For spontaneous talks, one issue has not been controlled for in the present study. As indicated in Quené’s recent statistical model (2008), speech tempo is strongly influenced by the length of the phrase. Because longer phrases contain more syllables than shorter phrases, they tend to be spoken faster which shortens syllable durations and increases the articulation rate. In that study, phrase length was found to vary with the speaker’s age: Older speakers produced shorter phrases compared to the younger speakers and they also tended to vary the length of their phrases more often. If so, speech tempo may be only weakly affected by speaker’s age if differences in phrase length are accounted for. This implies that the differences in the articulation rate between the Northern and Southern speakers (or differences due to speaker age and gender) may be significant but they will diminish if phrase length is included in assessing the results.

Although we did not consider phrase length in our analyses of spontaneous talks, the results for read sentences can be interpreted in the light of Quené’s findings. Those sentences contained a fixed number of syllables per phrase. Since each speaker uttered seven syllables in a phrase, it was clearly the differences in the phrase duration that affected the articulation rate. The present Wisconsin speakers read faster than the North Carolina speakers (the mean phrase durations were 2.01s and 2.20s, respectively), young adults read faster than older adults (2.0s vs. 2.20s) and men read a little faster than women (2.06s vs. 2.15s). Given the fixed number of syllables, longer phrase indicates slower articulation rate and shorter phrase indicates faster rate. The present results for reading thus clearly show that articulation rate varies as a function of speaker dialect, age and, to some extent, gender.

Can we assume that Wisconsin speakers as well as young adults and men are simply better readers than North Carolina speakers, older adults and women? Given the large sample and significant statistical effects, we have to exclude that these results came about by chance. The reading data show a systematic effect of between-subject factors. Because all reading productions were fluent, it is unclear what would account for the notion of a “better reader” (or perhaps a “faster reader”) if no pauses or hesitations were present in a prosodically constrained read phrase. Although we can only speculate at present in the absence of articulatory data, these effects could be due to differences in the articulatory paths for older and younger adults as well as for women and men, the former exhibiting more careful productions than the latter. However, the dialectal differences cannot be easily explained by the same reasoning. It might be the case that some other dialect-specific segmental properties play a role here. So far, we have found significant effects of vowel duration between the Northern and Southern speech (Jacewicz et al., 2007) and significant differences in stop closure durations, Wisconsin closures being longer than North Carolina closures (Jacewicz et al., 2008). However, these findings are far from making conclusive statements about the dialect-specific features with regard to articulation rate. More work is needed to explain which features contribute mostly to the perception of the Southern speech being slower than the Northern.

A reanalysis of the present data for spontaneous talks would be necessary to determine whether phrase length is a predictor of differences in articulation rate as a function of speaker dialect, age and gender. As a general rule, do Wisconsin speakers produce longer phrases which shorten syllable durations as compared to North Carolina speakers or does the dialectal difference originates in dialect-specific temporal differences between speech segments? Since the present results for reading and spontaneous talks do not overlap for the effects of age (see the differences in free speech for young and older adults in Wisconsin and North Carolina), we may infer that these differences are due to the lack of inclusion of phrase length as a predictor in the analysis of spontaneous talks. However, more work is necessary to prove the validity of this interpretation.

In conclusion, this study yielded several robust findings with regard to the articulation rate in read and spontaneous speech in the productions of mostly the same participants. First, the articulation rate of Wisconsin speakers was distinctly faster than that of North Carolina speakers. Second, young adults in Wisconsin spoke and read faster than older Wisconsinites. For North Carolina, articulation rates in reading followed the same pattern. However, in free speech, no significant differences due to speaker age were found.3 Finally, while the effects of gender were present, they were weak. It can only be suggested that men may speak faster than women under some circumstances, namely reading. This limited finding is consistent with most previous research.

As the body of work on the sociophonetics of American English continues to grow in scope and depth, we argue that it is important to include fundamental phonetic information, like these results on articulation rate, as part of our catalog of regional differences and patterns of change in American English.

ACKNOWLEDGMENTS

This work was supported by research grant NIH/NIDCD R01 DC006871. We thank the following for help with data collection, transcription, and acoustic measurements: Mahnaz Ahmadi, Jason Fox, Janaye Houghton, Samantha Lyle, Leigh Smitley, Dilara Tepeli, and Lisa Wackler. We also thank Eric Raimy for discussion on the topic, as well as the audience at the American Dialect Society, Chicago, January 2008, where an earlier version of this paper was presented. Comments of three anonymous reviewers are greatly appreciated.

APPENDIX 1

The following sets of sentences were recorded by each speaker. All 2-set sentences were randomly presented to the subject in two stimulus lists. The sentences in which the main stressed falls on the first and second word position, respectively, served as distracters and were not included in the final analyses.

Vowels before a voiceless consonant in a word

bits
  • JANE knows the small bits are sharp.
    No! JOHN knows the small bits are sharp.
  • John FEELS the small bits are sharp.
    No! John KNOWS the small bits are sharp.
  • John knows the SOFT bits are sharp.
    No! John knows the SMALL bits are sharp.
  • John knows the small SCREWS are sharp.
    No! John knows the small BITS are sharp.
  • John knows the small bits are DULL.
    No! John knows the small bits are SHARP.

baits
  • MOM said the dull baits are best.
    No! DAD said the dull baits are best.
  • Dad THINKS the dull baits are best.
    No! Dad SAID the dull baits are best.
  • Dad said the BRIGHT baits are best.
    No! Dad said the DULL baits are best.
  • Dad said the dull HOOKS are best.
    No! Dad said the dull BAITS are best.
  • Dad said the dull baits are WORST.
    No! Dad said the dull baits are BEST.

bets
  • FRANK said the small bets are low.
    No! JOHN said the small bets are low.
  • John THOUGHT the small bets are low.
    No! John SAID the small bets are low.
  • John said the BIG bets are low.
    No! John said the SMALL bets are low.
  • John said the small POTS are low.
    No! John said the small BETS are low.
  • John said the small bets are HIGH.
    No! John said the small bets are LOW.

bats
  • SAM said the small bats are fast.
    No! DOC said the small bats are fast.
  • Doc THINKS the small bats are fast.
    No! Doc SAID the small bats are fast.
  • Doc said the LARGE bats are fast.
    No! Doc said the SMALL bats are fast.
  • Doc said the small BIRDS are fast.
    No! Doc said the small BATS are fast.
  • Doc said the small bats are SLOW.
    No! Doc said the small bats are FAST.

bites
  • JANE thinks the small bites are deep.
    No! SUE thinks the small bites are deep.
  • Sue KNOWS the small bites are deep.
    No! Sue THINKS the small bites are deep.
  • Sue thinks the LARGE bites are deep.
    No! Sue thinks the SMALL bites are deep.
  • Sue thinks the small CUTS are deep.
    No! Sue thinks the small BITES are deep.
  • Sue thinks the small bites are WIDE.
    No! Sue thinks the small bites are DEEP.

Vowels before a voiced consonant in a word

bids
  • BOB thinks the fall bids are low.
    No! TED thinks the fall bids are low.
  • Ted KNOWS the fall bids are low.
    No! Ted THINKS the fall bids are low.
  • Ted thinks the SPRING bids are low.
    No! Ted thinks the FALL bids are low.
  • Ted thinks the fall SALES are low.
    No! Ted thinks the fall BIDS are low.
  • Ted thinks the fall bids are HIGH.
    No! Ted thinks the fall bids are LOW.

bades

(The nonsense word bade was explained to the speaker as indicating “a brand of knife, a brand name.”)

  • TOM says the dull bades are cheap.
    No! TED says the dull bades are cheap.
  • Ted THINKS the dull bades are cheap.
    No! Ted SAYS the dull bades are cheap.
  • Ted says the SHARP bades are cheap.
    No! Ted says the DULL bades are cheap.
  • Ted says the dull FORKS are cheap.
    No! Ted says the dull BADES are cheap.
  • Ted says the dull bades are WEAK.
    No! Ted says the dull bades are CHEAP.

beds
  • TOM said the tall beds are warm.
    No! ROB said the tall beds are warm.
  • Rob THINKS the tall beds are warm.
    No! Rob SAID the tall beds are warm.
  • Rob said the SHORT beds are warm.
    No! Rob said the TALL beds are warm.
  • Rob said the tall CHAIRS are warm.
    No! Rob said the tall BEDS are warm.
  • Rob said the tall beds are COLD.
    No! Rob said the tall beds are WARM.

bads

(The speaker was told that bad refers to “an error or mistake.” For example, if someone makes an error, he or she might say “my bad” instead of “my mistake.”).

  • NICK thinks the small bads are worse.
    No! MIKE thinks the small bads are worse.
  • Mike KNOWS the small bads are worse.
    No! Mike THINKS the small bads are worse.
  • Mike thinks the BIG bads are worse.
    No! Mike thinks the SMALL bads are worse.
  • Mike thinks the small GOODS are worse.
    No! Mike thinks the small BADS are worse.
  • Mike thinks the small bads are BEST.
    No! Mike thinks the small bads are WORSE.

bides

(The nonsense word bide was explained to the speaker as indicating “a small animal, a type of dog.”)

  • SUE thinks the small bides are cute.
    No! JANE thinks the small bides are cute.
  • Jane KNOWS the small bides are cute.
    No! Jane THINKS the small bides are cute.
  • Jane thinks the SHORT bides are cute.
    No! Jane thinks the TALL bides are cute.
  • Jane thinks the small CATS are cute.
    No! Jane thinks the small BIDES are cute.
  • Jane thinks the small bides are GROSS.
    No! Jane thinks the small bides are CUTE.

APPENDIX 2

Example of a transcript used in calculating the articulation rate in spontaneous talks. The fluent phrases were numbered consecutively and the articulation rate was calculated for these phrases only. All hesitations, pauses and fillers (marked here in italics) were excluded from analyses.

Older North Carolina female speaker:

          52) and to tell a funny story about,
     uh, my speech,
     we have a,
          53) a mountain pasture where we keep our cows, our cattle in the uh,
     spring and summer,
     and this,
     uh,
          54) man from Florida came up and bought the place next to it,
          55) and so, we were up there checking on the cattle one day and he,
          56) we stopped and started talking to him and,
          57) he got acquainted with us
     and uh,
          58) he was just delighted with my speech and I didn’t think
          there was anything wrong with it but,
          59) I did not get the feeling that he was making fun of me and,
          60) he kept saying that the next time they came up from Florida that
          he was going to,
     um,
     have,
          61) bring his wife and let her hear me talk,

Footnotes

1This popular stereotype is confirmed by media attention. Coverage of the 2008 U.S. presidential campaign reinforced this stereotype. Simple Google searches (December 22, 2007) for ‘fast-talking’ plus the names of two Northern candidates (Rudy Giuliani of New York, Mitt Romney of Massachusetts) yielded a total of 9,260 hits and similar searches with ‘slow-talking’ plus two Southern candidates (Mike Huckabee of Arkansas, Fred Thompson of Tennessee) yielded 3,370.

2These analyses were carried out on articulation rate means for individual speakers in either the read sentence or the spontaneous talk conditions. We did not address potential within-speaker variation in articulation rate.

3Note that possible ‘age-grading’ of speech or articulation rate could raise questions about the apparent time construct.

Contributor Information

Ewa Jacewicz, Department of Speech and Hearing Science, The Ohio State University.

Robert A. Fox, Department of Speech and Hearing Science, The Ohio State University.

Caitlin O’Neill, Department of Speech and Hearing Science, The Ohio State University.

Joseph Salmons, Department of German, University of Wisconsin-Madison.

References

  • Abercrombie David. Elements of general phonetics. Edinburgh: Edinburgh University Press; 1967.
  • Byrd Dani. Relations of sex and dialect to reduction. Speech Communication. 1994;15:39–54.
  • Clopper Cynthia G, Pisoni David, de Jong Kenneth. Acoustic characteristics of the vowel systems of six regional varieties of American English. Journal of the Acoustical Society of America. 2005;118:1661–1676. [PMC free article] [PubMed]
  • Crystal Thomas H, House Arthur S. Segmental durations in connected speech signals: Preliminary Results. Journal of the Acoustical Society of America. 1982;72:705–716. [PubMed]
  • Crystal Thomas H, House Arthur S. The duration of American-English vowels: An overview. Journal of Phonetics. 1988a;16:263–284.
  • Crystal Thomas H, House Arthur S. Segmental durations in connected-speech signals: Current results. Journal of the Acoustical Society of America. 1988b;83:1553–1573.
  • Heffernan Kevin M. Phonetic Distinctiveness as a Sociolinguistic Variable. Ph.D. dissertation. University of Toronto; 2007.
  • Hewlett Nigel, Rendall Monica. Rural versus urban accent as an influence on the rate of speech. Journal of the International Phonetic Association. 1998;28:63–71.
  • Hirose Keikichi, Kawanami Hiromichi. Temporal rate of change of dialogue speech in prosodic units as compared to read speech. Speech Communication. 2002;36:97–111.
  • Howell Peter, Kadi-Hanifi Karima. Comparison of prosodic properties between read and spontaneous speech material. Speech Communication. 1991;10:163–169.
  • Jacewicz Ewa, Fox Robert A, Lyle Samantha. Variation in stop consonant voicing in two regional varieties of American English. Journal of the Acoustical Society of America. 2008;124:2559. [PMC free article] [PubMed]
  • Jacewicz Ewa, Salmons Joseph, Fox Robert A. Vowel duration in three American English dialects. American Speech. 2007;82:367–385. [PMC free article] [PubMed]
  • Jacewicz Ewa, Salmons Joseph, Fox Robert A. Prosodic prominence effects on vowels in chain shifts. Language Variation & Change. 2006;18:285–316.
  • Keating Patricia, Byrd Dani, Flemming Edward, Todaka Y. Phonetic analyses of word and segment variation using the TIMIT corpus of American English. Speech Communication. 1994;14:131–142.
  • Labov William, Ash Sharon, Boberg Charles. Atlas of North American English: Phonetics, Phonology, and Sound Change. Berlin: Mouton de Gruyter; 2006.
  • Laver John, Trudgill Peter. The Gift of Speech: Readings in the analysis of speech and voice. Edinburgh: Edinburgh University Press; 1991. Phonetic and linguistic markers in speech; pp. 235–264.
  • Niedzielski Nancy, Preston Dennis R. Folk Linguistics. Berlin: de Gruyter; 1999.
  • Quené Hugo. Multilevel modeling of between-speaker and within-speaker variation in spontaneous speech tempo. Journal of the Acoustical Society of America. 2008;123:1104–1113. [PubMed]
  • Quené Hugo. On the Just Noticeable Difference for tempo in speech. Journal of Phonetics. 2007;35:353–362.
  • Ramig Lorraine. Effects of physiological aging on speaking and reading rates. Journal of Communication Disorders. 1983;16:217–226. [PubMed]
  • Ray George, Zahn Christopher. Regional speech rates in the United States: A preliminary analysis. Communication Research Reports. 1990;7:34–37.
  • Roach Peter. Myth 18: Some languages are spoken more quickly than others. In: Bauer Laurie, Trudgill Peter., editors. Language Myths. London: Penguin; 1998. pp. 150–158.
  • Smith Bruce L, Wasowicz Jan, Preston Judy. Temporal characteristics of the speech of normal elderly adults. Journal of Speech and Hearing Research. 1987;30:522–529. [PubMed]
  • Thomas Erik. Rural Southern white accents. In: Schneider Edgar W., editor. Varieties of English 2: The Americas and the Caribbean. Berlin: Mouton de Gruyter; 2008. pp. 87–114.
  • Tillery Jan, Bailey Guy. The urban South: Phonology. In: Schneider Edgar W., editor. Varieties of English 2: The Americas and the Caribbean. Berlin: Mouton de Gruyter; 2008. pp. 115–128.
  • Tsao Ying-Chiao, Weismer Gary, Iqbal Kamran. Interspeaker variation in habitual speaking rate: Additional evidence. Journal of Speech, Language, and Hearing Research. 2006;49:1156–1164. [PubMed]
  • Verhoeven Jo, De Pauw Guy, Kloots Hanne. Speech rate in a pluricentric language: A comparison between Dutch in Belgium and the Netherlands. Language and Speech. 2004;47:297–308. [PubMed]
  • Whiteside Sandra P. Temporal-based acoustic-phonetic patterns in read speech: Some evidence for speaker sex differences. Journal of the International Phonetic Association. 1996;26:23–40.
  • Yuan Jiahong, Cieri Chris, Liberman Mark. Towards an integrated understanding of speaking rate in conversation. Paper presented at the International Conference on Spoken Language Processing (Interspeech 2006); Pittsburgh. 2006. (Full paper available at http://ldc.upenn.edu/myl/llog/icslp06_final.pdf.)