|Home | About | Journals | Submit | Contact Us | Français|
Low mean fundamental frequency (F0) in men’s voices has been found to positively influence perceptions of dominance by men and attractiveness by women using standardized speech. Using natural speech obtained during an ecologically valid social interaction, we examined relationships between multiple vocal parameters and dominance and attractiveness judgments. Male voices from an unscripted dating game were judged by men for physical and social dominance and by women in fertile and non-fertile menstrual cycle phases for desirability in short-term and long-term relationships. Five vocal parameters were analyzed: mean F0 (an acoustic correlate of vocal fold size), F0 variation, intensity (loudness), utterance duration, and formant dispersion (Df, an acoustic correlate of vocal tract length). Parallel but separate ratings of speech transcripts served as controls for content. Multiple regression analyses were used to examine the independent contributions of each of the predictors. Physical dominance was predicted by low F0 variation and physically dominant word content. Social dominance was predicted only by socially dominant word content. Ratings of attractiveness by women were predicted by low mean F0, low Df, high intensity, and attractive word content across cycle phase and mating context. Low Df was perceived as attractive by fertile-phase women only. We hypothesize that competitors and potential mates may attend more strongly to different components of men’s voices because of the different types of information these vocal parameters provide.
Speech is complex, both semantically and acoustically. A number of distinct vocal parameters have been shown to affect interpersonal perceptions in humans (e.g., Banse and Scherer 1996; Feinberg et al. 2005; Puts et al. 2007; Scherer 1986; Zuckerman and Miyake 1993). Evolutionists have focused primarily on the effect of low-sounding vocalizations on the dynamics of male–male competition and female mate choice (via judgments of dominance and attractiveness); however, human speech varies along many dimensions. The purpose of the present paper is to examine relationships between these evolutionarily relevant judgments and multiple vocal parameters by using naturally varying, unscripted speech generated in an ecologically valid social interaction. Many studies of this kind use vocal stimuli that are unnaturally invariant in content and motivation, with all speakers uttering precisely the same, often socially irrelevant, phrase (e.g., Bruckert et al. 2006; Feinberg et al. 2006). Although this methodology permits experimental control, it filters out the actual contexts in which courtship and competition occur. The generalizability and external validity of such tightly controlled results depend on whether the effects they reveal persist in natural speech. Control for content is of course required, but our approach is to control semantic content statistically rather than experimentally.
In addition, the existing literature has focused primarily on the ways in which males and females converge in their response to acoustic parameters; however, there are adaptive reasons why males and females may not target the same types of acoustic information. In terms of the force of sexual selection, one might expect that men and women would have evolved to be responsive to different male attributes. Both sexes might form alliances with males, but beyond this, males are potential sexual competitors for men whereas they are potential mates for women. To the extent that competitive traits might be revealed by different aspects of voice than would traits revealing of mate quality, these different priorities may favor men and women attending to different attributes of men’s voices.
The goal of this paper is to examine the independent and joint effects of four acoustic parameters—formant dispersion, mean fundamental frequency, fundamental frequency variation, and intensity—plus duration and semantic content, in natural speech. Separate, parallel analyses are performed to assess the effects of each of these six variables on male perceptions of dominance and female perceptions of attractiveness.
Low-sounding vocalizations are the product of multiple acoustic parameters, including compact formant frequencies (low “formant dispersion” [Df]; Fitch 1997) and low fundamental frequency (F0). Df is a measure of the average spacing between the formants, which are energy peaks in the harmonic spectrum (Feinberg 2008; Fitch and Giedd 1999). Lower formants and formant dispersion give a more “resonant” quality to the voice. Df is a function of the length and shape of the vocal tract (or the depth of the larynx in the throat), which is the space through which sound waves must travel from the vocal folds through the oral cavity. Until puberty, vocal tract length scales with body size growth without any sexual dimorphism (Vorperian et al. 2005). At puberty, modulated by an increase in testosterone, males’ larynges begin to descend farther than females’ (Fitch and Giedd 1999). The eventual product of this laryngeal alteration is a longer vocal tract and the resultant “deeper,” more resonating voice among adult males. On average, the vocal tract is about 15% longer in adult males (Fant 1960) and is correlated with height (Fitch 1997). This results in sex differences in Df, with males exhibiting formants of lower frequency and dispersion (Collins 2000; Feinberg et al. 2005; Gonzalez 2004), and constitutes the basis for the indexical value of low Df.
Recent findings show that experimentally lowering Df elevates dominance ratings (Puts et al. 2007; Wolff and Puts 2010), and low Df is associated with assessments of larger size, muscularity, masculinity, and age (Collins 2000; Feinberg et al. 2005). However, studies investigating the relationship between men’s Df and women’s judgments of male attractiveness have produced mixed or inconclusive results (Collins 2000; Feinberg et al. 2005).
Although F0 was first hypothesized to indicate large body size (e.g., Morton 1977), the evidence for relationships between body size parameters and average F0 in humans is inconsistent (Bruckert et al. 2006; Collins 2000; Gonzalez 2004; Evans et al. 2006; Lass and Brown 1978; van Dommellen 1993; van Dommellen and Moxness 1995). On the other hand, F0’s dependence on testosterone (T) may constrain it as an honest signal. Working through androgen receptors in the vocal folds (Newman et al. 2000), testosterone lengthens and thickens the vocal folds, lowering mean F0 in pubertal males (Harries et al. 1997; Hollien et al. 1994; Titze 1989) and in adults undergoing T treatment (Need et al. 1993). Given that T has been linked to aggressive responses to challenges (Archer 2006), cues of T dosage such as low F0 (Dabbs and Mallinger 1999; Evans et al. 2008) may usefully predict a competitor’s aggressive potential.
Because only well-buffered individuals can tolerate the immunosuppressant effects of T (Folstad and Karter 1992; Grossman 1985), mean F0 may also indicate parasite resistance (Feinberg et al. 2005; Puts 2005). Coevolutionary contests with parasites can maintain heritability in parasite resistance and its indicators, and thus the basis for good-genes mate choice in hosts (Hamilton and Zuk 1982). To the degree that immunocompetence and F0 remained heritable over human evolution, women’s judgments of male vocal attractiveness may have evolved to recruit these qualities for their offspring. Indeed, mean F0 is heritable (Debruyne et al. 2002), and women’s preferences for male voices within a certain F0 range appear suited for the function of sire choice. Men with deeper voices are perceived as more attractive than those with higher voices (Feinberg et al. 2005, 2006; Jones et al. 2010; Puts 2005), and men with attractive voices have greater mating success than those with less-attractive voices (Hughes et al. 2004). Further, women are more attracted to low-F0 male voices when near ovulation (Puts 2005; Feinberg et al. 2006) and in short-term mating contexts (Puts 2005). These preferences may increase the reproductive success of men with low voice pitch; in a sample of Western undergraduates, men with lower voice pitch reported more sexual partners in the past year than those with higher voice pitch (Puts 2005), and in a study of hunter-gatherers, men with lower voice pitch reported more offspring (Apicella et al. 2007).
People also associate low F0 with physical and social characteristics that contribute to success in competitive encounters (i.e., relative dominance); males with lower F0 are rated as older, taller, and heavier (Collins 2000; Feinberg et al. 2005; Lass and Brown 1978). In addition, experimentally lowering F0 increases perceptions of dominance (Jones et al. 2010; Puts et al. 2007; Wolff and Puts 2010).
Many of the above-cited studies measure F0 with software that repeatedly samples a digital speech segment and averages across the samples. While this method effectively captures mean F0, it may obscure state-dependent patterns of variation. Vocal fold tension is under significant volitional control, often in service of linguistic expression (Ohala 1983; Johnstone and Scherer 2005); however, it is also subject to nonvolitional influence (Scherer 1986; Johnstone and Scherer 2005) via changes in respiratory patterns, diaphragmatic and abdominal tension, and hormonal status, all of which are influenced by affective state (Titze 1994). It is therefore important not only to understand vocal indicators of fixed or “inherent” quality, but also those that index fluctuating states between and within social contexts.
Because vocal signals reflect state-dependent changes (Scherer 1986), they may be particularly relevant in domains of social life in which relationships shift situationally, such as dominance interactions. Changes in mean F0 have been shown to vary across contexts depending on the characteristics of the signaler; men who perceive themselves to be dominant tend to lower their mean F0 when addressing a competitor, whereas men who perceive themselves to be subordinate tend to raise their mean F0 (Puts et al. 2006).
The extent to which F0 fluctuates over an utterance may also index socially relevant state-dependent changes (Scherer 1986). F0 variation quantifies one aspect of prosodic individual differences; it captures the extent of “highs” and “lows” in speech (this is in contrast to jitter, which measures shorter cycle-to-cycle variation in F0; Baken 1987). Perceptually, low F0 variation has a monotone quality, whereas high F0 variation can have a sing-song character. Adults are more likely to exaggerate F0 variation when speaking to infants than when speaking to other adults (Trainor et al. 2000), suggesting that high variation in F0 may be associated with friendly and appeasing contexts. However, high F0 variation is also characteristic of high-activation emotions (e.g., panic fear and “hot” anger) (Banse and Scherer 1996). Among cotton-top tamarins (Saguinus oedipus oedipus), high F0 variation is indicative of a proclivity to engage in playful wrestling (Goedeking 1988). If high F0 variation is used to affiliate, then lack of F0 modulation (i.e., monotonic vocalization) may be used to intimidate.
Riding et al. (2006) demonstrated a trend toward increasing attractiveness in men’s voices with decreasing F0 variation, although the effect was not significant. In contrast, other studies have shown that moderate to high F0 variation is associated with higher social attractiveness ratings (Ray et al. 1991; Zuckerman and Miyake 1993). Some of this discrepancy may be accounted for by examining attractiveness ratings across menstrual phases and imagined mating contexts, both of which affect women’s preferences for putative indicators of genetic quality (Penton-Voak et al. 1999; Puts 2005). Mean F0 has been argued to reflect genetic quality, and its attractiveness has been shown to vary with female fertility and mating context in a way that supports such a perspective (Puts 2005; Feinberg et al. 2006). If F0 variation also reflects genetic quality, then preferences for it might also change with cycle phase and mating context, but if F0 variation does not track heritable fitness differences in men, then there should be no difference in preferences for it across cycle phase and mating context.
Intensity is a measure of power per unit of area and roughly corresponds to the psychoacoustic term “loudness” (Baken 1987). Individuals can vary intensity by changing the amount of lung pressure and the size of the glottal (vocal fold) opening (Titze 1994). Changes in intensity are used to communicate semantic (i.e., to emphasize certain syllables and words) and emotional information (Banse and Scherer 1996). Like high F0 variation, intensity is a characteristic of high activation emotions—fear, anger, and joy (Banse and Scherer 1996). More confident individuals speak with greater intensity (Kimble and Seidel 1991), and high intensity is associated with perceptions of dominance (Aronovitch 1976; Scherer et al. 1973).
A central goal of the present research is to assess the extent to which these existing findings on the relationships between acoustic parameters, on the one hand, and dominance and attractiveness judgments, on the other, generalize to ecologically valid competitive and courtship contexts. It is possible that subtle acoustic variation is salient only against the background of fixed content. In addition, our research format allows a comparison of the relative influence of acoustic and semantic variables on dominance and attractiveness judgments. Finally, for theoretical reasons outlined above, male judgments of dominance and female judgments of attractiveness might be expected to vary in response to different acoustic parameters. Females might attend to invariant characteristics signaling genetic quality, whereas males might attend to situational factors that predict the outcome of competitive encounters. All of these possibilities are evaluated first in univariate and then in multivariate contexts.
Acoustic stimuli and audio ratings were collected by Puts and colleagues (Puts 2005; Puts et al. 2006). Content ratings of transcriptions of the same audio recordings were collected for the present study and provided a control for semantic variation among speakers.
Participating in this study were 264 self-reported heterosexual males and 177 self-reported heterosexual females, distributed as follows: 111 male “participants” (ages 18–24, M=18.9 years, SD=1.2), 86 male “audio raters” (ages 18–28, M=20, SD=2.1), 142 female “audio raters” (ages 18–30, M=19.1, SD=1.7), 67 male “content raters” (ages 18–26, M=19.2, SD=1.6), and 35 female “content raters” (ages 18–37, M=19.5, SD=2.7). All participants spoke native, minimally accented American English. Additionally, both participants and audio raters were sampled from the same community (University of Pittsburgh undergraduates). Content raters were similarly aged male and female undergraduates at the University of California, Santa Barbara.
In the first part of the study, male participants competed in a mock dating-game scenario modeled after Simpson et al. (1999; see Puts 2005 and Puts et al. 2006 for methodological details). Before learning the dating-game format, each participant was asked to read a control passage out loud (the Rainbow Passage). Participants were next instructed to describe themselves to their potential date (“courtship recording”). They were then asked to address their male competitor and describe why they might be respected or admired by other males (“competitive recording”). All participants’ responses were recorded using GoldWave digital audio software. After the courtship and competitive phases, participants were asked to rate their own and their competitor’s physical and social dominance on a Likert scale. In accordance with procedures described by Mazur et al. (1994), raters were told that “a [socially] dominant person tells other people what to do, is respected, influential, and often a leader; whereas submissive people are not influential or assertive and are usually directed by others.” Raters were then asked to assess social dominance by marking a visual analogue scale ranging from “extremely dominant” to “extremely submissive.” Regarding physical dominance, raters were asked to mark a visual analogue scale anywhere from “strongly disagree” to “strongly agree” beneath two statements: “If this man (I) got in a fistfight with an average male undergraduate student, this man (I) would probably win.” The difference between self- and target-rated dominance was used to calculate a “relative dominance” variable for each participant (see Puts et al. 2006).
Mean F0, F0 variation (operationalized as the within-subject standard deviation of F0, hereafter called F0-SD), utterance duration (seconds), intensity (decibels, dB), and formant dispersion (Df) were measured using Praat voice analysis software (version 4.4.11) for each of the participants’ three recordings (control, courtship, competitive). All parameters were set to the programmers’ recommendations, including a pitch floor of 75 Hz and a pitch ceiling of 300 Hz. For characteristics of these recordings see Table 1. Df was obtained by the following methods (Gonzalez 2004; Xue and Hao 2003): Spectral information was obtained using the long-term average spectrum (LTAS), a fast Fourier transform (FFT)-generated frequency power spectrum represented in a running speech acoustic signal. (Voices were first resampled at 11025 Hz; otherwise, all values were set to Praat’s authors’ recommendations.) An LPC smoothing analysis generated a smooth curve from the LTAS, with six peaks and a pre-emphasis coefficient of 0.5. Using edit and draw tools, we confirmed that the LPC curve matched well to the LTAS. When a six-peak model did not conform well to the LTAS, seven peaks were used. Formant values (F1–F4) were taken from the first four peaks of the LPC curve; Df was calculated by computing the average distance between each of the four peaks (Fitch 1997). All formants were analyzed a second time using the same procedures in order to obtain test-retest reliability. Correlations between first and second measurements were high owing to the semi-automated nature of the method; for all three recordings, r=0.99. Df values used in all analyses were the averages of the first and second measurements.
Competitive recordings were used as stimuli for male “audio raters.” Raters listened to a set of 30 or 31 recordings, which included a random selection of raised, lowered, and unmanipulated recordings (10 or 11 of each). Importantly, only unmanipulated recordings were included in analyses for the present study. No set contained two recordings of the same dating-game participant. Recordings were rated for perceived social and physical dominance using the procedures described above. Mean physical and social dominance scores were created for each of the voices (N=111) by averaging across the raters’ responses (mean raters/recording=8).
Transcribed competitive recordings were read by male “content raters.” Each rater read a random selection of 28 to 31 transcribed recordings and rated each transcript for perceived social and physical dominance, mirroring procedures for audio recordings.
Female “audio raters” listened to a selection of 30 or 31 courtship recordings (analogous to procedures described above) and rated each target’s attractiveness for a “short-term, purely sexual relationship, such as a one-night stand,” and then for a “long-term, committed relationship.” Female raters were told to place a mark on a visual analogue scale anchored at “extremely unattractive” and “extremely attractive.” Mean short-term and long-term attractiveness scores were created for each of the voices (N=111) by averaging across the fertile (mean raters/recording=3.5) and non-fertile (mean raters/recording=9) raters’ responses.
Transcribed courtship recordings were read by female “content raters.” Each rater read a random selection of 28 to 31 transcribed recordings and rated each transcript for short-term or long-term attractiveness, mirroring procedures described above.
Menstrual cycle information was collected from all female audio and content raters through self report. The participants were asked to circle the number of days in their typical menstrual cycle (<22 to >38; M=29.2 days) and the date of onset of their last menstrual bleeding on a calendar. Each rater was categorized as being in either the “fertile” (audio: N=38, content: N=17) or “non-fertile” (audio: N=98, content: N=18) phase of her cycle, according to a conservative eight-day window of fertility beginning six days before the presumptive day of ovulation and ending one day after (see Puts 2005 for further methodological details). Raters were neither pregnant nor using hormonal contraceptives at the time of the study. Menstrual-cycle effects are not a central focus of this study, but because effects have previously been observed in studies of voice judgments, we attempted to monitor such variation.
F0 and F0-SD were correlated in each of the three recording types: control (r=0.43, p<0.001), courtship (r=0.44, p<0.001), and competitive (r=0.35, p<0.001). Both F0 and F0-SD changed between the control and the two experimental conditions, and the change in F0 was significantly correlated with the change in F0-SD for both competitive (r=0.43, p<0.001) and courtship(r=0.43, p<0.001) recordings. Within participants, F0 and F0-SD were correlated with themselves across each of the three recordings (mean F0: average r=0.92, and F0-SD: average r = 0.50).
Df was correlated with F0-SD only in the control recording (r=0.26, p<0.01) and was not correlated with F0 for any of the three recording types. Df did not change significantly from control to either courtship (t109=0.20, n.s.) or competitive (t109 = 0.15, n.s.) recordings.
Intensity was positively correlated with mean F0 for both competitive (r=0.31, p<0.01) and courtship (r=0.26, p<0.01) recordings.
Vocal physical dominance was significantly correlated with F0-SD (r=−0.29, p<0.01) and content physical dominance (r=0.65, p<0.001). Vocal social dominance was significantly correlated with F0-SD (r=−0.21, p<0.05), intensity (r=0.19, p<0.05), and content social dominance (r=0.60, p<0.001). (See Table 2 for marginal and nonsignificant correlations.)
Vocal short-term attractiveness ratings by fertile-phase females were correlated with mean F0 (r=−0.31, p<0.001), F0-SD (r=−0.24, p<0.01), long-term vocal attractiveness ratings (r=0.77, p<0.001), and short-term content attractiveness ratings (r=0.27, p<0.01). Vocal short-term attractiveness ratings by non-fertile-phase females were correlated with mean F0 (r=−0.24, p<0.001), F0-SD (r=−0.25, p<0.01), long-term vocal attractiveness ratings (r=0.69, p<0.001), and short-term content attractiveness ratings (r=0.27, p<0.01).
Vocal long-term attractiveness ratings by fertile-phase females were correlated with mean F0 (r=−0.32, p<0.001), F0-SD (r=−0.29, p<0.01), intensity (r=0.22, p<0.05), and long-term content attractiveness ratings (r=0.26, p<0.01). Vocal long-term attractiveness ratings by non-fertile-phase females were correlated with mean F0 (r=−0.22, p<0.001), intensity (r=−0.22, p<0.05), and long-term content attractiveness ratings (r=0.34, p<0.001). (See Table 3 for marginal and nonsignificant correlations.)
There was a significant decrease in F0-SD from control (M=12.9) to courtship recording (M = 11.6; two-tailed paired t-test: t110=3.76, p<0.001). Participants also reduced their F0-SD from control to competitive (M=12.3), but the difference was not statistically significant (t110=1.54, p=0.13). Additionally, changes in F0-SD from control to courtship varied significantly with participants’ perceptions of their relative physical dominance (r=−0.23, p=0.02): males who rated themselves as more physically dominant tended to decrease F0-SD to a greater degree. Changes in F0-SD from control to competitive did not vary significantly with relative dominance (r=−0.12, p=0.26). Puts et al. (2006) reported that changes in mean F0 from control to competitive varied with perceptions of relative physical dominance. Controlling for F0-SD does not change this result (partial r=−0.24, p=0.01).
Acoustic and content variables were entered into multiple regression models to predict attractiveness and dominance ratings. Given that the predictor variables were often correlated, multicollinearity was a concern. Consequently, variance inflation factors (VIFs) were assessed in order to test for multicollinearity in all multiple regression models. In all models, VIFs were <1.3 for each of the predictors, and therefore the results of these multiple regression analyses are unlikely to be confounded by multicollinearity.
Because F0 and F0-SD are positively correlated (Tables 2 and and3),3), it is possible that previously reported correlations concerning the relationship between F0 and dominance (or attractiveness) are due instead to the influence of F0-SD. In order to assess the independent effects of mean F0 and F0-SD, both variables were entered simultaneously along with mean duration, intensity, Df, and mean content rating into separate multiple regression models for each of the two dominance variables (see Table 4). These models explained 49.7% (R=0.705, F6,101=16.63, p<0.001) of the variance in physical dominance ratings and 39% (R=0.625, F6,101=10.77, p<0.001) of the variance in social dominance ratings. F0-SD significantly predicted judgments of physical dominance (β=−0.16, p<0.05), and mean F0 approached significance as a predictor (β=−0.14, p=0.08). Neither F0-SD (β=−0.09, n.s.) nor mean F0 (β=−0.14, n.s.) explained a significant portion of the variance in judgments of social dominance. Df also did not capture any additional variance in physical (β=−0.06, n.s.) or social (β=−0.06, n.s.) dominance ratings; nor did intensity (physical: β=0.08, n.s., social: β=0.11, n.s.). Dominant content ratings were a strong positive predictor of both physical (β=0.62, p<0.001) and social(β=0.56, p<0.001) dominance ratings.
In order to control for possible interactions between acoustic parameters and content, we conducted exploratory stepwise multiple regressions to examine which variables would explain significant proportions of the variation in ratings when all variables and their interactions were included in the analyses. Vocal parameters, content, and the interaction terms (the products of each of the vocal parameters with content) were entered into a stepwise multiple regression to predict physical and social dominance.
The relationships between vocal parameters and perceptions of physical dominance were relatively unchanged by these controls: only F0-SD and physically dominant content were significant predictor variables (β=−0.18, p=0.011 and β=0.62, p<0.001, respectively). The interaction between Df and content (β=−0.16, p=0.021) also significantly predicted physical dominance. To explore this interaction, we regressed physical dominance on Df at one standard deviation above and at one standard deviation below the mean on physically dominant linguistic content. There was a marginally significant inverse relationship between physical dominance and Df for those with highly dominant content only (β=−0.20, p=0.055). In other words, a low, masculine Df predicted increased ratings of physical dominance more for men who used physically dominant language.
For perceptions of social dominance, the effects were similar to the initial model: only socially dominant content was a significant predictor variable (β=0.52, p<0.001). The interaction between duration and content (β=−0.25, p<0.01) was also a significant predictor. An analysis of the simple slopes revealed a significant positive relationship between duration and social dominance for those with higher socially dominant content (β=0.35, p<0.05), but the inverse for those with lower socially dominant content (β=−0.28, p<0.05). That is, speaking for a longer time increased social dominance for those who spoke with socially dominant language but decreased dominance ratings for those who did not use dominant-sounding language. In addition, the interaction between intensity and content (β=−0.15, p=0.04) achieved significance. There was a significant positive relationship between social dominance and intensity for those with low socially dominant content only (β=0.25, p<0.05). In other words, for men who used language that was perceived as lacking in social dominance, speaking more loudly increased their social dominance ratings.
For neither physical nor social dominance were there significant interactions between speech content and either mean F0 or F0-SD.
Multiple regression models were used to assess the impact of vocal parameters on women’s attractiveness judgments. These models explained 25.3% (R=0.503, F6,101=5.52, p<0.001) and 29% (R=0.538, F6,98=6.67, p<0.001) of the variance in fertile-phase women’s judgments of short- and long-term attractiveness, respectively, and 23.2% (R=0.428, F6,101=5.10, p<0.001) and 20.4% (R=0.451, F6,101=4.30, p<0.01) for non-fertile women’s judgments. For the case most revealing of good-genes mate choice, fertile-phase women judging short-term attractiveness, mean F0 achieved significance (β=−0.27, p<0.01) whereas F0-SD did not (β=−0.15, n.s.). Mean F0 also predicted judgments of short- and long-term attractiveness by fertile-phase women (β=−0.27, p<0.01 and β=−0.24, p<0.05, respectively) and non-fertile women (β=−0.17, p<0.10 and β=−0.21, p<0.05). Results were inconsistent for F0-SD, which predicted long-term attractiveness judgments by fertile (β=−0.20, p<0.05), but not non-fertile (β=−0.06, n.s.), women, and short-term attractiveness judgments by non-fertile (β=−0.21, p<0.05), but not fertile (β=−0.15, n.s.), women. Df was a significant predictor of short-term and long-term attractiveness judgments by fertile (β=−0.19, p<0.05 and β=−0.17, p<0.05), but not by non-fertile (β=−0.13, n.s. and β=−0.06, n.s.), women. Both intensity and content ratings also positively predicted attractiveness across menstrual phases and mating contexts (β-values≥0.20, p-values<0.05). (See Table 5.)
Stepwise multiple regression analyses were conducted, as described above, for short- and long-term attractiveness for both fertile and non-fertile women. Vocal parameters, attractive content, and the interaction terms (the product of each of the vocal parameters with attractive content) were entered into stepwise multiple regressions predicting attractiveness judgments across cycle phase and mating context.
Mean F0 (β=−0.33, p<0.01), intensity (β=0.22, p=0.02), and attractive content (β=0.21, p<0.02) significantly predicted short-term attractiveness ratings by fertile-phase women, whereas mean F0 (β=−0.26, p<0.01) and attractive content (β=0.26, p<0.01) significantly predicted short-term ratings by non-fertile women. Mean F0 (β=−0.36, p<0.001), intensity (β=0.27, p<0.01), and attractive content (β=0.21, p=0.02) significantly predicted long-term attractiveness ratings by fertile-phase women. Similarly, mean F0 (β=−0.22, p=0.02), intensity (β=0.23, p=0.01), and attractive content (β=0.31, p<0.01) significantly predicted long-term attractiveness ratings by non-fertile women. In addition, the interaction between mean F0 and long-term attractive content achieved significance (β=−0.22, p=0.01). Low mean F0 increased long-term attractiveness to non-fertile women only for those men with highly attractive content (β=−0.39, p<0.01). There were no significant interactions for either fertile-phase women or short-term mating contexts. Again, there was no significant interaction between speech content and either mean F0 or F0-SD.
The present study indicates that multiple acoustic parameters vary in natural competitive- and courtship-generated men’s speech, and that variance in these parameters may be differentially attended to by men evaluating dominance and women evaluating attractiveness. Separate evolved priorities may account for the differences between men’s and women’s judgments.
We present several novel findings on the role of voice in male–male competition and female mate choice. First, lower F0 variation in men’s voices predicted increased perceptions of physical dominance in male listeners. Second, F0 variation captured some variation in attractiveness judgments by women, but results were inconsistent. Our results suggest that mean F0 and F0 variation have independent effects, and that they are attended to differently by men and women judging physical dominance and sexual attractiveness, respectively. The role of F0 variation is likely to be an important consideration in future research on the social ecology of the human voice. Third, women found speech intensity (i.e., volume) attractive across mating context and menstrual phase. Fourth, semantic content strongly predicted dominance and attractiveness assessments. To our knowledge, this is the first study to examine the associations between perceptions of naturally occurring competitive and courtship interactions, their semantic content, and acoustic parameters.
Although previous experimental research has shown that low Df is perceived as dominant by men (Feinberg et al. 2005; Puts et al. 2007; Wolff and Puts 2010), this was not the case in the present study. A possible explanation is that the variance in Df in this sample was smaller than the manipulations in experimental studies. However, in the control condition, one standard deviation of Df was 60 Hz, representing a 6% deviation from the mean, and the increment of manipulation in Puts et al. (2007) was also 6%. Another possible explanation is that the effects of Df may be small relative to the effects of other acoustic parameters and content that varied freely in the present study. Too little variance in dominance judgments may have remained unexplained by content and these other acoustic parameters for Df to explain a significant portion. Finally, although we used only ratings and measurements of unmanipulated recordings in all analyses, participants also heard recordings raised and lowered in both F0 and Df. This may have damped the effects of F0 and Df on ratings of unmanipulated recordings. While some caution in interpreting the relative effects of these acoustic variables within rating contexts (e.g., physical dominance) is needed, our main findings that acoustic variables differentially predicted ratings between contexts (e.g., attractiveness to women vs. dominance to men) are not undermined.
We found that women in the fertile phase of their cycle attended to low Df when rating short- and long-term attractiveness. Feinberg et al. (2005) found no effect of Df on attractiveness, although manipulations of Df did influence perceptions of age, size, and masculinity—qualities that influence attractiveness to females (e.g., Pawlowski and Jasienska 2005). Df is thought to be an index of vocal tract length, which goes through a secondary growth spurt in males during puberty (Fitch and Giedd 1999), putatively under the influence of circulating androgens. Therefore, like mean F0, low Df may indicate genetic quality through its association with immunocompetence—an idea that has not yet been explored in the literature.
Previous studies have shown that lower mean voice pitch increases assessments of some proxies of relative dominance (i.e., height, weight, and age; Collins 2000; Feinberg et al. 2005; Lass and Brown 1978), as well as subjective judgments of physical dominance (Puts et al. 2006, 2007; Wolff and Puts 2010). Although mean F0 and both measures of dominance were correlated, when entered into a multiple regression with other voice parameters, mean F0 failed to achieve statistical significance.
In contrast, mean F0 was a significant predictor of women’s assessments of attractiveness. In a previous study using the same voice recordings, Puts (2005) found a significant negative correlation between mean F0 and women’s attractiveness ratings. In contrast with men’s judgments of dominance (where inclusion of F0 variation, Df, intensity, and duration washes out the effects of mean F0), women’s judgments of attractiveness remain significantly related to F0 in a parallel multivariate analysis. Mean F0 predicted judgments of attractiveness across cycle phase and relationship context. However, the relationship between mean F0 and attractiveness was strongest for fertile-phase women rating short-term attractiveness, suggesting an elevated attention to signals of quality when conception risk is greatest (Gangestad and Thornhill 1998; Penton-Voak et al. 1999; Puts 2005).
Results showed that male voices low in F0 variation were perceived to be physically dominant. In many previous studies of mean F0, F0 variation was held constant or ignored. In this study we found that both mean F0 and F0-SD vary naturally in unmanipulated utterances, and that the two are moderately correlated. By controlling each of these parameters using multiple regression, we found evidence that men attend strongly to F0 variation when making physical dominance attributions.
One proximate mechanism that may clarify the relationship between low F0 variation and perceptions of high physical dominance is autonomic arousal. Both mean F0 and F0 variation are increased during states of heightened autonomic arousal. The vocal characteristics of high-activation emotions (e.g., panic fear and “hot” anger) include high mean F0 and high F0 variation (Banse and Scherer 1996). F0 increases as a result of greater tension on the vocal folds (Titze 2000), which may also increase F0 variation owing to the loss of fine motor control (Goedeking 1988).
On an ultimate level, indicators of arousal may be important signals in dominance contexts. The primary function of autonomic activation in antagonistic interactions is to prepare the body for fighting or fleeing. This physiological preparation affects the laryngeal muscles, which in turn affect voice characteristics (Scherer 1986). Perhaps only those who are assured of their own size, strength, or ability can afford not to prepare for a costly fight or a quick flight. In this way, lack of arousal may function as a costly signal (Scherer 1986; Zahavi 1982). Thus, vocal signals associated with lack of arousal—low F0 variation and low F0—may indicate the signaler’s self perceptions of relative dominance when facing a challenge.
Constraints of the vocal apparatus may ensure that the signal remains honest and uncorrupted by cheaters. Because the vocal system is not a distinct system but rather is overlaid on more primary structures (e.g., those of respiration and digestion; Johnstone and Scherer 2005; Scherer 1986), and because the qualities of vocalizations are affected by states of the body via its connection with the lungs and diaphragm (Titze 1994), it may be difficult to alter particular vocal signals without altering more basic functioning (e.g., breathing), and vice versa. This functional linkage may account for men’s selective attention to individual differences in F0 variation.
Mean F0 is associated with both the signaler’s vocal fold size and state of arousal. However, F0 variation is not as confounded by individual differences in vocal fold morphology and may therefore provide a more pure signal of arousal. These dynamics would apply not only to male dominance contests but to competitive interactions among females as well.
The hypothesized link between F0 variation and arousal in dominance interactions appears to mesh with what is known about F0 variation and friendly interactions. Trainor et al. (2000) proposed that the exaggerated F0 contours indicative of infant-directed speech is highly “emotional” speech, which is not categorically different from adult-directed emotional speech. In this sense, F0 variation might be the vocal equivalent of a smile—a friendly signal that may have phylogenetic roots in a submissive gesture (Darwin 1872; Ketelaar et al. 2005). The present findings suggest the converse of this relationship: lack of variation may communicate unfriendly or aggressive intentions.
In contrast to findings with male raters, the relationship between women’s attractiveness judgments and F0 variation was inconsistent; F0 variation predicted attractiveness for non-fertile females rating short-term attractiveness and fertile females rating long-term attractiveness. Several studies have found that moderate to high variation in F0 is most “attractive” to people in general (Ray et al. 1991; Zuckerman and Miyake 1993); however, these studies fail to take into account functional specificity in attractiveness criteria based on evolutionary considerations (i.e., what is physically attractive may depend on the sex of the target and on whether one is discriminating among potential mates or potential allies; Tooby and Cosmides 1992). Riding et al. (2006) found a decrease in social attractiveness in men’s voices with increasing F0 variation; however, the effect was not significant. Clearly, more research is needed to determine whether factors such as individual differences or contextual variation can account for inconsistent findings on the perceived attractiveness of F0 variation.
Previous research has found that preferences for traits theoretically associated with high genetic quality peak during the fertile phase of the ovulatory cycles and/or when evaluating short-term mates (e.g., Gangestad and Thornhill 1998; Penton-Voak et al. 1999; Puts 2005). However, unlike mean F0, F0 variation appears to exhibit low heritability (Debruyne et al. 2002) and thus may be a poor indicator of genetic quality. If F0 variation does not primarily signal genetic quality but rather indicates state-dependent changes in arousal, then increased attention to F0 variation during ovulation or in short-term mating contexts would not necessarily be a good design for sire recruitment. In accordance with this reasoning, we did not find consistent attention to F0 variation in these contexts.
Recent research suggests that the evolutionary effects of female choice in humans may be overestimated in part because a number of traits widely assumed to be attractive to females actually have a greater impact on male dominance (Puts 2010). In parallel with this perspective, we recently showed that, in a sample of college-age men, F0 variation is a better predictor of male mating success than is F0 (Hodges-Simeon et al. 2010).
The results reported here may indicate functional specificity in attention to features of voice (Searcy and Nowicki 2005). When judging attractiveness, women may respond to features of the voice that are indicative of heritable mate quality (i.e., “good genes”)—low mean F0 and Df. Among men, the functionally important signal components for intrasexual competitors may be twofold: previous research has shown that signals of static quality or formidability are important; however, competitors must also attend to signal features (i.e., F0 variation) that indicate more state-dependent qualities, such as arousal.
The present findings also highlight some differences between physical and social dominance. Although the two are overlapping constructs, physical dominance (as rated by men) was related to low F0 variation, but social dominance was not. In humans, there may be multiple ways to attain status (Henrich and Gil-White 2001). Physical dominance was defined for participants in the present study as the ability to win a physical fight. Social dominance, in contrast, may also require effective communication and leadership. Lack of sufficient pitch variation might impair one’s ability to achieve these goals, and therefore the ability to win friends, allies, and supporters. F0 variation in this sense quantifies the zero-sum trade-off between signaling affiliation and aggression. F0 variation across differing contexts may be used to assay the balance of these motivational states in specific social interactions.
Another interesting result was a set of positive relationships between voice intensity and all measures of vocal attractiveness by women. A possible explanation is that voice intensity in men reflects vigor and is analogous to roaring rate in red deer stags, which predicts actual fighting success (Clutton-Brock and Albon 1979) and appears to positively influence attractiveness to females (McComb 1991). However, this interpretation seems contradicted by the lack of statistically significant relationships between intensity and either dominance rating. The role of voice intensity in human female choice and male contests warrants further investigation. In addition, the correlation between intensity and mean F0 suggests the importance of measuring and controlling for variation in intensity when studying the effect of mean F0 on interpersonal perceptions.
State-dependent changes may be ascertained by examining changes in F0 variation from one condition to another. Results indicated that F0 variation decreased from control to courtship and that this change was associated with the participants’ perceptions of their own physical dominance; participants who rated themselves as more physically dominant relative to their competitor tended to lower F0 variation to a greater degree when speaking to the woman. Signaling low arousal via low F0 variation is potentially important to perceptions of both dominance and attractiveness, as our results show. Thus, our male participants might generally have been motivated to lower F0 variation during mating competition. However, physically dominant men were apparently more successful at this, perhaps because they were more confident of their success and therefore less nervous. Although participants also tended to decrease F0 variation from control to competitive condition, this difference was not significant. For a discussion of changes in mean F0 across conditions, see Puts et al. (2006).
Verbal content was a strong predictor of both dominance judgments by men and attractiveness judgments by women. Although verbal content per se is not the focus of this paper, it should come as no surprise that men rated highly on physical dominance used such phrases as “I’m a lot bigger and stronger than most men.” Men rated highly on social dominance often referred to holding office in a fraternity. The male most attractive to female raters spoke of being an athlete; being tall and physically fit; liking to go out, watch movies, and hang out; as well as having a good sense of humor. We also examined content by acoustic parameter interactions in order to determine whether content moderates perceptions of different vocal characteristics (Jones et al. 2008). Overall, there were very few interactions and no interactions between content and either F0 or F0-SD.
Acoustic parameters and verbal content explained 20–50% of the variance in perceptions of dominance and attractiveness. Although a variety of idiosyncratic preferences could account for the amount of variance unexplained, there may be other acoustic factors that systematically affect participants’ perceptions of dominance and attractiveness.
Dominance and attractiveness assessments depend on a variety of factors in multiple sensory modalities (Searcy and Nowicki 2005). We explored the potential effects of natural variation in multiple voice parameters on such assessments in unscripted speech samples from an ecologically valid competitive interaction, statistically controlling potential confounding variables via multiple regression. The present research suggests that multiple vocal parameters affect dominance and attractiveness judgments among humans, and that these acoustic variables may be weighted differently by men and women as a function of their different evolved priorities.
We thank Lisa Brevard, Jerome Lee, Rebecca Prosser, John Putz, Melinda Putz, and Linda Snyder for their conscientious assistance in study preparation and data collection; Julio Gonzalez and Drew Rendall for their advice on measuring formant frequencies; Katherine Verdolini Abbott for providing research support; and Jane Lancaster, Nathan Pipitone, and two anonymous reviewers for their helpful comments on a previous draft of this paper.
Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
Carolyn Hodges Simeon received a master’s degree in psychology from the College of William and Mary. She is currently pursuing a doctorate in Integrative Anthropological Sciences at the University of California, Santa Barbara. She is interested in sex differences, and her dissertation research will focus on factors affecting the development of sexually dimorphic voice characteristics during puberty.
Steven J. C. Gaulin earned his B.A. in anthropology and psychology from U.C. Berkeley and his Ph.D. in biological anthropology from Harvard University. He taught anthropology, animal behavior and psychology for 25 years at the University of Pittsburgh, and is now a professor of anthropology at U.C. Santa Barbara, as well as coeditor-in-chief of Evolution and Human Behavior. Sex differences—in anatomy, physiology, behavior and cognition—are his primary research focus. He has seen one-third of the world’s extant bird species in the wild, and he plays acoustic finger-style guitar.
David Puts (lab website: http://www.putslab.psu.edu/index.html) is an assistant professor in the Department of Anthropology at Penn State University and is co-funded by the Penn State Children, Youth, and Family Consortium. His research focuses on the neuroendocrine and evolutionary bases of human sexuality and sex differences. He was the 2004 recipient of the Human Behavior and Evolution Society New Investigator Award, was elected member of the International Academy of Sex Research in 2006, and is on the Editorial Board of the journal Archives of Sexual Behavior.