The goal of this section is to show how secondary articulation, coarticulation and prosodic boundaries affect the number of contacts, duration, and F2-F1 difference in Russian trills. Measurements will be presented separately for the number of contacts, the duration, and the tongue fronting. Figures will present averages across subjects and the tables present the statistical analysis.
3.1. Number of contacts
For each token in the data, the number of contacts was categorized into 3 categories, 0-contact, 1-contact, and 2–3 contacts. No tokens in the data had more than 3 contacts, and the number of 3-contacts was small enough warranting their inclusion into one category with the 2-contacts. The 0-contacts will be termed “approximant”, the 1-contacts will be termed “tap”, and the 2–3-contacts will be termed “full trill”. The percentage of each contact category was determined by secondary articulation, word position, and vowel. presents these results averaged across the subjects, since the subjects are highly consistent in these percentages. Each bar in the figure is broken down into the percentages of each of the categories. shows that the percentage of approximants for /r/ and /rj/ are approximately the same. For /rj/, the rest of the data (79%) is in the tap category, whereas for /r/, about 57% are taps, while 26% are full trills. The latter occur for only 1% of the /rj/. Therefore, it is certainly not the case that both /r/ and /rj/ are realized in the same way, in terms of the number of contacts. Furthermore, the main asymmetry between the two trills is that full trilling (2–3 contacts) occurs almost only for the non-palatalized trill. shows that full trilling occurs primarily for initial and final trills, and only rarely for intervocalic and preconsonantal trills, which have more approximant realizations. shows that the contiguous vowel does not affect the number of contacts, as can be seen from the similarity of the breakdown for the vowels.
Figure 2 Distribution of 0, 1, and 2≥ contacts as a function of a) Secondary Articulation, b) Word Position: Initial (In), Intervocalic (IV), Preconsonantal (PC), and Final (Fn), c) Contiguous vowel. Each bar contains the breakup in percentage (Approximant (more ...)
To statistically investigate how Secondary Articulation, Vowel, and Word Position affect the number of contacts, we used the logistic regression instantiation of the Generalized Linear Model within a Mixed Linear Effects framework (Baayen, 2007
) to account for the fact that the data from each participant are correlated. In the model, the error distribution is taken to be the binomial distribution to allow for binary responses, using the logit
model to linearize probabilities. Several models have been used to analyze ordinal data (Agresti, 2002
). The one we felt was most appropriate for the data analyzed here is the cumulative Logit Model, where the model is built to estimate that the probability of the dependent variable is less than or equal to a particular value. In this case, there were two models tested: 1) The Logit_1 model tests if the factors and their interactions increase or decrease the odds of a trill having 1 or greater than one closure vs. 0 contacts, and 2) The Logit_2 model tests if the factors and their interactions increase or decrease the odds of a trill having 2 or more contacts vs. less than 2 contacts (0 or 1 contacts). The results are shown in .
Generalized Linear Mixed Model Results for Contacts as a function of Secondary Articulation, Vowel, and Word Position.
For the Logit_1 model, Secondary Articulation and Vowel showed no effect on the odds of having zero, one, or more contacts. For Word Position, Final, Intervocalic, and Preconsonantal trills all have a significantly higher chance of having 0 contacts, i.e., of being approximant.
For the Logit_2 model, /rj/ has a significantly lower chance of having 2 or 3 contacts than /r/. Also, Initial trills have a significantly higher chance of having 2 or 3 contacts than all other positions. Final Position also has a significantly higher chance of having 2–3 contacts than Preconsonantal Position. There was no statistically significant interaction between Secondary Articulation and Vowel. In summary, palatalization prohibits full trilling (2≥), and there is greater likelihood for an approximant (0 contact) in intervocalic and preconsonantal position and greater likelihood for full trilling on the boundary, especially initial trills.
shows the normalized means of trill duration across subjects as a function of secondary articulation. The normalization was performed by subject-centering the data, that is, the subject’s mean trill duration was subtracted from each token trill’s duration. shows that subjects do not show a sizable difference in duration as a function of secondary articulation. The difference is only about 5 ms. shows the mean Position effect on Duration. When the means are broken down by participant, we find that all participants show Final trills as longer than Intervocalic and Preconsonantal trills, except for Participant M5, whose Intervocalic and Final trills have comparable durations. Participants W2, W3, W4, M1, M2, and M4 all produce Final trills with longer durations than initial ones, but Participant M3 has Initial trills longer than Final ones, while Participants M1 and M5 show comparable durations for Initial and Final trills. Furthermore, Initial and Final trills are consistently longer than word-medial trills. shows the relation between contiguous vowel and trill duration. For all participants, trills contiguous to /a/ are not consistently longer than trills contiguous to /u/, but participants other than W4 show that trills contiguous to /a/ are longer than those contiguous to /i/. And unpalatalized trills contiguous to /u/ are consistently longer than unpalatalized trills contiguous to /ɨ/.
Mean normalized duration by a) Secondary Articulation, b) Word Position Initial (In), Intervocalic (IV), Preconsonantal (PC), and Final (Fn), c) Contiguous vowel. Normalization is by centering each subject’s data by the means duration.
To examine how Secondary Articulation (SA), Contiguous Vowel (V), and Word Position (WP) affect the duration of Russian trills, a Linear Mixed-Effects General Linear Model test was performed with Dependent variable Duration, with fixed effects factors: SA (Levels: Palatalization, Velarization), V (Levels: /a/, /u/, /i/ with /rj/ and /a/, /u/, /ɨ/ with /r/) and WP (Initial, Intervocalic, Preconsonantal, Final), and Random Factor: Participant. Both main effects and interactions were estimated. Markov Chain Monte Carlo simulations with 10000 iterations were used for significance testing and estimating 95% Confidence Intervals (CI) for the effect coefficients, as described in Section 2.
shows the statistical results of the study on duration. For the SA contrast between palatalized and non-palatalized trills, there was no significant difference in duration. Trills contiguous to back vowels /u/ and /a/ are significantly longer than ones contiguous to front /i/ and central /ɨ/ but are not significantly different from each other.
Generalized Linear Mixed Model Results for Duration as a function of Secondary Articulation, Vowel, and Word Position.
Word Position shows two main types of effects: trills at the boundary are longer than trills inside the domain, and boundary-final trills are longer than boundary-initial trills for most participants.
To investigate the intrinsic relation between the number of contacts and duration, there is the basic problem that number of contacts is a categorical variable, whereas duration and F2-F1 are continuous variables. To deal with this problem, we split the scale of Duration into ten parts and evaluated the odds of having 0 vs. 1 or more closures in each part of the scale and the odds of having 2≥ vs. 0 or 1 contacts. The 0 vs. 1,2,3 contacts Odds ratio is an indication of the likelihood of an approximant trill, whereas 2≥ Contacts vs. 0 or 1 is an indication of the likelihood of a highly trilled trill. Duration was correlated with the Odds ratios and with each other, and R2, the amount of explained variation, was then calculated for each relation. The results are plotted in . The relations with high R2 variation are highlighted in gray.
R2 for relation between durations and number of contacts.
As can be seen from , within /r/, /rj/, and across all the data, Duration is a good predictor of whether the trill will be an approximant or will have one or more contacts. The longer a trill, the less likely it will be an approximant, and the more likely it will be fully trilled. For the likelihood of having 2 or 3 contacts, duration is a good predictor when the data are analyzed as a whole and within /r/, but not within /rj/, since there is little variability in duration in that category.
3.3. Tongue Fronting
To acoustically investigate how the position of the tongue body varies in a Russian trill as a function of Secondary Articulation, Contiguous Vowel, and Word Position, we used F2–F1. The distance between the first two formants increases as the tongue body assumes a palatal position and decreases as the tongue body assumes a back position, and has been previously called the sharp-plain distinction (Jakobson, Fant, and Halle, 1952
), where sharp configurations have a low F2-F1 and plain configurations have a large F2-F1. We use this measure since the palatalization contrast is a contrast exactly in the front-back positioning of the tongue, and should therefore structure F2-F1 variability. Also, this measure is less speaker-dependent than F2 or F1 alone (Ladefoged, 2001
). Sproat and Fujimura (1993)
used this measure to investigate the phasing between front and back gestures in American English /l/ and showed that it correlates highly with articulatory measures made using Xray Microbeam tracking of tongue position. In the current section, we investigate F2-F1 at the beginning of the trill and the end of the trill as a function of Secondary Articulation, Vowel, and Word Position of the trill.
Measurement of F2-F1 “at the beginning of a trill” means F2-F1 in a 25 ms window at the beginning of the first open phase component for a 1-contact or 2≥ Contact or beginning of the approximant for a 0-contact, whereas an F2-F1 measure “at the end of the trill” means F2-F1 in a 25 ms window at the end of the last open phase for a 1-contact or 2≥ contact or last 25 ms of an approximant. Specifically, we attempt to answer two questions. 1) Does palatalization span the entire trill, i.e., is it parallel with the apical gesture (synchronous), or is it sequential to the apical gesture (asynchronous)? If the two gestures are synchronous, then the palatalization contrast as expressed in F2-F1 should be equally detectable at the beginning and end of the trill, whereas asynchronous gesture organization would show a F2-F1 contrast at one end of the trill, but not the other. The effect of word position and vowel on whether the gestures are synchronous or not is also investigated. 2) Does the syllable’s vowel span the entire trill in its influence or only the beginning or end of the trill? We attempt to answer this question by measuring the difference between the effects of each vowel pair (e.g. /a/ vs. /u/) on the beginning and end of the non-palatalized trill by itself or the palatalized trill by itself at each position in the word. If the difference between vowel pairs can be detected equally at the beginning and end of the trill, then the contiguous vowel spans the entire trill, whereas if the difference can only be detected at one side of the trill, there is evidence that vowels only overlap with a part of the trill.
shows the means for all the participants of F2-F1 at the beginning and end of the trill for /r/ and /rj/, as well as the means for the change in F2-F1 from the beginning to the end of the trill. To be able to compare across different participants with different formant ranges, F2-F1 for each participant was normalized for this figure by subtracting from each token the speaker’s mean F2-F1 calculated across all the data for that participant. As can be seen, all participants show a large difference in F2-F1 measured at the beginning/end of a palatalized trill as compared to that measured at the beginning/end of a non-palatalized trill. When all the trills are combined together, as they are in , participants seem to consistently show little change through the trill from beginning to end in /r/, whereas for /rj/ there is a small amount of change, around 100–200 Hz. This figure combines data for all the trill categories. In , the mean phonetic differentiation in F2-F1 (non-normalized) between /r/ and /rj/ at the initiation of the trill are shown for approximant and tap trills. Full trills (>2-contacts) were not included, since there are so few cases of them for /rj/. The data is not normalized by subject to show that the effects are present, even when the normalization is not undertaken. It can still be seen that at the beginning of the trill, there is still high differentiation between the two trills. Crucially, the differentiation does not depend on word position. However, the differentiation is greater contiguous to the non-back vowel than in the back vowel context, which could be due to back vowels having a lower F2 before the palatalized segment than before the non-palatalized one. shows the mean change in F2-F1 across the trill (F2-F1 at the end of the trill minus F2-F1 at the initiation of the trill. It can be seen that onset (initial and intervocalic) palatalized trills, for both approximants and taps, show an increase in F2-F1, but that coda palatalized trills do not show a change in F2-F1 across the trill. When viewed by position, the non-palatalized trills show almost no change in F2-F1 across the trill, however when the data is separated by vowel, we see that there is a positive change in F2-F1 contiguous to /a/ and /u/ only.
F2-F1 at the a) beginning and b) end of the trill, and c) change in F2-F1 across the trill.
F2-F1 at the beginning of the trill (Pal Init) for approximants, and taps, divided by position in the domain.
Change in F2-F1 across the trill (Pal Change) for approximants, and taps, divided by position in the domain.
, and show the significant results for the General Linear Mixed Model tests for F2-F1 at the beginning, end, and F2-F1change, respectively. The Vowel and Word Position effects were investigated by separating the data into the two classes, since the data behaves quite differently in the classes. The significant results reflect the patterns seen for individual participants. Some effects do show significance regarding Word Position. Specifically, initial trills show significantly higher F2-F1 than final and preconsonantal trills at the end of palatalized trills, and initial trills also show higher F2-F1 than intervocalic and preconsonantal trills at the beginning of non-palatalized trills. However these effects are relatively small, all less than 150 Hz, and their confidence intervals are quite high. It can be seen from the individual participant data that some participants do show these effects to a large extent and therefore bias the significance of the results. We will therefore not consider these effects as being robust. shows that palatalized trills show an increase of about 130 Hz through the trill.
Generalized Linear Mixed Model Results for F2-F1 at the end of the trill as a function of Secondary Articulation, Vowel, and Word Position.
Generalized Linear Mixed Model Results for F2-F1 at the beginning of the trill as a function of Secondary Articulation, Vowel, and Word Position.
Generalized Linear Mixed Model Results for F2-F1 Change in the trill as a function of Secondary Articulation, Vowel, and Word Position.
,, show the investigation of how vowel contrasts are exhibited at the beginning and end of the trill, as measured by Cohen’s d distance between distributions of F2-F1 for each vowel pair, measured at the beginning and end of each trill. is for the /a/-/ɨ/ and /a/-/i/ distinction, is for the /ɨ/-/u/ and /i/-/u/ distinction and is for the /a/-/u/ distinction. The upper row in each figure shows data for /r/, while the bottom row shows data for /rj/. The left column in each figure shows data for 1-Contacts, while the right figure shows data for 0-contacts. The major generalization arising from this investigation is as follows: the contrast between vowels is detectable in the portions of the trill close to vowels, but is poorly detectable or not detectable at all in regions of the trill farther from the vowel. For instance, in /r/, the /ɨ/-/a/ distinction is above two standard deviations in all positions in the trill, except for the beginning of the initial trill and the end of the preconsonantal and final trills. These positions are different from all the others in that they are not contiguous to the vowel. Examination of the various vowel distinctions reveals, however, that the distinction at the beginning of initial position is the only one that is always compromised. Several vowel distinctions are detectable quite significantly for all participants at the end of the preconsonantal and final positions. The /a/-/u/ distinction for /rj/ is least detectable across contexts, whereas that same vowel distinction is highly detectable in /r/.
Distances between F2-F1 distributions in /a/ and /ɨ/ (top) and /a/ and /i/ (bottom) in standard deviations at the beginning and end of each trill as a function of Word Position. Top: 1-Contacts. Bottom: 0-Contacts.
Distances between F2-F1 distributions in /u/ and /ɨ/ (top) and /u/ and /i/ (bottom) in standard deviations at the beginning and end of each trill as a function of Word Position. Top: 1-Contacts. Bottom: 0-Contacts.
Distances between F2-F1 distributions in /a/ and /u/ in standard deviations at the beginning and end of each trill as a function of Word Position. Top: 1-Contacts. Bottom: 0-Contacts.
To investigate the extent of coproduction of the trill with its contiguous vowel statistically, as a function of the trill’s position in the word, we tested for the interaction of Vowel and Word Position on F2-F1 at the start of the trill and at the end of the trill, within /rj/ and within /r/. The results for /rj/ are in . It can be seen that the vowel distinctions (/i/ vs. /u/) and (/i/ vs. /a/) have an effect at the beginning and end of the palatalized trill in Intervocalic, Preconsonantal, and Final positions. However, in Initial position, the vowel distinctions are significant at the end of the trill (immediately preceding the vowel), but not at the beginning of the trill. The /u/ vs. /a/ distinction has a significant effect only at the beginning of Final trills. shows the effect of vowel distinctions on the beginning and end of /r/. The same basic pattern can be seen for this trill. The beginning of Initial /r/ shows little (ɨ/u and a/u) to no (ɨ/a) effect of the contiguous vowel distinctions, but all other positions show sizable effects.
Generalized Linear Mixed Model Results for F2-F1 at the beginning and end of /rj/as a function of Vowel Pair contrast and Word Position.
Generalized Linear Mixed Model Results for F2-F1 at the beginning and end of /r/as a function of Vowel Pair contrast and Word Position.