PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
 
Clin Linguist Phon. Author manuscript; available in PMC 2011 February 23.
Published in final edited form as:
PMCID: PMC3043988
NIHMSID: NIHMS262157

Remote capture of human voice acoustical data by telephone: A methods study

Abstract

In this pilot study we sought to determine the reliability and validity of collecting speech and voice acoustical data via telephone transmission for possible future use in large clinical trials. Simultaneous recordings of each participant's speech and voice were made at the point of participation, the local recording (LR), and over a telephone line using a dedicated in-line computerized interactive voice recording system, the remote recording (RR). All voice recordings were made from our laboratory telephone located in Groton, Connecticut to the RR system located in Madison, Wisconsin. All data points were compared on a measure-by-measure basis between the LR and RR recordings. The results suggest that both measures of frequency excursion and of speech motor timing are reliably captured over the telephone. Results are discussed in terms of specific acoustic measures that may be useful and accurately measured via telephone transmission, for examining disease severity and pharmacological intervention for use in a large-scale clinical trial.

Keywords: Acoustics, telephone, assessment

Introduction

A variety of voice acoustical measures have been shown to be potentially sensitive markers of disease severity and therapeutic treatment response in both motor and affective central nervous system (CNS) disorders. These measures, related to either the motor timing aspects of speech or prosodic control of speech, have been employed in the study of diseases ranging from Parkinson's disease (Goberman, Coelho, & Robb, 2002; Goberman & Coelho, 2002a; Goberman & Coelho, 2002b) to Major Depressive Disorder (Ellgring, & Scherer, 1996; Nilsonne, 1987; Nilsonne, Sundberg, Ternstrom, & Askenfelt, 1988; Flint, Black, Campbell-Taylor, Gailey, & Levinton, 1992; Stassen, Kuny, & Hell, 1998; Teasdale, Fogarty, & Williams, 1980). Although the recording of human voice samples, within the context of conducting multi-centre clinical drug trials may at times be desirable, it is a logistically difficult task. Creating ideal recording conditions (e.g., the use of a soundproof booth, on-location recording equipment, and well-trained staff) is typically neither cost nor time permissible for a study including large numbers of participants across multiple study sites. In an effort to move the field of communication sciences and disorders towards evidence-based practice, and participation in randomized clinical trials, it is first necessary to determine suitable avenues of large-scale data collection. Therefore, this study was conducted to determine the methodological challenges, sensitivity of measurement, and the overall appropriateness of using the telephone to collect potentially useful and disease/disorder-specific voice acoustical data.

The use of the telephone is not a new avenue for the collection of objective speech data. Telephone data collection has a diverse background in documenting lexical stress (van Kuijk, & Boves, 1999) and interview success and perceptual ratings of speech (Sharf, & Lehman, 1984), as well as speech timing in affective disorders such as schizophrenia and mania (Friedman, & Sanders, 1992). However, none of these authors sought to investigate and report on the integrity of the voice signal, and the possibility of degradation and/or alteration of the signal data as a result of telephone transmission.

Due to the wide range of measures that may be useful for the acoustic analysis of disordered speech (see for example Kent, Weismer, Kent, Vorperian, & Duffy, 1999), we have limited our focus to two types of variables (i.e., time dependant and frequency dependant) that have previously been shown to be clinically meaningful for understanding several dopamine related CNS disorders in which we hold an interest. In this initial study, we focused our efforts on two time dependant measures, speaking rate and voice onset time, and two frequency related measures, pitch excursion as measured by fundamental frequency (F0) variation and vocal range (Hall, & Yairi, 1997; Kent et al., 1999; Nishio, & Niimi, 2001; Turner, Tjaden, & Weismer, 1995; Weismer, Laures, Jeng, & Kent, 2000).

Speaking rate is a measure of the overall integrity of the speech motor control system (Hall, & Yairi, 1997). This general measure of speech production ability holds promise as a clinical outcome measure for the study of CNS disorders associated with motoric output deficits. It has been documented as a sensitive and differentiating measure in a wide variety of CNS diseases accompanied by dysarthria of speech. For example, persons with amyotrophic lateral sclerosis (ALS) have been shown to produce slower speaking rates whether they are speaking under habitual or fast speaking conditions (Weismer et al., 2000). Additionally, (Kleinow, Smith, and Ramig, 2001) have demonstrated increased variability of speech rate in persons with idiopathic Parkinson's disease. In a broad study of speaking rate in Japanese persons with dysarthria, Nishio and Niimi (2001) found overall speaking rate to be significantly reduced in all dysarthria types studied including flaccid, spastic, ataxic, hypokinetic, mixed and unilateral upper motor neuron type. These authors have concluded that the simple measure of speaking rate is a sensitive measure of disordered speech motor performance in all of the most common clinically recognized types of dysarthria.

Changes in speaking rates have also been associated with changes in emotional tone in individuals with affective disorders such as depression and negative symptom schizophrenia. In general, speaking rates in these patients have been shown to be slower overall and well-linked to depressive symptomatology (Flint et al., 1992, Teasdale et al., 1980) and negative symptom complex schizophrenia (Alpert, Rosenberg, Pouget, & Shaw, 2000; Alpert, Kotsaftis, & Poiget, 1997; Puschel, Stassen, Bomben, Scharfetter, & Hell, 1997; Shaw, Dong, Lim, Faustman, Pouget, & Alpert, 1999) Decreased pause durations and increased speech rates have been concurrently seen in patients undergoing pharmacological treatment for depression. Findings indicate that signs of mood improvement are associated with concurrent speech changes and that the speech changes may be more indicative of early therapeutic response than other clinical measures (Stassen et al., 1998). Generally, dynamic changes in speaking rate and speech pause time measurements, seen as increased speaking rate and shortened pauses, have mirrored clinical improvements in depression and are strongly correlated to symptomatology and improvement (Ellgring, & Scherer, 1996; Flint, Black, Campbell-Taylor, & Gailey, 1993; Hardy, Jouvent, & Widlocher, 1984; Nilsonne, 1987, 1988; Stassen et al., 1998; Alpert, Pouget, & Silva, 2001).

Voice onset time (VOT) also represents an acoustic measure related to motoric strategy and timing that is associated with changing articulation in the mechanisms of speech production (Borden, Harris, & Raphael, 1994, p. 131; Kent, Weismer et al., 1999, p. 159). Measurement of this dimension represents the inclusive time duration from the release of a stop consonant to the subsequent periodic vibration of a following vowel. Previous research regarding VOT changes in persons with PD is inconsistent. Forrest, Weismer, and Turner (1989) found increased mean VOT, which they attributed to deficits in the coordination and initiation of movement in the laryngeal musculature. Others have found decreased VOT, which was attributed to rigidity of the laryngeal musculature causing a reduction in vocal fold opening (Flint et al., 1992, 1993; Weismer, 1984). Voice onset time involves the initiation and coordination of voicing in speech and is therefore of interest because it has been suggested that early changes in hypokinetic dysarthria associated with PD are likely to initially involve laryngeal control (Duffy, 1995; Zwirner, & Barnes, 1992). Measurement of VOT is reliably made. In our laboratory, we have found that measures of onset, offset, and total VOT duration, made by independent judges, produces very high inter-rater and intra-rater reliability (r>.95), and similar results have been observed in other laboratories (r=.96; Flint et al., 1993). To date, only one study has investigated VOT in depression and found shorter VOT in depressed participants relative to control subjects; however, these durations were not significantly different from persons with PD (Flint et al., 1993). It does not appear that this topic has been investigated in schizophrenia.

The measure Fundamental frequency (F0) variability involves tracking the dynamic fluctuations of vocal inflection over time. The use of automatic computerized software to analyse acoustic properties of speech, such as vocal range and F0 contour, has become increasingly commonplace (Duffy, 1995; Kent et al., 1999; Kent, Vorparian, & Duffy, 1999) and is thought to be reliably used in persons with dysarthria and in normal controls when task content is well controlled (Kent, Vorparian et al., 1999). Previous research in persons with depression (cf. Alpert et al., 2001; Talavera, Saiz-Ruiz, & Garcia-Toro, 1994), negative symptom schizophrenia (cf. Alpert et al., 2000), and PD (cf. Flint et al., 1992; Metter, & Hanson, 1986) suggests that F0 variability in these disorders is reduced significantly when compared to typical healthy control populations. These consistent findings of decreased variability of F0 across these disorders has led some researchers to hypothesize that a common underlying mesolimbic-nigrostriatal dopamine depletion may be the root cause (Flint et al., 1993; Talavera et al., 1994).

Methods

Participants

Two of the authors (M. S. C. and N. R.) served as the subjects in this comparative study and each participant served as his or her own control for each recording session. These participants were a 34 year-old male, and a 23 year-old female, both native speakers of American English, in good physical health with no history of neurological deficit, learning disability, or psychiatric disorder. They were assessed by a licensed speech-language pathologist (SLP) as being 100% intelligible in conversation with appropriate speech articulation. The participants were also subjectively judged by the SLP to be free of overt voice disorders or dysarthria.

Data collection

For each telephone call, the participants systematically executed a series of speech and vocal exercises that constitute a fundamental portion of our acoustics laboratory generic speech and voice collection protocol (see Appendix A). This abridged protocol was specifically designed to extract the four variables of interest (i.e., speaking rate, pitch variability, pitch range, and VOT), and consisted of the following: automatic speech (e.g., counting from 1–40), vocal range with the vowel /a/, diadochokinesis (DDK; /pa pa pa pa pa /), and standard paragraph reading (e.g., The Grandfather Passage). During each recording, participants spoke naturally into the telephone (i.e., the remote recording; RR), with the local recording (LR) microphone placed approximately 10 cm from the speaker's mouth. The participants spoke at a comfortable volume in all recording situations.

Two telephone calls were made from our laboratory in Groton, CT to an interactive voice response (IVR) recording system located in Madison, WI. This system has been adapted, validated and used for the remote capture and collection of data for individuals with depression to study drug treatment response in clinical trials (Mundt, 1997). The IVR is programmed to prompt participants to answer questions and perform tasks specific to a data collection protocol. Then responses and performances were digitally recorded at a sampling rate of 8 kHz. The RR sound files, in the form of .wav files, were then sent via e-mail back to the laboratory in CT.

Local recordings were made in a quiet environment using a unidirectional high quality microphone designed for vocal recording (Sure SM-58). Analog to digital conversion was accomplished through the XLR front panel microphone input of the CSL-4400 audio capture device, sampled at 44100 Hz 16 bit quantification and saved in .wav file format following the completion of each of the four tasks. Files of interest were then subjected to acoustical analyses.

Acoustical analyses

All acoustic analyses were performed using both the freely available Praat (Boersma, & Weenik, 2003) and the commercially available Computerized Speech Laboratory (CSL) main program (Kay Elemetrics, 2001) speech and voice analysis software programs. Because the speech and voice recordings were already in a digital format, the signals were opened and directly analysed regarding the measures of interest (see below).

Measures

Overall speaking rate

Overall speaking rate in this investigation refers to the total amount of time needed by the speaker to complete each of the following; the passage-reading task and the counting task. This time is calculated by first marking the onset of visible energy in the acoustic spectrogram of the digitized signal. The completion of the reading passage or counting task is then marked by the offset of visible spectrum energy. Then overall speaking rate is calculated as the absolute time in between the two energy markers in the reading task or counting task.

Pitch variability

Pitch variability was measured as a coefficient of variation of F0 within a running speech sample. This was calculated as the standard deviation of F0 divided by the mean of F0 for the segment of interest. The periodicity-to pitch-autocorrelation function of Praat, with a pitch floor of 75 Hz and an automatic time step were used to derive the mean and standard deviation of F0 for both tasks. The tasks of interest for this measure were the standardized reading passage and the automatic speech task.

Pitch range in Hertz

Vocal range is a measure of vocal control related to the ability to regulate laryngeal mechanisms regarding frequency, perceptually identified as vocal pitch. In this task participants were instructed to take a deep breath, say /a/ in their typical speaking voice, and gradually raise their pitch until they could not make it any higher. Then participants were instructed to take a deep breath, say /a/ in their typical speaking voice, and gradually lower their pitch until they could not make it any lower. A visible F0 contour was extracted from the Praat periodicity-to-pitch autocorrelation function for the aforementioned exercises. Vocal range was then calculated as the difference between the highest and lowest analyzable periodic frequency measured in Hertz.

Voice onset time (VOT)

VOT was measured from the stop consonant vowel combination production /pa/ from the DDK task. Measurement followed typical acoustical conventions with markers measuring from the point of initial spectrographic evidence of the plosive burst of the stop consonant, to the point of periodic vocal fold vibration signaling the onset of the vowel. The three middle productions of the /pa/ consonant vowel pair were used for each participant.

Results

Overall speaking rate

Overall speaking rates, listed below in Table I, were quite consistently measured for both the automatic speech task and the standard reading passage. Measurement variability between the two recording conditions was, at its greatest, less than two tenths of a per cent different for either of the experimental task conditions. Per cent score variability between conditions ranged from 0% difference to .1% difference in seconds for all of the comparisons.

Table I
Speaking rates in seconds for automatic speech and passage reading.

Pitch variability

Pitch variability as measured by F0 coefficient of variation (COV), listed in Table II below, was very consistently measured between the LR and RR recording conditions. In three out of the four comparisons, the measurement difference was less than two hundredths of a per cent, with the final measurement at less than five hundredths of a per cent different.

Table II
Pitch variability measured as a coefficient of variation (COV).

Pitch range in Hertz

Pitch range in Hertz was also reliably measured in both the LR and RR conditions. Absolute pitch range differences between recording conditions were .32 Hz for the male and 2.21 Hz for the female with an overall percent difference of one tenth of a per cent and four tenths of a per cent respectively (see Table III).

Table III
Pitch range in Hz.

Voice onset time

Measures of VOT that were compared across recording locations included the total VOT for matched pairs of consonant syllable pairs /pa/. Three such pair measurements were analysed per subject between recording conditions. Difference scores, measured in milliseconds, ranged from a low of one ms to a high of six ms. These results demonstrate that VOT may be measured with a relatively high degree of precision between local and remote recording measurements.

Reliability

Twenty five per cent of the measures, ten measures overall, were randomly selected from the LR and RR conditions and reanalysed for the purpose of determining both the inter-rater reliability and intra-rater reliability of the measurement procedures. Pearson product-moment correlations were used to determine the association between the original measurements the repeated measurements. Correlation coefficients were r2=.95 for the within examiner reliability and r2=.92 for between rater reliability.

Discussion

We sought to determine the feasibility of using human speech samples that were collected and recorded over the telephone for voice acoustical research. The results should be considered preliminary, due to the limitations of a two subject descriptive design, the small sample size of recordings, and the use of only a single healthy male and a single healthy female. Still, these initial findings demonstrate that quite reliable and accurate measurements can be made whether the acoustical signal is collected over the telephone line or under more ideal laboratory conditions. This was true for measures of speaking rate, pitch variability, pitch range, and VOT, all of which are considered to be useful metrics in the study of affective and dysarthric speech and voice profiles.

The current study is based on an earlier comparative study conducted by our laboratory, in which international telephone calls were remotely and locally recorded as the subject performed vocal and speech exercises similar to those used in this study (Cannizzaro, & Snyder, 2003). Recordings were made from three locations in Western Europe and yielded promising results. While this initial foray served as the inspiration for the current study, we sought to remove certain design flaws, such as inconsistencies in telephones and the modes of transmission (e.g., satellite, differing phone companies), by employing a single telephone with known transmission characteristics in a more controlled setting.

The measurement accuracy of the two time-dependent measures of interest, speaking rate and voice onset time, are reliant on accurate representation and visualization of the speech signal as a complex waveform and as a spectrogram. As long as visible landmarks, such as a plosive burst or the initiation of periodic voicing are clearly apparent during the analysis, the measurement should be as accurate as the skill level of the person performing the analysis. In all of the comparisons made in this experiment, there were no difficulties in performing these analyses, as the signals of interest were, without exception, clearly discernable from the surrounding signal. This is readily apparent by the close agreement of the measurements we were able to make across a variety of tasks in both the RR and LR conditions. These findings are further supported by the high levels of inter-rater and intra-rater reliability found for both conditions across the tasks of interest.

The accurate quantification between the two frequency related measures of interest, pitch variation and pitch range in Hertz, is a little more complicated given the known characteristics of the telephone. Telephone frequency response is generally reported to be within the range of 300 Hz to 3000 Hz acting essentially as a band pass filter rejecting higher and lower frequency transmission (Kent, & Read, 2002, p. 78). In an initial test of the IVR system, we have found an accurate frequency portrayal of the telephone transmission between 250 Hz to approximately 3200 Hz with a steep roll off in frequency response outside of these parameters (Cannizzaro, & Snyder, 2003). Given these qualities, the measurement of frequencies beyond these boundaries, in the interest of this study below 300 Hz, is dependant on the algorithms used in the analysis. For this reason, the periodicity-to pitch-autocorrelation function in Praat was used as it is not dependant on the actual presence of a fundamental frequency to make this measurement. Autocorrelation uses the repeat length of the waveform to determine the fundamental period (P. Boersma, personal communication, November 12, 2003). That is, the lowest common frequency in a complex periodic wave with equally spaced harmonics will generate a repeat pattern at that frequency. Since the voice fundamental frequency is generated by the periodic vibration of the vocal folds, and each harmonic is a simple multiple of that fundamental, the lowest common repeat pattern is at the same rate as the fundamental frequency of the voice. As evidenced by our findings, even a wide range of frequency excursion can be accurately assessed in telephone recordings.

While our initial findings are quite promising regarding the use of the telephone to collect acoustic data, there are some cautions that should be addressed. We chose our measures carefully based on our initial testing of the IVR system with speech and noise signals. No measures of intensity were performed due to the difficulty in accurately calibrating and equating the decibel level of the mouth to telephone and mouth to recorder levels. At the present time, this leaves out a number of measures that may hold potential value for a future clinical trial using voice acoustic data. Also, the clarity of our voice recordings was consistent as only one telephone was used in this experiment. The use of lower quality telephones and other modes of transmission (e.g., cellular or satellite) would also need to be explored before they could be deemed suitable for use in large-scale data collection.

Overall, we were encouraged by these early findings. Future investigations of larger groups of subjects with suspected and diagnosed CNS disorders, as well as healthy controls, will ultimately advance the feasibility of telephone recording of acoustic data. Similar designs that utilize simultaneous LR and RR recordings in future studies can only serve to further our understanding employing these techniques in large-scale randomized clinical trials.

Table IV
Voice onset time from /pa/.

Acknowledgments

Supported, in part, by NIH grant R43MH68950 to J. C. M. The authors would also like to acknowledge Ben Barth, without whose technical support the use of the IVR system would not have been possible.

Appendix A

Telephone speech collection module

  • Automatic Speech
    • 1
      Counting from 1–40: “In your typical speaking voice, please count from one to forty.”
  • Vocal Range
    • 2
      Pitch range—mid to low: “Now take a deep breath, and say the sound ‘aahhh’ in your normal speaking tone. Then, gradually lower your pitch until you cannot go any lower.”
    • 3
      Pitch range—mid to high: “Now take a deep breath, and say the sound ‘aahhh’ in your normal speaking tone. Then, gradually raise your pitch until you cannot go any higher.”
  • Repeated Syllables
    • 4
      Diadochokinesis (DDK) /pa/: “Take a deep breath, and repeat the following syllable /pa/ as quickly as possible for five seconds.”
  • Standard Reading Passage
    • 5
      Reading passage: “Please read the following paragraph in your typical speaking voice.”

The Grandfather Passage

You wished to know all about my grandfather. Well, he is nearly ninety-three years old. He dresses himself in an ancient black frock coat, usually minus several buttons; yet he still thinks as swiftly as ever. A long, flowing beard clings to his chin, giving those who observe him a pronounced feeling of the utmost respect. When he speaks his voice is just a bit cracked and quivers a trifle. Twice each day he plays skillfully and with zest upon our small organ. Except in the winter when the ooze or snow or ice prevents, he slowly takes a short walk in the open air each day. We have often urged him to walk more and smoke less, but he always answers, “Banana Oil!” Grandfather likes to be modern in his language.

References

  • Alpert M, Kotsaftis A, Pouget ER. At issue: Speech fluency and schizophrenic negative signs. Schizophrenia Bulletin. 1997;23:171–176. [PubMed]
  • Alpert M, Pouget ER, Silva RR. Reflections of depression in acoustic measures of a patient's speech. Journal of Affective Disorders. 2001;66:59–69. [PubMed]
  • Alpert M, Rosenberg SD, Pouget ER, Shaw RJ. Prosody and lexical accuracy in flat affect schizophrenia. Psychiatry Research. 2000;97:107–118. [PubMed]
  • Boersma P, Weenik D. Praat (Version 4.1.9) [computer software] Amsterdam: Institute of Phonetic Sciences; 2003.
  • Borden GJ, Harris KS, Raphael LJ. Speech science primer: Physiology, acoustics, and perception of speech. Philadelphia, PA: Lippincott, Williams, & Wilkins; 1994.
  • Cannizzaro MS, Snyder PJ. Telephone test signal transmission. 2003 Unpublished raw data.
  • Duffy JR. Motor speech disorders: Substrates, differential diagnosis, and management. New York: Mosby; 1995.
  • Ellgring H, Scherer KR. Vocal indicators of mood change in depression. Journal of Non-Verbal Behavior. 1996;20:83–110.
  • Flint AJ, Black SE, Campbell-Taylor I, Gailey GF, Levinton C. Acoustic analysis in the differentiation of Parkinson's disease and major depression. Journal of Psycholinguistic Research. 1992;21:383–399. [PubMed]
  • Flint AJ, Black SE, Campbell-Taylor I, Gailey GF, Levinton C. Abnormal speech articulation, psychomotor retardation, and subcortical dysfunction in major depression. Journal of Psycholinguistic Research. 1993;21:309–319. [PubMed]
  • Forrest K, Weismer G, Turner G. Kinematic, acoustic and perceptual analyses of connected speech produced by Parkinsonian and normal geriatric males. Journal of the Acoustical Society of America. 1989;85:2608–2622. [PubMed]
  • Friedman EH, Sanders GG. Speech timing of mood disorders. Computers in Human Services. 1992;8:121–142.
  • Goberman AM, Coelho C. Acoustic analysis of Parkinsonian speech I: Speech characteristics and L-Dopa therapy. Neurorehabilitation. 2002a;17:237–246. [PubMed]
  • Goberman AM, Coelho C. Acoustic analysis of Parkinsonian speech II: L-Dopa related fluctuations and methodological issues. Neurorehabilitation. 2002b;17:247–254. [PubMed]
  • Goberman AM, Coelho C, Robb MP. Phonatory characteristics of Parkinsonian speech before and after morning medication: The on and off states. Journal of Communication Disorders. 2002;17:247–254. [PubMed]
  • Hall KD, Yairi E. Speaking rate and speech motor control: Theoretical considerations and empirical data. In: Hulstijn W, Peters HFM, van Lieshout PHHM, editors. Speech production: Motor control, brain research and fluency disorders. Amsterdam: Elsevier; 1997. pp. 547–556.
  • Hardy P, Jouvent R, Widlocher D. Speech pause time and the Retardation Rating Scale for Depression (ERD): Towards a reciprocal validation. Journal of Affective Disorders. 1984;6:123–127. [PubMed]
  • Kay Elemetrics Corporation. CSL Model 4400 [computer software] Lincoln Park, NJ: 2001.
  • Kent RD, Read C. Acoustic analysis of speech. 2nd. Canada: Singular; 2002.
  • Kent RD, Vorperian HK, Duffy JR. Reliability of the Multi-Dimensional Voice Program for the analysis of voice samples for subjects with dysarthria. American Journal of Speech Language Pathology. 1999;8:129–136.
  • Kent RD, Weismer G, Kent JF, Vorperian HK, Duffy JR. Acoustic studies of dysarthric speech: Methods, progress, and potential. Journal of Communication Disorders. 1999;32:141–186. [PubMed]
  • Klienlow J, Smith A, Ramig LO. Speech motor stability in IPD: Effects of rate and loudness manipulations. Journal of Speech, Language and Hearing Research. 2001;44:1041–1051. [PubMed]
  • Metter J, Hanson W. Clinical and acoustical variability in hypokinetic dysarthria. Journal of Communication Disorders. 1986;19:347–366. [PubMed]
  • Mundt JC. Interactive voice response (IVR) systems in clinical research and treatment. Psychiatric Services. 1997;48:611–623. [PubMed]
  • Nilsonne A. Acoustic analysis of speech variables during depression and after improvement. Acta Psychiatrica Scandinavia. 1987;76:235–245. [PubMed]
  • Nilsonne A. Speech characteristics as indicators of depressive illness. Acta Psychiatrica Scandinavia. 1988;77:253–263. [PubMed]
  • Nilsonne A, Sundberg J, Ternstrom S, Askenfelt A. Measuring the rate of change of voice fundamental frequency in fluent speech during mental depression. Journal of the Acoustical Society of America. 1988;83:716–728. [PubMed]
  • Nishio M, Niimi S. Speaking rate and its components in dysarthric speakers. Clinical Linguistics and Phonetics. 2001;15:309–317.
  • Puschel J, Stassen HH, Bomben G, Scharfetter C, Hell D. Speaking behavior and speech sound characteristics in acute schizophrenia. Journal of Psychiatric Research. 1997;32:89–97. [PubMed]
  • Sharf D, Lehman ME. Relationship between speech characteristics and effectiveness of telephone interviewers. Journal of Phonetics. 1984;12:219–228.
  • Shaw RJ, Dong M, Lim KO, Faustman WO, Pouget ER, Alpert M. The relationship between affect expression and affect recognition in schizophrenia. Schizophrenia Research. 1999;37:245–250. [PubMed]
  • Stassen HH, Kuny S, Hell D. The speech analysis approach to determining onset of improvement under antidepressants. European Neuropsychopharmacology. 1998;8:303–310. [PubMed]
  • Talavera JA, Saiz-Ruiz J, Garcia-Toro M. Quantitative measurement of depression through speech analysis. European Psychiatry. 1994;9:185–193.
  • Teasdale JD, Fogarty SJ, Williams JMG. Speech rate as a measure of short term variation in depression. British Journal of Social and Clinical Psychology. 1980;19:271–278. [PubMed]
  • Turner GS, Tjaden K, Weismer G. The influence of speaking rate on vowel space and speech intelligibility for individuals with amyotrophic lateral sclerosis. Journal of Speech and Hearing Research. 1995;38:1001–1013. [PubMed]
  • Van Kuijk D, Boves L. Acoustic characteristics of lexical stress in continuous telephone speech. Speech Communication. 1999;27:95–111.
  • Weismer G. Articulatory characteristics of Parkinsonian dysarthria: segmental and phrase-level timing, spirantization, and glottal-supraglottal coordination. In: McNeil M, Rosenbeck J, Aronson A, editors. The dysarthrias: Physiology, acoustics, perception, and management. San Diego, CA: College-Hill Press; 1984. pp. 101–130.
  • Weismer G, Laures JS, Jeng J, Kent RD. Effect of speaking rate manipulation on acoustic and perceptual aspects of dysarthria in amyotrophic lateral sclerosis. Folia Phoniatrica et Logopedica. 2000;52:201–219. [PubMed]
  • Zwirner P, Barnes G. Vocal tract steadiness: A measure of phonatory stability and upper airway motor control during phonation in dysarthria. Journal of Speech and Hearing Research. 1992;35:761–768. [PubMed]