|Home | About | Journals | Submit | Contact Us | Français|
Vocal fold vibration is vital in voice production and the correct pitch of speech. We have developed a high speed functional optical coherence tomography (OCT) system with a center wavelength of 1050 nm and an imaging speed of 100,000 A-lines per second. We imaged the vibration of an ex-vivo swine vocal fold. At an imaging speed of 100 frames per second, we demonstrated high quality vocal fold images during vibration. Functional information, such as vibration frequency and vibration amplitude, was obtained by analyzing the tissue surface during vibration. The axial direction velocity distribution in the cross-sectional images of the vibrating vocal folds was obtained with the Doppler OCT. The quantitative transverse direction velocity distribution in the cross-sectional images was obtained with the Doppler variance images.
Optical coherence tomography (OCT) is a powerful interferometric technology used to obtain cross-sectional tissue images noninvasively with micrometer resolution, millimeter penetration depth, and a video-rate imaging speed . Due to its non-contact and high resolution nature, OCT has become a valuable tool in a number of medical fields. Recently, OCT technique has been increasingly used to perform functional imaging as well. Doppler optical coherence tomography (DOCT) or optical Doppler tomography (ODT) is one kind of functional extension of OCT, which combines the Doppler principle with OCT. ODT has been widely used for in-vivo imaging of blood flow in live animals and human beings [2–7].
Clinically, vocal fold vibration has been widely imaged using laryngeal videostroboscopy and high speed video, as these methods provide clinically relevant important information on vocal fold behavior in health and pathology. Lohscheller et al. obtained functional information regarding the vibration vocal folds, such as vibrating frequency, velocity, and acceleration [8–10]. Although videostroboscopy provides an excellent method to dynamically assess the vocal folds, it only provides information on the surface of the vocal folds; therefore, the condition of the vocal folds underneath the surface remains unknown using these systems. There is a wide spectrum of diseases that can occur in the vocal folds, including benign polyps, premalignant and malignant lesions. Differentiating these afflictions using only direct visualization can be difficult and a biopsy is often required. The key element to differentiate these lesions has to do with visualizing the integrity of the basement membrane. A loss of the basement membrane integrity is a hallmark of cancers of the vocal fold. Currently, there is no reliable noninvasive method to diagnose laryngeal cancer without introducing a biopsy. However, doing a biopsy in the vocal folds can come with its own risk of creating permanent damage to the vocal folds; therefore, the importance of using a noninvasive imaging method that can visualize below the surface of the vocal cords, such as Ultrasound, and OCT, is highly practical. Ultrasound has also been used to image the vibrating vocal folds [11–13]. Although functional information can be obtained from an ultrasound, color Doppler ultrasound images suffer from low resolution and low frame rate. Hence, there is immense value in being able to image dynamically and in real-time image the structure and characteristics of the vibrating vocal folds, as much pathology is below the thin subsurface of this organ. Recently, imaging vibration vocal folds using OCT has been demonstrated by several groups [14–16]. Lüerßen et al. demonstrated the vibration vocal fold OCT image at an imaging speed of 10 frames per second . Our group has demonstrated in-vivo imaging of human vibrating vocal folds with a 1.3 µm, 20 kHz swept source OCT system and a hand-held probe . Functional information, such as vibrating frequency, was obtained by analysis of the OCT structure images. Kober et al. used a triggered 10 kHz swept source OCT system to image the excised half calf larynx . With the help of the particle image analysis method, the authors obtained the velocity vector in the cross section images from the OCT structure images.
For humans, the actual vocal fundamental frequencies vary by sex. In females it is approximately 200 Hz and in males it is approximately 120 Hz. For imaging such high frequency movement, a high speed imaging system is essential to provide high frame rate images for the analysis. In addition, Doppler OCT requires much more dense scanning between A-lines. In order to cover a large enough field of view and obtain high quality Doppler images at the same time, a fast system is essential to provide high frame rate.
In this paper, we demonstrate functional imaging of vibrating vocal folds ex-vivo with a high speed swept source OCT and ODT system. The system has a maximum imaging speed of 100 kHz A-line per second, a central wavelength of 1.05 µm, and a depth resolution of 7 µm. The functional information regarding the vibrating vocal folds, such as vibrating frequency, vibrating amplitude, and speed was obtained by fitting the surface curve of the vibrating vocal fold. To the best of our knowledge, this is the first time high quality and cross-sectional velocity distribution images of the vibrating vocal fold were obtained with an OCT and ODT system at a frame rate of 100 frames per second.
The schematic of the OCT and ODT system is shown in Fig. 1 . The laser source is a high speed swept laser source with a central wavelength of 1050 nm and a sweeping speed of 100 kHz (Axsun Technology, Billerica, MA). The MEMS-based compact laser source has a high coherence length (>10 mm), and these MEMS-based swept source OCT systems have demonstrated a high sensitivity roll-off in ophthalmology applications . The laser source output is split into the reference and sample arms by a 20:80 coupler with 80% in the reference arm and 20% in the sample arm. In the reference arm, the light is further split by a 95:5 coupler. About 95% of the light in the reference is sent to cause interference with the collected back-reflection or backscattering signal from the sample. This interference signal is detected by a balanced photon detector and digitized by a high speed digitizer (ATS 9350, Alazar Technologies Inc., Pointe-Claire, QC, Canada). The remaining 5% of the light in the reference arm is sent to a non-balanced photon-detector after passing through cover slides and is then finally digitized by another channel in the digitizer. The system can work in two modes, i.e. OCT or ODT modes. In OCT mode, only the signal from the channel that connects to the balanced photon-detector is acquired and digitized. In ODT mode, signals from both channels are acquired. The measured system sensitivity is 101.2 dB around the zero path difference.
The phase stability of the system is very important for a phase resolved Doppler OCT system. Due to the mechanical scanning components used in the tunable filters of the swept source laser, the swept source laser based FD ODT systems encounter worse phase stability issues than spectrometer-based FD ODT systems. Usually, a static surface may be added as a reference to correct phase error and improve the phase stability. Our group has used the top surface of chick chorioallantoic membrane (CAM) as a reference to obtain the blood flow in a CAM . Vakoc et al. have proposed a method to tap 1% of the sample arm and direct it at a calibration mirror, which is positioned near the maximum imaging range of the system . In this manuscript, a common-path method was used to correct phase errors similar to that presented by Adler et al. . We tapped 5% of the light from the reference arm and pass it through a cover glass with a thickness of 1mm. The two surfaces of the cover glass generated an inference fringe. This produced a reference surface at a depth corresponding to the thickness of the cover glass. By subtracting a portion of the phase difference of the reference surface location from the phase difference of the sample signal, the phase error was corrected . Figures 2 (a) and 2(b) show the OCT structure image and the phase difference between successive A-lines before correction for a mirror respectively. Figure 2(c) shows the phase difference between successive A-lines after the correction. The improvement can be clearly seen by comparing the images in Fig. 2(b) and Fig. 2(c). In the method proposed here, the reference surface calibration was obtained by taking a portion of the reference arm light in order to be detected by another channel of the digitizer. The sample signals were not affected by the calibration procedure, so that the system parameters, such as sensitivity and imaging range, were kept just the same as in a normal non-calibration procedure.
Fresh porcine larynges, with an intact trachea, were obtained from a local biological tissues supply company. Then, larynges were dissected with removal of supraglottic tissue exposing the vocal fold, but leaving key structures (arytenoid cartilages, anterior commissure, thyroid cartilage) intact. A nylon suture was placed to approximate position the arytenoid cartilages and thus create adduction of the vocal folds. Once the vocal folds were exposed, the larynx was mounted on a custom made mount and air supply device. A cuffed endotracheal tube was placed from below into the trachea, and to avoid air leakage the cuff was inflated. Then, warm air at different flow velocities was delivered through the endotracheal tube through the trachea and past the glottis to vibrate the vocal folds. Figures 3 (a) and (b) show the side view and top view photographs of the larynges mounted on the holder. A total of 6 larynges were prepared and investigated in this study.
The M-mode was first used to image the vibrating vocal fold. In M-mode imaging, the laser beam is not scanning and OCT images provide the depth profile of a single location at different times. The location of the incident beam is shown as the red dot in Fig. 3(b). Figure 4 shows the M-mode OCT structure image. The oscillation pattern of the vocal fold can be clearly seen from the images. The parameters such as oscillation period, amplitude, and speed are important to analysis of the imaged sample. In order to get these parameters, we first found the surface curve of the vocal fold by an intensity threshold method. The surface is overlapped with the OCT image as depicted by the solid red line in Fig. 4. Hsaio et al. used a bilaterally fixed-ends vibrating string model to simulate the vocal vibration . At a fixed location, the surface curve can be described by a sine function:
where is the surface location at time t, A is the amplitude of the vibration, wis the period of the vibration, and are two parameters decided by the acquisition start time and initial surface location of the sample at rest respectively. By fitting the surface curve in Fig. 4 with Eq. (1), these parameters can be obtained. Figure 5 shows the curve fitting results together with the data. Clearly, the sine function fits well with the curve of tissue surface movement, especially in the down slope of the vibration. The amplitude Awas found to be and w was 5.3 ms. The period was 10.6 ms and the vibration frequency was 94.3 Hz. From equation, the velocity of the tissue surface movement can also be obtained. This can be obtained with the following equation:
We obtained the velocity of the tissue surface with the above mentioned fitting method. In addition to the velocity of the tissue surface, the velocity distribution beneath the tissue surface is also valuable. Kobler et al. have used a particle image velocimetry to obtain the velocity distribution . In our case, we used the phase resolved ODT to obtain the velocity distribution. Phase resolved ODT has been used to image blood vessels in tissue; in addition, it utilizes the phase difference between adjacent A-lines to estimate the velocity value along the incident light beam direction. The Doppler frequency caused by the sample movement in the axial direction can be obtained by the following equation:
where is the Doppler frequency, is the sample velocity along the light beam direction, is the central wavelength of the incident beam, is the phase difference between adjacent A-lines, and Tis the time difference between adjacent A-lines. Figure 6 shows the velocity distribution of a cross-sectional image obtained with the phase resolved Doppler method. In Fig. 6 a quasi-periodic pattern was caused by phase wrapping, and the phase difference is wrapped between and π. However, wrapped phase images also give the qualitative information regarding the acceleration. The absolute value of the velocity can be obtained using the following simple method.
According to Eq. (4), we can find that the velocity may be expressed as:
In this experiment, μm, μs, and corresponds to a velocity difference of 0.0525 m/s. In Fig. 6, the black striations, as indicated by the white arrows, correspond to a velocity value of m/s, where n is an integer. The regions with are decided based upon the peak and valley location of the oscillation. The values for n in the other regions can then be decided by their relative distance to the region. In Fig. 6, the maximum n is 7 and the maximum velocity is between the velocities 0.3675 m/s and 0.42 m/s, which correspond to n=7 and n=8, respectively. Therefore, this value is close to the maximum velocity value obtained with the previous fitting method which shows the value is 0.415 m/s. For Fig. 6 we can find that the velocity distribution in the down slope region is different from that in the up slope region. In the down slope, the velocity distribution pattern of the tissue surface is more like a sine function. The velocity changes fast at the peak and valley regions, and it changes slower at the waist region. However, in the up slope the velocity distribution pattern cannot be seen clearly.
The B-mode OCT and ODT images are shown in Figs. 7(a) and 7(b). The solid green line in Fig. 3(b) shows the location of the scanning trace. When analyzing B-mode images, we should pay attention that the B-mode images are not “snap-shot” images of the vibration sample, because each A-line is obtained at a different time. Taking the B-mode image as a “snap-shot” image may cause misleading results, especially when the frame rate of the system is close to or slower than the vibration frequency of the sample. A sliding window covering 50-100 A-lines may be used to analyze these images. The information provided in this window can be considered as an “instant” or “snap-shot” image. Using the same analysis method proposed in the previous section, we can obtain the velocity distribution in the B-mode cross-sectional images. Similarly, from the ODT image in Fig. 7(b), the velocity distribution in the up slope and that of the down slope are different. As mentioned, this image is not a “snap-shot” image and the slopes mentioned here are different from the actual “instant” slopes of vibrating vocal folds. The velocity distribution pattern of the B-mode image is similar to that of the M-mode image. In the down slope, the velocity distribution is more like a sine function. In addition, the velocity changes faster at the peak and valley regions and slower at the waist region. Consequently, the acceleration is larger at the peak and valley regions and smaller at the waist region. However, the acceleration in the up slope is more uniform than that in the down slope, and the acceleration at the peak region is larger than that at the valley region. On the other hand, from Fig. 5 we are able to see that the sine function fitting of the vibrating tissue surface works better during the down slope than during the up slope. Clearly, the mechanics of vocal fold vibration are very complex. The vibration takes place not only in the vertical direction but also in the horizontal direction. In the setup used here, phase resolved color Doppler can only detect the velocity in the vertical direction. Doppler variance is an extension of the Doppler imaging technique; it uses the bandwidth of the Doppler spectrum to quantify the transverse speed of the imaging sample . The transverse velocity of the vibrating vocal fold with the Doppler variance image is quantified in Fig. 7(c). We can see that the horizontal velocity is high at the waist region and low at the peak and valley region.
Due to the limited penetration depth of the OCT, the velocity distribution at greater depths inside the tissue cannot be resolved with the current technique. Slower vibrations may ease the requirement for deeper penetration, and some mechanical properties may be found from the superficial layers of the vocal fold. By controlling the volume of the air flow rate, we are able to control the frequency and amplitude of the vocal fold vibration. Figure 8 shows a B-mode OCT structure, color Doppler, and Doppler variance images of the vocal fold vibrating at slow frequency and small amplitude. The images are acquired at 100 frames per second. Figure 9 is the movie of the B-mode images replayed at a slower 10 frames per second. The velocity distribution at different time points can be seen from the movie. Figure 10 shows four ODT images extracted from the movie with the time difference between adjacent images being 12 milliseconds. The velocity distribution in the cross-section of the vocal fold can be obtained by the color Doppler images. An interesting phenomenon is that the change in velocity is in the radial direction, as pointed out by the blue arrows in Fig. 10. Although this phenomenon has been found in the waist region of the fast vibration case, as shown in Fig. 6 and Fig. 7, this phenomenon is more evident in the case of slower vibration like that in Fig. 10. Furthermore, the radial directed velocity in this area is changing when the wave travels from right to the left, as shown in the progression of Fig. 10(a)-(d). For these images, we can also quantify the wave traveling speed in the horizontal direction. The distance between the two yellow vertical lines in Fig. 10(a) and Fig. 10(c) is . Since the time difference between Fig. 10(a) and Fig. 10(c) is 24 ms (12 ms x 2 = 24 ms), the velocity for the transverse wave is 9.3 mm/s (assume a constant velocity for the wave).
For phase resolved Doppler OCT, the minimum detectable velocity is decided by the phase stability of the system. For the current setup, the phase stability of the system is measured to be less than 5 milliradians. According to Eq. (5) in this manuscript, this phase stability corresponds to a speed of 42 μm/s. The maximum speed with Doppler OCT is usually decided by the phase wrapping, which corresponds to a phase difference of . Furthermore, this phase difference corresponds to a speed of 0.0525 m/s. With the method introduced in section 3.1 of this paper, or phase unwrapping methods, we are able to obtain a velocity larger than 0.0525 m/s. The accurate results that the system can provide are dependent upon the velocity, frequency, and amplitude of the vibration sample. The velocity is decided by the amplitude and frequency of the vibration sample. When the sample is vibrating too fast, the Doppler effect will induce erroneous depth information, and the axial resolution will be degraded [16, 22]. The duty cycle of the laser used in this study is around 50%, and the actual integration time period for one single A-line is around 5 μs. If we set the intolerable axial sample movement during one integration period as the axial resolution (7 μm) of the current system [16, 22], then the axial direction velocity that will not introduce an intolerable depth is 1.4 m/s in our setup. The largest velocity demonstrated in this experiment is 0.415 m/s. This corresponds to an axial movement of around 2 μm during the integration time of one A-line.
As mentioned in section 3.2, the B-mode images are not “snap-shot” images of the vibration sample. When the frame rate of the system is slower than the vibration frequency of the sample, the information for the whole frame is not obtained at the same time and taking the B-mode image as a “snap-shot” image may cause misleading results. However, the velocity and acceleration information provided by color Doppler and Doppler variance images are corrected because the phase resolved methods use 2 adjacent A-lines to obtain this information. A sliding window covering 50-100 A-lines was used to analyze these B-mode images in this study. The information provided in this window can be considered as an “instant” or “snap-shot” image. In order to obtain a “snap-shot” like B-scan image, the frame rate of the system must be much higher than the vibration frequency. This may be realized by decreasing the number of A-lines per frame or by increasing the A-line speed of the swept source laser.
Although bulk motion is not applicable in our current experiment, the human bulk motion will affect the results of the Doppler OCT for awake patient imaging. There is extensive experience regarding bulk motion from OCT ophthalmology applications [23,24]. The bulk motion will introduce bulk phase and usually the structural images will not be affected. The vibration frequency, amplitude and period are extracted from the structure images and we estimate that the bulk motion will not affect the extraction of these parameters. However, the velocity information is obtained with the phase-revolved method and will be affected by bulk motion. The bulk motion artifacts may not be able to be eliminated with the histogram-based statistic method usually adopted in ophthalmology application. A way to minimize this effect is to increase the speed of the system so that the imaging time is reduced. In the current setup for B-mode imaging, the imaging speed is 100 frames per second with 1000 A lines per frame. We can increase the frame rate further by decreasing the number of A lines per frame or by increasing the sweeping rate the laser by a buffering technique .
Functional imaging of vibrating ex-vivo porcine vocal folds was demonstrated with a high speed swept source OCT and ODT system. The functional information regarding the vibrating vocal folds was obtained with this high speed system. The tissue surface of the vibrating vocal folds was extracted to obtain functional information such as vibration amplitude, vibration frequency, velocity, and acceleration. Color Doppler and Doppler variance methods were used to obtain the velocity distribution characteristics in the cross sections. Essentially, the use of this system or one similar to it by physicians in otolaryngology could show how laryngeal carcinomas and other afflictions differ from baseline characteristics, because factors such as velocity are monitored to show how the inertial movement of the tissue is affected.
This work was supported by the National Institutes of Health ( EB-00293, EB-10090, RR-01192, HL-103764, HL-105215), Air Force Office of Scientific Research ( FA9550-04-0101), and the Beckman Laser Institute Endowment.