2.1. System configuration
The experimental configuration of our system is shown in
. We designed our SD-OCT system to be capable of an ultrafast A-scan rate up to 10 MHz and
display real-time 4D OCT video at a video rate. We also intended to record 4D OCT images for a long
duration at a volume rate even faster than the standard video rate. The strategy to achieve
ultrahigh speed was to acquire an A-scan signal at all the k
with parallel photoreceivers and an analog-to-digital (A/D) converter array. Then, the A-scan rate
is fundamentally determined by the speed of the A/D converter array. The use of two ODs (OD+ and
OD−) enables such a parallel detection. The concept of an ultrafast SD-OCT configuration was
already reported in our previous work in the wavelength region of 1550 nm [22
]. In this work, we designed a system in the 1300 nm wavelength region, which is
more suitable for most biological tissue imaging, and added the capability of real-time 4D display.
We designed our ultrafast data processing system using FPGAs and a GPU board to attain real-time 4D
display and long-time 4D image recording.
Experimental configuration of our system. In inset (a), spectral shape of light at output of FIL
The light source of the system is a combination of a superluminescent diode (SLD) (Covega,
Jessup, USA), a semiconductor optical amplifier (SOA1) (Thorlabs Quantum Electronics, Jessup, USA),
and an optical filter (FIL) (a custom product of Alnair Labs, Tokyo, Japan), as shown in . A polarization controller PC (Thorlabs Japan, Tokyo, Japan)
and a polarizer (POL) (Optoquest, Ageo, Japan) were used to adjust the wavelength of the maximum
output power to be near the center wavelength of the ODs. The purpose of the FIL was to eliminate
light with wavelengths outside the principal free spectrum range (FSR) of the ODs, as explained
below. The output spectral shape from the source assembly, measured at the output of the FIL, is
shown in inset (a) in . The total output power was 27.5
mW. The vertical broken lines indicate the boundary of the principal FSR of the ODs. A few channels
near both sides of the FSR were affected by the roll-off characteristic of the FIL.
The OCT interferometer has a Mach-Zehnder configuration. The light out of the FIL was divided
using a coupler (C1) (Opneti Communications, Shenzhen, China) into a sample arm (SA) and reference
arm (RA). The splitting ratio of C1 was chosen between 50:50 and 90:10 depending on the experiment.
The light in the SA was directed to either one of two sample probes, which differed in the B-scan
rate, using a circulator (CRS) (Opneti Communications, Shenzhen, China). One probe was for a 4 kHz
B-scan rate and comprised of a collimator (CLS) (FH10-IR-APC, Newport Corporation, Irvine, USA), a 4
kHz resonant scanner (RS) (General Scanning, Billerica, USA), a Galvano mirror (GM) (6220H,
Cambridge Technology, Lexington, USA), and an achromatic-doublet objective lens (OLS) with a focal
length of 70 mm. The other probe was for an 8 kHz B-scan rate and comprised of a CLS (F280APC-C,
Thorlabs), an 8 kHz RS (General Scanning), GM (6215H, Cambridge Technology), and an OLS with a focal
length of 30 mm. For the 4 kHz probe, the beam diameter at the output of the CLS was about 8 mm, the
transverse resolution at the focal point was 15 μm, the confocal parameter was 260 μm,
and the beam diameter at the position apart from the focal point by 2 mm, (half the depth range),
was 230 μm. For the 8 kHz probe, the respective values were 3.3 mm, 15 μm, 270
μm, and 220 μm.
The back-scattered or back-reflected light from the sample was collected by the illuminating
optics and directed from the CRS to an SOA (SOA2) (Thorlabs Quantum Electronics, Jessup, USA). SOA2
had a center wavelength of an amplified spontaneous emission (ASE) at 1305.8 nm, optical 3 dB
bandwidth of 86 nm, small signal gain of 34.6 dB, and saturation output power of 17.8 dBm. Usually,
strong ASE light is emitted out of an SOA. In fact the SOA2 emitted an intensity of 70 mW out of the
input port. However, the CRS effectively works as an isolator and it was attenuated to 820 nW and
6.6 nW at the output/input port and input port of the CRS, respectively. They made negligible effect
to system performance. The output of SOA2 was directed to a coupler (C2) (Opneti Communications,
Shenzhen, China) of a 50:50 splitting ratio. We did not use an additional optical filter to
eliminate ASE light outside the principal FSR of ODs. The noise due to ASE light made negligible
contribution compared with the beat noise which determined the noise floor of the present experiment
as explained in section 3.2. A polarization controller (PCS) (Thorlabs Japan,
Tokyo, Japan) was used to adjust the signal polarization. In the RA, a circulator (CRR) (Opneti
Communications, Shenzhen, China), collimator (CLR) (FH10-IR-APC, Newport Corporation, Irvine, USA),
achromatic-doublet objective lens (OLR) and a reference mirror (RM) were used to balance the optical
path length difference between the SA and RA. The output of the CRR was directed to C2. An
adjustable aperture was placed between the CLR and OLR to regulate the power directed to C2. A
polarization controller (PCR) (Thorlabs Japan, Tokyo, Japan) was used to adjust the polarization of
the reference light.
The two outputs of C2 were directed to OD+ and OD−, respectively. Detailed specifications
of the ODs are explained below. They output data at 320 channels of different optical frequencies
with an equal adjacent interval. An array of 320 balanced photoreceivers (PDs) (2117, New Focus, San
Jose, USA) detected the outputs from the ODs. Optical signals output from the same channel number of
ODs were differentially detected by each photoreceiver. The PD array output 320 electric signals,
which were detected and processed using our ultrafast data processing system, the function of which
is explained below.
2.2. Optical demultiplexer
Three main advantages of using ODs in an SD-OCT system are an ultrafast A-scan rate, reduced
sensitivity roll-off, and linear-k
spectral detection. The disadvantages are a
limited spectral coverage, high-cost, and stronger attenuation of light compared with a diffraction
grating. Because usage of ODs in SD-OCT is not yet common, we describe the characteristics of our
ODs a little more in detail. AWGs were used for this experiment as ODs. A schematic of an AWG is
. Each output from C2 in is directed to the
input port of an AWG. The input slab guide directs input light to a set of arrayed waveguides. The
arrayed waveguides consist of optical paths that mutually differ in path length. In the output slab,
light output from the set of the arrayed waveguides is collected. Due to interference of light
passing through optical paths of different lengths, light is dispersed into different wavelengths at
the output surface of the output slab. A wavelength range of light is directed to an optical fiber.
By suitably designing the wave guides, frequency (wave number) intervals between adjacent fiber
outputs can be made equal, i.e., a linear-k
interferometer is fabricated. The
multi-output by optical fibers enables simultaneous detection of the set of outputs leading to an
ultrafast A-scan rate. The spectral width of a nearly Gaussian band-pass characteristic at each
output is narrower than the frequency interval between the adjacent channels and allows an
effectively long coherence length of the SD-OCT system. We can find explanation of the fundamental
technology of AWG in the book by Okamoto [36
Schematic of arrayed waveguide grating (AWG)-type optical demultiplexer.
In our former work, we used AWGs in the 1560 nm wavelength region [22
], which is a standard in telecommunication technology. In OCT, the 1300 nm
region is more suitable for most of tissues. We asked NTT Electronics (NEL, Yokohama, Japan) to
design and manufacture custom planar lightwave circuit (PLC)-type AWGs to operate at this wavelength
region and to achieve a depth range of 4 mm in OCT application. NEL gave us results of test
measurements, some of which are shown below. Application of an AWG for OCT in the 1300 nm wavelength
region was reported by Nguyen et al., where they imaged the output at the surface of the output slab
of an AWG onto a line-scan camera using a lens [37
]. In their
setup, the A-scan rate was limited by the 46 kHz speed of the line-scan camera, and a 6 dB roll-off
of the point spread function was observed at about 0.7 mm, while the depth range was only 1 mm.
Although they used the advantage of linear-k
detection in their system, ultrafast
detection and reduced roll-off capability were not attained. Recently, Akca et al. improved the
depth-range to 4.6 mm and 6 dB roll-off to about 3.2 mm using a polarization-independent AWG in
their reflectometry [38
An AWG has an FSR, and the best performance is expected in the principal FSR. The principal FSR
of our AWG was from 1293.42 nm (231.782 THz) to 1327.58 nm (225.818 THz) and was divided into 320
channels. The spectral coverage of 34 nm (6.0 THz) determined the axial resolution limit of 22
μm for a rectangular apodization [39
]. Dependencies of
the optical frequency
on the channel number i
, provided by NEL, are shown in
. We used two AWGs. The outputs of these OD+ and OD− were respectively connected to +
and − inputs of the photoreceivers, as shown in . The agreement in the characteristics of the two ODs affects the quality of the SD-OCT
system. Linear least squares fit gave νi
= 231.8005 –
(THz) with a coefficient of determination of
= 0.99999998 for both OD+ and OD−. Therefore, the two AWGs are
practically identical in the frequency characteristic, and the frequency (and therefore the wave
) depends linearly on the channel number, which eliminated RSK in our data
processing. NEL’s AWGs are thermally tunable. To tune to a practically identical
characteristic, the temperatures of OD+ and OD− must be controlled to 27.9°C and
44.4°C, respectively. The average frequency interval of 18.7 GHz between the adjacent
channels determined the depth range of 4.0 mm.
Dependence of optical frequency on channel number. (a) Optical demultiplexer (AWG) OD+, (b)
Optical demultiplexer (AWG) OD−.
We now explain the role of the FIL in . We observed
the spectra of selected output channels from OD+ using an optical spectrum analyzer (AQ6370,
Yokogawa, Yokosuka, Japan). Without using the FIL, we attenuated the output light from SOA1 with a
coupler of a 90:10 splitting ratio and directed it to the optical spectrum analyzer. Superposed
spectra at 9 channels (1, 40, 80, 120, 160, 200, 240, 280, and 320) are depicted in
. The principal FSR region is indicated in the figure with an arrow. In the longer wavelength
region, signals out of channels 1, 40, and 80 were observed. In the shorter wavelength region, the
spectrum out of channel 320 was observed close to the spectral peak observed at channel 1. Without
the FIL in the experimental configuration of our system shown in , these signals with different wavelengths outside the principal FSR were detected at
each channel simultaneously. As seen in the difference in separation of the peaks in channels 1 and
320 at both sides of the principal FSR, the periodic structure of the spectrum shifted in wavelength
outside the principal FSR from that inside the principal FSR. Therefore, without the FIL in , we observed unwanted side lobes in the PSF signals.
Superposed spectra observed at selected channels (1, 40, 80, 120, 160, 200, 240, 280, 320) of
optical demultiplexer (AWG) OD+.
To measure the agreement in wavelength at corresponding channels of the two ODs, spectra at
channel 160 were measured for both ODs. The FIL was used in the light source. The results are shown
; the red and blue lines denote OD+ and OD−, respectively. The peak was observed at
practically the same position for both ODs. Similar agreement was observed for all the selected
channels shown in . To estimate the spectral width, the
OD+ signal is plotted with a linear vertical scale (solid line in ). The observed spectrum is a convolution of the real spectrum with the resolution
function of the optical spectrum analyzer. We observed a spectrum at four different resolutions of
the analyzer and obtained the value of ~0.05 nm for the spectral width from extrapolation to zero
resolution. The spectral shape shown with as the broken line in is a rough approximation of the real spectrum, obtained by shrinking the observed
spectral shape in the horizontal direction to ~0.05 nm width. The spectral shapes shown as the thin
black lines in are drawn similarly. In the
logarithmic vertical scale, it is almost overlapped with other curves. The narrow width compared
with the wavelength interval between the adjacent channels effectively makes our system a frequency
comb detection system. We demonstrated improvement in sensitivity roll-off [22
]. Bajzraszewski et al. also demonstrated a noticeable improvement in sensitivity
roll-off in their SD-OCT system using an optical frequency comb source [40
(a) Spectra observed at 160-channel of two optical demultiplexers; OD+ : red, OD−: blue.
(b) Plot of spectrum observed at channel 160 of optical demultiplexer OD+ with linear vertical
The attenuation of light by the AWGs was also provided by NEL for all the channels. Dependencies
on the channel number are shown in
. and show OD+ and OD−, respectively. A light source of narrow spectral width (81600B
Tunable Laser Source, Agilent, USA) was used for the measurements. The minimum values were 3.21 and
2.98 dB for OD+ and OD−, respectively. The maximum values were 6.56 and 6.50 dB for OD+ and
OD−, respectively. The attenuation increased as the channel approached both sides of the
principal FSR. The variation is not monotonous and modulates the amplitude of the interference
fringe of OCT. The modulation must be corrected in data processing. Strong attenuation by an AWG
compared with that of a diffraction grating is a disadvantage in using an AWG for OCT. For example,
attenuation by using a grating from Wasatch Photonics (Logan, USA) can be less than 0.5 dB.
Attenuation by using an AWG also occurs due to the pass-band characteristic shown in for a continuous light source. The portion of light
indicated by the pink area in is lost. This leads
to about 3 dB attenuation at all the channels.
Fig. 6 Dependence of attenuation on channel number are shown for (a) optical demultiplexer OD+ and (b)
optical demultiplexer OD−. Dependence of non-adjacent background crosstalk on channel number
are shown for (c) optical demultiplexer (AWG) OD+ and (more ...)
The non-adjacent background crosstalk provided by NEL is shown in . and show OD+ and OD−, respectively. The average values are −32.51
and −34.99 dB for OD+ and OD−, respectively. The dependence on channel is weak. The
crosstalk weakly contributes to the noise floor in an OCT image. Our measurement in is consistent with NEL’s data. From the figure,
the crosstalk between adjacent channels was estimated. The vertical dashed green lines labeled 159
and 161 indicate the center wavelength of the adjacent channels. From the cross points of the green
lines and the spectra shown by the thin solid lines, the crosstalk between the adjacent channels was
estimated to be less than about −25 dB. It deteriorates spectral purity at a channel by a
2.3. Data processing
In the experimental configuration shown in , an
A-scan interference fringe signal was acquired simultaneously with different 320-channel
photoreceivers and DAQs. To perform FFT in real time for real-time display, the A-scan data
distributed over 320-channel DAQs must be gathered as a set of signals immediately after data
acquisition. At the time we planned this experiment four years ago, National Instruments (NI,
Austin, USA) was to start selling DAQ-connected FPGA boards, which are inserted in a chassis with a
PXI Express ×4 bus (NI calls a PCI Express equivalent bus PXI Express). We estimated the
capability of a system comprised of NI’s off-the-shelf boards and chasses to gather data, as
mentioned above, and to conduct real-time FFT processing. We found such feasibility and ordered a
custom made ultrafast data processing system with the 320-channel A/D converter array shown in the
right-hand side of .
The block diagram of the A/D converter array and ultrafast data processing system is shown in
. Outputs from photoreceivers were connected to digitizers (5751, NI). A digitizer is a
16-channel, 50 MHz, 14-bit adapter module for FPGA-module-D (PXIe-7962R, NI). With FPGA-module-Ds,
we conducted BGS and APD processing. The number of digitizers and FPGA-module-D pairs was twenty.
The data processed using FPGA-module-D were transferred to two FPGA-module-Fs (PXIe-7965R, NI) via
PXI Express switches. First in, first-out (FIFO) buffers were built into the FPGA boards to buffer
and transfer data between boards without loss. An A-scan data of 320 channels was zero padded to 512
data points for FFT processing. Four FFT units were built into each of the two FPGA-module-Fs, and
the eight units performed FFT successively. The processing speed of a FFT unit was 146,000
A-scans/second and the total processing speed with the eight FFT units was 1.17 ×
106 A-scans/second. NI’s PXIe-boards were inserted into two chasses (PXIe-1075,
NI). Precise synchronization of clocks of the two chasses was done with two timing boards
(PXIe-6674, NI). Fast data transfer between chasses and the PC was done with two PXIe-interface
boards (PXIe-8375, NI) and a PCIe-interface board in the PC (PCIe-8371, NI). The sustainable
throughput of the interface was 838 MBytes/s. The multifunction DAQ, (MF-DAQ) (PXIe-6363, NI), was
used to receive sampling trigger signals from the resonant scanner at a rate of 4 or 8 kHz and also
to output control signals for the Galvano scanner.
Block diagram of A/D converter array and ultrafast data processing system.
The PC (T7500, DELL, Austin, USA) included a GPU-board (Tesla C2050, Nvidia, Santa Clara, USA),
PCIe-interface board, and HD-interface board (8262 × 4, NI). We used RAID 0-type HDD memory
(HDD-8264, NI) of 3 TBytes for long recording. The sustainable writing and reading speed of the HDD
was 600 MBytes/s. The image was displayed on a display (U2410, DELL). The refresh rate of the
display was 59 Hz.
The firmware of the FIFO buffers and FFT units written in the FPGA boards were prepared by NI.
The software to run in the PC was developed under Microsoft Windows 7 Professional operating system.
The system control software and the graphical user interface were created in NI’s Labview
2009, which was operated in ×32 mode because the compiler of the FPGA boards was not
supported in ×64 mode. After transferring volume data from the data processing system to the
PC, we copied them into the GPU, the software of which was written using compute unified device
architecture (CUDA) by NVIDIA, compiled using Microsoft Visual Studio 2008 and Intel C++ Compiler
Professional Edition. The CUDA program performed volume rendering. We modified the sample program
“volumeRender” in the file of “NVIDIA GPU Computing SDK” downloaded from
NVIDIA’s web site [41
]. The OpenGL 3.0 Library was
used for visualization of the processed images. The CUDA- and OpenGL-based algorithms were
implemented in Labview through dynamic link libraries (DLLs) so that rendered images could be
displayed and manipulated in Labview’s screen display. Although we designed the average
traffic rate of data to be lower than the limit of PCI (PXI) Express × 4, when we tried to
transfer numerical 2-byte data (14-bit data acquired with DAQs and results of FFT processing) in a
form of signed 16-bit integer (I16), partial loss of data occurred in the data transfer. We believed
this was due to fluctuations in the data traffic rate and overflow occurring at some instance when
the traffic rate exceeded the sustainable throughput of PCI (PXI) Express ×4. To reduce this
fluctuation, we bunched four of the I16 data into unsigned 64-bit data (U64), and empirically found
no loss of data. The FFT data were bunched using FPGAs and transferred to the PC, and un-bunching
from U64 to I16 was done using the GPU in the PC.
An example of the displayed screen images and functions of the software are shown in
. shows an example of the screen displays.
To show details clearly, partial images are enlarged and illustrated as –. The 3D-rendered
image (3D), a B-scan image (V), and the surface image (en face) are displayed on the left-hand side.
The display of images in this area can be selected with the buttons shown in . By clicking button (d)-(1), the display image shown in is selected. By clicking (d)-(2), (d)-(3), or (d)-(4), only
the rendered 3D, B-scan, or en face image is displayed in the display space, respectively. Button
(d)-(5) is the on-off of the directional dice display at the lower-right corner of the 3D image. The
en face image is calculated by integrating the power spectrum in the axial direction for each
A-scan. The direction of the view of the rendered volume is selected with the buttons shown in . The reset button orients the direction to the default.
We prepared seven color codes, as shown in . Of the
seven, the black-and-white (c)-(1) or tissue-like color gradation (c)-(5) was used for most tissue
imaging. The rainbow color codes were sometimes useful for local enhancement in an image. Rotation,
zooming, or translation can be done in real time with the mouse.
Example of screen image of ultrafast real-time 4D OCT system.
We could also cut and reveal an arbitrary surface perpendicular to one of the three axes in real
time by selecting the “Clipping” button in . This corresponds to a virtual surgery in real time. Such cutting was performed with the
slides shown in . Because the cutting can be done
simply by eliminating the portion of volume data cut, more complicated real-time virtual surgery is
possible if we write a program for that purpose. By selecting the button “WW/WL” in
, we can open slides to control the threshold and
window of image intensity expressed in dB scale for both the 3D image and en face image. By
selecting the “Opacity” button, we can modify the color code. The GPU status can be
displayed by clicking the “GPU” button, which is only used for debugging purposes.
We installed long-time data-recording capability. For recording, we did not perform FFT. With the
3-TByte HDD, volume data could be recorded for about 100 minutes.
2.4. Imaging speed and real-time display
The B-scan rate was fixed either at 4 or 8 kHz by using the RS shown in . The number of A-scans within a B-scan was limited by the FFT speed of 1.17
A-scans/second. We chose the number of A-scans per B-scan as 256 and 128 for
4 and 8 kHz B-scan, respectively. For both cases, the FFT processing rate was 1.02 ×
A-scans/second. The maximum A-scan rate could be the DAQ speed of 50 MHz. However,
the real-time processing is not possible at this sampling rate due to the limitation of the
processing system. Moreover, if an A-scan rate is too fast compared with the response time of the
photoreceiver, inter-A-scan blurring occurs in our system configuration [22
]. The response frequency of our photoreceiver was 12 MHz at the gain of 10,
which we usually used. Therefore, we added five 50-MHz data samples and set the fastest A-scan rate
to 10 MHz. It should be noticed that the summation of fringes may distort the original signal, if
the phase of the OCT signal keeps changing during the five 50 MHz samplings. The A-scan rate
decreased by including additional 50 MHz samplings per A-scan. The lower limit of the A-scan rate
was set by the condition in which a set of A-scans per B-scan must be finished within half the
period of the RS. As the A-scan rate increased, the duty ratio decreased. For the A-scan rates of
2.5, 5, and 10 MHz, duty ratios were 39.8, 20.5, and 10.2%, respectively. If we consider the duty
ratio, the averaged A-scan rates are all 1 MHz. The off-duty time was required to complete a series
of FFT processing within a single B-scan time. Therefore, a faster A-scan rate reduced the motion
artifacts in a B-scan (in the direction of the fast axis), but it did not reduce the artifacts in
the direction of the slow axis. We call the lateral scan along the slow axis “B-scans per
The number of B-scans per single-volume data determined the volume rate. After a forward full
swing of the GM, fly-back time and re-start processing time were required for a single volume scan.
We measured the actual volume rate of real-time processing by using the indicator shown in . It indicated the time duration between successive volume
scans, which fluctuated. For example, motion of the mouse or touch of the keyboard apparently
reduced the volume rate. It also depended on the display format. The measured actual volume
processing rates, averaged over 300 volumes, are listed in
for various choices of B-scan number per volume, B-scan rate, and display format. An
example of a 3D-only display image and a 3-image display are shown in section 3.4.
Volume rates (volumes/second) and voxel rates (MVoxels/second) of real-time processing. Voxel
rates are in parentheses.
The refresh rate of the LCD display shown in was 59
Hz. In videos of volume rates faster than this rate, partial frames were lost at the display. For a
volume image, more than 100 lateral scans are preferred for both axes. Therefore, we claim about 41
volumes/second as the practical fastest real-time 4D display rate with 8-kHz RS, 3D only display,
and 128 B-scans per volume.
Continuous recording of volumetric 3D data is also useful. By rendering after recording, we can
manipulate each 3D image to reveal an interesting aspect. The image changes faster than the display
speed can be investigated from frame to frame. In recording, 320 data samples of 2 bytes per channel
were collected as a set of A-scans. They were transferred to the PC without performing FFT. The
recording speed was limited by the rate of 600 Mbytes/second sustained by the HDD shown in . Data traffic rates, under all the conditions listed in
, were slower than the limit. Actual recording
volume rates were measured and averaged over 300 volumes, as listed in
. In two cases, 32 and 64 B-scans per volume at an 8-kHz B-scan rate, the recording
rate was faster than the corresponding 3D-only display rates listed in . In other cases, the recording rate was practically the same with the
3D-only real-time display.
Volume rates (volumes/second) and the voxel rates (MVoxels/second) of recording. Voxel rates
are in parentheses.
The voxel rate listed in and are significantly lower than the fastest voxel rate of 4.5
GVoxels /second demonstrated by Wieser et al. [16
] with their
ultra-fast SS-OCT system.