We have previously demonstrated SV in a low BTM scenario by imaging mouse dorsal skinfold window chamber, where small capillaries could be detected with relatively slow image speed of 36kHz by optimizing the SV parameters (i.e. gate length, n
, or the number of frames used to calculate the SV, and frame rate, F
]. In this study, the viability of the fast imaging system (108kHz) was evaluated in a high BTM scenario by in vivo
imaging of non-stabilized human fingernail root on a healthy volunteer. B-mode scanning and real-time display of averaged structural OCT and SV image were examined first. Each B-scan consisted of 512 A-lines spanning over 2mm. Each A-scan had 2096 and ~364 samples before and after recalibration, respectively, which was zero padded to 1024 prior to the FFT. Subpixel image registration was performed on the top center region (256 × 256) of the structural image with an upsampling factor of 100. For calculating SV in a high BTM case, a small gate length of either 2 or 4 and a fast frame rate are desired to obtain a high SV signal-to-noise ratio (SNR) [6
]. While the A-line density fixed the frame rate at 216fps, the software allowed for a real-time update of the gate length between 2 to 16. A gate length of 4 was chosen to reconstruct the averaged structural OCT and SV images, as shown in
, at an effectively display rate of 54fps. The galvanometer sweepback is seen on the right of each image. Layers of tissue morphology could be delineated easily in the structural image, whereas microvasculature appears as high contrast regions in the SV images. (Media 1
) and 3(b) (Media 2
) compare the SV image quality with and without subpixel image registration realignment, clearly showing less bulk tissue signal and higher vascular contrast in the realigned image.
Fig. 3 B-mode SV image (a) without (Media 1) and (b) with (Media 2) subpixel image registration realignment displayed at 54fps during human fingernail fold imaging. (c) Corresponding real-time structural OCT image. (d) Structural image overlaid with SV image (more ...)
With the same B-mode imaging parameters as above, the software performance was analyzed using the CUDA’s built-in visual compute profiler 4.2, which provided detailed timing statistics of each kernel executed on the GPU as shown in
. The longest possible processing path was considered by including all data transfers to and from the GPU, subpixel image registration, and SV calculation on the nth frame. The GPU times were the average of 100 kernel calls, and the total processing time per SV image was ~3.02ms. This processing speed can theoretically support a 169kHz SS-OCT system at 2096 samples per A-scan. The most time-consuming step was the image registration algorithm, which took 1.6ms to complete. Compared with a CPU implementation reported by Guizar-Sicairos et al. [16
] which employed the same algorithm (image size of 256 × 256, and an upsampling factor κ
= 25), our GPU implementation with κ
= 100 is two orders of magnitude faster. SV calculation and data transfer to the GPU took 388μs and 372μs, respectively, making these tasks the second and third most time-consuming step. The GPU time for the SV calculation is independent of the number of frames used. This is because the mean structural OCT and the sum of squares are calculated as each frame is being acquired, and on the nth frame the same number of arithmetic operations are performed to produce the SV image. It is worthwhile to note that the structural image calculation only took 195μs, and for normal structural OCT imaging, where image registration is not essential, the processing time becomes ~1.23ms, which can theoretically support a 416kHz system. Approximately 30MB of GPU memory was allocated to process each B-scan of 2.15MB in size. Such low memory requirement allows for this parallelized implantation through standard, commercially available GPUs. Finally, each recalibrated B-scan contained ~745kB of data and took on average 573μs to transfer to the solid state drive. This data transfer is relatively time-consuming and will require its own dedicated thread to hide this latency for larger data sizes.
Fig. 4 Time expenditure of each processing step preformed on the GPU in the order of execution as shown in Thread 2 of . The total average processing time was 3.02ms for each SV image (512 × 512 pixels, n = 4) with registration image size of 256 (more ...)
To test the 3D SV imaging capability the real-time data saving function, 10,000 B-scans (512 A-lines in each B-scan) of recalibrated data covering a 2mm-by-2mm region were saved for post-processing. Data transfer did not degrade the software performance, which provided the shortest time delay between adjacent frames, and hence the higher SV SNR. In our previous study [6
], a 50% decrease in variance signal was observed when the frame step size was the same as the beam spot size. Here, the large number of frames corresponded to a 0.2µm step size, which was 65× smaller than the beam spot size of 13µm.
In post-processing, the same steps were followed as in real-time processing.
shows an en face
2D SV projection image in which the high contrast regions closely resemble capillary loops commonly observed in nailfold capillaroscopy [17
]. The projection image is a summation over 860μm starting from 310μm below the tissue surface. Motion artifacts due to heartbeat and breathing can be seen as periodic horizontal striations (indicated by the white arrows) resulted from an increase of the noise floor of the entire SV image and adjacent frames. shows the projection image after the structural images are realignment using the image registration algorithm, which improved the image quality via the removal of the periodic striation noise.
Fig. 5 2D SV projection image (a) without and (b) with structural image realignment using subpixel image registration algorithm. Motions during imaging appeared as periodic horizontal striations (indicated by the white arrows on left hand side) that increased (more ...)