Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Opt Express. Author manuscript; available in PMC 2009 December 9.
Published in final edited form as:
PMCID: PMC2790868

Experimental Bioluminescence Tomography with Fully Parallel Radiative-transfer-based Reconstruction Framework


Bioluminescence imaging is a very sensitive imaging modality, used in preclinical molecular imaging. However, in its planar projection form, it is non-quantitative and has poor spatial resolution. In contrast, bioluminescence tomography (BLT) promises to provide three dimensional quantitative source information. Currently, nearly all BLT reconstruction algorithms in use employ the diffusion approximation theory to determine light propagation in tissues. In this process, several approximations and assumptions that are made severely affect the reconstruction quality of BLT. It is therefore necessary to develop novel reconstruction methods using high-order approximation models to the radiative transfer equation (RTE) as well as more complex geometries for the whole-body of small animals. However, these methodologies introduce significant challenges not only in terms of reconstruction speed but also for the overall reconstruction strategy. In this paper, a novel fully-parallel reconstruction framework is proposed which uses a simplified spherical harmonics approximation (SPN). Using this framework, a simple linear relationship between the unknown source distribution and the surface measured photon density can be established. The distributed storage and parallel operations of the finite element-based matrix make SPN-based spectrally resolved reconstruction feasible at the small animal whole body level. Performance optimization of the major steps of the framework remarkably improves reconstruction speed. Experimental reconstructions with mouse-shaped phantoms and real mice show the effectiveness and potential of this framework. This work constitutes an important advance towards developing more precise BLT reconstruction algorithms that utilize high-order approximations, particularly second-order self-adjoint forms to the RTE for in vivo small animal experiments.

1. Introduction

Bioluminescence Imaging (BLI) has become an increasingly important tool for in vivo preclinical research [1][2]. Currently, planar (two-dimensional) BLI is the conventional imaging modality used in optical imaging. Optical photons emitted from bioluminescence sources in the small animal body (usually mice) propagate in biological tissues and are detected once emitted from the subject surface. Since in vivo biological tissues have high absorption and scattering characteristics, planar BLI only indirectly reflects the activity of the targeted biological object via the surface photon distribution [3]. Optical signals are easily confounded by the tissue properties in in vivo mouse experiments, especially when monitoring the initiation and progression of tumors over time within the same subject. The aim of BLT is to quantitatively acquire 3D information of bioluminescence sources, significantly improving the information quality of bioluminescence imaging.

Monte Carlo (MC) methods used with statistical models can obtain very precise simulations by tracking photon propagation in biological tissues [4]. However, these methods are severely time-consuming. Solving the radiative transfer equation (RTE) (i.e. Boltzmann equation) can also obtain precise simulation results as there is no Poisson noise in the simulation data [5]. Currently, nearly all 3D BLT reconstruction algorithms are based on the diffusion equation, which is a simple approximation to the RTE [6][7] [8][9]. Simulation and experimental reconstructions have shown that the diffusion approximation (DA) introduces significant artifacts in BLT reconstruction [3][10]. Therefore, it is necessary to develop high-order approximation-based reconstruction methods to improve BLT reconstruction. The first- and second-order formulations of the RTE are usually used to directly solve the RTE [5]. Discrete ordinates (SN) and spherical harmonics (PN) methods, as two usual numerical approximations, can yield simulation solutions based on two types of formulations. Compared with first-order formulations, the operators acting on the second-order forms such as the even-odd parity (EOP) equations are self-adjoint [5]. This distinct advantage provides a straightforward application of the finite element methods (FEM) easily executed on complex heterogeneous geometries [11]. Furthermore, the acquired FEM matrix in the second-order equations is sparse positive-definite (SPD), which yields better numerical stability and efficiency and benefits the development of reconstruction algorithms [12]. In order to generate an accurate simulation model, and regardless of the first- and second-order equations, one has to set N as large as possible and then N(N + 2) and (N + 1)2 coupled equations corresponding to SN and PN methods need to be solved. This computational complexity, especially for the whole-body of small animals, creates a substantial challenge in the development of novel BLT reconstruction algorithms. Recently, a novel type of second-order approximation form, the simplified spherical harmonics (SPN) method, has been developed for optical imaging [13], improving computational efficiency. Furthermore, a fully parallel adaptive FEM method was proposed to improve the simulation speed [14]. However, to obtain more accurate BLT reconstructions, a novel BLT reconstruction framework needs to be developed with the radiative transfer-based approximations to the RTE.

BLT is an inverse source problem and in the general case, its solution is not unique [15]. A priori information plays an indispensable role in BLT reconstruction. Among the various types of a priori information, multispectral measurements are important for achieving BLT reconstructions [16][17][18][19]. However, spectrally-resolved data sets can significantly increase the computational burden, especially when non-contact measurements are made using highly sensitive CCD cameras which acquire detailed surface photon distribution. In addition, due to the curved surface topography of the mouse and the necessity of using the heterogeneous characteristics of mouse tissues, numerical reconstruction algorithms, such as those based on FEM, are more suitable compared to analytical and statistical modeling-based methods [4].

BLT reconstruction is, in principle, similar to that of single photon emission computed tomography (SPECT) and positron emission tomography (PET) imaging. Therefore, reconstruction algorithms appropriate for PET and SPECT can be introduced to realize the BLT reconstruction [17]. In this case, the system response P-matrix needs to be computed, which is a very time-consuming step, although it can be obtained prior to acquiring the measured data. The BLT reconstruction is sensitive to various factors. Precalculating the P-matrix can affect the reconstruction quality due to the use of different heterogeneous geometries between the calculation and the experiment. Diffuse optical tomography (DOT) has been investigated for several decades and its reconstruction algorithms are easily applied to BLT. In this case, the Jacobian matrix needs to be calculated for each iteration, which is time-consuming [7]. Since the BLT problem is linear, the least-square problem can be solved to realize the BLT reconstruction by establishing the linear relationship between the unknown source variable and the measured surface data [6]. Until now, this method has not been extended to the radiative-transfer-based model. In addition, matrix inversion needs to be performed when using this strategy. Additional investigations should be performed to improve the reconstruction speed.

In this paper, a radiative-transfer-based fully parallel BLT reconstruction framework is developed using simplified spherical harmonics (SPN) equations. This framework uses finite element methods to process complex reconstruction domain geometries. The linear relationship between the unknown source and the spectrally-resolved measured data using the SPN approximation is established to achieve BLT reconstruction. To improve the reconstruction speed and enable BLT reconstruction for the whole body of the mouse, the finite element-based matrices are stored and operated in a parallel distribution mode. Furthermore, for the time-consuming problems of the key steps in the reconstruction, corresponding improvements are also performed, which significantly accelerate the reconstruction. Timing analysis demonstrates the improved performance of the proposed framework. Experimental reconstructions using mouse-shaped phantoms and real mice show the potential of this framework for practical BLT applications. The next section introduces the proposed fully parallel framework using the SPN approximation. In the third section, the performance tests and analysis are described and experimental BLT reconstructions also are demonstrated. Finally, we discuss relevant issues.

2. Methods

2.1. Radiative transfer equation and SPN approximation

The radiative transfer equation (RTE) is an approximation to Maxwell's equations. In bioluminescence imaging, the source intensity is generally assumed to be time invariant during the data acquisition. In addition, the photons at different wavelengths are considered to be independent, therefore we get

s^[center dot][nabla]ψ(r,s^,λ)+(μS(r,λ)+μa(r,λ))ψ(r,s^,λ)=μS(r,λ)4πp(s^,s^)ψ(r,s^,λ)ds^+S(r,s^,λ)

where ψ(r,ŝ,λ) denotes photons in the unit volume traveling from point r in direction ŝ. Based on the principle of energy conservation, the RTE suggests that the radiance ψ(r,ŝ,,λ) is equal to the sum of all factors affecting it (including absorption μa(r,λ), scattering μs(r,λ), and source energy S(r,ŝ,λ) ) when light photons cross a unit volume [20]. p(ŝ,ŝ′) is the scattering phase function and gives the probability of a photon scattering anisotropically from direction ŝ′ to direction ŝ. Generally, the Henyey-Greenstein (HG) phase function is used to characterize this probability [21]:


where g is the anisotropy parameter; cosθ denotes the scattering angle and is equal to ŝ·ŝ′ when we assume that the scattering probability only depends on the angle between the incoming and scattering directions. When photons reach the body surface of a mouse, that is r [set membership] [partial differential]Ω, some of them are reflected and cannot escape from the mouse body Ω because of the mismatch between the refractive indices nb for Ω and nm for the external medium. When the incidence angle θb from the mouse body is not larger than the critical angle θc (θc = arcsin(nm/nb) based on Snell's law), the reflectivity R(cosθb) is given by [22]:


where θm is the transmission angle. Furthermore, we can get the exiting partial current J+(r) at each boundary point r [13]:

J+(r,λ)=s^[center dot]v>0[1R(s^[center dot]v)](s^[center dot]v)ψ(r,s^,λ)ds^

where v is the unit outer normal vector. After a series of deductions with the PN method, the SPN approximation is obtained [13]

(n+12n+1)[nabla][center dot]1μa,n+1(λ)[nabla]((n+22n+3)ϕn+2(λ)+(n+12n+3)ϕn(λ))(12n+1)[nabla][center dot]1μa,n1(λ)[nabla]((n2n1)ϕn(λ)+(n12n1)ϕn2)+μa,n(λ)ϕn(λ)=S(λ)

where μa,n(λ) = μa(λ) + μs(λ)(1 – gn); when ψ(λ) is expanded by the PN approximation, ϕn(λ) are the Legendre moments of ψ (λ) (2 ≤ nN, N is an odd positive integer). Although the SPN solution is asymptotic and cannot converge to an exact radiative transfer solution with the increase of N, the simulation results have shown good agreement between the SP7 approximation and Monte Carlo methods [14]. Through some further deductions, (N + 1)/2 boundary conditions can be obtained corresponding to (N + 1)/2 Eqs. (5). These boundary conditions are mixed and consist of linear combinations of the even-order ϕn and their first derivatives. With respect to the composite moments [var phi]n of ϕn, which are

[var phi]1=ϕ0+2ϕ2,[var phi]2=3ϕ2+4ϕ4,,[var phi]n=(2n1)ϕ2n2+(2n)ϕ2n,,[var phi](N+1)/2=NϕN1.

We can get the general equations of the SPN approximation and its boundary conditions when practical measurements are performed at the wavelength λk using bandpass filter:

{[nabla][center dot]Ci,[nabla][var phi]i(λk)[nabla][var phi]i(λk)+j=1(N+1)/2Ci,[var phi]j(λk)[var phi]j(λk)=Ci,S(λ)Si(λk)j=1(N+1)/2Ci,[nabla][var phi]jb(λk)v[center dot][var phi]j(λk)=Σj=1(N+1)/2Ci,[var phi]jb(λk)[var phi]j(λk)i[set membership][1,(N+1)/2]}
(7a) (7b)

where Ci,[nabla][var phi]i(λk), Ci,[var phi]j(λk), Ci,[nabla][var phi]jb(λk), and Ci,[nabla][var phi]jb(λk) can be calculated. The details and the above coefficients of the SP1 to SP7 approximations can be found in [13].

2.2. Fully parallel reconstruction algorithm

Based on the finite element analysis, we can get a general weak formulation for the SPN approximation [23]:

Ω{Ci,[nabla][var phi]i(λk)[nabla][var phi]i(λk)[center dot][nabla]υ+j=1(N+1)/2Ci,[var phi]j(λk)[var phi]j(λk)[center dot]υ}dΩ[partial differential]ΩCi,[nabla][var phi]ij=1(N+1)/2fv[center dot][var phi]i([var phi]j)[center dot]υd[partial differential]Ω=ΩCi,SSi(λk)[center dot]υdΩ

To avoid the processing of v·[var phi]i in boundary integration, we assume v·[var phi]i are unknown variables in the boundary equations (Eq. (7b)). We obtain fv·[var phi]i(·) by solving a set of first order equations. The boundary conditions can be easily processed using this method.

Figure 1 shows the flowchart of the proposed fully-parallel framework. Fully-parallel reconstruction means that almost all of the steps in the reconstruction framework should work in parallel mode. After the reconstruction domain Ω is discretized into the volumetric mesh J, the next step is to partition this mesh into Nc mesh subdomains Jc(1JcNc), where Nc is the number of the utilized CPUs. Regarding the finite element implementation, the space of linear finite elements V is introduced on J. [var phi]i(λk) and Si(λk) are approximated:

{[var phi]i(r,λk)p=1NP[var phi]i,p(λk)υp(r)Si(r,λk)p=1NPSi,p(λk)υp(r)}
(9a) (9b)

where [var phi]i,p(λk) and si,p(λk) are the discretized values at a discretized point p when using the basis function υp(r); NP is the total number of the discretized points over the entire domain. Considering Eqs. (8), (9a), and (9b), for a volumetric element τe, we have

[m1[var phi]1(λk)m1[var phi]2(λk)[center dot]m1[var phi](N+1)/2(λk)m2[var phi]1(λk)m2[var phi]2(λk)(...)m1[var phi](N+1)/2(λk)[vertical ellipsis][vertical ellipsis][vertical ellipsis][vertical ellipsis]m(N+1)/2[var phi]1(λk)m(N+1)/2[var phi]2(λk)(...)m(N+1)/2[var phi](N+1)/2(λk)][[var phi]1,τe(λk)[var phi]2,τe(λk)[vertical ellipsis][var phi](N+1)/2,τe(λk)]=[b1[var phi]1(λk)b2[var phi]2(λk)[ddots, three dots, descending]b(N+1)/2[var phi](N+1)/2(λk)][s1,τe(λk)s2,τe(λk)[vertical ellipsis]s(N+1)/2,τe(λk)]


mi[var phi]j(λk)={τe{Ci,[nabla][var phi]i(λk)[nabla]υp[center dot][nabla]υq+Ci,[var phi]i(λk)υpυq}dr[partial differential]τeCi,[nabla][var phi]i(λk)fv[center dot][var phi]i(υp)υqdrifi=jτeCi,[var phi]j(λk)υpυqdr[partial differential]τeCi,[nabla][var phi]i(λk)fv[center dot][var phi]i(υp)υqdrifij}


bi,[var phi]i(λk)=τeυpυqdr

[partial differential]τe is the boundary element if τe is on the boundary and belongs to the respective subdomain in the parallel implementation. After assembling the submatrices on all the elements, we get

[M1[var phi]1Jc(λk)M1[var phi]2Jc(λk)(...)M1[var phi](N+1)/2Jc(λk)M2[var phi]1Jc(λk)M2[var phi]2Jc(λk)(...)M2[var phi](N+1)/2Jc(λk)[vertical ellipsis][vertical ellipsis][vertical ellipsis][vertical ellipsis]M(N+1)/2[var phi]1Jc(λk)M(N+1)/2[var phi]2Jc(λk)(...)M(N+1)/2[var phi](N+1)/2Jc(λk)][[var phi]1Jc(λk)[var phi]2Jc(λk)[vertical ellipsis][var phi](N+1)/2Jc(λk)]=[BJcBJc[ddots, three dots, descending]BJc][S1Jc(λk)S2Jc(λk)[vertical ellipsis]S(N+1)/2Jc(λk)]

After inverting the entire matrix at the left side of Eq. (11), we have

[[var phi]1Jc(λk)[var phi]2Jc(λk)[vertical ellipsis][var phi](N+1)/2Jc(λk)]=[j=1(N+1)/2Cj,SIM1[var phi]jJc(λk)[center dot]BJc[center dot]SJc(λk)j=1(N+1)/2Cj,SIM2[var phi]jJc(λk)[center dot]BJc[center dot]SJc(λk)[vertical ellipsis]j=1(N+1)/2Cj,SIM(N+1)/2[var phi]jJc(λk)[center dot]BJc[center dot]SJc(λk)]

Where IMi[var phi]jJc(λk) are the submatrices of the entire inverse matrix IM(λk) corresponding to Mi[var phi]jJc(λk). Note that matrix inversion is calculated with respect to the entire matrix M(λk) although IMi[var phi]jJc(λk) are used to describe the subdomain submatrices. Because matrix inversion is always time-consuming, to accelerate the reconstruction, direct and iterative matrix inversion methods are compared to optimize the execution time. Details can be found in Section 2.3. After removal of the rows in the matrices j=1(N+1)/2Cj,SIMi[var phi]jJc(λk)[center dot]BJc[center dot]SJc(λk) corresponding to the non-boundary measurable discretized points, we further manipulate Eq. (4) to get [13]

JJc,+,b(λk)=j=1(N+1)/2βj(λk)[var phi]jJc,b(λk)=j=1(N+1)/2βj(λk)GjJc(λk)SJc(λk)=GJc(λk)SJc(λk)

where βj(λk) can be calculated with respect to Eq. (4) when ψ(r,ŝ,λ) is expanded; GjJc(λk) are the corresponding matrices on subdomain Jc after removing the rows in Eqs. (12). When the surface optical data are collected at K wavelengths, we get



JJc,+,b=[JJc,+,b(λ1)[vertical ellipsis]JJc,+,b(λk)[vertical ellipsis]JJc,+,b(λK)],AJc=[γ1GJc(λ1)[vertical ellipsis]γkGJc(λk)[vertical ellipsis]γKGJc(λK)]

AJc is the relationship matrix between JJc,+,b and SJc; γk is the percentage at the wavelength λk of the total energy. It is usually considered as an ill-conditioned matrix because of the ill-posedness of BLT. The surface measured data JJc,+,m corresponding to JJc,+,b will likely lead to reconstruction failure when solving Eq. (14) directly due to the noise factor. Through solving the bound-constrained least-square problem


We can generate the BLT reconstruction, where Ssup is the upper bound of the source density; δ the regularization parameter; and η(·) the penalty function. Since all the data in Eq. (16) are distributed on Nc CPUs, the optimization algorithms should work in parallel mode.

Fig. 1
The flowchart of the proposed fully parallel framework.

2.3. Further demonstration

Overall, BLT reconstruction can be obtained by solving Eq. (16) using parallel optimization algorithms. Although a fully parallel spectrally-resolved reconstruction framework can be realized, performance optimization is necessary to accelerate the reconstruction. With respect to the steps of the proposed framework, three aspects are further demonstrated:

Load balancing is critical for high performance parallel reconstructions. If there are large differences between the sizes of the mesh subdomains on different CPUs, the performance is adversely affected. At the beginning of the proposed framework, mesh partitioning is an important step. In this framework, a multilevel k-way method is used to perform this [24]. The method results in improved performance by reducing the dimensions of the mesh, partitioning it into smaller sizes, and refining it to the original one. Another consideration is the distribution of the relationship matrix A between the spectrally-resolved measured data and the unknown source variable. The distribution of the measurable boundary discretized points is not uniform on each CPU. In addition, AJc is formed by combining GJc. The redistribution of A is necessary in order to optimize the performance of parallel optimization algorithms.

Matrix inversion is the major component in the Relationship Forming step. Although it is very time-consuming, it is unavoidable in the proposed framework. Furthermore, with the increase of the number N in the expansion to the radiance, the dimensions of the matrix M(λk) become very large. It is essential to find a very efficient inversion matrix algorithm [25]. Since M(λk) is sparse positive-definite, LU factorization has good performance in various direct inversion algorithms. Another strategy is to use iterative methods. Preconditioning strategies in recent years have significantly improved the performance of iterative methods [26]. Since the performance of two strategies depends on the matrix properties, it is necessary to decide the type of strategy suitable for the specified problem.

Optimization methods Eq. (16) is a least squares problem, and it is easy to obtain the Hessian matrix in Newton-type optimization methods [27]. However, due to the large scale of the problem, a significant amount of memory is required during the optimization procedure. Even if the Hessian matrix can be calculated at each iteration, the process is extremely time-consuming. In addition, when computing the search direction, it is necessary to invert the Hessian matrix, which is also time-consuming. The speed of BLT reconstruction is therefore severely affected using Hessian matrix-based optimization algorithms. One solution to this problem is to use quasi-Newton methods. Generally, these methods build up an approximate Hessian matrix by using gradient and iteration algorithms. This approximate matrix is obtained by vector-vector multiplications in real time and is easy to inverse, saving memory and time costs. Here, the limited memory variable metric bound constrained quasi-Newton method (BLMVM) in parallel mode is used for BLT reconstruction [28].

3. Results

The bioluminescence imaging experiments were performed on a Maestro 2 in vivo imaging system (CRI, Woburn, Massachusetts). This system uses a cooled CCD camera with a custom lens as the detector. The distinct characteristic of this system is that a liquid crystal tunable filter (LCTF) is used to acquire multispectral data. Generally, the filter bandpass width was set to 20nm and the optical data was collected from a single view. The exposure time for each wavelength was adjusted to obtain high signal-to-noise ratio (SNR). After completing each optical signal acquisition, the phantom or mouse were scanned using an X-ray microCAT system (Siemens Preclinical Solutions, Knoxville, TN) to obtain CT images. The software Amira (Mercury Computer Systems, Inc. Chelmsford, MA) used the CT images to generate volumetric meshes for BLT reconstructions.

The framework was implemented in libMesh [29]. LibMesh is an open-source, high-quality software package and is developed to meet the needs of parallel FEM-based simulation. LibMesh provides almost all of the components used in parallel PDE-based simulation with unstructured discretization. Its design concept is to use existing software packages as far as possible. PETSc developed by Argonne National Laboratory (ANL) was used to solve the linear systems in parallel mode [30]. By default, an open-source serial graph partitioning package, METIS, realizing the multilevel k-way partitioning algorithm was used to partition the whole domain in libMesh [24]. In addition, in order to observe the effect of the model errors in the reconstruction quality, the regularization parameter δ was set to zero in the reconstructions. In order to test the proposed framework, we selected the SP1(DA), SP3 and SP7 approximations to perform BLT reconstructions. All the simulations were performed on a cluster of 27 nodes ( 2 CPUs of 3.2GHz and 4 GB RAM at each node).

3.1. Mouse-shaped phantom experiments

In the first case, a commercial mouse-shaped phantom (Caliper Life Sciences, Hopkinton, Massachusetts, USA) was used to acquire multispectral data. The phantom was fabricated from a polyester resin, TiO2 and Disperse Red. To imitate the bioluminescence source, an optical fiber coupled to a green LED was embedded within the phantom. The emission spectrum of the LED was similar to that of a bioluminescence source. Its wavelength range was from 500nm to 700nm with a peak at about 567nm. The photon distribution data at two wavelengths (580nm and 640nm) were used in the BLT reconstruction. Table 1 shows the optical properties (μa and μs) at two wavelengths measured with the inverse adding-doubling method [31]. For the high-order SPN approximations, we set the anisotropic parameter g to 0.9. More detailed information about this phantom can be obtained elsewhere [16]. Figure 2(a) shows a photograph of the phantom in the Maestro 2 system. To avoid the curved surface effect in the measured data, the bottom flat surface was used as the detection surface. Figures 2(b) and 2(c) show the photon distribution at 580nm and 640nm respectively. They were acquired using an exposure time of 5min. There are distinct differences between them because of the different optical properties at different wavelengths, which benefit the 3D source localization.

Fig. 2
(a) Photograph of Caliper mouse-shaped phantom in a Maestro 2 system; (b) and (c) are the photon distribution at 580nm and 640nm corresponding to (a).
Table 1
Optical properties of a mouse-shaped phantom and actual mouse muscle

With respect to the photon distribution, about two-thirds of the overall phantom was selected for mesh generation. The volumetric mesh, as shown in Fig. 3(a), contained 4,969 nodes and 21,348 tetrahedral elements. Figure 3(a) also shows that the photon distribution was mapped on the mesh surface using a manual co-registration method. Figure 3(b) shows the results after the mesh was partitioned when 10 CPUs were used for the BLT reconstruction. The number of discretized points in each subdomain is similar, avoiding a load imbalance. Figure 4 further shows the reconstructed results based on SP1, SP3 and SP7 approximations. Due to the absence of a regularization parameter, the SP1-based reconstruction was very sensitive to the measured noise and we could not obtain good source localization, as shown in Fig. 4(a). Figures 4(b) and 4(c) show the reconstructed results when SP3 and SP7 approximations were used. From the figures it is clear that the source positions are reconstructed well when using high-order approximations. The localization errors are less than 1mm in two directions, which can clearly be observed from Figs. 4(b) and 4(c). These reconstructed results show the importance and performance of high-order SPN approximations for BLT reconstruction.

Fig. 3
(a) shows the volumetric mesh and the mapped photon distribution at 640nm. (b) is the mesh partitioning results when 10 CPUs are used in BLT reconstruction.
Fig. 4
BLT reconstructions with SPN approximations. (a), (b) and (c) are the reconstructed results corresponding to SP1(DA), SP3 and SP7. Green dotted lines are used to align the boundaries of CT slices with those of reconstructed slices. Thin red lines pass ...

3.2. Reconstruction performance optimization

Although the high-order SPN-based BLT reconstructions yield good source localization, the reconstruction memory and time costs are significantly increased with respect to the first order approximation (SP1). In order to save a dense inverse matrix IM(λk), the SP1-based reconstruction requires only about 94 MB of space compared to about 1.5 GB in the SP7-based reconstructions. Although the proposed fully parallel framework has the ability to process matrices with large dimensions by distributing the storage, reconstruction time becomes impractical with the increase of the approximation order N and the number of used total wavelengths K. Performance optimization is indispensable to improve the efficiency of the proposed framework. The quasi-Newton optimization method (BLMVM) has been selected to significantly reduce reconstruction time when compared to general Newton-type methods. Additionally, matrix inversion and the number of utilized CPUs further optimizes the reconstruction framework.

3.2.1. Direct vs. iterative inversions

When the reconstruction domain is discretized into NP points, the SPN-based BLT reconstruction must process a N*NP×N*NP matrix. The computational complexity of the matrix inversion is O((N*NP)3) if direct inversion methods are used. The computational burden is significantly increased with the increase of N and NP. When 10 CPUs were used in the above BLT reconstructions, the SP1-based reconstruction required only 1,163.1sec, as opposed to 3,086.1sec for SP3 and 10,754.5sec for SP7 (Table 2). LU-factorization-based relationship forming (i.e. forming Eq. (14)) utilized most of the total reconstruction time. For the SP1 and SP7 approximations, the percentage increased from 58.0% to 96.8%, making it critically important to improve the performance of the matrix inversion. For iterative matrix inversion methods, the parallel incomplete LU (ILU) conjugate gradient (CG) method was used to accelerate the inversion. This preconditioner was provided by the Hypre open source package [32], developed by Lawrence Livermore National Laboratory (LLNL). For the SP1- and SP7-based reconstructions, the total reconstruction time sharply decreased from 644.5 to 3,367.5sec, as shown in Table 2. Although the percentage of the relationship forming part in the total time increased regardless of SP1 and SP7 approximations, the reconstruction speed was improved by a factor of 1.80 and 3.19 corresponding to the SP1 and SP7 approximations.

Table 2
Performance comparison between direct and iterative inversions when 10 CPUs are used in reconstructions. DI is Direct Inversion; II denotes Iterative Inversion; and DI/II is the ratio of total time between DI and II.

3.2.2. Parallel reconstruction performance

Iterative matrix inversion methods show the improved performance in the Relationship Forming step. Ideally, parallel execution on an increased number of CPUs should provide improved reconstruction performance. In order to evaluate the proposed fully-parallel reconstruction framework, BLT reconstructions with SP1, SP3 and SP7 approximations were performed using different number of CPUs. Iterative matrix inversion was used in these evaluations. Figure 5(a) shows the total reconstruction time depending on the CPU number. The SP1-based reconstruction time increased with an increased number of CPUs. For SP3- and SP7-based reconstructions, there were an optimal CPU number which will provide the shortest reconstruction time (3 and 23 CPUs corresponding to SP3 and SP7). In the proposed framework, the four main steps are 1) Mesh Partitioning, 2) Matrix Assembly, 3) Relationship Forming, and 4) Optimization. To further observe the above behavior, time analysis was performed while observing these steps, as shown in Figs. 5(b) and 5(c). Since Mesh Partitioning required the least time among the four steps, it was negligible with respect to the entire reconstruction (data is not shown in Figs. 5(b) and 5(c)). With the increase in CPU number, the time cost of Matrix Assembly was gradually reduced. Despite the fact that the matrix assembly time increased with the application of high-order approximation, Matrix Assembly required a small percentage of the overall reconstruction. Relationship Forming and Optimization comprised nearly all the reconstruction time. Furthermore, both of these steps had an optimal number of CPUs to obtain the minimal time cost. Since BLT reconstructions are performed on a cluster, the time cost of the communication between the CPUs becomes significant compared with the performance improvement from parallel execution. Higher speed communication methods, such as shared memory mode, could significantly improve the reconstruction speed. With the current hardware architecture and software settings, the number of CPUs must be preselected to obtain optimal reconstruction time.

Fig. 5
Performance comparison depending on CPU number in SPN-based BLT reconstruction. (a) is the total reconstruction time depending on CPU number; (b) and (c) are the reconstruction time of the major steps in the proposed framework and the percentages of the ...

3.3. Real mouse experiments

To further validate the proposed framework, experiments with a living mouse were performed in the Maestro 2 system. To simulate the bioluminescence source, a luminescent bead (Mb-Microtec, Bern, Switzerland) whose emission spectrum is similar to the in vivo spectrum of a firefly luciferase-based source was used. This bead uses tritium (the half life is about 12 years) to excite phosphor which generates photons, making it a very stable source. The bead dimensions are 0.9mm in diameter and 2.5mm in length. Prior to performing the experiments, the mouse was anesthetized and the bead was injected into the thigh using a syringe. The photon distribution data at 580nm and 660nm were collected from the ventral view. The exposure time was set to 1.5min. The volumetric mesh used in the reconstruction was generated using CT images of the mouse and contained 5,932 points and 24,120 tetrahedral elements. Figure 6(a) shows the mapped photon distribution after co-registration between the photograph of the mouse and the volumetric mesh. From the CT images, the tritium source can be clearly identified, as shown in Fig. 7. Since the photon propagation path consists almost totally of muscle, the reconstruction domain was considered to be homogeneous muscle tissue. The corresponding mouse muscle optical properties were then used in the reconstruction (Table 1).

Fig. 6
(a) shows the volumetric mesh and the mapped photon distribution at 660nm for real mouse experiments. (b) is the mesh partitioning results when 30 CPUs are used in BLT reconstruction.
Fig. 7
BLT reconstructions with SPN approximations for real mouse experiments. (a), (b) and (c) are the reconstructed results corresponding to SP1(DA), SP3 and SP7. Green dotted lines are used to align the boundaries of CT slices with those of reconstructed ...

Figure 6(b) shows the partitioned mesh when 30 CPUs were used in the BLT reconstruction. The reconstructed results corresponding to SP1, SP3, and SP7 approximations are shown in Fig. 7. The actual center position of the tritium source was at (51.8,–0.1). The reconstructed center position obtained from SP1, SP3, and SP7 approximations was at about (51.1,0.2). The SP7-based reconstruction was similar with the SP3-based one regarding the source center position. Although the SP1-based reconstruction was somewhat different compared to the SP3- and SP7-based results, the source localization errors were measured to be less than 1mm in two directions. This result was similar to the SP3- and SP7-based reconstruction in the phantom experiments. The difference between phantom- and real mouse-based BLT reconstructions was that the tritium source could be localized well in the SP1-based reconstruction. This was likely because the tritium source was superficial compared to the LED source in the mouse-shaped phantom. In addition, the mouse surface was more irregular than the mouse-shaped phantom surface, it should have the effect in the reconstruction.

4. Discussions and Conclusion

In this paper, a radiative-transfer-based fully-parallel reconstruction framework was developed for spectrally-resolved bioluminescence tomography. Although the BLT reconstruction was performed based on the simplified spherical harmonics approximation, the proposed framework was also suitable for other high-order self-adjoint approximations to the RTE. The application of the finite element methods made the framework suitable for processing complex geometries. Fully-parallel execution made the BLT reconstruction for the whole-body of a small animal feasible. The reconstruction performance optimization significantly improved the reconstruction speed. The experimental reconstruction using the mouse-shaped phantom and real mouse further demonstrated the effectiveness of the proposed framework.

Since bioluminescence tomography can provide more accurate bioluminescent source information, the importance of developing mature BLT technologies is critical, given the successful and extensive application of planar bioluminescence imaging in biological research. Diffusion theory approximation leads to inaccurate reconstructions and more accurate approximation models are necessary for BLT reconstruction. However, the corresponding computational burden prevents the realization of such reconstruction algorithms. The proposed framework addresses this problem. Although good reconstruction performance can be obtained using high-order approximation models, the memory and time cost cannot be neglected. The balance between reconstruction quality and cost should be further explored. From the reconstruction cases presented in this paper, we find that the SP3 approximation is suitable for obtaining good BLT reconstructions after comparing its results with those based on SP1 and SP7 approximations.

The performance optimization presented here is relevant not only to the improvement of the algorithms involved, but also to the development of the computational hardware. While the application of quasi-Newton optimization methods and iterative matrix inversion have effectively improved the reconstruction performance, further optimization strategies should be developed to obtain higher reconstruction speed. Since the matrix M(λk) is sparse, sparse approximate matrix inversion can be considered to accelerate the reconstruction. With respect to the hardware, it is necessary to improve the data communication speed between CPUs. Developing multi-core CPU technology and shared-memory high performance computer will significantly benefit the proposed algorithm.

In conclusion, we have developed a fully-parallel spectrally-resolved BLT reconstruction framework for radiative-transfer-based high-order approximations. A performance optimization was also performed and described. Preliminary experimental reconstruction verifications show the feasibility and effectiveness of the proposed framework. Further investigations will focus on real mouse experiments used as disease models.


We would like to thank Dr. Laurent Bentolila from Department of Chemistry & Biochemistry, University of California Los Angeles for providing us with access to the Maestro 2 system. We are grateful to Judy Edwards and Waldemar Ladno at the small-animal imaging facility of the Crump Institute for Molecular Imaging for their assistance with mouse experiments. This work is supported by the NIBIB R01-EB001458, a NIH/NCI 2U24 CA092865 cooperative agreement, the Department of Energy DE-FC02-02ER63520, and the NCI grant 5-R01 CA08572.


OCIS codes: (110.6960) Tomography; (170.3010) Image reconstruction techniques; (170.6280) Spectroscopy, fluorescence and luminescence.

References and links

1. Ntziachristos V, Ripoll J, Wang LV, Weisslder R. Looking and listening to light: the evolution of whole body photonic imaging. Nat. Biotechnol. 2005;23:313–320. [PubMed]
2. Weissleder R. Scaling down imaging: Molecular mapping of cancer in mice. Nat. Rev. Cancer. 2002;2:11–18. [PubMed]
3. Virostko J, Powers AC, Jansen ED. Validation of luminescent source reconstruction using single-view spectrally resolved bioluminescence images. Appl. Opt. 2007. pp. 2540–2547. [PubMed]
4. Gibson AP, Hebden JC, Arridge SR. Recent advances in diffuse optical imaging. Phys. Med. Biol. 2005;50:R1–R43. [PubMed]
5. Lewis EE, Miller WF., Jr. Computational Methods of Neutron Transport. John Wiley & Sons; New York: 1984.
6. Cong W, Wang G, Kumar D, Liu Y, Jiang M, Wang LV, Hoffman EA, McLennan G, McCray PB, Zabner J, Cong A. Practical reconstruction method for bioluminescence tomography. Opt. Express. 2005. pp. 6756–6771. [PubMed]
7. Gu X, Zhang Q, Larcom L, Jiang H. Three-dimensional bioluminescence tomography with model-based reconstruction. Opt. Express. 2004. pp. 3996–4000. [PubMed]
8. Lv Y, Tian J, Cong W, Wang G, Luo J, Yang W, Li H. A multilevel adaptive finite element algorithm for bioluminescence tomography. Opt. Express. 2006. pp. 8211–8223. [PubMed]
9. Dehghani H, Davis SC, Jiang S, Pogue BW, Paulsen KD, Patterson MS. Spectrally resolved bioluminescence optical tomography. Opt. Lett. 2006. pp. 365–367. [PubMed]
10. Klose AD. Transport-theory-based stochastic image reconstruction of bioluminescent sources. J. Opt. Soc. Am. A. 2007. pp. 1601–1608. [PubMed]
11. de Oliveira CRE. An arbitrary geometry finite element method for multigroup neutron transport with anisotropic scattering. Progr. Nucl. Energ. 1986;18:227–236.
12. Wright S, Schweiger M, Arridge SR. Reconstruction in optical tomography using the PN approximations. Meas. Sci. Technol. 2007;18:79–86.
13. Klose AD, Larsen EW. Light transport in biological tissue based on the simplified spherical harmonics equations. J. Comput. Phys. 2006;220:441–470.
14. Lu Y, Chatziioannou AF. A parallel adaptive finite element method for the simulation of photon migration with the radiative-transfer-based model. Commun. Numer. Methods Eng. 2009;25:751–770. [PMC free article] [PubMed]
15. Wang G, Li Y, Jiang M. Uniqueness theorems in bioluminescence tomography. Med. Phys. 2004;31:2289–2299. [PubMed]
16. Kuo C, Coquoz O, Troy TL, Xu H, Rice BW. Three-dimensional reconstruction of in vivo bioluminescent sources based on multispectral imaging. J. Biomed. Opt. 2007;12:024007. [PubMed]
17. Chaudhari AJ, Darvas F, Bading JR, Moats RA, Conti PS, Smith DJ, Cherry SR, Leahy RM. Hyperspectral and multispectral bioluminescence optical tomography for small animal imaging. Phys. Med. Biol. 2005;50:5421–5441. [PubMed]
18. Lv Y, Tian J, Cong W, Wang G, Yang W, Qin C, Xu M. Spectrally resolved bioluminescence tomography with adaptive finite element analysis: methodology and simulation. Phys. Med. Biol. 2007;52:4497–4512. [PubMed]
19. Alexandrakis G, Rannou FR, Chatziioannou AF. Tomographic bioluminescence imaging by use of a combined optical-PET (OPET) system: a computer simulation feasibility study. Phys. Med. Biol. 2005;50:4225–4241. [PMC free article] [PubMed]
20. Vo-Dinh T. Biomedical Photonics Handbook. CRC Press; 2002.
21. Ishimaru A. Wave propagation and scattering in random media. IEEE Press; 1997.
22. Haskell RC, Svaasand LO, Tsay T, Feng T, McAdams MS, Tromberg BJ. Boundary conditions for the diffusion equation in radiative transfer. J. Opt. Soc. Am. A. 1994. pp. 2727–2741. [PubMed]
23. Rao SS. The finite element method in engineering. Butterworth-Heinemann; Boston: 1999.
24. Karypis G, Kumar V. Multilevel k-way partitioning scheme for irregular graphs. J. Parallel Distrib. Comput. 1998;48:96–129.
25. Golub GH, Van Loan CF. Matrix computations (3rd ed.) Johns Hopkins University Press; 1996.
26. Benzi M. Preconditioning techniques for large linear systems: a survey. J. Comput. Phys. 2002;182:418–477.
27. Nocedal J, Wright SJ. Numerical Optimization. Springer; New York: 1999.
28. Benson SJ, Moré J. Technical Report ANL/MCS-P909-0901. Mathematics and Computer Science Division, Argonne National Laboratory; 2001. A limited-memory variable-metric algorithm for bound-constrained minimization.
29. Kirk B, Peterson JW, Stogner RH, Carey GF. libMesh: A C++ Library for Parallel Adaptive Mesh Refinement/Coarsening Simulations. Eng. Comput. 2006;22:237–254.
30. Balay S, Buschelman K, Gropp WD, Kaushik D, Knepley MG, McInnes LC, Smith BF, Zhang H. PETSc Web page. 2001.
31. Prahl SA, van Gemert MJC, Welch AJ. Determining the optical properties of turbid media by using the adding-doubling method. Appl. Opt. 1993. pp. 559–568. [PubMed]
32. Falgout RD, Yang UM. hypre: A library of high performance preconditioners. Proceedings of the International Conference on Computational Science-Part III. 2002:632–641.