|Home | About | Journals | Submit | Contact Us | Français|
All acknowledged Co-authors in alphabetic order on this paper:
Karsten Alfke1, Iris Asllani2, Karin M Bloch3, Ajna Borogovac4, Patrick Browaeys5, John A Butman6, Marc van Cauteren3, Jon M Chia3, Ivan Dimitrov3, Manus J Donahue7, Neville D Gai6, J Christopher Gatenby8, Stephen Goode9, Adam E Hansen10, Michael Helle1, Shuichi Higano11, Toshinori Hirai12, Hans Hoogduin13, Frank GC Hoogenraad6, Marko K Ivancevic3,14, Alan Jackson15, Geon-Ho Jahng16, Adun Kampaengtip17, EunJu Kim3, Jae Hyoung Kim18, Sung Tae Kim19, Mika Kitajima12, Linda Knutsson20, John Lackey21, Song Lai21, Jiraporn Laothamatas17, David J Larkman22, Henrik BW Larsson23, Jung Hee Lee19, Seung-Koo Lee24, Hanzhang Lu25, Alex L MacKay26, Reto A Meuli5, Kirsten Moffat27, Elizabeth A Moore3, Paul S Morgan9, Takaki Murata11, Burkhard Mädler26,3, Tomoyuki Noguchi28, Ron Peeters29, Nancy K Rollins30, Ronald Shnier27, Stefan Sunaert29, Pia C Sundgren14, E Brian Welch3,8, Dal Mo Yang16, Takashi Yoshiura28, Peter C van Zijl7, Ivan Zimine3.
1Institute of Neuroradiology, University Hospital of Schleswig-Holstein, Kiel, Germany; 2Department of Radiology, Columbia University, USA; 3Philips Healthcare MR Clinical Science; 4Department of BioMedical Engineering, Columbia University, USA; 5Department of Radiology, University Hospital, Lausanne, Switzerland; 6Diagnostic Radiology Department (DRD), Clinical Center, NIH, Bethesda, MD, USA; 7F.M. Kirby Center for Functional Brain Imaging, Kennedy Krieger Institute, Department of Radiology, Johns Hopkins University, USA; 8Vanderbilt University Institute of Imaging Science, Vanderbilt University, Nashville TN, USA; 9Academic Radiology, University of Nottingham, Nottingham University Hospital, Queen’s Medical Centre Nottingham, UK; 10Functional Imaging Unit, Department of Radiology, Glostrup Hospital, Denmark; 11Department of Diagnostic Radiology, Tohoku University Graduate School of Medicine, Sendai, Miyagi, Japan; 12Department of Diagnostic Radiology, Graduate School of Medical Sciences, Kumamoto University, Kumamoto, Japan; 13BCN Neuroimaging Center (NIC), University Medical Center Groningen, Groningen, The Netherlands; 14Department of Radiology, University of Michigan, Ann Arbor, MI, USA; 15Wolfson Molecular Imaging Centre, Department of Radiology, The University of Manchester, UK; 16Kyung-Hee University, Department of Radiology, East-West Neo Medical Center, Seoul, Korea; 17Department of Radiology, Ramathibodi Hospital, Bangkok, Thailand; 18Seoul National University Bundang Hospital, Department of Radiology, Seoul, Korea; 19Department of Radiology, Samsung Medical Center, Sungkunkwan University School of Medicine, Seoul, Korea; 20Department of Medical Radiation Physics, Lund University, Lund & Center for Medical Imaging and Physiology, MR division, Lund University Hospital, Sweden; 21Department of Radiology, Thomas Jefferson University, Philadelphia, PA, USA; 22Imperial College London, UK; 23Functional Imaging Unit, Department of Clinical Physiology & Nuclear Medicine, Glostrup Hospital, Denmark; 24Severance Hospital Yonsei University, Department of Radiology, Yonsei University College of Medicine, Seoul, Korea; 25Advanced Imaging Research Center, UT Southwestern Medical Center, USA; 26Department of Physics and Astronomy, Department of Radiology, University of British Columbia, Vancouver, Canada; 27Symbion Clinical Research Imaging Centre, Sydney, Australia; 28Department of Clinical Radiology, Graduate School of Medical Sciences, Kyushu University, Japan; 29Department of Radiology, University Hospitals of the Catholic University of Leuven, Leuven, Belgium; 30Department of Radiology, Children’s Medical Center of Dallas, Dallas, TX, USA.
Arterial Spin Labeling (ASL) is a method to measure perfusion using magnetically labeled blood water as an endogenous tracer. Being fully non-invasive, this technique is attractive for longitudinal studies of cerebral blood flow in healthy and diseased individuals, or as a surrogate marker of metabolism. So far, ASL has been restricted mostly to specialist centers due to a generally low SNR of the method and potential issues with user-dependent analysis needed to obtain quantitative measurement of cerebral blood flow (CBF).
Here, we evaluated a particular implementation of ASL (called Quantitative STAR labeling of Arterial Regions or QUASAR), a method providing user independent quantification of CBF in a large test-retest study across sites from around the world, dubbed “The QUASAR reproducibility study”. Altogether, 28 sites located in Asia, Europe and North America participated and a total of 284 healthy volunteers were scanned. Minimal operator dependence was assured by using an automatic planning tool and its accuracy and potential usefulness in multi-center trials was evaluated as well.
Accurate repositioning between sessions was achieved with the automatic planning tool showing mean displacements of 1.87±0.95mm and rotations of 1.56±0.66°. Mean gray matter CBF was 47.4±7.5 [ml/100g/min] with a between subject standard variation SDb = 5.5 [ml/100g/min] and a within subject standard deviation SDw = 4.7 [ml/100g/min]. The corresponding repeatability was 13.0 [ml/100g/min] and was found to be within the range of previous studies.
Cerebral blood flow (CBF) is an important physiological parameter for probing metabolic activity in the brain and therefore accurate CBF measurements are crucial for the evaluation of a wide range of diseases and their progression. CBF gives information about the delivery or availability of metabolites or nutrients, rather than the direct metabolic rate and knowledge about baseline CBF can therefore tell whether the minimum delivery required for ensuring homeostasis is fulfilled. Fast fluctuations in CBF around the baseline are usually linked to temporary variations in metabolic demand due to e.g. neuronal activity in the brain, and are the underlying basis behind the common blood oxygenation level dependent (BOLD) contrast used in fMRI (Ogawa et al., 1990). CBF measurements at high temporal resolution can therefore be used to develop a more comprehensive picture of the physiological events accompanying neuronal activation (Hoge et al., 1999).
Arterial Spin Labeling (ASL), a non-invasive perfusion modality, has the potential to open a unique window into the assessment and understanding of perfusion within vascular diseases in the clinics, as well as brain function within the neuroscience field (Petersen et al., 2006b). However, obtaining quantitative CBF using ASL techniques is challenging due to uncertainties in bolus arrival time, arterial input function, underlying kinetics and static tissue parameters like blood equilibrium-magnetization. The latter is of special importance in longitudinal ASL studies, because it is a direct scaling factor in CBF quantification and therefore any error in this parameter will propagate directly to the uncertainty of the perfusion estimate. In addition to the complexity of the flow quantification, ASL is a low signal-to-noise measurement technique and as a result, ASL is often being portrayed as a perfusion tool only working in dedicated and highly specialized settings. Nevertheless, the development efforts over the years combined with the recent move towards high-field systems, even in the clinical settings, have solved many of these problems (Golay and Petersen, 2006).
In recent years, several studies have reported on the robustness of ASL with regards to reproducing CBF estimates in subjects from both the clinical as well as the more research minded settings (Asllani et al., 2008; Floyd et al., 2003; Hermes et al., 2007; Jahng et al., 2005; Parkes et al., 2004; Yen et al., 2002). Common to all of them is that they were performed within a single center and therefore any variability attributed to differences in hardware and subject handling procedures between several sites still remains to be evaluated.
In addition, a big challenge with regards to the success of any multi-center MRI study is to maintain consistent imaging protocols across sites. This includes everything from briefing of the subjects, subject positioning and fixation in the scanner to the subsequent operator-dependent planning of the image sections to be acquired. The former three are important for subject comfort, which again will directly influence the amount of motion artifacts present in the data sets. The latter is important with respect to the post-processing of the data where differences in angulations easily can affect the subjective reading by radiologists or change quantitative measures such as physical-anatomical parameters (Fazekas et al., 2002). While it is of less importance in true 3D isotropic acquisitions, repositioning errors can be significant in multi-slice acquisitions, especially when acquired with a gap between slices. In order to improve planning consistency in the clinics, the scanner vendors have recently provided automatic planning software (Itti et al., 2001; van der Kouwe et al., 2005; Young et al., 2006a). These tools have the potential to improve planning consistency between populations and centers in multi-center trials as well as for repeated scans within individuals.
In this work, we evaluated the Quantitative STAR labeling of Arterial Regions (QUASAR) implementation of ASL (Petersen et al., 2006a; Petersen et al., 2009), a method which allows user independent CBF estimation, in a worldwide test-retest study dubbed “The QUASAR reproducibility study”. The aims of the study were to show that ASL is a reliable option for perfusion measurements in the first place and secondly that it can easily be applied across centers without the need for special hardware or dedicated personnel. Altogether, twenty eight sites located in Asia, Europe and North America participated in the study, each scanning 10 healthy volunteers on average. Minimal operator dependence was assured by using automatic planning tools and their accuracy and usefulness in multi-center trials were evaluated as well.
The 28 participating sites were distributed with 10 in Asia, 9 in Europe and 9 in North America. Each site recruited and scanned 10 healthy volunteers on average with a total of 284 subjects: 164 males and 120 females with an age range of 18-65 (Mean: 33.7 +/− 8.9, Median: 32). The ethnicities were distributed with 112 of Asian origin, 165 Caucasians and 7 African Americans, giving a good representation of the Asian and Caucasian population in particular. All subjects gave written informed consent before participation and the studies were conducted in accordance with the declaration of Helsinki and local ethics regulations, and they were approved by the institutional review boards of the respective sites. Any personal information from subjects was removed in accordance with local patient protection regulation such as the Health Insurance Portability and Accountability Act (HIPAA) in the USA.
The sites were all equipped with 3T Philips Achieva whole body systems (Philips Healthcare, Best, The Netherlands). All images were acquired using the quadrature body coil as the transmit coil, and a dedicated 8-element phased-array head coil as the receiving coil. In addition all scanners were equipped with the manufacturer’s automatic planning software called SmartExam (Young et al., 2006a).
The study consisted of two scanning sessions per subject, separated by two weeks on average (13±10 days), and all subjects underwent 3 high resolution 3D anatomical scans as well as 4 ASL scans. Two perfusion measurements were obtained during each session and during the first session, the second scan was acquired after repositioning the volunteer in the scanner, whereas for session 2 the second scan was repeated without any repositioning of the subject. A full scanner calibration step was enforced before each ASL scan. This setup allowed the evaluation of different factors affecting the within subject standard deviation (SDw) of CBF, such as physiological variations that are expected to be significant between sessions but less pronounced within a scan session of approximately 30 min. Efforts were made to keep the physiological perfusion fluctuations as natural as possible, that is, no medication, beverage or food with vaso-dilatory or vaso-constrictive effects such as coffee, tea and licorice were allowed to be consumed within 8 hour prior to the study. To assess the influence of repositioning on the within subject standard deviation, SDw within session 1 where the subjects underwent repositioning was compared to SDw within session 2 where no repositioning occurred. Finally, within session 2, where no repositioning occurred, SDw can be assumed to be attributed mainly to acquisition errors, subject motion and potential errors propagating in the subsequent post processing. These sources of error all together increase the overall variability between repeated measurements, and are related to that particular technique. Both sessions were performed in random orders (50-50%) across sites and volunteers.
Whereas the within subject variation is mainly of importance when considering repeated measurements on the same subject, such as in longitudinal studies, then the between-subject variance is valuable when a comparison of CBF between populations is planned, such as in patient vs. control studies. In addition, if studies are performed across centers, the between-center effects play a role too. These effects were also possible to evaluate in the current setup with subjects of different gender and race participating from different sites.
The QUASAR experiment is based on a multi-slice acquisition (with a gap between slices) thus a correct repositioning is crucial with regards to the interpretation of the reproducibility of the perfusion estimates. Therefore, in order to reduce user interaction to a minimum, it was decided to use the automatic planning tool available on the scanners. SmartExam (Young et al., 2006a), the implementation available on the Philips scanners, uses image recognition on a 3D survey for computerized identification of 27 landmarks in the brain and is capable of automatically orienting the scanning geometry based on this information. The geometrical positioning of these landmarks is derived from previous manually planned scans, in this case based on 10 subjects with mixed gender and race (Asians and Caucasians) who were all positioned three times in the scanner in one of the participating sites (Singapore). The scan protocol and the “trained” geometry database were distributed to the participating sites and all scans were subsequently planned automatically. Such methods ensured a minimal set of instructions to follow, thereby reducing the risks of faulty protocols. Positioning in the scanner was done three times for each subject, twice during session 1 and a single positioning for session 2.
Automatic planning is a relative new feature available on the scanners, and therefore the accuracy of this user independent approach was evaluated by acquiring a high resolution anatomical image (MPRAGE) each time a subject was positioned in the scanner. Information about the precision was then obtained by co-registration of these MPRAGE images acquired with the following parameters: TR/TE=6.7/3.1ms, TI=0.8s, FA=8°, resolution=0.9×0.9×0.9mm3, FOV=240×162×190, Reconstruction=288×288 and 180 slices, scan duration 5 min 26 s. Altogether three volumes were acquired, two during session 1 (MPRAGE 1 & 2) and one during session 2 (MPRAGE 3) where the subjects were positioned once only.
The subsequent image registration was done using ITK (The Insight Toolkit) (Ibanez et al., 2005) in a two step 3D rigid body registration where first a rough registration was performed based on mutual information and then followed by a standard mean square optimization. The centre-of-mass of the image intensity was used as a reference point during the registration and the output of the procedure is therefore translation and rotation (angle and rotation axis) around this reference point (See Fig.1d). Choosing a reference point at the center of the brain helps to interpret the values in an intuitive way. The reported values will include inaccuracies of the automatic planning, subject movement between acquisition of the 3D survey and the MPRAGE scan as well as possible registration errors.
The QUASAR implementation of ASL was used for this study and the sequence has been described in detail elsewhere (Petersen et al., 2006a; Petersen et al., 2009). For this study, it was modified to be capable of accurately measuring tissue equilibrium-magnetization Mt,0 and T1t by means of a dual flip-angle Look-Locker acquisition strategy described in Part I of this work “Method for user independent Arterial Spin Labeling” (Petersen et al., 2009). This ensures that all parameters needed for the subsequent CBF quantification are acquired and thereby no user interaction is required for the perfusion estimation.
The sequence is a multi-slice, multiple time-points capable ASL sequence based on pulsed arterial spin labeling principles. Both labeling and control experiments are preceded by a saturation pulse and QUIPSS II (Wong et al., 1998) type of bolus saturation is applied during a Look-Locker sampling (Gunther et al., 2001). In addition, both crushed and non-crushed control-label pairs are acquired in an interleaved manner for AIF and arterial blood volume (aBV) estimation (Petersen et al., 2006a). During the crushed experiments, homogeneous elimination of the signal in the fast moving vessels is ensured by applying bi-polar crusher gradients for the control-label pairs in the 4 diagonal directions [(+x,+y,+z);(−x,+y,+z);(+x,−y,+z);(−x,−y,+z)] (Petersen et al., 2009). General scan parameters were: TR/TE/ΔTI/TI1=4000/23/300/40ms, 13 inversion times (40-3640ms), 64×64 matrix, 7 slices, slice thickness=6mm, 2mm gap, FOV=240×240, flip-angle=35/11.7°, SENSE=2.5, 84 averages (48@Venc=4cm/s, 24@Venc=∞, 12 low flip angle), all implemented in a single sequence, scan duration 5min 52s. Altogether four perfusion scan were performed, two during session 1 and two during session 2.
Post processing was done using in-house software after the images were exported to a Windows PC running IDL 6.1 (ITT Visual Information Solutions, Boulder, CO). Pairs of images showing strong motion artifacts were automatically discarded prior to averaging and the raw images were then modulus-subtracted to produce ΔM images separately for crushed and non-crushed data. Estimation of blood equilibrium magnetization Ma,0 was performed in a user independent way, by assigning Ma,0[x,y] = Mt,0[x,y]/λ on a voxel by voxel basis (x,y), where Mt,0 is the equilibrium tissue magnetization and λ=0.9 is the full brain blood-brain partition coefficient (Method 3 in (Petersen et al., 2009)). Perfusion maps were subsequently calculated as described earlier (Petersen et al., 2006a) resulting in CBF, aBV as well as arterial transit time (ATT) maps. In short, information about the arterial blood volume and the shape of the AIF, is revealed by subtracting the crushed experiments from the non-crushed experiments. Appropriate scaling of the AIF and subsequent deconvolution of the tissue signal (crushed experiment) by this AIF, makes estimation of CBF possible from the peak of the resulting residue function, in a way similar to dynamic susceptibility contrast based perfusion techniques (Petersen et al., 2006a). Arterial transit times on the other hand, are estimated based on the rising edge of the ΔM images acquired at multiple inversion times.
In addition, SNR maps were calculated on a voxel-by-voxel basis and it was defined as Mt,0/σ, where σ is the standard deviation of the noise and Mt,0 is tissue equilibrium magnetization. The calculated T1 maps from the Look-Locker saturation recovery data were used to register all perfusion scans to the MNI template from the International Consortium for Brain Mapping (ICBM). Based on the registered average CBF map from all subjects and scans, failed CBF maps were automatically detected and discarded. This was done according to the correlation with the average CBF map (r < 0.5) and whether more than 30% of the voxels had a SNR lower than 150. This criterion was subjectively chosen based on the ability to discard outliers in a consistent way across subjects and centers.
Statistical analyses were performed using IDL 6.1 (ITT Visual Information Solutions, Boulder, CO) and R version 2.7.2 (The R Foundation for Statistical Computing).
At the level of single sites we model the subjects as a random effect (block factor) and scan time as a fixed effect. Since experiments were replicated 28 times at different sites, we have a replicated design, where the blocks (subjects) are nested in - rather than crossed with - replications (sites) as would have been the case if the same 10 subjects had been scanned at every site.
In this model we investigate the contribution to the total variance of three different variance components, σ2Site, σ2Subj(Site), σ2Scan,Subj(Site), representing the between-site, between-subject and within-subject variability respectively. This Linear Mixed Effect model was fitted by restricted maximum likelihood (REML), which in effect corrects the maximum-likelihood estimator for degrees of freedom (Pinheiro and Bates, 2000) when there are missing observations. Additional regressors for effects from gender, age as well as time of scan were also included and all three perfusion modalities i.e. CBF, aBV as well as ATT were analyzed.
Repeatability was assessed using the Bland-Altman methods (repeatability = √2×1.96×SDw; comparing difference to mean) (Bland and Altman, 1986), the often used Coefficient of Variation (CV = SDw/μ×100) as well as by calculation of the intraclass correlation coefficient (ICC), which is based on two-way random effects analysis of variance (ANOVA):
where MSB is the mean square between subjects, MSW is the mean squares within subjects and k is the number of repeated scans.
Separation of contribution to the repeatability was visualized with separate Bland-Altman plots from within session 1, within session 2 as well as between the sessions. In addition, the absolute differences between these scans were analyzed fitting the Linear Mixed Effect model.
A ballpark figure of the needed subject population size was estimated based on the width of the 95% confidence interval for the within-subject standard deviation of the population (ignoring external effects such as site, gender etc.) which is:
where SDw is the within-subject standard deviation, n is the number of subjects and, m is the number of observations per subject. One way to deal with the sample size and its dependence on the standard error and therefore on the quantity we wish to estimate, is to estimate it to within say 5% of the population value (Bland and Altman, 1996):
Assuming four tests per subject would therefore require 256 subjects to reach a 95% confidence, and we are within this limit in this study.
The level of significance was set at α = 0.05 and 95% confidence intervals (CI) were used throughout.
Figure 1 summarizes the results from the registration process used for assessing inaccuracies of the automatic planning combined with subject motion. Of the 284 subjects, 221 had complete MPRAGE data with no known errors during planning. In Fig 1a, the combined distributions for the translation errors between MPRAGE 1-2, 1-3 and 2-3 are shown for the images center (centre-of-mass of intensity) in left-right (L-R), anterior-posterior (A-P) and foot-head (F-H) directions. The error distributions can be seen to be wider in F-H (0.13 ± 1.65mm) than in A-P (0.09 ± 1.16mm) and in L-R (0.02 ± 0.57mm) direction (mean ± std.). The resulting modulus displacement error from the reference point was 1.87 ± 0.95mm. The distribution of rotation errors around the reference point is shown in Fig 1b (1.56 ± 0.66°) and the axes around which the individual rotations occurred are plotted in Fig. 1c. In the same figure, it can be seen that the L-R axis of rotation is more prevalent than the others with an average modulus component of 0.72, 0.38 and 0.33 along L-R, A-P and F-H respectively.
Representative CBF maps from three different sites are shown in Fig. 2 for session 1 and session 2, whereas Fig. 3 shows the average data from all subjects for CBF, aBV as well as ATT after registration of data to MNI space.
Successful data, according to the criteria described in the method section, was acquired in 960 out of the 1119 available ASL data sets. This resulted in an overall success rate of 85.8% across sites or 275 subjects with partial ASL data and 191 with complete ASL data available. In general all available data was used in the analysis, except where effects from within and between sessions were analyzed. The success rate is seen to be highly dependent on the site, varying from as low as 47% all the way to 100%. Table 1 summarizes the average CBF, SDw, repeatability, ICC, CV as well as success rate and overall SNR for the individual sites. There was a weak but significant correlation between a site’s SNR of the successful data and the success rate (r = 0.45, p = 0.017), and an even stronger correlation between a site’s mean CBF of the successful data and the success rate (r = 0.66, p = 0.00015).
The mean gray matter CBF was 47.4±7.5 [ml/100g/min] with a standard deviation between the different sites SDs = 1.8 (CI: 1.0 - 3.3) [ml/100g/min], a between-subject standard variation SDb = 5.5 (CI: 5.0 - 6.2) [ml/100g/min] and a within-subject standard deviation SDw = 4.7 (CI: 4.5 - 5.0) with a corresponding repeatability of 13.0 [ml/100g/min]. A significantly higher GM CBF of 2.1 (p=0.008, CI: 0.5 - 3.6) [ml/100g/min] was observed in females than in males with flow values of 48.4 vs. 46.3 [ml/100g/min]. There were no significant effects on the mean CBF from age, scan/session or time of scan, however the within-subject standard deviation becomes 4.3, 3.1 and 5.3 [ml/100g/min] if considered individually within session 1, session 2 and in between sessions. Figure 4 shows the mean differences between sites as well as Bland-Altman plots where the difference in repeatability for scans within session 1 can be seen in blue, within session 2 in green and in between sessions in red. The corresponding absolute mean differences were 4.8, 3.3 and 5.6 [ml/100g/min] all significantly different from each other (p<0.001) and there was a significant increase of 1.8 (p= 0.002, CI: 0.7 - 2.9) [ml/100g/min] in the variance in females than in males in between sessions but not within sessions.
The mean gray matter aBV was 0.67±0.16 [ml/100g] with a standard deviation between the different sites SDs = 0.03 (CI: 0.02 - 0.06) [ml/100g], a between-subject standard variation SDb = 0.06 (CI: 0.05 - 0.08) [ml/100g] and a within-subject standard deviation SDw = 0.15 (CI: 0.14 - 0.16) [ml/100g] with a corresponding repeatability of 0.41 [ml/100g]. There were no significant effects from gender, scan/session or time of scan, however a significant age effect appear with a change of −0.02 (p= 0.028, CI: −0.03 - −0.002) [ml/100g] per decade. The absolute mean differences between and within sessions were 0.16 [ml/100g], with no significant difference when considered individually within session 1, session 2 or in between sessions.
The mean gray matter ATT was 0.82± 0.12 [s] with a standard deviation between the different sites SDs = 0.04 (CI: 0.02 - 0.06) [s], a between-subject standard variation SDb = 0.09 (CI: 0.08 - 0.10) [s] and a within-subject standard deviation SDw = 0.046 (CI: 0.044 - 0.049) with a corresponding repeatability of 0.13 [s]. A significantly shorter GM ATT of −0.08 (p<0.0001, CI: −0.10 - −0.05) [s] were observed in females than in males. There was also a significant age effect with an ATT increase of 0.03 (p<0.0001, CI: 0.02 - 0.04) [s] per decade. The absolute mean differences within session 1, within session 2 as well as in between sessions were 0.04, 0.03 and 0.05 [s], all significantly different from each other. Table 2 summarizes all effects and main results from that study.
The precision of the automatic planning tool for repositioning the subjects was good, having average translation errors of less than 2mm and rotations of less than 1.6°. Even though the effect of rotation depends on the distance from the center, a 1.6° rotation would only correspond to a 2mm displacement at the brain periphery (~70mm distance).
There is a preferred directionality of the rotation around L-R direction (Fig.1c) and it is likely to come from the fact that the more “natural” motion would be a rotation around this axis when the subject is placed supine in the scanner. Additionally, many motion limiting measures consist of padding and straps at the sides of the head, limiting the rotation to the sides but not the rotation around the L-R axis. This is also likely to explain the fact that the L-R translational distribution is narrower than the others (Fig. 1a). Finally, any rotation due to subject motion is likely to occur around a point in the back of the skull as compared to the center of the brain and this would appear as an increase in the displacement in A-P and F-H direction at the reference point depicted in Fig. 1d.
In general it can be argued that parts of the rotation and displacement were contributions from subject movements rather than inaccuracy of the SmartExam. Automatic planning inaccuracies could account for something similar to the L-R displacement (0.02 ± 0.57mm), while the angulations errors could be similar to the sidewise rotation (1.1 ± 0.33° where A-P & F-H components are larger than L-R). On another hand, it is possible that the symmetrical properties of the coronal “view” makes the automatic planning more accurate in the L-R direction and that parts of the increased variability in A-P & F-H direction could be due to this particular reason.
The accuracy reported in this study, is in good agreement with previous reports (Itti et al., 2001; Springorum et al., 2006; van der Kouwe et al., 2005; Young et al., 2006a; Young et al., 2006b) and this study includes many more subjects and repositions than these earlier studies. Surely, the use of automatic planning methods will improve the consistency of acquired data in future clinical trials, whether involving a single or multiple centers. Furthermore, standardized data acquisition could also ease the general clinical reading and make retrospective studies of clinical data easier to perform. Future improvements of these techniques would preferably include real-time tracking of subject movements during the scan session. Currently, these techniques rely on minimal subject motion during the scan session, which isn’t always the case for instance with children, patients with severe illnesses or people suffering from dementia.
This study demonstrates that ASL in general is possible to perform at multiple sites with similar performance from most sites (Table 1). However, it is also evident from Table 1 that certain sites perform better than others, with success rates from as low as 47% all the way to 100% with an overall success rate of 85.8%. This is to some extent to be expected, as ASL is a low SNR perfusion measurement, which makes it a hardware-demanding modality. Even though this study used identical scanner setups, there are differences in homogeneity of the B0 field, SNR as well as other parameters which may not influence standard anatomical imaging but could make a difference when it comes to ASL, and in particular the QUASAR sequence used in this study. It is noticeable that there is a correlation between the mean CBF within the sites and their success rate (r = 0.66), possibly indicating that there could be differences in e.g. inversion efficiency. The influence of the sites SNR on the success rate seems less important (r = 0.45) even though part of the success criteria is based on a minimum SNR. Reasons could be differences in the B0 and B1 field homogeneities at the level of the labeling region between sites, which could affect labeling efficiency but not necessarily SNR at the level of the imaging region. An overall success rate of 85.8% still leaves room for more development with regards to ensuring sufficient SNR and a stable inversion efficiency among others to ensure similar performance between sites but also to enhance stability within sites. In table 1, both ICC and CV were included in addition to the repeatability, as they are widely used in the literature for repeated measurements of image modalities. However, care has to taken when comparing values, as CV is dependent on the mean and ICC is dependent on the ratio of within and between subject variations (Eq.) and as a result it can vary between e.g. different populations.
Figure 2 shows representative CBF maps from three sites, where the accuracy of the automatic planning in between sessions can be appreciated as well. In this study, multi-slice acquisition with gap between slices was used and therefore image registration cannot be performed accurately. Consequently correct repositioning is of particular importance for comparing the data in between sessions.
In Figure 3, the average CBF-maps from all subjects (N=275, 960 datasets) are shown in the upper row having a mean gray matter CBF of 47.4 [ml/100g/min] in line with previous reported results from MRI (Donahue et al., 2006; Gunther et al., 2001; Shin et al., 2007), CT and PET studies (Kudo et al., 2003; Leenders et al., 1990). Larger values are often reported in ASL and Xenon based methods (Blauenstein et al., 1977; Hermes et al., 2007; Matsuda et al., 1996; Wintermark et al., 2001). This could be explained in part by the regularization technique used in the QUASAR method during quantification, which does generally underestimate CBF (Petersen et al., 2006a). In addition, differences in the way the region-of-interest (ROI) are extracted can influence the values. Most studies use smaller hand drawn ROIs, where this study uses a full brain GM ROI segmented from the T1-map, which could potentially increase partial volume with white matter and cerebrospinal fluid.
Shin et al. (Shin et al., 2007) and Parkes et al. (Parkes et al., 2004) previously reported an 11% and 13% higher GM CBF in females than in males, however, the difference was only 4.3 % in this study (46.3 vs. 48.3 [ml/100g/min]). No significant age effect could be reported in this study, although previous studies reports 3.0-7.4% decrease per decade depending on the region observed (Leenders et al., 1990; Parkes et al., 2004; Shin et al., 2007). This discrepancy is most likely due to the fact that the majority of the subjects participating in the present study were in their twenties and thirties, making the age span rather narrow. Alternatively, some of these effects might be explained by the fact that a T1 dependent mask is used in this study, which automatically will compensate for any atrophy and therefore partial volume effects seen with ageing (Parkes et al., 2004). This correction could potentially lower or eliminate the age related effect.
The variance component analysis, which was based on a Linear Mixed Effect model, revealed a standard deviation between the different sites SDs of 1.8 [ml/100g/min], a between subject standard variation SDb of 5.5 [ml/100g/min] and a within subject standard deviation SDw of 4.7 [ml/100g/min]. SDw is here based on all four scans, where two were scanned without leaving the scanner and if it is decomposed into within sessions as well as between session effects, it becomes 4.3, 3.1 and 5.3 [ml/100g/min] considering within session 1, session 2 and in between sessions individually. If one would plan a repeated study on the same subjects e.g. before and after treatment, then one would use the larger SDw=5.3 for calculating the statistical power of the study. If on the other hand, two groups were to be compared, then SDb = 5.5 [ml/100g/min] should be used instead. SDs tells about the variability between sites, which can also be seen in Table 1 and Figure 4a. Finally, an SDw of 5.3 between sessions will result in an overall repeatability of 14.7 [ml/100g/min] and together with CV from the different sites in the range of 6-15%, the reproducibility of the method is in good agreement with previously published results using both ASL (Hermes et al., 2007; Parkes et al., 2004; Yen et al., 2002) as well as other modalities (Blauenstein et al., 1977; Matthew et al., 1993; Shin et al., 2007).
Although there is no effect on the mean of the CBF differences from within and between session scan, then there are clearly differences in the variability as seen in the Bland-Altman plot in Figure 4c. As expected, the repeatability is largest between sessions where one could expect variations of physiological and planning origins as well as from the acquisition and subsequent post-processing of the ASL data. The repeatability within session 2 is the smallest because no repositioning happened between scans and the variability is only due to possible subject motion, acquisition and post-processing errors. The corresponding absolute mean differences were 4.8, 3.3 and 5.6 [ml/100g/min] for within session 1, session 2 and in between sessions, respectively. This suggests that more than half (~59%) of the absolute mean difference in between two sessions would originate from possible subject motion, acquisition and post-processing errors. Approximately one quarter (~27%) could be due to repositioning errors in between scans, while the remaining error due to physiological variations is only ~14% over time. An interesting observation is the fact that females have a significantly larger variation between sessions than males, but not within sessions. This could potentially be explained by the female hormonal or menstrual cycle which is known to alter metabolism (Rasgon et al., 2001), hematocrit (Hirshoren et al., 2002) and sympathetic baroreflex sensitivity (Minson et al., 2000).
Finally it should be mentioned, that there were no significant effect seen from the number of days in betweens sessions or the time of day when the scan was performed contrary to previous results (Parkes et al., 2004).
Average arterial blood volume maps are shown in the middle row of Figure 3, where larger blood volumes can be seen corresponding to the location of feeding arteries. The mean gray matter aBV of 0.67 [ml/100g] would result in a gray matter cerebral blood volume (CBV) of 2.91-3.94 [ml/100g], assuming literature values of the venous to total CBV ratio of 0.77-0.83 (An and Lin, 2002). This is in the lower end as it is generally agreed that total gray matter CBV is in the order of 4-5 [ml/100g] (Kuppusamy et al., 1996; Sakai et al., 1985), although no accepted gold standard exists for quantitative estimation of CBV. However, the arterial blood volume measured with the current technique is a pseudo measure and it depends on the fraction of arterial blood that flows faster than a certain velocity encoding threshold (Petersen et al., 2006a), here 4 cm/s. Although this threshold is believed to include blood at the arteriolar level down to approximately 100μm diameter (Gilmore et al., 2005), it does not necessarily include the same arterial compartment as the above mentioned studies (An and Lin, 2002). In addition, the velocity encoding or vascular crushers will not work equally well in all spatial directions and as a result the arterial blood volume can appear smaller and again the definition of gray matter or the ROI used may also influence the resulting values.
Contrary to CBF, aBV did not show any gender effect whereas a significant age effect with a decrease of 0.02 [ml/100g] per decade was observed. This corresponds to 0.07-0.12 [ml/100g] per decade for CBV assuming the above venous to total CBV ratios. Again this decline is somehow smaller than the previously reported CBV decline of 0.46 [ml/100g] per decade (Leenders et al., 1990) which on the other hand seem to be on the high side. Here also, our rather narrow distribution of ages might have lead to an underestimation of such effects.
The lower row of Figure 3 shows the average arterial transit time map, and in good agreement with previous published results (Hendrikse et al., 2008), prolonged transit times in the regions between the perfusion territories of the anterior-, middle- and posterior cerebral arteries can be seen. These arrival time differences are important to consider with regards to accurate CBF quantification (Petersen et al., 2006b).
With a mean gray matter ATT of 0.82 [s], we are in good agreement with previously reported results (Gunther et al., 2001) and with a typical bolus length in the order of 0.7 [s], it is obvious that ASL experiments using single inversion times and no vascular crushers need to acquire data well beyond 1.5 [s] after labeling for correct quantification without arterial contamination.
A significantly shorter GM ATT of −0.08 [s] were observed in females than in males, indicating differences in the blood velocity and maybe vascular structure. It should be remembered that the labeling slab was placed in the same physical distance from the image slices in all subjects and that differences most likely are related to differences in resistance (size) and compliance of the vessels as well as the perfusion pressure. There was also a significant age effect with an ATT increase of 0.03 [s] per decade.
A limitation of the current study is the use of a single field strength and a single scanner brand. In this multi center study, 3 T scanners were used as they are becoming standard in the clinical setting. Also, 3 T is currently the most promising field for clinical use of ASL due to increased SNR and prolonged T1 of blood as compared to 1.5 T (Golay and Petersen, 2006). Whereas most clinical sites will have access to 3 T systems, different scanner brands are to be expected in a clinical multi center trial. Nevertheless, considering the striking similarities in performance between vendors regarding gradient performance, coil design and other hardware parameters, we see no reason for the estimated variations in perfusion between sites and subjects to be in a different range for other brands, assuming a similar ASL implementation. In addition, the vendors continuously improve the stability and SNR of their systems, which is also likely to improve the variability as compared to the ones obtained in this study in the future.
In the present work, we evaluated the QUASAR implementation of ASL across 28 centres in a test-retest multi center trial. The accuracy of the automatic slice-planning as well as the overall and within site reproducibility of ASL was tested.
Good slice repositioning was achieved and in agreement with previous reports, we found the automatic planning to be effective for precise and consistent planning regardless of the location and user in charge. This framework is likely to improve data consistency in future trials whether involving ASL or other image modalities.
Although the success rate of ASL varied from 47 to 100% between sites, the test-retest showed reasonable reproducibility across sites, suggesting that ASL is ready for use within and across centers in future clinical multi-centre studies. The reproducibility was found to be within the range of previous studies using ASL, PET and Xenon based methods.
However, there is still room for further improvements with regards to ensuring similar performance and SNR between sites and enhancing stability within sites. Prospective motion correction and the continuous improvement of hardware may be of particular help and will hopefully move ASL for its last step into clinical practice.
The authors would like to thank Philips Healthcare for establishing the connections between the participating sites and for worldwide support during this study. This work was supported by the following grants and institutions: NMRC/0919/2004 (Singapore), The Swedish Research Council (Sweden), NIH NS054916, P41 RR015241, NIH (USA), EPSRC (UK)
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.