Here we address the critiques offered by Hasan and Pedraza to our recently published manuscript comparing the performance of two automated segmentation programs, FSL/FIRST and FreeSurfer (Morey, et al. 2009
). An assessment of the specific critiques and discussion follows:
Lack of age and other demographic information
We agree that a clear description of sample characteristics is essential in any scientific report and is an important omission from our original report. Key demographic attributes of our sample of 20 participants are described in . While we do not believe the particular group of subjects we selected have biased our results, it is possible one technique may be poorer at segmenting a specific demographic subgroup (e.g. aged) than another owing to the range of subjects who contributed to the atlas. While we agree that basic demographic information of the group is important and may affect absolute volume measures, we did not predict that the expert human expert rater to be tracing the structures differentially with respect to age or other demographic attributes and therefore did not expect them to influence the comparison of methods. We acknowledge the age related changes in volume for certain structures cited by Hasan and Pedraza, however investigation of age related effects would be difficult to assess in the age demographic represented in our sample. This would be an interesting area of further research.
Demographic attributes of sample
Lack of tabulation of individual volumes from manual tracing
It is unclear how tabulating volumes of individual participants’ volumes would help the interpretation of the results. We are comparing the outcome of two algorithms and therefore the relative differences between methods are much more important than the absolute volumes for individual subjects. Individual volume measures might be useful if a publicly available dataset was used, however this was not the case. Furthermore, manual hippocampal and amygdala segmentations depend on the segmentation protocol and thus it can be misleading to directly compare absolute volumetric measurements.
Introduction of a correlation by not considering left and right structures which ought to be tabulated separately
We agree that because the volumes of left and right structures are generally correlated, this may tend to inflate any correlation that includes left and right as independent measures. We checked the correlation of left and right hemispheres of the hippocampus and amygdala and indeed they are correlated for each of the three methods employed (see ). When left and right hemisphere correlations are assessed separately, the right hemisphere had generally higher correlations with manual tracing than the left hemisphere. The correlations obtained from using left and right hemisphere data as independent measures (as reported in Morey, et al. 2009
) led to values intermediate to the separately computed left and right correlations as seen in . Regarding our power estimates, treating the hemispheres as independent, given the left-right correlations, results in an underestimation of variability. This leads to overstated inferences (too lenient) and overestimates of power, although it does not change the relative position of the methods. Additional data on laterality is conveyed in the shape analyses presented in Figures 9 and 10 of the original paper (Morey, et al. 2009
). Therefore, in our estimation, this information does not lead to a significant revision of the main conclusions about either of the methods. More conclusive results are certain to be made available from future validation studies.
Left and right hemisphere correlations
Evidence of possible sources of difference related to the atlases used in FSL and FreeSurfer with a potentially missed opportunity for further analyses by Morey et al
Including this point would enhance the completeness of the Discussion section. Atlas differences are one possible source of diverging results between FSL and FreeSurfer. The suggestion for comparing systematic bias of manually delineated volumes to the atlas from FreeSurfer goes beyond the scope of our study as this could be a manuscript unto itself. Indeed, Hasan and Pedraza cite the work of Shattuck and colleagues (Shattuck, et al. 2008
) as a careful study of the influence of atlas selection.
In summary, Hasan and Pedraza bring up some important points concerning our reporting of methods and results. While their commentary aids the reader in more critically assessing our study, it falls short of substantiating that our omissions and methodological shortcomings ought to lead any reader to significantly revise their interpretations. Further research is certain to help disentangle the advantages and limitations of the various freely-available automated segmentation software packages.