PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of jcbfmLink to Publisher's site
 
J Cereb Blood Flow Metab. 2010 September; 30(9): 1551–1557.
Published online 2010 June 23. doi:  10.1038/jcbfm.2010.86
PMCID: PMC2949251

Everything you never wanted to know about circular analysis, but were afraid to ask

Abstract

Over the past year, a heated discussion about ‘circular' or ‘nonindependent' analysis in brain imaging has emerged in the literature. An analysis is circular (or nonindependent) if it is based on data that were selected for showing the effect of interest or a related effect. The authors of this paper are researchers who have contributed to the discussion and span a range of viewpoints. To clarify points of agreement and disagreement in the community, we collaboratively assembled a series of questions on circularity herein, to which we provide our individual current answers in ≤100 words per question. Although divergent views remain on some of the questions, there is also a substantial convergence of opinion, which we have summarized in a consensus box. The box provides the best current answers that the five authors could agree upon.

Keywords: brain imaging, functional magnetic resonance imaging, imaging, neuroimaging, statistical methods

Introduction

Brain imaging produces very large data sets of brain activity measurements. However, the neuroscientific conclusions in papers are typically based on a small subset of data. The necessary selection—unless carefully accounted for in the analysis—can bias and invalidate statistical results (Vul et al, 2009a; Kriegeskorte et al, 2009).

The large number of brain locations measured in parallel allows us to discover brain regions with particular functional properties. However, the more we search a noisy data set for active locations, the more likely we are to find spurious effects by chance. This complicates statistical inference and decreases our sensitivity to true brain activation. In functional magnetic resonance imaging, the goal is typically twofold: (1) to identify voxels that contain a particular effect and (2) to estimate the size of the effect, typically within a region of interest. Whether widely used analyses meet the resulting statistical challenges has been hotly debated in the past year.

Let us consider the first goal: finding brain regions that contain a particular effect. For example, we may wish to answer questions such as: Which voxels respond more to faces than houses? Or, in which voxels does the face-house contrast correlate with IQ across subjects? The use of many null-hypothesis tests across brain locations presents a multiple testing problem: the more voxels that are tested, the greater the family-wise error rate (FWE), i.e., the probability that one or more voxels will pass the significance threshold by chance even when there are no true effects (false-alarm voxels). A number of statistical methods have been developed to control the FWE (for a review, see Nichols and Hayasaka, 2003).

The Bonferroni method increases the significance threshold for each voxel to ensure that the FWE does not exceed, say 0.05. However, as Bonferroni's method does not account for image smoothness, it is overly conservative and not optimally sensitive. Random field theory methods (Worsley et al, 1992; Friston et al, 1994) adjust for spatial correlation between voxels to achieve greater sensitivity (i.e., power—the probability that a truly active voxel will be identified as such). Although voxel-wise methods detect individual voxels, cluster-wise methods (Poline and Mazoyer, 1993) report as significant clusters (contiguous sets of voxels that all exceed a primary threshold) that are larger than a predetermined cluster-size threshold (chosen to ensure a 5% FWE for clusters).

Instead of limiting the probability of any false alarms (i.e., the FWE), false-discovery rate methods (Genovese et al, 2002) limit the average proportion of false alarms among the voxels identified as significant. This approach promises greater sensitivity when there are effects in many voxels. When used appropriately, these methods solve the multiple testing problem and ensure that we are unlikely to mistake an inactive region for an active region.

The second goal is estimating the size of the effect. For example, we may wish to answer questions such as: How strongly do these voxels respond to faces? Or, how highly does the activation contrast in this region correlate with IQ across subjects? Unfortunately, we cannot accurately address such questions by simply analyzing the selected voxels without worrying about the selection process. The effect-size statistics need to be independent of the selection criterion; otherwise the results will be affected by ‘selection bias.' For intuition, imagine the data were pure noise. If we select voxels by some criterion, those voxels are going to better conform to that criterion than expected by chance (for randomly selected voxels). Even if the selected voxels truly contain the effect of interest, the noise in the data will typically have pushed some voxels into the selected set and some others out of it, thus inflating the apparent effect in the selected set.

This problem has long been well-understood in theory, but is not always handled correctly in practice. Variants of bias due to selection among noisy effect estimates affect many parts of science. Just like voxels are selected for inclusion in an ROI, so studies are selected for publication in scientific journals (Ioannidis, 2005; 2008). In either case, the selection criterion is effect strength, and effect estimates are inflated as a result.

Vul et al (2009a) suggested that cross-subject correlation neuroimaging studies in social neuroscience are affected by ‘nonindependence' (also see Vul and Kanwisher, 2010). Kriegeskorte et al (2009) discuss the problem of ‘circularity' more generally as a challenge to systems neuroscience.

These authors argued that effect estimates and tests based on selected data need to be independent of the selection process, and that this can be ensured by using independent data for selection (e.g., using half of the data to select signal-carrying voxels, and the other half to estimate the signal) or by using inherently independent functional or anatomic selection criteria.

Although there is little controversy about the basic mechanism of selection bias, the 2009 papers have sparked a debate about exactly which analysis practices are affected and to what degree (Diener, 2009; Nichols and Poline, 2009; Yarkoni, 2009; Lieberman et al, 2009; Lazar, 2009; Lindquist and Gelman, 2009; Barrett, 2009; Vul et al, 2009b; Poldrack and Mumford, 2009). Herein, we collaboratively assembled and then individually answered a series of questions on circular analysis to clarify points of agreement and disagreement. Each answer is ≤100 words. We hope to contribute to a convergence within the community toward statistical practices that ensure that systems and cognitive neuroscience remain solidly grounded in empirical truth.

Table thumbnail

Notes

The authors declare no conflict of interest.

References

  • Barrett LF. Understanding the mind by measuring the brain: lessons from measuring behavior (Commentary on Vul et al., 2009) Perspect Psychol Sci. 2009;4:314–318. [PMC free article] [PubMed]
  • Diener E. Editor's introduction to Vul et al. (2009) and comments. Perspect Psychol Sci. 2009;4:272–273.
  • Friston KJ, Worsley KJ, Frackowiak RSJ, Mazziotta JC, Evans AC. Assessing the significance of focal activations using their spatial extent. Hum Brain Mapp. 1994;1:214–220. [PubMed]
  • Genovese CR, Lazar NA, Nichols T. Thresholding of statistical maps in functional neuroimaging using the false discovery rate. Neuroimage. 2002;15:870–878. [PubMed]
  • Ioannidis JP. Why most discovered true associations are inflated. Epidemiology. 2008;19:640–648. [PubMed]
  • Ioannidis JPA. 2005. Why most published research findings are false PLoS Med 2e124doi: doi: 10.1371/journal.pmed.0020124. [PMC free article] [PubMed] [Cross Ref]
  • Kriegeskorte N, Simmons WK, Bellgowan PSF, Baker CI. Circular analysis in systems neuroscience—the dangers of double dipping. Nat Neurosci. 2009;12:535–540. [PMC free article] [PubMed]
  • Lazar NA. Discussion of ‘puzzlingly high correlations in fMRI studies of emotion, personality, and social cognition' by Vul et al. (2009) Perspect Psychol Sci. 2009;4:308–309.
  • Lieberman MD, Berkman ET, Wager TD. Correlations in social neuroscience aren't voodoo: commentary on Vul et al. (2009) Perspect Psychol Sci. 2009;4:299–307.
  • Lindquist M, Gelman A. Correlations and multiple comparisons in functional imaging: a statistical perspective (Commentary on Vul et al., 2009) Perspect Psychol Sci. 2009;4:310–313.
  • Lindquist M, Spicer J, Leotti L, Asllani I, Wager T. Localizing areas with significant inter-individual variation: testing variance components in a multi-level GLM. Hum Brain Mapp Annu Meet. 2009.
  • Nichols TE, Holmes AP. Nonparametric permutation tests for functional neuroimaging: a primer with examples. Hum Brain Mapp. 2002;15:1–25. [PubMed]
  • Nichols TE, Hayasaka S. Controlling the familywise error rate in functional neuroimaging: a comparative review. Stat Methods Med Res. 2003;12:419–446. [PubMed]
  • Nichols TE, Poline J-B. Commentary on Vul et al.'s (2009) ‘puzzlingly high correlations in fMRI studies of emotion, personality, and social cognition. Perspect Psychol Sci. 2009;4:291–293.
  • Poldrack RA, Mumford JA. Independence in ROI analysis: where is the voodoo. Soc Cog Affect Neurosci. 2009;4:208–213. [PMC free article] [PubMed]
  • Poline JB, Mazoyer BM. Analysis of individual positron emission tomography activation maps by detection of high signal-to-noise-ratio pixel clusters. J Cereb Blood Flow Metab. 1993;13:425–437. [PubMed]
  • Vul E, Harris C, Winkielman P, Pashler H. Puzzlingly high correlations in fMRI studies of emotion, personality, and social cognition. Perspect Psychol Sci. 2009a;4:274–290.
  • Vul E, Harris C, Winkielman P, Pashler H. Reply to comments on ‘puzzlingly high correlations in fMRI studies of emotion, personality, and social cognition. Perspect Psychol Sci. 2009b;4:319–324.
  • Vul E, Kanwisher N. 2010. Begging the question: the non-independence error in fMRI data analysis Foundational Issues for Human Brain Mapping(Hanson S, Bunzl M, eds). MIT Press: Cambridge, MA.
  • Worsley KJ, Evans AC, Marrett S, Neelin P. A three-dimensional statistical analysis for rCBF activation studies in human brain. J Cereb Blood Flow Metab. 1992;12:900–918. [PubMed]
  • Yarkoni T. Big correlations in little studies: inflated fMRI correlations reflect low statistical power. Commentary on Vul et al. (2009) Perspect Psychol Sci. 2009;4:294–298.

Articles from Journal of Cerebral Blood Flow & Metabolism are provided here courtesy of Nature Publishing Group