|Home | About | Journals | Submit | Contact Us | Français|
Brain-imaging literature of Irritable Bowel Syndrome (IBS) suggests an abnormal brain-gut communication. We analyzed the literature to evaluate and compare the aspects of brain activity in individuals with IBS and control subjects experiencing controlled rectal stimulation.
PubMed was searched until September 2010. Data from 16 articles reporting brain activity during rectal balloon distensions in IBS compared to control groups was analyzed. Prevalence rates and pairwise activations were assessed using binomial distributions for 11 selected regions of interest. The data was aggregated to adjust for center effect.
There was considerable variability in the literature regarding regions and their activity patterns in controls and individuals with IBS. There was no significant difference found in the thalamus, ACC, PCC, and PFC, however results show limited evidence of consensus for the Anterior Insula (AI) (p = 0.22). Pairwise activity results suggest that pairs involving the AI tend to have more consistent activity together than pairs which do not involve the AI (Posterior Insula and AI, p = 0.08; Posterior Cingulate Cortex and AI, p = 0.16), however no pairwise evaluation reached significance.
Our pooled analysis demonstrates that the literature reports are quite heterogeneous but there is some evidence that there may be patterns of higher activity more common in individuals with IBS than in controls. A consensus, though, regarding study designs, analysis approach and reporting could create a clearer understanding of brain involvement in IBS pathophysiology.
Irritable Bowel Syndrome (IBS) is a gastrointestinal disorder that is typified by persistent symptoms of cramping, bloating, abdominal pain, constipation or diarrhea. According to the National Institute of Health (NIH), IBS may afflict up to 20% of the American adult population . While the direct cause of IBS is still unknown some of the current theories hypothesize that there is hyper-vigilance in the central nervous system towards signals originating in the viscera. Other theories hypothesize that abnormalities in signal processing of enteric nervous system are the cause. IBS is likely a complex set of interactions between the central nervous system (CNS) and the periphery that affects a patient’s tolerance of pain signals from the viscera.
The integral CNS involvement in sensory signal evaluation in IBS makes brain imaging an interesting tool for evaluating the aspects of IBS pathophysiology. Recent studies of the CNS representation of various forms of pain and pain modulation have been interrogated using brain-imaging methods such as Positron Emission Tomography (PET), functional Magnetic Resonance Imaging (fMRI) or Single Photon Emission Computed Tomography (SPECT). These modalities are widely used to define brain areas involved in multiple processes including the brain activity related to visceral stimulation and pain.
Starting in 1997, a substantial number of imaging studies have researched the CNS visceral pain pathways of individuals with IBS and/or normal patients [2–48]. Even though all of these studies are important for understanding visceral pain, less than one half performed a direct comparison of brain activity of individuals with IBS and controls [4, 6, 7, 19, 25, 26, 31, 35, 36, 38–40, 42, 43, 46, 47]. Studies with a direct comparison of controls and individuals with IBS are particularly helpful, in that, similar experimental conditions and analyses potentially allow for a robust interpretation of the results across different authors and study centers. These studies have considerable amounts data which are valuable and variable. A thorough and fair generalized assessment may be challenging. This information could lead to an increased understanding that would not only help to clarify the pathologic and normal cortical mechanisms of visceral pain, but also to unveil the candidate regions which could serve as markers for visceral modulation by potential therapeutic agents for individuals with IBS.
By pooling the existent literature related to brain imaging and IBS, the aims of this analysis are 1) to evaluate and compare the aspects of brain activity in individuals with IBS and control subjects experiencing controlled rectal stimulation 2) to define pairwise activation patterns among the most commonly reported regions of interest (ROI) within the articles.
The authors searched for published English language articles using PubMed, until September 2010. The search was performed using the following keywords: brain imaging, IBS, PET, fMRI, inflation, distention, balloon, rCBF, BOLD, barostat and activity. Forty-two relevant imaging articles corresponding to the search criteria were found [3–7, 9–19, 21–23, 25–33, 35–48]. These articles dated from December 1994 to December 2009 and included all of the studies that combined brain-imaging methods with rectal/sigmoid balloon distensions in controls, individuals with IBS, or both groups. From all of these studies, only those which approached a comparison between IBS and controls were further included in the analysis and consequently 16 studies were finally reviewed for this article [4, 6, 7, 19, 25, 26, 31, 35, 36, 38–40, 42, 43, 46, 47]. Studies that did not have a direct between-groups analysis but declared the within group results for each of the groups (IBS and controls) were also accepted for this analysis. Overall, 13 studies presenting a direct between-group comparison and 3 studies limited to within group results were included in our analysis.
Data was collected from each article including author, study center, stimulation type, scan modality, number of subjects (total and with-in group), and region of interest (ROI) activity for each reported region. Due to the varied reporting and non-reporting of Brodman Areas (BA), ROIs were chosen by most commonly reported structure rather than BA number. Of note, the most common BAs reported by the included studies were: 9, 10, 32, 24, 44. Activity reporting was coded where a 1 indicates regions that were more activated in individuals with IBS than controls, a 2 indicates that there was no reported difference, and a 3 indicates that the region was more activated in controls than in individuals with IBS. In the instances of no reporting, a record of N/A was used. This coding method was preferred due to varied reporting of z-statistics and other statistical measures.
To adjust for center bias in multiple reports from the same center, the results from studies conducted at the same site were aggregated. There were three different centers that had published multiple articles to be included in our analysis. To apply weighting in the aggregated data, preference was given to the reporting of a 2 in cases when a 1 or 3 was reported in one study and a 2 was reported in the other. Although giving preference to a 1 or 3 may have provided more significant results, a preference was given to no significance as it is more conservative. In cases where there were three or more studies from one center reporting on a single ROI a mean rating was used to determine the center rating. Once the data were aggregated, a total of 11 centers (composed of 16 articles) were used for ROI analysis.
We used a binomial model for the calculation of the probability of a study giving a certain rating within a ROI, with a uniform prior distribution on the binomial parameter. Results were then summarized by the mean of the posterior distribution of the binomial parameter, along with a 95% probability interval.
An analysis was also preformed to test for paired activation. Six regions (ACC, PCC, PFC, Thal, AI, and PI) which were reported by at least 6 of the 11 aggregated studies were included and the results were tabulated pairwise yielding 15 pairs. The objective evaluated by this analysis was to determine which pairs of regions exhibit consistent activation across studies. We determined a p-value for this situation by simulation. In the appendix we describe the method for assessing the statistical significance of this test of consistency.
Table 1 displays a breakdown of the results for each of the 16 studies included in this analysis. Information included in this table includes the author, year, center, analytic approach, site of stimulation, stimulation type, scan modality, total number of subjects, total number of controls and individuals with IBS, number of males and females, ROI choice (a priori/posteriori), and 11 ROIs. The 11 ROIs include the Anterior Cingulate Cortex (ACC), Prefrontal Cortex (PFC), Anterior/Posterior Insula (AI, PI), Hippocampus/Amygdala (Hp/Am, reported together), Somatosensory Area I & II (SI, SII); Posterior Cingulate Cortex (PCC), Temporal Lobule (T), and Periaqueductal grey matter (PAG). Of the 11 ROI’s, the ACC and PFC had the most studies reporting activity (15 and 13 respectively), while the PAG and temporal regions had the least reporting (2 and 5 respectively).
Table 1.a is a supplementary table of collected data on experimental parameters. This table contains data for maximal balloon volume, distention rate, and pressure or volume for stimulation ratings of subliminal, liminal, and supraliminal sensations. The table is broken down by author and subdivided when individual group ratings were used. The studies that reported on the N of the sub-types of IBS (Alternating (A), Diarrhea (D), and Constipation (C)) are listed, however all studies included all types together within the study results. There were 12 studies that reported rate of inflation [6, 19, 25, 26, 31, 39, 40, 42, 46, 47]. Of these studies, one used an inflation rate of 14.5ml/s, one used a rate of 25 ml/s, two used a rate of 38ml/s, two used a rate of 40ml/s, one used a rate of .5 mmHg/s, one used a variable rate of 2–10mmHg/s, and two use a rate of 870ml/min. There were no consistent stimulation parameters amongst different centers. As well, most studies did not indicate if there had been previous exposure to rectal distention.
Eleven studies reported threshold levels for liminal, subliminal, or supraliminal distention [19, 25, 26, 31, 35, 36, 38–40, 46, 47]. The other five studies used individual pain thresholds, four gave averages [6, 7, 42, 43] and one did not . Of the thirteen studies, reporting thresholds three studies report volume, while the other ten studies reported pressure. Five of the eight studies used a supraliminal pressure of 60mmHg, while two used a pressure of 55mmHg, two used 50mmHg, one used 70mmHg and one other used a pressure of 45 mmHg; all of which were indicated to cause pain. Four studies filled to a liminal pressure of 45 mmHg, one filled to a pressure of 35mm Hg, one to 30 mm Hg and another to 25mm Hg, while three did not use a liminal pressure. There is considerable variability in the stimulation amount to reach the various stimulation sensations where brain activation was measured.
Table 2 shows how the data from same centers were aggregated and subsequently reported for use in our analysis. Ringel et al. published two articles that were included [35, 36]. Within these articles there were three total ROIs reported only one of which was reported in both articles. In this instance the most recent and larger of the two studies reported a 2 in the ROI, while the older study reported a 3. We qualitatively weighted the new and larger study and reported a 2. Four articles from the Center for Neurovisceral Sciences and Women’s Health at UCLA have been aggregated using an average across multiple reporting in the same ROI [6, 25, 31, 38]. In instances of only two of the four articles reporting activity within an ROI, the study with the largest size was given priority. Finally, two articles by Song and by Wilder were aggregated [39, 42]. These two studies were virtually identical in methodology and size; however they did use different subjects.
The p-values for single-region activation consistency are tabulated in Table 3 for both IBS greater than controls (I) and controls greater than IBS (C). There was no difference found or in some cases a balance between activations and deactivations resulting in no difference in the thalamus, ACC, PCC, and PFC. For example, of studies reporting on the ACC, 3 studies reported that individuals with IBS showed greater activation, 2 reported greater activation in the control group and 4 reported no difference resulting in an overall qualification of no difference. There were too few studies reporting for the Hp/A, SI, SII, PAG, and temporal lobule to confidently declare a result.
Results for the pairwise activation for IBS>Controls are presented in table 4. The results suggest that pairs involving the AI tend to be more consistent (PI and AI, p = 0.08; PCC and AI, p = 0.16) than pairs that did not include the AI (TH and ACC, p=0.52; Th and PFC, p=0.86; PCC and ACC, p=0.26), though the null hypothesis cannot be rejected for any of these pairs as none reach significance. The other p-values are substantially larger than these values.
This analysis was motivated by observing the complexity and heterogeneity that exists in brain imaging studies of the brain-gut axis. A cursory review of the literature about IBS and brain imaging challenges a reader to develop a clear idea about brain activity patterns in IBS and controls. Our analysis assesses the heterogeneity of the results in the brain-gut studies in IBS by outlining the diversity of regional patterns of activity.
The binomial distribution and repeated measures test of significance for single ROI activity did not conclusively show that any of the observed ROIs were consistently activated in IBS or controls during supraliminal distention. The regions which were reported by multiple studies within the pooled articles that were inconclusive included ACC, PCC, PFC, Th, PI, S1, S2, Hp/A, T, and the PAG. While it is important to note that these regions are commonly thought to be within the visceral pain network [49, 50], no statistically significant consensus can be made between the IBS studies within our analysis. As well, we cannot say with statistical certainty that there is any consensus for paired interactions between regions. This can be due to one of two reasons: 1) that there is no consensus or 2) that there are not enough studies to gain statistical power because of the small number of studies (see appendix).
The articles that were included in our pooled analysis were selected specifically because they each had very similar paradigms. Of the 41 articles that we found by our search criteria only 16 articles met all of the requirements to be included in the analysis. The intense exclusion process was purposeful to reduce the effects of inter-study variability. However as the results show, the variability was still likely large enough to cause a wide margin of error.
Center Bias is caused by the reporting of multiple studies from the same research group which likely have similar analyses and reporting approaches and can heavily influence the final prevalence outcome for a given ROI [51, 52]. In our study, there were three groups reporting more than one study: the group at UCLA with 4 studies, Ringel et al with 2 studies, and Wilder & Song et al with 2 studies. As a result, we strove to correct for this bias by aggregating the data so that each center was represented only once, which we believe is a fair assessment and representation of the data.
Due to notable variations in experimental paradigms and reporting, we relied on our expertise and a methodical approach to determine the appropriate rating for a group of aggregated studies. For example, there were 8 instances in total where there was a disagreement of the activity rating within a single region between two studies. In all of these instances, the activity ratings were summarized as having no difference (rating of 2). This conservative approach reduces the overall amount of confirmatory activity ratings within a ROI, but guarantees that there is not a false positive caused by the center effect or by our aggregation.
Since we are not inferring function to any region in this study, we feel that we are justified in collapsing regions that were more specifically delineated. For example, we collapsed the pregenual acc (pgacc), rostral acc, and dorsal acc into a blanketed acc region. Even with the use of generalized regions which enabled us to include more reported results per region and increased our statistical power, we found no consensus. For example the AI, the only region which at first glance appears to perhaps show evidence of consensus with IBS greater than controls, had four of the seven centers reporting no significant difference between groups (p=0.22). The remaining three centers all concluded IBS>Control.
In order to reach significance on the aggregated data there would have needed to be agreement between at least six of the 11 centers if they all reported on a single region. We did run the statistical analysis on the non-aggregated data as well and similarly did not have enough agreement between centers to reach significance. Again, with the non-aggregated the AI is the best case scenario with 12 centers reporting and 6 centers in agreement reaching a significance of p=0.54. The discrepancy between what are thought to be IBS related CNS changes from the literature and the conclusions of this pooled analysis may be caused by many factors such as conservative thresholding, result reporting differences, and site specific differences in methodology.
Differences in experimental paradigms, such as differences in the rate of inflation and pain vs. discomfort ratings (liminal vs. subliminal vs. supraliminal), may cause some inter-study variability. For example, pressure ratings for supraliminal inflation ranged from 32.9 to 60 mmHg and ratings for subliminal pressure ranged from 5 to 20 mmHg. While one study may find that an average fill of 50mm Hg caused pain in their subjects other studies may have found that this level only cause discomfort. However, we must recognize that even small changes in the response dynamic of induced pain will likely cause a reactionary difference in activity in the brain. For example, there is a possibility that a balloon distention causing only discomfort may not cause activity in the anterior insula, but that it might during a balloon distention causing pain. Although there are clear differences in the pressures used for discomfort ratings between studies, we try to limit the effects within our study by only using activity reported during supraliminal pressures.
Another source of variation may be differences in significance thresholds. For example, in comparing two articles from our analysis, Naliboff et al’s 2001 study used a threshold of p≤.01 for significance, while Andresen et al used a value of p≤ .001 for significance in their 2005 article. Andresen et al reported nine regions with three regions being significantly different. Similarly, Naliboff et al reported nine regions with significance in six regions. We cannot assume that Andresen et al may have found significance in more regions at a p-vaule≤.01, but with less stringency they may have found more activity across the brain thus creating the opportunity to produce more ROIs within the significance boundary. Assessing the level of variability among the studies and adjusting for it was not possible because of inconsistent reporting of the specific values used for different thresholds. Of the 16 studies used for this analysis, 6 did not report on thresholding and 5 studies did not use consistent whole brain thresholds. Also, there were differences in the type of statistics reported. Most studies reported T-statistics (N=5), while 3 reported R-statistics, 2 reported Z-scores, 2 reported percent change and 3 had no reported statistical values. Using similar values for thresholds applied to the statistics mentioned above, as well as, clear reporting of these values would increase the homogeneity of the data analysis and consequently would facilitate pooling the data from multiple independent studies.
The authors of this study did not have access to the original study data. While there are other methods for analyzing data with the original data, such as feature-space clustering and ALE, we found that other meta-analyses had filled these gaps and that our approach is complimentary to them.
In a September 2010 article by Tillisch et. al , a similar cohort of studies were examined using ALE. ALE is a technique in which all of the locations of peak activity from multiple studies are mapped in Talairach space and are blurred with a Gaussian distribution. A statistical model is fitted where any interactions of the Gaussian blurs create a higher probability of activation at that voxel location. The Tillisch study found differences between activity associated with distention protocols in controls and individuals with IBS. The differences in activity were found in the activation loci of the pgACC, midbrain, and insula. In our analysis, we do not find significance within the ACC (which includes results for the pgACC), because there are simply not enough studies that agree that IBS>Controls or Controls>IBS. The discrepancy between our two studies might be caused by some of the limitations of ALE. The ALE model is a fixed effects calculation that will find very significant results among small numbers of studies. As well, there is no account for interstudy variability, center bias, and no correction for multiple inputs to a single ALE location from the same study. ALE lacks the ability to distinguish between activity from distinct regions, for example the temporal lobe and insular cortex. This means that a single study reporting activation in the hippocampus and anterior insula may contribute to a significant ALE cluster found somewhere in between the two. This is an error that cannot happen within our pooled analysis because each study and more importantly each center only contributes one value per region. However, we do feel that ALE can offer some valuable contributions and find some interesting results where other studies may not when used with careful observation and accountability on the part of the user. In the end, generating a detailed quantitative assessment including coordinates of the activity loci, level of activity and volume of the clusters that accounts for inter-study variability is desirable and would be ideal for understanding the central involvement of pain in the pathology of the IBS.
Brain imaging has played an important role in elucidating central response to peripheral stimulation in patients with IBS. A greater uniformity of the analysis of research data: performing direct subtractions between groups, using similar threshold approaches and levels (p or z values, number of voxels), and applying similar corrections for multiple comparisons could offer much needed consistency in the literature. Also, a clear and consistent reporting of results presenting: 1) bilaterality/unilaterality aspects 2) voxels contained in the activated/deactivated clusters 3) the exact coordinates data and z value of peak activity per cluster and 4) specific p values of the activity differences between the groups would considerably help for future quantitative meta-analysis. We could even go so far as suggesting collaborative efforts in standardizing IBS distention protocols and imaging techniques, similar to that done in other fields of study such as the Alzheimer’s Disease Neuroimaging Initiative  and those suggested in the Rome working team report .
To clarify the directions of research in IBS, we have analyzed the information published to date. Though our results are inconclusive, our analysis of the literature regarding pain processing in IBS suggests that there are patterns of activity that could explain the differences between pain sensitivity in individuals with IBS and controls. However, clearer statements can be made only when there is a greater uniformity in imaging analysis and the reporting of the results.
(Data acquisition; analysis and interpretation of data; drafting manuscript)
Alexander Gaman, MD
(Material support, data acquisition)
Mark Vangel, PhD
(Statistical analysis; appendix)
Braden Kuo, MD MSc.
(Analysis of data; drafting manuscript, study supervision)
Support: NIH DK 069614, International Foundation of Functional GI Disorders, NIH UL1 RR025758 and M01-RR-01066.
In these notes we consider separately the closely related issues of consensus and consistency among multiple neuroimaging studies comparing IBS patients with controls. By consensus we mean the extent to which the studies agree, and by consistency we mean the extent to which studies do not arrive at contradictory conclusions. The data consist of activation results for various regions from studies at eleven centers. We evaluate consensus by estimating the overall probability of a study being inconclusive, and of a study favoring IBS, using a simple Bayesian approach. We examine consistency using both χ2 -tests and an apparently new test statistic constructed for this purpose.
The activation results for each region and each of the eleven centers which performed IBS studies was summarized by “IBS>Control”, “NS”, and “Control>IBS,” corresponding to mean activation in IBS group significantly greater than the control group, no significant difference between the groups, and control group mean significantly greater than the IBS group mean, respectively. For centers with multiple studies, one of these categories was chosen for each region by expert judgment, as a subjective synthesis of the multiple published studies. Studies from the same center were combined because, unlike different centers, it was not plausible to assume that results from studies at the same center were independent.
We assessed the extent to which a consensus exists among the results for each region as follows. First, the overall probability of “NS” was determined. Then, conditioning on the number of “NS” studies, we evaluated the evidence against the null hypothesis that “IBS>Control” and “Control>IBS” are equally likely outcomes. We concluded that there was a consensus on the difference in activation between the two groups if this null hypothesis can be rejected. Otherwise, we concluded that results were inconclusive.
In both stages of the above analysis, we used a binomial model for the probabilities of study counts, with a uniform prior distribution on the binomial parameter. This is a special case of the beta-binomial Bayesian model. This prior is non-informative in the sense that it is the unique prior for which all possible outcomes are equally likely a priori. Results were then summarized by the mean of the posterior distribution of the binomial parameter, along with a 95% interval. To be more specific, denote the number of centers reporting “IBS>Control”, “NS”, and “Control>IBS” for a given region by nI, n0, and nC , respectively. Then we estimated the probability of NS, π0, by
Note that, unlike the more typical estimate n0 /(n0 +nI +nC ), this estimate is not equal to zero when n0 = 0, nor does it equal one when nI + nC = 0. This is reasonable, in particular, since it is not unlikely that one could obtain n0 =0 with π0 > 0 with as few as eleven studies. For example, if π0 =0.075, then the probability that n0 =0 is approximately 0.5. We obtain an interval which contains the unknown π0 with 95% probability (i.e., a 95% credible interval, CI) by determining the appropriate percentiles of a beta distribution with parameters n0 +1 and nI + nC +1. For the second part of this analysis, we consider only nI and nC (i.e., we condition on the value of n0). Assuming a uniform prior on πI, the posterior mean of the probability distribution of the probability of “IBS>Control” is
and the 95% credible interval is determined from the beta distribution with parameters nI +1 and nC +1.
The results are given in Table S1. The regions are ordered by decreasing I. From this table it can be seen that it is not unlikely that an inconclusive result has high probability for all regions. Also, the only region which at first glance appears to perhaps show evidence of consensus is AI, but four of the nine centers reporting this region could not detect a significant difference between groups. The remaining five centers all concluded IBS>Control.
As above, we summarized the results for each region by the numbers (nI, n0, nC). We concluded that the studies are consistent with respect to a particular region if one of πI and πC is substantially greater than the other, or if both are nearly zero. This is because if πI and πC are comparable and nonzero, then there is a disagreement among the conclusions of the studies which should be addressed.
We condition on n0, and test the hypothesis H0 : πI = πC. Given n0, nI is binomially distributed with parameters n – n0 and πI , and nC is binomially distributed with parameters n – n0 and πC . We use as a test statistic min(nI ,nC),since for consistent studies one of(nI, nc) will tend to be less than the other(unless n – n0 is very small, in which case one has very little power to reject H0 ). The probability that min(nI,nC)is less than or equal to the observed value of this minimum for each region is easy to determine exactly from the binomial distribution, thus giving a p-value for this test for each region.
A commonly used alternative to the above test for this hypothesis is a χ2 -test, which for this situation reduces to comparing
with a χ2 –distribution with one degree of freedom. (Note that if both nI and nC are zero, then S1 = 0.) The statistic S1 will likely differ substantially from the distribution, since our counts are very small. Since the above binomial test is exact and since it appears to be closely related to S1, we use it in preference to the χ2 test for our analysis.
We also looked at consistency jointly for pairs of regions. For any pair of regions, we observed the counts
with corresponding probabilities
where nCI corresponds to the number of studies report Control>IBS for region A and IBS>Control for region B, etc. Under the null hypothesis of no consistency, I and C can be exchanged in the subscripts without changing the probabilities, so that the matrix P is constrained to be
where the πi are arbitrary non-negative values for which
To test this hypothesis, we note that there is no evidence of inconsistency if the sum of the counts in any one of the four L-shaped borders of N(e.g., for the upper right border, nII + nI0 + nIC + n0C + nCC) is equal to zero. If we denote the sums for these borders as BUL, BUR, BLL, BLR, for upper-left border, upper-right border, lower-left border and lower-right border, respectively, then we define the test statistic
and we reject the null hypothesis of inconsistency if U is sufficiently small. The distribution of U was approximated for n =4, 5, …, 10 by simulation, by first selecting random values for the πi, the n by inserting n values into a 3×3 matrix N0 according to the probabilities P0, and then finally determining a simulated value for U under the null hypothesis. This was then repeated 100,000 times; the results are given in Table S2. For example, for n = 9, the probability that U ≤ 1 is seen to be 0.004+0.0396 < 0.05; hence we would reject the null hypothesis of inconsistency if U =0 or U =1.
We can also construct a χ2 - test for the null hypothesis given by the matrix of probabilities P0 above. This statistic is:
For large n, S2 is approximately distributed as a random variable, but for n as small as the values in our investigation, one should not rely on this approximation, but instead obtain the distribution of S2 by simulation as above. This statistic has not yet been investigated.
The p-values for single-region consistency are tabulated in Table 3. There is only minimal evidence for consistency for AI (p =0.22), and essentially no evidence at all for the other regions.
The results of pairs of regions suggest that pairs involving AI tend to be more consistent than pairs which do not involve AI (PI and AI, p =0.08; PCC and AI, p =0.16; ACC and AI, p =0.20; PFC and AI, p =0.20; Th and AI, p =0.32), though the null hypothesis cannot be rejected for any of these pairs. It is also worth noting that for PI and PCC p=0.16. The other p-values are substantially larger than these values.
The statistical approaches used in this article assume that the centers are exchangeable. That is, no information is conveyed by knowledge of the identity of a center. In reality the centers performed their studies at different times, and with different study sizes, so this assumption is an approximation at best. It could be improved on, for example, by including study size in the statistical model.
Authors have no disclosures to report.