This analysis was motivated by observing the complexity and heterogeneity that exists in brain imaging studies of the brain-gut axis. A cursory review of the literature about IBS and brain imaging challenges a reader to develop a clear idea about brain activity patterns in IBS and controls. Our analysis assesses the heterogeneity of the results in the brain-gut studies in IBS by outlining the diversity of regional patterns of activity.
The binomial distribution and repeated measures test of significance for single ROI activity did not conclusively show that any of the observed ROIs were consistently activated in IBS or controls during supraliminal distention. The regions which were reported by multiple studies within the pooled articles that were inconclusive included ACC, PCC, PFC, Th, PI, S1, S2, Hp/A, T, and the PAG. While it is important to note that these regions are commonly thought to be within the visceral pain network [49
], no statistically significant consensus can be made between the IBS studies within our analysis. As well, we cannot say with statistical certainty that there is any consensus for paired interactions between regions. This can be due to one of two reasons: 1) that there is no consensus or 2) that there are not enough studies to gain statistical power because of the small number of studies (see appendix
The articles that were included in our pooled analysis were selected specifically because they each had very similar paradigms. Of the 41 articles that we found by our search criteria only 16 articles met all of the requirements to be included in the analysis. The intense exclusion process was purposeful to reduce the effects of inter-study variability. However as the results show, the variability was still likely large enough to cause a wide margin of error.
Center Bias is caused by the reporting of multiple studies from the same research group which likely have similar analyses and reporting approaches and can heavily influence the final prevalence outcome for a given ROI [51
]. In our study, there were three groups reporting more than one study: the group at UCLA with 4 studies, Ringel et al with 2 studies, and Wilder & Song et al with 2 studies. As a result, we strove to correct for this bias by aggregating the data so that each center was represented only once, which we believe is a fair assessment and representation of the data.
Due to notable variations in experimental paradigms and reporting, we relied on our expertise and a methodical approach to determine the appropriate rating for a group of aggregated studies. For example, there were 8 instances in total where there was a disagreement of the activity rating within a single region between two studies. In all of these instances, the activity ratings were summarized as having no difference (rating of 2). This conservative approach reduces the overall amount of confirmatory activity ratings within a ROI, but guarantees that there is not a false positive caused by the center effect or by our aggregation.
Since we are not inferring function to any region in this study, we feel that we are justified in collapsing regions that were more specifically delineated. For example, we collapsed the pregenual acc (pgacc), rostral acc, and dorsal acc into a blanketed acc region. Even with the use of generalized regions which enabled us to include more reported results per region and increased our statistical power, we found no consensus. For example the AI, the only region which at first glance appears to perhaps show evidence of consensus with IBS greater than controls, had four of the seven centers reporting no significant difference between groups (p=0.22). The remaining three centers all concluded IBS>Control.
In order to reach significance on the aggregated data there would have needed to be agreement between at least six of the 11 centers if they all reported on a single region. We did run the statistical analysis on the non-aggregated data as well and similarly did not have enough agreement between centers to reach significance. Again, with the non-aggregated the AI is the best case scenario with 12 centers reporting and 6 centers in agreement reaching a significance of p=0.54. The discrepancy between what are thought to be IBS related CNS changes from the literature and the conclusions of this pooled analysis may be caused by many factors such as conservative thresholding, result reporting differences, and site specific differences in methodology.
Differences in experimental paradigms, such as differences in the rate of inflation and pain vs. discomfort ratings (liminal vs. subliminal vs. supraliminal), may cause some inter-study variability. For example, pressure ratings for supraliminal inflation ranged from 32.9 to 60 mmHg and ratings for subliminal pressure ranged from 5 to 20 mmHg. While one study may find that an average fill of 50mm Hg caused pain in their subjects other studies may have found that this level only cause discomfort. However, we must recognize that even small changes in the response dynamic of induced pain will likely cause a reactionary difference in activity in the brain. For example, there is a possibility that a balloon distention causing only discomfort may not cause activity in the anterior insula, but that it might during a balloon distention causing pain. Although there are clear differences in the pressures used for discomfort ratings between studies, we try to limit the effects within our study by only using activity reported during supraliminal pressures.
Another source of variation may be differences in significance thresholds. For example, in comparing two articles from our analysis, Naliboff et al’s 2001 study used a threshold of p≤.01 for significance, while Andresen et al used a value of p≤ .001 for significance in their 2005 article. Andresen et al reported nine regions with three regions being significantly different. Similarly, Naliboff et al reported nine regions with significance in six regions. We cannot assume that Andresen et al may have found significance in more regions at a p-vaule≤.01, but with less stringency they may have found more activity across the brain thus creating the opportunity to produce more ROIs within the significance boundary. Assessing the level of variability among the studies and adjusting for it was not possible because of inconsistent reporting of the specific values used for different thresholds. Of the 16 studies used for this analysis, 6 did not report on thresholding and 5 studies did not use consistent whole brain thresholds. Also, there were differences in the type of statistics reported. Most studies reported T-statistics (N=5), while 3 reported R-statistics, 2 reported Z-scores, 2 reported percent change and 3 had no reported statistical values. Using similar values for thresholds applied to the statistics mentioned above, as well as, clear reporting of these values would increase the homogeneity of the data analysis and consequently would facilitate pooling the data from multiple independent studies.
The authors of this study did not have access to the original study data. While there are other methods for analyzing data with the original data, such as feature-space clustering and ALE, we found that other meta-analyses had filled these gaps and that our approach is complimentary to them.
In a September 2010 article by Tillisch et. al [53
], a similar cohort of studies were examined using ALE. ALE is a technique in which all of the locations of peak activity from multiple studies are mapped in Talairach space and are blurred with a Gaussian distribution. A statistical model is fitted where any interactions of the Gaussian blurs create a higher probability of activation at that voxel location. The Tillisch study found differences between activity associated with distention protocols in controls and individuals with IBS. The differences in activity were found in the activation loci of the pgACC, midbrain, and insula. In our analysis, we do not find significance within the ACC (which includes results for the pgACC), because there are simply not enough studies that agree that IBS>Controls or Controls>IBS. The discrepancy between our two studies might be caused by some of the limitations of ALE. The ALE model is a fixed effects calculation that will find very significant results among small numbers of studies. As well, there is no account for interstudy variability, center bias, and no correction for multiple inputs to a single ALE location from the same study. ALE lacks the ability to distinguish between activity from distinct regions, for example the temporal lobe and insular cortex. This means that a single study reporting activation in the hippocampus and anterior insula may contribute to a significant ALE cluster found somewhere in between the two. This is an error that cannot happen within our pooled analysis because each study and more importantly each center only contributes one value per region. However, we do feel that ALE can offer some valuable contributions and find some interesting results where other studies may not when used with careful observation and accountability on the part of the user. In the end, generating a detailed quantitative assessment including coordinates of the activity loci, level of activity and volume of the clusters that accounts for inter-study variability is desirable and would be ideal for understanding the central involvement of pain in the pathology of the IBS.
Brain imaging has played an important role in elucidating central response to peripheral stimulation in patients with IBS. A greater uniformity of the analysis of research data: performing direct subtractions between groups, using similar threshold approaches and levels (p or z values, number of voxels), and applying similar corrections for multiple comparisons could offer much needed consistency in the literature. Also, a clear and consistent reporting of results presenting: 1) bilaterality/unilaterality aspects 2) voxels contained in the activated/deactivated clusters 3) the exact coordinates data and z value of peak activity per cluster and 4) specific p values of the activity differences between the groups would considerably help for future quantitative meta-analysis. We could even go so far as suggesting collaborative efforts in standardizing IBS distention protocols and imaging techniques, similar to that done in other fields of study such as the Alzheimer’s Disease Neuroimaging Initiative [54
] and those suggested in the Rome working team report [55
To clarify the directions of research in IBS, we have analyzed the information published to date. Though our results are inconclusive, our analysis of the literature regarding pain processing in IBS suggests that there are patterns of activity that could explain the differences between pain sensitivity in individuals with IBS and controls. However, clearer statements can be made only when there is a greater uniformity in imaging analysis and the reporting of the results.