4.1. Genital Ulcerative Disease versus Genital Discharge Syndrome
The comparisons of men diagnosed with GUD and GDS are consistent with findings that intact men are more prone to GUD and circumcised men are more prone to GDS. Consequently, there is no surprise here.
4.2. Genital Discharge Syndrome
The prevalence of GDS shows a moderate trend toward being less common in intact men (OR = 0.89 and 95% CI = 0.73–1.09). The finding in general populations is statistically significant (OR = 0.77 and 95% CI = 0.59–0.99). The only study of incidence found no significant difference [82
]. Circumcision prevalence in the population studied had a significant association with the odds ratio measured for prevalence of GDS (). The funnel graph () indicates that the study by Warner et al., [87
] to be an outlier. This is confirmed when this study is excluded from the analysis and the summary odds ratio drops to 0.85 and the finding approaches statistical significance (95% CI = 0.70–1.03) (). This study also may explain why four of the measures of publication bias were positive. While this diagnosis is based on clinical findings, the lack of association with intact men and GDS is consistent with what is seen with NSU.
4.3. Nonspecific (Nongonococcal) Urethritis
The prevalence of NSU is significantly lower in intact males (OR = 0.76 and 95% CI = 0.63–0.92). Between-study heterogeneity is a concern as five of the twelve studies contributed significantly to the between-study heterogeneity, but exclusion of any these studies did not change the significance of this finding (). Three publication bias measures were positive, which is consistent with the paucity of studies in lower left portion of the funnel graph (). The “trim and fill” method, however, found that no studies were needed to adjust for publication bias.
Other than the problems with between-study heterogeneity and these analysis indicates a fairly robust, significant association between a lower prevalence of NSU in intact males.
There was no significance difference in the prevalence of genital chlamydia infections but a trend toward a lower prevalence in intact men. None of the studies of incidence found a significant difference (whether adjusted for lead-time bias or not). When studies of incidence are adjusted for lead-time bias and combined, there is no significant association.
Only two outliers were identified (). When they are excluded from the analysis, the summary odds ratio is 0.93 (95% CI = 0.87–1.00) and the between-study heterogeneity resolves (chi-square = 7.75 (df = 11) and P = .7357). Meta-regression showed a trend toward a lower association between the prevalence of chlamydia and intact men in African studies (P = .0873). In African studies, the summary odds ratio was 0.63 (95% CI = 0.35–1.12).
The funnel graph indicates a clear outlier () [89
]. Two of the measures of publication were positive, and the “trim and fill” method added two studies to the left lower portion of the graph, giving a summary odds ratio, adjusted for publication bias of 0.87 (95% CI = 0.69–1.11).
The analysis indicates a trend toward a lower prevalence of chlamydia in intact men, especially in Africa and in the general population. No difference was seen in the incidence studies.
No significant association between the incidence or the prevalence of gonorrhea and circumcision status of males was found. This was seen in both high-risk and general populations. There was significant between-study heterogeneity, and five potential outliers were identified. The prevalence of circumcision in the population studied was significantly associated with odds ratio reported in the study (P = .0048) (). As circumcision prevalence approached the extremes, the summary odds ratio in population with a 0% circumcision rate would be estimated at 0.68 (95% CI = 0.49–0.96), while a population with a 100% circumcision rate, the summary odds ratio would be estimated at 1.72 (95% CI = 1.16–2.55).
Figure 2 Natural logarithm of odds ratio as a function of the prevalence of circumcision in the population when estimating the prevalence of gonorrhea by circumcision status in adult men. Solid triangles represent individual populations. Circles represent estimates (more ...)
Only one measure of publication bias was positive, and the funnel graph () looks symmetric. No studies were added using the “trim and fill” approach.
Funnel graph of precision (1/variance) by the natural logarithm of the odds ratio of studies estimating the prevalence of gonorrhea by circumcision status in adult men. Empty triangles represent published studies.
The data indicate that the incidence and the prevalence of gonorrhea are not affected by circumcision status as much as by the prevalence of circumcision within the community studied.
4.6. Genital Ulcerative Disease
Incidence and prevalence of GUD were consistently positively associated with intact men, even when subjected to sensitivity analysis and meta-regression. Between-study heterogeneity was significant even after adjusting for four “outlying” studies. Meta-regression found significant associations for population type, whether studies were performed in Africa and circumcision prevalence in the populations studied (). When combined in a multivariate analysis, only a study being performed in Africa was a significant factor.
Figure 3 Natural logarithm of odds ratio as a function of the prevalence of circumcision in the population when estimating the prevalence of genital ulcerative disease by circumcision status in adult men. Solid triangles represent individual populations. Circles (more ...)
In the funnel graph, there is a study in the right lower portion that is not balanced in the left lower portion (). None of the publication bias measures were positive, yet the “trim and fill” process added one study making summary odds ratio, adjusted for publication bias, of 1.63 (95% CI = 1.34–2.01).
GUD, which is more commonly seen in developing countries, has a propensity for mucosal surfaces. Most of the studies of HSV have looked at seroconversion rates for herpes simplex virus type 2. This will not capture recurrences. Since GUD is a clinical measure that includes HSV recurrences and ulcers for which no causative agent can be identified; one would expect a higher rate in intact men because more than half of the mucosal surface of the penis is removed with circumcision. Herpes simplex viruses, including type 1 and type 2, also have a propensity for junctional tissues. This is why cold sores recur in the corner of the mouth and on the facial lips. If one were to amputate facial lips, one would see a lower recurrence rate of herpes simplex virus type 1. To follow this analogy, circumcision removes all of the junctional tissue of the prepuce [90
], so this may impact HSV recurrences. While this is a consistent finding, it is difficult to know what the public health impact is in regions where the prevalence of GUD is low.
The data on syphilis present quite a farrago. On the one hand, there is a positive association between the prevalence of syphilis and intact genitalia, but, on the other hand, the incidence of syphilis, even before adjusting for lead-time bias, indicates a negative, albeit nonsignificant, association. The positive association is seen primarily in populations at high risk for acquiring STIs, while in the prevalence in general populations found no statistically significant difference (depending on the calculation method used such as general variance-based method: OR = 1.23 and 95% CI = 1.0064–1.49; meta-regression method: OR = 1.25 and 95% CI = 0.96–1.60). Seven prevalence studies had statistically significant contributions to the between-study heterogeneity. The between-study heterogeneity improves when only studies of general populations are considered but does not resolve completely. The prevalence of syphilis by circumcision status is also significantly associated with the prevalence of circumcision in the population studied ().
Figure 4 Natural logarithm of odds ratio as a function of the prevalence of circumcision in the population when estimating the prevalence of syphilis by circumcision status in adult men. Solid triangles represent individual populations. Circles represent estimates (more ...)
The funnel graph clearly looks asymmetric (), but none of the measures of publication bias nor the “trim and fill” method identified this.
With the mixed results between incidence and prevalence, the lack of a significant association in general populations, the number of studies that could be considered outliers, the significant association with circumcision prevalence in the population studied, and the asymmetry of the funnel graph, one cannot accurately conclude that the risk of syphilis is significantly associated with circumcision status.
4.8. Genital Herpes/Herpes Simplex Virus Type 2
While there was a trend for the prevalence of HSV to be greater in intact men, the association was not statistically significant. When adjusted for lead-time bias, none of the studies that looked at the incidence of herpes simplex virus type 2 found a statistically significant association. When the studies are combined, there is no statistically significant association but a slight trend toward higher risk for intact men.
There was significant between-study heterogeneity for the prevalence studies. Six outliers were identified. Exclusion of these studies individually and the two largest contributors did not bring the between-study heterogeneity within an acceptable range and did not yield a summary effect that was statistically significant. In both high-risk and general populations, the summary effect was not statistically significant, and between-study heterogeneity remained significant. Using meta-regression, there was a trend (P = .1261) that odds ratios were higher in African studies.
The funnel graph indicates some asymmetry with a cluster of studies in the lower right portion that is not balanced on the left side (). Two of the measures of publication bias were positive, but no adjustments were indicated using the “trim and fill” method.
While there is a trend toward higher incidence and prevalence of HSV in intact men, the finding is persistently not statistically significant despite a number of adjustments. The high level of between-study heterogeneity, which could not be shed despite several attempts, presents a problem in making any recommendation regarding circumcision's impact on HSV.
An earlier meta-analysis of HSV prevalence and circumcision had failed to include two of the populations included in this analysis [15
]. This is strange considering that the same person was the lead author of both studies.
As an aside, there have been a number of systemic and fatal herpetic infections reported following ritual circumcision in which the person performing the circumcision puts his mouth around the penis after the foreskin has been amputated [92
]. Instead of banning the practice, the New York City Health Department has asked parents to sign off on this practice. Orthodox Jews in New York City are currently fighting this ruling.
The paucity of studies, the reliance on clinical identification in all but one of these studies, and the high degree of between-study heterogeneity make it difficult to comment on the impact of circumcision on this illness, yet the lack of good evidence did not keep the 2012 AAP Task Force from including a discussion of circumcision's impact on the prevalence of chancroid [12
], which is relatively uncommon in developing nations and extremely rare in developed nations. The degree of between-study heterogeneity is significant and can be almost completely attributed to one study [69
]. Exclusion of this study brought the between-study heterogeneity within an acceptable range (P
= .1128). When other outliers were excluded from analysis along with the study by Hart [69
], the further reduction in the between-study heterogeneity chi-square, compared to excluding only Hart's study, was not statistically significant.
The data do not support the claim by Weiss et al. that “circumcised men are at lower risk of chancroid” [15
]. There have been no new publications on the impact of circumcision on the prevalence of chancroid since 2006. The difference between the analyses is that Weiss et al. included several studies in their meta-analysis that were not strictly studies of chancroid. As I have noted previously [96
], three of the studies included in their analysis of chancroid did not meet basic inclusion criteria because they lacked a direct comparison between intact and circumcision men for a specific diagnosis of chancroid [59
]. In two of the studies, men with genital ulcers were presumed
to have chancroid but never tested for it [97
], while the third study tested the men presumed to have chancroid and found that 31.4% had herpes simplex virus type 2 and only 22.9% had a positive culture for Haemophilus ducreyi
, the causative agent of chancroid [59
]. When these studies are appropriately assigned to an analysis of the prevalence of GUD and excluded from an analysis of the prevalence of chancroid, any imagined association between circumcision status and prevalence of chancroid evaporates.
4.10. Genital Warts
The prevalence of genital warts has a strong trend towards being lower in intact males. In general populations, the association is statistically significant (OR = 0.78 and 95% CI = 0.63–0.96) and did not have evidence of between-study heterogeneity (chi-square = 8.61 (df = 6) and P = .1969). Three studies were identified as potential outliers; removal of the two studies with the greatest impact on between-study heterogeneity brought the between-study heterogeneity near the acceptable range (P = .0901). Using meta-regression, circumcision prevalence in the population studied was negatively associated with the reported odds ratios (P = .0324) ().
The funnel graph indicates some paucity of studies in the left lower region (). None of the measures of publication bias were positive, yet the “trim and fill” calculations indicated that there were two studies missing in the left lower portion of the funnel graph. Adjusting for publication bias, the summary odds ratio was 0.76 (95% CI = 0.60–0.97).
The evidence in favor of a lower prevalence of genital warts in intact males is supported by the finding in studies of general populations, which were surprisingly free of between-study heterogeneity and the summary result after adjusting for publication bias. The odds ratios in studies were, however, impacted by the prevalence of circumcision in the population studied.
4.11. Human Papillomavirus
A systematic review of the incidence and prevalence of genital HPV infections as they relate to circumcision status in males is fraught with a variety of pitfalls. This may explain why several systematic reviews with meta-analysis have been published with inconsistent results [16
]. HPV has many subtypes, some of which have been demonstrated to be oncogenic, while others are benign and self-limited infections. The oncogenic types have been strongly linked to cervical cancer in women and may be responsible for about half of the cases of penile cancer in men. Some studies reported their results for HPV infections without specifying the types of HPV identified, some reported only infections with oncogenic HPV, and some studies reported results on all HPV infections and also infections with oncogenic HPV. Consequently, two analyses were run (any HPV and high-risk HPV). Since oncogenic HPV is more concerning clinically, the second analysis may be the more relevant of the two. In the analysis that focused on high-risk HPV, there was no significant difference in the prevalence by circumcision status.
Previous analyses have found that sampling bias and patient report of circumcision status significantly effect the odds ratio reported in a study [16
]. For this reason, a third analysis (selective HPV) was run on the studies of prevalence in the second analysis (high-risk HPV) in which studies with the potential for sampling bias and misclassification bias were excluded.
Finally, the two randomized clinical trials that reported their results on HPV infection both failed to adjust for sampling only the glans and to adjust for lead-time bias.
The incidence of HPV infections was barely statistically significantly different based on circumcision status before adjustment for sampling bias and lead-time bias (RR = 1.16 and 95% CI = 1.0097–1.34). After adjustment for these sources of bias, the relative risk is 0.96 (95% CI = 0.85–1.09).
Prevalence of HPV in the first analysis (any HPV) was higher in intact men (OR = 1.24 and 95% CI = 1.02–1.50), but the statistical significance of this finding is tenuous. When sensitivity analysis comparing studies of high-risk populations and studies of general populations, the result in neither group is statistically significant. When two of the identified “outliers” are individually excluded from the analysis, the results are not statistically significant.
When meta-regression is used to adjust for sampling bias, and misclassification bias the summary odds ratio is 1.08 (95% CI = 0.93–1.24).
The funnel graph for the first analysis of HPV (any HPV) shows a clear paucity of studies in the left lower portion (). Not surprisingly, three of the measures of publication bias were positive, and the “trim and fill” method added two studies. The summary odds ratio adjusting for publication bias was 1.19 (95% CI = 0.97–1.46).
Prevalence of HPV in the second analysis (high-risk HPV) was not significantly different on the basis of circumcision status (OR = 1.17 and 95% CI = 0.94–1.45). Significant difference was found in neither high-risk populations nor general populations.
Five outliers were identified. Excluding them individually from the analysis or excluding the two studies that contributed the most to between-study heterogeneity did not result in providing evidence of statistically significant difference. Excluding the two studies did bring between-study heterogeneity to within an acceptable range (P = .1303). The summary odds ratio with these studies excluded was 1.16 (95% CI = 0.95–1.41).
Using meta-regression to adjust for sampling bias and misclassification bias the summary odds ratio was 1.01 (95% CI = 0.84–1.22).
The funnel graph for the second analysis also shows a paucity in the left lower portion (). Three measures of publication were positive, and “trim and fill” methods indicated the absence of two studies. The summary odds ratio, adjusting for publication bias, was 1.10 (95% CI = 0.88–1.39).
Prevalence of HPV in the third analysis (selective HPV) was nearly identical in intact and circumcised men (OR = 1.01 and 95% CI = 0.80–1.28). Three studies were identified as outliers. Exclusion of the study with the largest contribution to the between-study heterogeneity [86
] resulted in the between-study heterogeneity coming with an acceptable range (P
= .1689) and yielded a summary odds ratio of 0.96 (95% CI = 0.79–1.15). The funnel plot for the studies included in the third analysis was symmetrical, all measures of publication bias were negative, and no addition of studies were indicated by the “trim and fill” analysis.
There are several messages from the three analyses performed on the HPV prevalence studies. Sampling bias and misclassification bias have a significant differential effect on the odds ratios reported in studies where these forms of bias are suspected. There is no significant difference in the incidence or the prevalence of HPV (especially oncogenic HPV) on the basis of circumcision status. While circumcision proponents repeatedly laud circumcision as preventive for HPV infections, the data do not support this claim. When their own studies are adjusted for lead-time bias and sampling bias, their treatment effect disappears [26
Several studies of HPV and circumcision status warrant additional comment because of their serious methodological flaws. One study compiled data collected from seven studies in five countries from three continents. A fatal flaw in the study was the small number of circumcised men in four of the countries and the small number of intact men in the fifth country. Of the twenty data cells that make up the two-by-two tables from the five countries, seven had five or fewer subjects. The authors used parametric statistical methods, which are notably unreliable in this situation, to report the statistics on the combined data [99
]. Unfortunately, this study, which did not find a statistically significant association between circumcision status of male sexual partners and cervical cancer, has been quoted by circumcision proponents, including the authors of the study, as demonstrating that circumcision prevents cervical cancer. Given the problems with small number of men in many of the data cells described above, it would be impossible to accurately perform the subset results they reported for cervical cancer.
The study published by Lajous et al. is problematic in that fourteen men were identified as circumcised on physical examination, while 95 men identified themselves as being circumcised. Although physical examination is considered the gold standard for assigning circumcision status, instead of using physical examination as the measure of circumcision status, the study published the association between HPV infection and self-report of circumcision. Eighty-eight of the 95 men who reported themselves as circumcised were not circumcised on the basis of physical examination [100
]. To defend their decision, the author stated “we chose to report the findings of self-reported circumcision. The prevalence of circumcision in Mexico is very low, and the interviewers who did the physical examination may not be accustomed to it and may have been unable to identify its presence.” [100
This inability of researchers in Mexico to accurately identify the circumcision on physical examination may call into question other studies from Mexico. For example, the study by Vaccarella of Mexican men undergoing vasectomy reported a circumcision rate of 31.7% and was identified as an outlier [86
]. This circumcision rate appears to be exaggerated in a country in which circumcision is rare. The studies by Giuliano et al. also recruited a third of their participants from Mexico [102
Perhaps most concerning is the results reported from the group of researchers from Johns Hopkins, who have after publication of their studies become vociferous advocates of the benefits of circumcision [5
]. At the beginning of their randomized clinical trial of circumcision of adult male “volunteers” in Rakai, Uganda, “two subpreputial and shaft swabs were also obtained for future testing of human papillomavirus infection.” [103
] In 2011, Tobian et al. reported the results of the HPV cultures of the glans and penile shaft at the 12-month visit of participants in their randomized clinical trial [104
]. So, it is not clear why, in 2009, Tobian et al. reported the results of the difference in HPV infections incidence using only samples obtained from the preputial cavity of intact men and the coronal sulcus of circumcised men [82
]. Why would Tobian and the research group from Johns Hopkins collect samples from the penile shaft and glans but only report the results from the glans?
Their randomized clinical trial ended in December 2006. In 2004, Weaver et al. published a study that demonstrated the clear differential between intact and circumcised males regarding the likelihood of HPV detection based on sampling the shaft or the glans of the penis [31
]. There are only two reasons for the Johns Hopkins researchers to withhold the evidence they collected; either they were not current on the medical literature as it applied the research they were conducting and reporting or they purposely withheld results of the swabs taken from the penile shaft. Neither of the options, incompetence or willful academic misconduct, is appealing. Basically, when Tobian et al. and Auvert et al. reported only on sampling from the glans, they guaranteed a positive finding because the location of HPV on the penis differs according to circumcision status [82
Research published in December 2008 had demonstrated the HPV viral load varied significantly by anatomic site with the penile shaft having the highest viral loads and being the preferred site for HPV-16 (the most prevalent oncogenic HPV type) replication [106
]. There is also the question of whether the glans of the circumcision is too dry to allow accurate sampling [107
The pertinent question as it relates to a systematic review of the medical literature and meta-analysis is whether studies that report only on cultures taken from the glans of the penis should be included in an analysis and adjusted for or be completely dismissed as invalid?
A couple for studies have indicated that the clearance of HPV takes longer from the intact penis [35
]. If this is true, it is unclear what the clinical impact would be. HPV infections on the genitals are transitory. Consequently, if the clearance of the virus takes longer, it would be more likely to be detected in intact men. If sampling is infrequent, prolonged time to viral clearance would result in an overestimate of the incidence of infection as infections of shorter duration could have come and gone and not been detected between scheduled samplings. This is an area that warrants further research.
Finally, the data from the randomized clinical trial of adult male circumcision in Kismu, Kenya, were published in 2012. While swabs were taken from the penile shaft and the glans and the data on circumcision status were collected, the authors failed to report the overall rates of HPV infection by circumcision status [77
]. If one back calculates using the rates of infections by the type of penile lesion and rates of the types of lesions by circumcision status and assumes there is no interaction between these factors, there is no statistically significant difference between HPV infection rates based on circumcision status. I wrote a letter to the editor asking that the authors provide the results of the incidence of HPV infection by circumcision status, but the editor refused to publish my letter.
4.12. Any Sexually Transmitted Infections
This is the first systematic review of the medical literature looking at the incidence and the prevalence of any STI as opposed to not acquiring an STI based on circumcision status. This analysis indicates that prevalence of acquiring any STI is lower in intact men. Three of the four studies of incidence are consistent with the prevalence date, while one study from New Zealand indicated a significant protective effect. Overall, the incidence data indicate a trend that intact men have a lower incidence of any STI.
When looking at the funnel graph for any STI, the study by Langeni [85
] is a clear outlier (). When the Langeni study are excluded, the summary odds ratio drops from 0.86 (95% CI = 0.74–1.01) to 0.82 (95% CI = 0.74–0.92). While the odds ratio does not change drastically, the confidence interval is tightened by the 203.41 drop in the chi-square value for between-study heterogeneity. With Langeni included, four of the six measure of publication bias were positive. Once Langeni was excluded, one of the measures of publication bias was positive. Consequently, the analyses of any sexually transmitted disease were performed with Langeni included and with Langeni excluded.
Langeni may also be justifiably excluded because the study reported participant self-report of either GUD or GDS, which might exclude several types of STIs and relied on self-diagnosis in Botswana.
With Langeni excluded, the prevalence of any STI is significantly lower in intact men. When only high-risk populations are considered, the trend is in the same direction, but the difference is not statistically significant. The funnel graph, with the exclusion of Langeni, is fairly symmetric. “Trim and fill” analysis found that no studies needed to be added whether or not Langeni was included.
STIs with genital discharges are more common than genital ulcers, which may explain why the prevalence of any STI is lower in intact males. The ratio of the two general types of STIs within a community may also influence the impact of circumcision on overall risk of having any STI. Differences in these ratios in different populations may also contribute to the between-study heterogeneity.
Identifying and quantitating “any STI” may be problematic as the outcome of interest varied between studies. In some studies, collected data were the recollection of any STI in one's lifetime, while, in others, it was the recollection of any STI within the past 12 months. The range of infections tested for or queried about also varied between studies. Likewise, determinations needed to be made regarding what was an STI. Is a yeast infection a sexually transmitted infection or the result of an imbalance of normal flora? In this analysis, candidal infections as well as infections with T. vaginalis, mycoplasma, and ureaplasma were not included. How much this variation affected the summary effect is unknown.
It is clear that despite these methodological concerns that the impact of circumcision on the overall risk of contracting any STI is to increase the overall risk of infection. Because of the hodgepodge of data included in this analysis and disparate results on the incidence of infection, more studies specifically designed to answer this question are needed.
4.13. General Findings
Several consistencies in the analyses deserve comment. All of the prevalence analyses showed significant between-study heterogeneity. This reflects the variety of populations, settings, diagnostic methods, and ways of determining circumcision status. Some would argue that given this degree of between-study heterogeneity, any meta-analysis that follows is not worthy of publication. Because of the between-study heterogeneity, one cannot sufficiently emphasize a disclaimer of caveat emptor. I have erred on the side that information is good, especially when properly presented. Looking at the data from different perspectives and applying different techniques that might help identify the sources of between-study heterogeneity should guide the reader in how to interpret this information.
The summary effect for the prevalence of every disease was greater in studies of high-risk populations than in studies general populations. This consistent finding, which was often statistically significant, has public policy implications. Calls for population-wide implementation of male circumcision on the grounds that it prevents STIs are not supported by the findings of these analyses. These analyses indicate that if male circumcision has any role (which these analyses also dispute) in reducing the incidence and prevalence of STIs, it should be implemented in easily identifiable high-risk populations. A major problem with infant circumcision is the lack of an accurate method of identifying which infants will find themselves in high-risk population when they become sexually active. Similarly, meta-regression analysis of the studies of HIV incidence and prevalence has found that there is no significant association in general populations but only in high-risk populations [108
In several analyses, the summary effect of the prevalence of a disease was significantly and positively associated with circumcision prevalence in the population studied. A similar finding has been identified in studies of HIV incidence and prevalence [108
]. These findings are consistent with how sexual networks impact the spread STIs [109
]. Sexual partners are not found randomly but usually within one's cultural or ethnic group. Since circumcision status has a strong association with religious, tribal, and cultural factors, men with a particular circumcision status will likely have sexual partners from within a group that has a predominance of men with the same circumcision status. The smaller the group, the more quickly the rise and the higher the peak prevalence for a particular STI [109
]. Consequently, when circumcision rates are high, intact men would be more likely to be in a smaller ethnic, religious, or cultural group and thus have a higher peak prevalence of a disease. As the circumcision prevalence drops, circumcised men would find themselves in the smaller groups that would be more likely to have a higher peak prevalence of infections.
The lack of a significant association between high-risk HPV infections and circumcision status undermines the argument made by the few who believe that circumcision reduces cancer risk [8
]. The lack of an association between HPV, HSV, and other STIs also undermines the analysis published by the same researchers at Johns Hopkins that selectively reported their HPV findings in Africa. They concluded that infant circumcision would save billions of dollars in public health expenditures, but these researchers relied almost exclusively on their own flawed data, which they failed to adjust for lead-time bias or sampling bias [10
]. If circumcision increases the overall incidence and prevalence of STIs, how will it save money?
The results of these analyses also further undermine the argument of how the increased risk of HIV infection in intact men is biologically plausible. The plausibility argument is based on several assumptions, all of which are purely speculative. The first is that the inner mucosa of the foreskin is thinner and more prone to abrasions. The second is that the subpreputial space is a breeding ground for sexually transmitted viruses. The third is that the Langerhans cells on the mucosal surface act like HIV-virus magnets pulling the virus into the body [111
]. The preputial mucosa is not thinner [112
], and circumcised men have a trend toward more penile abrasions (presumably from lack of adequate lubrication) [114
]. Langerhans cells are quite efficient in killing HIV cells, which explains the low rate of transmission through sexual contact (approximately 1 in 1000 unprotected acts of coitus) and require activated T cells [115
]. Langerhans cells are the first line of mucosal defense. Their presence in the mucosal portion of the prepuce may explain why the overall incidence and prevalence of STIs is lower in intact men. Finally, there is no difference in the incidence and prevalence of HSV or HPV based on circumcision status. The claim that the subpreputial space is a preferential breeding ground for these viruses is also contradicted by the research that found the highest viral replication rates and viral load of HPV on the penile skin [106
]. Men with genital ulcers are at greater risk because of the disruption in epithelial integrity at the site of the ulcer and the activation of T cells by the inflammation accompanying the ulcer.
4.14. Missed Studies of Interest
There are several studies that reported results that could not be incorporated into the analyses. For example, Urassa et al. reported that they did not find a significant difference in GDS or GUD prevalence in males based on circumcision status but gave no further details [79
]. In 1949, Hand reported, without providing his data, no difference in the rate of HSV in soldiers on the basis of circumcision status [25
]. A study of 537 sailors examined for gonorrhea before and after shore leave in the Far East found that circumcision status did not significantly affect the susceptibility to gonorrhea but provided no specifics [78
Because circumcision status based on country of origin is inexact, a Dutch study was excluded that found that men born in the Netherlands, where circumcision is an uncommon practice, had lower rates of STIs than men who immigrated from Turkey, where circumcision is nearly uniformly practiced (one or more STI: OR = 0.30 and 95% CI = 0.12–0.72; HSV: exact OR = 0.37and 95% CI = 0.007 infinity; early syphilis: exact OR = 0.20 and 95% CI = 0.06–0.63; gonorrhea: OR = 0.20 and 95% CI = 0.06–0.63; chlamydia: OR = 0.42 and 95% CI = 0.14–1.37) [117
]. These results also support the theory that minority groups have a higher peak prevalence of STIs.
Of historical interest, a study of the cause of deaths in New York City in 1931 found that death from syphilis and related diagnoses was lower in Jews than non-Jews (Poisson regression RR = 0.66 and 95% CI = 0.51–0.86). When only males are considered, the results are similar (Poisson regression RR = 0.66 and 95% CI = 0.49–0.88). If circumcision was a contributing factor, beyond that seen for ethnicity alone, one would expect a significant interaction between ethnicity and gender in which Jewish men would have a lower rate of syphilis than Jewish women. Such an interaction could not be demonstrated (P
= .6500) [118
]. Likewise, Jewish men and women were found less likely to have syphilis in 1882–1883, but, once again, the lack of interaction between ethnicity and gender (P
= .9007) fails to support circumcision as a contributing factor [119
]. The differences in the rates of lues between ethnic groups can be explained by a lack of sexual mixing between the two populations. For example, Christian prostitutes were banned from consorting with Jews [120
4.15. Methodological Choices
This paper did not review the literature for HIV infections for two reasons. First, such a review would be lengthy and best left to another article. Second, most of the study of HIV and circumcision status has taken place in Africa. In that setting that is estimated 20% or more of infections are not spread through sexual contact [121
]. Using the data from the three African randomized clinical trials in adult males that looked for an association between circumcision and incidence of HIV infection [103
], it appears that approximately half of the infections documented in these studies were transmitted through nonsexual means [131
]. None of these trials made any attempt to determine the source of HIV infection documented in the trials. Consequently, since it is not clear whether the HIV infections identified in African studies were sexually transmitted or iatrogenic infections, HIV infections were excluded from this paper.
A drawback seen in some observational studies is having a small number of patients with a specific outcome. When this occurs, the parametric assumptions that allow one to make accurate inferences may no longer be valid, resulting inaccurate estimates for odds ratios and 95% confidence intervals. Since these inaccurate calculations of odds ratios and variance can bias summary effects and estimates of variance, including studies with small cell populations can result in inconsistent summary estimates depending on the calculation method used [16
]. To minimize any bias introduced by studies with cells with small populations, the odds ratios and confidence intervals were calculated using exact methods.
Some adjustments in the composition of control groups were necessary to provide consistency of methodology between studies. For example, Wilson compared seasoned soldiers to new recruits [23
], while Hand's control were men without any exposure to STIs.
In Mallon et al., British men referred to a dermatology specialist for penile problems were compared to a control group of patients without penile problems cared for by the same dermatologists [24
]. This is a classic case of referral bias. If primary care providers are less comfortable identifying and treating problems with the complete penis, these men would be overrepresented in a referral dermatology practice. More difficult to explain is the high circumcision rate in the control group: 47.8%. Of the men with penile problems, only 23.0% were circumcised. Yet, in a representative population survey of British adults from the early 1990s, 21.9% of adult males reported being circumcised, with the highest circumcision rate (32.2%) being reported in men aged from 45 to 59 years [132
]. In a 2000 British survey, 15.8% of British men reported being circumcised [133
]. Clearly, a control group in which 47.8% are circumcised was not representative of the general population.
Using a control group of men without any STI is problematic. First, men without a detectable STI differ in several ways from men who have an STI and introduce a “Berksonian bias” [134
]. Some have the mistaken belief that contracting a different STI introduces unidirectional bias [135
]. The opposite is likely the case. Excluding men with a different STI is more likely to introduce bias. For example, if, while investigating for association between the prevalence of gonorrhea and circumcision status, all men with syphilis, whether or not they have gonorrhea, are excluded, the measure of association will be biased because intact men presumably have a higher prevalence of syphilis. By excluding a disproportionate number of intact men, the odds ratio for intact men having gonorrhea, after excluding those with syphilis, will be higher than if these men had been included. Similarly, if men with genital warts, which is more common among circumcised men, are excluded, then the odds ratio for intact men having gonorrhea will decrease. In order to justify excluding these men from the analysis, these other conditions would need to be shown to be confounding factors or effect modifiers for gonorrhea. This has not been demonstrated for the diseases in these analyses.
Second, using a disease-free control group discards data collected on men who had an STI other than the infection of interest. Those who participate in medical research allow their medical information to be used and their privacy to be violated. Violating a subject's privacy to collect data and then not use the information excludes useful information and is ethically suspect. Every participant's information should contribute to a study, and so serious deliberation needs to be undertaken before this information is arbitrarily excluded from analysis. If the aim of a study is to consider a specific infection, the data on all patients meeting the inclusion criteria should be incorporated into the analysis. For example, in a cross-sectional study, the characteristics of men with the disease of interest would be compared to the characteristics of men without the illness, regardless if they happen to have a different type of infection.
Finally, it provides a method of comparison that is consistent with the other studies included in the meta-analysis.
Many prefer to use individual patient data in meta-analyses for a variety of reasons [136
]. First, not all studies adjust their results for confounding factors. In fact, most studies identified in this paper did not. Second, studies that provided adjusted odds ratios do not consistently adjust for the same factors, so adjusted results from different studies are not comparable. Third, most studies that report adjusted results rarely perform evaluations for collinearity, which can destabilize multivariate models. Circumcision status has been noted in several studies to be a differential factor in the number of lifetime sexual partners, marriage rates, contact with prostitutes, and tobacco and alcohol consumption [137
]. If a study were to adjust for one of these factors, they might find that particular factor is significant, circumcision is significant, or both are significant, when the truth is that circumcision is linked to the other factor and the two variables in a multivariate model are describing the same thing. Fourth, when adjusted odds ratios are calculated, the uncertainty (variance) of the estimate increases. When calculating a summary effect, the weight assigned to data from an individual study is the inverse of the variance. An adjusted odds ratio will have a larger variance and give the study less weight when determining the summary effect than the unadjusted odds ratio would. For example, in the study by Laumann et al. [89
], the weight assigned to the raw data is from 3.6 to 6.6 times greater than the weight assigned to the adjusted odds ratios. Similarly, in a study by Urassa et al., going from raw data to an adjusted odds ratio increased the variance from 0.000685 to 0.0153 [79
]. Subsequently, a much smaller and less rigorous study that reported only raw data would have more impact on the summary effect than a large nationally representative probability sample using adjusted odds ratios. Fifth, adjusted odds ratios are open to manipulation using multivariate logistic regression. Consequently, using raw data will diminish the impact of researcher bias and avoid overfitting the data with multivariate analysis.
One of the most important tasks in performing the literature review is looking for forms of bias and making adjustments to minimize the impact of differential bias. Bias happens, and it is hard to identify and control. Most forms of bias are insidious and difficult to measure. Circumcision status, which is linked to socioeconomic status, may impact healthcare seeking behaviors. If, for example, circumcised men are more likely to visit an STD clinic for reassurance purposes, they would be more likely to be placed in a no disease only control group thus increasing the odds ratio for those intact men and the illness of interest [139
Lead-time bias was present in all of the data coming from the randomized clinical trials of adult male circumcision in Africa. Because men randomized to immediate circumcision were not exposed to STIs for four to six weeks following their procedures, their exposure to disease was not the same as men who were assigned to later circumcision. While a six-week adjustment to trials scheduled to last from 21 to 24 months wound not appear to be substantial, when the reduced exposure time is accounted for, several of the associations found that these trials were no longer statistically significant. If these findings were robust, adjusting for lead-time bias should not have influenced the interpretation of the results.
What is more concerning is that potential for lead-time bias was overlooked in the planning, funding, analysis, and reporting phases of these projects. The potential for lead-time bias in any cohort study or clinical trial is taught and emphasized in the most basic classes on research design. How was this potential source of bias missed by the highly regarded researchers at Johns Hopkins, the reviewers who approved funding for these studies at the National Institutes of Health, and the editors and peer-reviewers at highly regarded medicals journals such as The Lancet and The New England Journal of Medicine? To compensate for the deficiencies of these individuals, a post hoc adjustment of six weeks lead time was made. Six weeks were chosen to be on the conservative side.
The need to adjust for sampling bias in the studies of HPV is quite apparent. Multiple studies have found that the location of HPV on the penis is differentiated by circumcision status [29
] and meta-regression has found that studies that sample only the glans have a significant difference in the odds ratio. The problem is that the entire treatment effect reported in studies that sampled only the glans [82
] can be attributed to sampling bias [26
]. Unfortunately, these studies are widely cited. While it could be argued that failure to sample the penile shaft is a fatal flaw, an adjustment for the number of infections missed is a straight forward solution. Doing so for the studies of disease incidence brought these studies in line with other studies that adequately sampled the genitals.
Nondifferential misclassification is a concern as the correlation between circumcision status based on patient report and physical examination can vary widely depending on the population studied. [79
]. Method of determining circumcision status was a significant factor in the meta-regression of studies of the prevalence of HPV. For some study designs, ascertaining circumcision status is not practical. For example, a number of studies of using representative samples of the general population relied on the subject report for circumcision status ().
Reliance on the patient report to document an STI introduces a potential for recall bias and may underestimate the incidence of STIs. This would only introduce bias if a differential ability to recall and report medically diagnosed sexually transmitted disease was linked to circumcision status [146
]. There is no reason to believe it is.
Searching for sources of bias also occurs in a meta-analysis, particularly for those involving observational studies, when looking at the impact of various factors on between-study heterogeneity. Some consider accounting for contributions to between-study heterogeneity is an obligation for the investigator and the most important task in performing a meta-analysis [37
]. It is particularly important for observational studies, which, compared to randomized clinical and controlled trials, are, on average, likely to overestimate the true odds ratio by 30% [147
]. Other methods that look to reduce between-study heterogeneity include the search for and the exclusion of studies that contain appreciable outlier data [148
], sensitivity analysis, and meta-regression.
Most of the between-study heterogeneity can likely be attributed to methodological limitations in the source studies and the inherent biases in study design. Many of the studies included in these analyses reported information collected at STD clinics. While these clinics provided concentrated clinical material at one location, their clientele does not reflect the characteristics and risk factors for disease seen in the general population and may introduce a selection bias that unduly influences the results generated [109
]. Intact and circumcised men may not use these health facilities with equal frequency for similar indications. For example, in the United States and England, men with higher socioeconomic status are more likely to be circumcised and more likely to have an STI treated by a physician in private practice rather than at an STD clinic. Health-seeking behaviors may be different in circumcised men who might be more likely to seek care for minor abrasions thus being placed in control group more frequently than their intact cohorts [134
4.16. Shortcomings of Meta-Analysis
Meta-analysis is an inexact tool and best applied to randomized controlled trials. It has inherent weaknesses when applied to observational studies, so guidelines on how to undertake this process have been published proposed [22
]. The validity of a meta-analysis of observational studies is related to study quality. The simple inclusion criteria allowed several studies of less than optimal quality to be included; however, more exclusive criteria can be subject to researcher bias and be manipulated to obtain specific results [15
]. The simple inclusion criteria may contribute to the between-study heterogeneity.
The analyses presented in this paper used a random-effects model to determine summary effects and confidence intervals. The alternative, fixed-effects models assume a single true effect common to all studies. Any variation would be attributed only to sampling error. Random-effects models allow for a true random component as a source of variation in effect size between studies as well as sampling error [149
]. If between-study heterogeneity is low, the random-effects model will give an estimate and confidence intervals similar to a fixed-effects model. In general, random-effects models are preferred because the assumptions for a fixed-effects model to be accurate are rarely satisfied [150
One limitation of this systematic review, or any systematic review, is the inability to find all sources of data using any search strategy. All search strategies have an ascertainment bias: the goal is to diminish this bias by finding as many relevant studies as feasible. So, there may be published and unpublished studies that were not included.
The measures of publication bias are a mathematical attempt to quantify the gestalt of looking a funnel graph and determining if it looks like an inverted funnel. Each measure of publication bias has its strengths and weakness [40
], but since there are no comparative analyses of the different methods of identifying publication bias, and the gold standard is our gestalt, all of the measures of publication bias should be used [151
]. They are often less than helpful. In the analyses published here, the results between the six different measures were often inconsistent, and funnel graphs that looked asymmetric in several instances did not have positive measures of publication bias and did not generate an intervention using “trim and fill” analysis. The trim portion of the “trim and fill” method is handicapped by being based solely on rank, without consideration of study size. Consequently, adjustments for publication bias should be viewed with caution as asymmetry of the funnel plot may be due to factors other than publication bias, and, likewise, results generated to correct for the asymmetry may not reflect a correction for publication bias [151