|Home | About | Journals | Submit | Contact Us | Français|
Background Childhood acute lymphoblastic leukaemia (ALL) may be the result of a rare response to common infection(s) acquired by personal contact with infected individuals. A meta-analysis was conducted to examine the relationship between day-care attendance and risk of childhood ALL, specifically to address whether early-life exposure to infection is protective against ALL.
Methods Searches of the PubMed database and bibliographies of publications on childhood leukaemia and infections were conducted. Observational studies of any size or location and published in English resulted in the inclusion of 14 case–control studies.
Results The combined odds ratio (OR) based on the random effects model indicated that day-care attendance is associated with a reduced risk of ALL [OR = 0.76, 95% confidence interval (CI): 0.67, 0.87]. In subgroup analyses evaluating the influence of timing of exposure, a similarly reduced effect was observed for both day-care attendance occurring early in life (≤2 years of age) (OR = 0.79, 95% CI: 0.65, 0.95) and day-care attendance with unspecified timing (anytime prior to diagnosis) (OR = 0.81, 95% CI: 0.70, 0.94). Similar findings were observed with seven studies in which common ALL were analysed separately. The reduced risk estimates persisted in sensitivity analyses that examined the sources of study heterogeneity.
Conclusions This analysis provides strong support for an association between exposure to common infections in early childhood and a reduced risk of ALL. Implications of a ‘hygiene’-related aetiology suggest that some form of prophylactic intervention in infancy may be possible.
Evidence is growing in support of a role for infections in the aetiology of childhood leukaemia, particularly for the most common subtype, acute lymphoblastic leukaemia (ALL).1–3 Two infection-related hypotheses have gained popularity and are currently supported by substantial, yet inconsistent, epidemiologic findings. Kinlen first proposed the ‘population mixing’ hypothesis in response to the observed childhood leukaemia clusters occurring in the early 1980s in Seascale and Thurso, two remote and isolated communities in the UK that experienced a rapid influx of professional workers.4 He proposed that childhood leukaemia may result from an abnormal immune response to specific, although unidentified, infections commonly seen with the influx of infected persons into an area previously populated with non-immune and susceptible individuals. This hypothesis suggests a mechanism that involves a direct pathological role of specific infectious agents, presumably viruses, in the development of childhood leukaemia and that an immunizing effect may be acquired through previous exposure. Supportive data include several subsequent studies conducted by Kinlen and others examining similar examples of population mixing including rural new towns, situations of wartime population change and other circumstances contributing to unusual patterns of personal contact.4–11 Currently, there is no molecular evidence implicating cell transformation by a specific virus.12
The ‘delayed infection’ hypothesis proposed by Greaves emphasizes the critical nature of the timing of exposure and is intended to apply mostly to common B-cell precursor ALL (c-ALL), which largely accounts for the observed peak incidence of ALL between 2 and 5 years of age in developed countries.13,14 He described a role for infections in the context of a ‘two-hit’ model of the natural history of c-ALL,15 where the first ‘hit’ or initiating genetic event occurs in utero during fetal haematopoiesis producing a clinically covert pre-leukemic clone. The transition to overt disease occurs, in a small fraction (~1%) of pre-leukaemia carriers, after a sufficient postnatal secondary genetic event, which may be caused by a proliferative stress-induced effect of common infections on the developing immune system of the child.1,13 This adverse immune response to infections is thought to be the result of insufficient priming of the immune system usually influenced by a delay in exposure to common infectious agents during early childhood. With the assumption that improved socio-economic conditions may lead to delay in exposure to infections, the Greaves hypothesis provides one plausible explanation for the notably higher incidence rates of ALL with its characteristic peak age between 2 and 5 years observed only in more socio-economically developed countries.16,17 Although different in hypothesized mechanism, both the ‘population mixing’ and ‘delayed infection’ hypotheses propose childhood leukaemia to be caused by an abnormal immune response to infection(s) acquired by personal contacts, and are compatible with available evidence. In some populations, it is possible that both mechanisms may be operating.
Several previous epidemiological studies have used day-care attendance as an indicator of the increased likelihood of early exposure to infections,18 since it is well documented that in developed countries exposures to common infections, particularly those affecting the respiratory and gastrointestinal tracts, occur more frequently in this type of setting.19 The immaturity of children’s immune systems in combination with the lack of appropriate hygienic behaviour is believed to promote the transmission of infectious agents in this social setting.19–21 In the current analysis, we took a meta-analytic approach to summarize the findings to date on the relationship between day-care attendance and risk of childhood ALL.
Literature searches were conducted in PubMed to identify original research and review articles related to childhood leukaemia and day-care attendance and/or social contacts published between January 1966 and October 2008. The searches were conducted using the term ‘childhood leukaemia’ in combination with other terms including ‘infection’, ‘child care’, ‘day care’ and ‘social contact’. In addition, the bibliographies of epidemiology publications on childhood leukaemia and infections were searched to identify studies that may not have been captured through the initial database search. This included the review published in 2004 by McNally and Eden on the infectious aetiology of childhood acute leukaemia (AL).2
Among the studies identified, inclusion in the meta-analysis was limited to observational studies of case–control or cohort design of any size, geographic location and race/ethnicity of study participants. When more than one publication from an individual study was available, either the most recent publication or the publication that performed the analysis most applicable to evaluating the ‘delayed infection’ hypothesis was selected. Studies needed to have reported a relative risk (RR) or odds ratio (OR) and confidence intervals (CIs), or original data by disease status from which a measure of effect could be calculated. The outcome of interest was defined as clinically diagnosed leukaemia in children between the ages of 0 and 19 years. In the very few studies that did not distinguish between specific leukaemia subtypes,22–26 it was assumed that ALL was the primary subtype since it accounts for the majority (~80%) of leukaemia diagnoses in children.27
The exposure of interest generally referred to as ‘day-care attendance’, which, in addition to formal day care, may have included preschool, nursery school, play groups, mother–toddler groups and other early social contacts. A strict criterion for the meaning of ‘regular attendance’ was not defined a priori since it was assumed that this would vary between studies. Of the primary studies identified, four were excluded for various reasons, including study emphasis on evaluating leukaemia prognosis and outcome,28 an earlier analysis of data from a study for which a more complete and recent publication is available,29 and not reporting a risk estimate for day-care attendance.30,31 After the exclusions, a total of 14 studies, all case–control in design, were retained for the meta-analysis.22–24,32–42
For most studies,23,24,34–40 the ORs and 95% CIs for leukaemia, AL, ALL or c-ALL among those who attended day care compared with those who did not attend day care were extracted. Among the few studies that did not provide this estimate, the OR for a similar measure was extracted, including those for no deficit in social contacts,42 regular contact outside the home,22 >36 months duration of day-care attendance,41 increasing index and family day-care measure,32 and social activity.33 In two instances, the reported OR was recalculated to reflect the risk associated with the highest level of day-care attendance and/or social activity measure compared with the lowest.41,42 Furthermore, several studies reported risk estimates for stratified analyses by specific subtype of leukaemia,22,32,33,35–38,40–42 age at diagnosis,34,35,38 specific age of day-care attendance,22,33,32–38,40 or race/ethnicity;37 multiple estimates were extracted from these studies for the purposes of subgroup and sensitivity evaluations in the meta-analysis, including specific leukaemia subtypes, particularly ALL and c-ALL and timing of day-care attendance. In general, studies referred to the common precursor B-cell ALL subtype (CD10 and CD19 positive ALL) as c-ALL. Four studies defined c-ALL with an added criterion that specified an age range between 2 and 5 years.33,37,38,43 Risk estimates by specific diagnosis age groups were not extracted since there were only a few studies that provided this information and the age cut-points varied. For the one study that stratified by race/ethnicity,37 two separate risk estimates were included in the meta-analysis since the reported estimates were based on independent populations.
The between-study heterogeneity was assessed using the Q statistic, which tests the null hypothesis that the estimated effect is homogenous across all studies.44 Acknowledging that the eligible studies have been conducted independently and may represent only a random sample of the distribution of all possible effect sizes for this association, the random effects model was utilized, which incorporates an estimate of both between-study and within-study variation into the calculation of the summary effect measure.45 Compared with the fixed effects model,46 this method is more conservative and generally results in a wider CI. Finally, publication bias was evaluated visually using the funnel graph method that displays the distribution of all included studies by their point estimates and standard errors.47 In addition, the Begg and Mazumdar adjusted rank correlation test was used to test for correlation between the effect estimates and their variances which, if present, provides an indication of publication bias.48
The association with c-ALL was evaluated with a meta-analysis of 7 of the 14 studies.32,33,35–38,42 If a study reported multiple ORs and 95% CIs by timing of day-care attendance, the risk estimate associated with the earliest timing (e.g. age ≤ 2 years) was used to be consistent with the ‘delayed infection’ hypothesis.23,32,34,37,38 The effect of the timing of exposure was evaluated in subgroup meta-analyses of studies reporting risk estimates for early day-care attendance (age ≤ 2 years)22,23,32–34,36–38,42 and studies reporting risk estimates for day-care attendance anytime before diagnosis.23,24,35,37–39,41 Finally, a series of sensitivity analyses were conducted to evaluate the sources of study heterogeneity, namely, the influences of potential selection bias, and heterogeneity in disease classification and exposure definition. The analyses were conducted using the statistical software, STATA Version 9.49
Table 1 presents selected characteristics of the 14 studies included in this meta-analysis. The studies, all case–control in design, were published between 1993 and 2008 and were conducted in many different geographic areas. Most studies achieved a population-based ascertainment of cases utilizing a national registry or a regional network of all major paediatric oncology centres. A population-based control selection strategy was most common with the exception of three studies that selected hospital-based controls.23,24,39 Only 1 of the 14 studies utilized a records-based day-care assessment protocol,36 whereas the remaining studies relied on standardized questionnaires administered either in person, by telephone or by mail. All studies have accounted for major confounding factors such as age, sex, race and socio-economic status through a matched study design and/or statistical adjustment in the analysis. Of the 14 studies identified, 11 studies have reported either a statistically significant reduced risk associated with day-care attendance and/or social contact measures23,33–37,39 or provide some evidence of a reduced risk.22,24,40,41
As shown in Table 2, the 14 studies included a total of 6108 cases and generated a combined OR estimate indicating that day-care attendance is associated with a reduced risk of childhood ALL (OR = 0.76, 95% CI: 0.67, 0.87). Figure 1 provides a visual portrayal of the relationship between day-care attendance and the risk of childhood ALL. Three large studies conducted in Germany,42 the USA38 and the UK33 appeared to carry a large proportion of the weight in the meta-analysis at ~13% each. The combined risk estimates excluding each of these studies individually remained similarly reduced indicating that no one large study was able to completely explain the protective effect observed (data not shown). No remarkable evidence of publication bias was apparent from the funnel plot since the data points for these 14 studies were, in general, randomly distributed around the combined OR estimate (plot not shown). This visual interpretation of the results was confirmed by the large P-value using the rank correlation method (P = 0.553).
We attempted to maintain a reasonable balance between maximizing the inclusion of studies and minimizing sources of heterogeneity, by relaxing the eligibility criteria to include estimates for broader leukaemia subtypes, other social contact measures and unspecified timing of exposure. The contribution of the influence of possible sources of heterogeneity on the combined risk estimate was evaluated. In subgroup meta-analyses presented in Table 3 examining the influence of the timing of exposure, the combined OR for seven studies reporting estimates for day-care attendance or social contacts before diagnosis showed a reduced risk of childhood ALL (OR = 0.81, 95% CI: 0.70, 0.94). When the meta-analysis was limited to the nine studies that specifically evaluated day-care attendance at or before age 1 or 2 years, a similarly reduced risk of ALL (OR = 0.79, 95% CI: 0.65, 0.95) was observed.
A series of sensitivity analyses were conducted on the meta-analysis of the 14 studies to examine the influence of individual study characteristics on the combined OR, namely, potential biases in the selection of controls, the categorization of leukaemia and the assessment of day-care attendance. Figure 2 presents a summary of these analyses showing that none of these factors was able to completely account for the reduced risk of ALL observed in the main analysis of the 14 studies. For example, in the evaluation of potential control selection bias, reduced risks were observed for the analyses excluding three studies that used hospital-based controls (OR = 0.78, 95% CI: 0.68, 0.90) and excluding two studies that used random digit dialing (RDD) to select controls (OR = 0.72, 95% CI: 0.63, 0.81). Similarly reduced combined ORs were observed when excluding studies that included infants ( <1 year of age) in the study population (OR = 0.81, 95% CI: 0.70, 0.94), studies not specifically examining ALL (OR = 0.74, 95% CI: 0.63, 0.87), and studies that did not define the exposure strictly as attendance at a day care or a similar type of setting (OR = 0.74, 95% CI: 0.61, 0.88).
Table 4 presents the results of the meta-analyses evaluating the association between childhood c-ALL and day-care attendance. The analysis of c-ALL contained fewer numbers of studies compared with the analysis of ALL. Similar to the result from the meta-analysis of ALL, the combined OR for the seven studies of c-ALL was also <1, although the CI was slightly wider (OR = 0.83, 95% CI: 0.70, 0.98). The subgroup analyses among studies of day-care attendance before age 1 or 2 years and c-ALL generated results similar to those for ALL (data not shown). No evidence of publication bias was observed for these analyses.
The evidence from a large and growing body of literature related to the exposure to infectious agents, as measured by day-care attendance, and the risk of childhood leukaemia was systematically evaluated using a meta-analytic approach. Heterogeneity between epidemiologic studies and their results is common and constitutes one of the major challenges in such a synthesis. Although the random effects model was used in this analysis to account for some of the between-study variation, we acknowledge the importance of interpreting results together with a thorough consideration of the potential sources of heterogeneity.
All the studies included in this analysis were conducted with the a priori objective of testing the biologically plausible, ‘delayed infection’ hypothesis, which specifies a predicted direction of risk, timing of the exposure and the most applicable subtype of leukaemia. Overall, the studies show consistency in support of a reduced risk associated with day-care attendance or social contacts during early childhood, with the vast majority of studies either reporting an effect in the hypothesized direction or no association. A quantitative assessment using meta-analysis indicates that day-care attendance is associated with a reduced risk of childhood ALL, as well as c-ALL. The reduction in risk persisted despite a thorough consideration of potential sources of study heterogeneity. We did not conduct a meta-analysis specifically in non-c-ALL or acute myeloid leukaemia due to the limited number of studies reporting results for these associations. Of the four studies that present data for non-c-ALL,35–38 three studies showed reduced ORs,35–37 but lacked precision. Based on currently available data, it is difficult to determine whether the association applies to a specific subtype of ALL only or ALL in general.
The subgroup meta-analysis by timing of day-care attendance did not suggest a stronger reduction in risk for day care specifically at or before age 1 or 2 years as might have been expected based on the hypothesis. However, a few individual studies have shown that the strongest reduction in risk occurs when day-care attendance is started <6 months of age.33,35,39 Although not formally evaluated in this meta-analysis, several individual studies that used detailed exposure assessment protocols demonstrated evidence of dose–response effects. Strong trends were observed for increasing levels of child-hours of day-care attendance,37 levels of social activity33 and age at start of day care.35
We were not able to conduct a comparable meta-analysis of studies pertaining to the related mechanism of rural ‘population mixing’ and the risk of childhood leukaemia. Although it was not possible to analyse the role of ‘population mixing’ in the same manner as was done for the ‘delayed infection’ hypothesis, it is recognized that these two processes may be interrelated or occur simultaneously and that both mechanisms may be operating in a given population. Thus, the results observed for the analyses of studies providing data relevant to the timing of infection in early life cannot be interpreted as ‘ruling out’ the possible role of ‘population mixing’, but rather lend further support to the role of immune related processes in the aetiology of ALL.
One major consideration in the evaluation of study validity is the possibility of selection bias, a type of systematic error that occurs when there is differential selection of either the cases or controls on the basis of characteristics which may affect exposure status. One way this may arise is if cases and controls do not originate from the same source population. A population-based ascertainment of cases is considered favourable since a defined source population, from which controls may be selected, is easily identifiable. Other strategies of case ascertainment may be appropriate as well, as long as the source population can be clearly defined. As implemented in three of the included studies, selection of controls among the inpatient cohort of the same hospital as the case diagnosis can fulfill this requirement, but can introduce bias if the illnesses/conditions of the control group are related to the exposure under study. Also, it has been suggested that the use of RDD, a population-based method of control recruitment, may result in a control group biased with respect to certain population characteristics that may be associated with exposures of interest.50 Analysis excluding the three studies that selected hospital-based controls23,24,39 or the two studies that used RDD to recruit population-based controls32,38 produced similar results to those for the full set of studies.
Similar types of systematic biases resulting in socio-economic differences between cases and controls have been implicated in other studies as well, including the large United Kingdom Children Childhood Cancer Study (UKCCS)33 and the Northern California Childhood Leukemia Study (NCCLS).37,51 Adjustments for these differences have been implemented in the analyses; however, the possibility of residual effects cannot be ruled out. To alleviate some of this concern, results of a subgroup analysis conducted in the NCCLS among matched cases and controls who had the same annual household income showed that the pattern of association with day-care attendance persisted.37
The potential for information bias in case–control studies is of particular importance due to the retrospective nature of data collection, and the recall of past exposures may be influenced by disease status. Most studies collected exposure data based on respondent recall using a standardized questionnaire administered either in person, by telephone or by mail. Recall bias in the evaluation of c-ALL is expected to be less likely, since diagnoses of c-ALL are usually made between ages 2 and 5 years, and recall of early exposure histories may be easier for the primary caregiver. Although the influence of recall bias could not be formally evaluated in these meta-analyses, one records-based day-care study conducted by Kamper-Jorgensen et al. in Denmark reported a reduced risk of childhood ALL associated with childcare attendance during the first 2 years of life.36 Several subtype specific analyses performed in this study showed the strongest association in B-cell precursor ALL and c-ALL.
In addition to potential biases associated with the ability of respondents to accurately recall past events, there was variation between studies in the extent of exposure assessment and categorization of individual exposures to infectious agents. For example, Schuz et al. reported results from a matched case–control study conducted in Germany that used a ‘deficit in social contacts’ variable based on the assumption that children were likely to have attended day care if during the first 2 years of their life both parents were in full-time work.42 The assumption made in the formulation of this social contact variable most likely contributed some non-differential misclassification, which tends to bias findings towards one of no effect. Their analysis did not indicate an association between deficit in social contact and AL or c-ALL.
In contrast, in the UKCCS, Gilham et al. created a hierarchical variable that reflected a child’s overall social activity based on interview data incorporating information on frequency of regular activity with children outside the home, frequency of attendance at a day nursery or nursery school, and number of other children in attendance.33 These analyses indicated that social activity/day-care attendance is associated with a reduced risk of childhood ALL. Ma et al., in the first publication on day-care attendance from the NCCLS, constructed a ‘child-hours of exposure’ variable incorporating information on the number of months attending a day care, mean hours per week at this day care and the number of children exposed to at this day care. They reported that children who had more total child-hours of exposure had a reduced risk of ALL.29 These results were later confirmed in a follow-up analysis using a larger study population.37 In non-Hispanic White children, children in the highest category of child-hours during infancy had a reduced risk of ALL and c-ALL compared with children who did not attend day care with strong evidence of a dose–response effect. This association was not observed in Hispanic children, which, as noted by the authors, had different socio-economic and demographic characteristics, including larger family size and different day-care utilization patterns. Although these types of refined exposure assessment strategies that account for duration, frequency and size of the day-care facility serve as examples for future studies, results from these analyses may have contributed to study heterogeneity. In a meta-analysis of 10 studies that strictly defined the exposure as attendance at a day care or other similar types of settings,23,24,34–41 a reduced risk estimate was observed.
Current evidence suggests that different subtypes of leukaemia, defined by both immunophenotypic and molecular characteristics, may be associated with distinct aetiological mechanisms.52,53 To minimize the bias associated with misclassification of the phenotype, most studies specifically evaluating the infectious hypothesis have reported results by subtype-specific leukaemia such as c-ALL, and have excluded infants since there is evidence suggesting these leukaemias may be associated with a causal mechanism involving transplacental chemical carcinogenesis.54–56 This is not expected to be a major source of error, as observed in the sensitivity analysis, since infant leukaemias comprise only a very small proportion of all leukaemia diagnoses (<5%).57 It is believed that the hypothesis on infections, particularly the ‘delayed infection’ hypothesis, is most relevant to ALL and its most common subtype, c-ALL.1 Limiting the meta-analysis to only those studies providing risk estimates for specific subtypes resulted in a reduced risk associated with both ALL22,33–38,40,41 and c-ALL.32,33,35–38,42
The UKCCS recently published results from the first records-based study examining the relationship between clinically diagnosed infections in the first year of life and childhood ALL.43,58 Contrary to what is expected based on the ‘delayed infection’ hypothesis and what was observed in this meta-analysis of day-care attendance, the results of this well-designed records-based study showed evidence of an increased risk of childhood ALL and c-ALL associated with clinically diagnosed infections in the first year of life. It is possible that these contrasting results reflect one of many mechanisms involved in the aetiology of childhood ALL. The authors explain that their findings may indicate that a dysregulated immune response to infections during the first few months of life leads to an increased risk of ALL.43
Alternatively, from a methodological perspective, it has been suggested that these contrasting results may be an indication that previous studies using self-reported data on infections and social contacts, many of which have found a reduced risk of ALL, may be biased due to differential recall/reporting between cases and controls.58 Although more studies are needed to evaluate this apparent discrepancy, it is important to note at this juncture that infection based on clinical diagnosis may reflect a different infectious disease experience of the child compared with a self-reported infectious disease history, as mothers may not seek medical attention for all of the common infections experienced by the child.
Although still susceptible to recall bias, surrogate measures of exposure to infections such as day-care attendance and birth order, are recognized as strong alternative measures to testing the ‘delayed infection’ hypothesis, since they are highly associated with common childhood infectious diseases and have the added advantage of capturing a child’s asymptomatic infections.59 It is not known to what extent recall bias may have affected results of previous day-care studies, but there is evidence from a recent Denmark study also showing strong evidence of a reduced risk associated with a records-based assessment of day-care attendance.36
Overall, this meta-analysis of existing epidemiological data provides strong support for an association between exposure to common infections in early childhood and subsequent risk of ALL. As an indirect measure of exposure to infections, the ability of day-care attendance to serve as a surrogate measure may vary depending on characteristics of the facility attended and the child’s pattern of attendance. Epidemiologic studies have shown that the transmission and development of infectious diseases are highly influenced by the age of the child, frequency and duration of attendance, structure and size of the facility.19,21 Future epidemiologic studies of childhood leukaemia should attempt to obtain this type of detailed information on the facilities attended to refine the exposure classification.
Although inconsistent, there is evidence from studies of other surrogate measures of exposure to infections including birth order,2 parental social contacts in the workplace,60 and other immune-related factors (e.g. vaccination and breastfeeding history61,62), that support a role for infections and immune response in the aetiology of childhood leukaemia. The causal significance of the role of infections in childhood ALL would be strengthened by identification of a plausible biological mechanism for the conversion of pre-leukemic cells following infection1 and by incorporation of genetic biomarkers of susceptibility and immune response into further epidemiological studies.63,64 The protective effect of early infection on risk of subsequent childhood ALL parallels the similarly protective impact of parasitic infections on type I diabetes in both animal models and children.65 An important implication of these ‘hygiene’-related hypotheses and supportive data is that some form of prophylactic intervention in infancy may ultimately be possible.1,65
Grants from the US National Institute of Environmental Health Sciences [grant numbers PS42 ES04705, R01 ES09137] and the Children with Leukaemia Foundation, UK. Funding to pay the Open Access publication charges for this article was provided by the US National Institute of Environmental Health Sciences [grant numbers PS42 ES04705, R01 ES09137].
The authors would like to thank Drs Mel Greaves and Tim Eden for their helpful contributions to a draft of the manuscript.
Conflict of interest: None declared.