We performed our review in accordance with current guidelines for performing [11
] and reporting of systematic reviews [13
] and established a scientific advisory board (see Acknowledgments for participating experts). A review study protocol was developed in advance, though not published.
According to our research question we defined the following inclusion criteria: Population
: Infants and children from 6
months to 5
years of age. While our primary focus was on age groups up to 2
years, we decided to set an upper age limit at 5
years, in order not to miss suitable studies with mixed age groups. Intervention
: Micronutrient fortified milk or cereal food. Control intervention
: Non-fortified food; additional other nutritional approaches, if such approaches were applied in the intervention and control group. Outcome
: At least one of the following health related outcomes: surrogate measures (such as MN serum levels, hematological parameters), functional outcome (e.g. motor development), measures of morbidity (such as disease rates) or mortality. Study designs
: Randomized controlled trials of any follow-up time.
We excluded studies with infants and toddlers younger than 6
] or applying infant formula [15
], studies addressing adolescents or adult women, interventions based on supplementation, home fortification, bare food based approaches, fortification with components other than micronutrients, and studies testing absorption of MN. A priori, we also excluded studies with fortification of staple food as provided for larger population groups to isolate the effect of fortified milk and cereals.
We systematically searched for studies using electronic databases (Medline [search strategy Table
, Cochrane library; from 1966 to February 2011; no language restriction). As this review was part of a larger project, that evaluates the economic effects of MN fortification as well, we also included search terms such as “cost” and “economics”. We screened reference lists of included papers and contacted experts in the field for additional references. In addition, we screened homepages of relevant organizations (e.g. WHO, United Nations [World Food Programme, Unicef, Millennium Development Goals], The World Bank, Pakistan National Nutrition Survey; International Clinical Epidemiology Network [16
]; Global Alliance for Improved Nutrition, GAIN [17
]; The Micronutrient Initiative [18
]; Bill & Melinda Gates Foundation [19
]). We also contacted a manufacturer (Nestlé) for further material and performed hand searches in relevant journals with developing countries issues (such as The Lancet). All references were stored in an EndNote X4 database (Thomson/ISI ResearchSoft Berkeley, CA, USA).
Medline electronic search strategy
Study selection and data extraction
Three reviewers screened titles and abstracts for relevance and assessed potentially relevant studies for inclusion by full text. Teaching sessions were held in advance to improve conceptual consistency between reviewers. Disagreements were resolved by consensus meetings. If data of a specific population were published in several papers or if follow-up data were presented, we included each population only once. Using a predefined form, data were extracted by one reviewer in an Excel database and checked independently by a second reviewer.
We extracted data on general study information (e.g. study region; length and completeness of follow up), study setting (e.g. level of population recruitment), population details, intervention (e.g. daily amount of fortified MN, determined as daily difference between intervention and control group; composition of MN; comparator food) and outcome (e.g. morbidity rates; hemoglobin levels [g/dl; conversion to g/L with factor 10]).
One reviewer assessed risk of bias in individual studies with a component approach exploring methodological quality on the study level (adequate generation of random sequence, concealment of allocation, blinding) as well as on the outcome level (incomplete outcome data due to attrition; selective outcome reporting) [12
First, we calculated pooled estimates. For continuous variables we computed weighted mean differences (WMD) and 95%-confidence intervals (CI). For example, for analysis of hemoglobin change we used the mean change in the intervention and in the control group and their pooled standard deviation (SD). If the sample size decreased during the study, we used the lower sample size at the end of the study. If mean hemoglobin change per group and SD were not reported, we calculated change as the difference of baseline and final values for intervention and control group and applied the SD of final values [20
]. If 95%-CI of mean values were reported we converted them to SD assuming normal distribution [21
]. To check results for robustness, we also calculated WMD for final hemoglobin values of both study groups, as this data was reported more often. Due to considerable heterogeneity between trials, we applied a random effects model [22
]. When authors reported only medians for continuous data (e.g. for ferritin levels), we did not include those data in the meta-analysis. For binary data, we calculated risk ratios and 95%-CI. Heterogeneity between trials was calculated with I2
, that is the percentage of the total variation in estimated effects that is due to heterogeneity rather than chance (where values of 25% are assigned low, 50% moderate and 75% high) [23
Second, we divided our dataset into pre-specified subgroups to explore the influence of possible modifying factors on the outcome (fortified milk vs. cereal food; high vs. low/middle-income countries; single- vs. dual/multi-micronutrient fortification strategy).
Third, we performed a meta-regression analysis weighted for the inverse of the variance of the outcome [12
]. With this approach we evaluated the unique contribution of other a priori chosen independent factors on the most often reported outcome (dependent variable: hemoglobin level; independent variables: hemoglobin levels before intervention; daily amount of fortified MN; length of follow-up; completeness of follow-up).
For parametric and non-parametric tests P-values <0.05 were considered significant. Analyses were performed using the STATA SE 9 software package (StataCorp. 2007. Stata Statistical Software, College Station, Texas, USA).