|Home | About | Journals | Submit | Contact Us | Français|
Background The promotion of household water treatment and handwashing with soap has led to large reductions in child diarrhoea in randomized efficacy trials. Currently, we know little about the health effectiveness of behaviour-based water and hygiene interventions after the conclusion of intervention activities.
Methods We present an extension of previously published design (propensity score matching) and analysis (targeted maximum likelihood estimation) methods to evaluate the behavioural and health impacts of a pre-existing but non-randomized intervention (a 3-year, combined household water treatment and handwashing campaign in rural Guatemala). Six months after the intervention, we conducted a cross-sectional cohort study in 30 villages (15 intervention and 15 control) that included 600 households, and 929 children <5 years of age.
Results The study design created a sample of intervention and control villages that were comparable across more than 30 potentially confounding characteristics. The intervention led to modest gains in confirmed water treatment behaviour [risk difference = 0.05, 95% confidence interval (CI) 0.02–0.09]. We found, however, no difference between the intervention and control villages in self-reported handwashing behaviour, spot-check hygiene conditions, or the prevalence of child diarrhoea, clinical acute lower respiratory infections or child growth.
Conclusions To our knowledge this is the first post-intervention follow-up study of a combined household water treatment and handwashing behaviour change intervention, and the first post-intervention follow-up of either intervention type to include child health measurement. The lack of child health impacts is consistent with unsustained behaviour adoption. Our findings highlight the difficulty of implementing behaviour-based household water treatment and handwashing outside of intensive efficacy trials.
The prevalence of diarrhoea in developing countries has encouraged the development of low-cost, behaviour-based interventions to interrupt diarrhoea-causing pathogen transmission by improving water quality at the point-of-use and by washing hands using soap. Meta-analysis of efficacy studies indicate that household water treatment reduces diarrhoea in children <5 years of age by 30–40%,1–3 and handwashing with soap reduces diarrhoea and acute respiratory infections by 31 and 24%, respectively.4,5 The child health improvements documented in efficacy studies of the interventions reflect treatment effects in the short term (maximum study duration 12 months), typically under weekly or bi-weekly behavioural reinforcement.
In this article, we evaluate the effectiveness of a 3-year, combined water treatment and handwashing intervention in rural Guatemala through a novel extension of previously published design (propensity score matching) and statistical methods (targeted maximum likelihood estimation).6,7 Between October 2003 and September 2006, two non-governmental organizations, Caritas and Catholic Relief Services, implemented a large household water treatment and handwashing campaign in approximately 90 villages across three municipalities in rural eastern Guatemala. The implementing organizations had oversight from the SODIS Foundation (http://www.fundacionsodis.org). The promoted water treatment methods were boiling, solar disinfection (SODIS) and chlorination using diluted bleach; all having demonstrated health benefits.2,8 All villages received the same intervention package, and all activities were initiated at the same time. In each intervention village, Caritas technicians introduced the programme to village residents and recruited community-based health promoters for training. The trained health promoters later visited households with children ≤3 years of age or with pregnant mothers to promote water treatment and handwashing using soap. Promoters educated mothers about proper nutrition for their children, and at the end of each visit gave the family a small ration of rice, beans and oil. The visits were conducted monthly or bimonthly and lasted ~30 min each (see online appendix for additional details; available as supplementary data at IJE online). There exists no formal record of the proportion of eligible households that participated, but technicians on the ground suggest that the majority of eligible households participated. At the conclusion of the intervention, the implementing organization conducted a survey of participating households and recorded water treatment behaviour based on self-report. The survey estimated that 70% of participating households regularly used some method of household water treatment (village level participation range: 29–100%). The SODIS Foundation provided these data at the start of our evaluation.
The primary objective of this study was to revisit households to assess water treatment behaviour, basic hygiene knowledge and practices, and child health 6 months after the conclusion of the intervention. We measured child health using self-reported symptoms of acute diarrhoeal and respiratory illness. We used anthropometric measurements that have demonstrated utility as outcome measures for water and sanitation interventions.9–11
This study was conducted in the Camotán municipality in the mountainous state of Chiquimula, Guatemala, near the eastern border with Honduras. We used a cross-sectional cohort design with a 7-day retrospective risk period.12 All data collection followed protocols approved by institutional review boards at the University of California, Berkeley and the Universidad del Valle de Guatemala, and all participants provided informed consent.
Since the intervention was non-randomized and villages were purposely selected by the implementing organizations, intervention villages were likely different, on average, from other villages in the study region. Baseline differences between intervention and control villages could lead to differences in water and hygiene practices and child health, independent of the intervention. To help increase comparability between intervention and control groups, we used restriction and propensity score matching13 based on pre-intervention characteristics to select intervention and control villages. All study villages—intervention and control—were selected in 2007, after the intervention ended. We adapted this selection approach from prospective, non-randomized, community-level intervention studies.6,14,15
We obtained village-level 2002 census data with detailed information about demographics, education, housing conditions, water sources and sanitation for 88 villages (30 intervention, 58 control) in the study region.16 We restricted our sample to villages with at least 50 children <5 years of age to guarantee a sufficient sample in each village. After a rapid assessment, we excluded two additional potential control villages that were qualitatively wealthier and had a large fraction of residents living in the USA sending remittances. This newly generated wealth was not reflected in the 2002 census. Our final sample for the match included 23 intervention villages and 26 potential controls.
We modelled the probability of participation in the behaviour change intervention using a logit model: logit [Pr(A = 1|W)] = α’W, where A is an indicator variable equal to 1, if a village participated in the intervention, and 0 otherwise, and W is a vector of characteristics that included the percentages in each village of: males; children <5 years of age; literate females; individuals employed in agriculture; households with private water taps; households with private wells; households with private latrines; households with electricity; and households with soil floors. Additionally, we measured the number of households, people per household and distance to the municipal centre. Importantly, the covariates in W were selected after detailed discussions with program technicians in the implementing organizations, and include information that the organizations used to select intervention villages.
In both the propensity score model used in our design and the targeted maximum likelihood estimation used in our adjusted analyses, one must estimate regressions that are not of direct interest (nuisance parameters), but are necessary to estimate the parameter of interest. The consistency of our estimates is contingent on the consistency of these nuisance parameter estimates. To estimate the nuisance parameters, we used the Deletion/Substitution/Addition (D/S/A) algorithm, which is a flexible model-selection approach that fits polynomial terms and their tensor products using cross-validation.17 Our D/S/A-selected propensity score model included the covariates as main effects with no interactions after allowing for up to two-way interactions, a maximum quadratic order for each term, and up to 15 total terms.17 We excluded intervention and control villages outside the region of common support (overlap) on the propensity score.18 Finally, we matched each intervention village to a control village without replacement, using the linear predictor of the model with nearest neighbour matching. This step resulted in 19 matched pairs. Due to time-in-field constraints, we included the 15 pairs with the closest match in our study.
We selected households within each village, using a stratified systematic sample. Our team used village sketch maps from the municipal planning department to split each village into two geographic strata with roughly equal numbers of houses. Within each stratum, the field supervisor chose a random start, and the interviewer teams visited every third house until 10 houses were sampled. The inclusion criteria for the study were: (i) at least one child <5 years of age living in the home and (ii) the family had lived in the village since 2003 or earlier (the time of intervention start). If a selected household met our inclusion criteria but the primary caretaker was away, the field team returned two additional times before choosing a replacement household.
A team of four trained fieldworkers and a field supervisor conducted household interviews during the dry season between April and June of 2007. The survey instrument was pre-tested and validated over a 2-week period in nearby, non-study villages. We collected household water samples in a random sample of 48 households from eight study villages (four intervention and four control, distributed across major water catchments). Water samples were collected in 100-ml Whirl-Pack™ bags in a fashion that mimicked each household's water retrieval practices, and transported in a cooler to the laboratory at the Universidad del Valle de Guatemala for culturing within 20 h of collection. Samples were processed using the Colilert Quantitray 1000 kit (IDEXX Laboratories), and we used a most probable number (MPN) table to quantify Escherichia coli.
The primary health outcomes of our study were diarrhoea, clinical acute lower respiratory-tract infections (ALRI), and child growth measured by height, weight and mid-upper-arm circumference. We collected gastrointestinal and respiratory symptoms over the previous 7 days using a health calendar modelled after Goldman et al.19 Gastrointestinal and respiratory outcomes were measured using daily longitudinal prevalence20,21 with 2-day recall after we identified under-reporting of symptoms for recall periods >2 days.22,23
We defined diarrhoea as three or more loose or watery stools in 24 h, or a single stool with blood or mucus.24 We recorded symptoms of highly credible gastrointestinal illness (HCGI), which includes any of the following four conditions: vomiting, watery diarrhoea, soft diarrhoea and abdominal cramps, or nausea and abdominal cramps.25 We defined clinical ALRI according to the World Health Organization (WHO) clinical case definition: cough or difficulty breathing with a raised respiratory rate measured with a wristwatch (more than 60 breaths/min in children <60 days old, more than 50 breaths/min for children aged 60–364 days, more than 40 breaths/min for children aged 1–5 years).26
All fieldworkers were standardized on anthropometric measurement techniques over 2 days of training,27 and they collected measurements in teams of two. Fieldworkers measured the weight and length of children <2 years of age in the lying position, and children aged 2–5 years standing using infant scales (Tanita 1380, 0.1 kg accuracy) and stadiometers (420 Measure All, 0.1 cm accuracy). Upper-arm circumference was measured for children aged ≥6 months at the mid-point of the upper right arm using an elastic tape (0.1 cm accuracy).
We measured water-treatment practices using self-reported behaviour. Families that reported treating their water were classified as ‘confirmed’ if they met the three following criteria: (i) reported treating their water in the previous 7 days, (ii) had treated water in their home at the time of the interview and (iii) could produce the materials they used to treat water. Fieldworkers evaluated the presence of treated water based on self-reported information and a sample (not tested) provided by the family. Treatment materials included a designated pot and storage container for boiling water, plastic bottles for SODIS and liquid bleach or chlorine tablets and a designated storage container for chlorine treatment. Interviewers collected self-reported handwashing behaviour by asking an open question to mothers about when they washed their hands in the past 24 h and coding answers using five critical times: before cooking, eating, or feeding children and after defecation or changing the baby. Interviewers collected information about hygiene and water storage with discrete spot-check observations during the interview.
Using daily symptoms reported at the time of the interview, we reconstructed the 48-h retrospective risk period for each child in the study.12 The parameter of interest for all outcomes is the marginal treatment effect conditional on selection into the study based on restriction and propensity score matching. We estimate the parameter as:
where Y is the outcome of interest, A is an indicator equal to 1 if a child lives in an intervention village and 0 otherwise, and W * is the set of characteristics among intervention villages in the study sample (W * = W | A = 1, more than 49 children <5 years of age). Thus, our inference is limited to the set of intervention villages for which there is a comparable control village based on the village selection method. For self-reported health outcomes, we calculated the difference in the daily longitudinal prevalence between the intervention and control groups. We converted the anthropometric measurements to age- and sex-specific Z-scores using a publicly available algorithm that references the 2006 WHO Growth Standards,28 and calculated the difference in Z-score means.
For self-reported health and anthropometric outcomes, we attempted to improve the efficiency of the estimator and control for potential residual confounding using targeted maximum likelihood estimation (MLE).7 Non-technical details of our estimation strategy with complete notation and specifications are included in the appendix (available as supplementary data at IJE online). We calculated 95% confidence intervals (CIs) for unadjusted and adjusted estimates using a bootstrap with matched village pairs as the sampling unit to reflect the design and account for correlation between children within villages.
We estimated that a sample of 30 villages with 20 households per village would be sufficient to detect a difference of 5.5 percentage points in the longitudinal prevalence of diarrhoea, assuming a prevalence of 15% in the control group29 and 80% power.
The village selection process improved the comparability of intervention and control villages across a range of important, pre-intervention characteristics. Table 1 summarizes pre-intervention covariate means for control and intervention villages, their standardized difference (SD), which is equal to the difference in means in standard deviations and is a useful measure of balance.14 Before restriction and matching, intervention villages had more households on average (91 vs 55%; SD = 67) and a greater proportion of households with tap water (73 vs 53%; SD = 47), latrines (61 vs 53%; SD = 18) and electricity (61 vs 45%; SD = 32). After restriction and matching, balance improved for 7 of 12 covariates (Table 1).
Interviewers visited 30 villages (660 households). Of these, 60 (9.0%) refused to participate in the study: 27/327 (8.3%) in intervention villages and 33/333 (9.9%) in control villages. The final sample included 600 households, 929 children <5 years of age and 1858 child-days of observation. Fieldworkers obtained complete anthropometric measurements for 872 (94%) children.
Intervention and control villages remained well balanced across a wide range of potentially confounding variables in 2007 (Table 2). Of 48 stored water samples, nearly all contained Escherichia coli: only two (4%) samples had MPN <1 per 100 ml and the mean (SD) log10 E. coli concentration per 100 ml was 1.975 (0.870) in the control and 2.292 (1.033) in the intervention group. In intervention villages, 147 (49%) of study households reported participating in monthly visits from CRS/Caritas promoters at anytime since 2003 (only 5% of families could remember specific dates—month/year when asked).
Overall, 85% of study households were satisfied with their drinking-water quality, but only 65% of respondents believed their drinking water was clean. The proportion of participating families in intervention villages reporting water treatment dropped from 70% at the end of the intervention to 37% 6 months later. Households in intervention villages were more likely to treat their water than control households based on self-reported activity [33.3 vs 21.0%; Risk Difference (RD) = 0.12, 95% CI 0.01–0.24], and based on confirmed water treatment activity at the time of the visit (8.7 vs 3.3%; RD = 0.05, 0.01–0.10) (Table 3). The primary reason families gave for not treating their water was that it was already clean (48%), followed by bad taste (14%), not interested (11%) and no time (7%).
We did not observe differences between intervention and control groups in self-reported handwashing behaviour, or spot-check observations of hygienic conditions (Table 4). Soap was present in most homes (90%), and its use was similar in intervention and control villages (RD = 0.03, –0.05 to 0.11).
In children <5 years of age, the daily longitudinal prevalence of diarrhoea and HCGI during the measurement period was 11.9 and 12.6%, respectively. Intervention and control groups did not differ in diarrhoea [Longitudinal Prevalence Difference (LPD) = 0.004, 95% CI –0.051 to 0.058] or HCGI (LPD = 0.005, –0.054 to 0.065) (Table 5). Respiratory illness was common among children in the study: the daily longitudinal prevalence of cough or difficulty breathing was 30.0% and clinical ALRI was 6.9%. We observed no differences between the intervention and control groups in the longitudinal prevalence of cough or difficulty breathing (LPD = 0.012, –0.097 to 0.137) or ALRI (LPD = 0.019, –0.028 to 0.078).
Study children were generally well nourished but, consistent with our acute self-reported health outcomes, we found no differences in anthropometric measures between children living in intervention and control villages (Table 6). Adjustment for a large set of potential confounding variables using targeted maximum likelihood did not change the unadjusted results (see appendix, available as supplementary data at IJE online).
To our knowledge, this is the first post-intervention follow-up study of a combined household water treatment and handwashing behaviour change intervention, and the first to extend propensity score matching and targeted maximum likelihood estimation to the design and analysis of a pre-existing intervention. The absence of child health impacts is consistent with the modest improvement we observed in water treatment behaviour (Table 3), no detectable differences in handwashing behaviour, and highly contaminated living environments (Table 4). These findings are consistent with efficacy trials of household water treatment that have found that health impacts are contingent on compliance.2,3 The large difference between self-reported and confirmed water treatment (Table 3) suggests that self-reported water treatment behaviour overestimates actual practice. Schmidt and Cairncross recently outlined the problems of self-reported health outcomes in non-blinded studies of household water treatment.30 Our self-reported health outcomes likely suffer from less reporting bias because we do not have frequent, repeated visits, we used a health calendar to collect symptoms, and we minimized recall to 48 h. Our objective anthropometric outcomes are an important complement to the self-reported outcomes, and the null treatments effect is consistent across all outcomes.
Our confirmed water treatment adoption in intervention households (9%) is lower than water treatment adoption reported after a CARE/Madagascar Safe Water System (SWS) campaign, which promoted chlorine treatment with safe storage. Ram et al. found that 54% (29/54) of households had detectable free chlorine in their stored water 18 months after the campaign.31 Parker et al. also report higher sustained adoption after a clinic-based SWS and handwashing intervention: 71% (36/51) of households had detectable free chlorine 1 year after the intervention.32 Our water treatment behaviour results are consistent with Luby et al., who found 5% (22/462) of households regularly treating their water 6 months after the completion of a year-long household flocculent-disinfectant intervention trial in Guatemala.33 Our handwashing and hygiene findings suggest that the presence of soap is common even in the absence of heightened promotion, but that self-reported handwashing remains infrequent around all key activities except cooking (Table 4). This finding contrasts with two earlier studies that report sustained handwashing behaviour change many years after short-duration interventions, though neither study included an adequate control group.34,35
Our results demonstrate that with available pre-intervention secondary data, the careful selection of a study population in the design stage can greatly improve the baseline comparability of intervention and control groups in the evaluation of a pre-existing intervention. Prospective, randomized designs have implemented pair matching on one or two variables such as baseline illness or community size to help improve the comparability of treatment arms.36 The limitation of one- or two-variable matching in non-randomized designs is that implementing organizations usually rely on many (or ill-defined) characteristics to choose intervention recipients, and matching on one or two covariates is unlikely to balance a large set of potential confounders. Propensity score matching simplifies multivariate matching by accommodating continuous covariates and reducing a large set of matching characteristics to a single scalar. Restriction and matching limit inference to the population ultimately included in the study, but when interventions are targeted to a subset of the population, making inference to segments of the population that do not share characteristics with those treated must rely on extrapolation.
There are limitations to our study. Our design does not include baseline outcome measurement. It is possible that intervention villages were in worse health condition than controls before the intervention, and that their health improved to control levels by 2007. We think this scenario is unlikely given the limited behaviour change we observed and the comparability of intervention and control villages across a broad range of demographic, socio-economic and environmental characteristics in both 2002 and 2007. Secondly, we only measured outcomes at one point in time, and it is possible that we misclassified families with respect to behaviour and illness, since these characteristics likely vary over time. We attempted to reduce misclassification by using measures of water treatment and hygiene that did not change rapidly over time, and by supplementing self-reported health outcomes with anthropometric measurements. Thirdly, only 49% of intervention households reported participating in the intervention. This modest participation rate may have diluted the treatment effect sufficiently to lead to a null finding with respect to effectiveness but is itself an important finding with respect to future implementation. Comparing the subgroup of participating intervention households to non-participants in unadjusted and adjusted analyses did not change our conclusions (see appendix available as supplementary data at IJE online).
A final limitation is that our cross-sectional measurement does not ultimately resolve whether the intervention was sustainable. Two scenarios are consistent with our results: (i) the intervention successfully increased water treatment behaviour among participating families, but the new behaviours were not sustained after intervention completion, or (ii) the intervention never led to behaviour change and there was nothing to sustain. The only available reference point to evaluate these scenarios was an end-of-intervention survey conducted by the implementing organization in which 70% of participating households in our study villages reported consistent household water treatment. Whereas this estimate is likely biased upward, in our survey 6 months after the intervention, 33% of intervention village households self-reported that they treat their water, a measurement prone to similar upward bias (Table 3). Taken together, these measurements suggest that water treatment likely tapered off after activities ceased. Future studies could address sustainability more rigorously by collecting measurements at the end of the intervention period followed by identical measures later to capture changes over time.
Six months after a 3-year intervention in rural Guatemala we observed minimal sustained water treatment and handwashing behaviour, which consequently led to no impacts on acute gastrointestinal, respiratory or anthropometric measures. Our findings highlight the difficulty of achieving sustained new behaviour adoption in the context of non-research intervention campaigns. Future research in this sector should focus on identifying techniques to improve and sustain behaviour adoption that implementing organizations can use in development programs. Our study design provides a useful template for effectiveness evaluations of pre-existing intervention campaigns initiated outside of formal research activities.
Supplementary data are available at IJE online.
Institute for Public Health and Water Research (http://www.ipwr.org). Funding to pay the Open Access publication charges for this article was provided by The Berkeley Research Impact Initiative (BRII).
The authors gratefully acknowledge Nazario Lopez and the interviewer team for long hours in the field, and Maricruz Alvarez for all laboratory analyses. Rodrigo Gramajo Rodriguez and Andri Christen provided valuable input into the questionnaire design and its translation. Fundación SODIS personnel Matthias Saladin and Alvaro Solano, and former Caritas personnel Marina de Lantán, Edna Mendoza and Carlos Miguel Loyo provided important information about the intervention and input into the questionnaire. Jan Hattendorf provided valuable comments on a draft of this manuscript.
Conflict of interest: None declared.