|Home | About | Journals | Submit | Contact Us | Français|
Despite the success of extensive control measures that have been implemented in China for over 50 years, the number of individuals infected with Schistosoma japonicum remains high in the remaining endemic areas. A variance components analysis was undertaken to estimate the heritable and environmental components that contribute to S. japonicum infection in the Poyang Lake region of Jiangxi Province, PR China. The total target population was 3148 from four separate administrative villages. Two thousand seven hundred and five of these comprised 400 families ranging in size from 3 to 188. After adjustments were made for gender, water contact and past history of having had schistosomiasis, the heritable component was estimated to account for as much as 58% of the phenotype variation under the polygenic model. Household was not shown to be an important environmental factor. Incorporating village effects indicated that the results were valid for the total population. We conclude that genetic heritability in this region is high and plays an important role in determining risk of infection with S. japonicum.
Schistosomiasis, caused by Schistosoma japonicum, has existed in China for over 2000 years, infecting over 12 million Chinese before control measures were introduced in the 1950s. Despite significant reductions in prevalence, human infection remains high in remaining endemic areas. This reflects largely the lifestyle and economy of the areas, where local inhabitants rely on water sources often contaminated with infective schistosome cercariae, for washing, cooking, fishing and irrigation. Although treatment of human populations with praziquantel remains highly effective in curing infection, it does not prevent re-infection (Wu et al., 1993; Wu et al., 1994). This repeated infection and long term exposure results in chronic disease causing an even greater economic drain on the public health sector. Further, the reassessment of schistosomiasis-related disability, combined with recent information on the global prevalence of schistosome infection, indicates that the true public health burden of schistosomiasis is substantially greater than previously appreciated (King et al., 2005).
A number of studies carried out on human helminths have identified a genetic association with infection. In schistosomiasis mansoni, segregation and linkage studies identified a major gene (SM1) controlling infection intensity (Abel et al., 1991) that was subsequently mapped to the 5q31-q33 region (Marquet et al., 1996; Marquet et al., 1999; Muller-Myhsok et al., 1997). A second gene, controlling advanced disease caused by Schistosoma mansoni (SM2), has also been identified, mapping to 6p21-q21 (Dessein et al., 1999a). Both gene locations are closely linked to, or contain, known genes encoding a wide range of cytokines that have shown strong association with infection and disease. Although variance components analysis failed to detect any genetic effects controlling Schistosoma haematobium infection (King et al., 2004), a strong association was identified between infection intensity and the 5q31-q33 region of the genome (Kouriba et al., 2005).
In this study, we measured the variation in infection and infection intensity in a Chinese population with a high level of exposure to S. japonicum. Using a variance components analysis we estimated the variances attributable to shared common household effects and additive genetic effects and assessed the importance of these components in determining susceptibility to infection.
The study population comprised four administrative villages: Aiguo, Dingshan, Fuqian and Xindong, located in the Poyang Lake region, Jiangxi Province, P.R. China. The geo-coordinates of the four villages (and the shortest distance to the water contact sites) are E 116.37° N 28.73° (40 m), E 115.97° N 29.12° (100 m), E 116.42° N 28.89° (500 m) and E 116.69° N 28.85° (200 m), respectively. Each administrative village comprised five to eight smaller ‘natural’ villages.
The total population of the four administrative villages comprised 3123 individuals. All residents were initially listed and allocated personal identification codes (PID), which included information on administration village, natural village, house identification (id) and household member id. They were interviewed by questionnaire (Ross et al., 1997) to obtain demographic information, schistosomiasis history and water contact.
A questionnaire was used to assess water contact frequency and mode during spring, summer and autumn. Frequency was measured on a basic scale as ‘not at all’, ‘once a month’, ‘twice a month’ and ‘at least once a week’ and given values of 0–3. The average water contact over all three seasons was used as a measure of exposure. Main modes of water contact were also noted (i.e. fishing, washing, bathing, farming). Children under the age of five were excluded from the study.
Family data were collected using a questionnaire. All subjects in the study population were interviewed regarding their first degree (biological) relatives, i.e. parents, children and siblings. Name, sex, date of birth and PID were recorded for each family member to ensure correct identification. Questionnaires were checked for inconsistencies within families and extended pedigrees were identified using the questionnaires. In-laws who were not connected to the family by children (and thus not biologically related to anyone else in the family) were not included. Individuals belonging to a household but who did not belong to the family were also excluded and, therefore, were not used in the household analysis. The pedigree data was managed in Access. Any errors detected during the identification of the extended family were corrected, where possible, or excluded from further analysis. All adjustments to the data were recorded. The final pedigree data consisted of 2705 individuals belonging to 400 extended pedigrees ranging in size from three to 188 and included as many as four generations.
Parasitological examination of the study population was carried out using the Kato-Katz thick smear technique (Katz et al., 1972). In order to maximise sensitivity of the technique, all village members were asked to provide two stool samples (Ross et al., 1998a; Berhe et al., 2004). Containers were given to each village member labelled with their personal identification code (PID) and stool number. The second stool was collected at least 2 days after the first. Three slides were each labelled with the PID and prepared from each stool at the village site within 24 h of receipt. Each slide for each stool had a separate code, which ensured blind readings and eliminated errors due to duplicate reading.
Ten percent of positive slides were re-read for quality control (Raso et al., 2004; Utzinger at al., 2000). An overall compliance of 89.77% was obtained for individuals providing two stools. Infection was diagnosed as positive if there was > = 1 egg in any of the slides. Intensity of infection was measured by egg count (per gram of faeces; EPG) and was shown to be highly over-dispersed in the population with only 1% accounting for over 70% of the total egg count of the population. A natural log (Ln(EPG+1)) transformation has been a popular method of normalising infection intensity data in helminth infections. However, after transformation, skewness and kurtosis remained significantly higher than zero. A blom transformation was then applied (Blom, 1957), which requires ranking each measure and adjusting the scale distances between the ranks to achieve an approximately normal distribution. The transformation was unsuccessful in normalising the data (P <0.0001) and, due to non-normality of the measure of infection intensity, the phenotype of EPG was excluded from all further analysis.
All data were double entered into an Access database. SPSS 13.0 was used for regression analysis. ‘FCOR’ of the S.A.G.E package (S.A.G.E.  Statistical Analysis for Genetic Epidemiology, Release 5.0) was used to calculate multivariate familial correlations and their asymptotic standard errors for all pair types available in the pedigree data. The limitation of ‘FCOR’ prevents adjustment of the phenotype for covariates in the program; thus standardised residuals of the regression analysis were used.
Variance components analysis was carried out using the SOLAR software package (Almasy and Blangero, 1998; Duggirala et al., 1997) to assess the importance of common household and genetic factors for infection status. This attempts to decompose residual variance of a regression model into fixed and random components. This allows parameters to be estimated for household, polygenic, household/polygenic and sporadic models and allows inferences as to how much of the variation is attributable to the different effects. The simple variance components model can be denoted by:
where Vtot is the total variance, which is partitioned into additive , common shared environment and environmental effects . The algorithm estimates a maximum likelihood of all parameters that best fits the model. All models are nested on the general (household/polygenic) model in which all parameters are estimated. Household and polygenic models are assessed by constraining the necessary parameters at zero ( and , respectively). A sporadic model fixes all parameters and allows inferences to be made on the significance of and . Model comparison is achieved through chi square values, with one degree of freedom, which can be calculated as twice the difference in the likelihoods. Heritability is estimated as the standardised value of the additive effect.
Written ethical approval for this study was obtained at the national, provincial and village levels within China, and approval for the study was granted by the ethics committees of Jiangxi Provincial Institute of Parasitic Diseases, Chinese Center for Disease Control and Prevention, National Institute of Parasitic Diseases, Chinese Center for Disease Control and Prevention, Shanghai, Hunan Institute of Parasitic Diseases, Chinese Center for Disease Control and Prevention and the Queensland Institute of Medical Research, prior to commencement. Study participants identified as stool egg-positive for schistosomiasis were treated with 40 mg/kg of praziquantel, the current dosage currently recommended by the WHO. Oral informed consent was obtained from all adults and from parents or guardians of minors who were involved in the project.
The average schistosomiasis prevalence in all pedigrees was calculated to be 14.4%. The population numbers (and prevalence estimates) of individual villages were 697 (13.9%) in Aiguo village, 716 (19.5%) in Xindong village, 588 (12.2%) in Fuqian village and 704 (12.1%) in Dingshan village. The relationship between infection, sex, age (grouped in 10 year intervals) and water contact was investigated. Fig. 1a shows there were no significant differences in prevalence between the age groups.
The relationship of water contact and age is depicted in Fig 1b. Water contact was low in children and increased rapidly with age, peaking in the 35–44 years age group. There were no significant differences in water contact between the sexes. A general increase in prevalence was observed with increase in water contact probably reflecting the seasonal transmission of S. japonicum (Fig. 1c).
A multivariate logistic regression was carried out and the covariates found to be significant are shown in Table 1. These included sex and water contact and the results correlated well with the data presented in Fig. 1b and c. Schistosomiasis history variables: acute disease, number of previous treatments and schistosomiasis infection in the preceding 2 years were also found to be significant. The number of previous treatments was then categorised and the relationship between number of treatments (category) and infection status, depicted in Fig. 1d, implies that the higher the number of previous treatments, the higher the risk of current infection.
The results (Table 1) also indicate that the chances of having a current infection were halved if treatment for schistosomiasis had been received in the previous 2 years. It is noteworthy that acute disease is an indicator of infection and the obtained odds ratios indicated that an individual who had had acute disease was at almost twice the risk of infection as someone who had not (odds ratio (OR)=1.9057). The analysis also showed a significant interaction between previous treatments and acute disease.
All individuals who had missing water contact data for any of the seasons and/or had not provided two stool samples were allocated missing phenotypes for the family correlations and the variance components analysis. Given the limitations of ‘FCOR’ and the inability to adjust for covariates within the program, correlations were calculated using standardised residuals of the logistic regression. The pair numbers and correlations for first degree family members and spouses are presented in Table 2.
The correlation between spouses for the adjusted phenotype was low and not significant indicating that there were no shared environmental effects detectable between these individuals in this study. The parent-offspring correlation (0.0840) was significant and the sibling correlation was approximately double the parent-offspring correlation (0.1735) and highly significant. This pattern is indicative of a genetic involvement with little or no environmental effect. Homogeneity tests of correlations among subtypes showed no significant differences between mother-offspring and father-offspring correlations or between sex-specific sibling correlations.
Four models were initially tested for the infection status phenotype: sporadic model, polygenic model, household model and household polygenic model. Standardised parameters were estimated as shown in Table 3. Maximum likelihood estimates were obtained for each model and were used for model comparison by likelihood ratio tests (LRT). Covariates found to be significant in the regression model were included in the model and screened for significance. From Table 3 it is evident that under the household model, the effect due to shared home accounted for 32% of the phenotype variation in the data, being highly significant when compared to the sporadic model seen in Table 4 (P <0.0001). The polygenic effect under the genetic model estimated 58% of the variation to be attributable to additive genetic effects and was also highly significant. The models were then compared with the polygenic and household model to assess whether they remained significant in the presence of each other (Table 4). The comparison results showed that the genetic effect was highly significant in the presence of the household effect (P = 0.0024). However, incorporation of a household effect into the polygenic model did not significantly improve the polygenic model (P=0.0774), implying that the data favour the polygenic model over the household model. The maximum likelihood of the polygenic and household model without test covariates was estimated at −660.01 indicating that the variables were significant in the analysis (P<0.0001). A Kullback-Leibler R-squared value of 0.0405 was obtained indicating the covariates account for 4% of the variation.
Given the potential for environmental differences outside the household at the natural village and administrative village level due to geography, new models were run to test for confounding effects (Table 3). The estimated effect of natural village on infection status was 8.5%. This effect did not significantly improve the sporadic model and was rejected. The administrative village effect estimate was also low (1.4%) and the model was strongly rejected. Both effects were shown to have a zero effect when incorporated into the polygenic model implying that homogeneity existed across all village groups, and that the genetic effect estimate held true for the total population.
This study presents a pedigree analysis that has investigated the host genetic component associated with human S. japonicum infection in China. Many regression and variance components analyses of helminth infections have focused on infection intensity as a dependent variable to maximise power of the analysis. The underlying assumption in such analyses is that the phenotype is normally distributed, which is required for the correct estimation of standard errors and likelihood estimates. Commonly, count data are transformed to a logarithmic scale which has been the case in previous genetic studies of schistosome infections but, in our data, normality was not achievable. This was due to the over-dispersion of egg counts seen in this population which is likely to be due to the high level of praziquantel treatment in this region, giving rise to fewer individuals with moderate to high infection intensity. When over-dispersion occurs, ignoring it will result in underestimating the standard errors of the parameter estimates, which may lead to incorrect conclusions. For these reasons, only infection status (infected vs. non-infected) was used in this analysis.
Analysis of the data demonstrated the importance of gender in the risk of infection possibly reflecting behavioural differences between the sexes in the study population. There were no significant differences between the age groups, however, which provided no evidence to support the concept of age-acquired immunity to schistosomiasis japonica (Ross et al., 1998b). Basic water contact was assessed only by questionnaire and relied solely on the individual recall, thus increasing the risk of bias. Despite this, however, a general increase in prevalence was observed with an increase in water contact that was shown to be clearly important as an indicator of infection, further substantiated by the regression analysis showing it to be highly significant.
Given the extensive control strategies implemented in China and the wide-scale use and availability of praziquantel, it is not surprising that schistosomiasis history was shown to be a relevant factor for infection in this study. Previous studies have targeted only families, where treatment was not common, using only those who had not received treatment in the preceding 2 years. However, 46% of this study population had previously been treated with praziquantel, of which 49% had received treatment in the preceding 2 years. This is mainly through government control programs in China and also due to increased health awareness, resulting in a high level of self-treatment. Our questionnaire analysis indicated that some individuals self-treated as often as three times a year which could explain, at least in part, the over-dispersion of egg counts in this study population.
It is arguable therefore that among those who regularly self-treat, the study is actually measuring susceptibility to reinfection. However, for the remaining 54% who have never been treated, the study is measuring susceptibility to first time infection. Given that treatment with praziquantel does not provide any immunity to re-infection, and there is no evidence to support age-acquired immunity in these communities, susceptibility to infection and to re-infection should not be viewed as different.
Whilst having had schistosomiasis in the previous 2 years indicated a lower risk of current infection, individuals who had experienced acute disease at some point in their life-time were twice as likely to be infected (OR=1.9057). Similarly, the data showed that the number of previous treatments correlated positively with infection. The high number of treatments (and previous history of acute disease) could be a result of high levels of exposure (due to occupation, proximity of living area to water, etc.) which, in turn, would account for the current high risk of infection in a particular individual, rather than a predisposition to infection or re-infection.
Familial correlations shown in Table 2 demonstrate a clear lack of environmental involvement in the aggregation of infections in families by the low and non-significant spousal correlation. This is further supported by the parent-offspring and sibling correlations which were significant (0.084 and 0.1735, respectively). The sibling correlation is almost exactly twice that of the parent-offspring suggesting either genetic dominance or a special sibship environment. There were no significant differences between the maternal and paternal correlations. It is important to note that the collection of pedigree data by questionnaire alone can lead to false identification of family members. Although this was minimised by interviewing each member individually and checking for inconsistent answers, paternity might always remain questionable. The divorce rate in these villages, however, is very low and, given the sociology of these communities and views towards monogamy, the numbers of wrongly identified father-offspring pairs would probably be minimal.
Although the power was dramatically reduced in this study as a result of using a binary phenotype, a strong genetic component was still detectable in this population accounting for as much as 58% of the variation in infection. It is noteworthy, however, that although household was not a significant factor in the polygenic model, effect of household was reasonably high. This could imply a potential lack of power to detect the household component which may be a result of using the binary phenotype. The lack of power could also be a result of having ‘older’ families, a consequence of the one child policy in China, introduced in 1979. Large, informative sibships only exist, therefore, in those over the age of 25 who no longer cohabit. The implication of this on the analysis is that the genetic effect could in fact be lower, but under the polygenic/household model, it still remains reasonably high (42%) and significant.
To ensure that the results of the analysis were reflective of the total population, differences in environment relating to administration village and natural village were also assessed. The resulting estimates indicated there were no significant (P <0.05) environmental differences between these groups. Furthermore, incorporation of the ‘village effects’ into the polygenic model reduced the effects to zero, thus indicating homogeneity across all natural and administrative villages. The estimated effect of the covariates in the model accounted for 4% of the variation. In general, the simplest of the most parsimonious models are favoured which, in this study, is the polygenic model. This provides evidence of a strong genetic component controlling infection with S. japonicum that accounts for as much as 58% of the phenotype variation in all four locations of Poyang Lake.
Variance components analyses have been widely used to assess genetic heritability in human helminth infections. Chan et al. (1994) identified familial aggregation and familial predisposition to Ascaris lumbricoides and Trichuris trichiura infection in an urban community in Kuala Lumpur, Malaysia. A genetic analysis of hookworm infection in Zimbabwe (Williams-Blangero et al., 1997) estimated a heritability of 0.37 in the population indicating genetic factors to be responsible for 37% of the variation (after correcting for confounding environmental factors) seen in faecal EPG. A strong genetic effect (37–44%) was also detected in intestinal schistosomiasis caused by S. mansoni in a population in Brazil (Bethony et al., 2001; Bethony et al., 2002) and shared household environment was also shown to be important and accounted for 12–21% of the phenotypic variation seen in infection intensity. In contrast, a family based study in Kenya that investigated the aggregation of S. haematobium infection, obtained a low heritability score for the trait (King et al., 2004). This was reduced further when the data were adjusted to include common household effects indicating heritability of susceptibility was low and that, should a genetic component exist, its role in determining individual risk for infection or disease was minimal. No such family based analyses have yet been carried out on S. japonicum but putative resistance and susceptibility traits have been shown in small population studies in China (Ross et al., 1998b; Li et al., 1999) and the Philippines (Acosta et al., 2002).
Host genetic involvement has also been demonstrated through segregation analyses which have identified a co-dominant major gene (SM1) controlling infection intensity with S. mansoni (Abel et al., 1991). Combining linkage and segregation analysis mapped the gene to the 5q31-q33 region of the human genome (Marquet et al., 1996; Marquet et al., 1999; Muller-Myhsok et al., 1997). This was further supported by a genome-wide search which again identified the 5q31-q33 region to contain the SM1 gene (Marquet et al., 1999; Zinn-Justin et al., 2001). Although variance components analysis failed to detect any genetic involvement in S. haematobium infection (King et al., 2004), recent genetic studies have shown there to be an association between infection and the 5q31-q33 region (Kouriba et al., 2005).
This study represents the first step in understanding the genetics of infection with S. japonicum. Further identification of the nature of the genetic effect, its mode of transmission, and its location in the genome could reveal new insights into the mechanisms of resistance to schistosome infection, and indicate similarities and differences in host resistance to the different schistosome species infecting humans. The implications of such studies for future control strategies would include more targeted chemotherapy for susceptible individuals, reducing the cost of mass chemotherapy and reducing morbidity caused by high and frequent re-infection at a lower cost to the health care system in China.
Acosta et al. (2004), Chevillard et al. (2003), Demeure et al. (1993), Dessein et al. (1999b, 2004), Dunne et al. (1992), Henri et al. (2002), Hirayama (2004, 2002), Joseph et al. (2004), Li et al. (2001), May et al. (1998), McManus et al. (1999), Ross et al. (2000), Secor et al. (1996), Shen et al. (2002, 2003), Silveira et al. (2004) and Waine et al. (1998).
Some of the results of this paper were obtained by using the program package S.A.G.E., which is supported by a U.S. Public Health Service Resource Grant (RR03655) from the National Center for Research Resources. This study was supported by the National Institute of Allergy and Infectious Diseases (NIAID) (Tropical Medicine Research Center grant 1 P 50AI-39461), Wellcome Trust and a National Health and Medical Research Council of Australia and Wellcome Trust (UK) International Collaborative Research Grants Scheme Award.