|Home | About | Journals | Submit | Contact Us | Français|
There is growing epidemiological and molecular evidence that ABO blood group affects host susceptibility to severe Plasmodium falciparum infection. The high frequency of common ABO alleles means that even modest differences in susceptibility could have a significant impact on the health of people living in malaria endemic regions. We performed an association study, the first to utilize key molecular genetic variation underlying the ABO system, genotyping >9000 individuals across 3 African populations. Using population- and family-based tests we demonstrated that alleles producing functional ABO enzymes are associated with greater risk of severe malaria phenotypes (particularly malarial anemia) in comparison with the frameshift deletion underlying blood group O: Case-control allelic odds ratio (OR) 1.2, 95% confidence interval (CI) 1.09 – 1.32, P=0.0003; Family-studies allelic OR 1.19, CI 1.08 – 1.32, P=0.001; Pooled across all studies allelic OR 1.18, CI 1.11 - 1.26, P=2×10−7. Analyzing the family trios we found suggestive evidence of a parent-of-origin effect at the ABO locus. Non-O haplotypes inherited from mothers, but not fathers, are significantly associated with severe malaria (likelihood ratio test of Weinberg, P=0.046). Finally we used HapMap data to demonstrate a region of low FST (−0.001) between the three main HapMap population groups across the ABO locus, an outlier in the empirical distribution of FST across chromosome 9 (~99.5 – 99.9th centile). This low FST region may be a signal of longstanding balancing selection at the ABO locus, caused by multiple infectious pathogens including P. falciparum.
A link between the ABO blood group system and malaria susceptibility has long been suspected. Significant associations between blood group and P. falciparum malaria have been reported from cross sectional and case control studies in Brazil, Gabon, India, Sri Lanka and Zimbabwe (1-7) (see also recent reviews and references therein (8, 9)). However other studies in Colombia, India, Sudan and Nigeria could not find an association between malaria and blood group (10-15). The positive association studies have consistently suggested that blood group O individuals are relatively protected from severe malaria.
Human erythrocytes infected with mature forms of the Plasmodium falciparum parasite adhere to uninfected red blood cells, endothelia and other components of the vascular space (16). This adhesive behavior is mediated by P. falciparum erythrocyte membrane protein 1 (PfEMP1), which is encoded by a family of highly variant parasite genes and subject to switching during the course of an infection (17). A range of human host molecules binding to PfEMP1 have been identified including CD36 (18), Intercellular adhesion molecule (ICAM)-1 (19) and Complement receptor-1 (CR1) (20). The ABO A and B antigens have been implicated in the formation of ‘rosettes’ the process by which infected red blood cells (iRBCs) surround themselves with uninfected erythrocytes (21, 22). PfEMP1 has been identified as the rosetting ligand of the parasite (20) and the blood group A antigen has been shown to bind to the semiconserved head structure of PfEMP1 (23). Erythrocyte rosetting is linked to the pathogenesis of severe malaria phenotypes such as cerebral malaria (CM) and severe malarial anemia (SA) (24, 25). Indeed fresh isolates of P. falciparum from Kenyan children with severe malaria bind A antigen more frequently than strains from children with mild disease (26). Recent work on Kenyan and Malian isolates suggests a significant reduction in rosetting in blood group O individuals compared with non-O blood groups, furthermore parasites required rosetting activity for blood group O to offer protection from severe disease (27). Wild and laboratory parasite strains demonstrate blood group preferences, which appear to vary geographically. In general rosette formation has been shown to occur preferentially with blood group A, B or AB erythrocytes, and particularly groups A and AB in studies of African strains (21, 22, 28, 29).
Previous epidemiological studies have employed the phenotype of serology to determine host ABO genotype. Although convenient serological studies of the ABO system are unable to discriminate all genotypes e.g. AO heterozygotes from AA homozygotes, and cannot assess subtler coding or non-coding variation. We took the novel approach of investigating the molecular genetic variation underlying this system. The ABO glycosyltransferase performs the terminal step in the biosynthesis of the ABO macromolecule, adding sugar residues to the precursor H antigen. The enzyme adds either N-acetylgalactosamine, to form the A antigen, galactose to form the B antigen, or is functionless leaving the H (or O) antigen unmodified. A variety of polymorphisms have been reported in the ABO glycosyltransferase gene (30). The common key functional variants are: (i) a one nucleotide deletion in exon 6 (codon 87) leading to a reading frameshift and premature termination of the polypeptide before the N terminal catalytic domain, producing the functionless O allele (31). Alternative ‘non-deletional’ O alleles exist but are rare in African populations (32). (ii) Four non-synonymous single nucleotide polymorphisms (SNPs) (altering residues 176, 235, 266 and 268) which switch enzyme function from A transferase to B transferase activity. The third and fourth SNPs (codons 266 and 268) have the greatest effect on determining the nucleotide-sugar donor used by the transferase, the second amino acid (235) also affects nucleotide-sugar specificity but to a lesser degree, finally the most 5′ SNP (codon 176) has very little influence on donor specificity (33).
We designed genotyping assays for the O allele frameshift mutation (rs8176719) and the three key A/B nonsynoymous SNPs in codons 235 (rs8176743), 266 (rs8176746) and 268 (rs8176747) in the catalytic domain. We conducted a multi-centre study design employing samples from three African regions: Gambia (West Africa), Malawi (South Central Africa) and Kenya (East Africa). We used two study types, first population-based association studies in each of the regions employing children with severe malarial disease and local cord blood controls, and second, family-based association studies at each site looking for transmission distortion of ABO alleles between parents and affected offspring, an approach which has the advantage of being robust to population stratification. Our aims were: (i) given the presence of substantial linkage disequilibrium (LD) between the functional SNPs in the ABO locus to define the most efficient marker set for large-scale genotyping; (ii) to perform SNP- and haplotype-based tests for association with severe malarial phenotypes; (iii) to employ our family-based genetic data to check for parent-of-origin effects, an approach not possible with previous case-control and cross-sectional study designs; and (iv) to consider the population genetic implications of an association between ABO variation and severe falciparum malaria.
The four key functional polymorphisms that we tested in the ABO locus are reported to be in substantial LD (30). To confirm this we genotyped 1320 Gambian parent-offspring trios, and 30 Yoruba parent-offspring trios from Ibadan in Nigeria (cell line DNA used by International Haplotype Map (HapMap) Project) (34). This allowed us to look at the fine structure of the ABO locus, combining our functional SNPs with the high density HapMap marker dataset (Fig. 1).
Genotyping in both sample sets demonstrated near perfect LD (r2 0.9 – 1.0) between the three nonsynonymous SNPs that distinguish A and B haplotypes. The frameshift deletion underlying O alleles is in moderate LD with the three nonsynonymous SNPs (D′~0.9, r2~0.3). The majority of O alleles occur on an A haplotypic background, although the minority of recombinants with B haplotypes are still expected to produce truncated products. The implication of this haplotypic structure is that 2 markers: rs8176719 and one of the three nonsynonymous SNPs can generally distinguish the three ABO alleles.
Two markers, rs8176719 and rs8176746, were genotyped in 3906 cases of severe malaria plus population and family controls from The Gambia, Kenya and Malawi. Case-control analyses were performed on 2127 cases of severe malaria and 1931 population controls, and family-based association tests were performed on a different set of 1779 cases of severe malaria and their parents. Further details of severe malaria phenotypes and sample sizes from each population are given in Material and Methods.
Our analysis took the statistical convention of using the commonest category (i.e. most frequent allele, haplotype or blood group) as reference, and comparing other alleles, haplotypes and inferred blood groups against this group. Single SNP analysis revealed that the minor allele of rs8176719, an insertion relative to the reference sequence (although almost certainly the ancestral allele), is associated with increased risk of severe malaria phenotypes. Consistent trends were found in both family and case-control studies, and across study sites (Table 1). However some individual studies did not reach statistical significance; probably due to lack of power (see Materials and Methods for power calculations). Overall the case-control allelic odds ratio (OR) was 1.2, 95% confidence interval (CI) 1.09 – 1.32, P=0.0003; Family-studies allelic OR 1.19, CI 1.08 – 1.32, P=0.001; Pooling data across all our studies, both family- and population-based, suggested an allelic OR of 1.18, CI 1.11 - 1.26, P = 2×10−7 for severe disease (Fig. 2). The results suggest that the full-length allele of rs8176719 may be associated with a particular risk of anemia during severe malaria infections. For the phenotype of severe malarial anemia the allelic OR was 1.30, CI 1.12 - 1.5, P = 0.0004 in the case-control study and 1.34, CI 1.1 - 1.64, P = 0.004 in the family-based study.
The situation with rs8176746 is more complex. The major allele, although defining the putatively ‘high risk’ A haplotype, occurs with the ‘low risk’ frame-shift deletion upstream in about two thirds of chromosomes. Thus no simple genetic model is significantly associated with disease when this SNP is considered in isolation.
We inferred 2 SNP haplotypes using rs8176719 and rs8176746. In comparison with O haplotypes, both A (case-control OR 1.27, CI 1.12 – 1.44, P = 0.0003; family OR 1.16, CI 1.03 – 1.31, P = 0.018) and B (case-control OR 1.13, CI 0.99 – 1.28, P = 0.06; family OR 1.20, CI 1.05 – 1.37, P = 0.009) haplotypes are significantly associated with severe malaria (Table 1).
Using an individual's haplotypes we inferred their ABO blood group (A and B haplotypes codominant, O haplotype recessive). Blood group A, B and AB individuals appear to be at significantly greater risk of severe malaria in comparison with blood group O (e.g. blood group A individuals: case-control OR 1.33, CI 1.13 – 1.56, P = 0.00065; family OR 1.29, CI 1.09 – 1.53, P = 0.003) (Table 2). Trends in the data suggest blood group B individuals may be at subtly lower risk than blood group A, while blood group AB individuals are probably at the greatest risk of severe disease (case-control OR 1.59, CI 1.15 – 2.21 P = 0.006; family OR 1.46, CI 1.05 – 2.04, P = 0.025).
Non-O blood groups demonstrate particularly high risk of severe malarial anemia (e.g. blood group A: case-control OR 1.54, CI 1.22 – 1.96, P = 0.00039; family OR 1.51, CI 1.09 – 2.09, P = 0.014). The higher risk of SA experienced by individuals with non-O blood groups may reflect a pathophysiological effect, for example accelerated clearance of erythrocytes bound to iRBC. Frequencies of genotypes, haplotypes and blood groups for all cases, controls and parents are documented in Supplementary Material, Table S1.
Previous association studies of the ABO system in severe malaria have examined serological data from unrelated individuals. In this study, having analyzed genetic data from pedigrees, we are in the unique position of being able to look for parent-of-origin effects. We were surprised to find a substantial difference in transmission of high risk alleles relating to their parent of origin. Full-length alleles of rs8176719 (marking A and B haplotypes) transmitted to offspring are associated with greater risk of severe disease if the allele was transmitted from a mother (OR 1.38, P = 0.0002) rather than a father (OR 1.05, P=0.6), conditional logistic regression fitting separate effects for maternal and paternal alleles supports the existence of a parent-of-origin effect (χ2 = 3.96, P= 0.047) (Table 3).
Interactions between maternal genotype and child's genotype can masquerade as parent of origin effects. To investigate this scenario we performed a parent-of-origin likelihood ratio test of Weinberg (PO-LRT) (35), which can allow for maternal genotype effects. The likelihood ratio test demonstrates evidence of a parent-of-origin effect (P=0.046), but did not support a maternal genotype effect (P=0.21) (Table 4). Using 2-SNP haplotypes to identify the three ABO alleles, and repeating the conditional logistic regression we found that both A and B haplotypes are associated with greater risk when transmitted by a mother although only the difference between maternal and paternal A haplotypes is significant (χ2 = 5.19, P= 0.023).
There was no significant difference in risk estimate for severe malaria between AO and AA genotypes in either the case-control study (AO, OR 1.35, CI 1.13 - 1.60; AA OR 1.23, CI 0.84 - 1.79; Wald test P = 0.64) or family study (AO, OR 1.31, CI 1.10 - 1.55; AA OR 1.13, CI 0.76 - 1.67; Wald P = 0.45). The same was true for BO versus BB genotypes, and grouping both non-O haplotypes. The phase of AO genotypes was distinguishable in the family study and, as noted above, suggested different risks of severe disease (AmatOpat, OR 1.64, CI 1.27 - 2.12; OmatApat, OR 1.04, CI 0.81 - 1.33; Wald P = 0.01). Differences between these genotypes and the AA genotype were not significant (AmatOpat versus AA; Wald P = 0.07; OmatApat versus AA; Wald P = 0.92).
Given that one ABO allele is protective against a major selection pressure such as P. falciparum malaria, it is important to consider the reasons why the ABO system remains polymorphic in Africa. The answer may relate to the balance of protection offered by specific ABO alleles to other infectious disease. ABO antigens have been implicated not only in malaria, but also to the pathogenesis of Escherichia coli (36), Helicobacter pylori (37), Norwalk virus (38), Hepatitis C (39), and respiratory infections (40). There has also been speculation about whether historical agents such as smallpox and plague may have shaped the global distribution of ABO alleles (41, 42).
Blood group A, B and O alleles occur in other primate species. Sequence analysis has suggested that the common ancestral enzyme had A transferase activity and that the presence of species-specific mutations underlying the non-human B and O alleles suggests functional polymorphism of the ABO gene has occurred through convergent evolution (43). High levels of nucleotide diversity at the ABO locus are considered to be significant evidence of non-neutral evolution in primate lineages.
Surveys for signatures of evolutionary selection in the Human genome have demonstrated evidence of balancing selection at the ABO locus (44, 45). Using our merged haplotype data from the HapMap Yoruba population and our additional genotyping of four functional polymorphisms (rs8176719, rs8176746, rs8176747, rs8176743), we investigated whether the long range haplotype (LRH) test (46), which detects signals of recent positive selection, was associated with the functional ABO variants (Fig. 3). Neither of our key SNPs appears to be associated with an extended haplotype homozygosity signal, which is consistent with a longstanding process of balancing selection. A recently reported high LRH score from the ABO locus (45) may reflect positive selection of nearby regulatory sequence controlling expression patterns rather than coding sequence itself. Interestingly we observed very low levels of population differentiation across the ABO locus in HapMap populations (Fig. 4A). The ABO locus including sequences 20-30 kb upstream, a region of 85 SNPs, has an FST (using a sample-size weighted metric of population differentiation between African, Asian and European populations) of −0.001, compared with a genome-wide average of around 0.1. Although more attention is generally given to high FST values, as signals of region-specific positive selection (e.g. the Duffy blood group locus under selection in Africa from P. vivax malaria (47)), here the low FST might reflect simultaneous balancing selection in all three populations, and would be consistent with a model of longstanding selection. The extension of the low FST region 20 – 30kb upstream of the ABO coding sequence (while stopping just 3kb downstream) may indicate that the balancing selection is also shaping the cis-acting regulatory sequences immediately upstream of the gene. Although not a formal test of deviation from neutrality we measured the empirical distribution of FST in windows across chromosome 9 and found the ABO region to be around the 99.5 – 99.9th centile in this distribution (Fig. 4B).
Our analysis strongly supports the hypothesis that blood group O individuals are relatively protected from severe malaria in comparison to other blood groups, particularly blood group A and AB. In addition to population-based studies we conducted the first family-based association studies of ABO variation, with severe malaria. Analysis of parent-offspring trios is robust to artifacts of population stratification and allowed us to examine the possibility of parent-of-origin effects. We found that full-length alleles at rs8176719, particularly A haplotypes, inherited from a mother lead to greater risk for offspring than similar alleles inherited from a father. The analysis suggests a phenomenon such as genomic imprinting rather than an effect of maternal genotype. The tissue- and cell-type specific expression of ABO is controlled by epigenetic signals, particularly at a 5′CpG island in the promoter of the gene (48). Loss of ABO expression mediated by promoter hypermethylation is common in certain forms of malignancy (49). There has also been a small case series documenting preferential loss of the maternally derived ABO allele in adult leukemia (50). Our results would be consistent with altered expression of the paternal ABO allele due to imprinting, leaving less target ligand for iRBCs to bind to, or shifting the pattern of sequestration to a ‘safer’ distribution. Paternal allele expression is not abolished, as demonstrated by the existence of blood group AB. However it is interesting to speculate that during gestation it could be beneficial to modulate or suppress the paternal ABO allele, so as to minimize maternal-fetal ABO incompatibility and the risk of hemolytic disease of the newborn. To our knowledge this is the first report suggestive of genomic imprinting in the genetics of malaria susceptibility. However despite our sample size, we have limited power to detect the subtle difference in risk between maternal and paternally derived alleles, it remains possible that this suggestive result is in fact type I error. Further experiments are required to confirm this finding, and could include family-based studies of ABO variation in malaria and other complex diseases, or functional experiments with family-derived samples to determine if ABO gene expression is affected by parent of origin.
The sizes of our individual study datasets gives us limited power to analyze the differences between regions. The differences in odds ratio between a family study and case-control study in a single population are of similar magnitude to the differences seen between regions, suggesting much of the variation observed is due to statistical fluctuation or methodological issues. Factors possibly contributing to the diversity of risk estimates reported between our populations, and in comparison with other reports include variation in phenotype definition and geographic differences in parasite strain e.g. differing affinities for ABO antigens. The use of cord blood controls instead of age-matched controls could affect association results. Cord blood samples represent the distribution of genotypes in the populations at birth; since a minority will go on to develop severe disease, this will tend to underestimate the true odds ratio of any genetic susceptibility or resistance allele. In contrast, if blood group O individuals are depleted from the population between birth and mid-childhood (the age of the cases) due to non-malarial pathogens this would lead to an over-estimation of protection from blood group O. However given the consistent results between the population- and family-based studies, the similar allele frequencies between cord blood samples and untransmitted parental alleles, and the limited effect size found, such biases are probably limited.
Our study employed samples from children with severe malaria phenotypes and it has been noted that the ABO effect is harder to detect in samples incorporating adults with malaria (8). The gradual accumulation of antibodies against a range of iRBC surface antigens, is thought to explain a component of the immunity to severe life-threatening malaria which develops in late childhood in endemic regions (51). It is possible that a limited repertoire of parasite epitopes binding A and B antigens, could lead to a high prevalence of effective antibodies against these variants in adults, reducing the significance of host ABO blood group in older age groups. It is interesting to speculate whether vaccination, leading to effective immunity to A and B antigen binding PfEMP1 variants, could reduce morbidity among blood group A, B and AB children.
The modest effect size found is typical of validated associations with complex disease in humans. The effect size is likely, in part, to explain why some previous studies have been unable to demonstrated significant results, despite trending towards protection from blood group O, particularly when sample size was low. However the global health impact of the frame-shift deletion, underlying blood group O, needs to be put in context by considering its high frequency. Roughly half the peoples of Sub-Saharan Africa, and many other human populations, at risk of life-threatening disease caused by P. falciparum, are homozygotes for this null mutation and are protected by being blood group O (52).
Patient samples were collected as part of ongoing epidemiological studies of severe malaria at the Royal Victoria Hospital, Banjul, The Gambia; the Queen Elizabeth Central Hospital, Blantyre, Malawi; and Kilifi District Hospital, Kilifi, Kenya. Populations in these study sites are exposed to endemic malaria, with the burden of life-threatening disease being experienced by children (< 5 years old). Nuclear family trios comprised two parents and one affected child. All DNA samples were collected and genotyped following approval from the relevant research ethics committees and informed consent from participants. Controls were cord blood samples obtained from birth clinics in the same locality as the cases. See Supplementary Material, Table S2 for further demographics of patients and controls.
All cases were children admitted to hospital with evidence of P. falciparum on blood film and clinical features of severe malaria (53, 54). In our analyses of sub-phenotypes we use Blantyre coma score of ≤2 as a criterion of cerebral malaria (CM), and hemoglobin <5g/dl or packed cell volume <15% as a criterion of severe malaria anaemia (SA). Some individuals had both CM and SA. Of the severe malaria cases that were not CM or SA by these criteria, most had lesser degrees of coma (Blantyre coma score 3) or anemia (Hb 5-6g/dl), or other complications such as respiratory distress. Our samples comprised:
Power calculations were performed using the Genetic Power Calculator (http://pngu.mgh.harvard.edu/~purcell/gpc/) (55). A single regional case-control study had ~61% power, based on a sample size of 700 cases, 700 unselected controls, allelic odd ratio of 1.2 (our risk estimate for non-O alleles compared with O alleles), high risk allele frequency of 0.3 and a type I error rate of 0.05. Across all case-control studies we would expect 97% power. The smaller family trio studies (N=230 trios) had ~25% power, while the larger Gambian study (N=1320 trios) had ~87% power.
Genomic DNA samples underwent whole genome amplification through either Primer Extension Pre-amplification (PEP) (56) or Multiple Displacement Amplification (MDA) (57), before genotyping on a Sequenom MassArray genotyping platform (58), All assays achieved high rates of genotyping success and no deviations from Hardy-Weinberg equilibrium were encountered (Supplementary Material, Table S2).
Analysis was performed using STATA (v9.2 for windows) and the genassoc package (http://www-gene.cimr.cam.ac.uk/clayton/software/) written for STATA by David Clayton. In general results for the multiplicative model are presented. Although full-length alleles are dominant with regard to blood group phenotype, we found a trend towards increased susceptibility for blood group AB individuals (homozygotes for the full-length allele at rs8176719), which makes this model valid. Case-control association analysis was undertaken by logistic regression and included the covariates of ethnic group, gender and Sickle status. DNA Sequenom genotyping for the Hemoglobin S (HbS) variant was performed for all samples. The HbS results for a proportion of the Gambian samples have previously been published (59). Gender was included in the regression analysis to control for sex-linked traits (e.g. G6PD). With regards to other variation thought to affect malaria susceptibility: Hemoglobin C was absent from our study populations and the range of deletions underlying the Thalassemias are not easily amenable to the high-throughput genotyping technology used.
Family-based association analysis was performed using a case-pseudo-control approach and conditional logistic regression (60). Trios were drawn from a larger pool of samples checked for relationship misspecification. All family samples were genotyped for 48 SNP markers and 15% of trios (indicated by their Mendelian errors rates) excluded from further analysis.
In the population-based studies two SNP haplotypes (and therefore the ABO alleles) were reconstructed using the snp2hap function, while in family studies phase can often be tracked from parent and child in the creation of the case-pseudo-control dataset. Data from the case-control studies was pooled in a single logistic regression analysis including covariates of ethnic group, gender and Sickle status, while family studies were combined in a single case-pseudo-control analysis. Pooling across all case-control and family-studies is less straightforward. Here we used the UNPHASED application version 3.0 (http://www.mrc-bsu.cam.ac.uk/personal/frank/software/unphased/) (61, 62) which employs a retrospective likelihood framework for performing genetic association analysis, and can be used to combine data from nuclear families and unrelated subjects. Ethnic origin was found to be a significant confounder and was retained as a covariate in the UNPHASED analysis.
Phased ABO locus haplotypes for the 30 Yoruba parent-offspring trios from Ibadan in Nigeria were downloaded from the HapMap website (www.hapmap.org) (34). Cell lines derived from the Yoruba individuals (obtained from the Coriell Cell Repositories, http://ccr.coriell.org/) were genotyped for the four key SNPs. The additional SNP data was combined with the known phased genotypes using PHASE version 2.1 (63, 64). LD patterns between our genotyped SNPs and between these genotypes and HapMap markers were visualized using HAPLOVIEW version 3.32 (http://www.broad.mit.edu/mpg/haploview/) (65). Extended haplotype homozygosity values (46) were calculated using SWEEP version 1.0 (http://www.broad.mit.edu/mpg/sweep/index.html).
SNP genotyping data for chromosome 9 was downloaded from the HapMap website (Release 21a/Phase 2, Jan 2007). The dataset was derived from three populations: the 30 Yoruba trios; 30 U.S. parent-offspring trios of northern and western European origin, collected by the Centre d'Etude du Polymorphisme Humain (CEPH); and a combined dataset of 45 unrelated individuals from the Tokyo, Japan and 45 unrelated individuals from Beijing, in China(34). 94,678 SNPs across chromosome 9 were polymorphic in all three populations. FST between the three populations was estimated, in windows across chromosome 9, using the formula for FST described in the supplementary information to the HapMap Consortial publication (34).
A.E.F. and M.J.G. were funded by Wellcome Trust Clinical Research Training Fellowships and T.W. by a Wellcome Trust Senior Fellowship. The sample collections and resources used in this study received funding from the UK Medical Research Council, the Wellcome Trust, the Bill and Melinda Gates Foundation, the National Institutes of Health and the European Union Network 6 BioMalPar Consortium. This manuscript is published with the permission of the director of the Kenya Medical Research Institute.
Conflict of Interest Statement. None declared.