The publication of the P. falciparum
genome divided the var
genes in different types according to the domain structure of the encoded proteins [17
]. Other groups have described semi-conserved regions upstream from the translation initiation sites, and grouped var
genes on this basis [18
]. We have synthesised the available information and suggest a somewhat different division of the var
genes into three major groups (A-C) and two intermediate groups (B/A and B/C), which represent transitions between A, B, and C. The genes were grouped according to chromosomal location and transcription direction, domain structure of the encoded proteins, and sequence similarities in coding and non-coding regions.
Group A consists of ten genes consistently identified as a distinct group by sequence analysis. Interestingly, recombinant CIDR domains based on the group A sequences do not bind CD36, by contrast to CIDR domains produced on the basis of groups B and C [20
]. Group A var
genes mainly encode large PfEMP1s with complex multi-domain structure. Nine of the Group A var
genes are flanked by a rif
gene, which is transcribed in the opposite direction. Thus, the 5' regions of the rif
genes merge. The fact that this organisation has been maintained in the 3D7 genome indicates that the DNA between the coding regions constitutes a functional unit, possibly regulating either recombination or transcription. If the latter is the case the genes could be co-regulated and there might be a functional relationship between the encoded PfEMP1s and RIFINs.
The largest var
group in 3D7, group B, comprise 22 genes sharing 5' upsB region. All genes but one are located in the telomeric region. The encoded proteins typically have the characteristic four-domain structure, DBLα-CIDRα-DBLδ-CIDR2. The 13 genes of group C are centromeric. The genes all share 5' upsC region and 12 of them encode proteins with the common four-domain structure. Genes of the B/A and B/C groups have characteristics indicating that they constitute intermediate forms between groups A and B, and groups B and C, respectively. Two genes, which have previously been shown to be present in most P. falciparum
genomes, did not fit into any of the groups. Compared to other var
genes they appear to be unusually conserved [28
]and it has been suggested that they belong to var
gene subfamilies named var1
, respectively [24
To investigate whether the proposed groupings of 3D7 var genes could be used as a general classification of var genes, the available database sequences from other parasite isolates were analysed. Sufficient sequence data was only available for 11 genes, and with regard to domain structure of the predicted proteins, they were not particularly representative of the PfEMP1 repertoire in 3D7. Analysis of the 5' regions allocated ten of the genes to the upsB 3D7 cluster, and they could therefore be classified as group B or group B/A var genes. Further analysis of sequence and predicted domain structure showed that all the genes shared characteristics with at least one group B 3D7 var gene, and none of them shared characteristics with the 3D7 var genes belonging to group A. The upstream region identified one gene as belonging to group C. This encoded a protein with a domain structure typical of 3D7 group C PfEMP1s. Thus, although the data are limited, analysis of non 3D7 var genes suggested that the proposed nomenclature could be used in a general classification of var genes.
The suggested grouping of var genes is operational and based on best judgement. It is likely that future work will change the classification and move genes between groups, nevertheless we believe that this grouping is helpful as starting point for understanding the evolution of the var gene repertoire and developing hypotheses about their function.
The fact that 5' regions predict var gene chromosomal organisation and domain structure, and sequence similarities in coding and non-coding regions several thousand bases downstream from the translation initiation site implies that recombination, or other mechanisms of homogenizing exchange is much more likely to occur between var genes within a group than between var genes of different groupings. It can be proposed that an original ancestral var gene has been duplicated and diverged in the three main types, and each of these have then diverged into the genes of each group. In this process information may also have been exchanged between genes of different groupings. The data suggests that some exchange have taken place between groups B and C and some characteristics of group A have leaked into these groups, but that characteristics from groups B and C have not gained access to group A. It is tempting to speculate that distinct chromosomal organisation patterns restrict recombination and that the conserved flanking regions serve to align genes of similar group for recombination. The fact that a putative boundary of the upstream sequence could be determined for most var genes may suggest that these sites also serve as splicing sites for insertion of larger gene fragments or whole genes.
Why then are var
genes structured into different groups? By mediating parasite binding to endothelium, PfEMP1 enables the parasite to sequester and avoid filtering through the spleen. Thus, parasites expressing PfEMP1, which are most effective in sequestering infected erythrocytes, will obtain the highest growth rates. How effective a given PfEMP1 is in binding in a particular host will depend on the binding characteristics of the PfEMP1, on the ligands that are available in the host [38
], and the anti-PfEMP1 antibody repertoire in the infected individual [11
]. Parasites causing severe malaria express phenotypes that are more often recognised by antibodies in children's plasma than the phenotypes expressed by parasites causing uncomplicated disease [41
]. The phenotypes associated with severe disease also tend to be serologically cross-reactive (Nielsen et al
., in preparation). Given that immunity to severe malaria is developed relatively early in life, it is possible to speculate that the most severe forms of malaria are caused by fast growing parasites expressing PfEMP1s optimized to mediate a very effective binding in non immune hosts. To maintain effective binding these PfEMP1 types are probably functionally constrained, and consequently have tight limits to the degree to which they can vary. The fact that recombination within var
genes of group A appear to be the most constrained, suggests that the PfEMP1s associated with severe malaria will be encoded by group A var
genes. This hypothesis is in agreement with findings from China indicating that parasites from individuals suffering from cerebral malaria compared with cases of non-severe malaria expressed high molecular weight PfEMP1s [42
] and a study from Brazil where expression of DBLα domains lacking 1–2 cysteine residues in DBLα homology block G were mainly found among severe malaria cases [43
]. In 3D7 this is a feature of all genes of group DBLα-CIDR1 group A (var
gene group A).
In most endemic settings transmission does not occur continuously, but is highly seasonal and in some areas restricted to a few months of every year [44
]. In such a situation the ability to establish chronic infections is important for parasite survival and transmission. Chronic human malaria infections are associated with 'shift' in PfEMP1 expression [45
] and it has been proposed that such shifts are driven by antibody forcing parasites to express PfEMP-1 molecules which are less optimal for adhesion, but not recognised by cross reactive antibodies. It is possible to speculate that PfEMP1s of groups B and C could serve this function.
In areas of high malaria endemicity, women who have acquired malaria immunity during childhood become susceptible to malaria during their first pregnancies [46
] and are infected by parasites expressing antigens that mediate binding to CSA in placenta [8
]. Parasites of this phenotype can apparently only expand and establish infection in individuals carrying a placenta and these parasites do not cross-react serologically with non-placental parasites [38
]. It has been recently reported that PFL0030c is the dominant var
gene transcribed in parasites selected for CSA binding and that most parasite genomes carry very similar genes, the var2
], Interestingly, the var2
upstream region (upsE) is markedly different from the 5' regions of the other var
genes and appears to be conserved. The upstream upsE region of var2
is also the only such region containing an ORF. Upstream ORFs are uncommon in known genomes, and primarily described in association with genes that are under tight translational control, such as oncogenes and genes involved in cellular differentiation (reviewed by Kozak, 2002). The function of the uORF 5' of var2