VESA1. VESA1 is a large (>300 kD), heterodimeric protein composed of VESA1a and VESA1b that is synthesized by
B. bovis and subsequently exported to the surface of the host erythrocyte [
40]. VESA1 undergoes rapid antigenic variation and has been implicated in host immune evasion and cytoadhesion, both of which would be expected to play a vital role in persistence and pathogenesis [
41,
42]. VESA1 is thought to be the functional homolog of PfEMP1, encoded by the
var gene family, in
P. falciparum [
43]. The
ves1 genes comprise the largest family in the
B. bovis genome. While sequence identity and the presence of similar secondary amino acid structures make it clear that these genes belong to the same family, two distinct types exist (
ves1α and
ves1β, encoding VESA1a and VESA1b, respectively) that possess highly variable regions of sequence composition, length, and gene architecture (). Genomic analysis predicts 119
ves1 genes in the available sequence (72α, 43β, and four unclassified;
Table S4). However, there is a gradient of increasing concentration of
ves1 genes in the sequence immediately adjacent to the physical gap on chromosome 1, and the contigs that appear to reside in the gap contain
ves1-like sequences, indicating that additional
ves1 genes reside in the gap. An estimated gap size of 150 kbp would limit the number of genes within the missing sequence to less than 40, resulting in a total of approximately 150
ves1 genes, far fewer than previously predicted [
37]. All but three members of the
ves1 family are found in clusters of two or more genes, with individual clusters separated by a few kilobases to nearly one megabase. Interestingly,
ves1 genes are distributed throughout all four chromosomes (), in contrast to the observation that genes involved in antigenic variation, immune evasion, and sequestration, including
P. falciparum var genes, are only occasionally found internally and are predominately telomerically located [
11,
44]. While
ves1 genes are also found near telomeres and centromeres, 89 genes (75%) are located distal to these chromosomal structures.
Transcription of
ves1 genes has been hypothesized to occur at a “locus for active transcription” (LAT), described as a divergently oriented pairing of
ves1α and
ves1β genes [
37]. This large locus encompasses nearly 13 kbp and includes
ves1α and
ves1β (each >4 kbp), a short intergenic region (<500 bp), and short portions of each gene found as blocks of repeats and motifs downstream from each
ves1 coding region. The genome sequence contains 24 loci (
Figure S3) with paired
ves1α/
ves1β genes with similar length, structure, and physical arrangement as found in the published LAT. This head-to-head arrangement is also found for 18
ves1α genes of similar length, resulting in nine loci containing
ves1α/
ves1α paired genes. These two groups of paired genes account for greater than half (66/119) of the annotated
ves1 genes (
Table S4), and exhibit the highest level of sequence identity and structural similarity among
ves1α and
ves1β genes.
The remaining
ves1 genes cannot be easily sorted according to the previously described head-to-head arrangements, and many of these genes are significantly truncated. All of them can be classified as either
ves1α or
ves1β, with the exception of four
ves1 genes located on chromosome 3. It is possible that the genes not arranged in putative LATs represent ancestral forms of
ves1 and now play the role of functional pseudogenes, providing material for segmental gene conversion into a functional LAT to create antigenic variation [
45].
ves1α and
ves1β exhibit sequence similarity, but have different gene architecture.
ves1α genes that are members of potential LATs tend to have three exons: two small exons followed by a large third exon are separated by two short introns. The
ves1β genes show considerably more diversity, as they have numerous introns that are not consistent in length or location [
37,
41]. Even among
ves1β genes in potential LATs, gene length varies from 987 to 3,642 bp, and the number of introns ranges between 2 and 11.
Although the
ves1α and
ves1β genes are structurally distinct, areas of sequence conservation and topological similarities exist among the predicted polypeptides. The corresponding conserved stretches of nucleotide identity may be exploited as recombination sites for the generation of antigenic diversity, in addition to encoding a functional motif. Because VESA1 is exported to the surface of infected erythrocytes [
40], it is notable that only seven of the 119 potential products are predicted to possess an N-terminal signal peptide (which again suggests that current signal prediction algorithms may not be suitable for
B. bovis). Most predicted VESA1 proteins have a large extracellular domain followed by a single transmembrane segment and a short cytoplasmic tail. This topology is conserved in VESA1a proteins encoded by genes in
ves1α/
ves1β and
ves1α/
ves1α pairings (35/42
ves1α genes), and to a lesser extent in VESA1b proteins encoded by genes in the
ves1α/
ves1β pairings (15/24
ves1β members). As with exon structure and gene length, however, considerably less conservation exists among the remaining proteins, as only 21/53 follow this pattern (
Table S4).
VESA1a is distinguished from VESA1b by the presence of a coiled-coil domain located near the center of the predicted protein, with 83% of all VESA1a subunits and 98% of VESA1a subunits from potential LATs containing this domain. Of the 11 VESA1a subunits that do not contain the coiled-coil domain, eight are encoded by truncated genes containing less than 312 amino acids and none are encoded by genes exhibiting the typical three exon structure. In contrast to VESA1a, only 4/43 VESA1b subunits contain the coiled-coil domain. An additional characteristic found almost exclusively among the VESA1a subunits is the presence of two distinct motifs that are variable among the predicted protein sequences but contain invariant amino acids at specific positions. These domains, referred to as the variant domain conserved sequences one and two (VDCS-1 and −2) [
37], are arranged in tandem and located near the coiled-coil domain. The T2Bo consensus sequence for VDCS-1 is K(N,D)x(L,I,V) (S,K)xxIxxxxxx(L,V) and for VDCS-2 is CxxCxxHxxKCGxxxxxxxCxxCx(Q,N)xxxxGXPS. While VESA1b subunits are essentially devoid of this motif, the VDCS-1 and −2 also help to define the subsets into which the VESA1a sequences are organized. VESA1a subunits predicted from the
ves1α/
ves1β pairs all possess perfect matches to VDCS-1 and −2 and only four motifs (three VDCS-1 and one VDCS-2) are missing from those coded by
ves1α/
ves1α pairs. In areas where these four missing motifs would normally be found, a similar amino acid pattern exists that does not match the motif perfectly. Of the remaining VESA1a subunits, 16/30 contain VDCS-1 and 17/30 possess VDCS-2.
Due to their resemblance to the published LAT [
37], 33 pairs of
ves1 genes should be considered potential transcription sites. The potential LATs are not clustered, and are distributed throughout the chromosomes. To better understand whether one or more of these potential sites of transcription were active, we examined T2Bo
ves1 gene cDNA sequences. Primer sets for three different experiments were designed to target (1) specific genes, (2) sets of genes, or (3) the published LAT (), and a total of 66
ves1α and 93
ves1β cDNA clones were analyzed. Unexpectedly, these cDNA represented 50 and 59 unique
ves1α or
ves1β sequences, respectively. Equally surprising, only one of the
ves1α and none of the
ves1β unique cDNA sequences matched a genomic
ves1α or
ves1β sequence. The
ves1 cDNAs displayed up to 50% sequence divergence in pairwise comparisons for transcripts within a given experiment. In experiment 1 (designed to target specific genes), 83% of the
ves1α had >91% sequence identity in pairwise comparisons while the
ves1β cDNAs displayed a bimodal distribution, with 46% having >91% sequence identity and 50% having only 56%–70% sequence identity. The RNA used for these experiments was obtained from
B. bovis T2Bo culture more than two years following isolation of the genomic DNA used to construct the libraries used for sequencing, possibly accounting for some of the sequence diversity, i.e., due to changes in the population represented in the culture at the time of sampling. However, although variation over time may account for some of the differences between the cDNA and genomic sequences, the number of unique sequences obtained from a single time point exceeds the number of predicted expression sites for
ves1 genes. Consistent with this finding, numerous
ves1 genes were also represented in EST data [
17].
Var gene expression, while “leaky” in the ring stages of
P. falciparum, appears to be restricted to a single, or very few, alleles in individual parasite populations in vivo [
46,
47]. However, multiple
var transcripts, although far fewer than for
B. bovis, have been detected when the organism is cultured in vitro [
48]. One possible explanation for the large number of different
ves1 cDNA transcripts may be that, similar to the
var genes,
ves1 genes are removed from in vivo transcriptional controls and/or phenotypic selection when the organism is grown in vitro. While in vivo analysis of
ves1 transcription remains to be performed, the number of diverse transcripts is interesting, and may suggest more widespread transcription and alternative post-transcriptional control mechanisms than observed in other hemoprotozoa.