Sequencing of the human and murine TRB loci has defined the repertoire of TRB genes in these species as well as provided insights into the organisation, evolution and regulation of this immunologically important locus [9
]. Although the bovine TRB locus sequence in the third bovine genome assembly is incomplete, the analysis conducted in the present study has provided insight into the nature of the bovine TRB gene repertoire and its genomic organisation and evolution.
The most striking result from the study was the large number of TRBV genes identified (134) which is over twice the number found in humans and four times that in mice [11
]. Although 11 of the 24 bovine subgroups identified in the genome contain multiple genes, the majority of the TRBV repertoire expansion is attributable to the extensive membership of just 3 subgroups, TRBV6 (40 members), 9 (35 members) and 21 (16 members). By comparison, the largest subgroups in humans are TRBV6 and TRBV7, with 9 members each, whilst in mice the only multi-membered subgroups are TRBV12 and 13 with 3 members each. As in humans the expansion of the TRBV repertoire has predominantly occurred through the tandem duplication of DNA blocks containing genes from more than 1 subgroup [9
]. Dot-plot analyses shows that this duplication in the bovine is complex, leading to the generation of 6 homology units ranging in size from ~7 Kb to ~31 Kb and encompassing between 1 and 11 TRBV genes. Unequal cross-over (non-homologous meiotic recombination) between genome-wide repeats (e.g. SINEs, LINEs and LTRs) has been proposed to act as the substrate for such duplication events in TR loci [9
]. Although genome-wide repeats are found in the DNA surrounding the bovine TRBV genes (Additional file 3
), as in the human TRB locus they are only rarely found at the boundaries of duplicated homology units (data not shown), suggesting their contribution to mediating duplication is minimal [10
Although gene conversion of TRBV genes has been documented [56
], as with other multi-gene families in the immune system, TRBV genes predominantly follow a 'birth-and-death' model of evolution [13
], by which new genes are created by repeated gene duplication, some of which are maintained in the genome whilst others are deleted or become non-functional due to mutation. Genes maintained following duplication are subject to progressive divergence, providing the opportunity for diversification of the gene repertoire. Gene duplication within the TR loci has occurred sporadically over hundreds of millions of years with ancient duplications accounting for the generation of different subgroups and more recent duplications giving rise to the different members within subgroups [9
]. The continuous nature of duplication and divergence of bovine TRBV genes is evident in the multi-membered subgroups where nucleotide identity between members ranges between 75.5% and 100%. The complete identity observed between some TRBV genes suggests that some of the duplication events have occurred very recently. Similar features have been described for the murine TRA and human IGκ loci, within which recent duplications, < 8 million years ago (MYA), have created pairs of V genes exhibiting ~97% nucleotide identity [9
]. Southern blot data showing differences in RFLP banding patterns of TRBV9 and 27 genes in DNA from Bos indicus
and Bos taurus
animals (Figure ), which only diverged between 0.25 – 2 million years ago [62
], provides further evidence of recent evolutionary development of the TRBV repertoire in cattle.
The distribution of TRBV genes over 5 scaffolds and the presence of > 180 Kb of undetermined sequence within two of the scaffolds indicate that characterisation of the genomic TRBV repertoire remains incomplete. Comparison with cDNA sequences data indicates that the number of undefined genes is substantial – only 36/86 (42%) of TRBV genes identified from cDNA analysis have corresponding identical sequences in Btau_3.1. Most of the identified TRBV genes missing from the assembly are members of the large subgroups TRBV6, 9, 19, 20, 21 and 29, further enhancing their numerical dominance. Although it is anticipated that completion of the TRB locus sequence will incorporate significant numbers of additional TRBV genes, the possible existence of insertion-deletion related polymorphisms (IDRPs), which can lead to intra-species variation in genomic TRBV gene repertoires as described in human and murine TRB loci [65
], may result in some of the genes identified in cDNA being genuinely absent from the sequenced bovine genome
The proportion of TRBV pseudogenes in Btau_3.1 is 41%, comparable to that seen in both humans (29%) and mice (40%), suggesting that the 'death rate' in TRBV gene evolution is generally high [58
]. Pseudogene formation has occurred sporadically throughout the evolution of TRBV genes, with genes that have lost function tending to subsequently accumulate further lesions [9
]. The majority of bovine TRBV pseudogenes (57%) contain a single lesion and thus appear to have arisen recently; the remaining 43% have multiple lesions of varying severity and complexity (Additional file 2
). In addition to pseudogenes we also identified 7 sequences showing limited local similarity to TRBV genes in Btau_3.1 (Figure – open boxes). Such severely mutated TRBV 'relics', 22 of which have been identified in the human TRB locus [10
]., are considered to represent the remnants of ancient pseudogene formation.
In contradiction to a previous report [39
], the repertoire of functional TRBV genes in Btau_3.1 exhibits a level of phylogenetic diversity similar to that of humans and mice. Phylogenetic groups A and E are over-represented in all 3 species, which in humans and cattle is largely attributable to expansion of subgroups TRBV5, 6, 7 and 10 and TRBV6, 9 and 21 respectively; in mice the expansion of subgroups TRBV12 and 13 make a more modest contribution to this over-representation. Much of the expansion of human subgroups TRBV5, 6 and 7 occurred 24–32 MYA [13
] and similarly, as described above, in bovines much of the expansion of subgroups TRBV6, 9 and 21 subgroups appears to be very recent. As these expansions have occurred subsequent to primate/artiodactyl divergence (~100MYA) [69
], over-representation of phylogenetic groups A and E must have occurred as parallel but independent events in these lineages, raising interesting questions about the evolutionary pressures that shape the functional TRBV repertoire.
In contrast to the wide variation in the organisation of TRBD, TRBJ and TRBC genes in the TRB locus seen in non-mammalian vertebrates [70
], in mammals the arrangement of tandemly located DJC clusters is well conserved [10
]. Although most placental species studied have 2, variation in the number of DJC clusters has been observed, with unequal cross-over events between TRBC genes usually invoked as the most likely explanation for this variation [36
]. The results from this study provide the first description of the entire bovine DJC region and confirm that like sheep, cattle have 3 complete DJC clusters [36
]. Dot-plot and sequence analyses indicate that unequal crossover between the ancestral TRBC1 and TRBC3 genes led to duplication of a region incorporating TRBC1, TRBD3 and TRBJ3 genes, generating the DJC2 cluster. The similarity with the structure of the ovine DJC region suggests that this duplication event occurred prior to ovine/bovine divergence 35.7 MYA [69
]. As with duplication of TRBV genes, expansion of TRBD and TRBJ gene numbers has increased the number of genes available to partake in somatic recombination – the 3168 different VDJ permutations possible from the functional genes present in Btau_3.1 is considerably more than that for either humans (42 × 2 × 13 = 1092) or mice (21 × 2 × 11 = 462). Interestingly, the sequence of bovine TRBD1 gene is the first TRBD gene described that doesn't encode a glycine residue (considered integral to the structure of the CDR3β) in all 3 reading frames [79
]. However, analysis of cDNA reveals evidence of expression by functional TRB chains of TRBD1in the reading frame that doesn't encode a glycine but have generated a glycine codon by nucleotide editing at the VJ junction (data not shown).
In contrast to TRBV, TRBD and TRBJ genes which encode products that bind to a diverse array of peptide-MHC ligands, TRBC gene products interact with components of the CD3 complex which are non-polymorphic. Consequently, due to structural restrictions TRBC genes are subject to concerted evolutionary pressures with intra-species homogenization through gene conversion evident in both humans and mice [9
]. Similarly, the bovine TRBC genes were found to encode near identical products, most likely as a result of gene conversion, although in the case of TRBC1 and TRBC2 genes this more probably reflects minimal divergence following duplication.
Comparison with the human and murine sequences shows that non-coding elements that regulate TRB expression, such as the Eβ, promoters and RSs are highly conserved in the bovine. This is consistent with work demonstrating that the critical role of RSs has enforced a high level of evolutionary conservation [70
] and that Eβ and PDβ1 sequences are well conserved in eutherian species [36
]. Although transcriptional factor binding sites are less well conserved in the putative PDβ1 than the Eβ sequence, the Ikaros/Lyf-1 and Ap-1 binding sites of the PDβ1, which are vital in enforcing stage-specific (i.e. Dβ-Jβ prior to Vβ-DβJβ recombination) are conserved [53
]. Our analysis of putative TRBV promoter elements was restricted to the well described CRE motif [9
]. However, TRBV promoters are complex and expression of TRBV genes whose promoters lack the CRE motif is maintained through the function of other transcriptional factor binding sites [83
]. A more detailed analysis of the bovine TRBV promoters would be interesting given the potential influence this may have on shaping the expressed TRBV repertoire [25
], but is beyond the scope of the current study.
The portion of the bovine TRB locus described in Btau_3.1 encompasses > 730 Kb of sequence (excluding the regions of undetermined sequence in Chr4.003.105 and Chr4.003.108). Thus, although incomplete, the bovine TRB locus is larger than that of either humans (620 Kb) or mice (700 Kb), mainly as a consequence of the duplications leading to the dramatic expansion of the V genes. In contrast to V genes, duplication of trypsinogen genes within the TRB locus is more limited in the bovine (Figure ), where only 5 trypsinogen genes were identified, compared to the human and murine where more extensive duplication has lead to the presence of 8 and 20 trypsinogen genes respectively. Despite the differences in duplication events, the organisation of both TR and non-TR genes within and adjacent to the TRB locus exhibits a striking conserved synteny between cattle, humans and mice [9
]. Indeed, the organisation of genes within the TRB locus and its position relative to adjacent loci is ancient, with marked conserved synteny also demonstrated between eutherian and marsupial mammalian species and, to a large extent, chickens [9
]. Given the evidence for conserved synteny of TRBV gene organisation despite dissimilar duplication/deletion events between mice, humans and cattle, the results of the analysis completed in this study suggest that several subgroups including TRBV1, 2, 17, 22 and 23, which were not identified in the genome assembly or from cDNA sequences, may have been deleted from the bovine genome (Figure ). Conservation of synteny would predict that the genomic location of the TRBV27 gene identified from cDNA analysis will be within the region of undetermined sequence in Chr4.003.108 between the TRBV26 and 28 genes (Figure ).