To better characterize the bovine WC1 co-receptor family we annotated the WC1 genes in the bovine genome Btau_3.1 assembly and identified 13 WC1 genes distributed between two loci on chromosome 5. This included a novel WC1 gene that more closely resembles swine WC1 than it does previously identified ruminant WC1 genes. Work is ongoing to resolve the gaps in the genome assembly of chromosome 5. However, the number of WC1 genes in the multi-gene family agrees reasonably well with previous reports estimating the occurrence of many related WC1 genes based on Southern blots [
11] and of 13 WC1 genes based on cDNA analysis of intracytoplasmic tail region transcripts [
48] but is fewer than the 50 WC1 genes predicted for sheep by Southern blotting [
46,
47]. The distribution of WC1 genes between two loci is reminiscent of the two TCR gamma loci for cattle and sheep (TRG1 and TRG2) where it is predicted that a number of duplication events and a subsequent translocation event resulted in the formation of the TRG2 locus [
51]. It is possible that similar events contributed to the formation of the WC1 loci because the distribution of WC1 genes based on sequence similarity does not support the idea that a single WC1 locus underwent duplication. Also, interestingly, the distribution of the three distinct exon-intron structures identified here for the WC1 genes (i.e. Type I, II or III) among the two loci also does not support locus duplication alone because the Type II and III genes are found only within a single locus.
We verified the gene annotations by amplifying and sequencing WC1 cDNA from bovine cells derived from a single animal. cDNA evidence confirmed the presence of at least a single functional gene, in an individual animal, for each of the three types of genes differentiated by their exon-intron structure. It is important to note that the profile of WC1 transcripts obtained from this individual animal is not necessarily representative of all WC1 genes expressed. This is despite using an experimental design in which cDNA from several conditions was pooled prior to PCR as well as using a variety of primers in an attempt to avoid biased amplification of particular WC1 transcripts. With regard to this, it was notable that transcripts representative of archetypal WC1.1, the only full-length sequence previously published [
11], were not found to be abundant, to the extent that it was necessary to design a separate primer to preferentially amplify archetypal WC1.1 transcripts. Nevertheless, we confirmed the presence of 8 out of 13 WC1 genes based on cDNA sequences that corresponded to genomic sequences while cDNA evidence for
WC1-2,
WC1-6,
WC1-7,
WC1-8 and
WC1-12 was not found. (Because many of those gene sequences are partial, it cannot be ruled out that cDNA evidence does exist for those genes but could not be classified as such at this point). Although all cDNA transcript sequences varied to some extent from corresponding genomic sequences, variations found between cDNA and related genomic sequences are most likely attributable to variation between animals due to single nucleotide polymorphisms. Indeed, even within a single animal there was preliminary evidence of allelic polymorphism (C.T.A. Herzig, unpublished data). Only two WC1 sequences derived from RT-PCR, CH525 and CH583, lacked any corresponding genomic sequence and were assigned the gene names of
WC1-nd1 and
WC1-nd2, respectively. The identification of a cDNA sequence lacking a corresponding genomic sequence could be a consequence of a gap in the genome sequence thus necessarily precluding the annotation of the corresponding gene. There is also evidence for copy number variation of WC1 genes among animals (G. Liu and J. Keele, personal communication, December 7, 2007) and this would also account for the observed differences.
Prior to these studies WC1 sequence corresponding to swine WC1 had not been identified in ruminants. The bovine swine-like
WC1-11 reported here is structurally similar to WC1 in swine containing 6 SRCR domains, a transmembrane region and a long intracytoplasmic tail. It has been suggested that swine WC1 is the primitive version of its ruminant ortholog [
29], so it could also be reasoned that in bovine
WC1-11 is the most primitive of the ruminant WC1 genes. However, interestingly, the bovine
WC1-11 has a very long intracytoplasmic region while swine WC1 genes have intracytoplasmic regions that are approximately the same length as those of the more classical bovine WC1 genes despite much dissimilarity in sequence [
45]. Current evidence for the classical bovine WC1 intracytoplasmic tails indicates that both tyrosine and serine phosphorylation is important for activation signals and endocytosis, respectively [
49]. It is possible that the presence of a tyrosine kinase phosphorylation motif within the unique portion of bovine
WC1-11 intracytoplasmic region could result in a signaling and/or functional role that is distinct from other WC1 genes.
With regard to this, Wijngaard and co-workers identified and designated three distinct WC1 gene products as WC1.1, WC1.2 and WC1.3 based on reactivity with specific mAbs using WC1-transfected cells [
45]. Based on those studies, WC1 bearing γδT cells were subsequently defined based on mAb reactivity as WC1.1
+, WC1.2
+, WC1.1
+/WC1.2
+ or WC1.1
+/WC1.3
+ wherein the WC1.3
+ population is only found as a subpopulation of WC1.1
+ cells. While the sequence for WC1.1 has been reported in its entirety, only limited putative amino acid sequence for Domains 1 and 2 and nucleotide sequence for segments of the intracytoplasmic tails for WC1.2 and WC1.3 have been reported. WC1.3 was unique with regard to its long intracytoplasmic tail [
45] and that sequence can be found to correspond to Type II WC1 tail sequences here. However, based on our annotations no WC1 gene was identified that had the sequence reported by Wijngaard et al. [
45] as that corresponding to Domain 1 of WC1.3; in fact similar sequence was instead found in Domain 6 of
WC1-4 and
WC1-9, and thus we suggest part of the published WC1.3 sequence is erroneous. Despite this problem it has already been shown that functionally distinct subpopulations of bovine γδT cells can be defined based on the presence of particular WC1 molecules that react with monoclonal antibodies recognizing WC1.1 or WC1.2 [
2,
14] and we now know that WC1 intracytoplasmic tails corresponding to the archetypal WC1.1 sequence play a critical role in signal transduction in response to antigen [
49]. Thus, it is important to further evaluate the role of the long intracytoplasmic tail regions contained in Type II genes
WC1- 9, WC1-10, WC1-12 and the swine-like Type III gene
WC1-11. Because Domain 1 is the most diverse among WC1 SRCR domains, as shown here, it is possible that it serves as the pattern recognition portion of the WC1 molecule and could be a region where bacterial products are ligated as occurs for DMBT1 [
52]. Therefore, pairings of particular WC1 Domain 1's with particular intracytoplasmic tail regions may be crucial to directing γδT cell responses and functions. Future studies will be targeted towards better understanding those relationships. For instance, transfection experiments with
WC1-4 and
WC1-9 would enable us to determine whether they bind the same ligands but their intracytoplasmic tails send different signals and thus result in differing functional responses.
Finally, the occurrence of a large variety of bovine WC1 molecules can be explained only in part by the number of WC1 genes since here we report evidence for extensive alternative splicing of bovine WC1 transcripts. In fact, all but one of the expressed WC1 genes we identified had corresponding splice variants. Immunoprecipitation of γδT cell membranes with anti-WC1 mAb results in a variety of bands including 144, 180, 200, 220, 240 and 300 kDa [
10,
12,
53-
57], lending support to the occurrence of multiple isoforms and/or swine-like
WC1-11 on γδT cells. While the possibility remains that what appear to be alternative splice variants are instead genes that were not identified during the annotation process as a result of gaps in the genomic sequence, all but two alternative splice variants can be related to unspliced transcript sequences that are ≥ 98% identical. Moreover, previous reports indicate that swine and sheep WC1 orthologs [
29,
46] as well as other SRCR family immune system molecules (i.e. CD5, CD6 and CD163) produce transcripts that are alternatively spliced [
22-
27]. However, interestingly, unlike for CD6 and CD163 [
23-
26], WC1 intracytoplasmic tail length appears to be dictated by the particular gene encoding a transcript and not by alternative splicing.
It is notable that Domain 1, the putative ligand-binding portion, was never found to be missing as a result of alternative splicing. While this could be an artifact due to primer design, the forward primer was designed to anneal in the leader sequence and thus that explanation is unlikely. Precedence for multiple isoforms of a T cell co-receptor is shown by the two CD4-like genes in fish which differ from each other structurally [
58,
59]. The function of these smaller WC1 molecules with apparently intact Domain 1's and intracytoplasmic tails is unknown but intriguing. Because WC1 serves as a co-receptor on γδT cells, smaller WC1 molecules may be better able to co-cluster in the immune synapse with the shorter TCR chains (each being about 30 kDa). It is also possible that WC1 isoforms differ in their flexibility given that full-length WC1 molecules contain inter-domain or "hinge" regions following SRCR domains 3, 8, and 10, and this could affect interaction with the TCR. Differences in flexibility have been noted for functionally different immunoglobulin heavy chains with IgE and IgM lacking hinge regions. It is yet to be determined whether transcripts of the same gene but with different alternative splice variants are found expressed by an individual cell but perhaps WC1 splicing is initiated following interaction with its ligand.