|Home | About | Journals | Submit | Contact Us | Français|
Histocompatibility, the ability of an organism to distinguish between its own cells and tissue from those of another, is a universal phenomenon in the Metazoa. In vertebrates, histocompatibility is a function of the immune system controlled by a highly polymorphic Major Histocompatibility Complex (MHC) that encodes proteins which target foreign molecules for immune cell recognition. The association of the MHC and immune function suggests an evolutionary relationship between metazoan histocompatibility and the origins of vertebrate immunity. However, the MHC is the only functionally characterized histocompatibility system; mechanisms underlying this process in non-vertebrates are unknown. A primitive chordate, the ascidian Botryllus schlosseri, also undergoes a histocompatibility reaction controlled by a highly polymorphic locus. We have isolated a candidate gene encoding an immunoglobulin superfamily member that by itself predicts the outcome of histocompatibility reactions. This is the first non-vertebrate histocompatibility gene described, and may provide insights into the evolution of vertebrate adaptive immunity.
Peptide presentation by the MHC is the foundation of the vertebrate adaptive immune system. Characterized by the presence of immunoglobulins, clonal B and T cells, somatic recombination, memory, and a number of other highly specialized processes, adaptive immunity has an incredibly complex, interrelated organization that is similar in form and function in all vertebrates, from sharks to humans1. However, as is clear by its name, the MHC was discovered because of its role in histocompatibility, and based on this dual function, it has been long suggested that there is an evolutionary connection between histocompatibility and vertebrate immunity. Ironically, MHC-based histocompatibility is thought to be an artifact of the MHC molecules role in the adaptive immune response coupled to their extraordinary polymorphism. Furthermore, there is no evidence of MHC, or direct homologs of any adaptive immune components in more primitive extant organisms, even those as closely related as the jawless fish1,2. Thus the mechanisms underlying histocompatibility in the non-vertebrates, the origins of any components of the vertebrate adaptive immune system, and the ultimate relationship of immunity and histocompatibility are completely unknown.
We are studying histocompatibility in the ascidian, Botryllus schlosseri. A member of the chordate subphylum, Urochordata, Botryllus begins life as a tadpole larva with many chordate traits, including a notochord, dorsal, hollow nerve tube, and gill slits. After a 24 hour motile phase, these chordate structures are lost when the tadpole settles and metamorphs into a sessile, invertebrate body plan, called an oozooid. In addition, Botryllus schlosseri is a colonial organism, and this initial metamorphosis is followed by a recurring, highly coordinated budding process which eventually gives rise to a large colony of asexually derived, genetically identical individuals, called zooids, united by a common vascular network (Figure 1a, panel 1).
At the periphery of the colony, the vasculature stops in small protrusions called ampullae (Fig. 1a, panel 1). The ampullae are the site of a naturally occurring histocompatibility reaction initiated when two colonies asexually expand into close proximity. Once juxtaposed, ampullae of each colony will reach out and begin an interaction (Fig. 1a, panel 2). This will result in either fusion of the two ampullae (forming a single chimeric colony sharing a common blood supply, panel 3), or a rejection reaction during which the interacting ampullae are destroyed, thus preventing vascular fusion (panel 4). Fusion or rejection is governed by a single, highly polymorphic locus called the FuHC 3–5. When two colonies share one or both FuHC alleles, they will fuse, whereas rejection occurs if no alleles are in common. The FuHC is highly polymorphic, with most populations containing tens to hundreds of alleles3–5. Thus this primitive chordate undergoes a histocompatibility reaction analogous to the vertebrate MHC-based reaction, and we wished to uncover the molecular mechanisms underlying this process.
We have created conditions to raise and breed B. schlosseri in captivity and developed partially inbred lines homozygous for different FuHC alleles, allowing us to take a forward genetic approach to identify the FuHC locus6–8. As described previously8, mapping populations were created using FuHC defined individuals as parents, the FuHC locus was mapped using a combination of AFLPs and bulk segregant analysis (Figure 1b), and a genomic walk was initiated from the linked markers using both bacterial artificial chromosome (BAC) and Fosmid genomic libraries. The physical map now consists of 3 contigs spanning 1.3 Mbp, with one chromosomal breakpoint crossed (not shown). Over 1 Mbp of the minimal tiling path have been sequenced.
The strategy for identifying candidate FuHC gene(s) is based on the extraordinary polymorphism of the locus. Predicted genes from genomic sequence are analyzed for expression using RT-PCR, then compared to those from the genomic clones. The genomic (BAC and Fosmid) and cDNA libraries are made from FuHC defined, but non-histocompatible individuals, thus we can concurrently survey for both expression and polymorphism of these cDNAs. The FuHC gene(s) should have polymorphisms with the following three characteristics: 1) absolute correlation with defined histocompatibility alleles in a fusion assay; 2) co-segregation with the same alleles in a defined cross and; 3) extraordinary polymorphisms in colonies isolated from the wild. Finally, expression patterns should correlate spatially with the functional aspects of histocompatibility.
A Genscan9 analysis of a sequenced contig consisting of two overlapping fosmid clones (531d19, 557i23; both segregated without recombination with the FuHC, Figure 1c) predicted a gene model encoding a transmembrane protein with an extracellular immunoglobulin (Ig) domain. A full-length cDNA was isolated via RACE and was 3.2 kb in length, predicting an open reading frame of 1007 residues, and was highly polymorphic. The cFuHC is a type I transmembrane protein with the majority of the protein (852 residues) extracellular, followed by a transmembrane domain and an intracellular tail of 128 residues.
The domain structure of the cFuHC is shown in Figure 1d. The N-terminus begins with a signal sequence, followed by an extracellular EGF repeat, then two tandem Ig domains, followed by the transmembrane domain and an intracellular tail. BLAST searches show that the EGF repeat has homology to notch and tenascin at E values of 5e-05; the region encompassing the two Ig domains is homologous to Immunoglobulin Superfamily Member 4D/nectin-like 3 from a variety of vertebrate species (E = 7e-10), the highest homology is to chicken. No direct homolog was identified in other sequenced ascidian genomes. 3D modeling on the PSSM fold recognition server suggested that the Ig domains have the highest homology to the poliovirus receptor, CD155. Conserved domain searches suggest that the first Ig domain is potentially a variable or more ancient intermediate type domain, and the second is closest to a C2-type, but is divergent and may not be easily classifiable (outlined in red, Supplementary Figure 1). These analyses depend on the presence of conserved residues throughout the domain and spacing between these residues10, however, without structural data the true configuration remains unknown. The genomic structure of each Ig domain is also different: the first domain is encoded in three exons, while two exons code the second domain. The intracellular tail has several potential phosphorylation sites, but it is unknown whether these are functional.
The cFuHC gene spans 33Kb, consisting of 31 exons (Supplementary Figure 1) In addition to the transmembrane form of the protein, which consists of 27 exons; there are two alternative splice variants of the cFuHC gene (Supplementary Fig. 1, bottom). First, a short, secreted form is created by an alternative splice which adds three new exons after exon 14 (exons 15–17). These exons, physically located between exons 14 and 18, encode another putative low-homology EGF domain, followed by stop codons in all three reading frames and ca. 300 base-pairs of 3′ untranslated sequence (UTR). The predicted secreted form does not include the Ig domains, and as discussed below, is expressed in both the tadpole and adult colonies. Secondly, there is a rare splice variant in the cytoplasmic domain which has been observed in 5% of the sequences covering the region. This is due to the presence of an extra 113 bp exon which is physically located between exons 29 and 31 of the full-length clone (exon 30; Figure 2), but is normally spliced out. When present, this exon encodes a new, shorter cytoplasmic domain, and the exon ends in a stop codon. Exon 31 is still present in this alternatively spliced form, but is now 3′ UTR. The shorter cytoplasmic domain adds no known motifs, and removes the predicted phosphorylation sites encoded in exon 29. This form has also been identified in both the tadpole and adult. Finally, there are also a three splice variants with combinations of exons 15 and 16 that contain the rest of the transmembrane form of the gene (exons 18–31). However, these variants do not form open reading frames due to frameshifts between exon boundaries when these extra exons are present in any combination. These may be non-functional transcripts made during splicing of the secreted form of the protein.
To analyze segregation of the cFuHC, we used a fragment amplified between the two Ig domains containing parts of exons 18 and 19 and a small (250bp) intron. As described in the methods, this fragment was highly polymorphic, and included multiple substitutions and insertion/deletions. These polymorphisms segregated absolutely with histocompatibility (as assayed by fusion/rejection or genetic mapping) for all individuals in our mapping crosses (Table 1). In addition, we correlated cFuHC polymorphisms with a number of individuals from multiple generations of our main FuHC AB x AB partially inbred lines from the last 16 years. In all cases, correlation of cFuHC polymorphisms with phenotypic and/or genetic FuHC typing was absolute.
To correlate cFuHC polymorphism with functional histocompatibility, wild-type colonies were grown in the lab and assayed for phenotypic fusion/rejection. We identified five robust pairs of fusing and five pairs of rejecting genotypes, including one pair of siblings which rejected each other, as well as two pairs of fusing colonies from different geographic locations. Full-length or extracellular polymorphic regions of the cFuHC were cloned from naïve subclones of each individual and compared among genotypes. In all cases, the alleles identified by sequencing correctly predicted the outcome of histocompatibility reactions (Table 2). Thus far, we have not identified fusing genotypes which do not share an identical allele, or rejecting genotypes which do share an allele. The cFuHC polymorphisms predict the outcome of histocompatibility reactions by themselves.
These experiments also confirmed that the cFuHC is a single locus. Segregation analysis done in Table 1 revealed no aberrant banding patterns, and experiments done in Table 2 on wild-type individuals never revealed more than two alleles/genotype. In addition, screening of the entire 13X genomic library revealed only known alleles, which was confirmed by southern blotting (not shown).
To analyze the overall polymorphism of wild-type colonies we sequenced full-length clones from 10 wild-type individuals collected from around the Monterey Bay. This resulted in the isolation of 18 cFuHC alleles; each individual was heterozygous. At the nucleotide level, any two alleles diverge by an average of 4% of the nucleotides. Amino acid polymorphisms of the cFuHC are illustrated in Figure 2. While alleles are quite divergent (Fig. 2A), polymorphism is spread throughout the protein, with no obvious highly variable regions (Fig. 2B). As shown, the majority of divergence between any two alleles is single amino acid changes located throughout the extracellular domain.
The final criteria for a candidate histocompatibility protein were the correct expression patterns. The FuHC gene should be expressed at sites of histocompatibility in the adult, and there must also be expression in the larva; histocompatibility is functional in the oozooid, thus the effector systems must be undergoing education processes prior to metamorphosis to be able to distinguish between FuHC alleles (discussed below). RT-PCR was done on cDNA isolated from different tissues (Figure 3a), and revealed that both membrane bound and secreted forms of the FuHC are expressed in the adult in both ampullae and blood, and also in the tadpole. In addition, the cFuHC does not appear to be expressed in germline tissue using these or other techniques. This corroborates with previous results suggesting that the FuHC does not appear to be involved in fertilization11,12.
Expression patterns were also visualized by in situ hybridization (Figure 3b). In the oozooid, strong cFuHC expression is seen in the epithelia of the ampullae, as well as a subset of blood cells. Expression in the tadpole is restricted to a region at the anterior of the larvae, outlining the nascent ampullae. Expression in isolated ripe eggs is not seen. In summary, the cFuHC is expressed strongly in tissues intimately associated with histocompatibility.
We have isolated and characterized a highly polymorphic candidate fusion/histocompatibility (cFuHC) gene from the primitive chordate Botryllus schlosseri which encodes an immunoglobulin superfamily member. The cFuHC segregates absolutely with histocompatibility in all mapping crosses, by itself predicts the outcome of interactions between wild-type colonies, and is expressed in tissues directly associated with the natural transplantation reaction: by every criterion this candidate is directly responsible for controlling histocompatibility. Other genes with these characteristics have not been identified in the FuHC locus, and together this is strong evidence that the cFuHC is the first metazoan histocompatibility ligand ever identified outside the vertebrates.
The ability to discriminate between multiple ligands is the universal principal shared by immunity and histocompatibility, leading F.M. Burnet to hypothesize that the origins of the most complex recognition system in the metazoa, the vertebrate adaptive immune system, would be found in histocompatibility systems of more primitive organisms13.
Does identification of the cFuHC in a primitive chordate confirm this hypothesis? As a type I transmembrane protein with multiple extracellular Ig domains, the cFuHC is certainly structurally similar to the vertebrate MHC, however, it is clearly not a direct homolog; for example the Ig domains do not correspond to the C1 type found in the MHC. Alternatively, it has been suggested that vertebrate genes with the more ancient Ig domains, like human IgSF4, with which the cFuHC has high homology, may be descendents of the first antigen receptors2,30. This is also based on observations of the organization of these genes in mammalian genomes2, which may also be shared with the FuHC (not shown). Further structural and genetic analysis of the cFuHC may help elucidate the evolutionary origins of the different Ig domains, and ultimately, the molecules in adaptive immunity.
From a functional standpoint, there is another possible relationship between histocompatibility in Botryllus to the origins of vertebrate immunity. In Botryllus, individuals sharing only a single allele are compatible. It seems unlikely that the hundreds of FuHC alleles could evolve to clearly distinguish themselves via only homotypic interactions, suggesting that effector systems in Botryllus work by missing self recognition, a strategy first identified in vertebrate Natural Killer (NK) cells14. We hypothesize that a population of effector cells are educated to both FuHC alleles, and that recognition of just one allele blocks a default rejection reaction. Histocompatibility in Botryllus evolutionarily links an innate function of vertebrate NK cells, scanning for altered MHC expression, to a highly polymorphic, Ig superfamily ligand. Thus missing-self recognition and natural killing effector systems may have originated in histocompatibility systems of non-vertebrates. In fact, the genome of the related ascidian, Ciona intestinalis contains a number of genes highly-homologous to those involved in natural killing in the vertebrates33. A major goal of future research will be to identify the cells, molecules and mechanisms underlying the function and education of this effector system. Interestingly, we have already found what appears to be a putative polymorphic receptor encoded within 200Kb of the cFuHC. However, these polymorphisms play no role in histocompatibility outcomes (Nyholm, et. al., in preparation).
Vertebrate MHC-based histocompatibility is thought to be an unintended consequence of peptide presentation and polymorphism, but histocompatibility and, more importantly, extraordinary polymorphism, have a well-defined function in B. schlosseri. After compatible colonies fuse, blood-borne germline or totipotent stem cells transfer between colonies and have the ability to expand and differentiate in the newly arising, asexually derived individuals of the vascular partner. This can often result in a situation where only one of the genotypes is represented in the gametic output of the fused individuals, a situation which can remain constant for life15–17. FuHC polymorphism functions to restrict vascular fusion, and the possibility of this germline parasitism, to kin. This stem cell parasitism is thought to be widespread, and the selective force driving the evolution of polymorphic histocompatibility systems in other phyla18,19. Moreover, this movement of stem cells in Botryllus is reminiscent of the transfer of hematopoietic stem cells between fetal mammals sharing a common placenta, the basis of natural tolerance20. These same studies did not identify germline chimeras despite the fact that mammalian germline precursors are also migratory during development20,21. Thus it may be that histocompatibility is still one function of vertebrate immunity, providing both a historical origin, and contemporary selective force to explain extraordinary MHC polymorphism. Alternatively, the FuHC may represent an undiscovered Ig-based system of allorecognition which still exists in the vertebrates; for example, NK cells have recently been found to interact with a non-MHC ligand via missing-self recognition22,23 .
Furthermore, CD155, a nectin family member with which the cFuHC shares structural homology, is a ligand for the NK activating receptor CD22631. In any case, the cFuHC is the first non-vertebrate histocompatibility ligand every identified, and as an Ig superfamily member provides the first structural link between invertebrate histocompatibility and vertebrate adaptive immunity.
Conditions for raising and crossing Botryllus schlosseri in the laboratory, techniques for assaying phenotypic histocompatibility, genetic and physical mapping of the FuHC locus, sequencing, DNA and RNA isolation and cDNA synthesis have been comprehensively described elsewhere6,8. Enzymes were from NEB (Beverly, Mass., USA). All chemicals were purchased from Fisher (Pittsburgh, Pa., USA). Oligonucleotides were synthesized by Operon (Chatsworth, Calif., USA). Kits were used for specific processes as described below.
For fusing or rejecting pair analysis, pregnant colonies were collected from the wild, hatched and reared in the lab. One naïve subclone was removed and DNA and RNA isolated, another subclone was used in a fusion/rejection assay3–5. Animals are named according to their collection site. Those starting with a number are from the Monterey Marina, other collection sites are Half Moon Bay, (HM), Santa Cruz (SC), and Moss Landing (ML), CA. This is followed by a lowercase letter which indicates a F1 individual, thus individuals 1960b and 1960c are progeny from the same wild-type colony.
BAC and Fosmid sequences were analyzed using BLAST24 and GENSCAN9 Putative expressed genes were analyzed for topology and functional domains using the HMMTOP transmembrane predictor 25 and ELM. Sequences which caught our interest were then isolated by RACE 27 and RT-PCR using the SMART II RACE kit as described (Clonetech). Initial Primers for the cFuHC were in two regions. STS 1 covered the Ig domain: sense: 5′-gtatgggacaacacaggaaattctac-3′; antisense: 5′-gtgacgttttagtccataggatatcag-3′. STS 2 was 5′ of this region. Sense 5′-tactattgagtgtatgaacggtgatgt-3′; antisense 5′-ttctatcgcccctatatagttttgtaa-3′. These same primers were used to isolate the entire gene by RACE. The entire gene was amplified with the following primers. The N terminus by the sense primer 5′-aacgatgaatgggttcgcgattttc-3′ which encompasses the start codon, and one of two antisense primers located near the stop codons of each form of the mRNA. The secreted form antisense primer is: 5′-tcgttatgctgtattcattcaa-3′. For the membrane bound form, the primer is antisense, 5′-aagcttctttcagagctactatcttca-3′. For the membrane bound form, we usually amplified in two overlapping ca. 1.5 Kb fragments from primers designed over the exon 14/exon 18 junction. The forward primer was 5′-tacgttgattggaactgtcgacttgaagta-3′ for amplifying a 3′ piece, the reverse primer was 5′-tacttcaagtcgacagttccaatcaacgta-3′ for amplifying the 5′ fragment, used in conjunction with the primers described above. This second strategy was done as there are a number of alternative splice variants which encode non-functional transcripts in exons 15–17. None of these variants form ORFs, due to frameshifts in the exon splice boundries (not shown). As these are near the secreted alternative splice site (after exon 16), we hypothesize that they are nonfunctional transcripts made during mRNA processing.
The majority of segregation analysis was done using STS 1, above, as the forward and reverse primers were each on different exons and split by a small (250bp) intron. This intron was highly polymorphic, and the nature of these polymorphisms allowed a PCR-RFLP approach to distinguish between all the FuHC alleles in our mapping populations. The FuHC A, B, and Y alleles could be differentiated by EcoRI, HindIII and HincII polymorphisms, techniques have been previously described8.Segregation was confirmed by sequencing of the entire clone from recombinant individuals in each cross (not shown). To detect expression via RT-PCR we used STS 1 and 2, as well as two other primers sets tailored for the membrane bound or secreted form. The membrane bound form specific RT-PCR set was: sense Ig STS1 F (shown above), antisense 5′-gtacctcaagtaccacacgccccaat-3′. The secreted form was sense 5′-tgcagcttggaagtttttg-3′; antisense 5′-tcgttatgctgtattcattcaa-3′. For both wild-type fusion/rejection, and polymorphism analysis, pregnant colonies were brought in from the field (from marinas in Monterey, Moss Landing and Santa Cruz, CA) and tadpoles hatched and reared in the lab. Fusion/rejection was visibly assayed, and the cFuHC was isolated as described above. Protein polymorphisms and phylograms were analyzed using ClustalW32.
In situ probes, amplified from primers described above, were subcloned into T-vector (Promega), sequenced to determine orientation, then re-amplified with T7 and sp6 specific primers. After PCR cleanup (Qiagen), in situ hybridization probes were made using T7 and sp6 RNA polymerase according to the manufacturers suggestions (Boehringer). The whole-mount in situ hybridization protocol in this study was carried out according to references 28 and 29, on lab-reared individuals. AP detection was done using either a BCIP/NBT or Vector® Blue alkaline phosphatase substrate kit (Vector Laboratories, Burlingame, CA) with levamisole. After color development, samples were post-fixed in 4% paraformaldehyde in PBS (20 min at 4 °C ), washed in PBS, mounted on slides and visualized on an Olympus BH-2 microscope.
We thank Chris Amemiya, Dan Rokhsar, Ron Davis, Audrey Southwick, Molly Miranda, David Ransom, John Cannon, Gary Litman, Robert Haire, Ayelet Voskoboynik, Jason Wallace, Erin Johnson, This study was supported by grants from the NIH (AI04588; Trans-BAC Sequencing) and the Community Sequencing Program at the DOE/JGI.
Supplementary Information is linked to the online version of the paper at www.nature.com/nature
Author Information Reprints and permissions information is available at npg.nature.com/reprintsandpermissions. The authors declare no competing financial interests.