The 3.6 Mb MHC region on human Chromosome 6 contains 140 genes between flanking markersMOG
]. We found that the opossum MHC region bounded by the same flanking markers spans 3.95 Mb and contains 114 genes, recognized by homology to known genes from other species and/or the presence of open reading frames (ORF). Eighty-seven of these genes are shared with human MHC (). A list of putative opossum MHC gene transcripts and our opossum MHC genome browser containing annotation are located athttp://bioinf.wehi.edu.au/opossum
. The opossum MHC is located on Chromosome 2q near the centromere, oriented withMOG
proximal () [11
]. Physical mapping of 19 bacterial artificial chromosome (BAC) clones, corresponding to loci spaced along the entire scaffold, confirmed the accuracy of the assembly.
Map of the Opossum MHC Coordinates Are Relative to scaffold_42:14,700,000 from MonDom2
Fluorescence In Situ Hybridization of Opossum MHC-Containing BAC Clones on Opossum Metaphase Chromosomes
The opossum MHC is similar in size and gene content to the MHC of eutherian mammals. However, the organization of the opossum Class I, II, and III regions is different from that of eutherians and shows more similarity to the organization seen in birds and amphibians. Ostensibly, the main difference between the opossum MHC and that of eutherians is the position of the Class I genes (). The opossum MHC has (1) a Class I/II region that contains interspersed Class I and II genes, (2) a “framework region” that is composed of only the framework genes in the opossum, but which also includes Class I genes in eutherians, (3) a Class III region with gene content and order highly conserved with eutherians, and (4) two extended regions that flank the MHC, corresponding to the eutherian extended Class I and II regions, and containing a very similar gene content and order.
Comparative Map of the MHC Organization in Mammals
In the opossum, Class I and II regions are adjacent and interspersed rather than being separated by Class III (). This unique arrangement of Class I and Class II loci has not been described in any other mammalian species. The proximity of Class I and II genes in opossum as well as non-mammalian vertebrates implies that Class I and II genes were originally located close together in the mammalian ancestor (). This conclusion is further supported by the presence of Class I pseudogenes in the human Class II region, and the presence of both functional and non-functional Class I genes in the rodent Class II region () [1
A Model of the Evolution of the Mammalian MHC
In humans and mice, the MHC contains a region referred to as the Class I framework region due to the presence of a set of non-Class I or II genes, amongst which the Class I loci are interspersed [7
]. The content and gene order of these framework genes are conserved between mice and humans. Remarkably, the opossum MHC contains a homologous cluster of framework genes (includingMOG, PPP1R11, TRIM26, TRIM39, GNL1, POU5F1,
) next to the Class III region opposite the Class I/II region. These genes are in the same order as in the eutherian Class I framework region, but they lack the interspersed Class I loci ( and). This implies that a block of Class I framework genes was established near the MHC locus prior to the translocation of the Class I genes to this region in eutherians. Framework genes have not been reported in the MHC of non-mammals, but it is likely that the association of the framework region genes is ancient and that the framework region moved into the MHC en masse, given that five framework region genes appear on the same scaffold as Class III genes inXenopus tropicalis
(Ensembl scaffold_547) (unpublished data).
In, we present a model to explain how MHC organization evolved from a simple ancestral form to the complex forms seen today in therian mammals. We propose that in the MHC of a therian ancestor of marsupials and eutherians, Class I and Class II loci were located together at one end of the region, along with the antigen processing genes. A similar hypothesis was suggested previously based on studies of MHC organization in non-mammals [12
]. Adjacent to the Class I and II regions was a gene-rich Class III region that already contained most of the genes present in human, mouse, and opossum MHC. The framework region, devoid of most or all Class I genes, assembled on the opposite side of Class III. The extended regions are present in the opossum as well as in eutherians, and therefore must have been present in the ancestral form. Studies have identified extended region genes in close proximity to MHC genes in teleost fish, despite the overall non-linkage of Class I, II, and III genes in these species [3
The eutherian MHC Class I–III–II structure exemplified by rodents and primates evolved relatively recently. Class I genes must have relocated across the Class III region and interspersed between the framework genes after the divergence of marsupials and eutherians, but prior to the divergence of primates and rodents (~60 million y ago) [15
]. This process gave rise to the eutherian Class I region. It is unclear how or why the Class I genes relocated, but Class I loci appear to “migrate” in different species along with their frequent expansions and contractions. Specifically, Fukami-Kobayashi et al. [16
] have suggested that long interspersed nuclear element (LINE) sequences can trigger genome fragment duplications, producing pairs of duplicated genome fragments. Perhaps, a series of duplicated genome fragments inserted themselves between framework genes in ancient eutherian mammals and have since been evolving via expansions and contractions in their new location.
The opossum MHC is unique in that Class I and II genes are interspersed and closely linked to antigen processing genes. The Class I expansion has occurred within the Class II region. The opossum MHC Class I/II region contains 11 putative Class I and ten Class II genes (predicted coding sequences available athttp://bioinf.wehi.edu.au/opossum
). Class II loci include the non-classicalDMA
genes, whose homologs are found in birds and eutherians. Three marsupial-specific classical Class II gene families are present;DA, DB
], and a newly discovered family that we have designatedDC
Phylogenetic Tree of the MHC Class II Genes
Of the 11 Class I loci in the opossum MHC, only one;UA1,
is known to have all the characteristics expected of a classical Class Ia locus by being both ubiquitously expressed and highly polymorphic.UA1
transcripts have been detected in all tissues tested by RT-PCR and account for all previously described Class Ia cDNAs [18
] (A). The level ofUA1
polymorphism is also comparable to that of human HLA-A (N. Gouin, P. B. Samollow, M. L. Baker, and R. D. Miller, unpublished data). Expression of a single classical Class Ia gene in the opossum is unusual for a mammal, but not unprecedented in vertebrates. For example, both the chicken andXenopus laevis
have a dominantly expressed single functional Class Ia molecule [19
]. UnlikeX. laevis,
the opossum Class Ia gene,UA1,
does not appear to have allelic lineages.
RT-PCR Results Demonstrating Class I Expression
Two of the Class I loci(UA2 andUH) appear to be pseudogenes, because they lack a predicted ORF and have not been found expressed in any of the tissues examined (data not shown). Two other loci(UF andUL) have predicted ORFs, but their transcription has not been detected in any tissue so far and their functionality remains unknown. Five of the remaining Class I loci(UE, UK, UJ, UI, andUM) are all transcribed in the thymus (B); however, each have tissue-specific expression, suggesting they are likely Class Ib in nature (S. D. Melman, M. L. Baker, and R. D. Miller, unpublished data).UG is transcribed in all tissues tested, including thymus, but the peptide binding sites are not polymorphic clearly suggesting it is a Class Ib gene (data not shown; N. Gouin, M. L. Baker, P. B. Samollow, and R. D. Miller,unpublished data). Overall, the majority of the 11 opossum Class I loci are transcribed. Since transcription is detected in the thymus, these have the potential to participate in T-cell selection, although other functions in thymic differentiation and T-cell development and regulation can not be ruled out.
The expressed Class I loci in the opossum Class I/II region are highly diverse, sharing as little as 49% nucleotide identity, and at most 83%, over exons 2, 3, and 4 among loci. A phylogenetic analysis of the Class I loci, including Class Ia and Ib loci from other species, is shown in. Despite the sequence divergence of opossum Class I loci, they are phylogenetically related and probably evolved from common ancestral loci. This observation raises some questions about one of the current theories explaining the general absence of non-classical Class I genes within the MHC of non-mammals. It has been suggested that proximity of Class I genes to the antigen processing genes has constrained their divergence [12
]. In eutherians, loss of this tight association by movement of the Class I genes away from the antigen processing genes may have resulted in increased plasticity that led to fluctuations in gene number and function and allowed Class Ib genes to reside in the MHC [12
]. However, in the opossum, antigen processing genes have not constrained the diversification of the adjacent classical and non-classical Class I genes. It is unclear what selective advantage, if any, might have been gained by the separation of Class I from Class II or antigen processing genes in eutherians. Close linkage has been implicated in co-evolution of Class I genes and antigen processing genes [19
]. Perhaps the Class I genes inM. domestica
have evolved to be less constrained by their proximity to the antigen processing machinery, allowing them to duplicate and diversify in close linkage to theTAP
genes. Alternatively, co-evolution with the antigen processing machinery may have severely restricted Class I evolution in this marsupial, perhaps resulting in only a single locus performing the classical role.
Phylogenetic Tree of the MHC Class I Genes
Class I loci,UB
were previously assumed to be linked to the MHC due to their high levels of sequence similarity toUA1
], but surprisingly they are not found on the scaffold containing the MHC region () and have been localized to the telomere of Chromosome 2p, distant from the MHC at 2q centromere (). Localization of Class I genes outside the MHC implies that these genes may have a non-classical role. In eutherians, the Class I loci lying outside the MHC are among the most divergent from Class Ia genes [2
]. However, in non-mammals, genes closely related to Class Ia have been found outside the MHC, and in sharks, Class Ib genes found outside the MHC share very high levels of similarity to the Class Ia genes [8
]. These non-mammalian genes have been designated Class Ib without elucidation of their functional roles, based on levels of expression and polymorphism. Currently, we do not have information about polymorphism levels ofUB
but their relatively low expression levels [24
] may indicate evolution towards non-classical Class Ib functions.UB
are both flanked by marsupial-specific retroelements of the CORE-SINE type [24
], which would be consistent with the role of such elements in Class I gene mobility [16
] and may explain the recent relocation ofUB
outside of the MHC [24
]. The high level of sequence similarity ofUA, UB,
raises the possibility that Class Ia genes can maintain their function when unlinked to the MHC.
Comparisons between MHC sequences of distantly related mammals highlight the conservation of the most important regulatory sequences, namely the SXY DNA motifs. Transcription of most MHC Class I and II genes is largely regulated by the Class II transactivator (CIITA), which interacts with several transcription factors, particularly those that bind to this motif [25
]. Conservation of promoter elements in opossum Class I genes has been reported previously [24
]. Using computational methods, we were able to identify SXY motifs upstream of most opossum MHC Class I and II genes. Eight SXY motifs were identified within 273 base pairs (bp) of the coding start in the opossum Class II genes (). Overall, these motifs were found to be conserved between eutherians and the opossum (A). Eight SXY motifs were also identified upstream of opossum Class I genes (). We were not able to identify the SXY motif in genesUK, UF,
. Furthermore, the S motifs in the promoters of genesUH, UI,
appear to be weak with respect to the eutherian pattern. This suggests that the opossum Class I SXY regions have diverged from their corresponding eutherian motifs (particularly in the X motif;B) more than the Class II SXY regions have. This is not unexpected given that Class II genes (classical and non-classical) are typically co-expressed whereas the non-classical Class I genes tend to evolve novel functions.
Coordinates of the Opossum Class I and Class II Loci Located within the MHC
Analysis of the SXY Promoter Regions of MHC Class I and II Genes
Perhaps most significantly, our data also suggest an ancient relationship between the MHC and the natural killer complex (NKC), which contains C-type lectin natural killer (NK) cell receptor loci [27
]. This relationship is drawn from the presence of two genes within the opossum MHC,MIC
is the most distant homolog to the polymorphic human Class I genesMICA
found to date (). TheMIC
genes are Class I–related genes that encode ligands for NKG2D, a C-type lectin NK receptor [28
genes are not found within the MHC of rodents. Instead, rodents have closely related genes, known asMILL
. In a phylogenetic analysis, the opossumMIC
is basal to a clade containing humanMICA/B
and the mouseMILL1/2
genes (). The function of rodentMILL
genes is not yet known [29
], but our results support a common evolutionary origin ofMIC
in eutherians. The presence ofMIC
in the opossum MHC, and its apparent absence in non-mammals, implies thatMIC
-like genes appeared before marsupials and eutherians diverged, and uniquely evolved intoMILL
The osteoclast-associated receptor(OSCAR)
was first discovered as a receptor on mouse osteoclasts [30
], but it has recently been shown to participate in antigen uptake and processing for Class II molecules in dendritic cells [31
(also known as polymeric immunoglobulin receptor 3) is located within the leukocyte receptor complex (LRC) of humans, chimps, mice, and rats [32
]. The presence of anOSCAR
homolog within the opossum MHC is surprising. Using Genscan, we confirmed that the opossumOSCAR
homolog contains an intact ORF and a predicted promoter. Human and opossumOSCAR
share 47% identity at the amino acid level and are reciprocal best hits in BLAST searches (opossumOSCAR
against human Refseq: best hit NP_573399.1 OSCAR isoform 4, e-value = 7e−66). Further, the presence ofOSCAR
in the opossum MHC suggests that involvement in antigen processing may be its original function.
MHC Class I molecules are ligands for NK cell receptors, so these two gene families must co-evolve. Keeping up with the rapid evolution of the MHC loci in response to pathogenic pressures is thought to have resulted in the independent evolution of two vertebrate NK receptor families, the C-type lectin and Ig superfamily types. In humans, the C-type lectin NK receptors are found on Chromosome 12 within the NKC [27
]. The second NK receptor family contains the killer cell Ig-like receptors (KIR) and is encoded in the LRC on human Chromosome 19 [27
]. The recent discovery of C-type lectin NK receptor genes in avian MHC [6
] supports an ancestral association of the MHC and the C-type lectin genes of the eutherian NKC. Just as birds provide an ancestral link between the MHC and NKC [33
], the two aforementioned opossum genes,OSCAR
provide links between the MHC and LRC.OSCAR
is in the MHC in opossum () but in the LRC in humans and rodents.MIC
genes are in the MHC of opossums and humans, whereas the relatedMILL1/2
genes are in the LRC of rodents [34
]. These observations support the existence of an ancestral genomic region in amniotes that probably contained MHC Class I loci and NK cell receptor genes of both the KIR and C-type lectin forms. This organization would have allowed both classes of NK receptors to co-evolve with their MHC ligands.
genes were linked to the MHC of chickens, and may have been part of the primordial MHC [35
]. Although a clear evolutionary relationship is evident between eutherian MHC Class I genes andCD1, CD1
is not located within the eutherian MHC. The marsupial homolog ofCD1
has been identified in the opossum genome (M. L. Baker, S. D. Melman, and R. D. Miller, unpublished data). It is located on a separate scaffold (scaffold_13) from that containing the MHC and maps to Chromosome 2p ().CD1,
like the NK receptors, probably moved out of the MHC after the separation of mammals and birds but prior to the separation of eutherians and marsupials.
Comparative analyses of the MHC region in opossum and other species supports the idea that at one time in vertebrate evolution there was a single “immune supercomplex” of genes that contained MHC Class I and II, antigen processing genes(TAP
and C-type and Ig-type NK receptor genes [37
]. This complex is no longer found in any living species analyzed so far, but clues of its existence remain in extant genomes.