Knowledge of an individual’s HLA genotype is essential for modern medical genetics, and is crucial for hematopoietic stem cell and solid-organ transplantation. However, the high levels of polymorphism known for the HLA genes make it difficult to generate an HLA genotype that unambiguously identifies the alleles that are present at a given HLA locus in an individual. For the last twenty years, the histocompatibility and immunogenetics community has recorded this HLA genotyping ambiguity using allele codes developed by the National Marrow Donor Program (NMDP). While these allele codes may have been effective for recording an HLA genotyping result when initially developed, their use today results in increased ambiguity in an HLA genotype, and they are no longer suitable in the era of rapid allele discovery and ultra-high allele polymorphism. Here, we present a text string format capable of fully representing HLA genotyping results. This Genotype List (GL) String format is an extension of a proposed standard for reporting KIR genotype data that can be applied to any genetic data that employs a standard nomenclature for identifying variants. The GL String format employs a hierarchical set of operators to describe the relationships between alleles, lists of possible alleles, phased alleles, genotypes, lists of possible genotypes, and multilocus unphased genotypes, without losing typing information or increasing typing ambiguity. When used in concert with appropriate tools to create, exchange, and parse these strings, we anticipate that GL Strings will replace NMDP allele codes for reporting HLA genotypes.
Genotype; GL String; HLA; KIR
We have updated the catalogue of common and well-documented (CWD) HLA alleles to reflect current understanding of the prevalence of specific allele sequences. The original CWD catalogue designated 721 alleles at the HLA-A, -B, -C, -DRB1, -DRB3/4/5, -DQA1, -DQB1, and –DPB1 loci in IMGT/HLA Database release 2.15.0 as being CWD. The updated CWD catalogue designates 1122 alleles at the HLA-A, -B, -C, -DRB1, -DRB3/4/5, -DQA1, -DQB1, -DPA1 and –DPB1 loci as being CWD, and represents 14.3% of the HLA alleles in IMGT/HLA Database release 3.9.0. In particular, we identified 415 of these alleles as being “common” (having known frequencies) and 707 as being “well-documented” on the basis of ~140,000 sequence-based typing observations and available HLA haplotype data. Using these allele prevalence data, we have also assigned CWD status to specific G and P designations. We identified 147/151 G groups and 290/415 P groups as being CWD. The CWD catalogue will be updated on a regular basis moving forward, and will incorporate changes to the IMGT/HLA Database as well as empirical data from the histocompatibility and immunogenetics community. This version 2.0.0 of the CWD catalogue is available online at cwd.immunogenomics.org, and will be integrated into the Allele Frequencies Net Database, the IMGT/HLA Database and National Marrow Donor Program’s bioinformatics web pages.
allele prevalence; common allele; CWD; HLA; sequence based typing; well-documented allele
MHC; NHP; database; nomenclature; IPD
The Immuno Polymorphism Database (IPD), http://www.ebi.ac.uk/ipd/ is a set of specialist databases related to the study of polymorphic genes in the immune system. The IPD project works with specialist groups or nomenclature committees who provide and curate individual sections before they are submitted to IPD for online publication. The IPD project stores all the data in a set of related databases. IPD currently consists of four databases: IPD-KIR, contains the allelic sequences of killer-cell immunoglobulin-like receptors, IPD-MHC, a database of sequences of the major histocompatibility complex of different species; IPD-HPA, alloantigens expressed only on platelets; and IPD-ESTDAB, which provides access to the European Searchable Tumour Cell-Line Database, a cell bank of immunologically characterized melanoma cell lines. The data is currently available online from the website and FTP directory. This article describes the latest updates and additional tools added to the IPD project.
Haematopoietic stem cell transplantation (HSCT) is a valuable tool in the treatment of many haematological disorders. Advances in understanding HLA matching have improved prognoses. However, many recipients of well-matched HSCT develop posttransplant complications, and survival is far from absolute. The pursuit of novel genetic factors that may impact on HSCT outcome has resulted in the publication of many articles on a multitude of genes. Three NOD2 polymorphisms, identified as disease-associated variants in Crohn's disease, have recently been suggested as important candidate gene markers in the outcome of HSCT. It was originally postulated that as the clinical manifestation of inflammatory responses characteristic of several post-transplant complications was of notable similarity to those seen in Crohn's disease, it was possible that they shared a common cause. Since the publication of this first paper, numerous studies have attempted to replicate the results in different transplant settings. The data has varied considerably between studies, and as yet no consensus on the impact of NOD2 SNPs on HSCT outcome has been achieved. Here, we will review the existing literature, summarise current theories as to why the data differs, and suggest possible mechanisms by which the SNPs affect HSCT outcome.
It is 14 years since the IMGT/HLA database was first released, providing the HLA community with a searchable repository of highly curated HLA sequences. The HLA complex is located within the 6p21.3 region of human chromosome 6 and contains more than 220 genes of diverse function. Of these, 21 genes encode proteins of the immune system that are highly polymorphic. The naming of these HLA genes and alleles and their quality control is the responsibility of the World Health Organization Nomenclature Committee for Factors of the HLA System. Through the work of the HLA Informatics Group and in collaboration with the European Bioinformatics Institute, we are able to provide public access to these data through the website http://www.ebi.ac.uk/imgt/hla/. Regular updates to the website ensure that new and confirmatory sequences are dispersed to the HLA community and the wider research and clinical communities. This article describes the latest updates and additional tools added to the IMGT/HLA project.
Variable interaction between the Bw4 epitope of HLA-B and the polymorphic KIR3DL1/S1 system of inhibitory and activating NK cell receptors diversifies the development, repertoire formation and response of human NK cells. KIR3DL1*004, a common KIR3DL1 allotype, in combination with Bw4+, HLA-B slows progression of HIV infection to AIDS. Analysis here of KIR3DL1*004 membrane traffic in NK cells shows this allotype is largely misfolded but stably retained in the endoplasmic reticulum, where it binds to the chaperone calreticulin and does not induce the unfolded protein response. A small fraction of KIR3DL1*004 folds correctly and leaves the endoplasmic reticulum to be expressed on the surface of primary NK and transfected NKL cells, in a form that can be triggered to inhibit NK cell activation and secretion of interferon-γ. Consistent with this small proportion of correctly-folded molecules, trace amounts of MHC Class I co-immunoprecipitated with KIR3DL1*004. There was no indication of any extensive intracellular interaction between unfolded KIR3DL1*004 and cognate Bw4+ HLA-B. A similarly limited interaction of Bw4 with KIR3DL1*002, when both were expressed by the same cell, was observed despite the efficient folding of KIR3DL1*002 and its abundance on the NK cell surface. Several positions of polymorphism modulate KIR3DL1 abundance at the cell surface, differences that do not necessarily correlate with the potency of allotype function. In this context our results suggest the possibility that the effect of Bw4+ HLA-B and KIR3DL1*004 in slowing progression to AIDS is mediated by interaction of Bw4+ HLA-B with the small fraction of cell surface KIR3DL1*004.
We describe a novel approach to genetic association analyses with proteins sub-divided into biologically relevant smaller sequence features (SFs), and their variant types (VTs). SFVT analyses are particularly informative for study of highly polymorphic proteins such as the human leukocyte antigen (HLA), given the nature of its genetic variation: the high level of polymorphism, the pattern of amino acid variability, and that most HLA variation occurs at functionally important sites, as well as its known role in organ transplant rejection, autoimmune disease development and response to infection. Further, combinations of variable amino acid sites shared by several HLA alleles (shared epitopes) are most likely better descriptors of the actual causative genetic variants. In a cohort of systemic sclerosis patients/controls, SFVT analysis shows that a combination of SFs implicating specific amino acid residues in peptide binding pockets 4 and 7 of HLA-DRB1 explains much of the molecular determinant of risk.
The fast evolving human KIR gene family encodes variable lymphocyte receptors specific for polymorphic HLA class I determinants. Nucleotide sequences for 24 representative human KIR haplotypes were determined. With three previously defined haplotypes, this gave a set of 12 group A and 15 group B haplotypes for assessment of KIR variation. The seven gene-content haplotypes are all combinations of four centromeric and two telomeric motifs. 2DL5, 2DS5 and 2DS3 can be present in centromeric and telomeric locations. With one exception, haplotypes having identical gene content differed in their combinations of KIR alleles. Sequence diversity varied between haplotype groups and between centromeric and telomeric halves of the KIR locus. The most variable A haplotype genes are in the telomeric half, whereas the most variable genes characterizing B haplotypes are in the centromeric half. Of the highly polymorphic genes, only the 3DL3 framework gene exhibits a similar diversity when carried by A and B haplotypes. Phylogenetic analysis and divergence time estimates, point to the centromeric gene-content motifs that distinguish A and B haplotypes having emerged ∼6 million years ago, contemporaneously with the separation of human and chimpanzee ancestors. In contrast, the telomeric motifs that distinguish A and B haplotypes emerged more recently, ∼1.7 million years ago, before the emergence of Homo sapiens. Thus the centromeric and telomeric motifs that typify A and B haplotypes have likely been present throughout human evolution. The results suggest the common ancestor of A and B haplotypes combined a B-like centromeric region with an A-like telomeric region.
It is 12 years since the IMGT/HLA database was first released, providing the HLA community with a searchable repository of highly curated HLA sequences. The HLA complex is located within the 6p21.3 region of human chromosome 6 and contains more than 220 genes of diverse function. Many of the genes encode proteins of the immune system and are highly polymorphic. The naming of these HLA genes and alleles and their quality control is the responsibility of the WHO Nomenclature Committee for Factors of the HLA System. Through the work of the HLA Informatics Group and in collaboration with the European Bioinformatics Institute, we are able to provide public access to this data through the web site http://www.ebi.ac.uk/imgt/hla/. Regular updates to the web site ensure that new and confirmatory sequences are dispersed to the HLA community, and the wider research and clinical communities.
The immune response HLA class II DRB1 gene provides the major genetic contribution to Juvenile Idiopathic Arthritis (JIA), with a hierarchy of predisposing through intermediate to protective effects. With JIA, and the many other HLA associated diseases, it is difficult to identify the combinations of biologically relevant amino acid (AA) residues directly involved in disease due to the high level of HLA polymorphism, the pattern of AA variability, including varying degrees of linkage disequilibrium (LD), and the fact that most HLA variation occurs at functionally important sites. In a subset of JIA patients with the clinical phenotype oligoarticular-persistent (OP), we have applied a recently developed novel approach to genetic association analyses with genes/proteins sub-divided into biologically relevant smaller sequence features (SFs), and their “alleles” which are called variant types (VTs). With SFVT analysis, association tests are performed on variation at biologically relevant SFs based on structural (e.g., beta-strand 1) and functional (e.g., peptide binding site) features of the protein. We have extended the SFVT analysis pipeline to additionally include pairwise comparisons of DRB1 alleles within serogroup classes, our extension of the Salamon Unique Combinations algorithm, and LD patterns of AA variability to evaluate the SFVT results; all of which contributed additional complementary information. With JIA-OP, we identified a set of single AA SFs, and SFs in which they occur, particularly pockets of the peptide binding site, that account for the major disease risk attributable to HLA DRB1. These are (in numeric order): AAs 13 (pockets 4 and 6), 37 and 57 (both pocket 9), 67 (pocket 7), 74 (pocket 4), and 86 (pocket 1), and to a lesser extent 30 (pockets 6 and 7) and 71 (pockets 4, 5, and 7).
HLA disparity between hematopoietic stem cell donors and recipients is one of the most important factors influencing transplant outcomes, but there are no well accepted guidelines to aid in selecting the optimal donor amongst several HLA mismatched donors. In this report, HLA-A is used as a model to illustrate factors that are barriers to delineating the relationship between specific HLA mismatches and transplant outcomes in the United States. Patients in this investigation received transplants for hematological malignancies that were facilitated by the National Marrow Donor Program (NMDP) between 1990 and 2002 (n=4,226). High resolution HLA typing was performed for HLA-A, -B, -C, -DRB1, -DQA1, -DQB1, -DPA1 and -DPB1. HLA-A mismatches were observed in 745 donor-recipient pairs and 62% of these pairs also had disparities at HLA-B, -C and/or -DRB1. The HLA-A mismatches involved 190 different combinations of HLA-A alleles and 51% of these were observed in only one pair. Addition of a single HLA-A disparity when HLA-B, -C, and -DRB1 were matched (n=282) was associated with increased mortality (OR=1.32, CI 1.07-1.63). When HLA-B, -C, and DRB1 were matched, the most frequent HLA-A mismatches were HLAA*0201:0205 (n=28), HLA-A *0301:0302 (n=15), HLA-A *0201:0206 (n=15), HLAA *0201:6801 (n=12), HLA-A*0101:1101 (n=11) and HLA-A*0101:0201 (n=10). There were no statistically significant relationships between any of these disparities and transplant outcomes (engraftment, acute and chronic GVHD, relapse, transplant-related mortality or overall survival) when adjustments for multiple comparisons were considered. Achieving 80% power to detect an effect of any one of these six HLA-A disparities on survival is estimated to require a total transplant population of 11,000 to more than one million U.S. donor-recipient pairs depending upon the HLA disparity. Thus, alternative approaches are required to develop a clinically relevant ranking system for specific HLA disparities in the U.S.
HLA; Histocompatibility; Bone Marrow Transplantation; Hematopoietic stem cell transplantation
The Immuno Polymorphism Database (IPD) (http://www.ebi.ac.uk/ipd/) is a set of specialist databases related to the study of polymorphic genes in the immune system. The IPD project works with specialist groups or nomenclature committees who provide and curate individual sections before they are submitted to IPD for online publication. The IPD project stores all the data in a set of related databases. IPD currently consists of four databases: IPD-KIR, contains the allelic sequences of Killer-cell Immunoglobulin-like Receptors, IPD-MHC, is a database of sequences of the Major Histocompatibility Complex of different species; IPD-human platelet antigens, alloantigens expressed only on platelets and IPD-ESTDAB, which provides access to the European Searchable Tumour cell-line database, a cell bank of immunologically characterised melanoma cell lines. The data is currently available online from the website and ftp directory.
It is 10 years since the IMGT/HLA database was released, providing the HLA community with a searchable repository of highly curated HLA sequences. The HLA complex is located within the 6p21.3 region of human chromosome 6 and contains more than 220 genes of diverse function. Many of the genes encode proteins of the immune system and are highly polymorphic. The naming of these HLA genes and alleles, and their quality control is the responsibility of the WHO Nomenclature Committee for Factors of the HLA System. Through the work of the HLA Informatics Group and in collaboration with the European Bioinformatics Institute, we are able to provide public access to this data through the website http://www.ebi.ac.uk/imgt/hla/. The first release contained 964 sequences, the most recent release 3300 sequences, with around 450 new sequences been added each year. The tools provided on the website have been updated to allow more complex alignments, which include genomic sequence data, as well as the development of tools for probe and primer design and the inclusion of data from the HLA Dictionary. Regular updates to the website ensure that new and confirmatory sequences are dispersed to the HLA community, and the wider research and clinical communities.
Studies have shown that KIR-ligand mismatching to predict NK cell alloreactivity may result in less relapse and better survival in patients with AML. KIR-ligands are distinguished by single nucleotide polymorphisms (SNPs) from HLA-B and HLA-C sequences. We hypothesized that pyrosequencing to determine KIR-ligand status by direct sequencing of the ligand epitope can be done as an alternative to high resolution HLA-typing. Pyrosequencing is rapid and would be particularly useful in analysis of retrospective cohorts where high resolution HLA-typing is unavailable or too expensive. To validate this assay, RNA and DNA from 70 clinical samples were tested for KIR-ligand by pyrosequencing. Primer binding to invariant regions without known SNPs was critical for KIR-ligand assignment by pyrosequencing to be in full concordance with high resolution HLA-typing. Pyrosequencing is sensitive, specific, high-throughput, inexpensive, and can rapidly screen KIR-ligand status to evaluate potential alloreactive NK cell or transplant donors.
NK cells; human; killer immunoglobulin receptor; HLA-typing; pyrosequencing; alloreactivity
The IMGT/HLA database (http://www.ebi.ac.uk/imgt/hla) has provided a centralized repository for the sequences of the alleles named by the WHO Nomenclature Committee for Factors of the HLA System for the past four years. Since its initial release the database has grown and is the primary source of information for the study of sequences of the human major histocompatibilty complex. The initial release of the database contained a limited number of tools. As a result of feedback from our users and developments in HLA we have been able to provide new tools and facilities. The HLA sequences have also been extended to include intron sequences and the 3′ and 5′ untranslated regions in the alignments and also the inclusion of new genes such as MICA. The IMGT/MHC database (http://www.ebi.ac.uk/imgt/mhc) was released in March 2002 to provide a similar resource for other species. The first release of IMGT/MHC contains the sequences of non-human primates (apes, new and old world monkeys), canines and feline sequences. Further species will be added shortly and the database aims to become the primary source of MHC data for non-human sequences.
The IMGT/HLA Database (www.ebi.ac.uk/imgt/hla/) specialises
in sequences of polymorphic genes of the HLA system, the human major
histocompatibility complex (MHC). The HLA complex is located within the
6p21.3 region on the short arm of human chromosome 6 and contains
more than 220 genes of diverse function. Many of the genes encode
proteins of the immune system and these include the 21 highly polymorphic
HLA genes, which influence the outcome of clinical transplantation
and confer susceptibility to a wide range of non-infectious diseases.
The database contains sequences for all HLA alleles officially recognised
by the WHO Nomenclature Committee for Factors of the HLA System
and provides users with online tools and facilities for their retrieval
and analysis. These include allele reports, alignment tools and detailed
descriptions of the source cells. The online IMGT/HLA submission
tool allows both new and confirmatory sequences to be submitted
directly to the WHO Nomenclature Committee. The latest version (release
1.7.0 July 2000) contains 1220 HLA alleles derived from over 2700
component sequences from the EMBL/GenBank/DDBJ
databases. The HLA database provides a model which will be extended
to provide specialist databases for polymorphic MHC genes of other
IMGT, the international ImMunoGeneTics database (http://imgt.cines.fr:8104 ), is a high-quality integrated database specialising in Immunoglobulins (Ig), T cell Receptors (TcR) and Major Histocompatibility Complex (MHC) molecules of all vertebrate species, created in 1989 by Marie-Paule Lefranc, Université Montpellier II, CNRS, Montpellier, France (lefranc@ ligm.igh.cnrs.fr ). At present, IMGT includes two databases: IMGT/LIGM-DB, a comprehensive database of Ig and TcR from human and other vertebrates, with translation for fully annotated sequences, and IMGT/HLA-DB, a database of the human MHC referred to as HLA (Human Leucocyte Antigens). The IMGT server provides a common access to expertized genomic, proteomic, structural and polymorphic data of Ig and TcR molecules of all vertebrates. By its high quality and its easy data distribution, IMGT has important implications in medical research (repertoire in autoimmune diseases, AIDS, leukemias, lymphomas), therapeutic approaches (antibody engineering), genome diversity and genome evolution studies. IMGT is freely available at http://imgt.cines.fr:8104 . The IMGT Index is provided at the IMGT Marie-Paule page (http://imgt.cines.fr:8104/textes/IMGTindex.html ).