The IMGT/HLA database was established to provide a locus-specific database (LSDB) for the allelic sequences of the genes in the HLA system, also known as the human major histocompatibility complex (MHC). The MHC is one of the most complex and polymorphic regions of the human genome, with excess of 220 genes (1
). The core genes of interest in the HLA system are 21 highly polymorphic HLA genes, found within the 6p21.3 region of the short arm of human chromosome 6, whose protein products mediate human responses to infectious disease and influence the outcome of cell and organ transplants. Three distinct regions have been identified within the MHC. The class I region is located at the telomeric end of the MHC and encodes the genes for the HLA class I molecules, HLA-A, -B and -C. These are co-dominantly expressed on the cell surface and responsible for presenting intracellularly derived peptides to CD8-positive T cells. The class II region lies at the centromeric end of the MHC and encodes HLA class genes HLA-DRA, -DRB1, -DRB3, -DRB4, -DRB5, -DQA1, -DQB1, -DPA1 and -DPB1. HLA class II expression is limited to cells involved in immune responses, where these molecules present extracellularly derived peptides to CD4-positive T cells. Located between the class I and class II regions lies the class III region where a number of non-HLA genes with immune function are located. With a nomenclature covering more than 50 genes and 8000 alleles, there is an obvious need for a curated LSDB to manage these highly polymorphic variants. The first public release of the IMGT/HLA database was made on the 16 December 1998 (2
). Since then the database has been updated every 3 months, in a total of 55 releases, to include all the publicly available sequences officially named by the World Health Organization (WHO) Nomenclature Committee at the time of release.
The naming of new HLA genes and allele sequences and their quality control is the responsibility of the WHO Nomenclature Committee for Factors of the HLA System, which first met in 1968. This committee meets regularly to discuss the issues of nomenclature and has published 19 major reports (3–21
) initially documenting the serologically defined HLA antigens and more recently the genes and alleles defined by nucleotide sequences. The IMGT/HLA database provides the nomenclature committee with the online tools necessary for its task. The dissemination of new allele names and sequences is of paramount importance in the clinical transplant setting, because the variation that distinguishes HLA alleles can have a critical impact on the outcome of a haematopoietic stem cell transplant (22
). The identification, verification and publication of the sequences of these variants through a centralized resource are necessary for accurate identification of HLA alleles in a clinical setting. Sequencing of HLA alleles began in the late 1970’s, predominantly using protein-based techniques to determine the sequences of HLA class I allotypes. The first complete HLA class I allotype sequence, B7.2, now known as B*07:02:01
, was published in 1979 (24
). The first HLA class II allele, DRA*01:01
, was defined by protein sequencing and later in 1982 by DNA sequencing (25–27
). The first HLA DNA sequences or alleles were named by the WHO Nomenclature Committee for Factors of the HLA System (10
) in 1987. At that time, 12 class I alleles and 9 class II alleles were named: in the first 8 months of 2012, the WHO Nomenclature Committee was able to assign names to 1163 alleles ().
The number of HLA alleles named each year and included in the IMGT/HLA Database. The recent surge in the number of submissions received by the database is clearly shown.