Dozens to thousands of alleles exist for each of the HLA genes (). The extent of this polymorphism necessitates a much more complex and costly process for genotyping than those commonly used for microsatellites or SNPs. HLA genotyping cost is driven in part by level of resolution. Paradoxically, increased resolution of HLA genotyping can decrease statistical power of HLA disease association analyses by increasing the number of categories and, therefore, reducing the numbers seen for each allele category. However, low-resolution genotyping necessarily bins HLA alleles together and can mask the effects of individual alleles. For example, in general, DRB1*04:xx alleles are highly predisposing for T1D. DRB1*04:03, however, is T1D protective. Low-resolution genotyping for DRB1*04, without allele-level resolution, could mask the protective effect of DRB1*04:03. In populations where DRB1*04:03 is at a high frequency, its protective effect could decrease the apparent predisposing effect of high-risk DRB1*04:xx alleles, such as DRB1*04:05. shows the risk for various haplotypes including DRB1*04xx alleles seen in the T1DGC. Genotyping methodologies and their capacity for resolution are discussed below.
DRB1-DQA1-DQB1 haplotypes that reached statistical significance in the published T1DGC data set
For HLA genes, the intron-exon structure of the gene corresponds to the structure of the encoded protein (A). For both class I and class II genes, exon 1 encodes the signal sequence. For class II genes, exon 2 encodes the domain that is furthest from the cell membrane and participates in the formation of the peptide-binding groove (α1 for the α chain and β1 for the β chain). The domain encoded by exon 3 (α2 for the α chain and β2 for the β chain) does not make contact with the peptide, and exon 4 encodes the transmembrane region. For class I, exons 2 and 3 encode the two immunoglobulin-like domains (α1 and α2) that form the peptide-binding groove, exon 4 encodes the α3 domain, which does not contact the peptide, and exon 5 encodes the transmembrane region. Minimally, complete genotyping to determine the primary sequence for all of the peptide-binding pockets for the classical HLA loci in a given individual requires genotyping the following: DRB1 exon 2 (and exon 2 for DRB3, DRB4, and DRB5 when present), DQA1 exon 2, DQB1 exon 2, DPA1 exon 2, A exons 2 and 3, B exons 2 and 3, and C exons 2 and 3. To further increase resolution, genotyping of DQB1 exon 3 and exon 4 for the class I loci may be included.
The earliest methods of determining HLA genotypes involved cell-based assays that used serum from multiparous women. The number of categories was small, reflecting the major antigenic epitopes on the HLA. The extreme diversity of the HLA genes was only recognized with the advent of DNA-based genotyping technology, which led to an exponential increase in the number of recognized alleles. DNA-based genotyping methods fall into three general categories, sequence-specific priming (SSP), sequence-specific oligonucleotide (SSO) probe, and sequence-based typing (SBT).
SSP involves multiple polymerase chain reaction (PCR) amplifications of genomic DNA with primer pairs designed to produce a product only if a given polymorphism is present in the sample. Results are visualized as the presence or absence of a product by gel electrophoresis. Genotypes consistent with the data are deduced by comparing data to expected patterns for all genotype combinations possible from known alleles. In most cases, multiple genotypes are consistent with the data. The resolution of this technique is dependent on the number of initial amplifications; thus, the higher the desired resolution, the more template, amplification reagents, and gels are required. Polymorphisms that are not represented in the test kit, including novel polymorphisms, will be missed by this technique.
SSO involves a single amplification of the locus to be tested, followed by hybridization of the PCR product with a set of oligonucleotide probes corresponding to known polymorphisms. Initial SSO technology involved immobilization of the PCR product, followed by multiple rounds of hybridization with individual labeled probes. Later versions of SSO technology use a single hybridization of the labeled PCR product to a set of oligonucleotide probes immobilized to a solid support, such as a nylon membrane or bead. Advantages of this technology include the single amplification and hybridization to test each locus, allowing genotyping to be performed with a smaller amount of template than what is required for SSP. The level of resolution of the assay is dependent on the number of oligonucleotide probes included. Like SSP, SSO only queries known polymorphisms, so novel polymorphisms will be missed.
SBT produces higher resolution than either SSP or SSO, because every position of the target sequence is included, rather than just selected polymorphic motifs. SBT also has the advantage of requiring a small amount of template compared with SSP. Until recently, however, the cost of SBT was significantly higher than that of SSP or SSO. Even with SBT, the two alleles in a genotype could not always be distinguished easily.
The most recent development in HLA genotyping technology uses next-generation sequencing (NGS) technology in which hundreds to thousands of sequence reads of polymorphic exons are generated from single molecules. With sequence read lengths long enough to span an exon in an HLA locus, the two sequences originating from two chromosomes in a sample can be determined individually. Thus, the only remaining ambiguity comes from assembling exons in loci with more than one polymorphic exon, or from polymorphisms that lie outside the tested exons. HLA genotyping data generated by NGS provides the highest resolution presently available.
Multiple technologies and varying resolution levels make combining data from different studies quite challenging. The same allele can be given two different designations, depending on the level of resolution used to genotype it. As resolution improves and costs of new technologies decrease, consistency among studies is improving; however, caution must be used when interpreting genotyping data from different studies or different laboratories.