gene is located in the major histocompatibility complex (MHC) region of the human genome. Variability in HLA
peptide sequences can alter the ability of immune cells to present specific antigens. As a result, many autoimmune or infection related diseases have been associated with specific HLA
. This research has been pivotal in discovering etiological determinants of numerous diseases, and continues today2
. Sequencing-based typing is most often used to obtain allelotypes for association studies; however, the method is not always ideal. Typing HLA-DQB1
with a sequencing-based approach currently requires 300 ng of DNA3
, several sequencing reactions per subject, and proprietary software to interpret a heterozygous chromatogram from one of the most polymorphic regions of the genome. As a whole, this can represent a significant expense for researchers.
Alternatives to direct HLA typing generally use a combination of SNP genotypes to infer individual allele information. Certain SNPs included on commercial genome-wide arrays are in strong linkage disequilibrium with, and can be used as tags for, common HLA alleles4
. However, these common SNPs often fail to capture the entire complexity of the HLA region, which is composed of many rare alleles. Furthermore, tag SNPs are rarely located in the coding sequence; thus, they are merely proxies for HLA alleles. Our goal was to develop an intermediate, economical assay that provided low- to medium-resolution allelotype data via direct interaction with exons.
Multiplexed ligation-dependent probe amplification (MLPA) is a PCR-based methodology developed by MRC-Holland5
). Here, we present a novel assay that uses the MLPA methodology to assess HLA
alleles. Using our probes, this method is able to distinguish the following alleles: DQB1
*05 and DQB1
*06. The DQB1
*06 allele family is further split by this assay into five subgroups (*06A–E), each containing one of the five common *06 alleles in Caucasians6
*06:04, and DQB1
*06:09) as well as rare *06 alleles ().
HLA-DQB1 alleles inferred from the presence or absence of 9 possible probe-pairing ligation products.
A number of oligonucleotide probe pairs () were designed using guidelines from MRC-Holland (www.mlpa.com
) such that each pair would meet directly at a key SNP site within exon 2 of HLA-DQB1
(, step 1) based on sequences from the IMGT/HLA database7
). Probe sequences were checked for non-specific annealing sites using the UCSC BLAT program (http://genome.ucsc.edu/cgi-bin/hgBlat
). The probes on the 3′ side of each SNP contained a 5′ phosphorylation to allow ligation to the free 3′-OH on its partner probe. Probes (Integrated DNA Technologies; Coralville, IA, USA) were diluted to a final concentration of 1–5 pM in TE (10mM Tris-HCl, 0.1mM EDTA, pH=8.0). Dimerizing probe pairs were separated into two probe mixes ().
Sequence of oligonucleotides probes and primers used.
Figure 1 Methodology flow diagram. In step 1, unique probes competitively hybridize to genomic DNA, designed with a tail including unique spacer sequences (labeled X), and common PCR primer sequences. Taq DNA ligase is used in step 2 to join probe pairs that had (more ...)
Concentrations of probes within each mix used.
Genomic DNA obtained from heparinized blood (extracted with FlexiGene DNA kit—Qiagen, Valencia, CA, USA), saliva (extracted with Oragene DNA kit—Genotek, Inc., Ottawa, Ontario, Canada), lymphoblastoid cell lines (extracted with standard phenol/chloroform protocol), clotted blood and buccal cells (both extracted with PureGene DNA isolation kit—Qiagen), was successfully used in this assay. Fifty ng of DNA in 5μL of TE was heated to 98°C for 5 minutes and brought back to 25°C. 1.5 μL of hybridization buffer (1500mM KCl, 300 mM Tris-HCl pH=8.5, 1.0 mM EDTA) and 1.5 μL of either probe mix A or probe mix B () were added to the DNA. The mixture was brought up to 98°C for 1 minute, then incubated at 60°C for 4–12 hours to allow the probes to competitively hybridize to template DNA. Four hours of hybridization time is recommended by MRC Holland, however we observed no difference in results when hybridization was allowed to run overnight.
After lowering the temperature of the DNA/probe mixture to 54°C, 0.2 μL of Taq DNA Ligase at 40,000 U/mL (New England Biolabs, Ipswich, MA, USA) was added along with 4.0 μL of 10x ligase buffer and 27.8 μL of H2O to the hybridized probe/DNA solution to a final volume of 40μL. This addition was performed in the thermocycler in order to keep the reaction temperature as close to 54°C as possible. Any probe pairs that hybridized adjacent to one another on the template were then joined via ligation (, step 2). After allowing incubation of the DNA/probes/ligase for 15 minutes at 54°C, the reaction was put on ice or stored at −20°C as needed.
PCR was used to amplify successfully ligated probe pairs (, step 3). Each probe was designed with a tail that included a uniquely sized spacer sequence followed by a uniform primer sequence, allowing amplification of all probe pairs with one PCR reaction. Three μL of the ligation reaction solution was used as the DNA source in a PCR reaction [11.2μL H2O, 4μL 5x Phire® enzyme buffer, 0.8μL 10mM dNTPs (Invitrogen, Carlsbad, CA, USA), 0.4μL 5μM each left and right primers (), 0.2 μL (1 reaction/μL) Phire® Hot Start DNA Polymerase (Finnzymes, Espoo, Finland)]. The PCR cycles were as follows: an initial 30 sec at 98°C; two steps of 5 sec at 98°C and 15 sec at 68°C for 37 cycles; with a final extension of 1 min 72°C; and held at 4°C. The “left” PCR primer contained a 5′ 6-carboxyfluorescein (6-FAM) modification to label the PCR products. This amplification step can be generalized as a PCR reaction with one carboxyfluorescein labeled primer. Many of the specifications given here could be successfully modified to use any Taq polymerase-based PCR system.
PCR products were separated by size using capillary fragment analysis (, step 4). The post-PCR solution (0.5μL) was mixed with 9.8μL of Hi-Di™ Formamide (ABI, Foster City, CA, USA), and 0.2μL of ROX Reference Dye (Invitrogen, Carlsbad, CA, USA) labeled DNA ladder in a 96-well plate, then sealed with adhesive foil. PCR product separation was performed with an ABI 3730xl at the UC Berkeley DNA Sequencing Facility. The 96-well plate was heated to 95°C for 3 minutes then flash cooled on wet ice for 2 minutes. The ABI 3730xl capillaries were dipped into the wells and a current was applied for 15 seconds to move a fraction of the oligonucleotides into the capillary with no loss of volume. The plate was then removed and replaced with a plate containing 3730 buffer with EDTA (ABI, Foster City, CA, USA).
Using the free software for fragment analysis, PeakScanner™
(ABI, Foster City, CA, USA), HLA-DQB1
alleles were assessed based on the presence or absence of each peak ( and ). Two probe pairs (C1 and C2, ) that met on non-variable DNA sequences were included in every reaction as a positive control for human DNA. Each 96-well plate also included DNA extracted from a mouse pro-B cell line (63–12)8
to serve as a negative control.
Figure 2 Example of allelotyping data. Peaks derived from ligated and amplified probe pairs can be clearly distinguished from the background (BLANK). The small peaks at ~18 nucleotides are the unincorporated fluorescently labeled left PCR primer. C1 and C2 are (more ...)
Peak presence is dependent on the efficiency of ligation of matching left and right probes. This, in turn, is dependent on the probes’ ability to hybridize to template DNA. Ligation is completely intolerant of mismatches on the 3′ end of the left probe, making this the ideal site for SNP detection. Other mismatches, on the right probe or away from the ligation site, are generally tolerated, with resultant smaller peak heights. For example, DQB1*06 alleles tend to amplify probe 5 less than DQB1*03 alleles due to a SNP in the 5R probe hybridization sequence, making ligation less efficient for *06 alleles. Probe concentrations were adjusted via trial and error in order to normalize final peak outputs for ease of analysis. However, peaks should be scored as ‘present’ or ‘absent,’ rather than attempting quantitative analysis based on height.
To verify the accuracy of the assay, a subset of 212 non-Hispanic Caucasian follicular lymphoma cases and 151 matched controls from a non-Hodgkin lymphoma case-control study based in Northern California were allelotyped at HLA-DQB1
using next-generation sequencing as previously described9
. HLA allelotype data generated by MLPA were highly consistent with next-generation sequencing data where 711 of 726 alleles tested had matching allelotypes, and no clear trends were observed among mistyped alleles. This verification pool contained the following DQB1
alleles: *02:01, *02:02, *03:01, *03:02, *03:03, *03:04, *03:05, *04:02, *04:04, *05:01, *05:02, *05:03, *05:04, *06:01, *06:02, *06:03, *06:04, *06:09, *06:11.
There exists no HLA typing strategy without some degree of ambiguity. details the resolution offered by this assay, including ambiguous allele combinations. For example, HLA-DQB
1*06:05, a rare allele common in Israel Ethiopian Jews (www.allelefrequencies.net
), would amplify probes 5 and 8, and would therefore be indistinguishable from HLA-DQB1
*06:09. These ambiguities are mainly in rare alleles; thus, the amplification patterns predicted in are theoretical based on sequences accessed in the IMGT/HLA database January 30, 2011 (release 3.3.0, 144 DQB1
). Furthermore, the following allele combinations are theoretically indistinguishable by our assay: (*06A + *06A) vs. (*06A + *06B); (*06C + *06C) vs. (*06B + *06C); (*06D + *06D) vs. (*06D + *06E); and (*06B + *06D) vs. (*06C + *06D) vs. (*06C + *06E). Based on known sequences, five rare alleles; HLA-DQB1
*03:06, *03:25, *06:06, *06:13, and *06:20 should go undetected by our assay. Because of these limitations, this assay is not currently appropriate for clinical use. However, additional probe pairs could theoretically resolve many of these ambiguities.
This assay uses little DNA (100ng) and is amenable to standard 96-well plates (48 samples per plate), making it high-throughput. We have used the technique to successfully obtain typing data for 1,022 follicular lymphoma cases and controls at a rate of roughly 100 samples per day10
. However, this technique could be improved to require less DNA and to distinguish more alleles. For example, reducing ligation reaction volumes could feasibly reduce genomic DNA input to <10ng per subject. Expanding the number of probe pairs also could improve the allele resolution. Using nine probe pairs, our assay recognizes the HLA-DQB1
*02, *03, *04, *05, and *06 alleles, with additional resolution for alleles in the *06 group (). However, MRC-Holland has designed probe mixes with 50+ probe pairs (www.mlpa.com
) for other applications. Clearly, with sufficient further development and refinement of our technique, researchers could theoretically create an assay that results in extremely specific allelotypes at any HLA
locus with minimal DNA cost. Additional probe pairs have been listed in Supplemental Table 1
as examples of theoretical probe sequences that could be added to improve the precision of this technique at the HLA-DQB1
To emphasize the low cost associated with this assay, we have created , which provides step-by-step costs associated with obtaining HLA-DQB1 allele data on a per-subject basis. We estimate that our assay using facilities available at a core facility (such as the one at UC Berkeley) has a one-time cost of US $520, followed by US $5.59 per subject. Of this per-subject value, >65% of the cost is in the PCR reaction, a generally unavoidable step when typing DNA.
Cost analysis of this methoda
We present a new methodology for HLA-DQB1 typing that provides meaningful data with minimal expense. The method is high-throughput, relatively inexpensive, and uses minimal DNA. Furthermore, the data is simple to analyze (), and can be read using free software. This assay represents a useful tool for researchers, offering accurate, low-cost genotyping in the coding sequence of a biologically relevant HLA gene.