One of the major challenges in blood and marrow transplantation is selection of the most compatible donor when an HLA matched donor is not available. Although several investigations have attempted to determine which HLA disparities are poorly tolerated, the results have often been contradictory. The investigation reported here uses one of the largest study populations to date (n=4,226) to illustrate that extensive HLA diversity is a major barrier to establishing the association between HLA disparity and transplant outcomes.
HLA-A was selected for this investigation because mismatches at this locus have been most consistently associated with adverse transplant outcomes [1
]. In the 4,226 pairs examined in this investigation, there were 190 different HLA-A mismatch combinations and 51% of these were observed in only one pair. Very few HLA-A disparities were observed frequently enough to study. Only four HLA-A disparities were observed in 30 or more donor-recipient pairs (A*0201:0205, A*0201:0206, A*6801:6802, A*0201:6801). Sixteen HLA disparities were observed in 10 or more donor-recipient pairs (). If pairs with additional HLA-B, -C, and -DRB1 disparity are excluded to diminish confounding HLA variables, only 6 mismatch combinations occur in 10 or more pairs (). Since the most frequent HLA-A mismatches in HLA-B, -C, and -DRB1 matched pairs occur only 2-6 times/1,000 U.S. patients, it is not feasible to directly determine associations between a particular HLA-A disparity and transplant outcomes for patients transplanted in the U.S.
One of the goals of this investigation was to identify frequent HLA disparities that are deleterious. However, there were only 6 mismatch combinations observed more than 10 times in HLA-B, -C, and -DRB1 matched pairs and when survival of each of these small groups was compared with that of the HLA-matched control group, no statistically significant associations were detected. A*0201:0205, the most frequent HLA-A disparity observed when HLA-B, -C, and -DRB1 were matched, had a trend toward increased mortality at one year (p=0.07, OR=2.08, 95% CI 0.86-4.80). A study from the 14th
International Histocompatibility Workshop examined 51 pairs with A*0201:0205 mismatches using additional HLA mismatches as a co-variable and was unable to detect any association between this disparity and survival [14
]. Additional donor recipient pairs are required to evaluate the impact of this disparity.
A trend for increased severe acute GvHD was detected for one of the most frequent mismatches, A*0201:0206 (OR=3.22, 95% CI 1.02-10.50, p=0.03). This is provocative because two recent publications reported associations between A*0201:0206 and adverse outcomes. Kawase et al., who studied transplants facilitated by the Japan Marrow Donor Program (JMDP), reported that a graft from an A*0206 donor to an A*0201 recipient was significantly associated with GvHD (OR=1.78, p<0.001) [12
]. An analysis of data from the 14th International Histocompatibility Workshop suggested that A*0201:0206 mismatches were associated with mortality [14
]. One limitation in these two studies is that it is likely that there is some overlap in the subjects because many of the A*0201:0206 mismatched pairs examined in the Workshop were obtained from the JMDP. Nevertheless, the A*0201:0206 mismatch combination has now been identified as a potential risk factor in several studies, suggesting that this particular mismatch may be associated with acute GvHD and perhaps mortality.
A high incidence of certain HLA-A disparities in the JMDP data set allowed Kawase et al. to examine directional differences in HLA disparities [12
]. For example, in their analysis an A*0201 donor:A*0206 recipient was not detrimental (OR=1.23, p=.22) while A*0201 recipient:A*0206 donor (OR=1.78, p<0.001) was significantly associated with GvHD. In their study, there were three HLA-A disparities that were associated with acute GvHD and all of these associations were directional. The possibility of directional effects has also been suggested by laboratory studies involving B*4402:4403 where (1) in vitro
assays, (2) characteristics of the peptides bound by the two HLA molecules, and (3) crystal structures of the molecules have shown that the same peptide in the HLA binding groove can have different orientations in HLA-B*4402 and *4403. These differences can be recognized by T lymphocytes [15
] and may be clinically important [31
]. For studies involving heterogeneous patient populations such as those from the NMDP, there are too few subjects with particular HLA disparities to further subdivide the mismatches according to their presence in the donor or recipient and this limitation may obscure significant effects.
One of the striking differences between reports from the JMDP and NMDP data sets is the incidence of particular HLA disparities. A study of 5,210 subjects whose transplants were facilitated by the JMDP reported very large frequencies for certain HLA-A mismatch combinations: A*0201:0206 (n=269), A*2402:2420 (n=90), and A*2601:2603 (n=69). The allelic lineages HLA-A2, A26 and A24 present rich diversity in the Japanese population [33
]. In contrast the European populations present only a limited number of alleles in these groups because there is one predominant allele in each lineage. The NMDP data set is largely composed of patients and donors with European ancestry where most of the pairs are HLA matched for common alleles and the mismatches often involve low frequency alleles.
Using the frequencies of the six most common HLA-A disparities observed in this study when HLA-B, -C, and -DRB1 are matched and the 100 day mortality difference observed between each disparity and an 8/8 match, the number of U.S. donor recipient pairs needed to achieve 80% power to detect an association with survival ranges is 11,000 to more than 1.3 million, depending upon the HLA mismatch (). For example, it is estimated that 72 pairs with a single A*0201:0205 would be required to achieve 80% power to detect an association with survival versus an 8/8 match. Using the frequency of 27 pairs with this HLA disparity in 4,226 subjects, it is estimated that 11,269 pairs would be required to detect an association with survival. Attempts to overcome this limitation by including pairs with multiple HLA mismatches creates another problem--adjusting for the effects of particular HLA disparities are not yet well understood.
Number of Subjects Needed to Achieve 80% Power for Detecting an Association with 100 day Mortality for the Six Most Frequent HLA-A Disparities in HLA-B, -C, and -DRB1 Matched Pairs*
One alternative which has been explored is to investigate the impact of mismatching particular amino acids within the HLA molecule. Kawase et al. used bootstrap analysis to examine each amino acid involved in an HLA mismatch and concluded that two specific amino acid substitutions in HLA-A were associated with acute GvHD: position 9 Tyr vs Phe and position 116 Asn vs Asp [12
]. This approach however has some limitations and the conclusions about some specific residues may be biased since context of the amino acid differences is ignored. For example, in the NMDP data set residue 9 can be mismatched along with 0-28 additional amino acids; a large body of laboratory investigation suggests that context of an amino acid difference is crucial for allorecognition. The analysis conducted by Kawase et al. may also be confounded by the occurrence of additional HLA mismatches in each donor-recipient pair. In this scenario specific amino acid mismatches may occur in one or a few mismatch combinations that may be accompanied by an additional mismatch in another locus as the result of linkage disequilibrium. Therefore, the effect attributed to a particular amino acid mismatch at given locus may in fact result from the occurrence of multiple mismatches. Another concern is that multiple comparisons were made in the study by Kawase et al. Multiple comparisons without support from functional data can detect relationships that are statistically significant but may not be clinically or biologically relevant [13
]. In vitro
assays may help to clarify the situation for certain HLA disparities [34
], but this approach is not practical for studying the large number of combinations of HLA mismatches that are observed in clinical transplantation.
This study shows that extraordinary diversity of HLA creates complexity related to multiple HLA disparities between donor and recipient along with differences in the context in which each amino acid difference occurs within HLA molecules. Decades of research in transplant immunology have taught us that these are important factors influencing the ability to stimulate a clinically significant response against an alloantigen. Unfortunately, the relatively large study population examined here (n=4,226) provides only a few examples for investigation and none of these have sufficient subjects to apply robust multivariate statistical approaches. Given extensive HLA polymorphism and complex biology, it is unlikely that the approaches described above will be successful in ranking the relative risks of most of the HLA disparities that are encountered in clinical practice in the United States. Although it may be possible to study certain HLA disparities in specific populations such as the Japanese, these observations may not extend to other racial and ethnic groups because there may be other genetic differences that influence immune responses.
There have been attempts to identify permissive and detrimental HLA disparities using characteristics of HLA mismatches. Several studies have classified HLA disparities based upon typing resolution: low resolution (antigen level) and high resolution (allele level) mismatches. This approach assumes that HLA mismatches detected by serological HLA typing reagents (i.e., low resolution) have different clinical effects that those which are not. The reported studies have been conflicting with the largest detecting no statistically significant difference for a single allele vs a single antigen mismatch [1
]. One explanation is that this comparison does not adequately classify HLA mismatches according to their risk. The number and function of amino acid differences in each of these groups is diverse and those with 3-6 amino acid differences can have either high or low resolution HLA disparities ().
Scoring systems which group HLA disparities based upon their similarity have also been explored. One approach which assigned scores to HLA disparities based upon the number, position and properties of the amino acid differences has not correlated well with clinical outcomes [18
]. This may in part be caused by observations showing that the number of mismatched amino acids may not correlate with T lymphocyte responses [34
]. Duquesnoy has created an algorithm which has shown some promise in solid organ transplantation, but ranking by this approach has not correlated with outcomes in a NMDP data set [20
Although these attempts have not been successful, approaches that combine HLA disparities using similar functional characteristics have the highest likelihood for success in studying populations which have a large number of low frequency HLA disparities. One possibility would be to establish a group in which the amino acid differences are predicted to only alter the peptide binding grove (PBG group) versus all others or perhaps subsets thereof defined by differences in epitopes for T, B, and NK lymphocytes. In this scenario, many high resolution mismatches which have only one amino acid difference in the peptide binding groove (e.g., A*0201:0206) would be classified in the PBG group. Classification of HLA mismatches that have amino acid differences outside the peptide binding groove is more complicated. There is evidence suggesting that there are six amino acids that are frequently involved in docking of T cell receptors [35
]. If these positions are the same between two HLA molecules and there are differences in the peptides bound by the HLA molecule, positive selection of T lymphocytes on a similar HLA surface may increase the frequency of alloreactive T lypmphocytes. In this case, A:0201:0206, which several studies suggest may be associated with adverse transplant outcomes, may be detrimental because T cells selected by similar docking residues respond to subtle differences in peptides bound by the HLA molecule. HLA disparities with differences in T cell contact residues may be less problematic [34
]. Others such as A*0202:0224 would be classified in another group because some of the amino acid differences are located in loops that are predicted to create epitopes for alloantibodies.
Extreme HLA diversity, genetic variation involving other factors influencing immune responses, and numerous factors influencing transplant outcomes (e.g., disease and stage of disease, recipient age) create a formidable challenge for achieving compelling data to develop well accepted guidelines for ranking the HLA mismatches in potential donors. Until this problem can be solved, it is tempting to use available reports with statistically significant comparisons to guide donor selection. The observations reported here illustrate some of the limitations of published investigations and the need for caution in using the available data to guide donor selection. Ultimately, sophisticated models will likely be required to address this important question.