|Home | About | Journals | Submit | Contact Us | Français|
Based upon the recent advances in genetic association studies, the age of genetically based personalized medicine may be finally arriving in the near future for many diseases. The field of hematopoietic cell transplantation (HCT) is well ahead of this curve. In the entire field of medicine, there is no better demonstration of the power of personalized medicine than the use of patient-donor histocompatibility antigen (HLA) matching to improve stem cell transplant outcomes. Nevertheless, even the HCT field will benefit significantly from the latest developments in genetic association studies, which has allowed expanded analyses of the human genome beyond the major histocompatibility complex (MHC).
Despite extensive efforts using HLA matching to improve transplant outcomes, the nonrelapse mortality (NRM) rate has remained relatively unchanged over the last decade. Recent data from our Center between 1990–2000 indicate that the cumulative incidence of NRM rate is 25% and 30% for matched sibling and unrelated transplants respectively undergoing a myeloablative transplant for a hematological malignancy (Figure 1). Although many of these deaths are related to graft versus host disease (GVHD), many are also attributable to other complications of HCT, including opportunistic infection and organ specific syndromes, some of which may reflect GVHD and related immune deficiency. Because there is evidence suggesting that the risk for some of these complications may be influenced by genetic loci outside of the MHC, many investigators have already begun to evaluate genetic variation in genes of candidate biologic pathways that play a role in the pathogenesis of these complication, trying to identify genetic variants that might serve as a biomarker for not only predicting the risk for developing these complicated outcomes, but also serve as potential future therapeutic targets. However, the results of this approach have been less than impressive. The majority of studies conducted thus far have yielded results which are largely still pending rigorous replication in independent validation studies. This is likely due to shortcomings related to issues specific to genetic association studies, such as appropriate selection of cases and controls, use of biologically relevant and well defined phenotypes, adequate sample size to achieve statistical power, incorporation of population stratification analysis, and most importantly, validation of the genetic association in an independent cohort [1,2].
Our research group recently published three studies evaluating genes in candidate pathways that are involved in the development of infections and airways injury after allogeneic HCT. While these are not the only studies available, they are representative examples of genetic association studies using a two-phase discovery validation approach. In the first study, we comprehensively screened innate immunity genes to discover and verify that single nucleotide polymorphism (SNP) haplotypes across the highly polymorphic gene bactericidal permeability increasing (BPI) gene influences the risk of developing rapid airflow decline after allogeneic HCT . We subsequently confirmed that BPI, an anti-inflammatory lipopolysaccaride binding protein, is produced by small airway epithelial cells , further supporting its potential role in the development of bronchiolitis obliterans syndrome, a form of GVHD in the lungs . In a second study, we discovered that a SNP haplotype containing a putative functional SNP in the promoter region of the lipopolysaccharide binding protein (LBP) gene is associated with a two-fold increase in risk of developing Gram-negative bacteremia after allogeneic HCT . We then confirmed this association in a prospective cohort, where we demonstrated that the homozygous rare genotype is associated with a nearly three-fold increase in circulating LBP levels, which likely plays a significant role in the down regulation of the innate immune response to Gram-negative bacteria, potentially resulting in unchecked bacterial replication . Finally, we recently conducted a study that comprehensively evaluate toll-like receptors (TLR) 2, 4, 6, and 9 that discovered and validated the finding that a SNP haplotype in TLR4 that contain previously known functional SNPs is associated with a two to five-fold increase in risk of developing invasive aspergillus infection .
Despite these and other successes, the candidate gene approach remains extremely limited in scope . Furthermore, the quality of genetic association studies continues to improve. The genome wide genetic association studies published in 2007 and 2008 clearly indicate that the field of genetic epidemiology has entered a new era. These studies examined complex phenotypes such as diabetes [8,9], rheumatoid arthritis [10,11], and multiple sclerosis  by scanning the genomes of tens of thousands of individuals using high throughput genotyping arrays containing over 500,000 single nucleotide polymorphisms (SNPs). Recently, we also embarked upon a genome wide analysis of infectious and organ specific complications of allogeneic HCT. Although the stem cell transplant population is much smaller than those used for other more common diseases, our population benefits from phenotypes that occur with higher frequency and specificity, both of which can significantly decrease the false discovery rate and increase the likelihood that we will discover a genetic relationship that is biologically plausible. From 3,177 patients undergoing allogeneic HCT at our Center between 1992 and 2005 for whom cryopreserved peripheral blood mononuclear cells (PBMC) and/or B-lymphoblastoid cell lines (B-LCL) were available, we randomly selected a discovery cohort consisting of 1,560 donor-patient pairs (total of 3,120 specimens), and reserved the remainder patient/donor pairs as the validation cohort. Each of these specimens was genotyped using the Affymetrix GeneChip® Genome-Wide Human Single Nucleotide Polymorphism (SNP) Array 5.0, which contains over 500,000 SNPs placed evenly across the genome (methods will be explained in detail later in this proposal). Population stratification was accounted for using a principle components approach to identify a subset of SNPs that capture genetic differences between ethnic populations.
Our analyses thus far have focused on a number of infectious and organ specific endpoints. Some initial results from our analysis of Gram-negative bacteremia and bronchiolitis obliterans syndrome are promising. To develop the Gram-negative bacteremia phenotype, an a priori list of GN bacterial organisms was assembled from a review of our laboratory database . Patients were considered a “case” if they had one or more positive blood cultures with one of these organisms prior to discharge from our Center. A time to event allelic and genotypic analysis was performed for each SNP to calculate hazard ratios and 95% confidence intervals for all allele types and genotypes respectively. Death was treated as a competing event. Pretransplant neutropenia, acute GVHD, and time to engraftment, defined as three consecutive days with a neutrophil count >500 cells/ul, were included as covariates in the genetic model. Overall, there were 303 cases of GN bacteremia in the discovery cohort. After eliminating SNPs that were identified to be in Hardy-Weinberg disequilibrium at p<0.001 or had a genotype call rate of <80%, 17 SNP genotypes in the patient were significantly associated with the GN bacteremia phenotype, with p-values ranging from 1 × 10−6.1 to 1 × 10−8.1 (Figure 2 and Table 1). Using a similar approach for bronchiolitis obliterans syndrome, which was defined according to recent recommendations from the National Institutes of Health , we have identified four patient and three donor genetic loci that are associated with a higher risk of developing BOS, with p-values ranging from 1 × 10−6.0 to 1 × 10−8.1.
Despite these promising genetic leads, it should be recognized that this approach is not without its own challenges. For instance, when over half a million analyses are conducted at the same time, the multiple comparison penalty paid restricts the focus of attention to the loci that meet the most stringent threshold for significance. While this conservative approach will reduce false positive discoveries, it also means that promising associations that do not quite meet the significance threshold will not receive the critical attention they deserve. Another major limitation of the agnostic genome wide approach is that these discoveries rarely lead to a candidate gene or pathway whose biologic role is immediately obvious. In fact, most genome wide studies thus far have resulted in associations with loci in gene deserts, leaving scientists at the beginning of the long and arduous task of elucidating the mechanisms that account for the observed associations. Finally, we should recognize that while the era genomics is technologically attractive, environmental influences such as clinical factors still play a major role in the development of many of these phenotypes. Our task is to integrate these genetic discoveries with the critical clinical factors to develop a biologically plausible hypothesis that will one day help us improve the outcome of stem cell transplantation.
As we wait for these discoveries to be validated, we have already learned three valuable lessons regarding the future of genetic research in HCT. First, while candidate gene approaches can succeed, they are less likely to identify genetic variants that will have a clinically meaningful effect when the entire genome is considered. However, this does not mean that candidate gene approaches should be abandoned. Rather, we believe that the candidate gene approach should take advantage of the flood of genetic information available from genome scans, such that each candidate gene can be more comprehensively evaluated. Second, these results demonstrate that an agnostic genome wide approach toward transplant outcomes can reveal highly statistically significant associations. While these discoveries still require rigorous validation, it is inevitable that we have identified leads that will ultimately lead us down a pathway that have never been previously considered. Importantly, this unbiased genome wide approach will likely identify targets that are not only clinically meaningful as potential predictive markers, but will also serve as high yield therapeutic targets. Finally, as we analyze genome wide data, we are repeatedly reminded of the importance of collaboration. Success in genome wide studies requires expertise in multiple areas of research (i.e. clinical and genetic epidemiology, statistical genetics, bioinformatics, molecular biology, immunobiology, and multiple clinical subspecialties), and is highly dependent upon cohort size and replication. As a community of scientists and patients, we are much smaller than those associated with more common diseases. In order for us to fully capitalize upon the coming technological advances, we will need to pool our resources and work together collaboratively in this exciting new era of genome sciences.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.