|Home | About | Journals | Submit | Contact Us | Français|
Both LMO2 mRNA and protein expression in diffuse large B-cell lymphoma (DLBCL) have been associated with superior survival; however, a role for germline genetic variation in LMO2 has not been previously reported. Immunohistochemistry (IHC) for LMO2 was conducted on tumor tissue from diagnostic biopsies, and 20 tag single nucleotide polymorphisms (SNPs) from LMO2 were genotyped from germline DNA. LMO2 IHC positivity was associated with superior survival (HR=0.55; 95% CI 0.31–0.97). Four LMO2 SNPs (rs10836127, rs941940, rs750781, rs1885524) were associated with survival after adjusting for LMO2 IHC and clinical factors (p<0.05), and one of these SNPs (rs941940) was also associated with IHC positivity (p=0.02). Compared to a model with clinical factors only (c-statistic=0.676), adding the 4 SNPs (c-statistic=0.751) or LMO2 IHC (c-statistic=0.691) increased the predictive ability of the model, while inclusion of all 3 factors (c-statistic=0.754) did not meaningfully add predictive ability above a model with clinical factors and the 4 SNPs. In conclusion, germline genetic variation in LMO2 was associated with DLBCL prognosis and provided slightly stronger predictive ability relative to LMO2 IHC status.
LMO2 (LIM domain only 2) is located on 11p13 and belongs to a family of four genes encoding LIM-only proteins, which are transcription regulators that control cell fate in normal hematopoiesis  and endothelial cell remodeling . LMO2 encodes a 156 amino acid protein comprised of two zinc-binding LIM domains, which function in protein-protein interactions in the transcription factor complex that includes E2A, TAL2, GATA1, and LDB1 in erythroids cells [3,4].
While LMO2 is perhaps most well-known for its role as a T-cell oncogenic protein [5,6], gene expression studies have found that LMO2 mRNA expression in diffuse large B-cell lymphoma (DLBCL) was part of the “germinal center” expression profile , and it emerged as the strongest predictor of overall survival in DLBCL both in the univariate and multivariate setting of a six-gene model . A monoclonal anti-LMO2 antibody was subsequently developed, and immunohistochemical (IHC) analysis showed that LMO2 protein was expressed as a nuclear marker in normal germinal center B-cells and hematopoietic lineages, as well as leukemias and a subset of germinal center derived B-cell lymphomas .
Approximately 50% of DLBCL patients express LMO2 protein by IHC analysis, and LMO2 protein expression has been associated with better progression-free and overall survival among DLBCL patients treated with anthracycline-based chemotherapy or rituximab plus anthracycline-based chemotherapy . Unlike its role in leukemias, LMO2 protein expression in DLBCL has not been associated with any somatic genetic alterations [9,10]. This raises the hypotheses that germline genetic variation could play a role in expression of LMO2, and in the setting of DLBCL, could also be associated with prognosis. We tested these hypotheses in a prognostic cohort of DLBCL patients who participated in a population-based study conducted from 1998–2000 and had data on both germline genotyping for LMO2 , and LMO2 expression as assessed by immunohistochemistry in formalin-fixed, paraffin embedded tumor tissue .
Institutional Review Boards at the National Cancer Institute and each Surveillance, Epidemiology and End Results (SEER) center approved the study protocol. Participants provided written, informed consent prior to completing an in-person interview. Details on survival of the DLBCL patients has been previously described [12,13]. Briefly, 1,321 subjects aged 20–74 years with incident, histologically-confirmed NHL first diagnosed from July 1998 through June 2000 were enrolled in a population-based case-control study. All cases were rapidly reported from four SEER cancer registries in the Detroit metropolitan area, the state of Iowa, Los Angeles County, and Seattle. Any patients known to be HIV-positive were excluded. A total of 1172 cases (89%) provided either a peripheral blood (N=773) or mouthwash buccal sample (N=399) as a source for DNA.
Date of diagnosis, histology, stage, presence of B-symptoms, first course of therapy, date of last follow-up, and vital status were derived from linkage to registry databases at each study site. Data on first course of therapy included use of single or multi-agent chemotherapy, radiation, other therapies exclusive of chemotherapy and/or radiation, and no therapy (presumed to be observation); information on individual agents and doses was not available. The SEER registries collect date and cause of death, but do not collect data on treatment response, disease recurrence or progression. In 2008, we conducted a second linkage to each SEER registry to update survival information through the end of 2007.
All cases in the study were initially histologically confirmed as NHL and coded according to the International Classification of Diseases for Oncology, 2nd Edition (ICD-O-2)  by the local diagnosing pathologist. Final ICD-O-3 codes based on local pathology review were received during the SEER record linkage. Pathology reports were obtained for 93% (1228 of 1321) of patients for review by an expert hematopathologist (MAV), who classified cases according to the World Health Organization classification for lymphoid neoplasms  and assigned a confidence score to the subtype diagnosis (≥90% versus <90%). For all cases with low confidence in the NHL subtype classification as well as a 20% random sample of cases with high confidence in the NHL subtype classification, additional immunostaining was conducted to establish the NHL subtype for those patients with available pathology material (N=472). All cases were then assigned a final diagnosis based on the pathology review, updated SEER record linkage if pathology review was not available, or original SEER data if updated data were not available. A total of 417 cases had a final diagnosis of DLBCL (ICD-O-2: 9680–84, 9688, 9712; ICD-O-3: 9678–80, 9684), and all but 36 of these were confirmed by pathology report or slide review.
Sufficient archived, unstained 5-micron slides from formalin-fixed, paraffin-embedded tumor biopsies were available for 239 of the 417 DLBCL cases (57%) for LMO2 IHC staining, which was performed at Stanford University under the direction of a hematopathologist (Y.N.) using an established protocol [9,10]. Two hematopathologists (Y.N. and A.D.) then independently scored the slides as no expression (0%), expression ≤ 30%, and expression >30%; staining greater than 30% was classified as LMO2 positive, while 0% and <30% staining was classified as LMO2 negative using the previously established threshold . LMO2 staining of endothelial cells served as an internal control. The concordance between the two hematopathologists was 89%; discordant cases were reviewed by the two hematopathologists together to assign a final consensus score. Of the 239 cases stained, 5 cases were excluded due to insufficient tissue quality to make a final LMO2 determination, leaving 234 cases for analysis.
Details on genotyping for these samples have been previously described . Briefly, genotyping of LMO2 SNPs was conducted at the NCI Core Genotyping Facility (Advanced Technology Center, Gaithersburg, MD; http://snp500cancer.nci.nih.gov)  as part of a larger, custom-designed GoldenGate assay (Illumina, www.illumina.com), supplemented by TaqMan genotyping. LMO2 tagging SNPs were chosen from the designable set of common SNPs at a minor allele frequency of >5% and a binning threshold of r2>0.8;  a total of 22 SNPs were selected, which represents 88% coverage based on the number of SNPs genotyped divided by the number of bins from the designable set of SNPs.
Quality control was implemented at the level of the entire 1536 SNP OPA for all 1001 (of 1321) NHL cases with a DNA sample . We excluded SNPs that failed to properly cluster in the genotyping calling algorithm (N=3) and SNPs with low completion rate (<90% of samples; N=5). Quality control duplicates with concordance <95% were excluded (N=1). We also excluded samples with a low completion rate (<90% of the full panel of 1536 SNPs, N=17). Hardy–Weinberg equilibrium was evaluated among non-Hispanic Caucasian controls of the parent study, and 3 SNPs showed evidence (p<0.001) for deviation from Hardy–Weinberg proportions, including one LMO2 SNP (rs7941248), but this SNP was retained in the analysis as there was no obvious genotyping error. Thus, all 22 LMO2 SNPs were successfully genotyped. However, we excluded rs2038602 due to lack of any variation (all wild type homozygotes) and rs941941 as it was only available on 50 cases (rest of cases not genotyped on that SNP due to availability of only a buccal DNA sample).
Of the 234 cases with clinical, outcome and IHC data, 162 also had genotyping data, and compromised the final analysis sample.
Overall survival was defined as the time from diagnosis to the date of death or last follow-up; patients alive at last follow-up were censored at that time point. We used Cox proportional hazards regression  to estimate hazard ratios (HRs) and 95% confidence intervals (CIs). To efficiently adjust for clinical and demographic factors, we used a previously developed risk score (“clinical risk score”), analogous to a propensity score in logistic regression . The clinical risk score is a linear combination of age (<60 versus 60+ years), stage (local, regional, distant, or missing), B-symptoms (no, yes, missing), initial therapy (chemotherapy + radiation, chemotherapy + other therapy, radiation only, all other therapy, or missing), sex, race (white, other), years of education (<12, 12–15, 16+), and study center (Detroit, Iowa, Los Angeles, Seattle). To address multiple testing, we calculated q-values  for the primary test of each SNP with overall survival; a q-value<0.05 was considered to be noteworthy.
To assess the correlation of LMO2 IHC staining and host genotypes, we use Chi-squared tests and logistic regression. To address the predictive ability of these results, we used time dependent ROC curves and corresponding c-statistics at 8 years (the approximate median follow-up) were used to assess and compare the prognostic ability of survival models .
Several bioinformatics tools were utilized to identify biological significance of the genetic variants. First, to identify evolutionarily conserved regions, we used phastCons, which is part of the PHAST (PHylogenetic Analysis with Space/Time models) package. The phastCons is a hidden Markov model-based method that estimates the probability that each nucleotide belongs to a conserved element based on the multiple alignments. It considers not just each individual alignment column, but also its flanking columns. We also used ESPERR (Evolutionary and Sequence Pattern Extraction through Reduced Representations) to calculate a Regulatory Potential score, which can discriminate regulatory regions from neutral sites with excellent accuracy (approximately 94%) [22,23]. To identify SNPs that might be in regions that bind transcription factors, we used Transfac Matrix Database (v.7.0) Public 2005 (http://www.gene-regulation.com/pub/databases.html) which contains position-weight matrices for 398 transcription factor binding sites, as characterized through experimental results in the scientific literature. Finally, we used ENCODE (Encyclopedia of DNA Elements, http://www.genome.gov/10005107) Integrated Regulation tracks, which can be accessed through the UCSC genome browser, to identify functional elements of the human genome for transcription regulation.
The mean age at diagnosis of the 162 DLBCL patients in this analysis was 60 years (range 24–74), and 58% were male. A majority of patients were white (90%). Clinically, 28% of the patients had advanced stage disease, and 27% had B-symptoms. Based on cancer registry data, the most common treatment was a chemotherapy-based regimen (84%). During a median follow-up of 92 months (range, 27 to 110 months), there were 52 deaths (32%). Of the 52 deaths, 36 (69%) were due to lymphoma. Our clinical risk score (a linear combination of age, stage, B-symptoms, initial therapy, sex, race, education, and study center as described in the Methods section) ranged from −1.34 to 1.43, and as expected, it was strongly predictive of overall survival (HR=3.12 per unit increase in the score; 95% CI 1.86–5.25; p<0.0001). Comparing the 162 patients in this analysis to the 249 that did not have tissue or genotyping data, there were no statistically significant differences based on age, sex, education, stage or treatment status (data not shown).
Defining positivity based on the 30% cutpoint , 79 of the 162 patients (48%) were classified as LMO2 positive. Compared to LMO2 negative patients, LMO2 positive patients had superior overall survival (HR=0.55; 95% CI 0.31–0.97; p=0.04). Further adjustment for the clinical risk score did not alter this association (HR=0.56; 95% CI 0.32–0.98; p=0.04).
Table I summarizes the SNP-level results with overall survival after adjustment for the clinical risk score. Two SNPs (rs10128650 and rs10836127) were statistically significantly associated with overall survival at p<0.05. The SNP rs10128650 (MAF of 0.06) is located in a highly conserved domain in the promoter region (by PhastCons analysis), and carrying a variant allele was associated with inferior survival (HR=2.89 for each variant allele, 95% CI 1.72–4.87, p-trend=0.00007; q-value 0.0014). The conserved domain in which rs10128650 resides has a very high regulatory potential score of >0.4 . The intronic SNP rs10836127 had a MAF of 0.18, and carrying a variant allele was also associated with inferior survival (HR=2.18 per variant allele, 95% CI 1.39–3.43; p-trend=0.001; q-value=0.010). The SNP rs10836127 was not located in any known functional genomic domains, but was in linkage disequilibrium (LD) with rs10128650 (D′=0.79; r2=0.19).
Four other SNPs approached statistical significance (0.05≤p-trend≤0.15): rs4756077 (intronic SNP, p-trend=0.061), rs750781 (promoter SNP, p-trend=0.087), rs941940 (promoter SNP, p-trend=0.13), and rs1885524 (intronic SNP, p-trend=0.15). With the exception of rs1885524, all of the remaining SNPs were in weak LD with the top two SNPs (Figure 1).
Using a manual backwards selection strategy to evaluate these six SNPs in a multivariate model adjusted for the clinical risk score, four SNPs remained statistically significant at p<0.05: rs1885524 (HR=1.78, 95% CI 1.16–2.72; p-trend=0.008), rs750781 (HR=2.23, 95% CI 1.33–3.73; p-trend=0.0024), rs941940 (HR=0.54, 95% CI 0.32–0.90; p-trend=0.019), and rs10836127 (HR=1.85, 95% CI 1.14–3.01; p-trend=0.013). In a model that further included LMO2 IHC expression, the HRs for the four LMO2 SNPs were essentially unchanged (data not shown), while the HR for LMO2 IHC expression attenuated slightly (HR=0.63, 95% CI 0.36–1.13; p=0.12).
There were no changes in these associations when analyses were restricted to lymphoma deaths as the outcome (non-lymphoma deaths censored; data not shown). When we excluded patients who did not receive any chemotherapy or restricted to white patients only, all significant SNP associations strengthened (data not shown).
Of the four SNPs associated with survival from the multivariate model, only the LMO2 promoter SNP rs941940 was significantly associated with LMO2 IHC expression (p=0.02). Inspection of the data in Table II supported a recessive model, and patients who were variant homozygotes were 2.89 times more likely to be LMO2 IHC positive compared to wild type homozygotes and heterozygotes combined (95% CI 1.26–6.54). This finding parallels the survival results, which showed that carrying a variant allele was associated with better survival (see previous section). For the other 3 SNPs associated with survival, two of them showed no variability in LMO2 IHC expression (rs1885524 and rs750781), while rs10836127 variant homozygotes showed lower LMO2 IHC expression (20%) compared to heterozygotes (52%) and wild type homozygotes (48%), although the global test was not statistically significant (p=0.29).
The only other SNP to be significantly correlated with LMO2 IHC expression was the promoter SNP rs7941248 (p=0.009). Inspection of Table II supported a dominant model, and patients carrying 1 or 2 variant alleles were 2.50 times more likely to be LMO2 positive compared to wild type homozygotes (95% CI 1.33–4.71). However, this SNP itself was not significantly associated with survival (p-trend=0.33), although the per allele HR was <1.0 (HR=0.81; 95% CI 0.53–1.24; Table I), which (weakly) parallels the IHC results. In the HapMap Phase II Caucasian population, this SNP is in high LD with multiple conserved transcription factor binding sites (TRANSFAC 7.0 Public 2005), including the binding site of a transcription factor, GATA1, which has been reported to regulate the LMO2 expression , suggesting a potential function underlying this association.
To assess the prognostic ability of LMO2 expression and LMO2 SNPs, we used a time dependent area under the curve (AUC) model, and calculated the concordance index (c-statistic) at 8 years follow-up, which provides values from 0.500 (prediction no better than chance) to 1.000 (perfect prediction). The model with the clinical risk score alone had a c-statistic of 0.676, which is consistent with the prediction ability of the IPI for DLBCL . The c-statistic for the model with LMO2 IHC expression alone was 0.573. A model with both the clinical risk score and LMO2 IHC was 0.691. The c-statistic for each of the four SNPs that remained statistically significant in a multivariate model ranged from 0.525–0.579 (Table III). When all four SNPs were included in the same model with no other factors, the c-statistic was 0.672. Adding LMO2 IHC to individual SNPs increased the c-statistic, although adding LMO2 IHC to the four SNP model had minimal impact on the model (c-statistic=0.683). Adding the clinical risk score to either individual SNPs or to the four SNP model increased model prediction. A full model of four SNPs, LMO2 IHC, and clinical risk score was essentially identical (c-statistic=0.754) to the four SNP and clinical risk score model (c-statistic=0.751).
For the seven LMO2 SNPs showing some association with either survival (N=6) or LMO2 IHC expression (N=2) (Figure 1), we further looked for additional SNPs in LD (r2 ≥ 0.8, HapMap CEU population) within 250kb of these seven SNPs to identify potential causal SNPs. There were two SNPs in LD with the seven candidate LMO2 SNPs. The SNP rs7119405 is in LD with rs10128650 (D′=1, and r2=0.94). However, SNP rs7119405 is not in a conserved region and does not overlap with any known functional genomic regions, suggesting that rs10128650 is more likely to be the biologically more relevant SNP. In addition, SNP rs1885523 is in LD with rs1885524 (D′=1, and r2=0.98). SNP rs1885523 is located in the intron of LMO2 and has been reported to overlap with a polymerase II binding site (ENCODE transcription binding project).
Using a population-based cohort of 162 DLBCL patients diagnosed from 1998–2000 (pre-rituximab era) and followed through 2008, we show for the first time that germline genetic variation in four LMO2 SNPs (rs1885524, rs10836127, rs750781, rs941940) was associated with overall survival after accounting for clinical factors. The correlation of LMO2 SNP genotype and LMO2 IHC status was more variable, although the SNP rs941940 was significantly correlated with both expression and survival, and several other SNPs trended towards a correlation with both expression and survival. Individually, the SNPs were weak predictors of outcome (c-statistics 0.525–0.579), similar to that of LMO2 IHC (c-statistic 0.573), and weaker than the clinical risk score (c-statistic=0.676). However, when the individual SNPs were combined with the clinical risk score, the c-statistics improved. Further addition of LMO2 IHC to models with SNPs and the clinical risk score only trivially improved the c-statistics. In total, these results suggest that germline genetic variation in LM02 provides as good or slightly better prognostic information than tumor LMO2 IHC status.
Strengths of this study include the population-based ascertainment of newly diagnosed cases, which enhances generalizability. Our genotyping included extensive quality controls. The LMO2 IHC uses a relatively robust antibody, and scoring was highly reliable as assessed by two independent pathologists. Further, the prevalence of LMO2 expression and its association with survival was in the range of previous studies using this antibody [10,25].
There are also several important limitations. We used a retrospective study design and patients were not uniformly treated as in a clinical trial. While this is an important limitation, it needs to be balanced against the limitations of a clinical trial, which are generally based upon a very select subset of generally otherwise healthy patients. We do note that the SNP associations were strongest for patients receiving chemotherapy, which in this setting would be presumably for curative intent. While we did not have data to calculate the IPI, we were able to adjust for clinical and demographic variables, and our clinical risk score predicted with the same robustness as the IPI in other datasets . We also only had IHC and genotyping data on 38% of the original cohort of patients, although we did not see systematic differences based on inclusion in this analysis. We did not include an analysis of germinal center B-cell like (GCB) phenotype in this report. However, in a prior report, we found that 61% of GCB DLBCLs (as assessed by IHC) were LMO2 IHC positive but that 26% of non-GCB cases were also LMO2 positive. Furthermore, there was no significant association of GCB-phenotype with overall survival , supporting a role for LMO2 independent of GCB phenotype that has been previously reported .
Our study was unable to enroll patients with rapidly fatal disease, consistent with enrollment of only living cases into the parent case-control study. The impact of this bias is that our observed survival is consistent with SEER estimates conditioned on surviving 12 months after diagnosis . Thus, the inferences from this study would not apply to early deaths. Nevertheless, LMO2 IHC positivity and the association with overall survival was similar to other published data , suggesting that the association of LMO2 IHC expression as a prognostic marker for overall survival is not particularly impacted by this bias. The number of events was modest (N=52 deaths), and so the models need to be considered in this context, and in the context that we did not have an independent validation sample. Finally, this study was based on patients initially treated in the pre-rituximab era, and thus our findings will need to be evaluated in rituximab era patients.
Our findings that LMO2 expression as measured by IHC in DLBCL is a strong prognostic factor after considering clinical factors replicates in a population-based setting the results of Natkunam and colleagues . They found that LMO2 IHC was positive in 53% of 263 patients treated with anthracycline-based chemotherapy (pre-rituximab) and that LMO2 positive patients had significantly better progression-free (median 12 versus 49 months, p=0.010) and overall (median 21 versus 80 months, p=0.018) survival. Our results are also consistent with studies that have found higher LMO2 expression as measured by cDNA microarray , and RT-PCR , but not qNPA , is associated with better survival in the pre-rituximab setting. While we did not have any data on R-CHOP treated patients, LMO2 IHC expression  and mRNA expression as assessed by RT-PCR  or qNPA  have all been found to be associated with better survival in R-CHOP treated patients, although LMO2 IHC expression was not statistically significantly associated with survival (63% 5-year survival for LMO2+ and 61% 5-year survival for LMO2− patients, p=0.78) in a study of DLBCL patients age 60 and older . However, the proportion of LMO2 positive cases in this study was somewhat lower than in previous studies and only 82 cases were analyzed for LMO2 expression.
Our data also strongly suggest that common germline genetic variation in the LMO2 gene is associated with DLBCL prognosis. There were four SNPs in LMO2 that were associated with overall survival in a multivariate model, two were in the promoter (rs750781 and rs941940), one was intronic (rs10836127) but in weak LD with the promoter SNPs, and the fourth (rs1885524) was intronic and not in LD with the promoter SNPs. The latter intronic SNP was in high LD with rs1885523, which is located in a polymerase II binding site. Of these same four SNPs, only one SNP (rs941940) was significantly associated with LMO2 IHC expression, and the allele associated with expression was also associated with better survival. The only other SNP associated with LMO2 IHC expression, rs7941248, was in high LD with multiple conserved transcription factor binding sites, including GATA1, which has been shown to regulate LMO2 expression . However, rs7941248 was not significantly associated with survival (p-trend=0.33), although patients carrying one or more variant alleles (which was associated with LMO2 IHC expression) did have better survival. The lack of a strong correlation between SNPs predicting survival and SNPs predicting LMO2 IHC expression may be due to variability in LMO2 protein expression, technical issues in LMO2 IHC staining and scoring, or other mechanisms that regulate LMO2 expression (e.g., epigenetics, miRNAs). Nevertheless, taken together, our data support the hypothesis that genetic variation in LMO2, particularly in the promoter, may play a role in both LMO2 expression and survival in DLBCL patients.
The biologic relevance of LMO2 in DLBCL is not known, but it appears to be involved in several important physiologic and pathologic processes relevant to lymphomagenesis. LMO2 is an important transcription factor and its expression is required for hematopoiesis, [1,28] angiogenesis early in development , and vascular endothelial remodeling in adults . Nuclear LMO2 is also widely expressed in the vasculature of native tissues, including lymphatic vasculature, and is detected in the secretory but not proliferative phase of the endometrial gland, suggesting tissue-specific regulation . From a pathologic perspective, chromosomal translocations with the T-cell receptor locus or insertional mutations have been associated with T-cell leukemias [5,30], although microarray analysis has found ectopic LMO2 expression in many cases without chromosomal changes . In a transgenic mouse model of LMO2, the mice developed T-cell lymphoblastic lymphomas and associated leukemias [32,33]. LMO2 is also uniformly expressed in benign vascular and lymphatic neoplasms and in most malignant vascular neoplasms . Of note, to date LMO2 expression in DLBCL has not been associated with any somatic genetic alterations [9,10]. Physiologic (and aberrant) control of LMO2 expression is only now being unraveled, and early reports support a role for tissue-specific regulatory elements  as well as microRNAs (specifically miR-223) during erythroid differentiation .
Irrespective of LMO2 biologic functions, our study shows that genetic variation in LMO2 in combination with clinical factors is a robust prognostic factor for DLBCL, and that LMO2 IHC did not add additional predictive ability once these factors were considered. If replicated, future prognostic indices should consider germline genetic risk markers in LMO2 and perhaps other genes known to be prognostic in DLBCL.
This work was supported by the National Cancer Institute (grants R01 CA96704, R01 CA129539, P50 CA97274, P01 CA17054, and P30 CA014089; NCI Intramural Program; SEER contracts N01-PC35139, N01-PC67008, N01-PC67009, N01-PC65064, N01-PC71105).
We thank Drs. Scott Davis (Fred Hutchinson Cancer Research Center) and Richard K. Severson (Wayne State University) for contributing data. The authors thank Sondra Buehler for editorial assistance.
Potential conflict of interest: There are no conflicts of interest.