|Home | About | Journals | Submit | Contact Us | Français|
Classification of DLBCL into cell-of-origin (COO) subtypes based on gene expression profiles has well-established prognostic value. These subtypes, termed Germinal Center B cell (GCB) and Activated B cell (ABC) also have different genetic alterations and over-expression of different pathways that may serve as therapeutic targets. Thus, accurate classification is essential for analysis of clinical trial results and planning new trials using targeted agents. The current standard for COO classification uses gene expression profiling (GEP) of snap frozen tissues, and a Bayesian predictor algorithm. However this is generally not feasible. In this study, we investigated whether the qNPA technique could be used for accurate classification of COO using formalin fixed, paraffin embedded (FFPE) tissues. We analyzed expression levels of 14 genes in 121 cases of R-CHOP treated DLBCL that had previously undergone GEP using the Affymetrix U133 Plus 2.0 microarray and had matching FFPE blocks. Results were evaluated using the previously published algorithm with a leave-one-out cross validation approach. These results were compared to COO classification based on frozen tissue GEP profiles. For each case, a probability statistic was generated indicating the likelihood that the classification using qNPA was accurate. When data were dichotomized into GCB or non-GCB, overall accuracy was 92%. The qNPA technique accurately categorized DLBCL into GCB and ABC subtypes, as defined by GEP. This approach is quantifiable, applicable to FFPE tissues with no technical failures, and has potential for significant impact on DLBCL research and clinical trial development.
Gene expression profiling can be used to classify DLBCL into 2 biologically based cell-of-origin (COO) subtypes with marked prognostic value that is retained even with immunochemotherapy. The Germinal Center B cell (GCB) subtype has over-expression of germinal center-related genes, higher rate of BCL2 translocations, and more frequent gains of 12q12. In contrast, the Activated B cell (ABC) subtype has over-expression of the NF kappa B pathway, and more frequent trisomy 3, gains of 3q and 18q21–q22, losses of 6q21–q22.(1–3) These biological differences strongly imply that different strategies for therapeutic targeting should be used for each subtype. (2;4–7) At this time, there is a need for methods for accurate classification in order to interpret retrospective and prospective clinical trial results as well as for design of new trials using targeted agents. The original standard for COO classification used RNA extraction, gene expression profiling (GEP) of snap frozen tissues, and a Bayesian predictor algorithm. Previously 14 key genes were reduced from a larger data set in order to make this distinction.(8) An immunohistochemistry (IHC) classification scheme (9), based on 3 antibodies, is widely used as a substitute for GEP classification; however IHC does not completely correlate with GEP classification or outcome.(10) More recently, an improved IHC scheme based on 5 antibodies has been proposed with improved specificity.(11) Need to add in the Tally algorithm. Both of these techniques rely on semi-quantitative assessment of protein levels rather than quantitative mRNA levels. Furthermore, the frequency of cases with failed or uninterpretable IHC results has not been reported in any of the previous papers. We recently described a qNPA assay (ArrayPlateR) for quantification of mRNA in tissues with excellent correlation between frozen and formalin fixed paraffin embedded (FFPE) materials and further extended our pilot series to a large group of DLBCL cases.(12;13) Initially, this assay was developed to measure the mRNA levels of genes previously identified using gene expression profiling or RT-PCR as highly prognostic in DLBCL.(14–17) In this study, we developed and incorporated additional gene probes to measure genes that were part of the Bayesian predictor algorithm for classifying ABC and GCB subtypes.(8) Our hypothesis was that qNPA could be used for accurate classification of COO using FFPE tissues. The overall goal of the project was to develop a robust, reproducible method to classify DLBCL into COO using FFPE blocks with the same accuracy as gene expression profiling for application in future studies.
All work was performed under an IRB approved protocol from the University of Arizona, Tucson, Arizona, USA. We retrospectively collected 121 cases of R-CHOP treated, de novo DLBCL from adult patients that had undergone GEP using the Affymetrix U133 Plus 2.0 microarray and had matching FFPE blocks were analyzed with qNPA in duplicate. Of these, 106 cases were part of the originally published case series by the Lymphoma and Leukemia Molecular Profiling Project.(18), with the remaining 15 collected later under the same protocol and so not included in the initial publication. All available cases with successful gene expression profiling and which had adequately thick FFPE blocks were included. Five unstained sections of 5 microns in thickness were cut from each FFPE block and sent to the University of Arizona, where they were anonymously coded and forwarded to High Throughput Genomics for analysis. The HTG personnel were blinded to the COO status of the cases.
We expanded the previous gene probe repertoire of the DLBCL-ArrayPlateR assay to include the 14 genes most pertinent to COO classification. The genes of interest for this project included CD10, LRMP, CCND2, ITPKB, PIM1, IL16, IRF4, FUT8, BCL6, PTPN1, LM02, CD39, MYBL1, and IGHM.(8)
The ArrayPlateR assay was performed as previously described.(12;13) Briefly, following lysis of the tissue sections, 50-mer probes specific for the genes of interest were incubated with the samples, forming specific probe-mRNA duplexes, then unhybridized probes were digested by S1 Nuclease, followed by alkaline hydrolysis to destroy the mRNA in the duplexes. This left intact probes with stoichiometric concentrations proportional to the abundance of specific mRNA in the original tumor sections. Samples were transferred for probe detection to ArrayPlates, which contained 16 unique, covalently-bound, 25-mer “anchor” oligonucleotides spotted on the bottom of each well. This universal array was modified to bind 50-mer probes for the genes of interest by exposing the array to a mixture of 50-mer Programming Linker oligonucleotides that contained 25-mer sequences designed to bind one of the probes at one end, and a 25-mer sequence to bind one of the anchor oligonucleotides on the other end. After hybridization, probes from the sample were bound to array elements by the Programming Linker oligonucleotides. A mixture of Detection Linker oligonucleotides was added. The 50-mer Detection Linkers contained a 25-mer sequence that bound sample probe on the end not bound by the Programming Linker probe on one end, and a common 25-mer sequence to bind a Detection Probe, bound with horseradish peroxidase, on the other end. Finally, a chemiluminescent peroxidase substrate was added to allow quantification (Lumigen PS-atto, Lumigen, Inc., Southfield, MI). Figure 1 shows a schematic of the assay. The signals were recorded simultaneously by imaging the plate from the bottom with a CCD-based Omix Imager (HTG). Images were analyzed using Vuescript software (HTG). Expression levels were normalized to the housekeeping gene TBP.(13;19)
Using the predictor previously described samples were initially subtyped into ABC and GCB categories based on a subset of 187 probes from the U133 Plus array, resulting in 58 ABC samples, 49 GCB samples, and 14 with intermediate gene expression which were left unclassified.(18)
The qNPA data were normalized to the housekeeping gene, TATA Box Binding Protein (TBP) as in previous studies.(13;19) The data were log2 transformed. Following a leave-one-out framework, Bayesian predictive models were generated based on those qNPA assays that were found to be significantly different between ABC and GCB with p<0.001.(8) Briefly this was done by, removing a test sample from the data set and predicting its subtype based on a predictor trained on the remaining samples. To do this, a predictor score equal to the weighted sum of the log transformed normalized qNPA values for the left out sample, with weights equal to the t-statistic for ABC/GCB differential expression for that qNPA probe on the remaining training set. This score was then compared to the distributions of scores within the ABC and GCB subsets in the training set to arrive at a Bayesian probability for the sample to be ABC versus GCB.
The classification results based on qNPA were then compared to the previous COO classification based on frozen tissue GEP profiles.(18). Since different Cross-validation runs included different weights and reference distributions, the predictor scores for individual cross-validation runs were not directly comparable. To present comparable scores in Figure 2, the scores of an individual cross-validation run were linearly transformed so that the means of the ABC and GCB scores on the training set of the run were equal to the means on a model made from the entire set of 121 samples.
All 14 genes in all 121 cases were successfully analyzed. Although gene selection was cross validated, the data was sufficiently uniform that all cross-validated runs on qNPA identified the same 12 probe sets with the most significance (p<0.001). The two eliminated genes were IgHM (p=0.06) and PTPN1 (p=0.017). The p-values for the remaining 12 genes were: CD10 p=6.85×10−13, LRMP p=2.28×10−6, CCND2 p=6.87×10−6, ITPKB p=1.6×10−5, PIM1 p=5.62×10−8, IL16 p=6.43×10−7, IRF4 p=3.87×10−12, FUT8 p=1.23×10−11, BCL6 p=1.29×10−7, LM02 p= 7.96×10−9, CD39 p=1.11×10−7, MYBL1 p=1.11×10−16 (p-values presented here were calculated across the entire set of 121 samples). All further analysis was performed with the 12 most significant genes. Of the 121 cases, 49 were GCB, 58 were ABC and 14 were unclassifiable as categorized by GEP in the original publication.(18) For each case, a probability statistic was generated indicating the likelihood that the classification using qNPA was accurate. Of the GCB cases, 36/49 (73%) were classified correctly by qNPA with a confidence cutoff of >0.9 and 39/49 (80%) classified correctly with a confidence cut-off of >0.8. Of the ABC cases, 46/58 (79%) were correctly classified as ABC using qNPA with a confidence cut-off of >0.9 and 48/58 (83%) classified correctly with a confidence cut-off of >0.8. These results are tabulated in the top portion of Table 1. When the data were dichotomized into GCB and non-GCB with a confidence cut-off of above or below 50th percentile of the probability statistic, and unclassifiable cases were excluded, 45/49 cases were correctly classified as GCB and 53 of 58 cases were correctly classified as ABC for an accuracy rate of 92%. These data are summarized in the bottom portion of Table 1. The calculated sensitivity is 92% and specificity is 90% for the GCB subtype.
Observed model prediction differences can result from technology platform differences, different cases used in the training set, or to model formulation (12 genes versus 187 genes). We therefore under took a comparison of 3 different models for ABC and GCB distinction by examining the probability scores derived from (1) 12 genes measured with the qNPA platform, model derived using cross-validation as described above, (2) the same 12 genes measured with Affymetrix platform retrained via cross validation on the 121 test samples, and (3) the 187 genes predictor trained on a separate set of 181 CHOP cases as previously described.(18) A summary of these models is shown in Table 2. Two-way graphs comparing the model scores generated by each model plotted against each other are shown in Figure 2. On the left, 2 sets of Affymetrix data are shown, one with 12 genes and the other with 187 genes. Inspection reveals good general agreement between the 2 models. Any scatter in the plot was considered to result primarily from the reduced number of genes in the model and possibly the different set of cases used for cross-validation, however not from the technical platform, which was the same. On the right, is a comparison of the 12 gene model using qNPA versus the 187 gene model using Affymetrix plotted against each other, R-scale value of 0.89, p-value of < 0.0001. Of interest, the amount of scatter is similar to the left panel which used different numbers of genes (12 vs. 187) but the same platform with a linear regression R-scale value of 0.94 and p-value of < 0.0001. Both graphs show excellent linear correlations. The similarity of the graphs indicates that the scatter in the right plot is also mainly due to the reduced number of genes or different case validation set rather than any inaccuracy in the qNPA platform.
The goal of the project was to develop a technique that would be useful for assessment of FFPE blocks with the same accuracy as gene expression profiling so that we could use this for interrogation of other collected case series. We reasoned that such a technique would have wide application in DLBCL research and clinical trials. In this study, we show that the qNPA technology can categorize DLBCL into GCB and ABC subtypes with high accuracy as compared to GEP. Importantly, both the laboratory work and classification analysis were performed blinded to the COO category based on the previous GEP and used cases from the one of our previously published papers which was used to define ABC and GCB in the literature. Also, the algorithm previously used for GEP classification was again employed with no modifications. Of note, there were no technical difficulties with any of the pathological materials although they were collected retrospectively from a variety of institutions and countries with different fixation methods. This could be a significant advantage of the qNPA technique over IHC, which may be sensitive to fixation and antigen retrieval methods. Of note, neither previous IHC nor molecular studies have reported on the rate of unsuccessful or uninterpretable cases.
Other methods for comparing mRNA levels in paraffin embedded tissues rather than frozen tissues are also under development as recently demonstrated using the Affymetrix system.(20) While accurate, the Affimetrix method still requires RNA extraction and performs best with a linear amplification technique in order to provide a sufficient amount of template. Additionally, the study excluded any cases previously assigned to the Unclassifiable category using complete GEP from previous studies, thus eliminating 11 of 59 samples, resulting in a 48 case series with a call rate of 93.8% using 100 of the most predictive genes.(21) In the current study, we used all available samples including those with gene expression profiles that had previously fallen into the Unclassifiable category.(15;18)
As shown in the model comparison, the larger number of genes that can be included in an assay, the more similar the results will be to the original complete GEP model with 187 genes. The assay tested in this study was based on the most statistically powerful 14 genes, of which 12 were finally used. However, additional genes could be readily added to the design that would undoubtedly further improve the accuracy compared to complete gene expression profiling.
Limitations of the qNPA assay include the relatively complicated technology compared to IHC and use of a patented array that is not widely implemented at this time. Of interest, a new imaging platform has just been announced that will use a Luminex bead technology called the qBead Assay (Luminex Corporation, Austin, Texas) rather than the Omix imager used for this project. Advantages include the lack of RNA extraction, use of FFPE tissues, and more quantitative data than IHC. The qNPA method yielded data on all cases with a perfect “intent to categorize” rate, while the failure rate of other methods has not been reported. In summary, we report here a method that has the potential for significant impact on future DLBCL research and clinical trial development not only with regard to COO classification but also in the development of other applications of GEP data.