Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Clin Cancer Res. Author manuscript; available in PMC 2012 June 1.
Published in final edited form as:
PMCID: PMC3107869

Accurate Classification of Diffuse Large B cell lymphoma into Germinal Center and Activated B cell Subtypes Using a Nuclease Protection Assay on Formalin Fixed Paraffin Embedded Tissues


Classification of DLBCL into cell-of-origin (COO) subtypes based on gene expression profiles has well-established prognostic value. These subtypes, termed Germinal Center B cell (GCB) and Activated B cell (ABC) also have different genetic alterations and over-expression of different pathways that may serve as therapeutic targets. Thus, accurate classification is essential for analysis of clinical trial results and planning new trials using targeted agents. The current standard for COO classification uses gene expression profiling (GEP) of snap frozen tissues, and a Bayesian predictor algorithm. However this is generally not feasible. In this study, we investigated whether the qNPA technique could be used for accurate classification of COO using formalin fixed, paraffin embedded (FFPE) tissues. We analyzed expression levels of 14 genes in 121 cases of R-CHOP treated DLBCL that had previously undergone GEP using the Affymetrix U133 Plus 2.0 microarray and had matching FFPE blocks. Results were evaluated using the previously published algorithm with a leave-one-out cross validation approach. These results were compared to COO classification based on frozen tissue GEP profiles. For each case, a probability statistic was generated indicating the likelihood that the classification using qNPA was accurate. When data were dichotomized into GCB or non-GCB, overall accuracy was 92%. The qNPA technique accurately categorized DLBCL into GCB and ABC subtypes, as defined by GEP. This approach is quantifiable, applicable to FFPE tissues with no technical failures, and has potential for significant impact on DLBCL research and clinical trial development.


Gene expression profiling can be used to classify DLBCL into 2 biologically based cell-of-origin (COO) subtypes with marked prognostic value that is retained even with immunochemotherapy. The Germinal Center B cell (GCB) subtype has over-expression of germinal center-related genes, higher rate of BCL2 translocations, and more frequent gains of 12q12. In contrast, the Activated B cell (ABC) subtype has over-expression of the NF kappa B pathway, and more frequent trisomy 3, gains of 3q and 18q21–q22, losses of 6q21–q22.(13) These biological differences strongly imply that different strategies for therapeutic targeting should be used for each subtype. (2;47) At this time, there is a need for methods for accurate classification in order to interpret retrospective and prospective clinical trial results as well as for design of new trials using targeted agents. The original standard for COO classification used RNA extraction, gene expression profiling (GEP) of snap frozen tissues, and a Bayesian predictor algorithm. Previously 14 key genes were reduced from a larger data set in order to make this distinction.(8) An immunohistochemistry (IHC) classification scheme (9), based on 3 antibodies, is widely used as a substitute for GEP classification; however IHC does not completely correlate with GEP classification or outcome.(10) More recently, an improved IHC scheme based on 5 antibodies has been proposed with improved specificity.(11) Need to add in the Tally algorithm. Both of these techniques rely on semi-quantitative assessment of protein levels rather than quantitative mRNA levels. Furthermore, the frequency of cases with failed or uninterpretable IHC results has not been reported in any of the previous papers. We recently described a qNPA assay (ArrayPlateR) for quantification of mRNA in tissues with excellent correlation between frozen and formalin fixed paraffin embedded (FFPE) materials and further extended our pilot series to a large group of DLBCL cases.(12;13) Initially, this assay was developed to measure the mRNA levels of genes previously identified using gene expression profiling or RT-PCR as highly prognostic in DLBCL.(1417) In this study, we developed and incorporated additional gene probes to measure genes that were part of the Bayesian predictor algorithm for classifying ABC and GCB subtypes.(8) Our hypothesis was that qNPA could be used for accurate classification of COO using FFPE tissues. The overall goal of the project was to develop a robust, reproducible method to classify DLBCL into COO using FFPE blocks with the same accuracy as gene expression profiling for application in future studies.



All work was performed under an IRB approved protocol from the University of Arizona, Tucson, Arizona, USA. We retrospectively collected 121 cases of R-CHOP treated, de novo DLBCL from adult patients that had undergone GEP using the Affymetrix U133 Plus 2.0 microarray and had matching FFPE blocks were analyzed with qNPA in duplicate. Of these, 106 cases were part of the originally published case series by the Lymphoma and Leukemia Molecular Profiling Project.(18), with the remaining 15 collected later under the same protocol and so not included in the initial publication. All available cases with successful gene expression profiling and which had adequately thick FFPE blocks were included. Five unstained sections of 5 microns in thickness were cut from each FFPE block and sent to the University of Arizona, where they were anonymously coded and forwarded to High Throughput Genomics for analysis. The HTG personnel were blinded to the COO status of the cases.

Assay Design

We expanded the previous gene probe repertoire of the DLBCL-ArrayPlateR assay to include the 14 genes most pertinent to COO classification. The genes of interest for this project included CD10, LRMP, CCND2, ITPKB, PIM1, IL16, IRF4, FUT8, BCL6, PTPN1, LM02, CD39, MYBL1, and IGHM.(8)

Assay performance

The ArrayPlateR assay was performed as previously described.(12;13) Briefly, following lysis of the tissue sections, 50-mer probes specific for the genes of interest were incubated with the samples, forming specific probe-mRNA duplexes, then unhybridized probes were digested by S1 Nuclease, followed by alkaline hydrolysis to destroy the mRNA in the duplexes. This left intact probes with stoichiometric concentrations proportional to the abundance of specific mRNA in the original tumor sections. Samples were transferred for probe detection to ArrayPlates, which contained 16 unique, covalently-bound, 25-mer “anchor” oligonucleotides spotted on the bottom of each well. This universal array was modified to bind 50-mer probes for the genes of interest by exposing the array to a mixture of 50-mer Programming Linker oligonucleotides that contained 25-mer sequences designed to bind one of the probes at one end, and a 25-mer sequence to bind one of the anchor oligonucleotides on the other end. After hybridization, probes from the sample were bound to array elements by the Programming Linker oligonucleotides. A mixture of Detection Linker oligonucleotides was added. The 50-mer Detection Linkers contained a 25-mer sequence that bound sample probe on the end not bound by the Programming Linker probe on one end, and a common 25-mer sequence to bind a Detection Probe, bound with horseradish peroxidase, on the other end. Finally, a chemiluminescent peroxidase substrate was added to allow quantification (Lumigen PS-atto, Lumigen, Inc., Southfield, MI). Figure 1 shows a schematic of the assay. The signals were recorded simultaneously by imaging the plate from the bottom with a CCD-based Omix Imager (HTG). Images were analyzed using Vuescript software (HTG). Expression levels were normalized to the housekeeping gene TBP.(13;19)

Figure 1
Schematic showing qNPA technology. In the first part of the assay, the Lysis Buffer is added to patient sample to disrupt the sample. Next detection probes, specifically designed for the genes of interest, hybridize to all mRNA. An S1 nuclease destroys ...

Data analysis

Using the predictor previously described samples were initially subtyped into ABC and GCB categories based on a subset of 187 probes from the U133 Plus array, resulting in 58 ABC samples, 49 GCB samples, and 14 with intermediate gene expression which were left unclassified.(18)

The qNPA data were normalized to the housekeeping gene, TATA Box Binding Protein (TBP) as in previous studies.(13;19) The data were log2 transformed. Following a leave-one-out framework, Bayesian predictive models were generated based on those qNPA assays that were found to be significantly different between ABC and GCB with p<0.001.(8) Briefly this was done by, removing a test sample from the data set and predicting its subtype based on a predictor trained on the remaining samples. To do this, a predictor score equal to the weighted sum of the log transformed normalized qNPA values for the left out sample, with weights equal to the t-statistic for ABC/GCB differential expression for that qNPA probe on the remaining training set. This score was then compared to the distributions of scores within the ABC and GCB subsets in the training set to arrive at a Bayesian probability for the sample to be ABC versus GCB.

The classification results based on qNPA were then compared to the previous COO classification based on frozen tissue GEP profiles.(18). Since different Cross-validation runs included different weights and reference distributions, the predictor scores for individual cross-validation runs were not directly comparable. To present comparable scores in Figure 2, the scores of an individual cross-validation run were linearly transformed so that the means of the ABC and GCB scores on the training set of the run were equal to the means on a model made from the entire set of 121 samples.

Figure 2
Two-way comparisons of probability scores derived from different models for ABC and GCB distinction. On the left, 2 sets of probability scores derived from Affymetrix data are shown, one with 12 genes and the other with 187 genes with a linear regression ...


All 14 genes in all 121 cases were successfully analyzed. Although gene selection was cross validated, the data was sufficiently uniform that all cross-validated runs on qNPA identified the same 12 probe sets with the most significance (p<0.001). The two eliminated genes were IgHM (p=0.06) and PTPN1 (p=0.017). The p-values for the remaining 12 genes were: CD10 p=6.85×10−13, LRMP p=2.28×10−6, CCND2 p=6.87×10−6, ITPKB p=1.6×10−5, PIM1 p=5.62×10−8, IL16 p=6.43×10−7, IRF4 p=3.87×10−12, FUT8 p=1.23×10−11, BCL6 p=1.29×10−7, LM02 p= 7.96×10−9, CD39 p=1.11×10−7, MYBL1 p=1.11×10−16 (p-values presented here were calculated across the entire set of 121 samples). All further analysis was performed with the 12 most significant genes. Of the 121 cases, 49 were GCB, 58 were ABC and 14 were unclassifiable as categorized by GEP in the original publication.(18) For each case, a probability statistic was generated indicating the likelihood that the classification using qNPA was accurate. Of the GCB cases, 36/49 (73%) were classified correctly by qNPA with a confidence cutoff of >0.9 and 39/49 (80%) classified correctly with a confidence cut-off of >0.8. Of the ABC cases, 46/58 (79%) were correctly classified as ABC using qNPA with a confidence cut-off of >0.9 and 48/58 (83%) classified correctly with a confidence cut-off of >0.8. These results are tabulated in the top portion of Table 1. When the data were dichotomized into GCB and non-GCB with a confidence cut-off of above or below 50th percentile of the probability statistic, and unclassifiable cases were excluded, 45/49 cases were correctly classified as GCB and 53 of 58 cases were correctly classified as ABC for an accuracy rate of 92%. These data are summarized in the bottom portion of Table 1. The calculated sensitivity is 92% and specificity is 90% for the GCB subtype.

Table 1
Molecular Sub-classification Using Quantitative Nuclease Protection Data

Observed model prediction differences can result from technology platform differences, different cases used in the training set, or to model formulation (12 genes versus 187 genes). We therefore under took a comparison of 3 different models for ABC and GCB distinction by examining the probability scores derived from (1) 12 genes measured with the qNPA platform, model derived using cross-validation as described above, (2) the same 12 genes measured with Affymetrix platform retrained via cross validation on the 121 test samples, and (3) the 187 genes predictor trained on a separate set of 181 CHOP cases as previously described.(18) A summary of these models is shown in Table 2. Two-way graphs comparing the model scores generated by each model plotted against each other are shown in Figure 2. On the left, 2 sets of Affymetrix data are shown, one with 12 genes and the other with 187 genes. Inspection reveals good general agreement between the 2 models. Any scatter in the plot was considered to result primarily from the reduced number of genes in the model and possibly the different set of cases used for cross-validation, however not from the technical platform, which was the same. On the right, is a comparison of the 12 gene model using qNPA versus the 187 gene model using Affymetrix plotted against each other, R-scale value of 0.89, p-value of < 0.0001. Of interest, the amount of scatter is similar to the left panel which used different numbers of genes (12 vs. 187) but the same platform with a linear regression R-scale value of 0.94 and p-value of < 0.0001. Both graphs show excellent linear correlations. The similarity of the graphs indicates that the scatter in the right plot is also mainly due to the reduced number of genes or different case validation set rather than any inaccuracy in the qNPA platform.

Table 2
Models for Comparison


The goal of the project was to develop a technique that would be useful for assessment of FFPE blocks with the same accuracy as gene expression profiling so that we could use this for interrogation of other collected case series. We reasoned that such a technique would have wide application in DLBCL research and clinical trials. In this study, we show that the qNPA technology can categorize DLBCL into GCB and ABC subtypes with high accuracy as compared to GEP. Importantly, both the laboratory work and classification analysis were performed blinded to the COO category based on the previous GEP and used cases from the one of our previously published papers which was used to define ABC and GCB in the literature. Also, the algorithm previously used for GEP classification was again employed with no modifications. Of note, there were no technical difficulties with any of the pathological materials although they were collected retrospectively from a variety of institutions and countries with different fixation methods. This could be a significant advantage of the qNPA technique over IHC, which may be sensitive to fixation and antigen retrieval methods. Of note, neither previous IHC nor molecular studies have reported on the rate of unsuccessful or uninterpretable cases.

Other methods for comparing mRNA levels in paraffin embedded tissues rather than frozen tissues are also under development as recently demonstrated using the Affymetrix system.(20) While accurate, the Affimetrix method still requires RNA extraction and performs best with a linear amplification technique in order to provide a sufficient amount of template. Additionally, the study excluded any cases previously assigned to the Unclassifiable category using complete GEP from previous studies, thus eliminating 11 of 59 samples, resulting in a 48 case series with a call rate of 93.8% using 100 of the most predictive genes.(21) In the current study, we used all available samples including those with gene expression profiles that had previously fallen into the Unclassifiable category.(15;18)

As shown in the model comparison, the larger number of genes that can be included in an assay, the more similar the results will be to the original complete GEP model with 187 genes. The assay tested in this study was based on the most statistically powerful 14 genes, of which 12 were finally used. However, additional genes could be readily added to the design that would undoubtedly further improve the accuracy compared to complete gene expression profiling.

Limitations of the qNPA assay include the relatively complicated technology compared to IHC and use of a patented array that is not widely implemented at this time. Of interest, a new imaging platform has just been announced that will use a Luminex bead technology called the qBead Assay (Luminex Corporation, Austin, Texas) rather than the Omix imager used for this project. Advantages include the lack of RNA extraction, use of FFPE tissues, and more quantitative data than IHC. The qNPA method yielded data on all cases with a perfect “intent to categorize” rate, while the failure rate of other methods has not been reported. In summary, we report here a method that has the potential for significant impact on future DLBCL research and clinical trial development not only with regard to COO classification but also in the development of other applications of GEP data.

Reference List

1. Alizadeh AA, Eisen MB, Davis RE, Ma C, Lossos IS, Rosenwald A, Boldrick JC, Sabet H, Tran T, Yu X, et al. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature. 2000 Feb 3;403(6769):503–11. [PubMed]
2. Bea S, Zettl A, Wright G, Salaverria I, Jehn P, Moreno V, Burek C, Ott G, Puig X, Yang L, et al. Diffuse large B-cell lymphoma subgroups have distinct genetic profiles that influence tumor biology and improve gene-expression-based survival prediction. Blood. 2005 Nov 1;106(9):3183–90. [PubMed]
3. Iqbal J, Sanger WG, Horsman DE, Rosenwald A, Pickering DL, Dave S, Cao K, Zhu Q, Xiao L, Hans CP, et al. BCL2 translocation defines a subset of DLBCL with germinal center B-cell-like gene expression profiles and preferential expression of a set of genes. Blood. 2003 Nov 16;102(11):884A.
4. Bea S, Colomo L, Lopez-Guillermo AL, Salaverria I, Puig X, Pinyo IM, Rives S, Montserrat E, Campo E. Clinicopathologic signficance and prognostic value of chromosomal imbalances in diffuse large B-Cell lymphomas. Journal of Clinical Oncology. 2004 Sep 1;22(17):3498–506. [PubMed]
5. Lam LT, Davis RE, Pierce J, Hepperle M, Xu Y, Hottelet M, Nong Y, Wen D, Adams J, Dang L, et al. Small molecule inhibitors of IkappaB kinase are selectively toxic for subgroups of diffuse large B-cell lymphoma defined by gene expression profiling. Clin Cancer Res. 2005 Jan 1;11(1):28–40. [PubMed]
6. Lenz G, Nagel I, Siebert R, Roschke AV, Sanger W, Wright GW, Dave SS, Tan B, Zhao H, Rosenwald A, et al. Aberrant immunoglobulin class switch recombination and switch translocations in activated B cell-like diffuse large B cell lymphoma. J Exp Med. 2007 Mar 19;204(3):633–43. [PMC free article] [PubMed]
7. Lenz G, Wright GW, Emre NC, Kohlhammer H, Dave SS, Davis RE, Carty S, Lam LT, Shaffer AL, Xiao W, et al. Molecular subtypes of diffuse large B-cell lymphoma arise by distinct genetic pathways. Proc Natl Acad Sci U S A. 2008 Sep 9;105(36):13520–5. [PubMed]
8. Wright G, Tan B, Rosenwald A, Hurt EH, Wiestner A, Staudt LM. A gene expression-based method to diagnose clinically distinct subgroups of diffuse large B cell lymphoma. Proc Natl Acad Sci U S A. 2003 Aug 19;100(17):9991–6. [PubMed]
9. Hans CP, Weisenburger DD, Greiner TC, Gascoyne RD, Delabie J, Ott G, Muller-Hermelink HK, Campo E, Braziel PM, Jaffe ES, et al. Confirmation of the molecular classification of diffuse large B-cell lymphoma by inmunohistochemistry using a tissue microarray. Blood. 2004 Jan 1;103(1):275–82. [PubMed]
10. Ott G, Ziepert M, Klapper W, Horn H, Szczepanowski M, Bernd HW, Thorns C, Feller AC, Lenze D, Hummel M, et al. Immunoblastic morphology but not the inmunohistochemical GCB/non-GCB classifier predicts outcome in diffuse large B-cell lymphoma in the RICOVER-60 trial of the DSHNHL. Blood. 2010 Aug 24;116(23):4916–25. [PubMed]
11. Choi WWL, Weisenburger DD, Greiner TC, Piris MA, Banham AH, Jaye DL, Wade PA, Iqbal J, Hans CP, Fu K, et al. A new inmunostain algorithm improves the classification of diffuse large B-cell lymphoma into prognostically significant subgroups. Modern Pathology. 2008 Jan;21:250A.
12. Rimsza LM, LeBlanc ML, Unger JM, Miller TP, Grogan TM, Persky DO, Martel RR, Sabalos CM, Seligmann B, Braziel RM, et al. Gene expression predicts overall survival in paraffin-embedded tissues of diffuse large B-cell lymphoma treated with R-CHOP. Blood. 2008 Oct15;112(8):3425–33. [PubMed]
13. Roberts RA, Sabalos CM, LeBlanc ML, Martel RR, Frutiger YM, Unger JM, Botros IW, Rounseville MP, Seligmann BE, Miller TP, et al. Quantitative nuclease protection assay in paraffin-embedded tissue replicates prognostic microarray gene expression in diffuse large-B-cell lymphoma. Laboratory Investigation. 2007 Oct;87(10):979–97. [PubMed]
14. Lossos IS, Czeiwinski DK, Alizadeh AA, Wechser MA, Tibshirani R, Botstein D, Levy R. Prediction of survival in diffuse large-B-cell lymphoma based on the expression of six genes. N Engl J Med. 2004 Apr 29;350(18):1828–37. [PubMed]
15. Rosenwald A, Wright G, Chan WC, Connors JM, Campo E, Fisher RI, Gascoyne RD, Muller-Hermelink HK, Smeland EB, Giltnane JM, et al. The use of molecular profiling to predict survival after chemotherapy ford diffuse large-B-cell lymphoma. N Engl J Med. 2002 Jun 20;346(25):1937–47. [PubMed]
16. Shipp MA, Ross KN, Tamayo P, Weng AP, Kutok JL, Aguiar RC, Gaasenbeek M, Angelo M, Reich M, Pinkus GS, et al. Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nat Med. 2002 Jan;8(1):68–74. [PubMed]
17. Tome ME, Johnson DBF, Rimsza LM, Roberts RA, Grogan TM, Miller TP, Oberley LW, Briehl MM. A redox signature score identified diffuse large B-cell lymphoma patients with a poor prognosis. Blood. 2005 Nov 15;106(10):3594–601. [PubMed]
18. Lenz G, Wright G, Dave SS, Xiao W, Powell J, Zhao H, Xu W, Tan B, Goldschmidt N, Iqbal J, et al. Stromal Gene Signatures in Large-B-Cell Lymphomas. New England Journal of Medicine. 2008 Nov 27;359(22):2313–23. [PubMed]
19. Lossos IS, Czeiwinski DK, Wechser MA, Levy R. Optimization of quantitative real-time RT-PCR parameters for the study of lymphoid malignancies. Leukemia. 2003 Apr;17(4):789–95. [PubMed]
20. Williams PM, Li R, Johnson NA, Wright G, Heath JD, Gascoyne RD. A Novel Method of Amplification of FFPET-Derived RNA Enables Accurate Disease Classification with Microarrays. J Mol Diagn. 2010 Aug 5;5:680–6. [PubMed]
21. Dave SS, Fu K, Wright GW, Lam LT, Kluin P, Boerma EJ, Greiner TC, Weisenburger DD, Rosenwald A, Ott G, et al. Molecular diagnosis of Burkitt’s lymphoma. N Engl J Med. 2006 Jun 8;354(23):2431–42. [PubMed]