|Home | About | Journals | Submit | Contact Us | Français|
Recently, a 23‐gene signature was developed to produce a melanoma diagnostic score capable of differentiating malignant and benign melanocytic lesions. The primary objective of this study was to independently assess the ability of the gene signature to differentiate melanoma from benign nevi in clinically relevant lesions.
A set of 1400 melanocytic lesions was selected from samples prospectively submitted for gene expression testing at a clinical laboratory. Each sample was tested and subjected to an independent histopathologic evaluation by 3 experienced dermatopathologists. A primary diagnosis (benign or malignant) was assigned to each sample, and diagnostic concordance among the 3 dermatopathologists was required for inclusion in analyses. The sensitivity and specificity of the score in differentiating benign and malignant melanocytic lesions were calculated to assess the association between the score and the pathologic diagnosis.
The gene expression signature differentiated benign nevi from malignant melanoma with a sensitivity of 91.5% and a specificity of 92.5%.
These results reflect the performance of the gene signature in a diverse array of samples encountered in routine clinical practice. Cancer 2017;123:617–628. © 2016 American Cancer Society.
The lifetime risk of developing melanoma in the United States is now 1 in 34 for men and 1 in 54 for women, with an extimated 73,870 new cases and 9940 deaths in 2015.1 Many melanomas are curable by excision if they are detected early, with a 10‐year survival rate for patients with stage I melanomas of 86% to 95%.2 However, the 10‐year survival rate is only 10% to 15% for patients with stage IV melanomas, and this makes the early and accurate diagnosis of melanocytic lesions vital to improved patient outcomes.
Histopathologic examination has long been the gold standard for melanoma diagnosis. Although this method is adequate for many cases, evidence suggests that approximately 15% of lesions may be diagnostically ambiguous by histopathology.3, 4, 5 As a result, even experienced dermatopathologists disagree in some cases, with diagnostic discordance ranging from 15% to 38% according to the type of lesion.3, 4, 5, 6, 7 Consequently, adjuncts to histopathology have been sought to facilitate the accurate diagnosis of melanoma.
Recently, a 23‐gene expression signature has been developed as an adjunctive method for differentiating melanoma and benign nevi.8 This signature measures the expression of 14 genes involved in melanoma pathogenesis and 9 housekeeper genes by quantitative reverse transcription–polymerase chain reaction, and it applies an algorithm that produces a numerical diagnostic score. This assay has been clinically validated in a retrospective study to differentiate malignant melanoma from benign nevi with a sensitivity of 90% and a specificity of 91%.8 Although that study included several diagnostically challenging subtypes, the overall distribution of lesions in these archival samples was somewhat limited.
The aim of the current study was to validate the use of this gene signature for clinical testing. This was done by assessing the performance of the gene signature in a cohort of prospectively collected clinical samples against the current diagnostic gold standard, which is histopathology. In light of the known limitations of histopathology, previous studies have suggested that the examination of cases by multiple pathologists improves the accuracy and reliability of histopathology.9, 10 To ensure an accurate assessment of the gene signature here, the performance was evaluated for a set of cases for which 3 experienced dermatopathologists, examining each lesion independently, arrived at the same diagnosis (triple‐concordant histopathologic diagnoses). Because of the difficulty in obtaining detailed clinical follow‐up, particularly because most melanomas are now excised before they develop metastatic capability, this approach has been deemed appropriate as a surrogate reference standard.9
Melanocytic lesions were submitted for gene expression testing (Myriad Genetic Laboratories, Inc) as part of normal health care operations. The full technical specifications of the test have been previously described.8, 11 Briefly, an anatomic pathologist identified representative areas of the lesion on an hematoxylin‐eosin–stained slide. The corresponding area was then macrodissected from unstained tissue and pooled into a single tube for RNA extraction. A quantitative reverse transcription–polymerase chain reaction assay measured the differential expression of 23 genes. This assay includes 14 tumor marker genes as well as 9 housekeeper genes for normalization. The cell differentiation gene PRAME is included in the signature and has been shown to exhibit significantly increased expression in more aggressive tumors of multiple lineages.12, 13, 14 The signature also includes S100A9 and 4 related genes (S100A7, S100A8, S100A12, and PI3), which participate in a cell signaling response to tissue damage.15 Eight immune group genes (CCL5, CD38, CXCL10, CXCL9, IRF1, LCP2, PTPRC, and SELL) are also included and appear to function in tumor immune response signaling.16, 17, 18, 19, 20 Both of these gene groups were selected on the basis of differential expression in benign nevi and malignant melanoma, with increased expression in melanoma. A weighting algorithm was applied to produce a score plotted on a scale ranging from –16.7 to+11.1 according to previously validated methods.11 Scores from –16.7 to –2.1 were reported as likely benign, scores from –2.0 to –0.1 were reported as indeterminate, and scores from 0.0 to+11.1 were reported as likely malignant.
This clinical validation study was approved by the Quorum Review institutional review board (Seattle, Wash) with a waiver for individual patient informed consent. A cohort of 1400 melanocytic lesions was obtained from samples prospectively submitted for gene expression testing. These lesions consisted of a wide range of subtypes representative of contemporary clinical samples. Cases that generated benign, indeterminate, and malignant scores were randomly selected, with the proportion of each score category approximating the overall distribution of results reported in the clinical setting. These samples consisted mostly of shave biopsies. Re‐excisions were excluded. The average overall volume of tumor tissue per sample was lower in this study than the first validation study, which used excisional (26.7%), punch (27.7%), and shave biopsies (45.7%) from archival samples.
The histopathologic review process is represented schematically in Figure Figure1.1. The dermatopathologists who participated in this study were selected for their experience and expertise as well as their diversity of training backgrounds and practice settings. All lesions were primary cutaneous melanocytic neoplasms as assessed by clinical history and histopathology. Each case was represented by 1 hematoxylin‐eosin–stained slide. The slides were anonymized and examined independently by 3 dermatopathologists (selected from a panel of 10 dermatopathologists in total) who were blinded to the score, the initial diagnosis of the submitting dermatopathologist, and the diagnoses made by the other 2 reviewing dermatopathologists. Available clinical information, including patient age, sex, and location of the lesion, accompanied each slide. Panel dermatopathologists were instructed to assign a diagnosis of benign or malignant to each case; specifying a histopathologic subtype or other information was optional. Cases were included only if all 3 panel dermatopathologists independently arrived at the same diagnosis (benign or malignant). Samples were excluded from further analysis if there was discordance between the reviewing dermatopathologists or if a diagnosis other than benign or malignant was assigned.
Because the reviewing dermatopathologists were not required to specify subtypes for each lesion, a fourth dermatopathologist reviewed all cases for which there was a triple‐concordant diagnosis of melanoma and assigned a subtype. For the statistical analysis, melanomas were initially assigned to 1 of 4 major cutaneous melanoma subtypes (acral melanoma, lentigo maligna/lentigo maligna melanoma, nodular melanoma, and superficial spreading melanoma).
A preliminary assessment of the test in triple‐concordant samples indicated that this cohort of mostly shave biopsies included some samples with an insufficient tumor volume. To ensure that only samples suitable for clinical testing were included, the test's limit of detection was quantitatively determined. This was done through the determination of the lowest concentration of melanoma cell RNA detectable by the assay in composite mixtures containing known quantities of malignant and nonmalignant RNA. Malignant RNA was acquired from 3 formalin‐fixed, paraffin‐embedded melanoma samples from the published validation cohort8 that contained a macrodissectible area composed of a relatively homogeneous population of malignant melanocytes (70%‐90% as determined by histopathology). Nonmalignant RNA was acquired from 5 previously tested nevi and solar lentigines.
A series of 2‐fold dilutions was used to create composite samples with known ratios of malignant RNA to nonmalignant RNA. In 2 of the composite samples, nonmalignant RNA was aggregated from 2 benign samples to acquire sufficient quantities. The percentage of malignant RNA was normalized for each dilution series on the basis of the housekeeper means. Score results were compared for these RNA mixtures corresponding to malignant cell contributions versus benign cell contributions at increasing dilutions.
Although the percentage contribution of malignant cells was quantitatively estimated in these RNA mixing experiments, a visual inspection of the melanocytic volume by histopathology is inherently less precise. Therefore, a threshold for clinical testing based on a visual inspection of the melanocytic volume by histopathology was independently determined. All samples from the first published validation study8 were reviewed, and the melanocytic volume was assessed independently by 2 pathologists. All samples with a moderately low melanocytic volume that were candidates for exclusion (<20% as determined by histopathology) as well as a random sampling of 10% of all remaining cases were reviewed by a panel of pathologists.
The degree of discordance was calculated only for cases with a tumor volume above the threshold to which the reviewing dermatopathologists assigned discordant diagnoses of benign and malignant. The submitting dermatopathologist's diagnosis was not used to calculate discordance. Discordance calculations represent the average disagreement between any 2 reviewing dermatopathologists.
The association between the score and the triple‐concordant histopathologic diagnosis was assessed by sensitivity and specificity. Exact 95% confidence intervals were computed for sensitivity (proportion of correctly identified positive cases/malignant cases) and specificity (proportion of correctly identified negative cases/benign cases) on the basis of the binomial distribution. The score was then used to assess the sensitivity of the gene expression signature within specific melanoma subtypes of lesions with triple‐concordant diagnoses of melanoma.
Within this cohort, 349 samples (24.9%) received a malignant score, 823 (58.8%) received a benign score, and 228 (16.3%) received an indeterminate score (Fig. (Fig.1).1). Lesions receiving indeterminate scores were slightly overrepresented here and represent approximately 10% of all samples tested in the clinical setting. The average diagnostic discordance among dermatopathologists for the samples in this cohort was 14.1%. Samples receiving a benign score had a lower incidence of discordance (9.9%) than samples receiving a malignant score (22.0%). The average discordance between reviewing pathologists was 19.4% in cases receiving an indeterminate score. The overall discordance of 14.1% observed in this study was higher than that for the cohort used in the previous validation study (4.7%) and similar to that reported by other investigations of disagreement in the histopathologic diagnosis of melanocytic lesions.3, 5
A triple‐concordant histopathologic diagnosis was assigned in 993 cases (70.9%; Fig. Fig.1).1). As noted in the Materials and Methods section, triple concordance was defined as 3 of 3 dermatopathologists independently assigning a definitive diagnosis of either benign or malignant. Cases that did not receive a complete diagnosis or that received a diagnosis of indeterminate from 1 or more dermatopathologists were excluded. Cases with an indeterminate score that had a triple‐concordant diagnosis of benign (n=112) or malignant (n=21) were excluded from further analysis. Among the remaining 860 cases with triple concordance, 204 (23.7%) received a malignant diagnosis, and 656 (76.3%) received a benign diagnosis (Fig. (Fig.11).
The subtype information for melanomas that received a triple‐concordant diagnosis of malignant is provided in Table 1. The largest proportion of this cohort was composed of superficial spreading melanoma (43.5%). Included within the category of superficial spreading melanoma were several histopathologic variants, including melanoma in situ (other than lentigo maligna; 6.8%) and melanoma arising within a dysplastic nevus (1.7%).
The cohort also contained a large number of lesions of the lentigo maligna subtype (32.8%), including lentigo maligna (15.8%), lentigo maligna with a nested pattern (5.6%), and lentigo maligna melanoma (lentigo maligna with an invasive component; 11.3%). Nodular melanomas composed 19.8% of the overall cohort. Acral melanomas composed only 0.6% of the cohort. Desmoplastic melanomas were not specifically excluded from the cohort; however, none of the cases for which there was diagnostic agreement among all 3 reviewing dermatopathologists were desmoplastic melanomas.
This cohort included most major clinicohistopathologic melanoma and nevus subtypes as well as many histopathologic variants. In numerous instances, the score was discordant with the submitting dermatopathologist's favored pretest diagnosis. In the majority of these, the triple‐concordant diagnosis was in agreement with the score (Figs. (Figs.2,2, ,3,3, ,4,4, ,5,5, ,6).6). Apparent false‐positives occurred in several cases for which the triple‐concordant diagnosis was dysplastic nevus. However, in several of these cases, the reviewing dermatopathologists noted that the differential diagnosis included superficial melanoma or melanoma arising within a preexisting dysplastic nevus (Fig. (Fig.7).7). Apparent false‐negative results were most common in lentigo maligna (Fig. (Fig.8).8). In addition, 2 lesions for which the differential diagnosis included metastatic melanoma versus primary dermal melanoma also produced false‐negative results. The test has not been validated for metastatic melanomas.
Although the first validation did not subclassify lesions beyond the main melanoma subtypes, the 2 cohorts contained similar overall proportions of superficial spreading melanoma and nodular melanoma (Table 1). A major difference in this prospective cohort was the inclusion of a far greater number of lentigo maligna/lentigo maligna melanoma cases (32.8% in this study vs 14.9% in the first validation study).8 Not surprisingly, the majority of samples excluded because of a lack of a sufficient tumor volume were of the lentigo maligna subtype (n=17); thus, the proportion of lentigo maligna samples here (32.8%) includes only those with a greater than 10% lesional melanocytic volume.
An inspection of the lentigo maligna samples included in this study revealed that lesional melanocytes were distributed as single cells or widely scattered small clusters in many samples. For the gene signature assessed here, the small tumor volume may have resulted in the dilution of differentially expressed tumor markers by nonlesional cells and caused the resultant gene expression score to be in the benign reporting region. The threshold melanocytic volume required for clinical testing was quantitatively assessed with composite mixtures of RNA samples containing known amounts of RNA from malignant melanomas diluted with known amounts of nonmalignant RNA. The undiluted malignant samples produced scores at the upper end of the clinical range. Composite RNA samples containing 3% to 9% malignant RNA consistently produced a malignant score. Scores were transitioned to an indeterminate diagnosis when the malignant RNA proportion was 2% to 4% of the composite sample and to a benign diagnosis when the malignant RNA proportion was less than 2%. This suggests that the threshold malignant tumor volume is between 3% and 9%.
Although the percentage contribution of malignant cells was quantitatively estimated in these RNA mixing experiments, a visual inspection of the melanocytic volume by histopathology is inherently less precise. The histopathologic review of all samples from the first validation study revealed that the melanoma diagnostic score for samples with a 10% to 20% melanocytic volume correlated well with the pathologic review, whereas samples with less than a 10% melanocytic volume did not. Therefore, the appropriate threshold for the minimum melanocytic volume, as determined by histopathology, is estimated to be 10%. The majority of the cases tested in the first validation had a significant tumor volume, leaving very few samples with a low melanocytic tumor volume (n=5) available for this retrospective analysis. However, the agreement between the quantitative (limit of detection) and qualitative (histopathology review) assessments of the melanocytic volume threshold affirms that a cutoff of 10% is appropriate for clinical testing.
All samples with a triple‐concordant pathologic diagnosis in the current study were reviewed, and those with less than a 10% melanocytic volume were excluded (14.4% [n=124]). The excluded samples included 13.2% of all samples receiving a malignant score and 14.8% of the samples that received a benign score. Not surprisingly, the majority of the samples excluded because of a lack of a sufficient tumor volume were of the lentigo maligna subtype (n=17); thus, the proportion of lentigo maligna samples here (32.8%) includes only those with a greater than 10% lesional melanocytic volume. Overall, 54.5% of the false‐negatives were excluded on the basis of this threshold melanocytic volume (Fig. (Fig.11).
The performance of the signature was determined for all triple‐concordant samples with a greater than 10% tumor volume (n=736). This does not include 13.4% of the samples (133 of 993) with a triple‐concordant diagnosis that received an indeterminate test result. Although a triple‐concordant diagnosis was required for inclusion, 3.8% of the final cohort (28 of 736) were cases for which the triple‐concordant diagnosis differed from the submitting dermatopathologist's diagnosis. The sensitivity and specificity were determined to be 91.5% (confidence interval, 86.4%‐95.2%) and 92.5% (confidence interval, 90.0%‐94.5%), respectively. This is comparable to the results of the first validation, which reported a sensitivity of 94% and a specificity of 90% when samples with an indeterminate score were excluded.8 The sensitivity of the gene signature was also assessed within the specific melanoma subtypes (Table 2).
The ability of a recently developed gene signature to differentiate benign nevi and malignant melanoma was assessed here in a prospectively collected cohort of melanocytic neoplasms. The performance of the signature was validated against triple‐concordant histopathologic diagnoses to ensure a comparison with the best possible representation of the current reference (gold) standard. Although this may eliminate some ambiguous lesions, the initial cohort was composed of cases submitted for clinical testing. As such, the resulting validation cohort included cases for which ancillary diagnostic information was sought. In this study, the score differentiated benign nevi from malignant melanoma with a sensitivity of 91.5% and a specificity of 92.5%.
Because melanocytic lesions and the biopsy specimens that contain them vary greatly in size, an important aspect of any ancillary test is the amount of tissue that it requires. The tissue requirements for the gene expression assay are 1 hematoxylin‐eosin–stained section followed by 5 to 7 sections cut at 5μm.8 However, some lesions submitted for clinical testing were characterized by an extremely small volume of melanocytic cells. As such, a threshold melanocytic volume was implemented here to exclude samples that lacked sufficient tumor for testing. This is an important consideration because lentigo maligna and other variants of melanoma in situ composed a substantial proportion of the cohort. In these subtypes, lesional melanocytes are often distributed as single cells or as small, widely scattered nests, and the inclusion of excessive quantities of benign tissue (either normal skin or background nevus cells) could result in the dilution of differentially expressed tumor markers and potentially produce false‐negative results.
Adjunctive testing methods to distinguish between melanomas and nevi include array‐based comparative genomic hybridization (aCGH) and fluorescence in situ hybridization (FISH). These methods vary substantially in their tissue requirements. aCGH requires 5 to 15 sections 25μm thick,21 and the tumor cell population must be relatively pure.22 The tissue requirements for FISH and the gene expression test described in this article are similar (5‐7 unstained slides). The homogeneity of the tumor cell population is less of an issue for FISH and gene expression tests than aCGH, but it can still be a limitation. For FISH, an accurate assessment requires that care be taken to perform signal enumeration in the neoplastic cells of interest. The exclusion of nonlesional cells can be an issue for melanomas that have background nevus cells or substantial inflammatory cell infiltrates. These same factors can also limit the sensitivity of gene expression assays. We found that lesions in which benign background nevus cells or other nonlesional cells exceeded melanoma cells by a ratio of approximately 20 to 1 could produce scores in the indeterminate zone or even within the benign range of the scale. A minimum tumor volume threshold of 10% was implemented to mitigate the risk of false‐negative results in this scenario.
In general, the analytical sensitivity of each test has a lower limit determined by tumor volume and tumor homogeneity. For aCGH, the larger studies indicate that for lesions satisfying the tissue requirement criteria outlined previously, the sensitivity is 92%22 to 95%.23 The sensitivity and specificity of the FISH method varies dramatically among existing reports with the lesion subtype, the probe set used, the number of observers, and the cutoff thresholds used. Various authors have reported a FISH sensitivity ranging from 43% to 94% and a specificity ranging from 60% to 98%.24, 25, 26, 27, 28 The original 4‐probe FISH assay targeting 6p25 (RREB1), 6q23 (MYB), Cep6 (centromere 6), and 11q13 (CCND1) was reported to discriminate between histologically unequivocal melanomas and benign nevi with a sensitivity of 86.7% and a specificity of 95.4%.24 However, the sensitivity was subsequently found to be only 70% for melanomas with a Spitzoid morphology.25, 26, 29, 30, 31, 32
The gene signature assessed here is intended to provide adjunctive information for the diagnosis of melanoma in ambiguous and difficult‐to‐diagnose lesions. The prospective cohort used in this study included numerous melanoma and nevus subtypes, including some types known to present significant diagnostic challenges in the clinical setting. However, the application of a triple‐concordant diagnostic reference standard did eliminate some of these cases. For example, desmoplastic and nevoid melanomas were not specifically excluded from the cohort, but none of the cases classified as either of these 2 particular melanoma subtypes received a triple‐concordant diagnosis by histopathology. The high frequency of discordance among reviewing dermatopathologists in this study was similar to that observed in other assessments of clinical cohorts3, 4, 5, 6, 7 and highlights the need for adjunctive diagnostic tools. An evaluation of the gene signature against clinical outcomes would minimize cohort bias toward straightforward cases, and studies assessing test performance by comparison with clinical outcomes are currently underway.
The requirement for a triple‐concordant histopathologic diagnosis among 3 experienced dermatopathologists who examined each case independently ensured the best possible representation of the current reference (gold) standard. The cohort's size (n=736) and diversity provide an evaluation of this adjunctive diagnostic tool for the types of lesions routinely encountered by dermatopathologists in the clinical setting. Additional studies with clinical follow‐up will likely provide additional insight into the performance of this test, as will studies focusing on particularly challenging subtypes such as desmoplastic melanoma, nevoid melanoma, and Spitzoid melanoma.
This work was supported by Myriad Genetic Laboratories, Inc.
Loren E. Clarke, Darl D. Flake II, Jonathan Nelson, Hillary Kimbrell, Kathryn A. Kolquist, Krystal L. Brown, M. Bryan Warf, Benjamin B. Roa, and Richard J. Wenstrup are employees of Myriad Genetics and receive salaries and stock options as compensation. Klaus Busam, Clay Cockerell, Klaus Helm, Jennifer McNiff, Jon Reed, Jaime Tschen, Jinah Kim, Raymond Barnhill, Rosalie Elenitsas, and Victor G. Prieto received consulting fees from Myriad Genetics. Darl D. Flake II, M. Bryan Warf, and Benjamin B. Roa also report 2 patents pending (14/205,965 and PCT/US15/038038).
Loren E. Clarke: Guarantor of overall content, project conceptualization, data interpretation, manuscript review, project supervision, and review and approval of final manuscript. Darl D. Flake II: Data analysis and curation and review and approval of final manuscript. Klaus Busam: Research investigation, manuscript review, and review and approval of final manuscript. Clay Cockerell: Research investigation, manuscript review, and review and approval of final manuscript. Klaus Helm: Research investigation, manuscript review, and review and approval of final manuscript. Jennifer McNiff: Research investigation, manuscript review, and review and approval of final manuscript. Jon Reed: Research investigation, manuscript review, and review and approval of final manuscript. Jaime Tschen: Research investigation, manuscript review, and review and approval of final manuscript. Jinah Kim: Research investigation, manuscript review, and review and approval of final manuscript. Raymond Barnhill: Research investigation, manuscript review, and review and approval of final manuscript. Rosalie Elenitsas: Research investigation, manuscript review, and review and approval of final manuscript. Victor G. Prieto: Research investigation, manuscript review, and review and approval of final manuscript. Jonathan Nelson: Data curation, study resources, and review and approval of final manuscript. Hillary Kimbrell: Data analysis, manuscript review, and review and approval of final manuscript. Kathryn A. Kolquist: Data analysis, manuscript review, and review and approval of final manuscript. Krystal L. Brown: Manuscript writing, review, editing, data visualization, and review and approval of final manuscript. M. Bryan Warf: Data analysis, manuscript review, and review and approval of final manuscript. Benjamin B. Roa: Project conceptualization, project supervision, manuscript review, and review and approval of final manuscript. Richard J. Wenstrup: Project conceptualization, project supervision, manuscript review, and review and approval of final manuscript.