We have identified 131 probe sets, corresponding to 108 known genes, which are highly effective in distinguishing invasive OSCC and normal oral tissue, as well as a list of genes that might be involved in the transformation of normal oral tissue to dysplasia, and of oral dysplasia to invasive OSCC. Although prior studies, including our own, have described global changes in gene transcription that distinguish normal oral epithelium from carcinoma, there is considerable heterogeneity among the lists of genes that have been reported and, to our knowledge, few studies have produced a limited combinations of genes as in the current study with high sensitivity and specificity in distinguishing OSCC from normal oral tissue through rigorous statistical testing and validation with independent datasets, and none had provided prediction models (14
). The current study provides prediction models that were generated using rigorous statistical analyses, and the differences in gene expression detected using microarray technology was validated by qRT-PCR, and by testing against independent internal and external genome-wide gene expression datasets. The ultimate goal of our work has been to generate candidate markers that can be easily applied to the testing of biopsies or surgical margins to aid diagnosis and prognosis of OSCC. It is our hope that the signals we identify will be strong enough to use in a clinical test without resorting to the isolation of the tumor cells and stromal cells, knowing that both cell populations play important role in OSCC development and progression. Thus, we have deliberately choose not to use laser capture microdissection to isolate tumor cells for this investigation. We believe that our current prediction models and the 131 genes that we identified warrant testing in subsequent studies for their utility in predicting local recurrence at surgical margins or the development of second primary cancer of OSCC patients, or for selective screening of individuals who are at high risk of OSCC. It is possible that histologically- negative margins harbor microscopic original tumor as residual disease. If so, the gene expression profile would more likely resemble that of the resected invasive OSCC, and measurement of one or more of the 131 genes we identified and application of one of our top models could potentially be of use for its detection. For individuals who are at high risk of OSCC, their oral epithelium could contain cells that are molecularly abnormal and primed for the development of cancer. As such, the molecular profile might be more similar to that of a pre-neoplastic oral lesion than that of an invasive OSCC. The list of genes that we generated that distinguishes invasive OSCC from dysplasia and controls could potentially be used to gauge malignant potential of these molecular changes. Recently, p53 and eIF4E have been evaluated to augment histologic assessment of surgical margins (4
). eIF4E expression, but not P53 mutation and overexpression, in histologically negative surgical margins was a significant predictor of recurrence and shorter disease-free survival of HNSCC patients (16
In the current study, we found that the expressions of two pairs of genes (LAMC2
and COL4A1; COL1A1
) were particularly effective in distinguishing OSCC from normal oral tissue in independent testing sets. The sensitivity and specificity were close to 100%. Because of the stringent criteria we applied to select candidate markers, it is expected that there are other probe sets among the 131 probe sets with a similar predictive property. We previously observed the differential expression of many of the 131 probe sets, including LAMC2, COL1A1 and COL4A1
). Overexpression of laminin gamma 2 in HNSCC, particularly in the invasive front of tumors, has been reported by others (20
). A study by Pyeon et al (13
) that used normal controls (n=14) and the same Affymetrix GeneChip arrays also found highly expressed LAMC2, COL4A1
in OSCC (n=42), compared to controls. A study by Ziober et al (22
), using Affymetrix U133 GeneChip arrays to compare gene expression of oral cavity tumors and paired adjacent clinically normal oral tissue from 13 patients, produced a list of 25 genes that showed 86-89% accuracy in distinguishing OSCC from controls in three small testing datasets that contained 13, 18 and 5 tumor samples and even fewer controls. Only seven of the 25 probe sets, encoding for COL1A1, 4A1, 5A1, 5A2, microtubule, periostin and podoplanin, were among our list of 131 probe sets. Given the differences between their study and ours, i.e., sample size, tumor site, source of control samples, analytical methods and the sample size of the testing sets, the common observation of differential expression of collagen genes and genes involved in cell shape and movement underscores the potential importance of these genes in oral carcinogenesis. Another study of gene expression signature (23
), involving comparison of oropharyngeal tumor samples from three patients with adjacent normal nonmalignant mucosa using a 9,350 EST cDNA array, reported differential expression of nine genes (23
). Only periostin in their list was among our 131 top candidate markers.
Our results were adjusted for age and sex. Although life style characteristics, such as tobacco use and infection with Human Papillomavirus (HPV) play an important role in OSCC development, we did not observe any appreciable difference in gene expression on the genome-wide level according to smoking status (former/current vs. never) or HPV status (positive vs. negative). Only when we examined oropharyngeal cancers alone, did we find differential gene expression between HPV-positive and HPV-negative tumors. The latter results have been submitted for review in a separate manuscript (Lohavanichbutr et al).
Laminin binds to Type IV collagen and to many cell types via cell surface laminin receptors (24
). Following attachment to laminin in the basement membrane, tumor cells secrete collagenase IV that specifically breaks down type IV collagen thus facilitate cell spreading and migration (25
). In addition, laminin fragments generated by post-translational proteolytic cleavage bind to cell surface integrins and other proteins to trigger and modulate cellular motility (26
). Increased levels of laminin have been associated with a number of carcinoma (27
). In some of these studies, laminin was associated with tumor aggressiveness, metastasis and poor prognosis. Results from mouse models showed that tumor cells with high levels of laminin and low level of unoccupied laminin receptor are resistant to killing by natural cytotoxic T cells and are highly malignant (36
) and that treatment with low concentrations of laminin receptor binding fragments of laminin blocked lung metastasis of hematologenously introduced tumor cells (37
). A large number of unoccupied laminin receptors have been observed for breast and colon cancer cells (25
); no similar reports have appeared on OSCC or HNSCC cells. Further studies of laminin and its receptors should be pursued for its role in OSCC etiology and progression.
The gene products of COL4A1
are assembled into type IV collagen that form the scaffold of basement membrane integrating other extracellular molecules, including laminin, to produce a highly organized structural barrier. Collagen IV also plays an important role in the interaction of basement membrane with cells (38
). Immune cells, migrating endothelial cells and metastatic tumor cells have been reported to produce and tightly regulate type IV collagen-specific collagenase (40
). Degradation of Type IV collagen could compromise basement membrane integrity and facilitate tumor cell spreading and migration. It is possible that the observed overexpression of COL4A1
by our study and by Pyeon et al is the net result of overproduction and degradation. Whether COL4A1 contributes to OSCC development is unknown and awaits investigation.
Peptidyl arginine deiminases (EC 220.127.116.11) catalyze post-translational modification of proteins through conversion of arginine residues to citrullines. Although their physiological functions are not well understood, they have been implicated in the genesis of multiple sclerosis, rheumatoid arthritis, and psoriasis (43
). The isoform peptidyl arginine deiminases type 1 (PADI1) is present in the keratinocytes of all layers of human epidermis. It has been reported that deimination of filaggrin by PADI1 is necessary for epidermal barrier function and deimination of keratin K1 may lead to ultrastructural changes of the extracellular matrix (43
). We found the expression of PADI1
to be downregulated in both dysplasia and OSCC when compared to controls. If deimination of arginine residues of proteins in the keratinocytes of oral mucosa by PADI1 forms an epidermis barrier, downregulation of PADI1 may allow the growth, expansion and movement of tumor cells. Given the strength of our observation, it would be important to examine the function of PADI1 in cell lines and animal model systems.
Among the biological pathways we identified to be prominently involved in OSCC were the JAK/STAT and interferon gamma (IFN-γ) signaling pathways. A wide array of cytokines and growth factors, including EGFR, transmit signals through the JAK/STAT pathway (44
). EGFR overexpression has been reported in up to 90% of HNSCC tumors (46
). Single modality therapeutics that target against EGFR, such as small molecule tyrosine kinase inhibitors, monoclonal antibodies, antisense therapy or immunotoxin conjugates, however, were only effective in 5-15% of patients with advanced HNSCC (47
). These observations suggest that there are other proteins and pathways driving the growth of some of these tumors. To our knowledge, this is the first study to show a strong association between IFN- γ signaling pathway and OSCC. Interestingly, IFN- γ signaling also involves the JAK/STAT pathway (44
). It is unclear whether the upregulation of the IFN- γ pathway is intrinsic to the tumor cells or is due mainly to the immune cells present in the stroma. Further studies utilizing laser capture microdissection to address this question are warranted.
We identified a set of genes that are possibly involved in, and specific for, the malignant transformation of oral dysplasia into invasive OSCC. These genes include those that encode for proteins that are known for cell-matrix and cell-cell interaction, cellular migration or invasion, such as LAMC2
and, SERPINE1 (PAI-1)
; for directed-cellular movement, such as CXCL2, 3
, and 9
, as well as for immune function, such as IL1β
. Due to the small number of dysplasia cases we studied, however, we were not able to separate the samples into a training set and a testing set. Another limitation is that the comparisons were made between dysplasia samples collected from the oral cavity and OSCC from both the oral cavity and oropharynx, and the controls from mucosa of oropharynx or tonsillar pillar. Thus, our results await confirmation or refutation by others. Kondoh et al (49
) reported the differential expression of 27 genes between 27 OSCC and 19 leukoplakia tissues based on their IntelliGene Human Expression cDNA array and qRT-PCR. Among those 27 genes, only LAMC2, IFIT3
were on our list. The observed discrepancy is not surprising, given the large number of differences between the two studies: 1) Kondoh et al compared OSCC with leukoplakias, while we compared OSCC with dysplastic lesions; and 2) that study used microdissected samples to remove stroma while we did not, and they assayed the samples with a 16,600 probe set cDNA array, as opposed to our ~50,000 probe set oligonucleotide array. Nonetheless, their study and ours show that LAMC2, IFIT3
are worthy of further investigation as predictors of the development of OSCC among patients with oral dysplasia. It is interesting to point out that, among our 131 probe sets, a large number of collagen genes were among the probe sets that may be associated with the conversion of oral tissue to dysplasia (Supplemental Table
) and were absent among the probe sets that may be involved in the conversion of dysplasia to invasive OSCC (). These observations suggest that collagen genes may play an important role early in the oral carcinogenesis process.
Although our sample size is substantially larger than other microarray articles published on HNSCC, it is nonetheless very small when compared to the number of genome-wide comparisons we were making. Furthermore, the sample sizes of the internal and external testing sets that we used to test the predictive power of our proposed models were also small. Although we validated the differential expressions of the four markers in the top two models, whether these four markers will continue to exhibit the greatest predictive power remains to be seen when they are further tested in independent studies with a much larger sample size.