In this study, we used a newly developed COGRIM statistical model to systematically define NF-κB regulons of genes differentially expressed by UM-SCC cells (Figures and ). These NF-κB regulons are connected to networks and signal pathways, for which there is evidence of significant involvement in tumorigenesis (Figures and , and Table ). Our experimental data confirmed and validated computational and bioinformatic predictions for NF-κB regulation and binding activity on the promoter sequences of a selection of these genes (Figures , , ), indicating that NF-κB family members function as important master controls of gene expression, coordinating action within networks and pathways that contribute to the malignant phenotype of UM-SCC. Our study revealed the power of a systems biology analysis using COGRIM modeling and IPA to identify molecular signatures at the global level that are modulated by functionally active TFs, interacting networks and signaling pathways.
This study is the first utilization of COGRIM to analyze a family of TFs in a human cancer system [
47,
53]. Previously, there have been limited genome-wide computational analyses of NF-
κB binding activity and regulated genes related to malignant phenotypes and genotypes, due to the complexity of NF-
κB regulatory mechanisms, heterogeneous cancer subtypes, and inherent limitations or biases in computational and experimental conditions. An important feature of the COGRIM model is the ability to computationally analyze complex transcriptional regulatory mechanisms by simultaneously integrating multiple large scaled data sources, in a principled and robust fashion without requiring
a priori knowledge of the relative accuracy of each data source. This model-based strategy greatly improved the efficiency and accuracy of the elucidation of the functional and physical relationships among the TFs, pathways and networks. Although the linear model of expression used as a basis for COGRIM is an approximation of transcriptional regulation, it has proven to be effective in other investigations [
54-
56]. One potential limitation of COGRIM is that the TF activity f
jt must be approximated by a proxy measure such as the expression level of the gene that codes for that TF. The predicted functions of TFs are confirmed with experimental results even when extensive ChIP binding data were not available [
47].
As described previously [
47], the COGRIM method includes a probabilistic model for each data source that addresses the inherent uncertainty within each data type. COGRIM is more than a simple extension of previous linear models in that it provides a principled mechanism for integrating sequence features with expression data for the prediction of target genes and can be further extended in several interesting directions in the presence of additional data sources. It should also be noted that although we have focused on TFs, the model would work equally well with regulatory factors that are not proteins but whose levels can be measured and whose binding sites can be identified (for example, microRNAs). COGRIM represents an initial step toward solving the problem of integrating available biological information in a principled fashion. Our belief is that this goal will best be accomplished by fitting large and flexible probability models that combine data from various experimental and compiled sources in a structured or multi-level framework. We anticipate that the model will become even more valuable as the accuracy and coverage of expression and sequence feature data improve.
Using COGRIM in this study, 748 putative NF-
κB target genes were identified, which consisted of 59% of 1,265 differentially expressed genes from microarray analysis in UM-SCC cells (Figure and Additional data file 1). This ratio is slightly higher than the frequency of all predicted NF-
κB binding motifs calculated in vertebrates (approximately 50%, including human, mouse and rat data from the Genomatix promoter database), but is slightly lower than the frequencies of NF-
κB binding motifs predicted in the up-regulated gene clusters enriched with known NF-
κB related genes published previously (approximately 65-70% in B-C gene clusters) [
14]. The prediction is consistent with the hypothesis and experimental data that NF-
κB regulated genes are over-represented in tumor associated gene signatures, especially in the up-regulated gene clusters [
14]. Interestingly, the overall ratio of approximately 60% of differentially expressed genes in human UM-SCC cells is remarkably consistent with the approximate percentage of genes in murine squamous cell carcinoma restored to expression levels seen in non-malignant cells of syngeneic origin by inhibition of NF-
κB using an inducible mutant I
κB
α [
13]. Inhibition of NF-
κB and target genes in this murine model was accompanied by decreased proliferation, migration, cell survival, angiogenesis and tumorigenesis [
13]. The murine NF-
κB modulated gene signature was independently associated with a gene signature associated with decreased prognosis in a large series of human HNSCC[
43]. Together, these experimental and
in silico analyses of expression profiling data in murine and human squamous cell carcinoma are consistent with involvement of NF-
κB as a key regulatory factor in global alterations in gene expression in squamous cell carcinoma.
The efficiency and accuracy of COGRIM prediction are also supported by cross validation with other experimental data from published literature, as well as with our experimental results from UM-SCC cells upon TNF-
α stimulation or siRNA knock down of NF-
κB (Figures , , ) [
14]. Among the 748 genes predicted as NF-
κB target genes, 75 of them (10%; in bold in Additional data file 1) overlapped with approximately 600 NF-
κB target genes published previously by the three websites described in the Materials and methods, indicating most of the predicted genes represent novel NF-
κB target genes. Additionally, only 16 genes of the list of 1,265 'known NF-
κB genes' based on these websites were excluded from our predicted gene list, due to low probability scores by COGRIM modeling (data not shown). Among the 16 genes, 3 were previously implicated in HNSCC and other cancers, namely
AREG (amphiregulin),
MMP14, and
MYC. After searching the original references, we found the reference for
AREG was incorrectly cited. For
MMP14, a NF-
κB binding motif was observed in the promoter; however, it is located at -1,165 bp from the transcriptional stating site, which is outside the proximal promoter sequence defined in this study. The reference for
MYC was published in 1990, for which the consensus sequence of the NF-
κB binding motif cited does not exist in the most updated human genome sequence (data not shown). However, there are several references suggesting other subunits of NF-
κB could be involved in the regulation of
MYC, including evidence that the RELB/p52 complex can directly bind to the
MYC gene promoter [
24]. However, the PWM of RELB/p52 binding motifs has not been well established, and the computation in this study did not include RELB/p52. There have been reports about possible involvement of cREL in
MYC gene expression but without discussing detailed mechanisms [
24,
57]. Thus, the COGRIM modeling in this study successfully predicted 82% (75/91) of known NF-
κB genes identified. The few cases of failed prediction could be either due to errors in literature citations, or because the location of the NF-
κB binding site is outside of the promoter sequence boundary selected for this study.
We experimentally validated a selected subset of predicted NF-
κB genes involved in signal pathways, using TNF-
α and siRNA as tools. We showed that TNF-
α significantly enhanced gene expression of 15 genes selected from the prediction, where 6 are novel NF-
κB targets, including
IL1R2,
ALDH1A3,
ITGA2,
ITGA5,
LAMA3 and
LAMB3 (Figure and Additional data file 1). To date, we have experimentally tested expression of a total of 47 genes in response to TNF-
α (Figure ) [
14], where 41 genes were identified as NF-
κB target genes by COGRIM, of which 23 are novel. Previously, there have been several experimental studies attempting to globally investigate NF-
κB binding activity and regulated gene expression, including RELA binding activity throughout human chromosome 22 [
58], and TNF-
α-induced NF-
κB target gene expression in HeLa cells [
59,
60], U937 monocytic cells [
61], lipopolysaccharide-stimulated human peripheral blood mononuclear cells [
62], and THP.1 cells transfected with IKK
γ [
63]. Under TNF-
α stimulation, 767 genes (
P < 0.05) or 343 genes (
P < 0.01) were differentially expressed by HeLa cells [
60], 348 genes exhibited NF-
κB binding activity in U937 cells [
61], and 79 or 72 genes were identified as NF-
κB regulated and responsive to lipopolysaccharide in macrophages [
62,
63]. Among these gene lists, a total of 88 genes were confirmed as NF-
κB-regulated genes and overlapped with our gene list (Additional data file 1), where 25 genes were identified as known NF-
κB genes listed by the three websites previously mentioned. These experimental data also cross-validate 63 putative NF-
κB target genes identified by our analysis.
We noticed different kinetics in the expression of gene subsets induced by TNF-
α in UM-SCC cell lines. The responsive kinetics of many of the novel NF-
κB target genes are either slowly induced, or induced and sustained without rapid decrease (Figure ), in contrast to the typical TNF-
α-induced early response gene, as observed for cytokines (
IL1A,
IL1B,
IL1RN,
CSF2 and
VEGFC) and a NF-
κB family member (
REL). Different kinetics of gene expression in response to TNF-
α treatment has been noticed previously [
59], where rapid oscillatory responses could be due to TNF-
α-mediated phosphorylation, degradation and re-synthesis of I
κB
α, in contrast to that of I
κB
β and I
κB
ε, which mediate prolonged stimulation [
64]. It has also been reported that the TNF-
α-induced early response pattern is seen often in genes with conserved promoter regions [
59]. This observation supports the hypothesis that the genes with typical early response inducible patterns are those with evolutionarily conserved functions involved in the first line of defense, where a quick reaction and termination mechanism is needed. The genes with the slower induction patterns are involved in functions such as adhesion and cell structure, where slower and persistent responses are necessary.
We also observed that the predicted NF-
κB regulons are not uniformly distributed in the subgroups of UM-SCC cells (Figure ), and more genes with predicted RELA-specific regulatory motifs were observed in the up-regulated gene groups with wild-type p53-deficient status (Figure ). This observation is in good agreement with the general consensus and experimental data regarding the positive regulatory role of RELA in controlling oncogenic gene expression [
2,
25,
65], and is consistent with our observation that in UM-SCC cells with wild-type p53-deficient status a cluster of NF-
κB regulated genes is over-expressed [
14,
35]. The experiments knocking down
RELA or
NFκB1 elucidated NF-
κB-mediated specific regulatory mechanisms (Figure ), and provided data consistent with the previous findings that the basal expression of most of the NF-
κB-regulated genes depends on both RELA and NF
κB1 (p65/p50 heterodimer). Negative regulation of NF
κB1 compared to the basal gene expression was consistent with a repressive function associated with p50 homodimers [
1,
2]. Furthermore, the binding activities of RELA and NF
κB1 were confirmed in the promoter regions of a typical NF-
κB target gene, namely
IL8 (which served as the positive control), in the promoters of atypical NF-
κB target genes, namely
IGFBP3 and
CDKN1A, and in the promoters of novel NF-
κB target genes, namely
ITGA5 and
LAMB3 (Figure ). Interestingly,
CDKN1A is also a p53 target gene with important function in the control of the cell cycle and apoptosis, and
IGFBP3 is a p63 target gene involved in the IGF signaling pathway [
66]. Our data provide computational and experimental evidence consistent with potential cross-talk between the two important pathways, namely NF-
κB and p53/p63, through which target genes could be transcriptionally regulated by either or both TFs.
The NF-
κB target genes identified were connected by networks and functioned as regulons under direct interaction or close regulation by RELA (Figure ), or NF
κB1 (Figure ). These gene groups included many known NF-
κB target genes with confirmed NF-
κB binding sites, such as
CCDN1,
CSF1,
CSF2,
ELF3,
ICAM1,
IL1A,
IL1B,
IL1RN,
IL2RA,
IL6,
IL8, and
VIM. Interestingly, most of these known NF-
κB target genes appear in the networks with RELA (Figure ), and are less significantly associated with NF
κB1 (Figure ). This observation is consistent with the fact that RELA contains the functional transactivation domain for mediating gene transcription [
2,
25,
65]. In addition, even fewer NF-
κB target genes were connected with NF
κB1 in cells with wild-type p53-deficient status (Figure ); this subgroup of cells over-expressed genes with a high prevalence of RELA regulation (Figure ). In this subgroup of cells (Figure ), more genes were connected to peroxisome proliferator-activated receptor gamma (PPARG), a member of the nuclear hormone receptor subfamily. PPARG is able to form heterodimers with retinoid X receptors, affect RELA cytoplasmic distribution and negatively regulate inflammatory responses [
67]. Interestingly, in this network, PPARG is also linked with PPAR binding protein, a PPAR co-activator with the ability to bind to DNA and p53 protein [
68,
69]. Our and other data provide computational and experimental evidence consistent with potential cross-talk between the two important pathways, namely NF-
κB and p53/p63, through which target genes could be transcriptionally regulated by either or both TFs.
The up-regulated NF-
κB target genes are enriched in important signal pathways implicated in most cancers, including leukocyte extravasation, inositol phosphate metabolism, and xenobiotic metabolism pathways (Table and Figure ). The pathways identified are consistent with previous evidence from studies by us and others that NF-
κB promotes proinflammation, pro-angiogenesis, cell adhesion and migration through up-regulation of genes involved in these pathways [
4,
6-
16,
70,
71]. The inositol phosphate metabolism pathway consists of molecular components of PI3K and protein kinase C pathways, both important signal pathways implicated in promoting tumorigenesis, especially in epidermis and epithelia [
72-
75]. The involvement of NF-
κB with the down-regulated genes has been less studied; in this study, genes in Wnt/
β-catenin and TGF-
β pathways were down-regulated in all tumor cells through regulation in association with all NF-
κB family members (Figure ). The Wnt/
β-catenin signaling pathway includes many negative regulators of cell growth and survival, and the down-regulation of these genes has been shown to be the critical step in tumorigenesis in epidermis and epithelia [
76]. Interestingly, the involvement of RELA and NF
κB1 in the Wnt/
β-catenin pathway was not significant, suggesting other NF-
κB family members or NF-
κB-independent intermediates could be involved. The TGF-
β signaling pathway is another negative regulatory pathway and the resulting deficiency has been demonstrated in HNSCC. Lu
et al. [
77] identified a defect of TGF-
β receptor 2 (
TGFβR2) and the related pathway that significantly contributes to HNSCC carcinogenesis and metastasis. Other signal pathways identified are more specific to the phenotypic and genotypic differences in UM-SCC cells resulting from different p53 statuses (Figure ). The pathways related to IGF, integrins, receptor and intermediate signals (Ephrin receptor, NF-
κB, p38 MAPK, PPAR and PTEN, and cytokines (VEGF and GM-CSF are dominant in cells with mutant p53 status, which is consistent with either the loss of the negative regulation (PTEN and PPAR), or the suppression of NF-
κB and other signal pathways and genes by gaining or retaining p53 functions in cells with mutant p53 status [
14,
35]. For cells with wild-type p53-deficient status, down-regulated genes were only involved in the cell cycle:G2/M DNA damage checkpoint pathway, where RELA showed dominant effects (Figure ).
This study provides a strong link between NF-
κB regulons and related pathways identified by the systems biology approaches, consistent with many conclusions previously drawn from individual and classic biological experiments. The data from both computational and experimental strategies support the hypothesis that the malignant progression of HNSCC is due to, or leads to, multiple genetic and phenotypic defects, such as p53 mutation or underexpression [
38,
78], and aberrant activation of several major growth factor and cytokine receptor pathways, including the TNF receptor [
16], IL1R [
9,
39], IL6R [
31], EGFR [
10], hepatocyte growth factor receptor/cMet [
41], TGF-
β receptor [
77], and platelet-derived growth factor receptor [
79] pathways. These receptors modulate multiple signal pathways, including aberrant activation of NF-
κB [
6,
7,
19], AP1 [
6,
9], JAK/STAT [
31], early growth response-1(EGR1) [
37], casein kinase 2 [
11], MAPK [
15], PI3K [
10,
41], and BCL-XL/IAP associated apoptosis pathways [
8]. Our previous report showed that the five major TFs - NF-
κB, STAT3, AP1, EGR1, and p53 - are specifically implicated in the unique gene signatures of UM-SCC cells [
14], adding supporting evidence to current work that multiple transcriptional mechanisms and signal pathways control specific gene and pathway signatures that determine the malignant phenotypes and the heterogeneous characteristics in UM-SCC cells.
In interpreting our current study, we recognize that there are differences between cell lines and human tissues. However, many of our previous studies using these cell lines have led to the demonstration and confirmation of important molecular findings made with them in tumor tissue and serum specimens. These include the demonstration of alterations and the biological and clinical significance of NF-
κB activation and of multiple NF-
κB-regulated genes and cytokines expressed in HNSCC tumor specimens and serum [
18,
32,
33,
42,
80,
81], and the demonstration of an inverse relationship between NF-
κB and p53 nuclear localization and associated protein expression in tumors [
35]. As a result, and to further examine the validity of the results of the bioinformatic analysis of the present study, we have recently undertaken a meta-analysis of 34 microarray datasets of HNSCC (approximately 80% from tissue specimens). Preliminary analyses are consistent with key observations from this study using UM-SCC cell lines, including that the molecules in, and/or regulated by, the NF-
κB and p53 signaling pathways are significantly enriched and related to HNSCC malignancy (B Yan
et al., manuscript in preparation). Since there are many important differences between the tumor cell lines in culture and human tumor specimens, where the paracrine effects from fibroblasts and other host cells are missing, it will be important in future studies to integrate stromal cell gene and protein expression data with functional studies of potential networks involving these interactions.