|Home | About | Journals | Submit | Contact Us | Français|
Despite their importance, the molecular circuits that control the differentiation of naïve T cells remain largely unknown. Recent studies that reconstructed regulatory networks in mammalian cells have focused on short-term responses and relied on perturbation-based approaches that cannot be readily applied to primary T cells. Here, we combine transcriptional profiling at high temporal resolution, novel computational algorithms, and innovative nanowire-based tools for performing perturbations in primary T cells to systematically derive and experimentally validate a model of the dynamic regulatory network that controls Th17 differentiation. The network consists of two self-reinforcing, but mutually antagonistic, modules, with 12 novel regulators, whose coupled action may be essential for maintaining the balance between Th17 and other CD4+ T cell subsets. Overall, our study identifies and validates 39 regulatory factors, embeds them within a comprehensive temporal network and reveals its organizational principles, and highlights novel drug targets for controlling Th17 differentiation.
Effective coordination of the immune system requires careful balancing of distinct pro-inflammatory and regulatory CD4+ helper T cell populations. Among those, pro-inflammatory IL-17 producing Th17 cells play a key role in the defense against extracellular pathogens and have also been implicated in the induction of several autoimmune diseases1. Th17 differentiation from naïve T-cells can be triggered in vitro by the cytokines TGF-β1 and IL-6. While TGF-β1 alone induces Foxp3+ regulatory T cells (iTreg)2, the presence of IL-6 inhibits iTreg and induces Th17 differentiation1.
Much remains unknown about the regulatory network that controls Th17 cells3,4. Developmentally, as TGF-β is required for both Th17 and iTreg differentiation, it is not understood how balance is achieved between them or how IL-6 biases toward Th17 differentiation1. Functionally, it is unclear how the pro-inflammatory status of Th17 cells is held in check by the immunosuppressive cytokine IL-103,4. Finally, many of the key regulators and interactions that drive development of Th17 remain unknown5.
Recent studies have demonstrated the power of coupling systematic profiling with perturbation for deciphering mammalian regulatory circuits6-9. Most of these studies have relied upon computational circuit-reconstruction algorithms that assume one ‘fixed’ network. Th17 differentiation, however, spans several days, during which the components and wiring of the regulatory network likely change. Furthermore, naïve T cells and Th17 cells cannot be transfected effectively in vitro by traditional methods without changing their phenotype or function, thus limiting the effectiveness of perturbation strategies for inhibiting gene expression.
Here, we address these limitations by combining transcriptional profiling, novel computational methods, and nanowire-based siRNA delivery10 (Fig. 1a) to construct and validate the transcriptional network of Th17 differentiation. The reconstructed model is organized into two coupled, antagonistic, and densely intra-connected modules, one promoting and the other suppressing the Th17 program. The model highlights 12 novel regulators, whose function we further characterized by their effects on global gene expression, DNA binding profiles, or Th17 differentiation in knockout mice.
We induced the differentiation of naïve CD4+ T-cells into Th17 cells using TGF-β1 and IL-6, and measured transcriptional profiles using microarrays at eighteen time points along a 72hr time course (Fig. 1, Supplementary Fig. 1a-c, Methods). As controls, we measured mRNA profiles for cells that were activated without the addition of differentiating cytokines (Th0). We identified 1,291 genes that were differentially expressed specifically during Th17 differentiation (Methods, Supplementary Table 1) and partitioned them into 20 co-expression clusters (k-means clustering, Methods, Fig. 1b and Supplementary Fig. 2) with distinct temporal profiles. We used these clusters to characterize the response and reconstruct a regulatory network model, as described below (Fig. 2a).
There are three transcriptional phases as the cells transition from a naïve-like state (t=0.5hr) to Th17 (t=72hr; Fig. 1c and Supplementary Fig. 1c): early (up to 4hr), intermediate (4-20hr), and late (20-72hr). Each corresponds, respectively, to a differentiation phase5: (1) induction, (2) onset of phenotype and amplification, and (3) stabilization and IL-23 signaling. The early phase is characterized by transient induction (e.g., Cluster C5, Fig. 1b) of immune response pathways (e.g., IL-6 and TGF-β signaling; Fig. 1d and Supplementary Table 2). Some early induced genes display sustained expression (e.g., Cluster C10, Fig. 1b); these are enriched for transcription factors (TFs), including the key Th17 factors Stat3, Irf4 and Batf, and the cytokine and cytokine receptors IL-21, Lif, and Il2ra (Supplementary Table 1). The transition to the intermediate phase (t=4hr) is marked by induction of ROR-γt (master TF; Supplementary Fig. 1d) and another 12 TFs (Cluster C20, Fig. 1b), both known (e.g., Ahr) and novel (e.g., Trps1) to Th17 differentiation. During the transition to the late phase (t=20hr), mRNAs of Th17 signature cytokines are induced (e.g., IL-17a, IL-9; cluster C19) whereas mRNAs of cytokines that signal other T cell lineages are repressed (e.g., IFN-γ and IL-4). Regulatory cytokines from the IL-10 family are also induced (IL-10, IL-24), possibly as a self-limiting mechanism related to the emergence of ‘pathogenic’ or ‘non-pathogenic’ Th17 cells11. Around 48hr, the cells induce IL23r (data not shown), which plays an important role in the late phase (Supplementary Fig. 3 Supplementary Table 1).
We hypothesized that each of the clusters (Fig. 1b, Supplementary Table 2) encompasses genes that share regulators active in the relevant time points. To predict these regulators, we assembled a general network of regulator-target associations from published genomics profiles12-19 (Fig. 2a, Methods). We then connected a regulator to a gene from its set of putative targets only if there was also a significant overlap between the regulator's putative targets and that gene's cluster (Methods). Since different regulators act at different times, the connection between a regulator and its target may be active only within a certain time window. To determine this window, we labeled each edge with a time stamp denoting when both the target gene is regulated (based on its expression profile) and the regulator node is expressed at sufficient levels (based on its mRNA levels and inferred protein levels20; Methods). In this way, we derived a network ‘snapshot’ for each of the 18 time points (Fig. 2b-d). Overall, 9,159 interactions between 71 regulators and 1,266 genes were inferred in at least one network.
The active factors and interactions change from one network to the next. The vast majority of interactions are active only at some time window (Fig. 2c), even for regulators (e.g., Batf) that participate in all networks. Based on similarity in active interactions, we identified three network classes (Fig. 2c), corresponding to the three differentiation phases (Fig. 2d). We collapsed all networks in each phase into one model, resulting in three consecutive network models (Supplementary Fig. 4; Supplementary Table 3). Among the regulators, 33 are active in all of the networks (e.g. many known master regulators such as Batf1, Irf4, and Stat3), whereas 18 are active in only one (e.g. Stat1 and Irf1 in the early network; ROR-γt in the late network). Indeed, while ROR-γt mRNA levels are induced at ~4h, ROR-γt protein levels increase at approximately 20h and further rise over time, consistent with our model (Supplementary Fig. 5).
In addition to known Th17 regulators, our network includes dozens of novel factors as predicted regulators (Fig. 2d), induced target genes, or both (Supplementary Fig. 4; Supplementary Table 3). It also contains receptor genes as induced targets, both previously known in Th17 cells (e.g., IL-1R1, IL-17RA) and novel (e.g., Fas, Itga3).
We ranked candidate regulators for perturbation (Fig. 2a, ,3a,3a, Methods), guided by features that reflect a regulatory role (Fig. 3a, “Network Information”) and a role as target (Fig. 3a, “Gene Expression Information”). We computationally ordered the genes to emphasize certain features (e.g., a predicted regulator of key Th17 genes) over others (e.g., differential expression in our time course data). We used a similar scheme to rank receptor proteins (Supplementary Table 4 and Methods). Supporting their quality, our top-ranked factors are enriched (p<10-3) for manually curated Th17 regulators (Supplementary Fig. 6), and correlate well (Spearman r>0.86) with a ranking learned by a supervised method (Methods). We chose 65 genes for perturbation: 52 regulators and 13 receptors (Supplementary Table 4). These included most of the top 44 regulators and top 9 receptors (excluding a few well-known Th17 genes and/or those for which knockout data already existed), as well as additional representative lower ranking factors.
In unstimulated primary mouse T cells, viral- or transfection-based siRNA delivery has been nearly impossible because it either alters differentiation or cell viability21,22. We therefore used a new delivery technology based on silicon nanowires (NWs)10,23, which we optimized to effectively (>95%) deliver siRNA into naïve T cells without activating them (Figs. 3b and c)23.
We attempted to perturb 60 genes with NW-mediated siRNA delivery and achieved efficient knockdown (<60% transcript remaining at 48hr post activation) for 34 genes (Fig. 3c and Supplementary Fig. 7). We obtained knockout mice for seven other genes, two of which (Irf8 and Il17ra) were also in the knockdown set (Supplementary Table 4). Altogether, we successfully perturbed 39 of the 65 selected genes – 29 regulators and 10 receptors – including 21 genes not previously associated with Th17 differentiation.
We measured the effects of perturbations at 48hr post-activation on the expression of 275 signature genes using the Nanostring nCounter system (Supplementary Tables 5, 6; Il17ra and Il21r knockouts were also measured at 60hr). The signature genes were computationally chosen to cover as many aspects of the differentiation process as possible (Methods): they include most differentially expressed cytokines, TFs, and cell surface molecules, as well as representatives from each cluster (Fig. 1b), enriched function (Supplementary Table 2), and predicted targets in each network (Supplementary Table 3). For validation, we profiled a signature of 86 genes using the Fluidigm BioMark system, obtaining highly reproducible results (Supplementary Fig. 8).
We scored the statistical significance of a perturbation's effect on a signature gene by comparing to non-targeting siRNAs and to 18 control genes that were not differentially expressed (Methods, Fig. 4a, all non-grey entries are significant). Supporting the original network model (Fig. 2), there is a significant overlap between the genes affected by a regulator's knockdown and its predicted targets (p ≤ 0.01, permutation test; Supplementary Methods).
To study the network's dynamics, we measured the effect of 28 of the perturbations at 10hr (shortly after the induction of ROR-γt; Supplementary Table 5), using the Fluidigm Biomark system. We found that 30% of the functional interactions are present with the same activation/repression logic at both 10hr and 48hr, whereas the rest are present only in one time point (Supplementary Fig. 9). This is consistent with the extent of rewiring in our original model (Fig. 2b).
Characterizing each regulator by its effect on Th17 signature genes (e.g. IL17A, IL17F, Fig. 4b, grey nodes, bottom), we find that at 48hr the network is organized into two antagonistic modules: a module of 22 ‘Th17 positive factors’ (Fig. 4b, blue nodes: 9 novel) whose perturbation decreased the expression of Th17 signature genes (Fig. 4b, grey nodes, bottom), and a module of 5 ‘Th17 negative factors’ (Fig. 4b, red nodes: 3 novel) whose perturbation did the opposite. Each of the modules is tightly intra-connected through positive, self-reinforcing interactions between its members (70% of the intra-module edges), whereas most (88%) inter-module interactions are negative. This organization, which is statistically significant (empirical p-value<10-3; Methods, Supplementary Fig. 10), is reminiscent to that observed previously in genetic circuits in yeast24,25. At 10hrs, the same regulators do not yield this clear pattern (p>0.5), suggesting that at that point, the network is still malleable.
The two antagonistic modules may play a key role in maintaining the balance between Th17 and other T cell subsets and in self-limiting the pro-inflammatory status of Th17 cells. Indeed, perturbing Th17 positive factors also induces signature genes of other T cell subsets (e.g., Gata3, Fig. 4b, grey nodes, top), whereas perturbing Th17 negative factors suppresses them (e.g., Foxp3, Gata3, Stat4, and Tbx21).
We now focused on the role of 12 of the positive or negative factors (including 11 of the 12 novel factors that have not been associated with Th17 cells; Fig. 4b, light grey halos). We used RNA-Seq after perturbing each factor to test whether its predicted targets (Fig. 2) were affected by perturbation (Fig. 4c, Venn diagram, top). We found highly significant overlaps (p<10-5) for three of the factors (Egr2, Irf8, and Sp4) that exist in both datasets, and a border-line significant overlap for the fourth (Smarca4), validating the quality of the edges in our network.
Next, we assessed the designation of each of the 12 factors as ‘Th17 positive’ or ‘Th17 negative’ by comparing the set of genes that respond to that factor's knockdown (in RNA-Seq) to each of the 20 clusters (Fig. 1b). Consistent with the original definitions, knockdown of a ‘Th17 positive’ regulator down-regulated genes in otherwise induced clusters, and up-regulated genes in otherwise repressed or un-induced clusters (and vice versa for ‘Th17 negative’ regulators; Fig. 4d and Supplementary Fig. 11a,b). The genes affected by either positive or negative regulators also significantly overlap with those bound by key CD4+ transcription factors (e.g., Foxp326,27, Batf, Irf4, and ROR-γt28,29, Xiao et al., unpublished data).
Knockdown of Mina, a chromatin regulator from the Jumonji C (JmjC) family, represses the expression of signature Th17 cytokines and TFs (e.g. ROR-γt, Batf, Irf4) and of late-induced genes (clusters C9, C19; p<10-5; Supplementary Tables 5 and 7), while increasing the expression of Foxp3, the master TF of Treg cells. Mina is strongly induced during Th17 differentiation (cluster C7), is down-regulated in IL23r-/- Th17 cells, and is a predicted target of Batf30, ROR-γt30, and Myc in our model (Fig. 5a). Mina was shown to suppress Th2 bias by interacting with the TF NFAT and repressing the IL-4 promoter31. However, in our cells, Mina knockdown did not induce Th2 genes, suggesting an alternative mode of action via positive feedback loops between Mina, Batf and ROR-γt (Fig. 5a, left). Consistent with this model, Mina expression is reduced in Th17 cells from ROR-γt-knockout mice, and the Mina promoter was found to be bound by ROR-γt by ChIP-Seq (data not shown). Finally, the genes induced by Mina knockdown significantly overlap with those bound by Foxp3 in Treg cells26,27 (P<10−25; Supplementary Table 7) and with a cluster previously linked to Foxp3 activity in Treg cells32 (Supplementary Fig. 11c and Supplementary Table 7).
To further analyze the role of Mina, we measured IL-17a and Foxp3 expression following differentiation of naïve T cells from Mina-/- mice. Mina-/- cells had decreased IL-17a and increased Foxp3 compared to wild-type (WT) cells, as detected by intracellular staining (Fig. 5a). Cytokine analysis of the corresponding supernatants confirmed a decrease in IL-17a production and an increase in IFN-γ (Fig. 5a) and TNFα (Supplementary Fig. 12a). This suggests a model where Mina, induced by ROR-γt and Batf, promotes transcription of ROR-γt, while suppressing induction of Foxp3, thus affecting the reciprocal Tregs/Th17 balance33 by favoring rapid Th17 differentiation.
Fas, the TNF receptor superfamily member 6, is another Th17 positive regulator (Fig. 5b). Fas is induced early, and is a target of Stat3 and Batf in our model. Fas knockdown represses the expression of key Th17 genes (e.g., IL-17a, IL-17f, Hif1a, Irf4, and Rbpj) and of the induced cluster C14, and promotes the expression of Th1-related genes, including IFN-γ receptor 1 and Klrd1 (Cd94; by RNA-Seq, Fig. 4, Fig. 5b, Supplementary Table 7 and Supplementary Fig. 11). Fas and Fas-ligand deficient mice are resistant to the induction of autoimmune encephalomyelitis (EAE)34, but have no defect in IFN-γ or Th1 responses. The mechanism underlying this phenomenon was never studied.
To explore this, we differentiated T cells from Fas-/- mice (Fig. 5b, Supplementary Fig. 12c). Consistent with our knockdown analysis, expression of IL-17a was strongly repressed and IFN-γ production was strongly increased under both Th17 and Th0 polarizing conditions (Fig. 5b). These results suggest that besides being a death receptor, Fas may play an important role in controlling the Th1/Th17 balance, and Fas-/- mice may be resistant to EAE due to lack of Th17 cells.
Knockdown of Pou2af1 (OBF1) strongly decreases the expression of Th17 signature genes (Fig. 5c) and of intermediate- and late-induced genes (clusters C19 and C20, p<10-7; Supplementary Tables 5 and 7), while increasing the expression of regulators of other CD4+ subsets (e.g., Foxp3, Stat4, Gata3) and of genes in non-induced clusters (clusters C2 and C16 p<10-9; Supplementary Table 5 and 7). Pou2af1's role in T cell differentiation has not been explored35.
To investigate its effects, we differentiated T cells from Pou2af1-/- mice (Fig. 5c, Supplementary Fig. 12b). Compared to WT cells, IL-17a production was strongly repressed. Interestingly, IL-2 production was strongly increased in Pou2af1-/- T cells under non-polarizing (Th0) conditions. Thus, Pou2af1 may promote Th17 differentiation by blocking production of IL-2, a known endogenous repressor of Th17 cells36. Pou2af1 acts as a transcriptional co-activator of the TFs OCT1 or OCT235. IL-17a production was also strongly repressed in Oct1-deficient cells (Supplementary Fig. 12d), suggesting that Pou2af1 may exert some of its effects through this co-factor.
Knockdown of the TSC22 domain family protein 3 (Tsc22d3) increases the expression of Th17 cytokines (IL-17a, IL-21) and TFs (ROR-γt, Rbpj, Batf), and reduces Foxp3 expression. Previous studies in macrophages have shown that Tsc22d3 expression is stimulated by glucocorticoids and IL-10, and it plays a key role in their anti-inflammatory and immunosuppressive effects37. Tsc22d3 knockdown in Th17 cells increased the expression of IL-10 and other key genes that enhance its production (Fig. 5d). Although IL-10 production has been shown33,38,39 to render Th17 cells less pathogenic in autoimmunity, co-production of IL-10 and IL-17a may be the indicated response for clearing certain infections like Staphylococcus aureus at mucosal sites40. This suggests a model where Tsc22d3 is part of a negative feedback loop for the induction of a Th17 cell subtype that coproduce IL-17 and IL-10 and limits their pro-inflammatory capacity. Tsc22d3 is induced in other cells in response to the steroid Dexamethasone41, which represses Th17 differentiation and ROR-γt expression42. Thus, Tsc22d3 may mediate this effect of steroids.
To further characterize Tsc22d3's role, we used ChIP-Seq to measure its DNA-binding profile in Th17 cells and RNA-Seq following its knockdown to measure its functional effects. There is a significant overlap between Tsc22d3's functional and physical targets (P<0.01, e.g., IL-21, Irf4; Methods, Supplementary Table 8). For example, Tsc22d3 binds in proximity to IL-21 and Irf4, which also become up regulated in the Tsc22d3 knockdown. Furthermore, the Tsc22d3 binding sites significantly overlap those of major Th17 factors, including Batf, Stat3, Irf4, and ROR-γt (>5 fold enrichment; Fig. 5d, Supplementary Table 8, and Methods). This suggests a model where Tsc22d3 exerts its Th17-negative function as a transcriptional repressor that competes with Th17 positive regulators over binding sites, analogous to previous findings in CD4+ regulation29,43.
We combined a high-resolution transcriptional time course, novel methods to reconstruct regulatory networks, and innovative nanotechnology to perturb T cells, to construct and validate a network model for Th17 differentiation. The model consists of three consecutive, densely intra-connected networks, implicates 71 regulators (46 novel), and suggests substantial rewiring in 3 phases. The 71 regulators significantly overlap with genes genetically associated with inflammatory bowel disease44 (11 of 71, p<10-9). Building on this model, we systematically ranked 127 putative regulators (80 novel; Supplementary Table 4), and tested top ranking ones experimentally.
We found that the Th17 regulators are organized into two tightly coupled, self-reinforcing but mutually antagonistic modules, whose coordinated action may explain how the balance between Th17, Treg, and other effector T cell subsets is maintained, and how progressive directional differentiation of Th17 cells is achieved. Within the two modules are 12 novel factors (Fig. 4 and and5),5), which we further characterized, highlighting four of the factors (others are in Supplementary Note and Supplementary Fig. 13).
In a recent work, Ciofani et al.29 systematically ranked Th17 regulators based on ChIP-Seq data for known key factors and transcriptional profiles in wild type and knockout cells. While their network centered on known core Th17 TFs, our complementary approach perturbed many genes in a physiologically meaningful setting. Reassuringly, their core Th17 network significantly overlaps with our computationally inferred model (Supplementary Fig. 14).
The wiring of the positive and negative modules (Fig. 4 and and5)5) uncovers some of the functional logic of the Th17 program, but likely involve both direct and indirect interactions. Our functional model provides an excellent starting point for deciphering the underlying physical interactions with DNA binding profiles30 or protein-protein interactions (Wu et al.45, in this issue of Nature). The regulators we identified are compelling new targets for regulating the Th17/Tregs balance and for switching pathogenic Th17 into non-pathogenic ones.
C57BL/6 wild-type (wt), Mt−/−, Irf1−/−, Fas-/-, Irf4fl/fl, and Cd4Cre mice were obtained from Jackson Laboratory (Bar Harbor, ME). Stat1−/− and 129/Sv control mice were purchased from Taconic (Hudson, NY). IL-12rβ1−/− mice were provided by Dr. Pahan Kalipada from Rush University Medical Center. IL-17Ra−/− mice were provided by Dr. Jay Kolls from Louisiana State University/University of Pittsburgh. Irf8fl/fl mice were provided by Dr. Keiko Ozato from the National Institute of Health. Both Irf4fl/fl and Irf8fl/fl mice were crossed to Cd4Cre mice to generate Cd4CrexIrf4fl/fl and Cd4CrexIrf8fl/fl mice. All animals were housed and maintained in a conventional pathogen-free facility at the Harvard Institute of Medicine in Boston, MA (IUCAC protocols: 0311-031-14 (VKK) and 0609-058015 (AR)). All experiments were performed in accordance to the guidelines outlined by the Harvard Medical Area Standing Committee on Animals at the Harvard Medical School (Boston, MA). In addition, spleens from Mina-/- mice were provided by Dr. Mark Bix from St. Jude Children's Reseach Hospital (IACUC Protocol: 453). Pou2af1-/- mice were obtained from the laboratory of Dr. Robert Roeder46. Wild-type and Oct1-/- fetal livers were obtained at day E12.5 and transplanted into sub-lethally irradiated Rag1-/- mice as previously described47 (IACUC Protocol: 11-09003).
Cd4+ T cells were purified from spleen and lymph nodes using anti-CD4 microbeads (Miltenyi Biotech) then stained in PBS with 1% FCS for 20 min at room temperature with anti-Cd4-PerCP, anti-Cd62l-APC, and anti-Cd44-PE antibodies (all Biolegend, CA). Naïve Cd4+ Cd62lhigh Cd44low T cells were sorted using the BD FACSAria cell sorter. Sorted cells were activated with plate bound anti-Cd3 (2μg/ml) and anti-Cd28 (2μg/ml) in the presence of cytokines. For Th17 differentiation: 2ng/mL rhTGF-β1 (Miltenyi Biotec), 25ng/mL rmIl-6 (Miltenyi Biotec), 20ng/ml rmIl-23 (Miltenyi Biotec), and 20ng/ml rmIL-β1 (Miltenyi Biotec). Cells were cultured for 0.5 – 72 hours and harvested for RNA, intracellular cytokine staining, and flow cytometry.
Sorted naïve T cells were stimulated with phorbol 12-myristate 13-aceate (PMA) (50ng/ml, Sigma-aldrich, MO), ionomycin (1μg/ml, Sigma-aldrich, MO) and a protein transport inhibitor containing monensin (Golgistop) (BD Biosciences) for four hours prior to detection by staining with antibodies. Surface markers were stained in PBS with 1% FCS for 20 min at room temperature, then subsequently the cells were fixed in Cytoperm/Cytofix (BD Biosciences), permeabilized with Perm/Wash Buffer (BD Biosciences) and stained with Biolegend conjugated antibodies, i.e. Brilliant Violet 650™ anti-mouse IFN-γ (XMG1.2) and allophycocyanin–anti-IL-17A (TC11-18H10.1), diluted in Perm/Wash buffer as described48 (Fig. 5, Supplementary Fig. 11). To measure the time-course of RORγt protein expression, a phycoerythrin-conjugated anti- Retinoid-Related Orphan Receptor gamma was used (B2D), also from eBioscience (Supplementary Fig 4). FOXP3 staining for cells from knockout mice was performed with the FOXP3 staining kit by eBioscience (00-5523-00) in accordance with their “One-step protocol for intracellular (nuclear) proteins”. Data was collected using either a FACS Calibur or LSR II (Both BD Biosciences), then analyzed using Flow Jo software (Treestar)49,50.
Naïve T cells from knockout mice and their wild type controls were cultured as described above, their supernatants were collected after 72 h, and cytokine concentrations were determined by ELISA (antibodies for IL-17 and IL-10 from BD Bioscience) or by cytometric bead array for the indicated cytokines (BD Bioscience), according to the manufacturers' instructions (Fig. 5, Supplementary Fig. 11).
Naïve T cells were isolated from WT mice, and treated with IL-6 and TGF-β1. Affymetrix microarrays HT_MG-430A were used to measure the resulting mRNA levels at 18 different time points (Fig. 1b). Cells treated with IL-6, TGF-β1, and IL-23 were profiled at last four time points (48 - 72hr). As control, we used time- and culture-matched WT naïve T cells stimulated under Th0 conditions. Biological replicates were measured in eight of the eighteen time points (1hr, 2hr, 10hr, 20hr, 30hr, 42hr, 52hr, 60hr) with high reproducibility (r2>0.98). For further validation we compared the differentiation time course to published microarray data of Th17 cells and naïve T cells51 (Supplementary Fig. 1c). In an additional dataset naïve T cells were isolated from WT and Il23r−/− mice, and treated with IL-6, TGF-β1 and IL-23 and profiled at four different time points (49hr, 54hr, 65hr, 72hr). Expression data was preprocessed using the RMA algorithm followed by quantile normalization52.
Differentially expressed genes (comparing to the Th0 control) were found using four methods: (1) Fold change. Requiring a 2-fold change (up or down) during at least two time points. (2) Polynomial fit. We used the EDGE software53,54, designed to identify differential expression in time course data, with a threshold of q-value≤0.01. (3) Sigmoidal fit. We used an algorithm similar to EDGE while replacing the polynomials with a sigmoid function, which is often more adequate for modeling time course gene expression data55. We used a threshold of q-value≤0.01. (4) ANOVA. Gene expression is modeled by: time (using only time points for which we have more than one replicate) and treatment (“TGF-β1+IL-6” or “Th0”). The model takes into account each variable independently, as well as their interaction. We report cases in which the p-value assigned with the treatment parameter or the interaction parameter passed an FDR threshold of 0.01.
Overall, we saw substantial overlap between the methods (average of 82% between any pair of methods). We define the differential expression score of a gene as the number of tests that detected it. As differentially expressed genes, we report cases with differential expression score >3.
For the Il23r−/−time course (compared to the WT T cells) we used methods 1–3 (above). Here we used a fold change cutoff of 1.5, and report genes detected by at least two tests.
We considered several ways for grouping the differentially expressed genes, based on their time course expression data: (1) For each time point, we defined two groups: (a) all the genes that are over-expressed and (b) all the genes that are under-expressed relative to Th0 cells (see below); (2) For each time point, we defined two groups: (a) all the genes that are induced and (b) all the genes that are repressed, comparing to the previous time point; (3) K-means clustering using only the Th17 polarizing conditions. We used the minimal k, such that the within-cluster similarity (average Pearson correlation with the cluster's centroid) was higher than 0.75 for all clusters; and, (4) K-means clustering using a concatenation of the Th0 and Th17 profiles.
For methods (1, 2), to decide whether to include a gene, we considered its original mRNA expression profiles (Th0, Th17), and their approximations as sigmoidal functions55 (thus filtering transient fluctuations). We require that the fold change levels (compared to Th0 (method 1) or to the previous time point (method 2)) pass a cutoff defined as the minimum of the following three values: (1) 1.7; (2) mean + std of the histogram of fold changes across all time points; or (3) the maximum fold change across all time points. The clusters presented in Fig. 1b were obtained with method 4. The groupings from methods (1, 2, and 4) are provided in Supplementary Table 2.
We identified potential regulators of Th17 differentiation by computing overlaps between their putative targets and sets of differentially expressed genes grouped according to methods 1-4 above. We assembled regulator-target associations from several sources: (1)in vivo DNA binding profiles (typically measured in other cells) of 298 transcriptional regulators12-17; (2) transcriptional responses to the knockout of 11 regulatory proteins 6,43,49,56-60; (3) additional potential interactions obtained by applying the Ontogenet algorithm (Jojic et al., under review; regulatory model available at: http://www.immgen.org/ModsRegs/modules.html) to data from the mouse ImmGen consortium (http://www.immgen.org; January 2010 release19), which includes 484 microarray samples from 159 cell subsets from the innate and adaptive immune system of mice; (4) a statistical analysis of cis-regulatory element enrichment in promoter regions 18,61; and, (5) the TF enrichment module of the IPA software (http://www.ingenuity.com/). For every TF in our database, we computed the statistical significance of the overlap between its putative targets and each of the groups defined above using a Fisher's exact test. We include cases where p< 5*10-5 and the fold enrichment > 1.5.
Each edge in the regulatory network was assigned a time stamp based on the expression profiles of its respective regulator and target nodes. For the target node, we considered the time points at which a gene was either differentially expressed or significantly induced or repressed with respect to the previous time point (similarly to grouping methods 1 and 2 above). We defined a regulator node as ‘absent’ at a given time point if: (i) it was under expressed compared to Th0; or (ii) the expression is low (<20% of the maximum value in time) and the gene was not over-expressed compared to Th0; or, (iii) up to this point in time the gene was not expressed above a minimal expression value of 100. As an additional constraint, we estimated protein expression levels using the model from Ref. 20 and using a sigmoidal fit55 for a continuous representation of the temporal expression profiles, and the ProtParam software62 for estimating protein half-lives. We require that, in a given time point, the predicted protein level be no less than 1.7 fold below the maximum value attained during the time course, and not be less than 1.7 fold below the Th0 levels. The timing assigned to edges inferred based on a time-point specific grouping (grouping methods 1 and 2 above) was limited to that specific time point. For instance, if an edge was inferred based on enrichment in the set of genes induced at 1hr (grouping method #2), it will be assigned a “1hr” time stamp. This same edge could then only have additional time stamps if it was revealed by additional tests.
The selection of the 275-gene signature (Supplementary Table 5, 6) combined several criteria to reflect as many aspect of the differentiation program as was possible. We defined the following requirements: (1) the signature must include all of the TFs that belong to a Th17 microarray signature (comparing to other CD4+ T cells51, see Supplementary Methods); that are included as regulators in the network and have a differential expression score>1; or that are strongly differentially expressed (differential expression score=4); (2) it must include at least 10 representatives from each cluster of genes that have similar expression profiles (using clustering method (4) above); (3) it must contain at least 5 representatives from the predicted targets of each TF in the different networks; (4) it must include a minimal number of representatives from each enriched Gene Ontology (GO) category (computed across all differentially expressed genes); and, (5) it must include a manually assembled list of ~100 genes that are related to the differentiation process, including the differentially expressed cytokines, cytokine receptors and other cell surface molecules. Since these different criteria might generate substantial overlaps, we used a set-cover algorithm to find the smallest subset of genes that satisfies all of five conditions. We added to this list 18 genes whose expression showed no change (in time or between treatments) in the microarray data.
The 86-gene signature (used for the Fluidigm BioMark qPCR assay) is a subset of the 275-gene signature, selected to include all the key regulators and cytokines discussed. We added to this list 10 control genes (Supplementary Table 5).
We used an unbiased approach to rank candidate regulators – transcription factor or chromatin modifier genes – of Th17 differentiation. Our ranking was based on the following features: (a) whether the gene encoding the regulator belonged to the Th17 microarray signature (comparing to other CD4+ T cells51, see Supplementary Methods); (b) whether the regulator was predicted to target key Th17 molecules (IL-17, lL-21, IL23r, and ROR-γt); (c) whether the regulator was detected based on both perturbation and physical binding data from the IPA software (http://www.ingenuity.com/); (d) whether the regulator was included in the network using a cutoff of at least 10 target genes; (e) whether the gene encoding for the regulator was significantly induced in the Th17 time course – we only consider cases where the induction happened after 4 hours to exclude non-specific hits; (f) whether the gene encoding the regulator was differentially expressed in response to Th17-related perturbations in previous studies. For this criterion, we assembled a database of transcriptional effects in perturbed Th17 cells, including: knockouts of Batf56, ROR-γt (Xiao et al., unpublished), Hif1a57, Stat3 and Stat543,63, Tbx21 (Awasthi et al., unpublished), IL23r (this study), and Ahr59. We also included data from the Th17 response to Digoxin64 and Halofuginone65, as well as information on direct binding by ROR-γt as inferred from ChIP-seq data (Xiao et al., unpublished). Our analysis of the published expression data sets is described in the Supplementary Methods. For each regulator, we counted the number of conditions in which it came up as a significant hit (up/down-regulated or bound); for regulators with 2 to 3 hits (quantiles 3 to 7 out of 10 bins), we then assign a score of 1; for regulators with more than 3 hits (quantiles 8-10), we assign a score of 2 (a score of 0 is assigned otherwise); and, (g) the differential expression score of the gene in the Th17 time course.
We ordered the regulators lexicographically by the above features according to the order: a, b, c, d, (sum of e and f), g - that is, first sort according to a then break ties according to b, and so on. We exclude genes that are not over-expressed during at least one time point. As an exception, we retained predicted regulators (feature d) that had additional external validation (feature f). To validate this ranking, we used a supervised test: we manually annotated 74 regulators that were previously associated with Th17 differentiation. All of the features are highly specific for these regulators (p<10-3). Moreover, using a supervised learning method (Naïve Bayes), the features provided good predictive ability for the annotated regulators (accuracy of 71%, using 5-fold cross validation), and the resulting ranking was highly correlated with our unsupervised lexicographic ordering (Spearman correlation > 0.86).
We adapted this strategy for ranking protein receptors. To this end, we excluded feature c and replaced the remaining “protein-level” features (b and d) with the following definitions: (b) whether the respective ligand is induced during the Th17 time course; and, (d) whether the receptor was included as a target in the network using a cutoff of at least 5 targeting transcriptional regulators.
4 × 4 mm silicon nanowire (NW) substrates were prepared and coated with 3 μL of a 50 μM pool of four siGENOME siRNAs (Dharmcon) in 96 well tissue culture plates, as previously described10. Briefly, 150,000 naïve T cells were seeded on siRNA-laced NWs in 10 μL of complete media and placed in a cell culture incubator (37°C, 5% CO2) to settle for 45 minutes before full media addition. These samples were left undisturbed for 24 hours to allow target transcript knockdown. Afterward, siRNA-transfected T cells were activated with αCd3/Cd28 dynabeads (Invitrogen), according to the manufacturer's recommendations, under Th17 polarization conditions (TGF-β1 & IL-6, as above). 10 or 48hr post-activation, culture media was removed from each well and samples were gently washed with 100 μL of PBS before being lysed in 20 μL of buffer TCL (Qiagen) supplemented with 2-mercaptoethanol (1:100 by volume). After mRNA was harvested in Turbocapture plates (Qiagen) and converted to cDNA using Sensiscript RT enzyme (Qiagen), qRT-PCR was used to validate both knockdown levels and phenotypic changes relative to 8-12 non-targeting siRNA control samples, as previously described66. A 60% reduction in target mRNA was used as the knockdown threshold. In each knockdown experiment, each individual siRNA pool was run in quadruplicate; each siRNA was tested in at least three separate experiments (Supplementary Fig. 9).
We used the nCounter system are presented in full in Geiss et al.67 to measure a a custom CodeSet constructed to detect a total of 293 genes, selected as described above. We also used the Fluidigm BioMark HD system to measure a smaller set of 96 genes. Finally, we used RNA-Seq to follow up and validate 12 of the perturbations. Details of the experimental and analytical procedures of these analyzes are provided in the Supplementary Methods.
ChIP-seq for Tsc22d3 was performed as previously described68 using an antibody from Abcam. The analysis of this data was performed as previously described 7 and is detailed in the Supplementary Methods.
The functional network in Fig. 4b consists of two modules: positive and negative. We compute two indices: (1)within-module index: the percentage of positive edges between members of the same module (i.e., down-regulation in knockdown/knockout); and, (2) between-module index: the percentage of negative edges between members of the same module that are negative. We shuffled the network 1,000 times, while maintaining the nodes' out degrees (i.e., number of outgoing edges) and edges' signs (positive/negative), and re-computed the two indices. The reported p-values were computed using a t-test.
We thank Leslie Gaffney and Lauren Solomon for artwork and the Broad's Genomics Platform for sequencing. Daneen Kozoriz for cell sorting. Work was supported by NHGRI (1P50HG006193-01, HP, AR), NIH Pioneer Awards (5DP1OD003893-03 to HP, DP1OD003958-01 to AR), NIH (NS 30843, NS045937, AI073748 and AI45757 to VKK), National MS Society (RG2571 to VKK), HHMI (AR), and the Klarman Cell Observatory (AR).
Authors Contribution: NY, AKS, JTG, HP, VKK, and AR conceived the study and designed experiments. NY developed computational methods. NY, AKS, JTG analyzed the data. AKS, JTG, HJ, YL, AA, CW. KK, SX, MJ, DG, RS, DYL, and JJT conducted the experiments. AS, MRP, PJR, MLC, MB, DT provided knockout mice. NY, AKS, JTG, VKK, HP and AR wrote the paper with input from all the authors.
Author Information: The microarray, RNA-seq and ChIP-seq data sets have been deposited in the Gene Expression Omnibus database under accession number GSE43955, GSE43969, GSE43948, GSE43949. Reprints and permissions information is available at www.nature.com/reprints. The authors declare no competing financial interests. Readers are welcome to comment on the online version of the paper.