|Home | About | Journals | Submit | Contact Us | Français|
Plant biomass degradation by fungal-derived enzymes is rapidly expanding in economic importance as a clean and efficient source for biofuels. The ability to rationally engineer filamentous fungi would facilitate biotechnological applications for degradation of plant cell wall polysaccharides. However, incomplete knowledge of biomolecular networks responsible for plant cell wall deconstruction impedes experimental efforts in this direction.
To expand this knowledge base, a detailed network of reactions important for deconstruction of plant cell wall polysaccharides into simple sugars was constructed for the filamentous fungus Neurospora crassa. To reconstruct this network, information was integrated from five heterogeneous data types: functional genomics, transcriptomics, proteomics, genetics, and biochemical characterizations. The combined information was encapsulated into a feature matrix and the evidence weighted to assign annotation confidence scores for each gene within the network. Comparative analyses of RNA-seq and ChIP-seq data shed light on the regulation of the plant cell wall degradation network, leading to a novel hypothesis for degradation of the hemicellulose mannan. The transcription factor CLR-2 was subsequently experimentally shown to play a key role in the mannan degradation pathway of N. crassa.
Here we built a network that serves as a scaffold for integration of diverse experimental datasets. This approach led to the elucidation of regulatory design principles for plant cell wall deconstruction by filamentous fungi and a novel function for the transcription factor CLR-2. This expanding network will aid in efforts to rationally engineer industrially relevant hyper-production strains.
The online version of this article (doi:10.1186/s13068-017-0901-2) contains supplementary material, which is available to authorized users.
Plant biomass, primarily composed of lignocellulose, is a renewable and environmentally clean energy source, and a promising feedstock for the production of next-generation biofuels and specialty chemicals [1–3]. A principal barrier to economical production of biofuels is the high production cost of biomass depolymerization enzymes . Filamentous fungi are among the most efficient degraders of lignocellulosic biomass in nature and play a key role in carbon recycling [5, 6]. Industrially relevant strains, such as Trichoderma reesei, were constructed through multiple rounds of random mutagenesis and can secrete over 100 g/L of protein [7, 8]. However, rationally engineering strains of filamentous fungi to further enhance the secretion of enzymes is a major challenge in bioenergy research . To meet this challenge and aid future experimental efforts, a system-level understanding of plant cell wall deconstruction by filamentous fungi is necessary [5, 10].
The model filamentous fungus Neurospora crassa has well-developed genetics, biochemistry, molecular biology, and a well-annotated genome [11–14]. In nature, N. crassa colonizes freshly burnt plant biomass and shows robust growth on lignocellulose [5, 15–19]. The suite of experimental resources available for N. crassa makes it an ideal model system for bioenergy-related research, particularly for the elucidation of plant cell wall deconstruction mechanisms and nutrient sensing [5, 16–20]. Research on N. crassa contributed to the discovery of a new class of enzymes called lytic polysaccharide monooxygenases (LPMOs), which greatly increase synergy in cellulose degradation. In addition, novel cellodextrin transporters from N. crassa were utilized to engineer improved yeast strains for sugar fermentation [21–24]. A network reconstruction encompassing the present knowledge of metabolic reactions, enzymes, and associated genes in N. crassa dedicated to the deconstruction of plant cell wall polysaccharides into simple fermentable sugars will further expedite experimental efforts.
The availability of fully sequenced genomes and accumulated wealth of biochemical evidence led to the reconstruction of genome-scale and manually curated metabolic networks for more than 50 organisms across the three domains of life [25, 26]. These genome-scale metabolic networks have been widely analyzed using constraint-based modeling methods to predict the response to environmental and genetic perturbations [27, 28]. Notably, only a few curated genome-scale metabolic reconstructions have been built for filamentous fungi [29–34]. While a manually curated genome-scale metabolic network for N. crassa exists , this reconstruction and those built for other ascomycete fungi [29–34] are limited by significant knowledge gaps, specifically pathways for the degradation and utilization of plant cell wall polysaccharides.
To overcome this limitation, we built a detailed network of biochemical reactions important for the degradation of plant cell wall polysaccharides into simple fermentable sugars in N. crassa (Fig. 1; Additional file 1). Plant cell walls are largely composed of complex polysaccharides that include cellulose, hemicellulose, and pectin [5, 35–40]. Cellulose is the most abundant plant cell wall polysaccharide and is an unbranched structure composed of linear chains of β-1,4-linked d-glucose residues. The second most abundant is the heterogeneous group of hemicelluloses, composed of several branched polymers, including xylan, xyloglucan, mannan, and mixed-linkage glucan. Pectin is a minor constituent of mature plant cell walls but the most complex heteropolysaccharide. Its main constituents are homogalacturonan, xylogalacturonan, and rhamnogalacturonan I.
The plant cell wall degradation network (PCWDN) reconstruction and annotation pipeline described here involved the integration of five heterogeneous data types: functional genomics, transcriptomics, proteomics, genetics, and biochemical information, along with extensive manual curation based on more than 130 research articles (Fig. 2; Additional file 1). The combined annotation information was encapsulated in a feature matrix, which was used to assign annotation confidence scores to PCWDN genes. Comparative analysis of RNA sequencing (RNA-seq)-based global transcriptome profiles underlined the importance of PCWDN genes for adaptations to different plant cell wall polysaccharides. Subsequent analyses of RNA-seq and ChIP-seq data within the context of the N. crassa PCWDN provided novel insights into the roles of key transcription factors (TFs) in the deconstruction of plant biomass, which were tested here. The N. crassa PCWDN will serve as a scaffold for the integration and systems analyses of diverse experimental data, helping elucidate the regulatory design principles underlying plant cell wall deconstruction by filamentous fungi.
We assembled an initial list of biochemical reactions and associated genes in the PCWDN v0.0 of N. crassa by combining information on predicted enzymes and transporters involved in the degradation of plant cell wall polysaccharides from the following sources: the Carbohydrate-Active enZYmes database (CAZY) , the N. crassa e-Compendium , the genome annotation for N. crassa OR74A [12, 13], and TransportDB  (“Methods”; Fig. 2). Specifically, 110 out of 231 CAZY genes predicted to encode carbohydrate-active enzymes in the genome were included in the PCWDN v0.0 (Additional file 2). The remaining 121 CAZY genes mainly belong to families of enzymes active on chitin and chitosan and thus are not likely to be involved in the PCWDN, but rather in remodeling the fungal cell wall (Additional file 2).
The annotation of PCWDN genes in the above databases has not been updated with data on plant cell wall deconstruction by N. crassa. For example, the current OR74A genome annotation is unable to differentiate between cellulolytic LPMOs, hemicellulose-active LPMOs, and starch-active LPMOs [44–50]. Thus, we performed extensive literature-based manual curation involving more than 130 research articles (Additional file 1) to fill the knowledge gaps in the initial PCWDN v0.0 and compiled multi-level supporting evidence as described below from five heterogeneous data types: functional genomics, transcriptomics, proteomics, genetics, and biochemical characterizations, to annotate genes in the final PCWDN v1.0 of 202 reactions and 168 genes (Fig. 2; Additional file 1). The 202 reactions in the final PCWDN of N. crassa were further subdivided into 101 extracellular reactions, 35 transport reactions, and 66 intracellular reactions (Additional file 1).
An important annotation feature of PCWDN enzymes is their predicted subcellular localization. For example, the hydrolysis of cellodextrins into d-glucose by β-glucosidases can occur in the extracellular space or in the intracellular space (Additional file 1). Of the 168 PCWDN genes, products of 103, 19, and 46 genes are associated with extracellular, transport, and intracellular reactions, respectively. We used SignalP  and Phobius  to predict the presence of signal peptides in PCWDN proteins to determine if they were destined toward the secretory pathway (“Methods”). We found that 89 out of the 103 gene products (~86%) associated with extracellular reactions were predicted to have a signal peptide by at least one of the two tools, while no gene products associated with transport or intracellular reactions were predicted to have a signal peptide by either of the two tools (Additional file 3). WoLF PSORT  and ProtComp were also used to predict subcellular localization of proteins (“Methods”). Predictions from at least one of the two tools matched the assigned localization for 90 out of the 103 gene products (~87%) associated with extracellular reactions, while the predictions from at least one of the two tools matched the assigned localization for all gene products associated with transport or intracellular reactions in the PCWDN (Additional file 3).
Using the compiled functional genomics-based information (Additional file 3), we assessed the annotation support for the 168 PCWDN members. Specifically, functional genomics information was considered to support the annotation of a PCWDN enzyme if the following three conditions were satisfied (Fig. 3): (i) gene annotation in CAZY database  or Broad OR74A genome [12, 13] or TransportDB  matches the assigned biochemical function in the network; (ii) SignalP or Phobius predicts the presence of a signal peptide in extracellular (secreted) enzymes and the absence of a signal peptide in intracellular enzymes; (iii) subcellular localization predictions from WoLF PSORT or ProtComp match the assigned localization in the network. Based on this definition, we obtained functional genomics-based annotation support for 145 of the 168 PCWDN genes (Additional file 3).
To augment the annotation of PCWDN genes, we used RNA-seq data and compared the transcriptional profiles of N. crassa WT strain (FGSC 2489 ; “Methods”) grown under different carbon source regimes corresponding to the different carbohydrates that make up the plant cell wall. Previous studies [16, 19] generated RNA-seq data from shift experiments, in which a 16-h-old culture of N. crassa WT was shifted for 4 h to minimal media with no carbon (NoC) source or one of five carbon sources: sucrose, cellobiose (CB), Avicel (microcrystalline cellulose), xylan, pectin, or orange peel powder (OPP, a pectin-rich substrate) (“Methods”; Additional file 4). We replicated this experimental design, generating RNA-seq data for four additional carbon sources: xyloglucan (XG), mannan, mixed-linkage glucan (MLG), and starch (“Methods”; Additional file 4). This approach ensured that a comparative analysis of transcriptional profiles could be performed between all tested plant cell wall polysaccharides. A pipeline consisting of standard software was used to analyze the RNA-seq data and identify differentially expressed genes between the nine different conditions corresponding to different polysaccharides and the two controls corresponding to NoC and sucrose conditions (“Methods”; Additional file 4).
Based on the compiled RNA-seq data, we generated transcriptomics-based annotation support for the PCWDN genes (Fig. 3; Additional file 4). We considered RNA-seq data to support the annotation of a gene involved in the degradation of a specific polysaccharide if its transcript was significantly differentially expressed and upregulated in the relevant carbon source as compared to the NoC control. For example, RNA-seq data supported the annotation of the gene gh6-3 (NCU07190), predicted to encode an exo-β-1,4-glucanase (cellobiohydrolase) involved in cellulose degradation, as this gene is upregulated on Avicel in comparison to the NoC control (Additional file 4). We obtained transcriptomics-based annotation support for 106 of the 168 PCWDN genes (Additional file 4).
To further enrich the PCWDN and provide an added layer of confidence in the annotation, we compiled proteomics data from previous N. crassa studies [15, 17–19, 54–57] (Additional file 5). In these studies, the secretomes from N. crassa grown on sucrose, Avicel, xylan, pectin, OPP, and NoC had been characterized using a shotgun proteomics approach [15, 17–19, 54, 55] or by quantitative proteomics  (Additional file 5). The compiled data were used to generate proteomics-based support for the annotation of PCWDN genes. Proteomics data supported the annotation of an enzyme if the protein was detected in the secretome or displayed an increased abundance in a carbon source as compared to the NoC control (Fig. 3). For example, proteomics data further supported the annotation of gh6-3, as the encoded protein was detected in the secretome and also increased in abundance on Avicel compared to the NoC control (Additional file 5). In this way, we obtained proteomics-based annotation support for 68 of the 168 PCWDN genes (Additional file 5). In cases where an enzyme in the PCWDN was detected in a secretome, but was not predicted to be secreted using functional genomics tools, the proteomics-based evidence was considered more reliable and given priority. This approach led to the re-annotation of 4 PCWDN genes: gla-1(NCU01517), gh6-3 (NCU07190), gh5-7 (NCU08412), and ce5-2 (NCU09663).
Next, we mined published literature to compile an experimentally verified dataset of N. crassa deletion strains for PCWDN genes with a growth-deficient phenotype as compared to the parental WT strain (FGSC 2489) (Additional file 6). This dataset was used to assign genetics-based annotation support to PCWDN enzymes. For example, the genetics-based dataset supported the annotation of the gene gh10-2 (NCU08189) as an endo-β-1,4-xylanase involved in xylan degradation, since the deletion strain for this gene exhibited a growth-deficient phenotype on xylan  (Additional file 6). Overall, genetics-based information supported the annotation of 19 out of the 168 PCWDN genes (Additional file 6).
Lastly, we mined the literature for biochemical data to support the annotation of PCWDN enzymes (Fig. 2). Gene products for 33 out of the 168 PCWDN genes have been biochemically characterized in N. crassa [20–24, 46–50, 58–68] (Additional file 7). An additional extensive literature search was performed to determine if ortholog/paralogs of N. crassa PCWDN genes had been biochemically characterized in other filamentous fungi [39, 69, 70] (Additional file 7). We used OrthoMCL [71, 72] to determine the orthology/paralogy of PCWDN genes in other filamentous fungi (“Methods”). In this way, biochemical-based annotation support was obtained for 113 out of the 168 PCWDN genes (Additional file 7).
The combined annotation information for PCWDN genes from the five different data types was captured in a feature matrix (Fig. 3; Additional file 1). We next devised a simple method based on the feature matrix to assign annotation confidence scores to PCWDN genes. A differential weighting system was used to account for the different levels of confidence associated with the annotation information for each of the five heterogeneous data types with each annotation level superseded by the next (Fig. 3). Annotation support from biochemical characterizations was given the highest level of confidence (factor of 16), followed by published mutant phenotypes (factor of 8), proteomic data (factor of 4), transcriptomic data (factor of 2), and functional genomics-based predictions (factor of 1). To obtain an overall annotation confidence score in the range of 0–1, the weighted sum of evidence support from the five data types was normalized by dividing by the maximal possible score of 31 (Fig. 3). Note that the chosen factors in the differential weighting system are the simplest possible that enable mathematical mapping from diverse evidence support values to a unique annotation confidence score. Moreover, the chosen factors are such that the contribution from a given data type to the confidence score is always more than the combined contribution from all other data types with less confidence. That is, a gene with annotation support from only biochemical characterization has a higher confidence score than a gene with combined annotation support from genetics, proteomics, transcriptomics, and functional genomics-based information. Additional file 1 lists the annotation confidence scores for the 168 PCWDN genes, and Fig. 3 presents the confidence scores for the genes belonging to CAZY class AA9 (Auxiliary Activity Family 9) encoding LPMOs involved in cellulose degradation [44, 47, 50]. For this particular case, the majority of genes encoding AA9 LPMOs within class 1 (3 out of 5 genes) and class 2 (2 out of 3 genes) are well characterized, while this is true for only 1 out of 6 genes encoding class 3 AA9 LPMOs.
To better define the genome-wide response of N. crassa to plant cell wall substrates, we analyzed combined RNA-seq data for possible correlations between the response patterns for various carbon sources. Figure 4a shows the level of correlation across the whole transcriptome for each pair of growth conditions. Hierarchical clustering of the pairwise correlation matrix across carbon sources revealed four main clusters. Avicel (cellulose) and MLG (a hemicellulose component) were in the first cluster with highly correlated gene expression patterns, likely due to the fact that both polysaccharides have a backbone that is rich in β-1,4-linked d-glucose residues. Similarly, sucrose and starch conditions fell into a second cluster, while xylan and pectin conditions were in a third cluster. This analysis also identified a fourth cluster where the gene expression in two hemicellulose components, mannan and XG, were very similar to the NoC starvation condition. In support of these observations, the N. crassa WT strain was observed to grow poorly on both mannan and XG. Principal component analysis of expression data under the different conditions gave a similar result (Fig. 4b), where the first two principal components together explained more than 79% of the total variance.
For a more detailed comparison of the N. crassa transcriptional response upon exposure to different plant cell wall polysaccharides, we identified the regulons for each polysaccharide using the NoC and sucrose conditions as controls. Following Benz et al. , the regulon (or up-regulon) for a given growth condition was defined as genes that were upregulated and differentially expressed in relation to both controls. Based on this definition, the regulons for Avicel, xylan, XG, mannan, MLG, pectin, and starch were determined to contain 153, 180, 138, 112, 188, 323, and 363 genes, respectively (horizontal bar plot in Fig. 4c; Additional file 4; for the corresponding down-regulons, see Additional files 4, 8). Although starch was determined to have the largest regulon with 363 genes, it was fairly distinct among the regulons with 294 unique genes (81%). By comparison, mannan was 47% unique, followed by pectin (39%), MLG (28%), xylan (16%), XG (15%), and Avicel (11%). The overlap between each of the seven regulons was determined, and the vertical plot in Fig. 4c shows all regulon comparisons with an overlap of 5 or more genes. Interestingly, XG and pectin regulons contained the highest overlap with 27 genes in common, followed by the Avicel regulon that overlapped most highly with MLG (24 genes), xylan with pectin (21 genes), and mannan with starch (14 genes). All 26 intersection sets from this analysis were subjected to functional category analysis based on gene annotations in FunCatDB  (Additional file 9). Most of the intersection sets involving either Avicel, xylan, XG, mannan, MLG, or pectin regulons were enriched in metabolic genes (“Metabolism and Energy”).
We next computed the relative abundance of the 168 PCWDN genes within each of the seven polysaccharide regulons and the probability of their enrichment. For Avicel, 66 PCWDN genes were represented in the regulon of 153 genes, a 43% enrichment (p < 10−78). The xylan regulon contained 68 PCWDN genes (37% enrichment; p < 10−76), the mannan regulon contained 13 PCWDN genes (29%; p < 10−7), the XG regulon contained 41 PCWDN genes (43%; p < 10−39), the MLG regulon contained 62 PCWDN genes (32%; p < 10−64), the pectin regulon contained 74 PCWDN genes (22%; p < 10−65), and the starch regulon contained 14 PCWDN genes (3.8%; p < 10−2) (Fig. 4c). The observed overlap between PCWDN genes and regulons was statistically highly significant, since the PCWDN genes account for only 1.7% of genes (168 out of 9758) within the N. crassa genome. Thus, our PCWDN genes capture a substantial (and relevant) part of the transcriptional response of N. crassa upon exposure to Avicel (cellulose), xylan, mannan, XG, MLG, and pectin.
Comparative global transcriptional analyses underlined the importance of PCWDN genes for adaptation to different plant cell wall polysaccharides. We next performed hierarchical clustering  of the RNA-seq data for the 168 PCWDN genes in all nine conditions (Avicel, xylan, XG, mannan, MLG, pectin, starch, sucrose, and NoC). The conditions (x-axis in Fig. 5) clustered similarly to what was obtained from correlation and principal component analysis of the genome-wide transcriptome profiles presented in Fig. 4a, b. The genes (y-axis in Fig. 5) were grouped into six major clusters (“Methods”; Additional file 10). The 33 genes in the largest cluster were highly expressed under Avicel and MLG as compared to other conditions. Notably, this cluster contains most of the endo-β-1,4-glucanases, exo-β-1,4-glucanases, β-glucosidases, AA9 LPMOs, and the identified cellodextrin transporters. The second cluster contained 15 genes, showed the highest expression on xylan, and consisted of genes encoding endo-xylanases, β-xylosidases, acetyl xylan esterases, xylose reductase, xylulokinase, and xylitol dehydrogenase. The 10 genes in the third cluster were most strongly expressed on pectin and included genes encoding pectin methyl esterases, a rhamnogalacturonan lyase, a rhamnogalacturonan acetyl esterase, an endo-β-1,4-galactanase, and a β-galactosidase. The fourth cluster composed of 27 genes showed transcriptional activity on both xylan and pectin and consisted mainly of genes encoding β-xylosidases, exo- and endo-arabinanases, α-arabinosidases, β-galactosidases, and genes involved in l-rhamnose and d-galacturonic acid metabolism. The fifth cluster composed of 6 genes with broader expression pattern contained four sugar transporters, and the final cluster composed of 29 genes contained those factors most strongly induced during starvation (excluding mannan and XG).
The AA9 LPMOs, previously annotated as CAZY class GH61, degrade cellulose by oxidative cleavage [44, 46, 47, 49]. Most AA9 LMPOs were highly expressed on Avicel and MLG and their expression was correlated across the two conditions (Additional file 11). In comparison to Avicel and MLG, the expression of 10 of the 14 LPMOs was much lower on XG and negligible on starch. Based on these observations, we concluded that these genes are specifically induced by non-substituted β-d-glucans and hypothesize that LPMOs in AA9 CAZY class also act on MLG, which is now included in the PCWDN of N. crassa (Additional file 1).
A recent study  suggested that the product of AA9 LPMO gene NCU02916 can also act on XG. However, NCU02916 was not induced when grown on XG, and the N. crassa WT strain (FGSC 2489) grew poorly on XG, with the transcriptome profile highly correlated with that of the NoC control (Fig. 4a). Thus, it is difficult to rule out the possibility that the AA9 LPMOs also act on XG, and we therefore included such reactions in the PCWDN of N. crassa (Additional file 1). However, another recent study  characterized a starch-specific LPMO, NCU08746, and this gene is classified in the new CAZY class AA13 . Consistent with this finding, NCU08746 was differentially upregulated under starch conditions (Additional file 4), and thus NCU08746 was assigned to starch-specific LPMO reaction in the PCWDN (Additional file 1).
Previous research [16, 17] on plant cell wall deconstruction by N. crassa led to the identification of two essential transcription factors (TFs), CLR-1 (NCU07705) and CLR-2 (NCU08042), for cellulose utilization, and one essential TF, XLR-1 (NCU06971), for utilization of xylan. All three TFs are conserved across ascomycete fungi [16, 75–80]. Using next-generation sequencing of chromatin-immunoprecipitated DNA (ChIP-seq), a recent study  identified binding regions for CLR-1, CLR-2, and XLR-1 across the N. crassa genome under sucrose, Avicel, xylan, and NoC conditions. CLR-1, CLR-2, and XLR-1 bound to the regulatory regions of 293, 164, and 84 genes, respectively, in the N. crassa genome . Integrating this information, we determined that CLR-1, CLR-2, and XLR-1 bound the promoter regions of 27 (p < 10−12), 37 (p < 10−30), and 20 (p < 10−17) genes, respectively, within the 168 PCWDN genes (Fig. 6). Of the 27 PCWDN genes regulated by CLR-1, 21 are involved in cellulose utilization, including endo-β-1,4-glucanases, AA9 LPMOs, exo-β-1,4-glucanases, β-glucosidases, or cellodextrin transporters, while 17 out of 20 PCWDN genes regulated by XLR-1 are involved in xylan utilization including endo-β-1,4-xylanases, β-xylosidases, β-arabinofuranosidases, or xylodextrin transporters  (Fig. 6). However, in the case of CLR-2, only 23 out of 37 PCWDN genes directly regulated by the TF are involved in cellulose utilization, while 12 other PCWDN genes are involved in xylan or mannan utilization  (Fig. 6). These observations suggest that CLR-2, unlike CLR-1 or XLR-1, has a broader regulatory role in plant cell wall deconstruction, and which is not limited to cellulose utilization.
CLR-1 is known to directly regulate clr-2 . Intriguingly, we found that 19 out of 27 PCWDN genes directly regulated by CLR-1 are also directly regulated by CLR-2 (Fig. 6). These observations indicate that CLR-1 and CLR-2 predominantly employ feed-forward loops (FFLs) [82–86] to regulate the cellulose utilization pathway within the PCWDN of N. crassa.
Our comparative transcriptome analysis revealed that the pattern of gene expression of N. crassa WT exposed to mannan was highly correlated to the NoC starvation condition. In contrast to the cellulose and xylan utilization pathways, the N. crassa genome has a mostly non-redundant mannan utilization pathway, with just one predicted endomannanase, gh5-7 (NCU08412), and one intracellular β-mannosidase, gh2-1 (NCU00890) (Fig. 7a; Additional file 1). Note that a novel β-1,4-mannanase belonging to the CAZY family GH134 recently found in Aspergillus nidulans  has no close homolog in N. crassa. Based on ChIP-seq data , CLR-2 directly regulates both genes, while CLR-1 directly regulates only gh5-7 and XLR-1 does not directly regulate any genes in the mannan pathway (Fig. 6). Additionally, gh5-7 and gh2-1 were induced on mannan, albeit to a much lower extent than on Avicel or MLG, a context where these enzymes should not be directly required (Fig. 7b). By contrast, the clr-1 and clr-2 genes were highly expressed in Avicel and MLG conditions, but not induced by mannan (Fig. 7b). These observations led us to hypothesize that CLR-1 and CLR-2 may be important for the induction of gh5-7, constituting the first step in the mannan utilization pathway.
To test this hypothesis, we first designed experiments to confirm that GH5-7, as the only predicted endomannanase in N. crassa, is critical for mannan utilization. As the deletion strain for gh5-7 was not available, we constructed a Δgh5-7 strain (ΔNCU08412) following standard procedures (; “Methods”). Since pure mannan acts as a poor inducer (Fig. 8d), we used Konjac glucomannan instead of pure mannan to test our hypothesis. As predicted, the Δgh5-7 mutant was found to have a strong growth phenotype, accumulating only 50–60% of WT biomass when grown on glucomannan as a sole carbon source (Fig. 8a, b). Additionally, both the Δclr-1 and Δclr-2 strains also grew poorly on glucomannan, accumulating only 10–20% of WT biomass, while the Δxlr-1 strain showed no significant growth phenotype (Fig. 8a, b).
The results presented above showed that clr-1, clr-2, and the endomannanase gene gh5-7 are important for mannan utilization by N. crassa. Together with the observation that CLR-2 directly regulates gh5-7 and gh2-1, and that the expression of clr-2 was negligible in the WT strain under pure mannan conditions (Fig. 7b), we hypothesized that an engineered strain mis-expressing clr-2 would show enhanced growth on mannan. A clr-2 mis-expression strain (Pc2) constructed in a previous study  results in a strain that constitutively expresses clr-2 and cellulases. We evaluated the growth of the Pc2 strain on glucomannan as compared to the WT strain and determined that the Pc2 strain accumulated 90–100% more fungal biomass than WT on glucomannan, while secreting 7 times as much protein (Fig. 8a–c). Similarly, a clr-1 mis-expression strain (Pc1) accumulated 90–100% more fungal biomass than WT on glucomannan, albeit exhibiting a similar protein secretion level (Fig. 8a–c). In contrast, a xlr-1 mis-expression strain (Px1) accumulated fungal biomass at a similar level to WT on glucomannan (Fig. 8a–c).
We next used Pc2, Pc1, and Px1 strains to test their growth on pure mannan as a sole carbon source. Remarkably, the Pc2 strain exhibited a robust growth phenotype on pure mannan (Fig. 8d) and secreted 14 times more protein than the WT strain. However, both Pc1 and Px1 strains showed a poor growth phenotype similar to WT in pure mannan (Fig. 8d). Thus, these experimental results validated our hypothesis that mis-expression of clr-2, and thus induction of gh5-7 and gh2-1, would restore growth on mannan. Furthermore, the Pc2 strain also exhibited robust growth on XG, while the Pc1 and Px1 strains showed growth characteristics that were similar to WT on XG as a sole carbon source (Fig. 8e). These novel insights into the regulation of the mannan and XG degradation pathway by CLR-2 will aid future efforts to engineer improved strains for degradation of lignocellulosic biomass.
A genome-scale metabolic model (iJDZ836) containing 836 metabolic genes that encode 1027 unique enzymatic activities was previously published for N. crassa  and captures biochemical reactions for catabolism of simple nutrients, central and energy metabolism, and biosynthesis of biomass precursors. By comparing reactions and genes in our PCWDN with those in iJDZ836, we found that 167 out of 202 PCWDN reactions (>82%) and 105 out of 168 PCWDN genes (>62%) were not contained in the model (Additional file 12). Additionally, among the 63 common genes, 23 PCWDN genes (>36%) had incorrect or outdated annotations (Additional file 12).
Apart from N. crassa, genome-scale metabolic models have also been reconstructed for a few other ascomycetes species, including A. nidulans , Aspergillus niger , Aspergillus oryzae , Aspergillus terreus , and Penicillium chrysogenum . Using OrthoMCL [71, 72], we searched the genomes of these species for ortholog/paralogs of the 168 N. crassa PCWDN genes, and found 178 (A. nidulans), 160 (A. niger), 197 (A. oryzae), 192 (A. terreus), and 164 (P. chrysogenum) orthologous/paralogous genes (“Methods”; Additional file 12). To assess the coverage of polysaccharide degradation pathways in the reconstructed metabolic models for these species [29–33], the overlap between genes in other fungal metabolic models and ortholog/paralogs of the PCWDN genes was determined. Of the PCWDN ortholog/paralogs, 40–70% were not accounted for in the other filamentous fungal metabolic models (Additional file 12). These analyses highlight the significant knowledge gaps specific to pathways for degradation and utilization of plant cell wall polysaccharides in published genome-scale metabolic models for other ascomycete species.
Here, we have taken a comprehensive approach, using diverse datasets, to define genes in filamentous fungi involved in the deconstruction of plant biomass, using N. crassa as a model (Fig. 9). From these analyses, we developed hypotheses regarding the regulation of mannan degradation in N. crassa and experimentally tested the hypothesis that the transcription factor CLR-2, previously characterized to regulate cellulose utilization [16, 17], also plays a role in mannan degradation. Interestingly, clr-2 is the ortholog of the transcriptional regulator ManR, which regulates mannan utilization in A. oryzae . These data support the view that the role of clr-2 orthologs in the regulation of genes involved in mannan utilization is conserved among filamentous fungi.
Current metabolic models for N. crassa  and other filamentous fungi [29–33] have significant knowledge gaps in the degradation pathways for most plant cell wall polysaccharides. For example, the N. crassa metabolic model iJDZ836  has significant knowledge gaps regarding the pathways for degradation and utilization of plant cell wall polysaccharides. The iJDZ836 model contains neither the pectin degradation nor the d-galacturonic acid utilization pathway (Additional file 12) and lacks degradation pathways for most hemicellulosic polysaccharides including mannan, XG, and MLG (Additional file 12). Additionally, degradation pathways for cellulose and xylan were not captured in detail. For example, the AA9 LPMOs were incorrectly annotated as endo-β-1,4-glucanases and therefore do not contain reactions describing the oxidative cleavage of cellulose by LPMOs (Additional file 12). Likewise, the metabolic models for other filamentous fungi do not capture the LPMO-associated reactions and pectin degradation pathway. These limitations render the available genome-scale metabolic models for filamentous fungi not well suited for the investigation of plant cell wall deconstruction, while our PCWDN will serve as a valuable resource to fill this gap. In the future, our reconstructed network is expected to play a central role in the systems analysis of complex experimental datasets and will yield many more novel insights into plant cell wall deconstruction by filamentous fungi.
A list of predicted carbohydrate-active enzymes in the N. crassa genome was compiled from two databases, the CAZY [41, 90] and the N. crassa e-Compendium [42, 91]. Following manual curation, we generated an updated list of predicted carbohydrate-active enzymes in N. crassa (Additional file 2). TransportDB [43, 92] was used to obtain a list of predicted transporters in the N. crassa genome. Proteins destined for the secretory pathway are likely to have a signal peptide sequence in their N terminus, and we used two prediction tools, SignalP [51, 93] and Phobius [52, 94], to predict the presence of signal peptides in amino acid sequences of PCWDN enzymes (Additional file 3). In order to assign the subcellular localization of enzymes, we used WoLF PSORT [53, 95] and ProtComp  (Additional file 3). The mycoCLAP database [70, 97] was used extensively to compile the list of biochemically characterized lignocellulose-active proteins of fungal origin. OrthoMCL [71, 72, 98] is a tool to identify orthologous gene pairs across eukaryotic genomes. We have used OrthoMCL to determine the ortholog/paralogs of N. crassa PCWDN genes in more than 40 fungal genomes.
The N. crassa WT reference strain was OR74A (FGSC 2489), which was the background for all mutant strains [12, 88]. The deletion strains for clr-1 (FGSC 11029), clr-2 (FGSC 15835), and xlr-1 (FGSC 11066 and 11067) were obtained from the Fungal Genetics Stock Center (FGSC) . The mis-expression strains for clr-1 (Pc1), clr-2 (Pc2), and xlr-1 (Px1) were obtained from a previous study . The deletion strain for gh5-7 (NCU08412) was not available in the FGSC collection and was constructed following standard procedures (Δgh5-7; ΔNCU08412) [12, 88]. Briefly, the 5′ upstream and 3′ downstream genomic regions surrounding NCU08412 were PCR amplified from WT genomic DNA and joined through fusion PCR with the hygromycin phosphotransferase (hph) knockout cassette . The resulting amplicon was transformed into FGSC 9718 (Δmus-51) and selected on hygromycin slants. A homokaryotic strain was obtained through microconidia selection on water agar plates yielding the strain (Δgh5-7::hyg R; Δmus-51).
All N. crassa strains were pre-grown for 24 h on 3 mL agar slants of Vogel’s minimal media (VMM)  with 2% sucrose at 30 °C under dark conditions. The slants were placed under constant light at 25 °C to stimulate conidia production. For flask cultures, conidia were collected and inoculated (106 conidia/mL) into 100 mL liquid VMM (2% sucrose) at 25 °C under constant light and shaking (200 rpm).
Media shift experiments were performed in triplicate and followed the procedure described earlier in Coradetti et al.  and Znameroski et al.  to ensure optimal comparability with the previously published RNA-seq datasets. First, using shake flasks (200 rpm), N. crassa cultures were pre-grown from conidia for 16 h in 100 mL of VMM  with 2% sucrose. Next, the mycelia were passed over a Whatman glass microfiber filter and washed three times with VMM without a carbon source (NoC). The mycelial mass was then transferred to new shake flasks with 100 mL of VMM containing a specific carbon source [2% XG (P-XYGLN, Megazyme) or 2% mannan (P-MANCB, Megazyme) or 2% MLG (P-BGBM, Megazyme) or 2% Starch]. After 4 h in the new carbon source, the mycelia were harvested over a filter, flash frozen in liquid nitrogen, and stored at −80 °C. Total RNA was extracted for library generation using the standard procedures as described in Tian et al. .
Single-end libraries were prepared for RNA sequencing (RNA-seq) using an Illumina kit (RS-100-0801) following standard protocols as described in Coradetti et al. . The cDNA libraries were sequenced on the Illumina HiSeq 2000 platform at the Vincent J. Coates Genomic Sequencing Laboratory, University of California, Berkeley. Three independent biological replicates of media shift cultures were sampled for N. crassa WT strains grown on VMM  with one of four carbon sources: XG, mannan, MLG, or starch. Profiling data for each of these carbon sources are contained in Additional file 4 and have also been deposited to the Gene Expression Omnibus (GEO) database  (Accession No. GSE90611). Previous studies [16, 19] generated profiling data for N. crassa WT strain grown in VMM with one of six carbon sources [sucrose, cellobiose (CB), Avicel, xylan, pectin, and OPP] or No Carbon (NoC) source, and these libraries were downloaded from the GEO database (Additional file 4).
Sequenced libraries were mapped to the current version of the N. crassa OR74A genome (v12) using Tophat v2.0.5 [102–104]. Transcript abundance was estimated in FPKM (fragments per kilobase of transcript per million mapped reads) using Cufflinks v2.0.2 [102, 103, 105] with options of upper quartile normalization and mapping against reference isoforms. Differential expression analysis was conducted using Cuffdiff v2.0.2 [102, 103]. Genes with a multiple-hypothesis adjusted p value below 0.05 and at least twofold induction were determined to be significantly differentially expressed between conditions.
Starting from average FPKM of genes across RNA-seq library replicates for a condition, hierarchical clustering was performed using Cluster 3.0 software suite [74, 106]. Before clustering, genes were filtered out that displayed consistently low expression (<10 FPKM) in all conditions. FPKM were log-transformed, normalized across conditions, and centered on the geometric mean across conditions on a per-gene basis. The average linkage method was used for cluster generation, with Pearson’s correlation as a similarity measure. Visualization of clusters was performed using GENE-E software suite .
Growth assays on cell wall substrates were performed in 3 mL liquid cultures in 24-well plate format (GE Healthcare Life Sciences 7701-5102) with breathable sealing tape (Fisher Scientific 1256705). 106 conidia/mL were inoculated into VMM with 0.5% wt/vol 1,4-β-D-mannan from carob (Megazyme P-MANCB), Konjac glucomannan (Megazyme P-GLCML), or xyloglucan from tamarind (Megazyme P-XYGLN) as the carbon source. Cultures were grown for 48 h at 25 °C with constant light and shaking at 250 rpm. At the end of the incubation, mycelia were concentrated by centrifugation at 3000 RCF for 10 min. Culture supernatants were then assayed for soluble protein with Bio-Rad protein assay dye reagent (Bio-Rad 500-0006), using bovine serum albumin (NEB 9001S) as the protein standard. Mycelia were washed twice in water and lyophilized before weighing for biomass determination.
AS, JPC, STC, JPB, NDP, and NLG designed the research; AS, JPC, STC, and JPB performed the research; AS, JPC, STC, JPB, and NLG analyzed the data; and AS, JPC, JPB, JAE, NDP, and NLG wrote the paper. All authors read and approved the final manuscript.
The authors thank Vincent Wu, Carly Grant, and Hillary Tunggal for technical assistance, and Xin Li and William Beeson for insightful discussions. The authors also thank the Fungal Genetics Stock Center and the Neurospora Genome Project for continuous support and acknowledge the use of materials generated by NIH Grant P01 GM068087 “Functional analysis of a model filamentous fungus.”
The authors declare that they have no competing interests.
The datasets supporting the conclusions of this article are included within the article and its additional files. RNA sequencing dataset generated for this article has been deposited to the Gene Expression Omnibus (GEO) database (Accession No. GSE90611).
This work was supported by Grants from the Energy Biosciences Institute to NLG and NDP, and National Institutes of Health NRSA Trainee Grant 2 TR32 GM 7127-36 A1 to STC.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Samuel T. Coradetti and J. Philipp Benz contributed equally to this work
Electronic supplementary material
The online version of this article (doi:10.1186/s13068-017-0901-2) contains supplementary material, which is available to authorized users.
Areejit Samal, Email: ni.ser.csmi@lamasa.
James P. Craig, Email: moc.liamg@431giarcmij.
Samuel T. Coradetti, Email: ude.yelekreb@darocs.
J. Philipp Benz, Email: ed.mut.mfh@zneb.
James A. Eddy, Email: firstname.lastname@example.org.
Nathan D. Price, Email: gro.ygoloibsmetsys@ecirpn.
N. Louise Glass, Email: ude.yelekreb@ssalgL.