|Home | About | Journals | Submit | Contact Us | Français|
Starch serves as a temporal storage of carbohydrates in plant leaves during day/night cycles. To study transcriptional regulatory modules of this dynamic metabolic process, we conducted gene regulation network analysis based on small-sample inference of graphical Gaussian model (GGM).
Time-series significant analysis was applied for Arabidopsis leaf transcriptome data to obtain a set of genes that are highly regulated under a diurnal cycle. A total of 1,480 diurnally regulated genes included 21 starch metabolic enzymes, 6 clock-associated genes, and 106 transcription factors (TF). A starch-clock-TF gene regulation network comprising 117 nodes and 266 edges was constructed by GGM from these 133 significant genes that are potentially related to the diurnal control of starch metabolism. From this network, we found that β-amylase 3 (b-amy3: At4g17090), which participates in starch degradation in chloroplast, is the most frequently connected gene (a hub gene). The robustness of gene-to-gene regulatory network was further analyzed by TF binding site prediction and by evaluating global co-expression of TFs and target starch metabolic enzymes. As a result, two TFs, indeterminate domain 5 (AtIDD5: At2g02070) and constans-like (COL: At2g21320), were identified as positive regulators of starch synthase 4 (SS4: At4g18240). The inference model of AtIDD5-dependent positive regulation of SS4 gene expression was experimentally supported by decreased SS4 mRNA accumulation in Atidd5 mutant plants during the light period of both short and long day conditions. COL was also shown to positively control SS4 mRNA accumulation. Furthermore, the knockout of AtIDD5 and COL led to deformation of chloroplast and its contained starch granules. This deformity also affected the number of starch granules per chloroplast, which increased significantly in both knockout mutant lines.
In this study, we utilized a systematic approach of microarray analysis to discover the transcriptional regulatory network of starch metabolism in Arabidopsis leaves. With this inference method, the starch regulatory network of Arabidopsis was found to be strongly associated with clock genes and TFs, of which AtIDD5 and COL were evidenced to control SS4 gene expression and starch granule formation in chloroplasts.
Starch is an insoluble glucose polymer stored in seeds and storage organs of plants. The starch molecule is composed of two types of glucose polymers—amylose and amylopectin—and is organized to form a distinct structure called a starch granule. It is commonly accepted that biosynthesis of starch takes place during the day using an excess sugar residue from photosynthesis as a substrate. At night, starch granules in leaves are decomposed to sugars to be transported to seeds or storage organs and stored as reserved carbohydrates or used as precursors in other metabolic pathways [1-3]. Besides starch synthase, various enzymes and proteins have been identified to play unexpected roles in starch biosynthesis, metabolism and granule formation [4-11].
With regard to regulation of starch biosynthesis and metabolism, post-translational protein modifications have major impacts on controlling the enzyme activities [12-14]. Allosteric regulation of ADP-glucose pyrophosphorylase (AGPase) [15-17] and redox modulation of pullulanase-type debranching enzymes [18,19], glucan-water-dikinase (GWD)  and β-amylase  indicate the significance of post-translational regulatory mechanisms. Protein phosphorylation and formation of multi-protein complexes of starch synthase (SS), branching enzyme (BE), debranching enzyme (DBE), and starch phosphorylase (SP) suggest tight linkages of metabolic pathways through modification and physical interactions of the enzymes (reviewed in ). In addition to post-translational mechanisms, genes encoding starch metabolic enzymes are also known to be regulated under transcriptional control. In barley, a sugar-inducible transcription factor (TF) in the WRKY family, SUSIBA 2, is reported to act as an activator in endosperm starch biosynthesis . In rice, a complex of a MYC protein (OsBP-5) and an EREBP protein (OsEBP-89) is proposed to be a transcriptional regulator of the rice Wx gene, whose product, namely the granule-bound starch synthase (GBSS), is responsible for synthesis of amylose in mature seeds . Additional finding in Arabidopsis indicates that expression of the GBSS-I gene is controlled by 2 main clock TFs, circadian clock associated 1 (CCA1: At2g46830) and late elongated hypocotyl (LHY: At1g01060) . The roles of these TFs suggest the significance of transcriptional mechanisms, although gene regulatory networks of starch metabolism remain largely uncharacterized.
Inference methods for construction of gene regulatory networks have been extensively developed after genome-wide microarray repositories became publicly available [25-31]. Reverse-engineering approaches of Arabidopsis gene regulatory network reconstruction utilizing large-scale microarray experiments have been previously proposed and revealed the Arabidopsis regulatory network models from different viewpoints [28-32]. Carrera and colleagues  applied a qualitative network model based on a probabilistic model and linear regression to 1,436 Arabidopsis microarrays, and analyzed topological parameters of the network. Their result showed that genes having cellular functions involved in responses and adaptation to environmental changes tended to have higher connectivity than genes not related to stress responses. Another approach of the gene network reconstruction from large-scaled Arabidopsis microarrays is proposed by Mao et al.. They constructed a genome-wide co-expression network of Arabidopsis from 1,094 microarrays based on Pearson correlation and analyzed modular structures of the networks that are functionally related. Significantly enriched pathway terms were then analyzed for the predicted modules. One module was defined to be enriched in starch metabolism; 9 out of 10 genes contained in this module were related to starch metabolism. Since the genes in the same module were predicted from co-expression across various conditions, it is suggested that these starch metabolic genes are potentially co-regulated. The other method successfully utilized to construct a regulatory network is gene expression analysis using a modified graphical Gaussian model (GGM) [33,34]. The modified GGM is considered appropriate for analysis of microarray data that usually has a high-dimensionality problem (i.e. the number of genes is much higher than the number of measurements). This technique has been applied to 2,045 Arabidopsis microarrays to construct a gene network which can be subdivided to sub-network structures . A number of sub-networks identified through this approach are suggested to be related to metabolism and stress responses. One of them is considered as a starch catabolism sub-network, where 7 out of 15 genes present in the network are apparently relevant to starch degradation pathways .
Given these backgrounds, systematic analysis of microarray data appears to provide insights into gene regulatory networks of starch metabolism in Arabidopsis. In this work, an inference model was constructed from a diurnal cycle microarray dataset to identify candidate transcriptional regulators of plant starch metabolism. This approach is based on evidence that biosynthesis and degradation of leaf starch is completed within 24 hours [3,35-37], and hypothesizes that regulators of our interests are co-expressed and oscillated with genes involved in starch metabolic processes under a day-night cycle [24,36-38]. Firstly, we identified genes whose expression profiles changed over an observed 24-hr time period. Among these temporally regulated ‘significant’ genes, a set of starch metabolic genes, TFs, and clock genes was utilized for the construction of gene association networks using the small sample inference framework of GGM [33,34]. Subsequently, a few pairs of TFs and starch metabolic genes were selected based on their correlation coefficients from global co-expression profiles. Finally, validation of the relationships between the selected TFs and starch metabolic genes were carried out using TF loss-of-function mutant lines. The results obtained from this study have led us to identify the involvement of TFs, indeterminate domain5 (AtIDD5: At2g02070) and constans-like (COL: At2g21320), in transcriptional regulation of an Arabidopsis starch metabolic gene. The work presented here provides a model for systematic understanding of regulatory networks of starch metabolic pathway applicable for modification of starch synthesis and accumulation.
A time-series significant analysis was performed using the Extraction of Differential Gene Expression (EDGE) software package. This software can identify differentially expressed genes from both typical and time-course microarray experiments [39,40]. In this research, the software was applied to detect changes in Arabidopsis gene expression occurring within a 24 hour period . This data set is the time-course measurement of transcripts extracted from fully-expanded leaves of Arabidopsis grown under a 12-hour-light and 12-hour-dark (12L/12D) condition. The samples are taken after 1, 2, 4, 8 and 12 hours in darkness or light, starting from the end or beginning of the light period, respectively. The concept of EDGE is based on the hypothesis testing of differential gene expression patterns fitted by natural cubic spline interpolation. Genes whose expression patterns deviate from a standard line within a 24-hour period would be detected as differentially expressed genes under a diurnal cycle. From approximately 22,000 Arabidopsis genes on the ATH-1 Affymetrix genome arrays, 1,480 genes were detected as significant genes (Q<0.01). These significant genes were clustered by k-means clustering, then functionally classified in MapMan, a visualization tool for functional classification of Arabidopsis. Genes encoding TFs and those related to starch metabolism were used as inputs for GGM network construction.
Since the leaf starch content of Arabidopsis increases in the light period and decreases in the dark period , we hypothesized that the high expression levels of genes in the starch biosynthetic pathway would be observed during the day, while those in the degradation pathway would occur at night. To test this hypothesis, the expression profile of all the significant genes were examined by k-means clustering (k=30). All clusters were subjectively divided into 4 groups (Figure (Figure1).1). Group A includes the genes whose mRNA levels are increased in the dark and decreased in the light. In contrast, the group B members showed their mRNAs increased in the light and decreased in the dark. Group C consists of the gene members whose expression remains relatively stable except their mRNA levels were observed to increase at the dark-to-light transition phase. Group D is similar to group C but the expression goes down at the dark-to-light transition phase. According to our hypothesis, the starch biosynthetic genes are expected to be clustered in group B and C since the gene expression increases when the light period starts. On the other hand, the genes in starch degradation whose expression are expected to increase in the dark period should be the members of group A and D.
According to Smith et al., there are 48 genes related to starch metabolism. In the group of 1,480 significant genes identified in this study, 21 out of 48 starch metabolic genes were present, 9 of which are related to starch synthesis and 12 of which are known to function in starch degradation pathway (Table (Table1).1). By using k-means clustering, these starch metabolic genes were classified into 11 different clusters (Figure (Figure1).1). Most of the starch synthesis genes (7 out of 9 genes) are observed in the clusters of groups B and C. This result supports the hypothesis that starch biosynthetic genes should be up-regulated during the day time. The two starch biosynthetic genes that do not follow this rule are those coding for granule-bound starch synthase (GBSS: At1g32900) and starch synthase 2 (SS2: At3g01180). The expressions of both genes show distinct diurnal patterns distinguishable from other starch synthase genes . Since their products are reported to be embedded in the starch granules that are daily destroyed at the night period , their high expression levels after the onset of the light are considered necessary to regenerate GBSS and SS2 proteins for starch biosynthesis during the light period. In contrast, only 2 of 12 total genes encoding for enzymes in starch degradation pathways, α-amylase 2 (a-AMY2: At1g76130) and β-amylase 9 (b-AMY9: At5g18670), showed expression patterns correlated with the starch content profile. One explanation is that starch degrading enzymes might need a lag time in post-transcriptional or translational processes to become catalytically functional.
The 1,480 significant genes were categorized according to the functions defined in MapMan (Table (Table2).2). The largest functional group containing 471 genes, which accounted for ~32% of total significant genes, was classified as “Not assigned”. Out of the 471 genes in this group, 276 genes (~59%) were identified as unknown expressed proteins. The second largest group was the “RNA” group. This group contains 177 genes whose functions are related to RNA processing, transcription process, and regulation of transcription. As a result, 106 TF genes were assigned in the RNA group. Among 24 significant genes in the “Major carbohydrate metabolism” group (Table (Table2),2), 21 genes are starch-related, while the other 3 are related to sucrose metabolism. Additionally, our significant gene set also includes both clock and clock-regulated genes such as circadian clock associated 1 (CCA1: At2g46830), late elongated hypocotyl (LHY: At1g01060), early-flowering 3 (ELF3: At2g25930), phytochrome B (PHYB: At2g18790), timing of cab expression 1 (TOC1: At5g61380), and casein kinase II beta-chain (CKB3: At3g60250). These TF, clock, and starch metabolic genes whose mRNA levels are significantly modulated during a diurnal cycle were used for reconstruction of transcriptional regulatory network of starch metabolism.
To further investigate the transcriptional regulation of starch metabolic pathway, only metabolic genes in starch metabolism and all possible regulators (i.e. TF and clock genes) were focused in this research. The 133 significant genes —composed of 21 starch metabolic genes, 106 TF genes, and 6 clock genes— were subjected to GGM network construction . Since GGM generates a conditional network depending on an input genes set, different sets of input genes are influent to the resulted network. Preservation of the TF-starch relationships in the networks constructed from expanded gene sets indicated the robustness of the network reconstructed by the focused set of starch metabolic and regulator genes (discussed in the following section).
The gene association network of starch metabolic pathway was constructed using small sample inference of GGM implemented in the R package ‘GeneNet’ . The transcript profiles of 133 significant genes were retrieved from the original microarray data , and transformed to log-base 2 scales before inferring the gene association network. In the resulting network, a node represents a gene and an interaction between 2 genes is called an edge. Each edge indicates a correlated expression of any 2 genes after removing effects of other genes in a study set. The hypothesis testing of non-zero partial correlation and false discovery rate (FDR), multiple testing correction were employed to obtain significant correlation coefficients (i.e. significant edges) in the final network. From a total of 133 genes, the final network derived from GGM analysis at Q<0.05 consisted of 117 nodes (or genes), corresponding with 16 starch (7 synthesis and 9 degradation), 97 TF, and 4 clock genes, and 266 edges (or, an association between 2 genes) (Figure (Figure2).2). These edges could be the associations between 1) regulator and regulator genes (i.e. TF-TF, clock-TF, and clock-clock), 2) regulator and target genes (i.e. TF-starch and clock-starch), and 3) target and target genes (i.e. starch-starch). There are 215, 42, and 9 edges for the first, second, and third edge types, respectively.
Since 80% of the genes used in the network reconstruction are TF genes, predominant interactions in the final network represent those between TFs. To verify the significance of these TF-TF interactions and the robustness of the starch genes association network, especially the TF-starch relationships shown in Figure Figure2,2, another gene association network was reconstructed by expanding a set of metabolic genes from only genes in starch metabolism to genes in broader carbon-related metabolisms. The same set of 106 TFs, 6 clock genes, and 171 metabolic genes selected from 11 carbon-related functional groups categorized to be related to photosynthesis, major carbohydrate metabolism, minor carbohydrate metabolism, glycolysis, fermentation, gluconeogenesis/glyoxylate cycle, oxidative pentose phosphate pathway, TCA/organism transformation, mitochondrial electron transport/ATP synthesis, cell wall, and lipid metabolism by MapMan (Table (Table2)2) were utilized for network reconstruction. It should be noted that all 21 starch genes are present in the major carbohydrate metabolism category, thus included in the metabolic gene set. The gene association network of carbon-related metabolisms is shown in Additional file 1: Figure S1. In this expanded network, 151 out of 215 regulator- regulator edges (70%) and 26 out of 42 TF-starch or clock-starch relationships (62%) were identical with the starch network shown in Figure Figure2.2. All 6 TF candidates and their relationships to 5 starch genes that will be further experimentally validated (discussed in the following section) exist in the expanded network. The result indicates that most of the regulatory relationships of the starch gene association network are robust and preserved even when expanding an input gene set to carbon-related metabolic genes.
According to Figure Figure3A,3A, the node degree of TF and clock genes ranged from 1 to 14 connections. Among the starch metabolic genes, β-amylase 3 (b-AMY3: At4g17090) was detected as the most-connected node or a hub gene with 15 neighbors (Figure (Figure3B).3B). This result may indicate the importance of b-AMY3 in the starch metabolism. The b-AMY3 represents a chloroplastic β-amylase [21,36,42] with a possible role in the leaf starch degradation process [3,43-45]. In the b-AMY3 sub-network (Figure (Figure3B),3B), there are 2 starch metabolic genes, starch synthase 4 (SS4: At4g18240) and β-amylase 6 (b-AMY6: At2g32290), that showed positive correlations with b-AMY3. There are reports indicating that SS4 involves in the starch granule initiation process [6,11], and b-AMY6 is repressed under heat shock stress . However, no functional correlation among these 3 starch metabolic genes has ever been reported.
In addition to gene-to-gene association with two starch metabolic genes, b-AMY3 was also observed to have positive correlation with various TF genes and a clock gene, TOC1. TOC1 is an evening gene expressed during the night period with a major role in circadian rhythm [47-49]. From the k-means clustering, expression of b-AMY3 increased slightly during the night period, decreased at the dark-to-light transition phase, then increased again a few hours later. Our findings are not only in agreement with the results indicating regulation of b-AMY3 expression under diurnal and circadian rhythms [36,50,51], but they also suggest the circadian control of b-AMY3 via TOC1.
From the GGM network construction, we additionally found that the sub-networks of two starch metabolic genes encoding GBSS (At1g32900) and disproportionating enzyme (DPE1: At5g64860) and their TF neighbours are entirely separated from the rest of the network (Figure (Figure2).2). These isolated modules indicate specific correlation between starch metabolic genes and their connected TFs. The LIM domain-containing protein (At2g39900) associated with DPE1 in the network is in fact WLIM2a, an actin-bundling protein that functions in cytoskeleton organization ; its gene expression is regulated by pickle (PKL), a member of CHD3 chromatin- remodelling protein involved in seed germination of Arabidopsis. The other isolated module represents the association between GBSS and two zinc finger family proteins, constans-like (COL: At2g21320) and constans-like 7 (COL7: At1g73870). In rice, MYC and EREBP are known to act synergistically in transcriptional regulation of the rice Wx gene . Since these homologues were not included in the GBSS sub-network of Arabidopsis, the results possibly suggest a difference in the mechanisms controlling the biosynthesis of storage starch (i.e. in rice endosperm) and transitory starch (i.e. in Arabidopsis leaves).
To identify candidate TFs that may play regulatory roles in starch metabolism, we focused on the relationships between starch metabolic genes and their immediately connected TFs. The starch sub-network excerpted accordingly contained 49 nodes (16 starch, 1 clock, and 32 TF genes) and 79 edges (Figure (Figure4A).4A). Details of the genes in the starch sub-network are summarized in Additional file 2: Table S1. Within the starch sub-network, we observed 5 types of gene-to-gene interactions or edges; those were interactions between starch metabolic genes (8 edges), between TF genes (26 edges), between starch metabolic and TF genes (42 edges), between clock and starch metabolic genes (2 edges), and between clock and TF genes (1 edge).
The Arabidopsis gene networks were previously reconstructed from global microarray conditions based on GGM  and co-expression analysis . Interestingly, the sub-networks related to starch metabolism were extracted from both studies, even though different algorithms were applied. The sub-networks derived from GGM  and co-expression analysis  contain a total of 15 and 10 genes, and from these gene set 10 and 9 were identified as starch metabolic genes, respectively. There are 6 genes in starch degradation, i.e. disproportionating enzyme 1 and 2 (At5g64860 and At2g40840, respectively), glucan water dikinase 1 (starch excess1 - At1g10760), glucan phosphatase (starch excess 4 - At3g52180), starch phosphorylase 2 (At3g46970), and α-amylase 3 (At1g69830), and one gene in starch synthesis, branching enzymes 3 (At2g36390) that both of these networks have in common. From total 15 starch metabolic genes identified in the sub-networks of previous studies, 7 genes, which are starch synthase 4 (At4g18240), branching enzyme 3 (At2g36390), disproportionating enzyme 2 (At2g40840), phosphoglucan water dikinase (At5g26570), isoamylase 2 (At1g03310), cytosolic and plastidial starch phosphorylase (At3g46970 and At3g29320, respectively), were also observed in our starch sub-network (Additional file 2: Table S1). These results suggest coherent expressions of starch metabolic genes, especially those in the starch degradation process. It is worth to note that the starch sub-networks derived from those studies contain mainly metabolic genes, except one clock-regulated gene encoding pseudo-response regulator 3 (APRR3: At5g60100) which was identified in the starch sub-network derived from the GGM analysis . However, this clock gene was not present in our starch sub-network.
The 42 interactions between 12 starch and 32 TF genes were further analyzed by searching for the presence/absence of TF binding sites in the promoter region of starch metabolic genes (Table (Table3).3). The analysis conducted for prediction of physical binding of TF to target genes is TF family-based, thus, it stands on the assumption that different TFs of the same TF family most likely attach to the same TF-binding site on the promoter region of a target gene.
In the TF family-based comparative analysis, all possible plant TF binding sites in the 2-kb upstream region of the 12 target genes (i.e. starch metabolic genes) were first obtained using a web-based tool from AthaMap database http://www.athamap.de/[54-57]. The names of known TFs and their relative binding locations predicted within the 2-kb upstream region of 12 starch metabolic genes are summarized in Additional file 3: Table S2. From the list of all possible TF binding sites, physical binding was predicted between 10 TFs and 6 starch metabolic genes, shown as 11 starch-TF interactions in the starch sub-network (Table (Table3).3). For easier visualization, we included information on the presence of putative TF-binding sites and re-drew the regulatory model of genes in the starch sub-network (Figure (Figure4B).4B). The results indicate that 10 TFs show only a positive correlation with 6 starch metabolic genes and can be classified into 7 families.
The robustness of the predicted regulatory network model of starch metabolism was further verified using a diverse set of “condition-independent” microarray data. This analysis has been implemented to find how the expression patterns of any 2 genes under various conditions correlate. The data representing the relationship of Arabidopsis co-regulated genes was obtained from the ATTED-II database http://atted.jp/[58,59]. In ATTED-II, the pair-wise correlation coefficients of 22,263 Arabidopsis genes were calculated from 58 experiments of GeneChip microarray (1,388 arrays in total) using weighted Pearson correlation. The pair-wise correlation coefficients of 12 starch metabolic genes with all Arabidopsis TF genes (1,849 genes listed in this database) were ranked from highest to lowest value (Table (Table33).
The pair-wise correlation coefficients between given target genes and all TFs were observed as normally distributed in this condition-independent analysis. The TF was thus considered significant when its correlation coefficient with its target starch metabolic gene was higher than the population mean with 97.5% confidence analysed by the single-sample t-test (one-tailed). From all the TFs in the starch sub-network, 8 TF genes passed this cut-off by being significantly and highly correlated with their target gene expression, and they were preliminarily chosen as candidates for experimental validation (discussed in the following section). However, among these 8 candidates, the expression patterns of 2 TF genes, constans-like 9 (COL9: At3g07650) and bHLH (At1g05805), did not adhere to the simple assumption that a TF should express at the same time, or earlier, than its target gene. The diurnal expression patterns of both COL9 and bHLH indicated that their induction took place after the expression of their potential target starch metabolic genes (Additional file 4: Figure S2). Therefore, by excluding these 2 TFs, the rest 6 candidate TF genes—AtIDD5 (At2g02070), C2H2 (At3g50700), COL (At2g21320), COL7 (At1g73870), WLIM2a (At2g39900), and KH-CCCH (At5g06770), each predicted as regulators for the expression of SS4, α-glucosidase-like 4 (AGLU-like 4: At5g11720), GBSS (co-regulated by COL and COL7), DPE1, and starch synthase 1 (SS1: At5g24300), respectively (Table (Table4)—were4)—were selected as final candidates and subsequently used in experimental validation. According to the TF family-based prediction of binding sequences (Tables3 and and4)4) of these 6 candidates, putative binding sites for zinc finger C2H2 type TFs were located in the promoter regions of their target genes, SS4 and AGLU-like 4. Accordingly, the gene-to-gene associations between AtIDD5 and SS4, and between C2H2 and AGLU-like 4, gain strong support from both the TF binding site prediction and the global co-expression analysis.
To experimentally verify the regulatory role of the 6 candidate TFs in the proposed TF-target model (Figure (Figure4B;4B; Table Table4),4), the accumulation of starch metabolic gene transcripts was determined using homozygous knockout lines. T-DNA or transposon inserted knockout lines—Atidd5, c2h2, col, col7, wlim2a, and kh-ccch—were obtained from The Arabidopsis Biological Resource Centre (ABRC)  and The Nottingham Arabidopsis Stock Centre (NASC) . The homozygous mutant plants were grown under the conditions described in  (See method). Rosette leaves at 3.90 developmental stage  were harvested 4 times within a 24- hour period, and used as materials for RNA extraction. The mRNA levels of TFs and target genes were quantified by quantitative real-time reverse transcription polymerase chain reaction (qRT-PCR) analysis.
AtIDD5, C2H2, COL7, and KH-CCCH mRNAs were absent in Atidd5, c2h2, col7, and kh-ccch mutant lines, respectively, whereas COL and WLIM2a mRNAs were partially detected in col and wlim2a mutant lines, respectively (data not shown). The mRNA accumulations of target starch metabolic genes were then monitored in these mutant lines to determine the effect of disruption of TFs predicted to act as regulators. The results indicated that a starch metabolic gene, SS4, showed a decreased mRNA level in Atidd5 mutant (Figure (Figure5A).5A). Significant down-regulation of SS4 was observed at the end of the light period of both short and long days (Figures5A and and5B).5B). It appears that AtIDD5 plays an important role in the regulation of SS4 gene expression. By contrast, the mRNA levels of AGLU-like 4, GBSS, DPE1, and SS1 genes were not different between the wild type and mutant lines—c2h2, col &col7, wlim2a, and kh-ccch, respectively—during the time course of 12L/12D condition (Additional file 5: Figure S3).
In addition to the effect of direct TF-target relationships, the regulatory network can be affected by the associated modules particularly when the target genes are in the same metabolic process. Based on this assumption, the SS4 mRNA levels were determined in other TF mutant lines. Similar to the results obtained from the Atidd5 mutant, SS4 was down-regulated in the col mutant during the light period of both short and long day conditions (Figure (Figure5A5A and and5B).5B). Alteration of SS4 expression may have resulted from (i) the negative effect of disruption of COL on AtIDD5 gene expression or (ii) the direct control of SS4 by COL. Since the mRNA levels of AtIDD5 in the wild type and the col mutant were different only in the long day condition (Additional file 6: Figure S4), regulation of AtIDD5 expression by COL therefore remains inconclusive. Although the exact underlying mechanism is unknown, COL appears to be in part of the starch metabolic gene regulatory network, and its relevance is further evidenced by starch granule deformation (discussed in the following section).
According to in silico prediction of the Arabidopsis proteome, 176 proteins are classified in the C2H2 zinc finger family . AtIDD5 belongs to this C2H2 gene family and is further classified into the same sub-family as the maize indeterminate 1 gene (ZmID1), which is a key regulator of flowering transition [63,64]. In Arabidopsis, there are 16 homologues of AtIDD genes. Among them, the biological functions of magpie (MGP/AtIDD3; At1g03840), nutcracker (NUC/AtIDD8; At5g44160), and jackdaw (JKD/AtIDD10; At5g03150) genes have been characterized [65,66]. To date, there are only three AtIDD genes—NUC/AtIDD8, AtIDD14 (At1g68130), and shoot graviropism 5 (SGR5/AtIDD15; At2g01940)—that have been reported to play roles, though indirectly, in sugar and starch metabolism [67-69]. Based on phylogenetic analysis, the AtIDD genes most closely related to AtIDD5 are AtIDD4 (At2g02080) and AtIDD6 (At1g14580). AtIDD4 is reported as a TF whose expression is affected by defects in chloroplast import machinery, and it is postulated to function as a transcriptional activator of nuclear-encoded photosynthetic gene expression . In addition, AtIDD4 and AtIDD6 are identified as gibberellin-regulated genes . Based on the evidence known to date, both AtIDD4 and AtIDD6 do not seem to have any function related to sugar and starch metabolism.
The database of global gene expression analysis provides evidence showing that AtIDD5 is abundantly expressed in leaf tissues. The global view of AtIDD5 gene expression was roughly examined using the Genevestigator web-based software . The expression of AtIDD5 was observed ubiquitously in all stages, but its level was particularly high from stages of developed rosette leaves to developed flowers. In flowers, expression pattern of AtIDD5 was classified as ‘stamen-specific lack of expression’, suggesting that its expression disappears, especially in anthers of flowers from stage 7 to 11 [73,74].
Information on AtIDD5-interacting proteins further suggests that AtIDD5 is associated with other signalling components, such as radical-induced cell death 1 (RCD1: At1g32230) . RCD1 is not only known as clone eighty-one (CEO1), which recovers the oxidative stress-sensitivity phenotype of the Yap1- mutant yeast , but also a major regulator of hormonal signalling and stress-response processes [77,78]. According to the AGRIS database http://arabidopsis.med.ohio-state.edu/REIN/, AtIDD5 is predicted to interact with a MADS-box domain TF, sepallata 3 (SEP3), which is a global moderator of multifunctional protein-complexes controlling flowering and hormonal signaling processes, especially responses to auxin stimuli .
Total starch content was measured under both short and long day conditions. Starch was extracted from fully expanded leaves of all the mutants and the wild type, and analyzed in the form of glucose by capillary electrophoresis-diode array detector (CE-DAD) after enzymatic digestion (see method). Both the mutants and the wild type accumulated starch at relatively similar levels (Figure (Figure6),6), even though SS4 was down-regulated in Atidd5 and col mutants (Figure (Figure5).5). It has been reported that the size of starch granules can be significantly altered in ss4 mutant while only 35% reduction of the starch content could be observed . Referring to these previous findings, we speculated that changes in starch granule morphology and number may occur in Atidd5 and col mutants as they show reduced levels of SS4 mRNA accumulation during the light period (Figure (Figure55).
Chloroplast and starch granule morphology of Atidd5 and col mutant lines were examined by transmitted electron microscopy (TEM). Transmission electron micrographs of Atidd5 and col mutant lines and wild type are shown in Figure Figure7.7. They were analyzed by the image processing software Image J (version 1.45) to obtain a group of data sets including (i) size measured by ‘Area’ and (ii) shape measured by ‘Width’, ‘Height’, and ‘Circularity’ of the chloroplast and starch granule cross-sections. Although these parameters might not represent the actual size and shape of chloroplasts and starch granules, we considered them suitable for a comparative purpose. Since the data was not normally distributed, a non-parametric statistic, the Mann–Whitney U test, was applied for testing significant differences between the wild type and the mutants. The relative mean ranks and P-values from the Mann–Whitney U test are described in Table Table5.5. Descriptive statistics of chloroplast and starch granule morphology of Atidd5 and col mutants are summarized in Additional file 7: Table S3.
Area, width, height, and circularity of 259, 176, and 368 chloroplasts of Atidd5, col, and the wild type, respectively, were measured using the ImageJ software. According to the Mann–Whitney U analysis, chloroplast area of the wild type was significantly larger than that of Atidd5 and col mutants (P-value=3.38E-26 and 2.99E-10, respectively) (Table (Table5)5) with the means of chloroplast areas at 13.18, 9.26, and 10.50μm2 for the wild type, Atidd5, and col, respectively. In addition to the size, the shape of chloroplasts of both mutants also differed from the wild type. The width, height, and circularity of the chloroplasts were significantly smaller in Atidd5 than in the wild type. The small circularity values of Atidd5 chloroplasts indicate that they are in more oblong shapes relative to the wild type chloroplasts. In addition, the chloroplasts in the col mutant had longer width but less height than those in the wild type. The results, therefore, suggest that both mutants develop chloroplasts with altered morphology, which, particularly, appear smaller or thinner than the chloroplasts in the wild type (Figure (Figure77 and Table Table55).
Since chloroplasts of both mutants were altered with respect to their size and shape, we examined their effects on the morphology of accumulated starch granules. Reduction of starch granule size, inferred from the cross-section area, was significant in Atidd5 (P-value=1.38E-10), but not in the col mutant. The means of starch granule areas of the wild type, Atidd5, and col were 0.54, 0.42, and 0.50μm2, respectively. In contrast, the granule shape deformity was noticed in both col and Atidd5 mutants (Figure (Figure77 and Table Table5).5). The decrease in width, height, and circularity of Atidd5 starch granule most likely suggested that the granule was small and in oblong shapes. As compared to the wild type, the col starch granules were observed to have greater circularity, suggesting that they were relatively round in shape.
According to the work of Rolden and coworkers , a chloroplast of the ss4 mutant mostly contains one large starch granule. When examined under TEM—among 283, 177, and 408 chloroplasts of Atidd5, col, and the wild type, respectively—none of the chloroplasts were observed to contain a single large starch granule like the ss4 mutant. On the other hand, the majority of chloroplasts—corresponding to 25.4%, 27.1% and 27.5% of observed chloroplasts in Atidd5, col, and the wild type, respectively—normally contained 3 starch granules. We further investigated the distribution of starch granule number per chloroplast to find the difference between the mutants and wild type (Figure (Figure8).8). In the Atidd5 mutant, 86.2% of the observed chloroplasts contained 2–5 starch granules, whereas 85.8% of chloroplasts from the wild type contained 1–4 starch granules. Interestingly, we observed that the number of chloroplasts containing 2 and 4 starch granules in the col mutant was lower than those in the Atidd5 and the wild type, whereas the number of chloroplasts containing more than 5 granules was higher than the other lines (Figure (Figure8).8). Moreover, the col mutant was the only line that was observed to contain up to 10 starch granules per chloroplast. The relative mean rank and P-value from the Mann–Whitney U test of the mutants and the wild type shown in Table Table55 indicated that both Atidd5 and col mutants had significantly higher numbers of starch granules per chloroplast than the wild type (P-value=0.0067 and 0.0213, respectively). The results suggest that reduction of SS4 expression in the Atidd5 and col mutant lines leads to a significant increase in starch granule numbers, while their distributions of granule number per chloroplast are differently affected among the mutants (Figure (Figure88).
The relationship between the size of chloroplast and the number of accumulated starch granules is shown in Figure Figure9.9. Our results indicate that the number of starch granules increased according to chloroplast size. Larger chloroplasts tended to contain greater numbers of starch granules; however the pattern of correlation was not uniform among the wild type and mutants. In the Atidd5 mutant, a positive correlation between chloroplast size and the number of starch granules was only observed in the chloroplast containing 1–4 granules. It appears that the chloroplasts in this mutant are unable to expand after reaching critical size; however, they continued to store higher numbers of starch granules without increasing their size. In col mutant, the positive correlation was observed in the chloroplast containing 1–6 granules, whereas the size of the chloroplast containing 6–8 starch granules tended to decrease in accordance with an increase in the number of starch granules. The average sizes of chloroplasts containing 7 and 8 granules were the same as those having 2 starch granules. In addition, the size of the col chloroplasts containing 10 starch granules was similar to the size of chloroplasts containing 6 granules, suggesting this might be the critical size limit of the col chloroplast.
The results indicated that, in addition to having defects in their control of SS4 gene expression, both Atidd5 and col mutants are unable to increase the size of chloroplasts, although they may still retain the capability to expand their chloroplast to contain relatively small numbers of starch granules until the chloroplast reaches its critical size limit. Particularly in Atidd5, having relatively small starch granules can be another adaptive response caused by chloroplast deformity. The observed phenomena may suggest the alternative roles of AtIDD5 and COL in controlling chloroplast size limit, which may synchronize with transcriptional regulation of a starch biosynthetic enzyme, SS4. Our findings address a question of how starch biosynthesis and chloroplast development and/or functions are synergistically controlled in plant cells. The underlying mechanism of interaction awaits further investigation.
In this study, we proposed a transcriptional regulatory network of starch metabolism in Arabidopsis leaves, and examined the biological relevance of predicted network modules. The general workflow of data acquisition, refinement, and experimental validation provides a model case for reconstruction of transcriptional regulatory network. The present work widely utilizes publicly available biological information and resource databases, demonstrating how they can be integrated to find biological significance of predicted network modules. Construction of gene-to-gene association network models is based on diurnal regulation of starch metabolism in leaves where the transcriptomes oscillate during the day/night cycles. We first grouped time-series-dependent significant genes on transcriptome into four classes showing distinct patterns of co-regulation with starch biosynthesis or degradation. A particular focus has been placed on relationships between TFs, clock genes and starch metabolic genes, to obtain transcriptional regulatory network model of starch metabolism. The network constructed by the small sample inference of GGM suggests relationships between TFs and target starch metabolic genes. Gene-to-gene associations have been further refined by prediction of TF binding sites in target genes and by global co-expression analysis. Through these approaches, we finally showed the involvement of AtIDD5 and COL in transcriptional regulation of SS4. These regulatory networks were considered attributable to daytime starch biosynthesis by SS4. In addition, AtIDD5 and COL were shown to control chloroplast development and starch granule formation. The present work on TF network modelling and examination provides new insights into the regulatory mechanisms of starch biosynthesis and granule formation in the chloroplast.
This study utilized Arabidopsis Affymetrix microarray data (CEL files) downloaded from the Nottingham Arabidopsis Stock Centre's microarray database (NASCArrays) [Experiment Reference Number: NASCARRAYS-60] http://affymetrix.arabidopsis.info/. This microarray experiment contains a set of 22k Arabidopsis ATH-1 genome array transcriptome data of leaves at developmental stage 3.9 taken after 1, 2, 4, 8, and 12 hours in both darkness and light . A ‘qspline’ normalization  and model-based expression index  were carried out in the microarray pre-processing, which was done using the Affy package in Bioconductor http://www.bioconductor.org.
The significant test for the time-series data was performed using the EDGE program (version 1.1.175) http://faculty.washington.edu/jstorey/edge/. Hypothesis testing on time-series expression of each gene was performed to test whether an average expression constitutes a flat line. The gene expression profile was fitted under a model based on null and alternative hypotheses. The null hypothesis states that there is no differential gene expression over a time period. The alternative hypothesis states that a gene is differentially expressed over a time period. The goodness of fit of 2 models was compared by F-statistic using a significant cut-off based on a false discovery rate criterion [82,83].
The R package “GeneNet” —available at the R archive (CRAN) http://CRAN.R-project.org/—was used to construct the gene association network. The GeneNet was developed from the small-sample inference framework of graphical Gaussian model (GGM) to obtain a partial correlation coefficient, which is a correlation between 2 variables obtained when eliminating effects of other variables. In the case of 3 variables—x, y, and z—the partial correlation of x and y when eliminating the effect of z, prxy,z can be calculated as follows
where r is the correlation coefficient between 2 variables. In the case of more than 3 variables, the partial correlation can be calculated from the following equation.
The prxy,g is the partial correlation between x and y against variable 3 to g. The sxy=the xyth element of the inverse of variance matrix (S=V-1). The element in matrix V is vij (i, j=1, … , n) corresponding to a covariance between variables i and j.
For microarray, the number of variables (i.e. genes) is much higher than the number of measurements (i.e. microarray conditions), thus making the inversion step of matrix V invalid. In the new framework of GGM, the parameter estimation techniques were used to obtain partial correlation of small sample size. In order to decide which edges are significant to be included in the resulting GGM network, statistical significance was further assigned to the edges in the GGM network by fitting a mixture model (as shown below) to the observed partial correlation coefficients .
The distribution of observed partial correlation coefficients is
where is the observed partial correlation, is the unknown proportion of null edges, is the distribution under the null hypothesis of zero-partial correlation, κ is the degree of freedom, and is the distribution of observed partial correlations assigned to actually existing edges. The two-sided p-values for each edge corresponding to the null distribution were subsequently calculated and followed by false discovery rate multiple testing [82,83] to obtain q-values. The edges with q-values are equal or lower than 0.05 were presented in the resulting GGM network in this study.
Arabidopsis ecotype Columbia-0 was used in this study. Arabidopsis mutant lines were obtained from the T-DNA or transposon inserted mutant collection (Col-0 background) of The Arabidopsis Biological Resource Centre (ABRC)  and The Nottingham Arabidopsis Stock Centre (NASC) . Accession numbers of Atidd5, c2h2, col, col7, wlim2a, and kh-ccch are SALK_110990, SALK_070916, SALK_061956, SM_3_37788, SALK_067756, and SAIL_672_A10, respectively. Details of all mutant lines are shown in Additional file 8: Table S4. The seeds were vernalized in the dark for 3days at 4°C before germination. Plants were grown on an equal mixture of sterile vermiculite and peat-based growing medium (PRO-MIX Bx/Microrise Pro, Premier) in a growth cabinet (SANYO) set at 60% humidity and 20-22°C with a light intensity of 100μmolm-2sec-1 and under 12hr light/12hr dark (short day) or 16hr light/8hr dark (long day) cycles. The trays of plant pots were sub-irrigated with a half-strength Arabidopsis liquid nutrient culture . Leaves at a developmental stage of 3.90  were harvested 4 times a day—1hr before and after day break and night break. Leaves for starch analysis were harvested at the end of the light period.
Total RNA was extracted from 100–200mg leaf material (3 biological replicates) using Plant RNeasy kit (Qiagen), treated with DNaseI (Invitrogen), and reverse transcribed by Omniscript Reverse Transcriptase (Qiagen). Subsequently, real-time PCR was carried out using SYBR® Premix Ex Taq ™ II (Perfect Real Time) (Takara) using ubiquitin 2 (UBQ2) as a constitutive internal control. Details of the primer pairs used in qRT-PCR experiments are shown in Additional file 9: Table S5.
Starch from Arabidopsis leaves (3 biological replicates) was extracted using the method described by Smith and Zeeman . Gelatinized starch was hydrolyzed to glucose by incubation for 4hr at 37°C with α-amylase and α-amyloglucosidase. After the enzymatic digestion of starch to glucose, the amount of glucose was quantified by the capillary electrophoresis photodiode array detection (CE-DAD) system according to the manufacturer’s protocol (Agilent) . Leaf starch content was calculated from the amount of glucose measured in this enzymatically-digested extract.
Fully expanded Arabidopsis leaves were collected at end of day, cut into 2 x 2mm2 pieces, and immediately fixed with a cold solution of glutaraldehyde. Various parameters describing starch granule morphology (i.e. area, perimeter, width, height, and circularity) and number of starch granules per chloroplast were measured from TEM micrographs using ImageJ software (version 1.45). It was noted that circularity is calculated by the following formula:
A value approaches 1.0 meaning a perfect circle and 0.0 meaning an elongated shape. The morphology data was tested for a statistically difference using a non-parametric Mann–Whitney U statistic (P-value <0.05).
The authors declare that they have no competing interests.
PI carried out the microarray data analysis, the model reconstruction, qRT-PCR experiments, and starch content and granule morphology analysis, and drafted the manuscript. PI, SC, and HT conceived and designed the research. SN and HT edited the manuscript. SP provided assistance in the statistical analysis of the research. JC provided assistance in R programming. HT provided assistance in growing Arabidopsis, qRT-PCR experiment, and starch measurement. AM, MT, and SB provided oversight of the work. All authors read and approved the final manuscript.
Figure S1. The gene association network of 11 carbon-related metabolisms inferred from GGM (Q<0.05).
Table S1. Genes in the starch sub-network.
Table S2. Prediction of TF binding sites in 12 starch metabolic genes (2kb-upstream).
Figure S2. Expression patterns of 2 TFs, At3g07650 (COL9) and At1g05805 (bHLH), and their target genes.
Figure S3. Expression patterns of starch genes in other regulatory modules.
Figure S4. Expression pattern of C2H2 gene in the wild type, Atidd5, and col mutants quantified by qRT-PCR.
Table S3. Descriptive statistics of chloroplast morphology, starch granule morphology, and starch granule number in the wild type, Atidd5, and col mutants.
Table S4. T-DNA insertion lines of 6 candidate TFs that were utilized in this experiment.
Table S5. Primer pairs for quantitative RT-PCR.
We are grateful to Dr. Akinori Suzuki and Ms. Yumiko Tsuchiya, RIKEN Plant Science Center, for technical support in growing Arabidopsis and qRT-PCR experiment. We are also grateful to Prof. Dr. Maleeya Kruatrachue, with the assistance of Ms. Sombat Singhakaew, Mahidol University, for providing assistance in transmission electron micrographs analysis. TEM samples preparation and TEM analysis were carried out by service laboratories at Kasetsart University Research and Development Institute (KURDI) and Central Laboratory and Greenhouse complex, Kasetsart University (Kamphaengsaen Campus), respectively. PI is financially supported by Thailand Graduate Institute of Science and Technology (TGIST), Contract No. TGIST 01-47-048, and National Center for Genetic Engineering and Biotechnology (BIOTEC), Grant No. BT-B-02-PG-B5-4813. This work is supported by BIOTEC, Thailand; RIKEN Plant Science Center, Japan; and in part by the grants from the Ministry of Education, Culture, Sports, Science and Technology (MEXT) of Japan, and Bio-oriented Technology Research Advancement Institution (BRAIN).