High-throughput biological technologies allow the simultaneous measurement of the expression of thousands of genes or proteins, which offers an unprecedented opportunity to fully characterize biological processes 
. Nevertheless, extracting a comprehensive overview from the huge amount of information is a significant challenge 
. During the last decade, high-throughput analysis mainly focused on dissecting the individual genes responsible for specific phenotypes, and some biomarkers for human diseases have successfully been identified through analysis of genome-wide expression profiles 
. However, it is well accepted that genes or proteins within a cell do not function alone, and they interact with each other to form networks or pathways so as to carry out biological functions 
. Therefore, it is crucial to reveal the essential biological mechanisms from a system perspective, and pathway-based analysis is becoming a popular method of analyzing high-throughput data. Several approaches have been proposed to score known pathways by the coherency of expression changes among their member genes 
. Generally, a known pathway is drawn from sources such as the Gene Ontology (GO) 
and KEGG 
databases. In contrast to the documented pathways, however, it is a more difficult task to identify novel sub-networks or pathways responsive to phenotypes from biomolecular networks. Recently, gene-set-based or pathway-based analysis has been extended to perform classification of microarray data by exploiting the phenotype difference 
and a number of approaches have been demonstrated for not scoring known pathways but extracting relevant sub-networks based on coherent expression patterns of the corresponding genes in the protein-protein interaction (PPI) networks 
. However, these approaches are mainly molecule-complex-based 
or individual-gene-based analysis, such as 
, in which the authors indicated that candidate sub-networks are seeded with a single protein and iteratively expanded to add other proteins into the sub-networks. Note that, in biology, a complex is a cluster of genes or proteins so closely related that they intergrade 
, while a pathway is a group of genes or proteins that are interacted (or related) 
In contrast to existing works, in this paper we first developed a novel module-based method to identify phenotype-based responsive modules by integrating gene expression data and high-quality PPI networks, which are able to reveal the potential causal or dependent relations between network modules and biological phenotypes. Specifically, we formulated the problem to identify phenotype-based responsive modules as a multi-classification problem of modules on phenotypes by a mathematical programming model, rather than identifying individual genes and gene sets, where the modules are resulted from the topological structure of the PPI networks. Then, the proposed method was applied to the cell-cycle process of budding yeast Saccharomyces cerevisiae (S. cerevisiae) to identify phenotype-based responsive modules and further examine their regulation roles in the phase transition process based on the microarray data of our biological experiments.
The cell cycle process, by which one cell grows and divides into two daughter cells, is a vital biological process, the regulation of which is highly conserved among the eukaryotes 
. Although extensive studies have been conducted on the cell cycle process 
, in particular by modeling the budding yeast 
, many detail regulations still remain unclear from network viewpoint. Generally, there are mainly two ways to perturb a biological system, that is, external stimulus, such as exposure to DNA-damaging agents, methyl methanesulfonate (MMS) 
, and internal stimulus, such as knocking out some genes 
. To functionally relate network modules to different phenotypes, we designed biological experiments by combining these two types of stimuli so as to create various phenotypes for the cell cycle process. In our biological experiments, when adding MMS at 15 min (G1 phase) or knocking out elg1
at the beginning of the cell cycle process, the cell cycle continues, nevertheless, when adding MMS at 15 min (G1 phase) to elg1
mutant strains, the cell cycle arrests at S phase.
By the proposed module-based method with exploiting high-throughput data of various phenotypes resulted from our biological experiments, we identified phenotype-based responsive modules and dynamical transition modules of budding yeast cell cycle. A responsive module means that the module potentially plays an important role in some phenotypes, while a transition module indicates that the module is potentially responsible for the transition from one phenotype to another from a dynamical perspective. After the identification of phenotype-based and transition-based responsive modules for the cell cycle phases and their transitions, the identified responsive modules were also validated by classifying the cell cycle phases of two independent datasets on budding yeast cell cycle.
Based on the computational and experimental results, the main contributions of this work can be summarized as follows. First, our method is able to identify phenotype-based and transition-based responsive network modules, drawn from the topological structure of a biomolecular network, instead of dissecting complexes or individual gene-based pathways. Second, the identified modules lead to new insights into the cell cycle process and provide biological interpretations on the functional roles of network modules. In particular, according to the validation on the other two independent datasets and also the functional validation, the phenotype-based responsive modules are potentially signatures or network biomarkers of the cell cycle process. We revealed the reason of arresting cell cycle at S phase under both internal and external stimuli from a network viewpoint, that is, we identified one well-known module “CLN1 CLN2 CLN3 BUD2” involved in cell cycle process and two new modules “PKC1 TOS2 KEL2 PPZ2 SKN7” and “SSD1 LST8 TOR1 KOG1 TOR2”, whose dysfunction results in cell cycle arresting. Third, our method is also a new theoretical model for multi-classification analysis, which was used to study the cell cycle process by relating network modules to different phenotypes and even phase transitions from a dynamical perspective. In addition, we showed that the identified responsive modules can also be directly used to annotate functions of genes or proteins.