Home | About | Journals | Submit | Contact Us | Français |

**|**BMC Syst Biol**|**v.4(Suppl 2); 2010**|**PMC2982684

Formats

Article sections

- Abstract
- Background
- Materials and methods
- Results
- Discussion and conclusion
- Competing interests
- Authors' contributions
- References

Authors

Related links

BMC Syst Biol. 2010; 4(Suppl 2): S10.

Published online 2010 September 13. doi: 10.1186/1752-0509-4-S2-S10

PMCID: PMC2982684

Shihua Zhang: nc.ca.ssma@hsz; Xue-Mei Ning: nc.ca.ssma@mxn; Chris Ding: ude.atu@gnidqhc; Xiang-Sun Zhang: nc.ca.tma@sxz

Selected articles from the Third International Symposium on Optimization and Systems Biology

Luonan Chen, Xiang-Sun Zhang and Yong Wang

Optimization and Systems Biology

20 – 22 September 2009

Zhangjiajie, China

Copyright ©2010 Zhang et al; licensee BioMed Central Ltd.

This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

This article has been cited by other articles in PMC.

With ever increasing amount of available data on biological networks, modeling and understanding the structure of these large networks is an important problem with profound biological implications. Cellular functions and biochemical events are coordinately carried out by groups of proteins interacting each other in biological modules. Identifying of such modules in protein interaction networks is very important for understanding the structure and function of these fundamental cellular networks. Therefore, developing an effective computational method to uncover biological modules should be highly challenging and indispensable.

The purpose of this study is to introduce a new quantitative measure modularity density into the field of biomolecular networks and develop new algorithms for detecting functional modules in protein-protein interaction (PPI) networks. Specifically, we adopt the simulated annealing (SA) to maximize the modularity density and evaluate its efficiency on simulated networks. In order to address the computational complexity of SA procedure, we devise a spectral method for optimizing the index and apply it to a yeast PPI network.

Our analysis of detected modules by the present method suggests that most of these modules have well biological significance in context of protein complexes. Comparison with the MCL and the modularity based methods shows the efficiency of our method.

Understanding the cell as a system of interacting components is a fundamental goal of current biology. Various types of biological networks are being constructed in cellular systems including PPI networks, gene regulatory networks and metabolic networks, etc. Exploring how molecules interact to form cellular machinery is a key task in systems biology. The well-understood graph-theoretical concepts has become a powerful tool to explore the topology, organization, function and evolution of biological networks. In this field, recent studies have made great progresses which considerably expanded our insights in the organizational principles and cellular mechanisms of cellular systems [1,2].

Modularity has been considered to be one of the main organization principles of biological networks in the past decade years. Biological modules as a critical level of biological hierarchy and relatively independent units play special roles in biological systems [1]. How to uncover modular structures in various biological networks is a basic step for understanding cellular functions and organizational mechanisms of biosystems. For example, by using the network partition, Zhao *et al*. (2006) investigated the functional and evolutionary modularity of human metabolic networks from a topological perspective [3].

In the past few years, a huge number of computational methods have been developed for detecting network modules and analyzing the network structure of biological networks. Hierarchical clustering has been proven to be useful tools for analyzing biological networks. Ravasz *et al*. (2002) studied the hierarchical modular organization of metabolic networks based on a topological linkage matrix. Researchers of three groups [4-6] employed hierarchical clustering based on three different clustering methods respectively to analyze the modular structure of yeast protein interaction networks. The diffusion kernel of graph was also suggested as a universal similarity metric to construct the clustering tree of networks [7]. However, this type of approaches may generate many identical distances (similarity) and leads to the ‘tie in proximity' problem during hierarchical clustering. This type of method imposes a stringent tree structure on the network which is highly sensitive to the metric used to assess (dis)similarity, and typically requires subjective evaluation to define modules. And the evaluation of these (dis)similarity measures for hierarchical clustering is not an easy problem.

Several studies of protein interaction networks have focused on detecting highly connected protein modules [8-10] which generally correspond to meaningful biological units such as protein complexes and functional modules. In general, these approaches only employ local connectivity among proteins and neglect many peripheral proteins that connect to the core protein clusters with few links. However, biological networks including PPI networks are generally very sparse. Most methods only identify strongly connected subgraphs as modules, so only a few modules were detected [9,11,12]. And biologically meaningful sparse protein modules are ignored by these approaches and those lost peripheral proteins may represent experimentally true interactions. Furthermore, because these approaches heavily rely on the local topological connectivity, they ignore the impact of global organization of networks. But biological networks are globally coordinated system, so the local connectivity based methods can not be employed to explore the relationship among modules. Another important factor is the noise of interaction data, other sources such as the function annotation data and gene expression data have been integrated into protein interaction networks to improve the effectiveness of module detection [13,14].

One popular class of methods for dissecting modular structure in the field of general complex networks is based on optimizing a global quality function called modularity [15,16] to partition the network into modules. And it has been comprehensively adopted to analyze biological networks [3,17-19]. However, it has recently been shown that the resolution of the modularity based methods is intrinsically limited. It fails to find small communities in large networks—instead, groups of small communities turn out merged as larger ones [20]. Li *et al*. (2008) proposed a novel quality function called modularity density (*D*) which aims to conquer the resolution limit problem in modularity [21]. They have tested it on many kinds of small networks for illustration but not on large real networks.

In this study, we aim to introduce the new quantitative measure modularity density into the modular analysis of biomolecular networks and develop new algorithms for detecting functional modules in protein-protein interaction (PPI) networks. We first adopt the simulated annealing (SA) technique to maximize the modularity density and evaluate its advantages on a suit of simulated networks where the modules are known. In order to conquer the computational burden of SA procedure, we adopt a spectral fc-means method for optimizing the measure and apply it to a yeast PPI network. Our biological analysis of detected modules suggests that most of these modules carry distinguished biological significance. We also make a comparison of our method with other two methods including the popular MCL and modularity based methods to verify its effectiveness.

The popular modularity *Q* is defined by Newman and Girvan (2004). Briefly, when the nodes of a network are divided into modules, one can compute it as follows:

where *m* is the number of modules, *L* is the total number of edges in the network, *l _{i}* is the number of edges between nodes in module

In the following, we introduce the so-called modularity density *D* which was proposed as an alternative measure for describing the modular organization [21]. The characteristic of this measure is that it is related to the density of subgraphs. We first define the average modularity degree of subgraph *G _{i}*
(

where *aid* (*G _{i}*) is the average inner degree of the subgraph

This measure provides a way to determine if a certain mesoscopic description of the graph is accurate in terms of modules. The larger the value of *D*, the more accurate a partition is. So the community detection problem can be viewed as a problem of finding a partition of a network such that its modularity density *D* is maximized. The search for optimal modularity density *D* is a challenging problem due to the fact that the space of possible partitions grows faster than any power of system size.

Moreover, the phenomenon of multiple resolutions or/and hierarchy of modular structures have been observed in biological networks [22]. The modularity density *D* can be extended for this more general case using a tuning parameter λ as follows [21]:

where λ is a value ranging from 0 to 1, and when λ = 0.5, the *D*_{0.5} corresponds to modularity density *D*. By varying λ, we can detect detailed and hierarchical organization of biological systems. In other words, we can divide the network into large modules and small modules using a small λ and a large λ respectively.

In principle, the goal of a module detection is to find the ‘optimal’ partition with largest modularity *Q* or modularity density *D*. Several methods have been proposed for optimizing *Q*. Most of them rely on heuristic procedures or approximate strategies. Here, we employ the simulated annealing (SA) technique to maximize *Q* and *D* to obtain the ‘best’ determination of the modules of a network for evaluating. Simulated annealing is a kind of stochastic search technique for optimization problems. It enables one to find ‘low cost' configurations without getting trapped in ‘high cost' local minima and has many applications in combinatorial optimization problems. In the searching process, a global parameter *T* representing temperature is introduced. When *T* is high, the system can explore configurations of high cost while at low *T* the system only explores low cost regions. Along with the decrease of *T*, ‘low cost' configurations can be reached step by step by overcoming small cost barriers. When identifying modules, the objective is to maximize the quantitative indexes (i.e. *Q* or *D*), thus, the cost is *C* = −*Q* or −*D*. At each temperature, we perform a number of random updates and accept them with probability:

(1)

where *C _{i}*
(

Specific implementation detail can be seen in [17]. Note that we add a decision clause to ensure that each potential ‘module' is connected. The one that performs best consists in isolating the module from the rest of the network, and performing a nested' SA, entirely independent of the ‘global’ one. In using *Q* and *D* as fitness functions', the method is more direct than those relying on heuristic procedures. Moreover, SA enables us to carry out an exhaustive search and to minimize the problem of finding sub-optimal partitions. We should note that the SA method can't scale to very large networks, but it is an efficient evaluation method for its exhaustive characteristic. Several efficient methods for optimizing *Q* have been proposed, but designing efficient algorithms for optimizing the new measure (*D*) is still an essential and challenging problem.

Given a network *G* = (*V*, *E*), and denote its vertex set as *V*, edge set as *E* and adjacency matrix as *A*. Given a *m*-partition *P _{m}*, define a corresponding

where *B* is the degree matrix. Let note that or then, we can obtain

So the problem of maximizing *D* can then be expressed as:

(2)

From the standard result in linear algebra, the optimal of the above trace maximization has close relationship with the leading *k* eigenvectors of 2*A* − *B* by relaxing as an arbitrary orthonormal matrix [23]. We can adopt the corresponding spectral algorithms and use the leading *k* eigenvectors of 2*A* − *B* to optimize the modularity density *D*. To obtain the final network partition, we apply the *k*-means clustering method to cluster eigenvectors. Importantly, the same principle can be derived for *D*_{λ}.

Given an upper bound *K* of the number of modules and the adjacency matrix *A* = (*a _{ĳ}*)

• Spectral mapping:

1. Compute the diagonal matrix *B* = (*d _{ii}*), where .

2. Form the eigenvector matrix *U _{k}* = [

• *k*-means: for each value of *k*, 2 ≤ *k* ≤ *K*

1. Form the matrix *U _{k}* = [

2. Normalize the rows of *U _{k}* to unit length using Euclidean distance norm: .

3. Treat the rows of *U _{k}* as points in

• Maximizing modularity density *D* or *D*_{λ} with given λ: Pick the* k* and the corresponding partition *P _{k}*
that maximizes

We should note that this type of spectral clustering technique has been successfully applied to general clustering problems as well as graph clustering problems [24,25]. Here, we explore the characteristic of modularity density *D*, and derive a new spectral clustering based method for maximize *D* (*D*_{λ}) (SpeMD). And the SpeMD procedure described here can be seen as a particular manner of employing the standard *k*-means algorithm on the elements of the leading *k* eigenvectors to extract *k* clusters simultaneously. Convergence and computational complexity of the SpeMD procedure are key problems when this method is applied to large complex networks. Fortunately, several strategies can be employed to improve these problems. First, we can initialize the *k*-means such that the starting centroids are chosen to be as orthogonal as possible [26]. This strategy does not change the time complexity, but can improve the quality of convergence, thus at the same time reduce the need for restarting the random initialization process. Second, several fast techniques for solving eigen system have been developed and several methods of *k*-means acceleration can also be found in the literature. Based on this type of techniques, for large sparse networks with *m* ~ *n*, and *k* *n*, the SpeMD procedure will scale roughly linearly as a function of the number of nodes *n *[25]. Here we didn't consider these ameliorative techniques and only focus on the validity of the SpeMD method.

The biological significance of the numerically computed modules can be validated by comparing the experimentally determined complexes in the MIPS database [27] with the computed modules. We use a best-matching criteria which was first introduced in [8] to match these two type of protein sets. By minimizing the probability *P _{ol}* of a random overlap between a computational protein module and an experimental complex using hypergeometric distribution, we determine the best-matching experimental complex for a protein module. The

where *N* is the size of the PPI network, *k* is the number of their common proteins, and |*C*|, |*M*| are the sizes of an experimental complex and a computed protein module respectively.

Furthermore, the geometric accuracy and separation described in the study of Brohee and van Helden [28] are employed to evaluate the performance of the module-detection methods. We first build a contingency table *T*, where row *i* corresponds to the *i* th experimental complex and column *j* to the *j* th module and the value of a cell *T _{ĳ}* indicates the number of proteins found in common between complex

**Accuracy:** The complex-wise sensitivity is defined as the maximal fraction of proteins in complex *i* that could be found in one module by the formula:

where *N _{i}* is the number of proteins belonging to complex

Moreover, the cluster-wise positive predictive value
is calculated as the maximal fraction of proteins in module *j* found in the best-matching complex by the formula:

To determine the general PPV (positive predictive value) of a partitioning result as a whole, the clustering-wise PPV is computed as the weighted average of over all modules by:

Finally, the geometric accuracy (ACC) indicating the tradeoff between sensitivity and predictive value can be obtained by computing the geometric mean of the SN and the PPV as follows:

**Separation:** From the contingency table, the relative frequencies with respect to the marginal sums can be defined for each row and each column as:

Then the separation is computed as the product of column-wise and row-wise frequencies by:

The complex-wise separation (cluster-wise separation for module ) is calculated as the sum of separation values for a given complex *i* (module *j*) by:

The complex-wise *SEP _{co}* and clustering-wise

The the geometric separation (SEP) is defined as the geometric mean of *SEP _{co}* and

In this section, we apply the present method to a suit of simulated networks and a yeast PPI network to test its efficiency. We first present detailed numerical results to show the difference of network partition determined by maximizing the modularity density *D* and modularity *Q* with simulated annealing (SA) technique. In general, maximizing *D* (MD) can give more detailed and valid results, while maximizing *Q* (MQ) encounters serious resolution limit in simulated networks.

Then we apply the new spectral method for maximizing the generalized *D _{λ}* (SpeMD) to a yeast PPI network to identify functional modules which show significant biological relevance. Comparison with MQ and MCL, we show that the SpeMD can obtain competitive performance with the well-known MCL method and resolve much finer modular structure than MQ method. To extract appropriate modules, the SpeMD and MCL both rely on one parameter. Here, we perform the SpeMD and MCL with adjusted parameters to obtain the ‘best’ geometric accuracy and separation. For SpeMD, we tune

First we do the comprehensive tests on a group of simulated networks which take on significant modular characteristics. In the work of [21], *D*-based method has been showed to be able to obtain competitive performance with *Q*-based method. However, the size of artificial networks generated by using Newman's popular procedure as well as its variant are too small to show the serious resolution limit problem of *Q*. Therefore, we devise a new type of artificial networks. The network is composed of 2*m* cliques (*m**n*_{1}-clique and *m**n*_{2}-clique), and external edges are placed randomly with a fixed expectation values so as to keep the average edge connections *k _{out}* of each node to nodes of other cliques. So each network has

The computational results for this experiment are summarized in Figure Figure11 and Figure Figure2,2, where NC is the number of cliques, i.e., *NC* = 2*m*. Figure Figure11 shows the fraction of nodes that are correctly classified into the communities (Precision) with respect to *k _{out}* by MD and MQ respectively. We can see that MD method based on

The most interesting observation is that performance of MD is almost the same, while that of MQ is greatly decreasing with the increase of NC (also the size of networks). For example, for 50 random networks with *k _{out}* = 6, always on an average >99.9% nodes are classified correctly by MD on four different sizes of networks with

To test the performance of MD and MQ in selecting the number of communities, we calculate the number of modules. Figure Figure22 shows the averaged number of modules on four different sizes of networks (*NC* = 20, 40, 60, 80) with respect to *k _{out}* by MD and MQ respectively. We can see that MD performs much better than MQ. The MD can almost always identify the right number of modules in four different sizes of networks with

The budding yeast *S. cerevisiae* PPI network was obtained from the DIP database (http://dip.doe-mbi.ucla.edu/dip/), which contains human-curated high-throughput and small-scale binary interactions directly observed in experiments, as well as binary interactions inferred from high-confidence protein complex data. We only considered non-self physical interactions and built the PPI network. The giant component of the PPI network is composed of 2559 proteins linked by 7031 nonredundant interactions. In order to test the ability of SpeMD to extract complexes from the interaction network and compare it with other two methods, we compared the detected modules to known complexes in yeast as annotated by the Munich Information Center for Protein Sequences (MIPS) [27] using the *P _{ol}* formula. We apply the SpeMD method to the yeast PPI network to detect functional modules. Totally, we obtain 279 protein modules of sizes from 4 to 38 with λ = 0.6 (To extract statistically and biologically significant modules, we remove 48 modules with size ≤ 3). In Table Table1,1, we show detailed information of 15 detected modules and their matching with experimentally determined protein complexes that were catalogued in MIPS database. Figure Figure33 presents two such modules. For example, the second one is a nine-member module which matches with Golgi transport complex for stimulating intra-Golgi transport with

There has been many methods for detecting network modules. The comparison of all the methods is not an easy task. Here, we attempt to compare the MD (SpeMD) with two types of classical methods: MQ and MCL. Just as we have mentioned, the modularity (*Q*) maximization based module-detection method has been comprehensively applied in many fields including analysis of biological networks. Another method is the Markov Cluster algorithm (MCL) which was developed by van Dongen [29]. The method simulates a flow on the network by calculating successive powers of the network adjacency matrix. In each iteration, an inflation step is applied to enhance the contrast between regions of strong or weak flow in the network. The process converges towards a partition of the network, with a set of high-flow regions separated by boundaries with no flow. The value of the *inflation parameter* strongly influences the the size and number of the detected modules. In a recent evaluation study [28], the algorithm was found to be superior to several representative graph clustering algorithms including MCODE [9], RNSC [11] and SPC [8] for the prediction of protein complexes.

The module size distribution of detected modules for each method on the PPI network have been shown in Figure Figure4.4. The SpeMD and MCL identify 279 and 242 modules respectively without extremely large clusters (λ = 0.6 for SpeMD and inflation parameter = 1.7 for MCL). The major trend generated by MD and MCL are both similar to that of the complexes in MIPS database, which suggest the definition of modularity density is reasonable (Note that the MIPS complex is a combination of hand-curated and experimental complexes. They have some overlap, so complex curve is higher. But the trend is similar). Unfortunately, the module size distribution of MQ is very different from the previous ones. The MQ only detect 21 modules with relative large size ranging from 39 to 263. As tested on the simulated networks, the MQ method is highly limited by the resolution problem (Figure (Figure55).

As to biological significance, the accuracy and separation are used for evaluating the correspondence between complexes and modules from each method [28]. From Figure Figure6,6, we can easily see that the SpeMD and MCL have consistently better performance than MQ. This means the modularity density based partition method can produce more biologically significant modules than the modularity based method. And the new quality function may become an evaluation index of modularity organization of networks. While MCL has no such evaluation function.

PPI networks are typical examples of complex biological systems that are difficult to understand from raw experimental data alone. Algorithmic and modeling progresses in the area of biomolecular networks analysis could contribute to the understanding of biological processes and organization. Many methods have been developed to organize, display and extract significant patterns in these systems [2].

A number of network clustering algorithms have been proposed to find modular structures in PPI networks and other biological networks. Our work is a further development along this line for dissecting biological systems. Here we introduce the quantitative measure (Modularity density *D*) for exploring modular organization of networks to the field of biomolecular networks. We suggest the SA technique to maximize it for rigorous evaluation and we propose an efficient spectral -means method in the decomposition procedure. Our comparative experiments with MCL and MQ on a yeast PPI network show that the MD (SpeMD) method can effectively detect protein interaction modules from a complex interaction network. In the current research, we use known complexes to choose the optimal λ as well as the inflation parameter for MCL algorithm. Actually, we can also adopt an intrinsic measure which compares the resulting modules against the original network to choose the most appropriate parameter in an unsupervised manner. For example, van Dongen [29] suggested the so-called *efficiency* measure to test the performance of network clustering efficiency. Therefore, the present method can be easily adapted to a fully self-contained method that doesn't rely on any known data or given parameters. The current algorithm, as most clustering methods, uncovers only disjoint modules (clusters). However, in real biological systems, proteins can be contained in more than one functional module or complex. Zhang *et al*. (2007c) has suggested to apply fuzzy c-means clustering method to a spectral space for uncovering fuzzy modules [25]. It can also be addressed using an intrinsic measure based on the original network in the same way as suggested in [30] by post-processing the modules obtained from the present algorithm.

Modularity *Q* have been extensively employed for dissecting and evaluating the modular organization of biomolecular networks [3,17-19] as well as clustering the graphic representation of gene expression profile data [31]. However, the heavy resolution limit of modularity *Q* reminds researchers to use it cautiously. And the modularity density *D* may become an alterative measure to achieve these goals.

In summary, our method is very effective for uncovering modular organization in biomolecular networks. It provides an objective approach to explore the organization and interactions of biological processes. With the increasing amount of biological ‘interaction' data available, MD (SpeMD) can facilitate the construction of a more complete view of the composition and interconnection of functional modules and the understanding of the organization of the whole cell at system level. We plan to automate this algorithm to compute functional modules for a large number of biological networks. We hope that related studies will benefit from the present method coupled with the modularity density *D* (*D*_{λ}).

The authors declare that they have no competing interests.

SZ designed the study. SZ and XMN implemented the method, performed the experiments and analyzed the data. CD and XSZ contributed to discussions on the method. SZ, XMN, CD and XSZ wrote the manuscript.

This work is partially supported by the National Natural Science Foundation of China under grant No. 10631070, No.60873205, Innovation Project of Chinese Academy of Sciences, kjcs-yw-s7, the Ministry of Science and Technology, China, under Grant No.2006CB503905 and the ‘Special Presidential Prize and Excellent PhD Thesis Award' Scientific Research Foundation of CAS.

This article has been published as part of *BMC Systems Biology* Volume 4 Supplement 2, 2010: Selected articles from the Third International Symposium on Optimization and Systems Biology. The full contents of the supplement are available online at http://www.biomedcentral.com/1752-0509/4?issue=S2

- Barabasi A, Oltvai Z. Network biology: understanding the cell's functional organization. Nature Rev. Gen. 2004;5:101–113. doi: 10.1038/nrg1272. [PubMed] [Cross Ref]
- Zhang S, Jin G, Zhang XS, Chen L. Discovering functions and revealing mechanisms at molecular level from biological networks. Proteomics. 2007;7:2856–2869. doi: 10.1002/pmic.200700095. [PubMed] [Cross Ref]
- Zhao J, Yu H, Luo JH, Cao ZW, Li YX. Hierarchical modularity of nested bow-ties in metabolic networks. BMC Bioinformatics. 2006;7:386. doi: 10.1186/1471-2105-7-386. [PMC free article] [PubMed] [Cross Ref]
- Brun C, Chevenet F, Martin D, Wojcik J, Guenoche A, Jacq B. Functional classification of proteins for the prediction of cellular function from a protein-protein interaction network. Genome Biol. 2003;5:R6. doi: 10.1186/gb-2003-5-1-r6. [PMC free article] [PubMed] [Cross Ref]
- Lu H, Zhu X, Liu H, Skogerbo G, Zhang J, Zhang Y, Cai L, Zhao Y, Sun S, Xu J, Bu D, Chen R. The interactome as a tree–an attempt to visualize the protein-protein interaction network in yeast. Nucleic Acids Res. 2004;32:4804–4811. doi: 10.1093/nar/gkh814. [PMC free article] [PubMed] [Cross Ref]
- Rives AW, Galitski T. Modular organization of cellular networks. Proc. Natl Acad. Sci., USA. 2003;100:1128–1133. doi: 10.1073/pnas.0237338100. [PubMed] [Cross Ref]
- Zhang S, Ning XM, Zhang XS. Graph kernels, hierarchical clustering, network community structure: experiment and comparative analysis. Eur. Phys. J. B. 2007;57:67–74. doi: 10.1140/epjb/e2007-00146-y. [Cross Ref]
- Spirin V, Mirny LA. Protein complexes and functional modules in molecular networks. Proc. Natl Acad. Sci, USA. 2003;100:12123–12126. doi: 10.1073/pnas.2032324100. [PubMed] [Cross Ref]
- Bader GD, Hogue CW. An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics. 2003;4:2. doi: 10.1186/1471-2105-4-2. [PMC free article] [PubMed] [Cross Ref]
- Bu D, Zhao Y, Cai L, Xue H, Zhu X, Lu H, Zhang J, Sun S, Ling L, Zhang N. et al. Topological structure analysis of the protein-protein interaction network in budding yeast. Nucleic Acids Res. 2003;31:2443–2450. doi: 10.1093/nar/gkg340. [PMC free article] [PubMed] [Cross Ref]
- King AD, Przulj N, Jurisica I. Protein complex prediction via cost-based clustering. Bioinformatics. 2004;20:3013–3020. doi: 10.1093/bioinformatics/bth351. [PubMed] [Cross Ref]
- Palla G, Derényi I, Farkas I, Vicsek T. Uncovering the overlapping community structure of complex networks. Nature. 2005;435:814–818. doi: 10.1038/nature03607. [PubMed] [Cross Ref]
- Cho Y, Hwang W, Ramanathan M, Zhang A. Semantic integration to identify overlapping functional modules in protein interaction networks. BMC Bioinformatics. 2007;8:265. doi: 10.1186/1471-2105-8-265. [PMC free article] [PubMed] [Cross Ref]
- Segal E, Wang H, Koller D. Discovering molecular pathways from protein interaction and gene expression data. Bioinformatics. 2003;19(S1):264–272. doi: 10.1093/bioinformatics/btg1037. [PubMed] [Cross Ref]
- Newman ME, Girvan M. Finding and evaluating community structure in networks. Phys. Rev. E. 2004;69:026113. doi: 10.1103/PhysRevE.69.026113. [PubMed] [Cross Ref]
- Newman MEJ. Modularity and community structure in networks. Proc. Natl. Acad. Sci., USA. 2006;103:8577–582. doi: 10.1073/pnas.0601602103. [PubMed] [Cross Ref]
- Guimer R, Amaral LAN. Functional cartography of complex metabolic networks. Nature. 2005;438:895–900. doi: 10.1038/nature03288. [PMC free article] [PubMed] [Cross Ref]
- Caretta-Cartozo C, De Los Rios P, Piazza F. et al. Bottleneck Genes and Community Structure in the Cell Cycle Network of S. pombe. PLoS Comput. Biol. 2007;3:e103. doi: 10.1371/journal.pcbi.0030103. [PubMed] [Cross Ref]
- Wang Z, Zhang J. In search of the biological significance of modular structures in protein networks. PLoS Comput. Biol. 2007;3:e107. doi: 10.1371/journal.pcbi.0030107. [PubMed] [Cross Ref]
- Fortunato S, Barthélemy M. Resolution limit in community detection. Proc. Natl. Acad. Sci., USA. 2007;104:36–41. doi: 10.1073/pnas.0605965104. [PubMed] [Cross Ref]
- Li Z, Zhang S, Wang RS, Zhang XS, Chen L. Quantitative function for community detection. Physical Review E. 2008;77:036109. doi: 10.1103/PhysRevE.77.036109. [PubMed] [Cross Ref]
- Ravasz E, Somera AL, Mongru DA, Oltvai ZN, Barabasi AL. Hierarchical organization of modularity in metabolic networks. Science. 2002;297:1551–1555. doi: 10.1126/science.1073374. [PubMed] [Cross Ref]
- Bach F, Jordan M. Learning spectral clustering. In Proceedings of 17th Advances in Neural Information Processing Systems. 2004.
- White S, Smyth P. A spectral clustering approach to finding communities in graphs. In Proceedings of SIAM International Conference on Data Mining. 2005.
- Zhang S, Wang RS, Zhang XS. Identification of overlapping community structure in complex networks using fuzzy c-means clustering. Physica A. 2007;374:483–490. doi: 10.1016/j.physa.2006.07.023. [Cross Ref]
- Ng A, Jordan M, Weiss Y. On spectral clustering: analysis and an algorithm. Adv. Neural Inf. Process. Systems. 2002;14:849–856.
- Mewes HW, Frishman D, Guldener U, Mannhaupt G, Mayer K, Mokrejs M, Morgenstern B, Munsterkotter M, Rudd S, Weil B. MIPS: a database for genomes and protein sequences. Nucleic Acids Res. 2002;30:31–34. doi: 10.1093/nar/30.1.31. [PMC free article] [PubMed] [Cross Ref]
- Broheé S, van Helden J. Evaluation of clustering algorithms for protein-protein interaction networks. BMC Bioinformatics. 2006;7:488. doi: 10.1186/1471-2105-7-488. [PMC free article] [PubMed] [Cross Ref]
- van Dongen S. Graph clustering by flow simulation. Ph‚D thesis, University of Utrecht, Centers for mathematics and computer science (CWI) 2000.
- Friedel CC, Krumsiek J, Zimmer R. Bootstrapping the interactome: unsupervised identification of protein complexes in yeast. RECOMB. 2008;4955:3–16. [PubMed]
- Stone EA, Ayroles JF. Modulated modularity clustering as an exploratory tool for functional genomic inference. PLoS Genetics. 2009;5:e1000479. doi: 10.1371/journal.pgen.1000479. [PMC free article] [PubMed] [Cross Ref]

Articles from BMC Systems Biology are provided here courtesy of **BioMed Central**

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |