Home | About | Journals | Submit | Contact Us | Français |

**|**Scientific Reports**|**PMC5386204

Formats

Article sections

Authors

Related links

Sci Rep. 2015; 5: 9602.

Published online 2015 May 6. doi: 10.1038/srep09602

PMCID: PMC5386204

Received 2014 September 28; Accepted 2015 March 11.

Copyright © 2015, Macmillan Publishers Limited. All rights reserved

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder in order to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Identifying the most influential spreaders is an important issue in understanding and controlling spreading processes on complex networks. Recent studies showed that nodes located in the core of a network as identified by the *k*-shell decomposition are the most influential spreaders. However, through a great deal of numerical simulations, we observe that not in all real networks do nodes in high shells are very influential: in some networks the core nodes are the most influential which we call true core, while in others nodes in high shells, even the innermost core, are not good spreaders which we call core-like group. By analyzing the *k*-core structure of the networks, we find that the true core of a network links diversely to the shells of the network, while the core-like group links very locally within the group. For nodes in the core-like group, the *k*-shell index cannot reflect their location importance in the network. We further introduce a measure based on the link diversity of shells to effectively distinguish the true core and core-like group, and identify core-like groups throughout the networks. Our findings help to better understand the structural features of real networks and influential nodes.

The most influential nodes can maximize the speed and scope of information spreading compared with other nodes in a network^{1}. Locating these influential nodes is important in improving the use of available resources^{2} and controlling the spread of information^{3}. A critical issue is how to determine and distinguish the spreading capability of a node. Centrality is usually used to measure the relative importance of nodes within the network, such as degree centrality^{4}, betweenness centrality^{5}, closeness centrality^{6}, eigenvector centrality^{7}, PageRank centrality^{8} and its variance^{9}. Nodes with high centrality are considered more influential in the spreading process^{10}^{,11}^{,12}^{,13}^{,14}. Among these measures, degree centrality is a simple and effective way, although it is based only on local link information^{15}^{,16}. The merit of degree is challenged by a recent study^{12}, in which the authors pointed out that the most influential spreaders do not correspond to the nodes with largest degree, but are those located in the core of the network as identified by the *k*-shell decomposition^{17}. This means the higher coreness of a node, the more influential it is in the spreading dynamics.

The *k*-shell decomposition decomposes a network into hierarchically ordered shells by recursively pruning the nodes with degree less than current shell index (see Methods for details). This procedure assigns each node with an index *k _{S}*, representing its coreness in the network. A large

In all these studies, the *k*-shell decomposition is used as a powerful tool to analyze the network structure and identify important nodes. Despite its effectiveness, researchers have noticed that the *k*-shell method has some defects^{28}^{,42}. For example, when there is a lack of the complete network structure, one can not apply the *k*-shell decomposition to the network. In tree structure and BA model network, the capability of finding influential spreaders is limited due to the low resolution of *k _{S}* index. Here, from the perspective of spreading efficiency of cores that have been identified by the

We first calculate the imprecision of coreness and degree in identifying influential spreaders and discover the true core and core-like group in real networks. We then analyze the structural features of the true core and core-like group and uncover their difference. Finally we successfully locate the core-like groups throughout the network by defining a measure of link entropy.

We use a classic Susceptible-Infected-Recovered (SIR) spreading model to simulate the spreading process^{43}^{,44}, and record the spreading capability (or spreading efficiency) of each node, which is defined as the average size of infected population *M* for each node as spreading origin (see Methods for details). To evaluate whether the structural centrality of coreness is an effective index to measure the spreading capability of nodes compared with degree, we calculate the imprecision function and proposed in Ref. ^{12}. The imprecision function is defined as

where *p* is the fraction of network size , and *M _{eff}*(

The imprecision functions of nine real networks are shown in Fig. 1. Contrary to common belief, the coreness *k _{s}* does not perform consistently well in all networks. We divide them into three groups. In Router, Emailcontact and AS networks, the

To find out the reason for the distinct performance of coreness in predicting spreading efficiency is the origin of our research interest in this paper. In the following, we first explore the structural characteristics of the first two groups of networks, and then explain the performance of coreness in the last three networks. As we know, the *k*-shell decomposition tends to assign many nodes with identical *k _{S}* value, although their spreading capabilities may be different. When we calculate the imprecision function at a certain

where is the average spreading efficiency of the nodes with coreness (nodes in *k _{S}*-core), and

where *M _{k}*(

In order to find out the reason for the poor performances of coreness in the spreading process, we first look into the structural properties of the studied real networks. The features of the studied real networks are listed in Table 1. From Table 1, we see that the degree heterogeneity *H _{k}* of the first group is sufficiently larger than that of the second group. The degree heterogeneity is defined as

To evaluate whether the difference of *H _{k}* and

We move to explore the complex connectivity between shells of the studied real networks. Specifically, we consider the link patterns from each shell to its upper shells (shells with greater *k _{S}* index), equal shell (the shell with equal

where is the number of links originating from node *i* to nodes in upper shells, *k _{i}* is the total number of links of node

where consists of nodes with coreness *k _{S}*, is the number of nodes in

From Fig. 3 (a)–(c) for the first group of networks, generally decreases with *k _{S}*, this is because the number of nodes in upper shells decreases monotonously with the increase of

Next, we focus on the link pattern of the innermost core. The link strength defines the ratio of links from a innermost core node *i* to the shell with index *k _{S}* to the degree of node

The above analysis suggests an obvious structural difference between the two groups of networks: in the first one, the link pattern of innermost core to other shells exhibits a strong diversity, while in the second group, the link of innermost core is very localized within the shell. To quantify the link diversity of a shell with index *k _{S}*, we define a link entropy as

where is the average link strength of *k _{S}*-shell to the -shell and

Uncovering locally connected core-like groups leads us to understand the imprecision of coreness centrality in the spreading process. We present the imprecision function of PGP, Netsci and Astro networks in Fig. 6 (a)–(c). The coreness performs very well at large *k _{S}* values, but then rises suddenly at certain shells. Specifically speaking, in PGP at the 22-shell and above, the imprecision of

According to Eq. (6), we calculate the link entropy of each shell in these networks. The shells outlined by hollow red circles in Fig. 6 (d)–(f) have relatively low entropy, which corresponds to locally connected core-like groups. This is reflected by the rise in *k _{S}* imprecision shown in Fig. 6 (a)–(c). The link patterns of the core-like groups, shown in Fig. S8, are similar to that of false cores in the second group of Email, CA-Hep and Hamster: a dense connection within the shell. The only difference is that these core-like groups locate in the outer shells of the network other than the innermost shell. These core-like groups have an obvious low spreading efficiency than their adjacent shells, which is also confirmed in Fig. S9 and S10. From the above, we see the link entropy provides a fast way to locate the position of core-like groups in the network without running a large amount of spreading simulations, which is very important in identifying key spreaders and controlling the spreading dynamics on networks.

Analyzing and profiling the structures of real networks is an important step in understanding and controlling dynamic behaviors on networks. The *k*-shell decomposition is a powerful tool to profile the hierarchical structures of networks. The inner core corresponds to the shells of large *k _{S}* and the network periphery corresponds to the shells of small

Uncovering these core-like groups is important in identifying key players and making control strategy for spreading dynamics. It is worth noticing that in the core-like groups, there may also exist some good spreaders. It implies that there should be new network analysis method which will effectively locate the nodes of different importance in core-like groups in the right hierarchical position. The new method should apply well in real networks with specific structures such as strong community structures.

The algorithm starts by removing all nodes with degree *k* = 1. After removing all nodes with *k* = 1, there may appear some nodes with only one link left. We should iteratively remove these nodes until there is no node left with *k* = 1. The removed nodes are assigned with an index *k _{S}* = 1 and are considered in the 1-shell. In a similar way, nodes with degree

The Susceptible-Infected-Recovered (SIR) model is widely used for simulating the spreading process on networks. In the model, a node has three possible states: *S* (susceptible), *I* (infected) and *R* (recovered). An individual in the susceptible state does not have the disease yet but could catch it if they come into contact with someone who does. An individual in the infected state has the disease and can pass it to susceptible individuals. An individual in recovered state neither spread disease nor be infected by others. In the start of a spreading process, a single node is infected, considered as seed, and all other nodes are in susceptible states. At each time step, there are two stages. In the first stage, susceptible individuals become infected with probability *λ* when they have contacted with an infected neighbor. In the second stage, infected nodes recover or die (change to *R* state) with probability *μ*. Here we set *μ* = 1 for generality. The spreading process stops when there is no infected node in the network. The proportion of recovered nodes defines the final infection population in a spreading process. We record the average infected population *M _{i}* originating at node

As we take the final infected population to quantify the spreading efficiency of each node, the infection probability should be carefully considered. If it is too large, the effect of node position is not obvious and all nodes show almost identical spreading capabilities. If it is too small, the infection is very localized in the neighborhood, which cannot reflect the overall spreading influence of the nodes. So we first calculate the epidemic threshold of a network using the heterogeneous mean-filed method in Ref. ^{47}. That is *λ _{c}* =

In the first rewiring scheme, we randomly choose two edges of the network, and label the ends of the first edge as A and B, and the ends of the second edge as C and D. Then we rewire the two edges, connecting end A and D as an edge, and connecting end B and C as another edge. We avoid multiple edge and self-edge in the rewiring process. This rewiring preserves the degree sequence of the original real network but destroys the degree correlations. In the second rewiring scheme, we randomly choose an edge and test the degree of one end, record as *k*. A second edge with an end having degree *k* is then chosen. We rewire the two edges as before and ensure that the end connecting to a node of degree *k* still connects to a node of degree *k* after rewiring. This scheme preserves both the degree sequence and the degree-degree correlations as the original real network.

The real networks studied in the paper are: (1) Router (the router level topology of the Internet, collected by the Rocketfuel Project)^{48};(2) Email-contact (Email contacts at Computer Science Department of University college London)^{12}; (3) AS (Internet at the autonomous system level)^{49}; (4) Email (e-mail network of University at Rovira i Virgili, URV)^{50};(5) CA-Hep (Giant connected component of collaboration network of arxiv in high-energy physics theory)^{51}; (6) Hamster (friendships and family links between users of the website hamsterster.com)^{52}; (7) PGP (an encrypted communication network)^{53}; (8) Netsci (collaboration network of network scientists)^{54}; (9) Astro physics (collaboration network of astrophysics scientists)^{55}.

This work was partially supported by the National Natural Science Foundation of China (Grant Nos. 11105025, 91324002, 61433014), Scientific Research Starting Project of Southwest Petroleum University (No. 2014QHZ024) and Youth Foundation of Southwest Petroleum University (Grant No. 285). Y. Do was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (NRF-2013R1A1A2010067).

The authors declare no competing financial interests.

**Author Contributions** Y.L., M.T. and T.Z. devised the research project. Y.L. performed numerical simulations. Y.L., M.T. and Y.H.D. analyzed the results. Y.L., M.T., T.Z. and Y.H.D. wrote the paper.

- Kempe D., Kleinberg J. & Tordos . Maximizing the spread of influence through a social network. in Proc of the 9th ACM SIGKDD Int. Conf. on knowledge discovery and data mining (ACM, Washington, DC, USA, 2003), KDD 03 137–146 (2003).
- Gallos L. K., Liljeros F., Argyrakis P., Bunde A. & Havlin S. Improving immunization strategies. Phys. Rev. E 75 045101(R) (2007).
- Lloyd-Smith J. O., Schreiber S. J., Kopp P. E. & Getz W. M. Superspreading and the effect of individual variation on disease emergence. Nature 438, 355–359 (2005). [PubMed]
- Freeman L. C. Centrality in social networks conceptual clarification. Social Networks 1, 215–239 (1978).
- Freeman L. C. A set of measures of centrality based upon betweenness. Sociometry 40, 35–41 (1977).
- Sabidussi G. The centrality index of a graph. Psychometrika 31, 581–603 (1966). [PubMed]
- Bonacich P. & Floyd P. Eigenvector-like measures of centrality for asymmetric relations. Social Networks 23, 191–201 (2001).
- Brin S. & Page L. The anatomy of a large-scale hypertextual Web search engine. Computer Networks 30, 107–117 (1998).
- Kleinberg J. M. Authoritative sources in a hyperlinked environment. J. ACM 46, 604–632 (1999).
- Canright G. S., & Engø-Monsen K. Spreading on networks: a topographic view. Complexus 3, 131–146 (2006).
- Lü L. Y., Zhang Y. C., Yeung C. H. & Zhou T. Leaders in social networks, the delicious Case. PLoS ONE 6, e21202 (2011). [PMC free article] [PubMed]
- Kitsak M.
*et al.*Identification of influential spreaders in complex networks. Nat. Phys. 6, 888–893 (2010). - Chen D. B., Lü L. Y., Shang M. S., Zhang Y. C. & Zhou T. Identifying influential nodes in complex networks. Physica A 391, 1777–1787 (2012).
- Macdonald B., Shakarian P., Howard N. & Moores G. Spreaders in the Network SIR mode: an empirical study. arXiv e-print 1208.4269v1 (2012).
- Borge-Holthoefer J., Rivero A. & Moreno Y. Locating privileged spreaders on an online social networks. Phys. Rev. E 85, 066123 (2012). [PubMed]
- Tanaka G., Morino K. & Aihara K. Dynamical robustness in complex networks: the crucial role of low-degree nodes. Sci. Rep. 2, 232 (2012). [PMC free article] [PubMed]
- Bolobás B. Graph Theory and Combinatorics: Proc. Cambridge Combinatorial Conf. in honor of P. Erdös (Academic Press, NY, 1984).
- Batagelj V. & Zaversnik M. An O(m) algorithm for cores decomposition of networks. arXiv e-print cs/0310049 (2003).
- Alvarez-Hamelin J. I., Asta L. D., Barrat A. & Vespignani A. k-core decomposition: a tool for the visualization of large scale networks. arXiv e-print cs/0504107v2 (2005).
- Carmi S., Havlin S., Kirkpatrick S., Shavitt Y. & Shir E. A model of Internet topology using k-shell decomposition. Proc. Natl. Acad. Sci. USA 104, 11150–11154 (2007). [PubMed]
- Dorogovtsev S. N., Goltsev A. V. & Mendes J. F. F. k-core organization of complex networks. Phys. Rev. Lett. 96, 040601 (2006). [PubMed]
- Alvarez-Hamelin J. I., Dall'Asta L., Barrat A. & Vespignani A. K-core decomposition of Internet graphs: hierarchies, self-similarity and measurement biases. Networks and Hetero Media 3, 371–393 (2008).
- Colomer-de-Simón P., Serrano M. ., Beiró, M. G., Alvarez-Hamelin, J. I. & Boguñá, M. Deciphering the global organization of clustering in real complex networks. Sci. Rep. 3, 2517 (2013). [PMC free article] [PubMed]
- Holme P. Core-periphery organization of complex networks. Phys. Rev. E 72, 046111 (2005). [PubMed]
- Zhang G. Q., Zhang G. Q., Yang Q. F., Cheng S. Q. & Zhou T. Evolution of the Internet and its cores. New J. Phys. 10, 123027 (2008).
- Hébert-Dufresne L., Allard A., Young J. & Dubé L. J. Random networks with arbitrary k-core structure. arXiv e-print 1308.6537v1 (2013). [PubMed]
- Castellano C. & Pastor-Satorras R. Competing activation mechanisms in epidemics on networks. Sci. Rep. 2, 371 (2012). [PMC free article] [PubMed]
- Pei S. & Makse H. A. Spreading dynamics in complex networks. J. Stat. Mech. 12, 12002 (2013).
- Fu Y.-H., Huang C.-Y. & Sun C.-T. Identifying super-spreader nodes in complex networks. Mathematical Problems in Engineering 2014, 675713 (2014).
- Garas A., Argyrakis P., Rozenblat C., Tomassini M. & Havlin S. Worldwide spreading of economic crisis. New J. Phys. 12, 113043 (2010).
- Borge-Holthoefer J. & Moreno Y. Absence of influential spreaders in rumor dynamics. Phys. Rev. E 85, 026116 (2012). [PubMed]
- Reppas A. I. & Lawyer G. Low k-shells identify bridge elements critical to disease flow in small-world networks. AIP Conf. Proc. 1479, 1426–1429 (2012).
- Zeng A. & Zhang C. J. Ranking spreaders by decomposing complex networks. Phys. Lett. A 377, 1031–1035 (2013).
- Ren Z. M., Zeng A., Chen D. B., Liao H. & Liu J. G. Iterative resource allocation for ranking spreaders in complex networks. Europhys. Lett. 106, 48005 (2014).
- Corominas-Murtra B., Fuchs B. & Thurner S. Detection of the elite structure in a virtual multiplex social systems by means of a generalized k-core. arXiv e-print 1309.6740 (2013). [PMC free article] [PubMed]
- Liu J. G., Ren Z. M. & Guo Q. Ranking the spreading influence in complex networks. Physica A 392, 4154–4159 (2013).
- Bae J. & Kim S. Identifying and ranking influential spreaders in complex networks by neighborhood coreness. Physica A 395, 549–559 (2014).
- Garas A., Schweitzer F. & Havlin S. A k-shell decomposition method for weighted networks. New J. of Phys. 14, 083030 (2012).
- Eidsaa M. & Almaas E. S-core network decomposition: A generalization of k-core analysis to weighted networks. Phys. Rev. E 88, 062819 (2013). [PubMed]
- Miorandi D. & Pellegrini F. D. K-shell decomposition for dynamic complex networks. Ad Hoc and Wireless Networks(WiOpt), 2010 Proc. of the 8th Int. Symp., IEEE 499–507 (2010).
- Azimi-Tafreshi N., Gómex-Gardeñes J. & Dorogovtsev S. N. k-core percolation on multiplex networks. arXiv e-print 1405.1336v1 (2014).
- Pei S., Muchnik L., Andrade J. S., Zheng Z. M. & Makse H. A. Searching for superspreaders of information in real-world social media. Sci. Rep. 4, 5547 (2014). [PMC free article] [PubMed]
- Anderson R. M. & May R. M. Infectious diseases in humans (Oxford University Press, Oxford, 1991).
- Moreno Y., Pastor-Satorras R. & Vespignani A. Epidemic outbreaks in complex heterogeneous networks. Eur. Phys. J. B 26, 521–529 (2002).
- Maslov S. & Sneppen K. Specificity and stability in topology of protein networks. Science 296, 910–913 (2002). [PubMed]
- Rombach M. P., Porter M. A., Fowler J. H. & Mucha P. J. Core-Periphery structure in networks. SIAM J. Appl. Math. 74, 167–190 (2014).
- Castellano C. & Pastor-Satorras R. Thresholds for epidemic spreading in networks. Phys. Rev. Lett. 105, 218701 (2010). [PubMed]
- Spring N., Mahajan R., Wetherall D. & Anderson T. Measuring ISP topologies with Rocketfuel. IEEE/ACM Trans. Networking 12, 2–16 (2004).
- Newman M. E. J. Network data. (2006) Available at: http://www-personal.umich.edu/%7Emejn/netdata. (Accessed: 12/12/2012)
- Guimera R., Danon L., Diaz-Guilera A., Giralt F. & Arenas A. Self-similar community structure in a network of human interactions. Phys. Rev. E 68, 065103(R) (2003). [PubMed]
- Leskovec J., Kleinberg J. & Faloutsos C. Graph Evolution: Densification and Shrinking Diameters. ACM Trans. on Knowledge Discovery from Data (ACM TKDD) 1, 1 (2007).
- Kunegis J. Hamsterster full network dataset - KONECT. (2014) Available at: http://konect.uni-koblenz.de/networks/petster-hamster. (Accessed: 01/03/2014)
- Boguñá M., Pastor-Satorras R., Diaz-Guilera A. & Arenas A. Models of social networks based on social distance attachment. Phys. Rev. E 70, 056122 (2004). [PubMed]
- Newman M. E. J. Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 74, 036104 (2006). [PubMed]
- Newman M. E. J. The structure of scientific collaboration networks. Proc. Natl. Acad. Sci. USA 98, 404–409 (2001). [PubMed]

Articles from Scientific Reports are provided here courtesy of **Nature Publishing Group**

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |