Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Science. Author manuscript; available in PMC 2007 December 10.
Published in final edited form as:
PMCID: PMC2128751

Team Assembly Mechanisms Determine Collaboration Network Structure and Team Performance


Agents in creative enterprises are embedded in networks that inspire, support, and evaluate their work. Here, we investigate how the mechanisms by which creative teams self-assemble determine the structure of these collaboration networks. We propose a model for the self-assembly of creative teams that has its basis in three parameters: team size, the fraction of newcomers in new productions, and the tendency of incumbents to repeat previous collaborations. The model suggests that the emergence of a large connected community of practitioners can be described as a phase transition. We find that team assembly mechanisms determine both the structure of the collaboration network and team performance for teams derived from both artistic and scientific fields.

Teams are assembled because of the need to incorporate individuals with different ideas, skills, and resources. Creativity is spurred when proven innovations in one domain are introduced into a new domain, solving old problems and inspiring fresh thinking (1-4). However, research shows that the right balance of diversity on a team is elusive. Although diversity may potentially spur creativity, it typically promotes conflict and miscommunication (5-7). It also runs counter to the security most individuals experience in working and sharing ideas with past collaborators (8). Successful teams evolve toward a size that is large enough to enable specialization and effective division of labor among teammates but small enough to avoid overwhelming costs of group coordination (9). Here, we investigate empirically and theoretically the mechanisms by which teams of creative agents are assembled. We also investigate how these microscopic team assembly mechanisms determine both the macroscopic structure of a creative field and the success of certain teams in using the resources and knowledge available in the field. We develop a model for the assembly of teams of creative agents in which the selection of the members of a team is controlled by three parameters: (i) the number, m, of team members; (ii) the probability, p, of selecting incumbents, that is, agents already belonging to the network; and (iii) the propensity, q, of incumbents to select past collaborators. The model predicts the existence of two phases that are determined by the values of m, p, and q. In one phase, there is a large cluster connecting a substantial fraction of the agents, whereas in the other phase the agents form a large number of isolated clusters.

We analyzed data from both artistic and scientific fields where collaboration needs have experienced pressures such as differentiation and specialization, internationalization, and commercialization (4, 10, 11): (i) the Broadway musical industry (BMI) and (ii) the scientific disciplines of social psychology, economics, ecology, and astronomy (Table 1). For the BMI, we considered all 2258 productions in the period from 1877 to 1990 (12, 13). Productions are defined as musical shows that were performed at least once in Broadway. The team members comprise individuals responsible for composing the music, writing the libretto and the lyrics, designing the choreography, directing, and producing the show, but not the actors that performed in it. For each of the scientific disciplines, we considered all collaborations that resulted in publications in recognized journals within the fields studied (14): seven social psychology journals, nine economics journals, 10 ecology journals, and six astronomy journals (Table 2). Collaboration networks (15-19) were then built for each of the journals independently and for the whole discipline by merging the data from the journals within a discipline (Materials and Methods).

Table 1
Global network properties of the fields studied. The sources for the BMI are (12) and (13). The data analyzed excludes revivals and focus on the steady-state period from 1940 to 1985. The data for scientific publications was obtained from the Web of Science. ...
Table 2
Journal-specific network structure. We present the same information as in Table 1 for each of the journals studied. We ranked journals within each field according to their impact factor (IF). For some low-impact journals, the fR is too high to be reproducible ...

The evolution of team sizes in the BMI bears out the expectation that team size and composition depend on the intricacy of the creative task. In the period from 1877 to 1929, when the form of the Broadway musical show was still being worked out through trial and error (12), there was a steady increase in the number of artists per production, from an average of two to an average of seven (Fig. 1A). This increase in size suggests that teams evolved to manage the complexity of the new artistic form. By the late 1920s, the Broadway musical reached the form we know today, as did team composition (4). Since then, the typical set of artists creating a Broadway musical have been choreographer, composer, director, librettist, lyricist, and producer. For the following 55 years, a period that includes the Great Depression, World War II, and the postwar boom, the average size of teams remained around seven (20).

Fig. 1
Time evolution of the typical number of team members in (A) the BMI and scientific collaborations in the disciplines of (B) social psychology, (C) economics, (D) ecology, and (E) astronomy.

We find similar scenarios for the evolution of team size in scientific collaborations. The four fields experienced an increase in team size with time (Fig. 1, B to E). The increase has been roughly linear in social psychology and economics and faster than linear in ecology and astronomy. For social psychology, team size growth rate was greater for high-impact compared with low-impact journals, suggesting that team size not only depends on the intricacy of the enterprise but also that successful teams might adapt faster to external pressures.

The analysis of team size cannot capture the fact that teams are embedded in a larger network (3). This complex network (21-26), which is the result of past collaborations and the medium in which future collaborations will develop, acts as a storehouse for the pool of knowledge created within the field. The way the members of a team are embedded in the larger network affects the manner in which they access the knowledge in the field. Therefore, teams formed by individuals with large but disparate sets of collaborators are more likely to draw from a more diverse reservoir of knowledge. At the same time and for the same reasons, the way teams are organized into a larger network affects the likelihood of breakthroughs occuring in a given field.

The agents composing a team may be classified according to their experience. Some agents are newcomers, that is, rookies, with little experience and unseasoned skills. Other agents are incumbents. They are established persons with a track record, a reputation, and identifiable talents. The differentiation of agents into newcomers and incumbents results in four possible types of links within a team: (i) newcomer-newcomer, (ii) newcomer-incumbent, (iii) incumbent-incumbent, and (iv) repeat incumbent-incumbent. The distribution of different types of links reflects the team's underlying diversity. For example, if teams have a preponderance of repeat incumbent-incumbent links, it is less likely that they will have innovative ideas because their shared experiences tend to homogenize their pool of knowledge. In contrast, teams with a variety of types of links are likely to have more diverse perspectives to draw from and therefore to contribute more innovative solutions.

Because quantifying the emergence and the effects of team diversity (2, 9, 27-29) is more difficult than measuring team size, we consider next a model for the assembly of teams. In our model, we assemble N teams in temporal sequence. The assembly of each team is controlled by three parameters: m, p, and q. The first parameter, m, is the number of agents in a team. In our investigations of the model, we considered three situations: (i) keep m constant, (ii) draw m from a distribution, or (iii) use a sequence of m values obtained from the data. For the theoretical analysis of the model, we kept m constant, whereas comparison with an empirical data set was done with the use of the sequence of m(t) values in the corresponding data set.

The second parameter, p, is the probability of a team member being an incumbent. Higher values of p indicate fewer opportunities for newcomers to enter a field. The third parameter, q, represents the inclination for incumbents to collaborate with prior collaborators rather than initiate a new collaboration with an incumbent they have not worked with in the past.

We start at time zero with an endless pool of newcomers. Newcomers become incumbents the first time step after being selected for a team. Each time step t, we assemble a new team and add it to the network (Fig. 2). We select sequentially m(t) different agents. Each agent in a team has a probability, p, of being drawn from the pool of incumbents and a probability, 1 – p, of being drawn from the pool of newcomers. If the agent is drawn from the incumbents' pool and there is already another incumbent in the team, then (i) with probability q the new agent is randomly selected from among the set of collaborators of a randomly selected incumbent already in the team; (ii) otherwise, he or she is selected at random among all incumbents in the network.

Fig. 2
Modeling the emergence of collaboration networks in creative enterprises. (A) Creation of a team with m = 3 agents. Consider, at time zero, a collaboration network comprising five agents, all incumbents (blue circles). Along with the incumbents, there ...

Lastly, agents that remain inactive for longer than τ time steps are removed from the network. This rule is motivated by the observation that agents do not remain in the network forever: agents age and retire, change careers, and so on. The removal process enables the network to reach a steady state after a transient time. Our results do not depend in the specific value of τ (Materials and Methods).

Through participation in a team, agents become part of a large network (30). This fact prompted us to examine the topology of the network of collaborations among the practitioners of a given field. More specifically, we asked, “Is there a large connected cluster comprising most of the agents or is the network composed of numerous smaller clusters?” A large connected cluster would be supporting evidence for the so-called invisible college, the web of social and professional contacts linking scientists across universities proposed by de Solla Price (31) and Merton (32). A large number of small clusters would be indicative of a field made up of isolated schools of thought. For all five fields considered here, we find that the network contains a large connected cluster.

As is typically done in the study of percolation phase transitions (33), we use the fraction S of agents that belong to the largest cluster of the network to quantify the transition between these two regimes: invisible college or isolated schools. We explore systematically the (p,q) parameter space of the model. We find that the system undergoes a percolation transition (33) at a critical line, pc(m,q). That is, the system experiences a sharp transition from a multitude of small clusters to a situation in which one large cluster, comprising a substantial fraction S of the individuals, emerges: the so-called giant component (Fig. 3). The transition line pc(m,q) therefore determines the tipping point for the emergence of the invisible college (34). Our analysis shows that the existence of this transition is independent of the average number of agents left angle bracketmright angle bracket in a collaboration, although the precise value of pc(m,q) does depend on m.

Fig. 3
Predictions of the model. (A) Phase transition in the structure of the collaboration network. We plot only the largest cluster in the network. For small p, the network is formed by numerous small clusters (p = 0.10). At the critical point pc, the tipping ...

The proximity to the transition line, which depends on the distribution of the different types of links, determines the structure of the largest cluster (Fig. 3A). In the vicinity of the transition, the largest cluster has an almost linear or branched structure (Fig. 3A) (p = 0.30). As one moves toward larger p, the largest cluster starts to have more and more loops (Fig. 3A) (p = 0.35), and, eventually, it becomes a densely connected network (Fig. 3A) (p = 0.60).

Networks with the same fraction, S, of nodes in the largest cluster do not necessarily correspond to networks with identical properties. Each point in the (p,q) parameter space is characterized by both S and the fraction, fR, of repeat incumbent-incumbent links. For example, in Fig. 3C, the line fR = 0.32 corresponds to those values of p and q for which 32% of all links in new teams are between repeat collaborators (35). The fR has a notable impact on the dynamics of the network. When fR is large, collaborations are firmly established, and therefore the structure of the network changes very slowly. In contrast, low values of fR correspond to enterprises with high turnover and very fast dynamics. Intermediate values of fR are related to situations in which collaboration patterns with peers are fluid (Materials and Methods).

For each of the five fields for which we have empirical data, we measure the relative size of the giant component S (Materials and Methods). For all fields considered, S is larger than 50% (Table 1). This result provides quantitative evidence for the existence of an invisible college in all the fields. Intriguingly, the relative sizes of the giant component is similar for three of the four fields considered: S = 0.70, S = 0.68, and S = 0.75 for BMI, social psychology, and ecology, respectively. However, for astronomy S was significantly larger (0.92), whereas for economics it was significantly smaller (0.54).

To gain further insight in the structure of collaboration networks, we used our model to estimate the values of p and q for each field. Given the temporal sequence of teams producing the network of collaborations, one can calculate the fraction of incumbents and the fraction of repeat incumbent-incumbent links. These fractions and the model enable us to then estimate the values of p and q that are consistent with the data (36).

We estimated p and q for each field and then simulated the model to predict the key properties of the network of collaborations, including the degree distribution of the network and the fraction S of nodes in the largest cluster. By comparing predictions of the model with the empirical results, we are able to test and validate the model. We first compare the degree distribution of the collaboration networks with the predictions of the model (Fig. 4, A to E) and find that the model predicts the empirical degree distributions remarkably well. In Table 1, we compare the predictions of the model for S with the measured values. The model correctly predicts that an invisible college containing more than 50% of the nodes exists in all cases. Additionally, the values of S predicted by the model are in close agreement with the empirical results.

Fig. 4
Network structure of different creative fields. Degree distributions for (A) the BMI, (B) the field of social psychology, (C) the field of economics, (D) the field of ecology, and (E) the field of astronomy. We carried out with the use of the sequence ...

To investigate how changes of the team assembly mechanism affect the structure of the network, we used the model to generate networks with the same sequence of team sizes as the data but with different values of p and q. We show in Fig. 4, F to J, that four out of the five creative networks we consider are very close to the tipping line at which an invisible college emerges. The exception is astronomy. We also find that, for astronomy, the fR is significantly larger than for the other fields.

If diversity affects team performance and our model correctly captures how diversity is related to the way teams are assembled, then the parameters p and q must be related to team performance. To investigate this issue, we considered for the four scientific fields how teams publishing in different journals are assembled. We used each journal's impact factor as a proxy for the typical quality of teams' output. We then studied the different journals separately to quantify the relationship between team assembly mechanisms and performance.

In Fig. 5, we show the values of p, q, and S for the journals in each of the fields as a function of the impact factor of the journal. We found that p was positively correlated with impact factor for economics, ecology, and social psychology, whereas q was negatively correlated with impact factor for the same fields. The result for p implies that successful teams have a higher fraction of incumbents, who contribute expertise and know-how to the team, whereas the result for q implies that teams that are less diverse typically have lower levels of performance.

Fig. 5
Relation between team assembly mechanisms, network structure, and performance. We calculate the values of p, q, and S for several journals in each of the four scientific fields considered. In a few cases, q should be larger than one in order to reproduce ...

The relative size S of the giant component in a journal was also associated with performance for ecology and social psychology. Teams publishing in journals with a high-impact factor typically give rise to a large giant component, whereas teams publishing in low-impact journals typically form small isolated clusters. This suggests that teams publishing in high-impact journals perform a better sampling of the knowledge within a field and thus are able to more efficiently use the resources of the invisible college. Surprisingly, neither p, q, or S were significantly correlated with impact factor in astronomy. This distinguishes astronomy from the other creative enterprises considered.

We have shown that team size evolves with time, probably up to an optimal size as in the case of the BMI. A similar process may be occurring for the parameters quantifying expertise, p, and diversity, q. Four of the five fields considered, all except astronomy, have very similar values of p and q, thus suggesting that a “universal” set of optimal values might exist. The fact that in astronomy there are no correlations between p, q, or S and the impact of journals also indicates that this field is different from the others. Whether these differences are caused by the needs imposed by the creative enterprise itself or to historical or other reasons is a question that we cannot answer conclusively.


Supporting Online Material

Materials and Methods

Figs. S1 and S2

References and Notes

1. Granovetter MS. Am. J. Sociol. 1973;78:1360.
2. Reagans R, Zuckerman EW. Organ. Sci. 2001;12:502.
3. Burt R. Am. J. Sociol. 2004;110:349.
4. Uzzi B, Spiro J. Am. J. Sociol. in press.
5. Larson JR, Christensen C, Abbott AS, Franz TM. J. Pers. Soc. Psychol. 1996;71:315. [PubMed]
6. Edmondson A. Adm. Sci. Q. 1999;44:350.
7. Jehn KA, Northcraft GB, Neale MA. Adm. Sci. Q. 1999;44:741.
8. Stasser G, Stewart DD, Wittenbaum GM. J. Exp. Soc. Psychol. 1995;31:244.
9. Katzenback JR, Smith DK. The Wisdom of Teams. Harper Business; New York: 1993.
10. Ziman JM. Prometheus Bound. Cambridge Univ. Press; Cambridge: 1994.
11. Brown JR. Science. 2000;290:1701. [PubMed]
12. Green S, Green K. Broadway Musicals Show by Show. ed. 5 Hal Leonard; Milwaukee, WI: 1996.
13. Simas R. The Musicals No One Came to See: A Guidebook to Four Decades of Musical-Comedy Casualties on Broadway, Off-Broadway and in Out-Of-Town Try-Out, 1943–1983. Garland; New York: 1988.
14. We imposed several requirements on the journals we selected for analysis. First, the main subject category of the journal must be the desired one. For example, we consider only those ecology journals whose subject category is either ecology or ecology and bio-diversity and conservation according to the Journal Citation Reports. We disregarded more specialized journals, such as Microbial Biology, whose subject category is more specific. We also required that journals contain a sufficiently large number of papers, typically larger than 1000.
15. Newman MEJ. Proc. Natl. Acad. Sci. U.S.A. 2001;98:404. [PubMed]
16. Barabási A-L, et al. Physica A. 2002;311:590.
17. Newman MEJ. Proc. Natl. Acad. Sci. U.S.A. 2004;101:5200. [PubMed]
18. Börner K, Maru JT, Goldstone RL. Proc. Natl. Acad. Sci. U.S.A. 2004;101:5266. [PubMed]
19. Ramasco JJ, Dorogovtsev SN, Pastor-Satorras R. Phys. Rev. E. 2004;70:036106.
20. This stationary state remains until the mid-1980s, when size drops again precisely at the time when a rash of revivals and revues conceivably simplified production.
21. Barabasi A-L, Albert R. Science. 1999;286:509. [PubMed]
22. Watts DJ, Strogatz SH. Nature. 1998;393:440. [PubMed]
23. Amaral LAN, Scala A, Barthélémy M, Stanley HE. Proc. Natl. Acad. Sci. U.S.A. 2000;97:11149. [PubMed]
24. Albert R, Barabási A-L. Rev. Mod. Phys. 2002;74:47.
25. Newman MEJ. SIAM Rev. 2003;45:167.
26. Amaral LAN, Ottino J. Eur. Phys. J. B. 2004;38:147.
27. Etzkowitz H, Kemelgor C, Neuschatz M, Uzzi B, Alonzo J. Science. 1994;266:51. [PubMed]
28. Harrison DA, Price KH, Bell MP. Acad. Manage. J. 1998;41:96.
29. Barsade SG, Ward AJ, Turner JDF, Sonnenfeld JA. Adm. Sci. Q. 2001;46:174.
30. The teams and the agents are the nodes in a bipartite network. Technically, agents are connected only to teams and vice versa. However, this bipartite network can be projected onto a network comprising only agents and in which there is an edge (connection) between two nodes (agents) if the agents have been connected to at least one common team.
31. de Solla Price DJ. Little Science, Big Science… and Beyond. Columbia Univ. Press; New York: 1963.
32. Merton RK. The Sociology of Science. Univ. of Chicago Press; Chicago: 1973.
33. Stauffer D, Aharony A. Introduction to Percolation Theory. ed. 2 Taylor and Francis; London: 1992.
34. Gladwell M. The Tipping Point: How Little Things Can Make a Big Difference. Little, Brown; Boston: 2000.
35. Figure 3C shows that large fR occurs when p and q are large and corresponds to a network in which collaborations among incumbents are firmly established and opportunities for newcomers are few. Conversely, small fR, which occurs when p and/or q are small, indicates plentiful opportunities for newcomers to join new projects. In this case, newcomers are the norm and collaborations are rarely repeated. Lastly, intermediate values of fR suggest intermediate values of both p and q, that is, a situation for which there is a balance between seasoned incumbents and newcomers with fresh ideas.
36. The value of p is directly given by the fraction of incumbents in new creations. The value of q must be obtained numerically by simulating the model with different tentative values of q until the fraction fR of repeat incumbent-incumbent links predicted by the model coincides with the value measured from the data.
37. We thank K. Börner, V. Hatzimanikatis, A. A. Moreira, J. M. Ottino, M. Sales-Pardo, and D. B. Stouffer for numerous suggestions and discussions. R.G. thanks the Fulbright Program and the Spanish Ministry of Education, Culture, and Sports. L.A.N.A. gratefully acknowledges the support of a Searle Leadership Fund Award and of a NIH/National Institute of General Medical Studies K-25 award.