The collective secretion of extracellular compounds lends cell groups the ability to consume complex growth substrates and to cause disease. Extracellular digestive enzymes and nutrient-sequestering molecules are common within bacterial biofilms, but they present a difficulty for evolutionary theory. Because such enzymes are secreted into the extracellular space, non-secreting cells that do not pay the cost of contributing to the public good may reap their benefits. And because they pay no cost of production, such cells can outcompete their enzyme-secreting counterparts [

31,

66,

68,

70–

72].

A dominant factor allowing cooperation to evolve in many systems is the preferential interaction among cooperative individuals relative to their competitive neighbourhoods [

30]. We might therefore expect the interaction between secreted enzyme transport and genetic lineage distribution to be critical for the evolution of cooperation within bacterial biofilms. In this section, we develop and analyse a general model for digestive enzyme secretion in biofilms and test it using a well-established individual-based simulation framework for biofilm growth. The analytical results derived below are similar to those of Driscoll & Pepper [

73], who also study the evolution of diffusible public good secretion. Our approach differs in that we include more physiological detail by using parameters that can be measured in the laboratory; we implement cells as spheres in three-dimensional space; and we provide an explicit description of spatial clustering for multi-cell scenarios of competition between enzyme producers and non-producers. These modelling choices allow us to couple the analysis with computational simulations and to emphasize dimensional reduction. We aim to provide sufficient description to serve as a guide for other researchers who may wish to use similar approaches for their systems of interest.

We start by deriving the concentration profile of the digestive enzyme (

*E*) around a single secreting cell that is stationary within a large body of still liquid. By Fick's Law,

*E* obeys the diffusion equation in spherical coordinates,

where

*D*_{E} is the diffusivity of the secreted enzyme,

*t* represents time, and

*r* the radial distance from the centre of the secreting cell. Diffusion of small molecules is typically such that the concentration profile of

*E* reaches steady state quickly relative to cell growth and division. The steady-state profile of

*E* is obtained by integrating equation (6.1) for

*E*/

*t* = 0

This is a second-order differential equation and can be solved using two boundary conditions. The first boundary condition we use is that the enzyme concentration should vanish far away from the producing cell

The second boundary condition implements conservation of mass and states that the rate of enzyme passing through the surface of the cell matches the rate of enzyme production by the cell (

*q*_{E})

where the left-hand side of the equation represents the integral of the diffusive flux out of the cell over its entire surface (

*S*). Solving equation (6.2) with boundary conditions (6.3) and (6.4) yields the following solution:

which states that the concentration of a secreted digestive enzyme increases directly with the rate of enzyme production and decreases with the inverse of the distance from the producing cell.

(a) Conditions favouring public good secretion: two-cell scenario

We will now use the profile of the extracellular enzyme concentration,

*E*, around a single cell to study competition between producing and non-producing cells (which we will term cheaters by convention) [

74]. The rates of increase in mass per volume of a producing cell

*P* and a cheating cell

*C* are defined by

and

where

*μ* is the growth rate per unit mass, called the specific growth rate. Equation (6.6) implements a metabolic cost of enzyme production (

*c*), which is subtracted from the growth rate of producers and assumed to be an arbitrary function of the enzyme production rate (an explicit cost function will be defined for our simulations below). We assume that the specific growth rates are linear functions of the local concentration of the digestive enzyme:

Here,

*μ*_{0} is the basal specific growth rate and

*b* a coefficient of growth increase per mass of enzyme, which implements the benefit of the secreted public good. In reality, cells benefit from nutrients released into the environment by enzymes as they break down complex substrates into smaller, importable nutrients; however, we only model diffusion of the secreted enzyme. We make this simplification for the sake of clarity and tractability, but note that this approach approximates the full description of any system in which the nutrients liberated by the enzyme diffuse much faster than the enzyme itself [

75]. For example, extracellular chitinases of

*Vibrio* spp. [

76] have an approximate molecular weight of 90 300, which can be converted to a molecular diffusion constant of

*D*_{chitinase} = 58 μm

^{2} s

^{−1} [

77]. The product of chitinase activity,

*N*-acetylglucosamine (GlcNAc), has a diffusion constant that is an order of magnitude larger:

*D*_{GlcNAc} = 500 μm

^{2} s

^{−1} [

78]. We expect such a difference in the diffusion constants of extracellular enzymes and their digested products often to be upheld, because digestive enzymes are typically much larger than the nutrient molecules they release into the environment.

We can use our model of growth rates together with the enzyme concentration profile from equation (6.5) to determine the conditions for which the producing cell outgrows a cheater cell in its vicinity. The producer has the advantage when its fitness (

*w*_{P}) is higher than that of the cheater (

*w*_{C}). Fitness is simply the net specific growth rate of each cell

and

The producer therefore has the advantage when

*w*_{P} >

*w*_{C}. Using equations (6.8)–(6.10), this condition for the fitness advantage of a producer can be expressed as

where

*E*_{P} and

*E*_{C} are the concentrations of secreted enzyme experienced by the producer cell and the cheater cell, respectively. From equation (6.5), the values of

*E*_{P} and

*E*_{C} are

and

where

*r*_{cell} is the radius of a producer cell and

*d* the distance between the producer and the cheater cells.

*q*_{E} was defined above as the rate of enzyme production per producer cell. We now replace

*q*_{E} by a term for the enzyme production rate per biomass of producer (

*k*_{E}), which is more convenient for the analysis that follows. The conversion is

*q*_{E} =

*M*_{P}*k*_{E}, where

*M*_{P} is the mass of a single producer cell. The mass of the cell is the product of its average density (

*ρ*) and its volume, and we can therefore replace

*M*_{P} by

After substituting equations (6.12) and (6.13) into equation (6.11) and dividing through by

*r*_{cell}, we can rewrite the condition for producer advantage as

The first factor on the left-hand side of equation (6.14) is a dimensionless number, which we will call

*B*_{L} (benefit localization)

*B*_{L} compares the fitness increase afforded by accumulation of secreted enzyme (numerator) to the diffusion of enzyme away from the producing cell (denominator). The expression that results from this substitution is

The ratio *c*/*μ*_{0} quantifies the cost of enzyme production, scaled to the basal cell growth rate. The expression (1−*r*_{cell}/*d*) is equal to zero when the producer and the cheater cells are directly adjacent, and approaches unity as the cheater cell is moved far away from the producer cell. Finally, the dimensionless number *B*_{L} captures to what extent the fitness benefit of secreted enzyme is localized around the producer cell. Small values of *B*_{L} correspond to rapid diffusion of enzyme away from the producer relative to its rate of production and thus a more homogeneous distribution of enzyme-mediated benefit in the environment. Large values of *B*_{L} correspond to steeper gradients of decreasing enzyme concentration around the producing cell and a resulting fitness benefit that is more tightly localized around the producer.

For a system containing one enzyme producer and one cheater, equation (6.16) describes whether the benefit of the secreted enzyme is sufficiently privatized by the producer for the secretion phenotype to be favoured [

73,

79]. The left-hand side describes the extent to which the enzyme-producing cell preferentially benefits itself due to localization of the secreted enzyme (

*B*_{L}), and its distance from the cheater cell, captured by (1−

*r*_{cell}/

*d*). When the product of these factors outweighs the cost of enzyme production, the secretion phenotype is favoured.

For simplicity, we began with the two-cell scenario described above, which introduces the central importance of how enzyme distribution and cell–cell distance interact to control whether enzyme production is favoured. In the following section, we address the more general problem of social evolution within groups containing many cells.

(b) Extension to a system of many cells

We will now extend the two-cell scenario to one with an arbitrary number of producer (

*n*_{P}) and cheater (

*n*_{C}) cells. The fitness values of the producer and the cheater cell types are defined by averaging the growth of the cells in the two subpopulations:

and

We are again interested in the condition

*w*_{P} >

*w*_{C}, which for the multi-cell scenario is

We will assume that all producer cells are the same size (this will be relaxed in our simulations below). The important distinction from the two-cell scenario is that the concentration of

*E* now experienced by a focal cell,

*α*, is the sum of the contributions from all producers in the system:

Here,

*d*_{αγ} is the distance between the focal cell

*α* and producer cell

*γ*. If the focal cell is itself a producer, then we assume

*d*_{αγ} =

*r*_{cell} for

*γ* =

*α*. We can now determine the form for the sums in inequality (6.19):

and

If we substitute these expressions into inequality (6.19) and re-scale the distance between two cells by the cell radius (such that

), the multi-cell version of inequality (6.16) becomes

Equation (6.23) is closely analogous to Hamilton's rule, BR > C, the canonical condition of inclusive fitness theory under which cooperation is selectively favoured [

80]. Here, B is the fitness benefit of cooperative behaviour, C the cost of cooperative behaviour and R relatedness, the regression coefficient of recipient genotype on donor genotype across all cooperative interactions. Relatedness is often interpreted to signify common descent, but more generally the relatedness coefficient is a statistical description of the extent to which cooperative actor genotype predicts recipient genotype [

33,

81–

86]. For social traits that influence neighbours in a distance-dependent manner, including the secretion of diffusible public goods, relatedness corresponds tightly to the spatial clustering of cooperative individuals with each other, relative to the clustering of cooperative individuals with cheaters [

33,

73,

87]. In such scenarios, spatial segregation of different genotypes yields high relatedness coefficients, whereas even mixture of different genotypes yields relatedness coefficients near zero (assuming no discrimination mechanisms that allow cooperative individuals to preferentially target one another to receive benefits).

Equation (6.23) thus contains the same fundamental components as Hamilton's rule, expressed in terms of the parameters of this particular system. The left-hand term in parentheses,

represents the degree of clustering among producer cells (left-side compound summation), minus the degree of clustering between producer and cheater cells (right-side compound summation). Together with

*B*_{L}, this clustering differential captures the extent to which producer cells preferentially benefit their own kind via the secretion of the digestive enzyme. That is, the combined effects of the clustering differential and

*B*_{L} determine the relatedness coefficient and total cooperative benefit associated with extracellular enzyme secretion for a given population structure and set of parameter values describing growth, enzyme production and enzyme transport. When the collective benefit provided by enzyme secretion is sufficiently biased towards enzyme-producing cells, such that the cost of enzyme production is offset, cooperation is selectively favoured. Importantly, equation (6.23) describes the instantaneous dynamics of a cell group; as a population grows and its structure changes, the balance of equation (6.23) may change as well.

(c) Simulations with an agent-based model

The derivations above illustrate the basic links between the abstract evolutionary theory of cooperation and the core parameters of cell group growth and public good production. They also imply that the outcome of competition between producers and cheaters may be predicted if the population spatial structure, the *B*_{L} number and the cost *c*/*μ*_{0} to producers are known. The *B*_{L} number provides the additional important insight that it is not the values of the parameters *b*, *k*_{E}, *r*_{cell}, *D*_{E} and *ρ* in isolation but rather their compounded value according to equation (6.15) that is critical for the evolution of diffusible public good production in spatially structured environments.

We now test the predictive role of

*B*_{L} with simulations of biofilms in two-dimensional space. The computational framework used to run these simulations has previously been described in detail and tested experimentally [

34,

46,

88]. Our model relaxes some of the assumptions of the analytical derivations above by adding realistic detail to the cells and their interactions with each other. The transport of solutes is still assumed to occur by diffusion, but diffusion now occurs only within a boundary layer that extends a distance

*h* above the biofilm. Cells are allowed to vary in size across the population as they grow and divide. We carried out simulations in which the producer and the cheater cells were inoculated at an initial 1 : 1 ratio (

*n*_{P} =

*n*_{C}), allowed the simulations to run until the biofilm reached a pre-defined maximum thickness, and then quantified the outcome of competition by computing the ratio of producer fitness to cheater fitness (

*w*_{P}/

*w*_{C}).

We first assumed that cell growth rate is exclusively a function of public good concentration at the cell's location, and we conducted an array of simulations in which *B*_{L} was altered by independently varying the values of the parameters that compose it (). As expected, our simulations showed that there is a threshold value of *B*_{L} above which enzyme-secreting cells are selectively favoured. Furthermore, the effect of varying *B*_{L} was identical regardless of which of its constituent parameters (*b*, *k*_{E}, *D*_{E} or *r*_{cell}) was altered.

We next extended the preliminary analysis by allowing growth rate to vary as a function both of local secreted enzyme concentration (

*E*, as above), and of available nutrient (

*N*, assumed to diffuse into the biofilm from a bulk liquid). The scenario we implement is one in which bacteria can achieve a basal growth rate by consuming a readily accessible carbon source, the basic nutrient

*N*, which diffuses into the biofilm from the bulk liquid. Bacterial growth may be augmented by the activity of the secreted enzyme, which liberates a different growth substrate [

89]. The two nutrient sources are implemented separately to allow us to independently vary the effects of two distinct phenomena on the evolution of cooperation. The first is the extent of lineage segregation within growing biofilms, which is determined by the thickness of the actively growing layer along the advancing front. The active layer thickness is governed by a parameter group (

*δ*, see §3 and ) that includes bulk nutrient concentration. The second is the concentration profile of secreted enzyme, determined by

*B*_{L}. As noted above, the benefit of a secreted enzyme becomes more privatized by cells producing it as the

*B*_{L} number increases. A stoichiometric table detailing the exact growth dynamics of producer and cheater cells is provided in the electronic supplementary material, table S1.

For *δ* > 10^{3}, which results in well-mixed biofilms, cooperative cells have higher fitness when *B*_{L} exceeds a critical threshold of approximately 10^{−2} (*a*). The threshold *B*_{L} above which cooperators are favoured corresponds to the length scale on which cell lineages cluster due solely to their immobility and limited dispersal (population viscosity). Note that diffusible enzyme production can still be favoured in relatively mixed environments, so long as the spatial range at which it provides a fitness benefit matches the spatial range along which cells tend to be of the same genotype.

For

*δ* < 10

^{3}, the threshold

*B*_{L} at which producers are selectively favoured increases sharply. This result was somewhat counterintuitive, as decreasing

*δ* leads to increasing spatial segregation among cell lineages, which in principle could allow for cooperative cells to be favoured even if the benefit of their secreted enzyme is distributed farther away from them (decreasing

*B*_{L}). However, we see that the conditions favouring cooperation become more stringent as

*δ* decreases because nutrient limitation creates a strong advantage for cell lineages that accumulate even marginally greater biovolume at the earliest time points during biofilm growth [

42,

62,

90]. Such cells are able to deny their neighbours access to nutrients and in so doing dramatically reduce their ability to grow. Under these conditions, cheater cells outgrow cooperative cells if the secreted enzyme is not strongly localized (

*b*). However, if the secreted enzyme's effect is sufficiently localized around producing cells (high

*B*_{L}), then producers outcompete cheater cells early during biofilm growth and remain dominant over the course of the competition (

*c*). Indeed for low

*δ* and high

*B*_{L}, cooperative cells outcompete cheater cells 10 times more strongly than they do with the same

*B*_{L} value at high

*δ*. Low

*δ* leads to more globalized competition for nutrients and increased segregation among cell lineages (i.e. increased relatedness) [

42,

74], both of which increase the advantage of spatially localized cooperative behaviour [

30].

Our simulation model makes a number of simplifying assumptions: inactive cells do not decay, biofilms are not subjected to shear stress, cells cannot disperse, biofilms are always initiated with a confluent monolayer of cells, and there is no plasticity in expression of the digestive enzyme. Relaxing some of these assumptions will certainly provide further insight into the evolution of enzyme production in biofilms. We expect that decreasing initial cell density will favour enzyme producers by allowing them to preferentially benefit themselves and their clonemates prior to experiencing competition with non-producers. Conditional enzyme secretion, for example in response to quorum sensing signals [

62,

91] or to nutrient conditions [

92,

93], may allow cooperative cells to avoid exploitation by non-producing cells early during biofilm formation, when growth deficits lead to a severe competitive disadvantage. The durability of secreted public good compounds [

94], the shapes of their benefit and cost functions [

95], and disturbance-dispersal dynamics [

89] also play an important role in the evolution of cooperation, but for the sake of illustration and brevity we have omitted these molecular and ecological details from our study.

Our analytical model and simulations illustrate how scaling analysis can be applied to make predictions about the evolution of social behaviour in cell groups, and how one may relate the detailed parameters of cell growth, enzyme secretion and solute diffusion to the abstractions of evolutionary theory.