Mathematical models are widely used to describe and analyse complex systems and processes. Formulating a model to describe, e.g. a signalling pathway or host parasite system, requires us to condense our assumptions and knowledge into a single coherent framework (May,

2004). Mathematical analysis and computer simulations of such models then allow us to compare model predictions with experimental observations in order to test, and ultimately improve these models. The continuing success, e.g. of systems biology, relies on the judicious combination of experimental and theoretical lines of argument.

Because many of the mathematical models in biology (as in many other disciplines) are too complicated to be analysed in a closed form, computer simulations have become the primary tool in the quantitative analysis of very large or complex biological systems. This, however, can complicate comparisons of different candidate models in light of (frequently sparse and noisy) observed data. Whenever probabilistic models exist, we can employ standard model selection approaches of either a frequentist, Bayesian or information theoretic nature (Burnham and Anderson,

2002; Vyshemirsky and Girolami,

2008). But if suitable probability models do not exist, or if the evaluation of the likelihood is computationally intractable, then we have to base our assessment on the level of agreement between simulated and observed data. This is particularly challenging when the parameters of simulation models are not known but must be inferred from observed data as well. Bayesian model selection side-steps or overcomes this problem by marginalizing (i.e. integrating) over model parameters, thereby effectively treating all model parameters as nuisance parameters.

For the case of parameter estimation when likelihoods are intractable, approximate Bayesian computation (ABC) frameworks have been applied successfully (Beaumont

*et al.*,

2002; Marjoram

*et al.*,

2003; Ratmann

*et al.*,

2007,

2009; Sisson

*et al.*,

2007; Toni

*et al.*,

2009). In ABC, the calculation of the likelihood is replaced by a comparison between the observed data and simulated data. Given the prior distribution

*P*(θ) of parameter θ, the goal is to approximate the posterior distribution,

*P*(θ|

*D*_{0})

*f*(

*D*_{0}|θ)

*P*(θ), where

*f*(

*D*_{0}|θ) is the likelihood of θ given the data

*D*_{0}. ABC methods have the following generic form:

- Sample a candidate parameter vector θ
^{*} from prior distribution *P*(θ). - Simulate a dataset
*D*^{*} from the model described by a conditional probability distribution *f*(*D*|θ^{*}). - Compare the simulated dataset,
*D*^{*}, to the experimental data, *D*_{0}, using a distance function, *d*, and tolerance ϵ; if *d*(*D*_{0}, *D*^{*})≤ϵ, accept θ^{*}. The tolerance ϵ≥0 is the desired level of agreement between *D*_{0} and *D*^{*}.

The output of an ABC algorithm is a sample of parameters from the distribution

*P*(θ|

*d*(

*D*_{0},

*D*^{*})≤ϵ). If ϵ is sufficiently small then this distribution will be a good approximation for the ‘true’ posterior distribution,

*P*(θ|

*D*_{0}). A tutorial on ABC methods is available in the

Supplementary Material.

Such a parameter estimation approach can be used whenever the model is known. However, when several plausible candidate models are available we have a model selection problem, where both the model structure and parameters are unknown. In the Bayesian framework, model selection is closely related to parameter estimation, but the focus shifts onto the marginal posterior probability of model

*m* given data

*D*_{0},

where

*P*(

*D*_{0}|

*m*) is the marginal likelihood and

*P*(

*m*) the prior probability of the model (Gelman

*et al.*,

2003). This framework has some conceptual advantages over classical hypothesis testing: for example, we can rank an arbitrary number of different non-nested models by their marginal probabilities; and rather than only considering evidence against a model the Bayesian framework also weights evidence in a model's favour (Jeffreys,

1939). In practical applications, however, a range of potential pitfalls need considering: model probabilities can show strong dependence on model and parameter priors; and the computational effort needed to evaluate these posterior distributions can make these approaches cumbersome.

The computationally expensive step in Bayesian model selection is the evaluation of the marginal likelihood, which is obtained by marginalizing over model parameters; i.e. *P*(*D*_{0}|*m*)=∫*f*(*D*_{0}|*m*, θ)*P*(θ|*m*)*d*θ, where *P*(θ|*m*) is the parameter prior for model *m*. Here, we develop a computationally efficient ABC model selection formalism based on a sequential Monte Carlo (SMC) sampler. We show that our ABC SMC procedure allows us to employ the whole paraphernalia of the Bayesian model selection formalism, and illustrate the use and scope of our new approach in a range of models: chemical reaction dynamics, Gibbs random fields and real data describing influenza spread and JAK-STAT signal transduction.