Search tips
Search criteria 


Logo of intjepidLink to Publisher's site
Int J Epidemiol. 2010 February; 39(1): 97–106.
Published online 2009 October 9. doi:  10.1093/ije/dyp296
PMCID: PMC2912489

Causal thinking and complex system approaches in epidemiology


Identifying biological and behavioural causes of diseases has been one of the central concerns of epidemiology for the past half century. This has led to the development of increasingly sophisticated conceptual and analytical approaches focused on the isolation of single causes of disease states. However, the growing recognition that (i) factors at multiple levels, including biological, behavioural and group levels may influence health and disease, and (ii) that the interrelation among these factors often includes dynamic feedback and changes over time challenges this dominant epidemiological paradigm. Using obesity as an example, we discuss how the adoption of complex systems dynamic models allows us to take into account the causes of disease at multiple levels, reciprocal relations and interrelation between causes that characterize the causation of obesity. We also discuss some of the key difficulties that the discipline faces in incorporating these methods into non-infectious disease epidemiology. We conclude with a discussion of a potential way forward.

Keywords: Agent-based modelling, dynamic systems modelling, epidemiology, regression

The hunt for causes in epidemiology

Epidemiology is a practical discipline, ultimately concerned with identifying modifiable causes of disease and dysfunction. As such, it cannot escape issues related to how one conceptualizes causality and finds causes. The understanding of causality is, of course, not simple, and we should be mindful that whether it is in the social or biological sciences, there may be no real ‘gold standard’ for doing so.1 As epidemiology has developed as a discipline in the last quarter century, the quest for isolating ‘causes’ has emerged as one of the central foci in the field.2 The sufficient-component causal model3 served as an early organizing heuristic, and more recently, the counterfactual paradigm4 has allowed epidemiologists to cast other concerns that are central to the field, particularly issues of confounding, within a causal framework. Indeed, important epidemiological developments, both conceptual and methodological, have grown directly out of this emerging clarity about the nature of causes, including, for example, directed acyclic graphs5 and marginal structural models.6

These developments have provided support for the focus on isolating factors, primarily behavioural or biological, that we may term causes of a particular disease state. In many respects, this has been a successful enterprise. For example, epidemiological studies demonstrating strong causal connections between smoking and lung cancer and asbestos exposure and mesothelioma have strengthened our resolve that we can discover causes of disease states. A greater sophistication in our ability to isolate these independent factors, and to identify which ones of them do indeed cause disease, increases the usefulness of epidemiology that ultimately has always conceived of itself as a ‘pragmatic science’.2

In this article we discuss both some of the conceptual challenges that are inherent in our current conceptualization of causes and how a shift in our methodological approach may allow us to better grapple with the complexity of causation within a multilevel understanding of disease etiology. Overall, from the point of view of a simple organizing definition, we think that Susser's timeless formulation that ‘a cause is something that makes a difference’ is apposite. Susser noted that ‘insofar as epidemiology is a science that aims to discover the causes of health states, the search includes all determinants of a health outcome’.2 We argue in this article that our growing sophistication in population health increasingly illustrates that for many diseases there are numerous factors at different levels of influence that meet this criterion. It is this recognition that compels us to consider new methodological approaches that can adequately deal with the full scope of causes ‘that make a difference’.

Complicating causes

Our hunt for causes is complicated by two emerging observations. First, factors at multiple levels, including biological, behavioural, group and macro-social levels, all have implications for the production and distribution of health.7 Secondly, these factors frequently influence one another and, in addition, are sometimes influenced by the health indicators of interest. We illustrate each of these observations by focusing on obesity, synthesizing a well-established body of work in the area.

The rapid increase in obesity in the USA and other countries during the past 15 years has been the subject of much academic discussion. Recent publications have even suggested that the increase in childhood obesity in the USA might result in the first reversal of life expectancy in this country within the past century (excepting a very brief reversal during the 1918 influenza pandemic).8 The academic focus on the etiology of the ‘obesity epidemic’ has been intense, and the evidence now suggests that a diverse range of factors influence obesity. These include but are not limited to: (i) endogenous factors such as genes and factors influencing their expression; (ii) individual-level factors such as behaviours (size of food portions, dietary habits, exercise, television-viewing patterns), education, income; (iii) neighbourhood-level factors such as availability of grocery stores, suitability of the walking environment, advertising of high caloric foods; (iv) school-level factors such as availability of high-caloric foods and beverages and health education; (v) district or state-level policies that regulate marketing of high caloric foods; (vi) national-level surplus food programmes, other food distribution programmes and support for various agricultural products; and (vii) from a lifecourse perspective history of breastfeeding, maternal health and parental obesity.9,10

So, what is the cause of obesity?

This example highlights the problem for the dominant epidemiological causal paradigm, namely, how do we think about, and analytically grapple with, the potential contribution of factors at all these levels of influence when we are centrally concerned with isolating independent and actionable ‘causes’?

We may think of three different possible solutions to this problem.

The first is conceptual. One might argue that factors that are at higher levels of influence and exert their influence indirectly by way of other factors are not truly ‘causes’. However, this solution is unsatisfying. For example, if legislation eliminated the production of cigarettes then (eventually) most lung cancer would be eliminated. Would it not make sense then to argue that such legislation was causally related to the occurrence of lung cancer in the population? If we accept, as the evidence amply suggests, that these higher level influences do indeed matter and do, in some way, influence the likelihood of obesity, a pragmatic scientific approach1 would suggest that we should indeed consider these factors as part of a set of causes, or at the very least as causes of causes11 and hence worthy of our epidemiological interest.12

A counterfactual approach suggests that a better understanding of causation is an understanding of how interventions to change a variable would ultimately lead to changes in an outcome variable of interest, compared with a counterfactual scenario in which no intervention occurred. This would involve not only an estimate of the strength, but also the timing of the effects, as some causal variables might produce more immediate effects, whereas others might have a slower but more lasting impact. This approach is agnostic as to the level at which such interventions occur.

The second solution could be to extend the tools that we currently use within epidemiology to deal with these other factors. Largely in response to this question there has been in the past decade a dramatic increase in the use of multilevel, or hierarchical, regression models within epidemiology. These models allow epidemiologists to consider the contribution of factors at multiple levels while taking into account factors at other levels that may confound the relation between the central factor (cause) of interest and the key disease outcome. Unfortunately, although useful, multilevel methods do little to help deal with a fundamental limitation of all regression-based models, namely that these models are concerned with assessing the relation between ‘independent’ variables and ‘outcomes’ of interest. Therefore, this approach, as commonly used, does little to take into account the dynamic and reciprocal relations between some ‘exposures’ and ‘outcomes’, discontinuous relations or changes in the relations between ‘exposures’ and ‘outcomes’ over time.

Going back to our obesity example, even though individual exercise patterns are linked to the risk of obesity,13 obesity is also a determinant of individual exercise patterns.14 Similarly, although dietary habits are clearly linked to risk of obesity,14 individual dietary habits are in turn shaped both by social networks15 and by the availability of food in an individual's neighbourhood.16 Also, it is likely that the relation among all the key factors, or causes, of obesity is not easily parameterized. It would be a substantial assumptive stretch to argue that there is a linear relation, for example, between suitability of the walking environment and risk of obesity, and even more of a stretch to argue that any hypothesized parametric relation is consistent across all the relations of interest in shaping obesity. Therefore, in this one example we can see a reciprocal relation between putative ‘exposures’ and ‘outcomes’, clear interrelations between key independent variables of interest, and absence of clear predictable parametric relations. Regression models, although clearly helpful at identifying isolated relations between covariates while taking into account potential confounders, are poorly suited to deal with these complications. Interrelations among individuals can also lead to violations of the stable unit treatment value assumption (SUTVA), since a treatment that affects the obesity of one individual could also affect the obesity of his/her friends.

A third possible solution—and the focus of this article—is the adoption of complex systems dynamic computational models. Complex systems approaches in general allow us to take into account both the influence of causal influence at multiple levels and the interrelations among causal covariates that strain most widely used analytic methods. There have been previous calls for the adoption of complex systems dynamic methods to epidemiology,17 and the conceptual basis for these arguments can be traced back as far as Morris (1957).18,19 These calls have grown in recent years, and recent writing in the field has gone further in showing how these methods can substantially move us forward in our thinking.20–22

A methodological shift: complex systems and complex systems dynamic analytic approaches

Complex systems are systems that are characterized by feedbacks, interrelations among agents and discontinuous non-linear relations.23,24 Drawing on this definition, we refer to complex systems dynamic analytic approaches as computational approaches that make use of computer-based algorithms to model dynamic interactions between individuals agents (e.g. persons, cells) or groups and their properties, within, and across levels of influence. The most common approach is agent-based modelling. Implementing an agent-based model (ABM) involves specifying rules for individual agents and how they interact with each other, and running computer simulations to observe how these individual level rules translate into patterns at the population level. Different variations of ABMs are possible. The agents may be adaptive, so that their behaviour changes over time to replicate more successful behaviours, or may use pre-set rules as to how their behaviour will respond in response to different stimuli. The agents may interact based on the spatial proximity of the agents, or based on a social network structure that specifies how interactions occur. Agents can also be clustered into groups at different levels—such as organs, households or neighbourhoods—and can be influenced by group-level variables.

In many respects then, complex systems dynamic methods may provide solutions to the important challenges to extant epidemiological methods that are posed by the multilevel causal thinking that we suggest in this article. Some of these challenges can be captured by alternative techniques, such as systems dynamic modelling. Systems dynamic models share a number of features in common with complex systems dynamic models, in that they can also involve computer-based algorithms that capture dynamic interactions among different variables, with feedback loops. They can be used to simulate the results of interventions, and compare them with counterfactual outcomes. However, they are not well suited to capturing other key features of the real world, such as the role of social networks in influencing behaviour, the role of heterogeneous individuals who may adapt their behaviour in response to policy changes and the selective movement of individuals between neighbourhoods depending on the features of the individuals and neighbourhoods.

If the appearance of obesity in populations and the secular trends in obesity that we observe truly reflect the combined effects of the interaction of multiple factors at the genetic, metabolic, behavioural, psychological, social network, built environment, institutional, food supply and food policy levels, then it would be surprising if it could be simply understood by reference to a single level of determination. As Rea and colleagues note25 ‘… living organisms are better understood as complex adaptive systems characterized by multiple participating agents, hierarchical organization, extensive interactions among genetic and environmental effects, nonlinear responses to perturbation, temporal dynamics of structure and function, distributed control, redundancy, compensatory mechanisms, and emergent properties’. If this is true of individual organisms, then moving beyond the individual into the realm of social and policy processes must surely increase complexity of the casual processes exponentially.

In addition, there is increased understanding that in the case of such complex systems, the appearance of emergent properties is the rule and not the exception. Emergent properties are system-wide phenomena that cannot simply be understood as the combination of independent individual components. At the physical level, the properties of water are not easily derived from understanding the behaviour of hydrogen and oxygen molecules. At the biological level, serum concentration of high density lipoprotein-cholesterol (HDL-C) is related to so many, high-dimensional and interacting factors that it appears to be an emergent phenotype.25 At the social network level, levels of organization such as communities emerge from the interaction of much simpler networks.26 Finally, at the political level, emergent phenomena have also been described.27 The study of emergent phenomena is relatively new and not without contention. Emergent phenomena may simply represent, as yet imperfectly understood, the result of high levels of non-linear interaction in dynamic, multilevel systems, or they may represent behaviour that is fundamentally not deducible from the individual forces in the system; hence, they are not easily incorporated into the standard regression-based approach that is primarily used in understanding population health. Complex systems dynamic methods have demonstrated the ability to produce emergent phenomena from simple rules, and in some cases have replicated system phenomena from the real world that have been difficult to replicate with alternative methods. The number of persons who die in wars follows a regular power–law distribution that political scientists have not found easy to explain, but has been found to emerge from a model of competing states in the face of technological changes.28 The patterns followed by flocks of birds would be difficult to capture in equations by observing the system as a whole, but can be replicated nicely using simple rules to govern the behaviour of individual birds.29,30

Observations such as those we draw here, applied to other phenomena of interest, have led complex systems dynamic analytic approaches to be embraced and used extensively in many disciplines.31 Systems biologists are increasingly utilizing complex systems approaches.32 Ecologists have used complex systems to approach issues of multi-scale and dynamic interactions within ecosystems.33 Economists have long considered both the joint characteristics of individuals and of global societal dynamics that influence economic systems,34,35 and have adopted complex systems dynamic analytic approaches, including evolutionary complex models that show how different stock market trading strategies can emerge from simple evolutionary rules.36 Complex systems computational approaches also have been applied in organizational science using multi-agent approaches that can have direct applicability for policies aimed at improving organizational effectiveness.37 In political sciences, complex systems computational models have been applied to questions of state formation, power politics27,38 and the role of power sharing in encouraging secessionism.39 Other work that has modelled civil violence has helped inform understanding of how group behaviour may lead to communal violence.40

Public health, however, has lagged substantially behind other disciplines in the adoption of these approaches. Although there has been a call for a growing integration of complex systems methods into public health analysis,17,21,41–43 the bulk of the work has been limited to areas of infectious disease processes (e.g. see ref.44). We are aware of only a handful of applications of complex systems dynamic computational approaches to non-infectious disease epidemiology.45 Importantly, the field of infectious disease transmission is an exception, as complex systems methods are increasingly being used to model person-to-person transmission of disease in populations—an exception that we shall come to later.

Challenges in the application of complex systems dynamic models to epidemiological questions

Given this rationale for the application of complex systems dynamic models to epidemiological questions, the question emerges—Why have these methods not found wider application within public health in general and within epidemiology in particular? Although there is, with all methods, a lag time in adoption of novel approaches, there have been clear calls in the peer-reviewed literature for the applications of these methods to epidemiology for at least a decade.17 Resources are now widely available for use in understanding complex systems thinking,31 agent-based modelling,29,46,47 software tools for developing ABMs48 and techniques for fitting models with data.47,49,50 There are readily available continuing education seminars and workshops that instruct and inform interested epidemiologists. We offer here suggestions about the challenges that face epidemiologists in the adoption of complex systems dynamic models to our work.

The determination of individual disease

Epidemiology is concerned with the determination of the distribution of health and disease in populations, a determination that ultimately rests on inter-individual variation in physiological processes that shape health within individuals. Although epidemiological methods are predicated on population-based methods that should, in a perfect world, be used only for group-level inference, epidemiologists are nonetheless accustomed to thinking of our methods as providing insight into individual health and disease formation. The epidemiological concern with individual health and disease poses a substantial challenge to methods, such as complex systems dynamic models, which are primarily and centrally concerned with the determination of population patterns and the modulation of those patterns by the interrelation among other features of these same populations. In many respects then, complex systems dynamic models push epidemiology even further from its early roots in clinical practice and ever closer to a concern with the health of populations. Although we might argue that such a shift would represent a positive development, it carries with it challenges that must be bridged if epidemiology is to embrace the potential use of complex systems dynamic models.

Parameterizing models

In the vast majority of the articles we cite here, and in the peer-reviewed literature at large, complex systems dynamic approaches have rested on modelling of parameters that are often either hypothesized or created as exemplars to test the interaction or behaviour of particular agents. That is, the estimates for parameters are more a matter of convenience or of intellectual interest to explore a model's behaviour, than solidly based in data. In fact, one of the criticisms of complex systems dynamic approaches in the broader literature is that complex systems modelling exercises are a ‘fact-free science’.51 In addition, complex systems dynamic techniques often rest on assumptions regarding interrelations between specific components that reflect the biases of the analysts.52 For example, many complex systems dynamic models embed assumptions about the distribution and magnitude of parameters, aggregate relationships between actors and the importance of including or excluding certain variables.53 The emphasis is sometimes on how the model ‘works’, and not on whether the parameters are tied to observable data or on the sensitivity of the specific empiric predictions of the models to realistic values of these parameters.

However, in many cases, assumptions about parameterization and parameter values are critical to the inference for the real world drawn from these models. In many respects, this historic reliance of complex systems dynamic modelling approaches on hypothesized or test parameters is inimical to epidemiology's heavy reliance on methods that produce the best possible estimates and parameters. Therefore, the adoption of complex systems dynamic models in epidemiology will necessitate the incorporation of tangible, evidence-based, parameterization into approaches that have not always been so focused on data.

Facility with simulated data

Epidemiological methods, frequently married with biostatistical techniques and approaches, continue to dwell, almost entirely, on the analysis of data that are collected through epidemiological studies and the application of various statistical techniques to document association present in the data collected. Adoption of complex systems dynamic models in epidemiology will require a conceptual shift for epidemiology and public health—a shift away from statistical association models focused on effect estimates to simulations in which we can test scenarios under different conditions, rather than simply observing associations within finite and specific datasets. Although sensitivity analyses do test different scenarios, they are generally an exercise focused on asking how various assumptions about measurement error or other barriers to inference influence the observed findings. By contrast, the modelling exercise of sweeping through a broad range of possible scenarios is more focused on understanding how the relationships within a complex system behave in an unexpected fashion.

Although, as noted above, complex systems dynamic models will still need to be parameterized using observational (or experimental) epidemiological data, these data will need to be used creatively, quilting together data from disparate sources in order to create simulation models that best help us answer the key epidemiological questions of interest. The shift from a dominant paradigm where we search for association in available data to the use of modelled data (albeit informed by existing data sources) to test scenarios is not insubstantial. Although testing alternate scenarios is uncommon in epidemiology, such an approach lends itself well to the counterfactual paradigm which is, itself, predicated on the notion of alternate/hypothetical counterfactual universes that change only one factor and speculate on whether such a change has an effect on the disease outcome of interest.4

Identifying interactions and interrelations

Our central premise in this article is that complex systems dynamic models have much to offer epidemiology and it is time for epidemiology to consider adopting these methods as part of its toolkit. However, as we have noted in several places in this article, this recommendation is focused mainly on non-infectious disease epidemiology. Indeed, infectious disease epidemiologists have used complex systems dynamic methods effectively to model the transmission of diseases from person to person. Complex systems dynamic models rest on modelling interactions and interrelations, and on understanding how these interactions contribute to the emergence of patterns in populations, be they in the form of interrelations among individuals or of dynamic feedback between states of a particular individual within environments that are also dynamically changing. This approach is intuitively easier to understand when considering transmission of pathogenic organisms between individuals, providing clear links among persons within a model.

Epidemiologists are less accustomed to modelling inter-individual relations when concerned with pathology that is not predicated on person-to-person transmission. Epidemiological inquiry focused on the role of the social environment in shaping individual health, or social epidemiology, is one of the most rapidly growing fields of epidemiology.41 However, even social epidemiology often models factors that arise from group interaction as endogenous properties of individuals. For example, although there is evidence that social supports are protective of the risk of coronary death,54 these social supports are typically modelled as properties of individuals even though they are, by definition, relational and exogenous properties of a particular micro- or macro-population. Absent a clear conceptualization of the interrelations between individuals that shape health outcomes, the application of complex systems dynamic models to epidemiology will remain limited.

The role of time and lifecourse

Recent writing in epidemiology has drawn attention to a lifecourse perspective55 which recognizes that disease production in the individual is not a static product of individual circumstance at any given time, but rather a product of circumstances over the lifecourse, possibly starting in utero and proceeding through an individual's life. In many respects this more closely approximates the temporal nature of most disease processes that develop over considerable periods of time, including the complex interactions over time that lead to dynamic down- and up-regulation of regulatory systems. Although there is much that can be said on this topic that is beyond the scope of this article, complex systems dynamic models, allowing the incorporation of changing, dynamic processes and their interrelations provide a promising optimal analytic approach to considering lifecourse perspectives in epidemiology. However, although lifecourse approaches have been well conceptualized, there is a substantial gulf between this conceptualization and our parameterization of the role of time in the determination of health and disease. Epidemiological studies remain largely short term, and even the few studies that have followed persons and populations over long periods of time seldom provide the richness of epidemiological detail that allow the reliable parameterization of changing temporal relations. Empiric studies have demonstrated the importance of certain early-life influences on health later in life, but it is rare to have exposure measures throughout the lifecourse in a cohort study that can be used to tease apart the relative impacts of different exposures at different life stages.56 Therefore, epidemiologists considering the implementation of these methods must contend with tools that allow them to model changing influence of determinants over time without much guidance from the empiric studies that we are accustomed to using in our work.

A way forward

We suggest that epidemiological thinking needs to broaden its conception of causes, and that such thinking may well be served by the adoption of complex systems dynamic models as part of our armamentarium. We have articulated a set of challenges that we argue has contributed to the slow diffusion of these methods within epidemiology. The potential of these methods seems vast, and the challenges that we need to bridge to successfully adopt them in epidemiology commensurately daunting. Is there then a way forward? We are all slow adopters of novel methods, even when the barriers to adoption of new methods are much lower than they are here. Unfamiliarity with methods and limited training in their implementation are probably enough reasons to delay epidemiologists’ adoption of complex systems dynamic models. The additional challenges discussed here add to the inherent difficulties when a discipline faces new methodological approaches. We suggest, however, that in this case the potential offered by these methods is considerable, and that ultimately epidemiologists will identify ways to overcome these challenges and to adopt complex systems dynamic models in much the same way as regression models, once new and foreign to the field, quickly became lingua franca in epidemiology.

In some respects, we are perhaps already ‘half-way there’ in dealing with many of these challenges. Although we remain concerned with looking for individual causes of individual diseases, and noting the mismatch between our methods, our outlook and the hunt for individual cause–disease relations, the chorus of epidemiological voices expressing concern is rising.7,17,20,21,41–43 In addition, epidemiological methodologists have long made ready use of mathematical simulations and other approaches in their quest to understand and refine methods that add to the epidemiological armamentarium. It would be one small step to move methodologists’ thinking from one concerned with fine-tuning methods in the hunt for causes, to incorporating methods that study interrelations and provide explanations for populations as systems. Similarly, although some work in complex computational analytic approaches remains highly theoretical, other uses are solidly grounded in the use of real data.57

The experience of infectious disease epidemiologists

The spread of these models throughout infectious disease epidemiological practice suggests that, in the right context, epidemiologists have also been able to turn to these unfamiliar methods to push parts of our discipline forward. Complex system models have contributed to the theoretical understanding of the spread of communicable diseases, as well as practical applications for predicting the effectiveness of different intervention strategies. For example, one of the key theoretical results in infectious disease epidemiology is that there is a threshold for the density of susceptibles in a population that determines whether an epidemic will die off or spread through the population.58 One consequence of this theory is that if a sufficient portion of the population can be immunized to reduce the density of susceptibles below this threshold, the disease will die out.59 This result has great potential for practical application, though the effectiveness of efforts to apply it to real world diseases has been mixed.59

Fox et al.60 helped to highlight a key reason for the failure of this model to accurately predict the threshold of herd immunity in real-world diseases. These threshold results were derived from the use of systems dynamic models that assumed random mixing of the population. If the contact rate is higher in some subsets of the population than others, then the required immunity level in those populations may need to be higher than in others. Complex systems dynamic models that explicitly model the social network structure have developed this idea further, showing how different network structures can lead to different results for the epidemiological threshold,61 or even the non-existence of an epidemiological threshold.62

Recent modelling of the spread of infectious diseases has also drawn on extensive data sources and modern computing power to provide even more effective tools that can be practically used to predict the effectiveness of different real-world interventions. The ‘Large Scale Agent Model’ developed by the Centre on Social and Economic Dynamics at the Brookings Institute57 is a good example of this type of model. It incorporates census data and travel patterns in the USA into a model that can handle several billion agents. This type of empirically based model can provide an important tool in efforts to prevent the spread of infections such as the H1N1 virus. Other empirically driven models have been used to model the spread of different types of infectious diseases, such as malaria and HIV.63,64

A role for complex systems models in evaluating interventions to improve population health

We close by presenting an example of how a complex systems dynamic model could provide practical information that could be used in evaluating potential interventions for improving population health. Congruent with the example we have followed throughout the article, we consider policy interventions that aim to reduce obesity.

Interventions to improve health outcomes can target causes at multiple levels: individual, neighbourhood, school district, city, state or national. They could target downstream causes that directly influence the target variable, or more upstream causes such as income or education levels, whose influence is felt indirectly.

Evaluating direct, individual-level interventions is relatively easy: we can perform randomized controlled trials to determine whether individuals who receive the intervention are less likely to develop the health outcome of interest than individuals who do not receive the intervention. In the case of upstream, group-level causal variables, this is more difficult. A randomized experiment would require identifying large numbers of groups (neighbourhoods, cities, states, etc.) that are willing to participate in what could be a costly policy intervention. In addition, upstream policies may be harder to evaluate because it may take longer for their health effects to be felt. An evaluation of health impacts over a limited time frame may not tell the full story if the immediate impacts are different than the long-run impacts. To judge the relative effectiveness of different policy options, it therefore becomes crucial to have a model that can predict how the short-term effects will translate into long-run outcomes. This requires a model that can capture the complexity of the situation, combined with careful data analysis to ensure that the relationships in the model are an accurate depiction of the real world.

An illustrative example

We have developed a preliminary model of the influence of social and behavioural factors on obesity and cardiovascular disease, which we can use to illustrate the role that complex systems modelling could play in helping to evaluate upstream, group-level policies. The time required for a policy to have an impact and the long-run persistence of these effects depend greatly on the pathways from the intervention to the outcome, and on the strength of feedback loops that occur along this pathway. We illustrate this by using simulations of the impact of investing in good food stores on body mass index (BMI), under different assumptions about the importance of friend networks in influencing diets. The model we use for these simulations cannot be described in detail here due to space limitations but the key features that are relevant for these simulations are summarized here. We are using an ABM with agents arranged in a grid divided into 100 neighbourhoods, with social ties formed primarily between agents who live nearby. An agent's diet is determined by a combination of the availability of good food stores in her neighbourhood, her education level, the diet of her parents and friends and genetic predispositions. This in turn influences BMI, which adjusts gradually to changes in diet.

The policy that we simulate increases the level of investment in attracting good food stores in all 100 neighbourhoods. We run two versions of the simulation, one with weak friend network ties, and one with strong friend network ties, where each agent's choice of diet depends 90% on the diets of their friends and 10% on other factors. These relatively extreme assumptions about the importance of friend networks are chosen to illustrate the point more clearly, but noticeable differences in the results can be seen with smaller variations in the importance of friend networks as well. With each set of assumptions, we then evaluate the difference between the average BMI at each point in time in the policy simulation against the average BMI in a counterfactual simulation where there is no such policy. In both cases the policy is funded for 25 time units (e.g. months), after which it is cancelled. The outputs presented are averages of 10 model runs using the program Recursive Porous Agent Simulation Toolkit (REPAST).

The results of these simulations are shown in Figure 1. In the case with weak social network effects, the impact of the policy is felt more quickly, and the maximum impact is stronger. However, the impacts are more persistent in the case with strong social network effects, and it takes longer for them to dissipate.

Figure 1
Agent-based-modelling simulation of population changes in BMI subsequent to the implementation of a policy to attract better food stores to local neighbourhoods, stratified by populations characterized by strong and weak network ties

This illustrates the importance both of understanding the strength of social network effects, and of having a model that can help illustrate how the network effect translates into policy relevant conclusions. Some progress has been made recently in evaluating the importance of friend networks in influencing the evolution of BMI over time, using an analysis of network data from the Framingham Heart Study.65 This type of data analysis is important in informing our understanding, and these efforts should be replicated and extended. Just as crucial is the construction of complex models that can be used to identify key pieces of information that should be studied, and to translate what is learned from the data into conclusions about policy; in this case translating information about the strength of friend network effects into conclusions about the timing of the impacts of policy interventions.


There is precedent for the use of complex systems dynamic models for the purpose of understanding system behaviour and outcomes, for parameterizing these relations using real data and for deriving from these models insight that has practicable and immediate implications for populations. We intend this article to serve as both a challenge and as an encouragement. We suggest here that complex systems modelling approaches have the potential to integrate our growing knowledge about multilevel causes of health and their patterns of feedback and interaction, and to inform our knowledge about how specific policy interventions influence the health of populations. It is important to note that we do not think that these approaches will necessarily be a panacea or that they will necessarily offer a solution to all the challenges epidemiology faces as we grapple with causal thinking. As in all statistical and computational models, the utility of models depends strongly on the quality of the data that are input into the models and the assumptions that inform the modelling effort. We do think that complex systems dynamic models provide a promising approach that can augment our epidemiological armamentarium and push us forward both conceptually and methodologically. Time will tell whether widespread adoption of these methods will move the field substantially forward and, of course, the ultimate test will be whether or not the adoption of these methods will help us address important epidemiological questions and move us closer to improving the health of populations. However, as more epidemiologists recognize that complexity is a compelling and essential aspect of population systems, we think that growth in the application of complexity approaches to epidemiology in coming years is near inevitable. In that light, clearly articulating the methods that can help us achieve that goal and the challenges we face in the adoption of these methods can suggest the barriers that need to be overcome and suggest a way forward.66


Supported by a Health Policy Investigator Award from the Robert Wood Johnson Foundation (to S.G. and G.K.), by grants DA 022720, MH 07815, MH 082729 (to S.G.) HD 047861 (to G.K.) from the National Institutes of Health, and by a grant from the Institute for Integrative Health (to G.V.)

Conflict of interest: None declared.


  • The growing recognition that the interrelation among factors at multiple levels that influence health and disease often involves dynamic feedback and changes over time challenges the dominant epidemiological approach to identifying causes.
  • Complex systems dynamic models may provide one approach for epidemiologists to account for the complexity of disease causation in populations.
  • There are several challenges facing the discipline in incorporating these methods into non-infectious disease epidemiology.


1. Marini MM, Singer B. Causality in the social sciences. Sociological Methodology. 1988;18:347–409.
2. Susser M. What is a cause and how do we know one? A grammar for pragmatic epidemiology. Am J Epidemiol. 1991;133:635–48. [PubMed]
3. Rothman KJ, Greenland S. Causation and causal inference in epidemiology. Am J Public Health. 2005;95:S144–45. [PubMed]
4. Maldonado G, Greenland S. Estimating causal effects. Int J Epidemiol. 2002;31:422–29. [PubMed]
5. VanderWeele TJ, Robins JM. Directed acyclic graphs, sufficient causes and the properties of conditioning on a common effect. Am J Epidemiol. 2007;166:1096–104. [PubMed]
6. Robins JM, Hernán MA, Brumback B. Marginal structural models and causal inference in epidemiology. Epidemiology. 2000;11:550–56. [PubMed]
7. Kaplan GA, Everson SA, Lynch JW. The contribution of social and behavioral research to an understanding of the distribution of disease: a multilevel approach. In: Smedley BD, Syme SL, editors. Promoting Health: Intervention Strategies from Social and Behavioral Research. Washington, DC: National Academy Press; 2000.
8. Olshansky SJ, Passaro DJ, Hershow RC, et al. A potential decline in life expectancy in the United States in the 21st century. N Engl J Med. 2005;352:1138–45. [PubMed]
9. Dietz WH, Gortmaker SL. Preventing obesity in children and adolescents. Annu Rev Public Health. 2001;22:337–53. [PubMed]
10. Finkelstein EA, Ruhm CJ, Kosa KM. Economic causes and consequences of obesity. Annu Rev Public Health. 2005;26:239–57. [PubMed]
11. Rose G. Sick individuals and sick populations. Int J Epiodemio. 1985;14:32–38. [PubMed]
12. Galea S, Ahern J. Considerations about specificity of association, causal pathways and heterogeneity of association in multilevel thinking. Am J Epidemiol. 2006;163:1079–82. [PubMed]
13. DiPietro L. Physical activity, body weight and adiposity: an epidemiologic perspective. Exerc Sport Sci Rev. 1995;23:275–303. [PubMed]
14. Trost S, Owen N, Bauman AE, Sallis JP, Brown W. Correlatees of adults’ participation in physical activity: review and update. Med Sci Sports Exerc. 2002;34:1996–2001. [PubMed]
15. Fuenkes GI, de Graaf C, Meybeoom S, van Staveren WA. Food choice and fat intake of adolescents and adults: associations of intakes within social networks. Prev Med. 1998;27:645–56. [PubMed]
16. Morland K, Wing S, Diez Roux A. The contextual effect of the local food environment on residents' diets: the atherosclerosis risk in communities study. Am J Pub Health. 2002;92:1761. [PubMed]
17. Koopman JS, Lynch JW. Individual causal models and population system models in epidemiology. Am J Pub Health. 1999;89:1170–74. [PubMed]
18. March D, Susser E. The eco in eco-epidemiology. Int J Epidemiol. 2006;35:1379–83. [PubMed]
19. Morris JN. Uses of Epidemiology. Edinburgh: Livingstone; 1957.
20. Saracci R. Everything should be made as simple as possible but not simpler. Int J Epidemiol. 2006;35:513–14. [PubMed]
21. Diez-Roux AV. Integrating social and biologic factors in health research: a systems view. Ann Epidemiol. 2007;17:569–74. [PubMed]
22. Pepe MS, Janes H, Longton G, Leisenring W, Newcomb P. Limitations of the odds ratio in gauging the performance of a diagnostic, prognostic, or screening marker. Am J Epidemiol. 2004;159:882–90. [PubMed]
23. Arthur WB. Why do things become more complex? Sci Am. 1993;268:92.
24. Gell-Mann M. What is complexity? Complexity. 1995;1:16–19.
25. Rea TJ, Brown CM, Sing CF. Complex adaptive system models and the genetic analysis of plasma HDL-cholesterol concentration. Perspect Biol Med. 2006;49:490–503. [PMC free article] [PubMed]
26. Palla G, Derényi I, Farkas I, Vicsek T. Uncovering the overlapping community structure of complex networks in nature and society. Nature. 2005;435:814–18. [PubMed]
27. Cederman LE. Endogenizing geopolitical boundaries with agent-based modeling. PNAS. 2002;99:7296–303. [PubMed]
28. Cederman L-E. Modeling the size of wars. Am Pol Sci Rev. 2003;97:135–50.
29. Macy MW, Willer R. From factors to actors: computational sociology and agent-based modeling. Ann Rev Soc. 2002;28:143–66.
30. Reynolds CW. Flocks, herds and schools: a distributed behavioral model. Computer Graphics. 1987;21:25–34.
31. Miller J, Page S. Complex Adaptive Systems: An Introduction to Computational Models of Social Life. Princeton, New Jersey: Princeton University Press; 2007.
32. Theise ND, d'Inverno M. Understanding cell lineages as complex adaptive systems. Blood Cells Mol Dis. 2004;32:17–20. [PubMed]
33. Levins R, Lopez C. Toward an ecosocial view of health. Int J Health Serv. 1999;29:261–93. [PubMed]
34. Lansing JS. Complex adaptive systems. Annu Rev Anthropol. 2003;32:183–204.
35. Tesfatsion L. Economic agents and markets as emergent phenomena. PNAS. 2002;99:7191–92. [PubMed]
36. Arthur WB, Holland JH, LeBaron B, Palmer R, Tayler P. Asset Pricing Under Endogenous Expectations in an Artificial Stock Market. Santa Fe Institute Working Paper #96-12-093; 1996.
37. Carley KM. Computational organization science: a new frontier. Proc Natl Acad Sci USA. 2002;99:7257–62. [PubMed]
38. Cederman LE. Emergent polarity: analyzing state-formation and power politics. Int Stud Quarterly. 1994;38:501–33.
39. Lustick IS, Miodownik D, Eidelson RJ. Secessionism in multicultural states: does sharing power prevent or encourage it? Am Pol Sci Rev. 2004;98:209–29.
40. Epstein JM. Modeling civil violence: an agent-based computational approach. PNAS. 2002;99:7243–50. [PubMed]
41. Kaplan GA. What's wrong with social epidemiology, and how can we make it better? Epidemiol Rev. 2004;26:124–135. [PubMed]
42. Galea S, Ahern J, Karpati A. A model of underlying socio-economic vulnerability in human populations: evidence from variability in population health and implications for public health. Soc Sci Med. 2005;60:2417–30. [PubMed]
43. Ness RB, Koopman JS, Roberts MS. Causal system modeling in chronic disease epidemiology: a proposal. Ann Epidemiol. 2007;17:564–68. [PubMed]
44. Halloran ME, Longini M, Nizam A, Yang Y. Containing bioterrorist smallpox. Science. 2002;298:1428–32. [PubMed]
45. Levy DT, Nikolayev L, Mumford E. Recent trends in smoking and the role of public policies: results from the SimSmoke tobacco control policy simulation model. Addiction. 2005;100:1526–36. [PubMed]
46. Axelrod R, Tesfatsion L. On-Line Guide for Newcomers to Agent-Based Modeling in the Social Sciences.
47. Grimm V, Railsback SF. Individual-Based Modeling and Ecology. Princeton, New Jersey: Princeton University Press; 2005.
48. Tesfatsion L. General Software and Toolkits: Agent-Based Computational Economics (ACE) and Complex Adaptive Systems (CAS) [(7 September 2009, date last accessed)].
49. Janssen MA, Ostrom E. Empirically based, agent-based models. Ecol Soc. 2006;11:2,37.
50. Windrum P, Fagiolo G, Moneta A. Empirical validation of agent-based models: alternatives and prospects. J Art Soc Soc Simulation. 2007;10:8.
51. Smith JM. Review of ‘the origins of order’ New York Rev Books. 1995;30:30–33.
52. Helmreich S. Silicon Second Nature: Culturing Artificial Life in a Digital World. Berkeley, CA: University of California Press; 1998.
53. Benoit K. Simulation methodologies for political scientists. Po Method. 2001;10:12–16.
54. Strike PC, Steptoe A. Psychosocial factors in the development of coronary artery disease. Prog Cardiovasc Dis. 2004;46:337–34. [PubMed]
55. Kuh D, Ben-Shlomo Y. A Lifecourse Approach to Chronic Disease Epidemiology. Oxford: Oxford University Press; 2004.
56. Ben-Shlomo Y, Kuh D. A life course approach to chronic disease epidemiology: conceptual models, empirical challenges and interdisciplinary perspectives. Int J Epidemiol. 2002;31:285–93. [PubMed]
57. Brookings Center on Social and Economic Dynamics. Large Scale Agent Model. [(7 September 2009, date last accessed)].
58. Bailey NTJ. The Mathematical Theory of Infectious Diseases. London; Griffen; 1975.
59. Fine PEM. Herd immunity: history, theory, practice. Epidemiol Rev. 1993;15:265–302. [PubMed]
60. Fox JP, Elveback L, Scott W, Gatewood L, Ackerman E. Herd immunity: basic concept and relevance to public health immunization practices. Am J Epidemiol. 1971;94:179–89. [PubMed]
61. Newman M. Spread of epidemic disease on networks. Phys Rev. 2002;66:016128. [PubMed]
62. Satorras RP, Vespignani A. Epidemic spreading in scale-free networks. Phys Rev Lets. 2001;86:3200–203. [PubMed]
63. Berrang-Ford L, MacLean J D, Gyorkos TW, Ford J D, Ogden NH. Climate change and malaria in Canada: a systems approach. Interdis Perspec Infect Dis. 2009:1–13. [PMC free article] [PubMed]
64. Volz E, Meyers LA. SIR epidemics in dynamic contact networks. Proc Royal Soc Biol Sci. 2007;274:2925–33. [PMC free article] [PubMed]
65. Christakis NA, Fowler JH, The spread of obesity in a large social network over 32 years. New Eng J Med. 2007;357:370–79. [PubMed]
66. Mitchell M. Can evolution explain how the mind works? A review of the evolutionary psychology debates. Complexity. 1999;4:17–24.

Articles from International Journal of Epidemiology are provided here courtesy of Oxford University Press