|Home | About | Journals | Submit | Contact Us | Français|
Computational ideas pervade many areas of science and have an integrative explanatory role in neuroscience and cognitive science. However, computational depictions of cognitive function have had surprisingly little impact on the way we assess mental illness because diseases of the mind have not been systematically conceptualized in computational terms. Here, we outline goals and nascent efforts in the new field of computational psychiatry, which seeks to characterize mental dysfunction in terms of aberrant computations over multiple scales. We highlight early efforts in this area that employ reinforcement learning and game theoretic frameworks to elucidate decision-making in health and disease. Looking forwards, we emphasize a need for theory development and large-scale computational phenotyping in human subjects.
The idea of biological psychiatry seems simple and compelling: the brain is the organ that generates, sustains and supports mental function, and modern psychiatry seeks the biological basis of mental illnesses. This approach has been a primary driver behind the development of generations of anti-psychotic, anti-depressant, and anti-anxiety drugs that enjoy widespread clinical use. Despite this progress, biological psychiatry and neuroscience face an enormous explanatory gap. This gap represents a lack of appropriate intermediate levels of description that bind ideas articulated at the molecular level to those expressed at the level of descriptive clinical entities, such as schizophrenia, depression and anxiety. In general, we lack a sufficient understanding of human cognition (and cognitive phenotypes) to provide a bridge between the molecular and the phenomenological. This is reflected in questions and concerns regarding the classification of psychiatric diseases themselves, notably, each time the Diagnostic and Statistical Manual of Mental Disorders (DSM) of the American Psychiatric Association is revised .
While multiple causes are likely to account for the current state of affairs, one contributor to this gap is the (almost) unreasonable effectiveness of psychotropic medication. These medications are of great benefit to a substantial number of patients; however, our understanding of why they work on mental function remains rudimentary. For example, receptors are understood as molecular motifs (encoded by genes) that shuttle information from one cellular site to another. Receptor ligands, whose blockade or activation relieves psychiatric symptoms, furnished a kind of conceptual leap that seemed to obviate the need to account for the numerous layers of representation intervening between receptor function and behavioral change. This, in turn, spawned explanations of mental phenomena in simplistic terms that invoked a direct mapping from receptor activation to complex changes in mental status. We are all participants in this state of affairs, since symptom relief in severe mental disease is sufficient from a clinical perspective, irrespective of whether there are models that connect underlying biological phenomena to the damaged mental function. A medication that relieves or removes symptoms in a large population of subjects is unquestionably of great utility, even if the explanation for why it works is lacking. However, significant gaps in the effectiveness of medications for different mental illness mean we should look to advances in modern neuroscience and cognitive science to deliver more.
We believe that advances in human neuroscience can bridge parts of the explanatory gap. One area where there has been substantial progress is in the field of decision-making. Aberrant decision-making is central to the majority of psychiatric conditions and this provides a unique opportunity for progress. It is the computational revolution in cognitive neuroscience that underpins this opportunity and argues strongly for the application of computational approaches to psychiatry. This is the basis of computational psychiatry [2–4] (Figure 1). In this article, we consider this emerging field and outline central challenges for the immediate future.
To define computational modeling, we must first distinguish it from its close cousin, mathematical or biophysical modeling. Mathematical modeling provides a quantitative expression for natural phenomena. This may involve building multi-level (unifying) reductive accounts of natural phenomena. The reductions involve explanatory models at one level of description that are based on models at finer levels, and are ubiquitous in everything from treatments of action potentials  (see also  for a broader view) to the dynamical activity of populations of recurrently connected neurons . Biophysical realism, however, is a harsh taskmaster, particularly in the face of incomplete or sparse data. For example, in humans, there seems to be little point in building a biophysically detailed model of the dendrite of single neurons if one can only measure synaptic responses averaged over millions of neurons and billions of synapses using functional magnetic resonance imaging (fMRI) or electroencephalography (EEG).
Biophysical modeling is important for elucidating key relationships in a hugely complex system  and thus predicting the possible effects of therapeutic interventions (see  for an example using dynamic causal modeling). For example, it is well known that critical mechanisms within neuromodulatory systems, such as dopamine, serotonin, norepinephrine and acetylcholine, are subject to intricate patterns of feedback and interactive control, with autoreceptors regulating the activity of the very neurons that release neuromodulators. Moreover, this feedback often includes the effects of one neuromodulator (e.g., serotonin) on the release and impact of others (e.g., dopamine) . These neuromodulators are implicated in many psychiatric and neurological conditions. The fact that they play key roles in so many critical functions may explain the fact, if not the nature, of this exquisite regulation. It is the complexity of these interactions that invites biophysical modeling and simulation, for instance, to predict the effect of medication with known effects on receptors or uptake mechanisms. Moreover, the capacity to perform fast biophysical simulations is essential for evidence-based model comparison using empirical data  and the exploration of emergent behaviors (e.g., ). Simulation has become vital to vast areas of science and it will be central in computational psychiatry. Mathematical predictions based on real neural and biophysical data are important; however, they are not equivalent to a computational account of mental or neural function.
Computational modeling seeks normative computational accounts of neural and cognitive function. Such accounts start from the premise that the brain solves computational problems and indeed has evolved to do so. One of the pioneers of computing theory, Alan Turing, conceived of mental function in exactly this fashion – the mind was cast as specific patterns of information processing supported by a particular kind of hardware (the brain) . This notion implies key constraints on mental phenomena – in particular constraints on computational complexity that limit the power of any device, neural or mechanical, to solve a wide range of problems . This idea is commonplace today but in the 1930s the idea of computation and its limits underwent a revolution [15–17] (see also ).
Currently, computational accounts of elements of mental and neural function exist, and in each case, typically some constraint is found that guides the discovery of the computational model. Some of the most important constraints come from optimality assumptions – the idea that the brain is organized to maximize or minimize quantities of external and internal importance (e.g., [6,19]). One set of optimality constraints emerges naturally from behaviors that support survival, such as foraging for food or responding appropriately to prospects of danger . A wide range of ideas, proofs, methods and algorithms for executing such behaviors can be found in many fields, including engineering, economics, operations research, control theory, statistics, artificial intelligence and computer science. In fact, these fields provide a formal foundation for the interpretation of many cognitive and neural phenomena . This foundation can span important levels of description, for instance, offering accounts of the representational semantics of the population activity of neurons  or of the firing of neuromodulatory neurons in the context of tasks involving predictions of reward [22–26]. This type of computational modeling can thus provide one explanatory framework for the reductive mathematical modeling discussed above.
The field of decision-making has been a particular target for computational modeling. Decision-making involves the accumulation of evidence associated with the utilities of possible options and then the choice of one of them, given the evidence. Decision-making problems in natural environments are extremely complex. One difficulty arises from the balance models must strike between built-in information acquired over the course of evolution about the nature of the decision-making environment (ultimate constraints) versus what can be learned over the course of moment-to-moment experience (proximate constraints). A second difficulty arises because of the inherent computational complexity of the problem: certain types of optimal decision-making appear intractable for any computational system. This fact motivates the search for approximations that underlie mechanisms actually used in animals. Reinforcement learning is one area where such approximations have been used to guide the discovery of neural and behavioral mechanisms. Box 1 provides a brief description of the modern view of neural reinforcement learning.
Reinforcement learning (RL) is a field, partly spawned by mathematical psychology, that spans artificial intelligence, operations research, statistics and control theory (for a good introductory account of RL, see ). RL addresses how systems of any sort, be they artificial or natural, can learn to gain rewards and avoid punishments in what might be very complicated environments, involving states (such as locations in a maze) and transitions between states. The field of neural RL maps RL concepts and algorithms onto aspects of the neural substrate of affective decision-making [90,91]. One important feature of this framework is that the majority of its models can be derived from a normative model of how an agent ‘should’ behave under some explicit notion of what that agent is trying to optimize .
Conventional and neural RL include two very broad classes of method: model-based and model-free. Model-based RL involves building a statistical model of the environment (a form of cognitive map; see ) and then using it to (i) choose actions based on predicted outcomes and (ii) improve predictions by optimizing the model. Acquiring such models from experience can be enhanced by sophisticated prior expectations (a facet that we relate to the phenomena of learned helplessness). In other words, an agent significantly enhances the models it can build based on experience if it already starts with a good characterization of its environment. In turn, these models enable moment-to-moment prediction and planning. Except in very simple environments, prediction and planning consume enormous memory and computational resources – a fact has inspired much work on approximations and the search for biological work-arounds.
Model-free RL involves learning exactly the same predictions and preferences as model-based RL, but without building a model. Instead, model-free RL learns predictions about the environment by enforcing a strong consistency constraint: successive predictions about the same future outcomes should be the same. Actions are chosen based on the simple principle that actions which lead to better predicted outcomes are preferred. Model-free RL imposes much lower demands on computation and memory because it depends on past learning rather than present inference. However, this makes it less flexible to changes in the environment.
The conceptual differences between model-based and model-free RL suggest that correlates can be sought in real-world neural and behavioral data. There are ample results from animal and human experiments to suggest that both model-free and model-based RL systems exist in partially distinct regions of the brain [67,93–97] and that there is a rich panoply of competitive and cooperative interactions between them . Model-free RL has a particularly close association with the activity of the dopamine neuromodulatory system, especially in the context of appetitive outcomes and predictions.
Finally, model-based and model-free RL are both instrumental in the sense that actions are chosen because of their consequences . Animals are also endowed with extremely sophisticated Pavlovian controllers (see main text), where outcomes and predictions of those outcomes directly elicit a set of species-typical choices apparently not under voluntary control. One important example related to predictions of future negative outcomes is behavioral inhibition (learning not to do something), which may be related to serotonin .
Many psychiatric conditions are associated not only with abnormal subjective states, such as moods, but also with aberrant decisions. Patients make choices: in depression, not to explore; in obsessive compulsive disorder, to repeat endlessly a behavior (such as hand-washing) that has no apparent basis in rational fact (such as having dirty hands); in addiction, to seek and take a drug, despite explicitly acknowledging the damage that follows. Key to the initial form of computational psychiatry is the premise that, if the psychology and neurobiology of normative decision-making can be characterized and parameterized via a multi-level computational framework, it will be possible to understand the many ways in which decision-making can go wrong. However, we should first consider an important earlier tradition of modeling in psychiatry.
There is an old idea in brain science, namely, that complex functions emerge from networked interactions of relatively simple parts [27,28]. In the brain, the most conspicuous physical substrates for this idea are the networks of neurons connected by synapses. This perspective has been termed ‘connectionism’. One modern expression of connectionism began with the work of Rumelhart, McClelland and the parallel distributed processing research group  (but now see ), which applied this approach to both brain and cognition in the early and mid-1980s, building upon the earlier pioneering work [27,28]. The basic concept underlying connectionism involves taking simple, neuron-like, units and connecting in them in ways that are either biologically plausible based on brain data or capable of performing important cognitive or behavioral functions. At approximately the same time and parallel to this work, three key publications emerged from physicists John Hopfield and David Tank, which showed how a connectionist-like network can have properties equivalent to those pertaining to the dynamics of a physical system [31–33]. Inspired by Hopfield’s work and the seminal (and still classic) work of Stuart and Donald Geman on Gibbs sampling and Bayesian approaches to image analysis, Hinton and Sejnowski  showed that probabilistic activation in simple units could perform a sophisticated Bayesian style of inference. Collectively, this work addressed memory states, constraint satisfaction, pattern recognition and a host of other cognitive functions , thus suggesting that these models might aid in understanding mental disease.
Through the 1990s, connectionist models turned their sights on psychopathologies, such as schizophrenia [35–39]. These models primarily addressed issues related to cognitive control and neuromodulation [35–38], with a particular focus on neural systems that could support these functions [40–44]. These and other models offered plausible solutions for how networks of neurons could implement functions, such as cognitive control and memory, and offered new abstractions for how such functions go awry in specific pathologies. This work leans heavily on the neurally-plausible aspect of connectionist models, a feature that now finds more biological support, as neuroscience has produced enormous amounts of new data that can be fit into such frameworks [42–44].
In this section, we review recent efforts to develop and test computational models of mental dysfunction and to extract behavioral phenotypes relevant for building computationally-principled models of mental disease. The examples discussed are intended to provide insights into healthy mental function but in a fashion designed to inform the diagnosis and treatment of mental disease. Along with the pioneering earlier studies [35–40], there have been recent treatments and reports of work along these lines on schizophrenia [3,45,46], addiction , Parkinson’s disease, Tourette’s syndrome, and attention-deficit hyperactivity disorder . Here, we concentrate on two areas that have not been recently reviewed in this context, namely depression and autism.
The efforts discussed here are now collectively blossoming into programmatic efforts in computational psychiatry (for example, the joint initiative of the Max Planck Society and University College London: Computational Psychiatry and Aging Research). It is our opinion that such efforts must reach further and strive to extract normative computational accounts of healthy and pathological cognition useful for building predictive models of individuals. Consequently, we emphasize for computational psychiatry the goal of extracting computational principles around which human cognition and its supporting biological apparatus is organized. Achieving this goal will require new types of phenotyping approaches, in which computational parameters are estimated (neurally and behaviorally) from human subjects and used to inform the models. This type of large-scale computational phenotyping of human behavior does not yet exist.
Box 1 notes three different, albeit interacting, control systems within the context of RL: model-based, model-free, and Pavlovian. Model-based and model-free systems link the choice of actions directly to affective consequences. The Pavlovian system determines involuntary actions on the basis of predictions of outcomes, whether or not deployed actions are actually appropriate for gaining or avoiding those outcomes. Pavlovian control appears completely automated in this description. However, it is known that other brain systems can interact with Pavlovian control, hence, it is at this level that such control can be sensitive to ongoing valuations in other parts of the brain.
These types of controllers and their interactions have been the subject of computational modeling in the context of mood disorders, especially depression [4,48–50]. First, let us consider the role of serotonin in clinical depression. In many patients, one effective treatment involves the use of a selective serotonin reuptake inhibitor (SSRI), which prolongs the action of serotonin at target sites. Data from animals suggests that serotonin release is involved in (learned) behavioral inhibition [50–53], associated with the prediction of aversive outcomes [54,55]. Computational modeling inspired by these data suggests that serotonin’s role in behavioral inhibition may reflect a Pavlovian effect: subjects do not have to learn explicitly what (not) to do in the face of possible future trouble. This effect could be called the ‘serotonergic crutch’. Problems with the operation of this crutch can lead to behavior in which poor choices are made because they have not been learned to be inappropriate. In this framework, punishments are experienced or imagined even if the choices concern internal trains of thought rather than external events . Restoration of the crutch is considered to improve matters again. The logic here is that the more an individual’s behavior is determined in a Pavlovian manner, the more devastating is the likely consequence of any problem with the serotonergic crutch. This is an account of vulnerability (analogous to the incentive sensitization theory of drug addiction; see [56–58]).
Conversely, model-based RL has been used to capture another feature of some forms of anxiety and depression: learned helplessness [59–61]. Animals can be made helpless when provided with uncontrollable rewards as well as uncontrollable punishments  and, thus, learning that their actions do not consistently predict outcomes. In these experiments, one way to demonstrate the onset of learned helplessness is to show that the animals do not explore or try to escape when placed in new environments (e.g. ).
A natural computational account is to treat the helplessness training in the first part of learned helplessness as inducing a prior probability distribution over possible future environments, indicating that the animal can expect to have little influence over its fate, that is, little controllability. This hypothesis is based on the expectation that related environments have similar properties. Exploration in a new environment is only worthwhile only if it is expected that good outcomes can be reliably achieved given appropriate actions. Thus, a prior belief implying that the environment is unlikely to afford substantial controllability will discourage exploration. Prior distributions are under active examination in Bayesian approaches to cognitive science and are offering substantial explanations for a broad range of developmental and adult behaviors . Only model-based RL is capable of incorporating such rich priors, even though model-free control can be induced to behave in similar ways by simpler mechanisms [65,66]. This computational interpretation of uncontrollability provides a new way to understand the role that environments can play in etiology. It also provides a way of formalizing the complex interaction between model-based, model-free and Pavlovian systems, when not only might one controller directly influence the training signal of the other controllers  but also the very experience other controllers require in order to learn their own predictions or courses of action.
A defining feature of human cognition is the capacity to model and understand the intentions (and emotions) of other humans. This extends to an ability to forecast into the near-term future, for example, how someone else will feel should they experience a consequence of an action that we might take. Sophisticated capacities such as these lie at the heart of our ability to cooperate, compete and communicate with others. One of the defining features of autism spectrum disorder (ASD) is a diminished capacity for socio-emotional reciprocity – the social back-and-forth engagement associated with all human interaction [1,68–70]. Recent modeling and neuroimaging work has used two-agent interactions, typically in the form of some game, to parameterize and probe this social give-and-take [71–81]. This work, along with other efforts [82–85], has collectively launched a computational neuroscience perspective on inter-personal exchange – a first step toward identifying computational phenotypes in human interactions that are underwritten by both behavioral and neural responses.
Game theory is the study of mathematical models of interacting rational agents. It is used in many domains and in recent years has been increasingly applied to common behavioral interactions in humans. Two game-theoretic approaches have recently been used to probe ASD and other psychopathological populations directly: the stag hunt game and the multi-round trust game (Figure 2). Although the behavioral probes are different, the two games share the feature that it is advantageous for a human player to make inferences about their partner’s likely mental state during the game. These inferences are recursive: my model of you incorporates my model of your model of me, and so on. Both approaches have built computational models around this central idea of recursion [78,79], thereby furnishing component computations required for healthy human exchange.
Yoshida et al. [79–81] used the stag hunt game (Figure 2a) to probe mental state inferences in ASD versus control subjects. The stag hunt game is a classic two-player game (in this case involving a human player and a computer agent), where players can cooperate to hunt and acquire high yield stags, or act alone and hunt low yield rabbits. The model developed by Yoshida et al.  used the human player’s observed behavior to estimate the sophistication level (depth of recursion) of their inference about the computer agent’s beliefs (theory of mind). This estimate is necessary if the human player is to cooperate successfully with the computer agent – the human must believe that the agent believes that the human will also cooperate, and so on.
Behavioral results that exploited this model pointed to a higher probability for a theory-of-mind model (versus fixed-strategy) for control subjects. The opposite was true for ASD subjects (~78% probability for fixed-strategy). However, as one might expect, there was heterogeneity in these estimates, with some ASD subjects (n=5) displaying higher probability for the theory-of-mind model compared to a fixed-strategy model (n=12). Intriguingly, ASD subjects with a higher probability for a fixed-strategy model showed higher ratings on two ASD rating scales (ADI-R, ASDI). These results are preliminary and the sample of ASD subjects small. However, the crucial point is that the model allows for a principled parameterization of important cognitive components (e.g., depth of recursion in modeling one’s partner). The use of such a model provides a way to formalize the cognitive components of ASD in computational terms. By collecting much more normative data, this type of approach could serve to differentiate ASD along these newly defined computational dimensions to improve diagnosis, guide other modes of investigation and help tailor treatments.
The multi-round trust game has also been used to probe a range of psychopathologic populations including ASD and borderline personality disorder [2,75,77]. The game is a sequential fairness game involving reciprocation, where performance is determined by whether players think through the impact of their actions on their partner. In the game, a proposer (called the investor) is endowed with $20 and chooses to send some fraction I to their partner. This fraction is tripled (to 3*I) on the way to the responder (called the trustee), who then chooses to send back some fraction of the tripled amount. Subjects play 10 rounds and know this beforehand. Cooperation earns both players the most money. Even when playing with an anonymous partner, investors do send money, a fact that challenges rational agent accounts of such exchanges. One way to conceptualize this willingness to send money was proposed by Fehr and Schmidt , who suggested that in such a social setting a player’s utility for money depends on the fairness of the split across the two players. Based on this model of fair exchange between humans, Ray and colleagues developed a Bayesian model of how one player ‘mentalizes’ the impact of their actions (money split with partner) on their partner . The key feature is for each player to observe monetary exchanges with their partner and estimate in a Bayesian manner the ‘fairness type’ of their partner, that is, the degree to which the partner is sensitive to an inequitable split.
This model was able to ‘type’ players reliably from 8 rounds of monetary exchange in the game. These types can be used to seek type-specific (fairness sensitivity) neural correlates. More importantly, the model can be used to phenotype individuals according to computational parameters important in this simple game-theoretic model of human exchange. This is an important new possibility. Using this same game, Koshelev and colleagues showed that healthy investors playing with a range of psychopathological groups in the trustee role can be clustered in a manner that reflects that type of psychopathology acting as the trustee . This model used a Bayesian clustering approach to observations of the healthy investors’ behavior as induced by interactions with different psychopathology groups. These preliminary results suggest that parameters extracted from staged (normative) game-theoretic exchanges could be used profitably as a new phenotyping tool for humans, where the phenotypes are defined by computational parameters extracted using models.
Computational models of human mental function present more general possibilities for producing new and useful human phenotypes. These phenotypes can then structure the search for genetic and neural contributions to healthy and diseased cognition. We do not expect such an approach to supplant current descriptive nosologies; instead, they will be an adjunct, where the nature of the computational characterization offers a new lexicon for understanding mental function in humans. Moreover, this approach can start with humans, define a computational phenotype, seek neural and genetic correlates of this phenotype and then turn to animal models for deeper biological study.
Under the restricted decision-making landscape that we have painted, RL models provide a natural example of a type of computational model that could be used in such phenotyping. Moreover, we sketched briefly how game-theoretic probes also allow for new forms of computational modeling and hence new ways to computationally phenotype humans. Through their built-in principles of operation and notions of optimal performance, RL models provide constraints that help bridge the aforementioned gap between molecular and behavioral levels of description. However, the behavioral underpinning of these models is extremely shallow at present, especially in human subjects. As suggested by the examples above, the estimation and use of computational variables, such as these, will require new kinds of behavioral probes, combined with an ever-evolving capacity to make neural measurements in healthy human brains. Not only is better phenotyping through the development of new probes needed, but also unprecedented levels of phenotyping of cognitive function. Many of the best ideas about mental performance and function derive primarily from studies in other species. While these animal models have been strikingly successful at uncovering the biology underlying learning, memory and behavioral choice, the human behavioral ‘software’ is likely to be significantly different in important ways that the probes will need to capture. Large-scale computational phenotyping will require radical levels of openness across scientific disciplines and successful models for data exchange and data sharing.
If the computational approaches we have outlined turn out to be effective in psychiatry, then what might one expect? The large-scale behavioral phenotyping project sketched above involves substantial aspects of data analysis and computational modeling. The aim of the data analysis will be to link precise elements of the models to measurable aspects of behavior and to molecular and neural substrates that can be independently measured. A strong likelihood here is that the models will offer a set of categories for dysfunction that are related to, but different from, existing notions of disease and this will lead to a need for translation.
Although we did not focus on them here, there are also implications for mathematical modeling. A simulation-based account of measurable brain dynamics, anatomical pathways and brain regions could be expected, equipped with visualization and analysis methods to help make sense of the output. The ultimate hope is for a detailed, multi-level, model that allows prediction of the effects of malfunctions and manipulations. However, making this sufficiently accurate at the scales that matter for cognition and behavior is a long way off. One critical, though as yet unproven, possibility is that a computational understanding will provide its own kind of short-circuit, with, for instance, rules of self-organization of neural elements based on achieving particular computational endpoints, thereby removing the requirement for detailed specification.
Finally, the most pressing requirement is for training. Broad and deep skills across cognitive neuroscience, computational neuroscience, cellular and molecular neuroscience, pharmacology, neurology, and psychiatry itself, in addition to computer science and engineering, are required for the emergence of the richly interdisciplinary field of computational psychiatry. Optimistically, how to achieve this may become clearer as thoughts mature about restructuring education to achieve breadth across the brain-related clinical disciplines of neurology and psychiatry .