Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Conf Proc IEEE Eng Med Biol Soc. Author manuscript; available in PMC 2012 August 1.
Published in final edited form as:
PMCID: PMC3282046

Bayesian Analysis of Trinomial Data in Behavioral Experiments and Its Application to Human Studies of General Anesthesia


Accurate quantification of loss of response to external stimuli is essential for understanding the mechanisms of loss of consciousness under general anesthesia. We present a new approach for quantifying three possible outcomes that are encountered in behavioral experiments during general anesthesia: correct responses, incorrect responses and no response. We use a state-space model with two state variables representing a probability of response and a conditional probability of correct response. We show applications of this approach to an example of responses to auditory stimuli at varying levels of propofol anesthesia ranging from light sedation to deep anesthesia in human subjects. The posterior probability densities of model parameters and the response probability are computed within a Bayesian framework using Markov Chain Monte Carlo methods.


A fundamental problem in neuroscience and medicine is understanding the mechanisms of general anesthesia. Loss of consciousness under general anesthesia is measured clinically by observing the loss of response to external stimuli. Characterizing the neurophysiological correlates of general anesthesia requires a precise quantification of when subjects lose or regain the ability to respond to these stimuli. By doing so, it may be possible to identify specific neural signatures that characterize the transitions in and out of consciousness. In our study, human subjects were asked to discriminate between two types of auditory stimuli by pressing one of two buttons. When subjects are conscious and responding to stimuli, the responses can be either correct or incorrect. However, as the anesthetic dose increases, we expect the responses to become less likely, with possible changes in the rate of correct or incorrect responses. A standard method is to estimate the response rate at a given point in time by grouping together correct and incorrect responses into one category, and treating the absence of response as a second category [1], [2], [3], [4]. However, if we treat incorrect responses as a separate category, we can quantify both the response rate and the probability of correct responses simultaneously. We assume that responding to a stimulus and responding correctly to a stimulus are two slightly different phenomena, because to anticipate the latter one, additional brain coordination including hearing, understanding and sending alternative signals to different muscles may be required. Performing such quantification requires a more sophisticated model of response dependencies and their dynamics. In this contribution we present a new approach for quantifying the probability of correct, incorrect, and no response. We do this by treating the probability of response and the conditional probability of a correct response given a response as hidden states governed by a dynamic model that is linked to the observations by a logit transform. Physiologically these hidden states indicate level of consciousness, based on both the ability to respond and the performance of the subject under general anesthesia. The posterior probability densities of model parameters and hidden states (response probabilities) are computed within a Bayesian framework using Markov Chain Monte Carlo methods. Once these probabilities are estimated, the probability of a correct response can be calculated using Bayes rule.

II. Experimental Protocol

We induced and allowed recovery from general anesthesia in a normal healthy human volunteer. We conducted these studies with the approval of the Human Research Committee at the Massachusetts General Hospital, and followed all hospital safety regulations for administration of general anesthesia. For the anesthetic induction, we increased the effect-site concentration in step-wise levels of 0, 1, 2, 3, 4, and 5µg/ml every 14 minutes using a computer controlled infusion [5], [6]. The subject listened to a series of pre-recorded auditory stimuli consisting of their name, affectively-neutral words, or a train of 40 and 84 Hz clicks, presented in a ratio of 1 word/name to 4 clicks. Subjects were instructed to press a button to differentiate between these sounds, and were asked to wait until the end of the click stimulus before responding to those stimuli. The duration of the names/words stimulus was approximately 0.5 seconds, the click stimulus duration was 2 seconds, and the inter-stimulus interval was 4 seconds. We used the loss of button responses as an indicator of loss of consciousness (LOC). We referred to the propofol concentration where the subject lost consciousness as CLOC. We then reduced the propofol concentration in a stepwise fashion to concentrations of CLOC − 0.5µg/ml, CLOC − 1.0µg/ml, and CLOC − 1.5µg/ml, and 0µg/ml, for 14 minutes each. Button press times were recorded throughout the experiment. The whole experiment spanned about 150 minutes.

III. Model

Let K be the number of trials in an experiment. For k = 1, …,K, let nk be 1 if the subject responds to the trial k and 0 otherwise. Let also mk be 1 if the subject responds correctly to the trial k and 0 otherwise.

equation image

Let pk be a 2 × 1 observation vector, with elements equation M1, representing the probability of a response and the conditional probability of a correct response given a response on trial k, respectively. Let also 2 × 1 vector xk, with elements equation M2 , be the cognitive state on trial k. Since our data are approximately equally spaced, we assume that xk depends on xk−1 through a random walk model (1), containing a dynamical noise term with a constant variance. We also assume that pk follows from xk through a logistic transformation (2), ensuring that pk is constrained to lie between zero and one. This yields the following model:

equation M3

equation M4

Equations (1) and (2) form a state-space model, where (1) is the system equation and (2) is the observation equation. Let equation M5 be Gaussian distributed dynamical noise terms for the system equation, i.e., equation M6. To simplify the problem, we assume that the observations in (2) are noiseless. Given initial values of cognitive states equation M7 and the system noise variances equation M8, the likelihood function of the free parameters is:

equation M9

In our experiment, every five consecutive stimuli contain four click train stimuli and one verbal stimuli, in a repeating pattern of click-click-verbal-click-click. Instead of analyzing the click train stimuli individually, we pooled every four click train stimuli, so that the number of data point are equal for the two stimulus types. By this, we can combine two uncoupled models into a combined model, for instance, models B and C in Fig. 3. The trinomial distribution on multiple trials was adopted:

equation M10

where Nk (Mk) is the number of trials in block k during which the subject responded (responded correctly), K is the number of blocks, and L = 4 is the number of trials in a block. Note that (3) is a particular case of (4) when L = 1. We can use either (3) or (4) with the state-space model ((1) and (2)) that can be modified by adding state variables to accommodate different types of stimuli or differences in response rate between different stimuli with the same model structure.

Fig. 3
Median estimates and 95% credible interval estimates of probability of response, conditional probability of correct response and probability of correct response of click train stimuli (green) and verbal stimuli (purple), with the assumption of (Model ...

The set of free parameters includes all the x0 and all the σ2. These are computed using the Bayesian approach, which assumes that prior information about the parameters improves the parameter estimates. For all the x0 we choose uniform prior distribution, Uniform(a, b), while for all the σ2, we choose the conjugate inverse gamma prior distribution, Inverse Gamma (α, λ). We assumed values of 0 and 100 for a and b respectively to reflect the fact that the subject responded perfectly at the beginning of the experiment. α and λ are chosen to be 5 and 1, making the inverse gamma prior distribution non-informative.

We use the Bayesian analysis implementation described in [3], which uses BUGS software and provides an efficient way of estimating parameters without having to develop a new algorithm for each model. The BUGS software is available for free in two implementations, WinBUGS [7] and OpenBUGS [8]. The median and credible interval estimates of probability are obtained from 100000 iterations after a 20000 iteration burn-in period. It takes less than 1 hour to perform the Gibbs sampling for one set of data.


In Fig. 1 we show the anesthetic propofol concentration level and the behavioral data over time. As we can see in Fig. 1, the subject started to respond incorrectly, and at the same time the response time became non-uniform. The subject stopped responding to auditory stimuli during the fourth segment when the propofol concentration level was 3µg/ml, which corresponds to CLOC. After four segments of not responding, the subject started to respond again after the anesthetic propofol level was lowered to 2µg/ml. In both Fig. 2 and Fig. 3 we indicate the median estimate with a solid line and the interval estimate with a shaded area. Using the Gibbs sampling algorithm we collected a set of estimates, one estimate from each iterative step. Then we sorted all the estimates in the ascending order, taking the middle value as the median estimate, and the 2.5% percentile and 97.5% percentile as a 95% credible interval.

Fig. 1
Propofol concentration level in µg/ml (blue) and behavioral responses of click train stimuli (green) and verbal stimuli (purple). Height of stems represent reaction time of responses in sec. A response is denoted by a stem. An incorrect response ...
Fig. 2
Model A. Median estimates and 95% credible interval estimates of (1st panel) probability of response, (2nd panel) conditional probability of correct response and (3rd panel) probability of correct response of click train stimuli (green) and verbal stimuli ...

In Fig. 2 we show the results of Model A, in which we assume that the verbal and click stimuli each have a distinct probability of response and conditional probability of correct response. This is done by augmenting the state and observation equations so that the verbal and click stimuli each have their own state equation (1) and observation equation (2). In the three upper panels of Fig. 2, we show the probability of response, the conditional probability of correct response given a response, and the probability of correct response. We are therefore able to compare simultaneously both the response rate and the correct response rate between the two types of auditory stimuli, which would not be possible with conventional binary models. We see that the subject responded more persistently to more salient verbal stimuli than less-salient click train stimuli. We find that the conditional probability (2nd panel) has a wider credible interval, covering almost all of the range 0 to 1 whenever the response rate was small, since responses were infrequent during that time. In the three lower panels of Fig. 2, we compare performance between verbal and click stimuli by computing the difference in probability between stimulus types. The 4th panel shows the difference in probability of response, the 5th panel shows the difference in conditional probability of correct response, and the 6th panel shows the difference in total probability of correct response. The solid line represents the median difference and the pale area represents a 95% credible interval. The credible interval above the horizontal zero means that the probability of verbal stimuli is significantly larger than that of click train stimuli. During the transition to loss of consciousness, when the response rate drops from 1 to 0, and the recovery of consciousness, when the response rate rises back to 1, we observe periods of significant differences between verbal and click stimuli in terms of response rate (4th panel) and correct responses (6th panel). During the unconscious period, the difference in conditional probability of correct response has a very wide credible interval (5th panel). This is because there are vanishingly few observations during this time, so the estimation power is very low.

In Fig. 3, we illustrate how these models can be used to represent different possible dependencies within the data, and how the model comparison method can be used to perform model selection. We consider two additional model structures. For one model, we assume that the two stimulus types have the same underlying probability of response, but different probabilities of correctness (Model B). For another model, we assume that the two stimulus types share the same underlying conditional probability of correct response, but have different probabilities of response, and different probabilities of correct response (Model C). In Model B, the common response rate (Fig. 3, 1st panel) apparently takes the shape from click train stimuli in Model A (Fig. 2, 1st panel). Its values lie in the intermediate range between the verbal and click response rates, and the credible interval is narrower. The correct response rate (Fig. 3, 3rd panel) of click train stimuli is consistent with that in Model A, but the correct response rate for the verbal stimuli is not because it is bounded by the overall response rate (Fig. 3, 1st panel), forcing it to assume a shape that is very similar to the click train correct response. For Model C, the response rate (Fig. 3, 4th panel) is similar to what we saw in Model A (Fig. 2, 1st panel). The probability of correct response (Fig. 3, 6th panel) is also similar to Model A (Fig. 2, 3rd panel), but with a reduced probability of correct response for the verbal stimuli. This happens because the combined conditional probability of correct response in Model C (Fig. 3, 5th panel) is less than that for the verbal stimuli in Model A (Fig. 2, 2nd panel). We performed model selection using Deviance Information Criterion (DIC), which is a measure given by the sum of deviance and twice the number of effective parameters. The model with the smallest DIC is estimated to be the model that would best predict a replicate dataset which has the same structure as the currently observed. The DIC of the models A, B, and C are 677.9, 863.7 and 675.7, respectively. This indicates that the model with two distinct response rates and one common conditional correct response rate is the most feasible, which implies that the subject had a consistent accuracy in responding to the two types of stimuli but also being more persistent in responding to verbal stimuli. Model A had a very comparable DIC value, and produced very similar results in terms of the performance differences between verbal and click stimuli.


We have presented a Bayesian dynamical model for quantifying probability of response and probability of correct response simultaneously. We have applied it to trinary behavioral data from ten human subjects undergoing general anesthesia. Due to space limit we presented here only behavioral results of one subject. Results with other indicators of the state of consciousness can be found in [9] and [10]. Our method can be extended to a more general case of data with more than three categories. This can be easily achieved with additional cognitive states of conditional probability. In additional to anesthesia studies, our method is applicable to other studies such as sleep studies or learning studies, in both animals and humans.


This work was supported by NIH Grants DP2-OD006454 (Purdon), K25-NS057580 (Purdon), T32-NS048005 (Harrell), DP1-OD003646 (Brown), R01-EB006385 (Brown, Purdon), and R01-MH071847 (Brown).


1. Suzuki WA, Brown EN. Behavioral and neurophysiological analysis of dynamic learning processes. Behavioral Cognitive Neuroscience Review. 2005;vol. 4:67–95. [PubMed]
2. Smith AC, Stefani MR, Moghaddam B, Brown EN. Analysis and design of behavioral experiments to characterize population learning. Journal of Neurophysiology. 2005;vol. 93:1776–1792. [PubMed]
3. Smith AC, Wirth S, Suzuki WA, Brown EN. Bayesian analysis of interleaved learning and response bias in behavioral experiments. Journal of Neurophysiology. 2007;vol. 97:2516–2524. [PubMed]
4. Smith AC, Shah SA, Hudson AE, Purpura KP, Victor JD, Brown EN, Schiff ND. A bayesian statistical analysis of behavioral facilitation associated with deep brain stimulation. Journal of Neuroscience Methods. 2009;vol. 183:267–276. [PMC free article] [PubMed]
5. Shafer SL, Gregg KM. Algorithms to rapidly achieve and maintain stable drug concentrations at the sit of drug effect with a computer-controlled infusion pump. Journal of Pharmacokinetics and Pharmacodynamics. 1992;vol. 20:147–169. [PubMed]
6. Schnider TW, Minto CF, Gambus PL, Andresen C, Goodale DB, Shafer SL, Youngs EJ. The influence of method of administration and covariates on the pharmacokinetics of propofol in adult volunteers. Anesthesiology. 1998;vol. 88:1170–1182. [PubMed]
7. Lunn DJ, Thomas A, Best N, Spiegelhalter D. WinBUGS - a bayesian modelling framework: concepts, structure, extensibility. Statistics and Computing. 2000;vol. 10:325–337.
8. Thomas A, OHara B, Ligges U, Sturtz S. Making BUGS open. R News. 2006;vol. 1:12–17.
9. Mukamel EA, Wong KFK, Prerau MJ, Brown EN, Purdon PL. Phase-based measures of cross-frequency coupling in brain electrical dynamics under general anesthesia; Conference Proceedings IEEE EMBS; 2011. [PMC free article] [PubMed]
10. Wong KFK, Mukamel EA, Salazar AF, Pierce ET, Harrell PG, Walsh JL, Sampson A, Brown EN, Purdon PL. Robust time-varying multivariate coherence estimation: Application to electroencephalogram recordings during general anesthesia; Conference Proceedings IEEE EMBS; 2011. [PMC free article] [PubMed]