|Home | About | Journals | Submit | Contact Us | Français|
Determining when, if, and how information from separate sensory channels has been combined is a fundamental goal of research on multisensory processing in the brain. This can be a particular challenge in psychophysical data, as there is no direct recording of neural output. The most common way to characterize multisensory interactions in behavioral data is to compare responses to multisensory stimulation with the race model, a model of parallel, independent processing constructed from the probability of responses to the two unisensory stimuli which make up the multisensory stimulus. If observed multisensory reaction times are faster than those predicted by the model, it is inferred that information from the two channels is being combined rather than processed independently. Recently, behavioral research has been published employing capacity analyses where comparisons between two conditions are carried out at the level of the integrated hazard function. Capacity analyses seem to be particularly appealing technique for evaluating multisensory functioning, as they describe relationships between conditions across the entire distribution curve, are relatively easy and intuitive to interpret. The current paper presents capacity analysis of a behavioral data set previously analyzed using the race model. While applications of capacity analyses are still somewhat limited due to their novelty, it is hoped that this exploration of capacity and race model analyses will encourage the use of this promising new technique both in multisensory research and other applicable fields.
The brain interfaces with the environment through many different sources of information. Waves of light and sound, the physical energy of vibrations and pressure, and chemical odorants and tastants provide different information about one’s surroundings. In addition to clear benefits, this wealth of information provides the brain with distinct challenges of how, when, and if this information should be combined to best form a functional approximation of the surrounding world. Our experiences inform us that the brain has solved this problem to a useful enough degree, and research ranging from the moth to the human has demonstrated that interactions between the senses occur [1–5]. However, mathematical representations that provide the means to experimentally characterize the if and when of multisensory interactions remain a challenge. The present paper compares one of the most common methods of testing multisensory interactions in human psychophysical data, the race model  with the application of expanded survival analyses.
The race model is probably the most common method used for assessing human behavioral measures for evidence of multisensory integration. The race model, like all models, does have some limitations in its application and interpretation. First, the race model method is not always easy to briefly explain to those who are not familiar with it, as is often the case for neuroscientists or psychologists presenting findings to a broad audience. In addition, interpretation of the race model is limited by the fact that it is based on subtractions of cumulative distribution functions, which limits sensitivity and interpretation of results in the tails of the distribution. These concerns will be explained in more detail below. A method of comparing distributions based on survival analyses has been used recently to evaluate behavioral facilitation due to unisensory redundant targets  and differences between older and younger adults in a same/different task . This capacity model takes advantage of hazard functions to not only address several of the limitations of the race model, but provide an intuitive output as well . The purpose of this paper is to suggest the potential utility of integrated hazard functions and capacity analyses for evaluating multisensory processing, either by themselves or in combination with traditional race analyses. This paper is meant to facilitate practical application of capacity analyses rather than being an exhaustive treatment of the mathematical and theoretical concepts that underlie race model and capacity analyses, as these concepts have been addressed previously [7, 10–12]. It is hoped that this discussion will raise awareness of these promising analysis techniques in the multisensory community and contribute to the evolving dialogue about defining and assessing multisensory interactions.
Data from a previously published study  were reanalyzed using capacity analyses. Basic information including subject characteristics, stimulus characteristics, and study design are included below and detailed in the original study.
The study was intended to investigate the effects of normative aging on multisensory integration. Subjects underwent a thorough screening to evaluate their health, sensory acuity, and cognitive status. Data were collected from 31 healthy young adults (mean age = 28 ± 5.6 years, female = 16) and 27 healthy older adults (mean age = 71 ± 5.0 years, female = 16). All participants granted written, informed consent and were compensated for their time. All subject recruitment, informed consent, and data collection procedures were completed in accordance with the Wake Forest University School of Medicine Institutional Review Board and the Declaration of Helsinki.
All experiments were completed in a sound and light attenuated booth (Whisper Room, Morristown, TN, USA). Stimulus timing and presentation, and collection of reaction time and accuracy data were accomplished using E-prime software (Psychology Software Tools, Pittsburgh, PA, USA) and a serial response box. Visual stimuli were presented on a computer monitor and auditory stimuli through speakers flanking it. Volume was adjusted for each participant to a comfortable and easily discriminable level, typically around 75 dB.
Participants completed a two alternative forced choice task where they were asked to discriminate between the colors red and blue with a button press. Visual stimuli were red or blue filled, colored circles subtending 7.7 presented in the center of a computer monitor for 250ms. Auditory stimuli were the words “red” or “blue” being spoken by a male voice and were 350ms in duration. During each trial, participants could be presented with a visual target alone, auditory target alone, or a multisensory target (simultaneous presentation of visual and auditory stimuli). Multisensory targets were always congruent, that is, a red circle was never presented with the word “blue.”
Each trial consisted of a 1s fixation period where a grey cross was presented in the center of a black computer screen. After the target was presented, the screen was cleared during the response period. The next trial was not presented until the participant responded or 8 seconds elapsed, at which point the next trial would begin. Subjects were instructed to respond “as rapidly and accurately as possible.” Stimulus conditions were presented in pseudo-random order to limit stimulus order effects. Each condition, visual alone, auditory alone and multisensory, was presented 44 times over the course of the experiment. Participants were highly accurate on this task (younger mean accuracy = 42.7 ± 1, older mean accuracy = 43.0 ± 0.9). Inaccurate responses were not included in analyses. Response times were effectively cut off at 8s, as noted above, and responses faster than 250ms were excluded from further analysis. Results from redundant multisensory targets are presented in this manuscript. Cumulative distribution functions from visual and auditory trials are not illustrated, but were used to calculate race models and capacity curves.
Data from extracellular neural recordings in animals and reaction time and accuracy experiments in humans show that the presence of multisensory stimulation results in gains in the form of increased neuronal firing, faster reaction times or improved accuracy under certain circumstances [14–21]. Such gains are examples of a positive interaction or dependency between the sensory channels, where more information results in better performance of the system. Sensory inputs can also be dependent on one another in a negative way, where the presence of information from additional sensory channels actually interferes with behavioral functioning or depresses the firing of neurons [22–24]. Of course, it is possible that in some situations the senses do not interact at all, but are processed fully in parallel, independent streams. Parallel, independent models are referred to as race models, because under these conditions, responses to the environment are determined by whichever input is processed the fastest [6, 11, 25, 26].
The race model distribution in multisensory literature is the predicted response time distribution to multisensory stimulation that would be observed if information from different sensory channels were processed separately . That is, it illustrates the distribution that would result if two channels of information were processed simultaneously, but there were no interaction or convergence between the channels. Under these conditions, multisensory processing is a “horse race” where the signal that reaches threshold first is the one that determines behavior. The race model distribution is generally calculated by summing the observed responses to individual sensory channels. Because the race distribution is a minimum distribution made of the fastest responses, it is typically faster than either unisensory distribution. The speeding of responses due simply to the fact that two sources of information are present is termed statistical facilitation . Very generally, the race model posits that if observed responses are faster than the responses predicted by parallel processing (e.g., speeding due to statistical facilitation), it can be inferred that interaction between the sensory channels has occurred. More thorough treatments of the influence of stochastic and context invariance and different models of processing architecture and decision rules have been previously published [7, 11, 12, 27, 28]
A typical multisensory experiment using the race model has at least three conditions: presentation of a target in one sensory modality, presentation of a target in another modality, and presentation of both modalities simultaneously. The race model distribution is calculated by summing the cumulative distribution functions (CDFs) of observed responses to the two unisensory conditions to create a predicted multisensory distribution. Each value in the CDF reflects the cumulative probability of a response occurring at a given range of reaction times, e.g., 240–250ms. The probability for response can then be compared between the race model and observed multisensory responses at each reaction time bin to assess for multisensory facilitation.
There are two methods most commonly used to calculate the race model. As originally proposed by Miller , the race model is calculated by summing the two unisensory probability distributions
where PV is the probability of a response occurring to the visual stimulus in a given time bin t and PA is the probability of a response occurring to the auditory stimulus at time t. The race model inequality proposed by Miller is widely regarded and has long been used as the standard analysis in multisensory literature. However, since it directly sums the two unisensory probability distributions, the Miller inequality sums to two. Of course, an empirically observed multisensory distribution will sum to one, meaning the responses modeled in the second part of the curve are not informative. The fact that the distribution from the Miller inequality sums to two also changes the slope of the race model curve, making it steeper. As a result, using the Miller inequality to calculate the race model distribution restricts the time points at which comparisons can be made and sets a stringent threshold for detecting non-parallel processing.
In order for the predicted race model to sum to one, it is necessary to either cap the race model distribution at one (as in [7, 12]) or as is more commonly done in empirical literature, assume stochastic independence between the two unisensory probability distributions so that the equation for summation becomes
[6, 29]. This independent race model is often used because it allows comparisons to be made across the entire distribution, and typically shows a less steep slope than the bound created by the Miller inequality when using observed data. There is some debate about whether or not independence should be assumed in the race model (e.g., [30, 31]), and it will be seen that in spite of the fact that the independent race model sums to one, the latter portions of the distribution remain uninterpretable.
If the observed multisensory responses are faster than predicted by the race model, the CDF of observed responses will be shifted to the left, so that the probability of observing a response in specific time bins is greater than predicted. Observed responses to multisensory stimuli faster than predicted by the race model (positively dependent) indicate that multisensory gains are greater than can be accounted for by statistical facilitation. Therefore, faster than predicted multisensory responses suggest information from sensory channels is being integrated (termed “coactivation” by Miller) rather than processed in a strictly parallel fashion . “Beating” the race model provides good evidence that multisensory facilitation has occurred, but it is important to note that the reverse does not hold; not beating the race model does not prove that purely parallel processing has occurred. In other words, nonlinear interactions can occur even when the race model is not violated. However, this is difficult to demonstrate empirically, because multisensory responses that are faster than unisensory responses but slower than the race model could result from statistical or multisensory facilitation. While positive dependency is the most common interpretation of violation of the Miller bound, it is also possible for perfect negative dependencies to beat the race model if the amount of time given to process a target within one sensory channel is dependent on what is occurring in the other channel. That is, the assumption of context invariance is not met [7, 30]. For the remainder of the discussion, we will refer to violation of the Miller bound as representing positive dependency. Thorough review of the theoretical assumptions underlying the Miller inequality, other race models, and other bounds can be found elsewhere [7, 11, 12, 27].
From the CDFs of observed visual, auditory, and multisensory distributions for a representative young subject (Fig. 1A), it can be seen that the multisensory CDF for this subject was shifted to the left of the visual and auditory distributions. The visual and multisensory distributions overlap for several slower reaction time bins, and all distributions begin to converge near 100%. This leftward shift indicates that multisensory response times were faster than the unisensory response times. The fact that responses to multisensory stimuli were faster than unisensory stimuli does not mean that coactivation or integration occurred. Figure 1B illustrates the relationship between observed responses to multisensory stimulation and response distributions predicted by the Miller and independent race model inequalities. Here it is clear that the multisensory distribution for this participant is faster than predicted by either race distribution for much of the distribution. Around the 520ms reaction time bin, the observed distribution crosses the Miller inequality as it ascends toward 20%. Observed responses continue to exceed those predicted by the independent race model until around 600ms for this subject, when the distribution for observed responses crosses the independent race model. As is commonly observed in such comparisons, the distributions cross once; the first part of the multisensory distribution is shifted to the left of the race, and this relationship is flipped at later time bins. This suggests that at early time bins, inputs from the two channels were interacting with a positive dependency rather than remaining independent and parallel.
The relationship between observed multisensory and predicted race model responses is perhaps easier to visualize through the difference curve (Fig. 1C), which represents a subtraction of the race model from the multisensory responses. The positive deflection represents time bins where the observed responses are faster than predicted by the race model, and the negative deflection shows time bins when observed responses were slower than predicted. The peaks in both positive and negative deflections represent peaks in the difference between the two curves. For this subject, both the Miller and independent race model show robust positive deflections. Here, the difference curve between the Miller and multisensory responses peaks around 20%, begins to decrease at the 520ms time bin, and continues its downward deflection until the end of the distribution. In contrast, observed multisensory responses exceed those predicted by the independent race model over a wider range of time bins and reach a slightly higher peak around 23%. After the difference curve crosses 0, it does not continue with a negative slope, but rather reaches a negative peak and then returns to 0 at the end of the distribution.
Two concerns with interpreting the race model are illustrated in this figure, the limitations imposed by the fact that race comparisons are carried out at the level of the CDF and that the difference is calculated through a subtraction. Expressing the reaction time distribution as a CDF is necessary to calculate the race model distribution. Modeling the entire distribution is also useful because the relationship between the race model and observed responses may only differ in certain parts of the curve, and sometimes even small differences that might be obscured in a global measure like the mean are nevertheless behaviorally meaningful. However, interpretation of differences between the race model and multisensory responses is complicated by the fact that the CDF is a probability distribution, and therefore must begin with 0% and end with 100%. This means that when two curves are compared, they will not only have the same starting point (0% of responses completed), but if they are followed for the entire curve, will be forced to the same ending point as well. Of course, this is not true for the Miller inequality, which sums to two; this bound is even more difficult to interpret once the difference curve crosses zero.
The difference curve of the independent race model in Figure 1C shows a shape characteristic of difference curves between multisensory responses and the race model. The curve looks almost like a sine wave, starting at the 0 line and going up with a positive deflection, crossing zero at the same time bin where the CDFs cross (610ms), proceeding to a negative deflection, and then returning to zero. On this particular distribution, the negative deflection reaches its negative peak at 790ms. At this point, the slope of the curve changes from negative to positive and the gap between the two distributions begins to close until it again equals 0 as the CDFs reach 100%. How should values be interpreted as the slope of the difference curve begins rising to approach 0 at 790ms? This curve suggests that the slowest reaction times actually exhibit less interference than faster responses, a tantalizingly counterintuitive finding. However, it is possible these results are not a true reflection of neural output, but rather are an artifact of the constraints of the probability distribution, which must end at 100%. Therefore, the fact that race model comparisons are carried out on a probability distribution inherently limits the interpretability of a significant portion of the curves.
Secondly, the fact that the difference between predicted and observed responses is computed as a subtraction may limit the interpretability of differences, especially in the tails of the distribution. In time bins at the beginning of the distribution, the overall probability of response is very small for all distributions. This means that when the race model is subtracted from observed responses, the absolute difference between the two is going to be very small. The subtraction employed to evaluate significance does not reflect the relationship between the curves very well. For instance, if the predicted probability of response is 1% and a 2% chance of responding is observed, it might be more meaningful to know that the presence of two stimuli increased the probability of response 100% than to know there was a 1% difference between conditions. Because the absolute value of the difference is small, it can be difficult to compare the amount of gain at these small time bins with the amount of gain when there are larger differences at later time bins. It is also possible to interpret such difference curves by calculating the area under the curve (see  for an elegant method and summary). However, even such a global measure of differences between the distributions is still affected by the concerns raised above. The extent of violation of the race model may be underestimated by the steeper slope generated by the Miller inequality or the fact that the difference curve returns to 0 at the tails.
The Grice inequality is most commonly used with the race model to test for slowing due to an interaction between the two channels, or a negative dependency. The Grice inequality states that because parallel processing draws on the fastest response time from either distribution, processing cannot be parallel if observed responses to a multisensory stimulus are slower than the fastest unisensory distribution . The Grice inequality has good specificity; violation is incontrovertible evidence that negative dependency exists. However, if the fastest reaction time determines the behavioral outcome, it is unclear why responses would ever be consistently slower than the race model under conditions of parallel processing. In addressing this question, it becomes clear that the Grice model lacks sensitivity, as negative interactions between stimulus channels could conceivably occur without violation of the Grice inequality. The stringent nature of the Grice bound adds to the difficulty in interpreting differences on the right hand side of the distribution. Figure 1A illustrates that while there are a few late time bins where the visual and multisensory distributions are quite close, there is no time period when visual responses are significantly faster than multisensory.
The race model and Grice inequality are standard means of assessing whether systems function in a parallel fashion because they are useful and provide meaningful results. However, some of the limitations inherent to these methods may be addressed, at least in part, by capacity analyses.
In a general way, capacity is conceived of as the amount of work a system is able to perform in a given amount of time [9, 10]. The concept of capacity is relevant in different contexts, for instance to describe the amount of energy a system is capable of generating in a given amount of time in physics, or the amount of cognitive resources that can be brought to bear on a task in cognitive psychology. The fundamental concept of capacity captured in physics is related mathematically and conceptually to the psychological application [7, 10]. In the context of psychophysical measurements, cognitive variables can be manipulated to influence how much work a system does and the influence of these manipulations can be quantified through reaction times. When a system, in this case a human subject, has high capacity, they will have great potential to do work resulting in fast reaction times. They will be able to accomplish the designated work in a short period of time. If the manipulation makes greater demands on cognitive capacity, for instance imposing a dual task situation, reaction times will slow down, reflecting a decrease in the capacity to perform work under those conditions.
In order to quantify capacity from a reaction time distribution, an expression of the distribution is needed that will be equal to 0 before any work is done, will be able to increase freely to reflect the maximum capacity of the observed system, and will not have a value once all the observations in the distribution have been made . The hazard distribution embodies all these principles, and its integrated form can be easily derived from the cumulative distribution function.
The hazard function expresses a conditional probability, that is, the probability that a response will occur at a given time given that it has not yet occurred. Mathematically, the hazard function can be simply expressed as
where h(t) is the hazard function, f(t) is the probability density function (PDF) and S(t) is the survival function.
Therefore, the numerator of the hazard function expresses the probability that the event will occur at time t, while the denominator expresses the probability that it will not have occurred by time t . This expression of the hazard function is often termed the instantaneous hazard, as it reflects the conditional probability per unit time specifically at time t. Because the expression is not a probability, but rather a probability per unit time, the instantaneous hazard is a rate. The hazard function can also be integrated to provide a measure of the summed probability of a response occurring at time t given that it has not yet occurred. The integrated form of the hazard function can be easily calculated as
[7, 8, 33]. This integrated or cumulative hazard function is interpreted as the summed work a system possesses at a specific time, or by extension, the amount of potential energy or capacity for work the system has at that time . The integrated hazard functions for the same representative subject shown in Figure 1 are shown in Figure 2A. Note that while the CDFs for this subject reach 100% and plateau, once the end of the integrated hazard function is reached, the distribution ends.
Townsend and colleagues [7, 11] have proposed a method of computing a capacity coefficient by taking the ratio of cumulative hazard functions where the condition of interest is in the numerator and the contrasting condition(s) are in the denominator. This metric has potential utility in assessing multisensory integration, as the value of interest is not the absolute capacity of a system during multisensory stimulation, but rather how the capacity of a system during multisensory stimulation relates to capacity predicted from a combination of the unisensory conditions. The capacity coefficient to evaluate multisensory interactions is
where H denotes the cumulative hazard function and the subscripted letters denote the reaction time distribution (MS is multisensory, V is visual, A is auditory). A capacity coefficient of one means that the observed multisensory capacity is exactly what is predicted by independent visual and auditory distributions. Townsend and colleagues termed this unlimited capacity. If a system exhibits unlimited capacity, the addition of one bit of information to the ongoing processing of another bit does not influence the system. Thus, the two incoming channels are being processed independently. A system can also exhibit super capacity output where the observed multisensory capacity is greater than predicted by the combined unisensory outputs, resulting in a capacity coefficient that is larger than one. In a super capacity situation, the addition of an extra channel of information results in a positive dependency that facilitates behavioral performance. Conversely, the addition of another channel of information can cause the subject to do less work than predicted by parallel processing. This situation is represented by a capacity coefficient that is less than one and indicates the system is operating at sub-capacity. Such a negative dependency would be reflected in attenuation of behavioral and neural output.
The difference curve calculated from the independent race model is shown again in Figure 2B above the capacity coefficient for the representative subject (Fig. 2C). The capacity coefficient curve begins with a value of 8.5 at time bin 370ms in this subject, and ends at time bin 1030ms with a value of 0.6. The capacity coefficient is super-capacity over the same time bins where the multisensory distribution exceeds the distribution predicted by the independent race model and crosses one at exactly the same time bin where the race model difference curve crosses zero. This occurs because calculation of the integrated hazard, like the independent race model, assumes stochastic independence when it models parallel processing of unisensory stimuli. Note however that unlike the negative deflection of the race model difference curve, the capacity coefficient does not converge at the right hand tail of the distribution. Since the capacity coefficient is not bounded by 100%, it remains sub-capacity until the distribution ends. Thus, the capacity coefficient is interpretable over the entire distribution.
To calculate group average race models, each participant’s reaction time distribution was converted to a cumulative distribution function by ordering the reaction times from fastest to slowest, binning them in 10ms time bins to give a frequency distribution of responses at each time bin, and calculating the cumulative percentage in each bin. Group reaction time and race model CDFs were computed as individual visual, auditory, multisensory, and race model CDFs averaged across subjects at each time bin to yield an average distribution. Significant differences between the predicted multisensory distribution (race model) and the observed multisensory distribution were assessed by subtracting the race model probability from the multisensory probability at each time bin. A one-sample t-test was performed at each time bin to test if the difference was significantly (p < 0.05) different from zero.
Vincent transformation as described by Ratcliff  is commonly used for averaging reaction distributions, as it is an effective way to preserve the shape of the distribution during averaging when there are small numbers of observations. It accomplishes this by binning the data by percentage of the reaction time distribution, effectively transforming the data onto the y-axis. Vincent transformation was not used in averaging in this study for several reasons. Survival analyses inherently depend on the data being represented with common data points in the time domain on the x-axis. However, the process of transforming the data onto the y-axis means there is now one data point for each percentage bin of the distribution on the y-axis, but not for every unit of time on the x-axis. In addition, in order to apply the Vincent transformation, the data must be transformed, averaged, retransformed to the x-axis, fit with a function, and then needed time points must be interpolated from this fit function (e.g., ). The residual estimates must also be transformed, adding another potential source of error. Given that this study had 31 young and 27 older subjects who averaged over 42 correct responses per condition, it was preferable to use the original data rather than introducing additional variance through the process of multiple transformations, parametric curve fitting, and interpolation.
As mentioned above and illustrated in Figure 2, the capacity coefficient curve is restricted to time bins that have hazard values for each of the conditions. Unlike the CDF, which is zero before responses start and 100% after responses are completed, the capacity coefficient curve does not have numeric values before responses begin or after they end. This means that an average capacity coefficient cannot be calculated in the same way that an average CDF can, by simply averaging values in each time bin. Therefore, group capacity coefficient curves were generated by averaging the visual, auditory, and multisensory CDFs, converting these into survival functions using equation 4, and taking their negative natural log as in equation 5 in order to create integrated hazards. The capacity coefficient for each group was then generated by creating a ratio of these average integrated hazards at each time bin using equation 6.
Assessing significance of integrated hazard analyses can be difficult and often involves making parametric assumptions . In addition, because curves were averaged at the level of the CDF, it is difficult to estimate the variance at the level of the cumulative hazard. To avoid these concerns, significance was determined by creating confidence intervals non-parametrically using the bootstrapping technique  implemented in MATLAB version 6.5 (MathWorks, Natick, MA, USA). Bootstrapping is a nonparametric method for assessing statistical significance. Briefly, an approximation of the underlying sample distribution is created by sampling the observed responses randomly with replacement. Information about the variance of the observed sample is then derived. Ten thousand bootstrapped CDF distributions each were created for multisensory, visual, and auditory conditions for young and older groups. Integrated hazard curves were generated for each CDF, and capacity coefficients were calculated creating 10,000 capacity coefficient curves each for young and older groups. Confidence intervals for capacity coefficient curves were derived from these bootstrapped distributions. The high number of 0s and low number of responses for the first 4 time bins meant that the capacity coefficient was not able to be reliably estimated using the bootstrapping method. Therefore confidence intervals could not be generated for these time bins.
In addition to individual race model analyses, group averages were computed as described above to compare multisensory processing in younger and older adults. Results of this analysis have been previously presented  and are shown in Figure 3. Difference curves for older and younger adults reveal that older adults show more gains due to multisensory interactions than do younger adults. The width of the positive deflection of the difference curve spans 410ms in the older adults as compared to 200ms in younger adults. Additionally, the 12.8% peak percent difference at 520ms between predicted and observed multisensory responses was significantly (p<0.05) higher for older adults than for younger adults, who peaked at 8.3% at 420ms. Neither younger nor older adults were slower than the fastest unisensory distribution (visual), although both were significantly slower than predicted by the race model at later time bins.
Results of capacity analyses performed on the data from Laurienti and colleagues  are shown in the right hand panels of Figure 3. Error bars represent the 95% confidence interval about each time point. For time points where the confidence band includes one, the data is indicative of unlimited capacity as the coefficient does not differ significantly from one. Capacity coefficients that are significantly greater than one represent super capacity processing, and those that are significantly less than one represent sub-capacity processing.
The earliest time bins for young adults show values greater than but not significantly different from one. Super capacity processing is exhibited for about a 200ms range of time bins (240–450ms) with a capacity coefficient peak value of 1.5 at time bin 340ms before processing returns to unlimited capacity (time bins 460–600). Older adults also show capacity coefficients that exceed, but are not significantly different from one at the earliest time bins. Response times over the next 250ms were all significantly super capacity (320–570ms) with a peak value of 2.5 at time bin 320ms before dropping down to unlimited capacity for responses between 580 and 840ms. These results suggest that during multisensory stimulation, all subjects were able to do more work than predicted from their unisensory responses. This difference was greater for older adults, who were able to do a maximum of two and a half times more work than predicted and showed increased capacity over more time bins than younger adults.
Unlike the race model analysis, statements can be made about the remainder of the distribution using the capacity analysis. The bulk of the young adult capacity curve was actually significantly sub-capacity (time bins 610–1600), whereas older adults were significantly sub-capacity between 850 and1600ms. This means that while both younger and older adults were faster at responding to multisensory stimuli than to either unisensory stimulus, the combination of two sensory channels resulted in slowing relative to the predicted parallel processing model at longer reaction times.
Close examination of Figure 3 will reveal that unlike the distributions calculated on a single subject, the capacity coefficient curve and race model difference curve cross one and zero, respectively, at different time bins. The reason for this difference is rooted in the way that the average race model and average capacity coefficients were calculated. Both race models were calculated for each subject, resulting in a value in each time bin. The race model CDFs were then averaged. As explained in section 4.3, the capacity coefficient does not have values in all time bins for all subjects, so there are not numbers to average in every time bin. Therefore, capacity coefficients were generated by creating average visual, auditory, and multisensory CDFs, expressing these as integrated hazard functions, and then taking the capacity coefficient ratio.
The impact of these different group averaging methods is illustrated in Figure 4A with three difference curves. The Miller inequality difference curve is the smallest of the three. As discussed above, the Miller model has the lowest peak, spans the fewest time bins, and the differences are meaningless once it crosses zero because it sums to two. In contrast, the independent race model has the highest peak of the three curves and spans the most time bins. The middle curve in this graph is an independent race model calculated in the same way as the capacity coefficient. Rather than calculating a race model for each individual and averaging them, average visual and auditory distributions were created and the race model predictions were generated based on these average distributions. This change in how the distribution was averaged resulted in a difference curve that was intermediate between the independent race model and Miller’s model. As can be observed in Figure 4B, the resulting difference curve crosses the zero line at the same time bin where the group capacity coefficient crosses one.
The primary incentive to incorporate capacity analyses into the repertoire of multisensory analyses is their potential advantages for interpretation. Because it represents the amount of work done and is expressed in the form of a ratio, the concept of capacity is relatively intuitive in a way that beating the race model is not. Rather than saying older subjects had a peak difference in the probability of response of 12.8% and younger subject had only a peak of 8.3%, we can conclude that younger adults were able to do up to 1.5 times and older adults up to 2.5 times as much work during multisensory stimulation as predicted from the unisensory distributions. Capacity is also grounded in fundamental concepts of physics and cognitive psychology, providing a conceptual reference point and a mathematical underpinning that are useful. In spite of these potential advantages, capacity analyses are not commonly used, and this is probably their single biggest weakness. Both the benefits and caveats associated with both methods of computing the race model are relatively well studied and have been demonstrated by different labs using different experimental paradigms.
Capacity analyses also have certain advantageous statistical properties. The values of the capacity coefficient curve are not bounded by 0 and 1 like the CDF, but are free to vary because they are generated from the cumulative hazard function. Like race model calculations, capacity coefficients are also nonparametric, and in this study were assessed for significance using a nonparametric technique. Not making assumptions about the shape of the distribution is both a strength and weakness, however. While nonparametric statistics have conceptual advantages, the bootstrapping technique used in the current analysis was computationally intensive, and is not as widely available or understood as a t-test.
Finally, it is essential to understand that all three analyses illustrated - Miller’s race model, the independent race model, and capacity analysis - give different information about the relationship between processing of unisensory and multisensory stimuli. Although the outcomes of the group capacity analysis differed somewhat from the group analysis of the independent race model due to differences in averaging, the two analyses agreed in important ways. Both showed significant performance gains due to multisensory stimulation, and the relationship between multisensory gains in older and younger adults was preserved. Older adults showed a higher peak difference between observed and predicted responses, and demonstrated facilitation due to multisensory stimulation in about twice as many reaction time bins as younger adults in both analyses. The outcomes of the analyses convey different information about the distribution, though. The race model analysis indicates that the difference between observed and predicted probabilities of response at fast reaction times is greater in older adults than in younger adults. Capacity analysis represents this difference as an increase in the capacity to do work, revealing that older adults are able to do up to 2.5 times as much during multisensory stimulation as predicted, while younger adults increase their capacity up to 1.5 times. Depending on the question being asked, it may be most useful to use the race model, capacity analysis, or both to maximize the conclusions that can be drawn.
Future research will hopefully continue to refine assessment and interpretation of capacity analysis. The use of bootstrapping is the first attempt we know of to assess significance on the capacity coefficient. Future studies may refine this technique further, or suggest more sensitive techniques for assessing significance of the capacity coefficient. Future work may also devise a better method for the averaging of the capacity coefficient curve to achieve a representation of the entire distribution. One method for representing overall capacity trends that avoids the complication of averaging the distribution is creating mean capacity coefficient values . Although mean capacity values allow for global interpretations of the data, such methods are not sensitive to the fact that behavioral enhancement is often observed for fast response times but behavioral decrements are observed for slow response times.
Another interesting facet of human psychophysical tasks that was not explored in this paper is the potential for speed/accuracy tradeoffs. That is, subjects may face the option of either responding faster and more incorrectly, or slowing down to improve their accuracy. In fact, models have been constructed that take such speed-accuracy tradeoffs into account. For an introduction to these issues in the context of capacity analyses, please see .
In the words of George Box, the eminent statistician, “Essentially, all models are wrong, but some are useful.” We believe that capacity analyses are useful for evaluating behavioral gains associated with multisensory integration; they provide a ratio measure that conveys meaningful information, allows inferences to be made about virtually all of the distribution, and supports an interpretation that is intuitive, yet grounded in fundamental principles of physics. It will be particularly interesting to continue to develop the relationship of capacity analyses to the physics principles that in fact underlie them.
The authors would like to thank Ms. Debra Hege and Ms. Jennifer Mozolic for their invaluable assistance. This research was supported by NIH grant #NS042568 and the Roena Kulynych Memory and Cognitive Research Center of Wake Forest University.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.