|Home | About | Journals | Submit | Contact Us | Français|
During reach planning, we integrate multiple senses to estimate the location of the hand and the target, which is used to generate a movement. Visual and proprioceptive information are combined to determine the location of the hand. The goal of this study was to investigate whether multi-sensory integration is affected by extraretinal signals, such as head roll. It is believed that a coordinate matching transformation is required before vision and proprioception can be combined because proprioceptive and visual sensory reference frames do not generally align. This transformation utilizes extraretinal signals about current head roll position, i.e., to rotate proprioceptive signals into visual coordinates. Since head roll is an estimated sensory signal with noise, this head roll dependency of the reference frame transformation should introduce additional noise to the transformed signal, reducing its reliability and thus its weight in the multi-sensory integration. To investigate the role of noisy reference frame transformations on multi-sensory weighting, we developed a novel probabilistic (Bayesian) multi-sensory integration model (based on Sober and Sabes, 2003) that included explicit (noisy) reference frame transformations. We then performed a reaching experiment to test the model's predictions. To test for head roll dependent multi-sensory integration, we introduced conflicts between viewed and actual hand position and measured reach errors. Reach analysis revealed that eccentric head roll orientations led to an increase of movement variability, consistent with our model. We further found that the weighting of vision and proprioception depended on head roll, which we interpret as being a result of signal dependant noise. Thus, the brain has online knowledge of the statistics of its internal sensory representations. In summary, we show that sensory reliability is used in a context-dependent way to adjust multi-sensory integration weights for reaching.
We are constantly presented with a multitude of sensory information about ourselves and our environment. Using multi-sensory integration, our brains combine all available information from each sensory modality (e.g., vision, audition, somato-sensation, etc.) (Landy et al., 1995; Landy and Kojima, 2001; Ernst and Bulthoff, 2004; Kersten et al., 2004; Stein and Stanford, 2008; Burr et al., 2009; Green and Angelaki, 2010). Although this tactic seems redundant, considering that the senses often provide similar information, having more than one sensory modality contributing to the representation of ourselves in the environment reduces the chance of processing error (Ghahramani et al., 1997). It becomes especially important when the incoming sensory representations we receive are conflicting. When this occurs, the reliability assigned to each modality determines how much we can trust the information provided. Here we explore how context-dependent sensory-motor transformations affect the modality-specific reliability.
Multi-sensory integration is a process that incorporates sensory information to create the best possible representation of ourselves in the environment. Our brain uses knowledge of how reliable each sensory modality is, and weights the incoming information accordingly (Stein and Meredith, 1993; Landy et al., 1995; Atkins et al., 2001; Landy and Kojima, 2001; Kersten et al., 2004; Stein and Stanford, 2008). Bayesian integration is an approach that assigns these specific weights in a statistically optimal fashion based on how reliable the cues are (Mon-Williams et al., 1997; Ernst and Banks, 2002; Knill and Pouget, 2004). For example, when trying to figure out where our hand is, we can use both visual and proprioceptive (i.e., sensed) information to determine its location (Van Beers et al., 2002; Ren et al., 2006, 2007). When visual information is available it is generally weighted more heavily than proprioceptive information due to the higher spatial accuracy that is associated with it (Hagura et al., 2007).
Previous studies have used reaching tasks to specifically examine how proprioceptive and visual information is weighted and integrated (Van Beers et al., 1999; Sober and Sabes, 2003, 2005). When planning a reaching movement, knowledge about target position relative to the starting hand location is required to create a movement vector. This movement vector is then used to calculate how joint angles have to change for the hand to move from the starting location to the target position using inverse kinematics and dynamics (Jordan and Rumelhart, 1992; Jordan et al., 1994). The assessment of target position is generally obtained through vision, whereas initial hand position (IHP) can be calculated using both vision and proprioception (Rossetti et al., 1995). Although it is easy to recognize what different sources of information are used to calculate IHP, knowing how this information is weighted and integrated is not.
The problem we are addressing in this manuscript is that visual and proprioceptive information are encoded separately in different coordinate frames. If both of these cues are believed to have the same cause then they can be integrated into a single estimate. However if causality is not certain then the nervous system might treat both signals separately; the degree of causal belief can thus affect multi-sensory integration (Körding and Tenenbaum, 2007). An important aspect that has never been considered explicitly is that in order for vision and proprioception to be combined, they must be represented in the same coordinate frame (Buneo et al., 2002). In other words, one set of information will have to be transformed into a representation that matches the other. Such a coordinate transformation between proprioceptive and visual coordinates depends on the orientation of the eyes and head and is potentially quite complex (Blohm and Crawford, 2007). The question then becomes, what set of information will be encoded into the other? In reaching, it is thought that this transformation depends on the stage of reach planning. Sober and Sabes (2003, 2005) proposed a dual-comparison hypothesis describing how information from vision and proprioception could be combined during a reaching task. They suggest that visual and proprioceptive signals are combined at two different stages. First, when the movement plan is being determined in visual coordinates; and second, when the visual movement plan is transformed into a motor command (proprioceptive coordinates). The latter requires knowledge of IHP in joint coordinates. They showed that estimating the position of the arm for movement planning relied mostly on visual information, whereas proprioceptive information was more heavily weighted when determining current joint angle configuration to compute the inverse kinematics. The reason why there should be two separate estimates (one in visual and one in proprioceptive coordinates) lies in the mathematical fact that the maximum likelihood estimate is different in both coordinates systems (Koerding and Tenenbaum, 2007; McGuire and Sabes, 2009). Therefore, having two distinct estimates reduces the overall estimation uncertainty because no additional transformations that might introduce noise are required.
The main hypothesis of this previous work was that the difference in sensory weighting between reference frames arises from the cost of transformation between reference frames. This idea is based on the assumption that any transformation induces noise to the transformed signal. In general, noise can arise from at least two distinct sources, i.e., from variability in the sensory readings and from the stochastic behavior of spike-mediated signal processing in the brain. Adding noise in the reference frame transformation thus increases uncertainty in coordinate alignment (Körding and Tenenbaum, 2007) resulting in lower reliability of the transformed signal and therefore lower weighting (Sober and Sabes, 2003; McGuire and Sabes, 2009). While it seems unlikely that neuronal noise from the stochastic behavior of spike-mediated signal processing changes across experimental conditions (this is believed to be a constant in a given brain area), uncertainty of coordinate alignment should increase with head roll. This is based on the hypothesis that the internal estimates of the head orientation signals themselves would be more variable (noisy) for head orientations away from primary (up-right) positions (Wade and Curthoys, 1997; Van Beuzekom and Van Gisbergen, 2000; Blohm and Crawford, 2007). This variability could be caused by signal-dependent noise in muscle spindle firing rates, or in vestibular neurons signaling head orientation (Lechner-Steinleitner, 1978; Scott and Loeb, 1994; Cordo et al., 2002; Sadeghi et al., 2007; Faisal et al., 2008).
To evaluate the notion that multi-sensory integration occurs, subjects performed a reaching task where visual and proprioceptive information about hand position differed. We expanded Sober and Sabes (2003, 2005) model into a fully Bayesian model to test how reference frame transformation noise affects multi-sensory integration. To behaviorally test this, we introduced context changes by altering the subject's head roll angle. Again, the rationale was that head roll would affect the reference frame transformations that have to occur during reach planning (Blohm and Crawford, 2007) but would not affect the reliability of primary sensory information (i.e., vision or proprioception). Importantly, we hypothesized that larger head roll noise would lead to noisier reference frame transformations, which in turn would render any transformed signal less reliable. Our main goal was to determine the effect of head roll on sensory transformations and its consequences for multi-sensory integration weights.
Experiments were performed on seven participants between 20 and 24years of age, all of whom had normal or corrected to normal vision. Participants performed the reaching task with their dominant right hand. All of the participants gave their written informed consent to the experimental conditions that were approved by the Queen's University General Board of Ethics.
While seated, participants performed a reaching task in an augmented reality setup (Figure (Figure1A)1A) using a Phantom Haptic Interface 3.0L (Sensable Technologies; Woburn, MA, USA). Their heads were securely positioned using a mounted bite bar that could be adjusted vertically (up/down), tilted forward and backward (head pitch), and rotated left/right (head roll to either shoulder). Subjects viewed stimuli that were projected onto an overhead screen through a semi-mirrored surface (Figure (Figure1A).1A). Underneath this mirrored surface was an opaque board that prevented the subjects from viewing their hand. In order to track reaching movements, subjects grasped a vertical handle (attached to the Phantom Robot) mounted on an air sled that slid across a horizontal glass surface at elbow height.
Eye movements were recorded using electrooculography (EOG), (16-channel Bagnoli EMG system; DELSYS; Boston, MA, USA). Two pairs of electrodes were placed on the face (Blue Sensor M; Ambu; Ballerup, Denmark). The first pair was located on the outer edges of the left and right eyes to measure horizontal eye movements. The second pair was placed above and below one of the subject's eyes to measure vertical eye movements. An additional ground electrode was placed on the first lumbar vertebrae, to record external electrical noise (Dermatrode; American Imex; Irvine, CA,USA).
Subjects began each trial by aligning a blue dot (0.5cm) on the display that represented their unseen hand position with a start position (cross) that was positioned in the center of the display field. A perturbation was introduced such that the visual position of the IHP was constant but the actual IHP of the reach varied among three positions (−25, 0, and 25mm horizontally with respect to visual start position – VSP). The blue dot representing the hand was only visible when hand position was within 3cm of the central cross. Once the hand was in this position, one of six peripheral targets (1.0cm white dots) would randomly appear 250ms later. The appearance of a target was accompanied by an audio cue. At the same time the center cross turned yellow. Once the subject's hand began to move the hand cursor disappeared. Subjects were instructed to perform rapid reaching movements toward the visual targets while keeping gaze fixated on the center position (cross). Targets were positioned at 10-cm distance from the start position cross at 60, 90, 120, 240, 270, and 300° (see Figure Figure11B).
Once the subject's reach crossed the 10-cm target circle, an audio cue would indicate that they successfully completed the reach, and the center cross would disappear. If subjects were too slow at reaching this distance threshold (more than 750ms after target onset), a different audio cue was played, indicating that the trial was aborted and would have to be repeated. At the end of each reach subjects had to wait 500ms to return to the start position, an audio cue indicated the end of the trial, and the center cross reappeared. This was to ensure subjects received no visual feedback of their performance. Subjects were instructed to fixate the central start position cross (VSP cross) throughout the trial.
Subjects completed the task at three different head roll positions, −30 (left), 0, and 30° (right) head roll toward the shoulders (mathematical angle convention from behind subject view). Throughout each head roll condition the proprioceptive information about hand position was altered at random trials, 2.5-cm left or right of the visual hand marker. For example, subjects would align the visual circle representing their hand with the start cross, but their actual hand position may be shifted to the right or left, 2.5cm. Subjects were not aware of the IHP shift when asked after the experiment. We introduced this discrepancy between visual and actual hand position to gain insight into the relative weighting of both signals in the multi-sensory integration process. For each hand offset subjects reached to each target twenty times, and they did this for each head roll. Subjects completed 360 trials at each head position, for a total of 1080 reaches. Head roll was constant within a block of trials.
Eye and hand movements were monitored online at a sampling rate of 1000Hz (16-channel Bagnoli EMG system, Delsys; Boston, MA, USA; Phantom Haptic Interface 3.0L; Sensable Technologies; Woburn, MA, USA). Offline analyses were performed using Matlab (The Mathworks, Natick, MA, USA). Arm position data was low-pass filtered (autoregressive forward–backward filter, cutoff frequency=50Hz) and differentiated twice (central difference algorithm) to obtain hand velocity and acceleration (Figure (Figure2).2). Each trial was visually inspected to ensure that eye movements did not occur while the target was presented (Figure (Figure2C).2C). If they did occur, the trial was removed from the analysis. Approximately 5% of trials (384 of 7560 trials) were removed due to eye movements. Hand movement onset and offset were identified based on a hand acceleration criterion (500mm/s2), and could be adjusted after visual inspection (Figure (Figure2E).2E). The movement angle was calculated through regression of the data points from the initial hand movement until the hand crossed the 10-cm circle around the IHP cross. Directional movement error was calculated as the difference between overall movement angle and visual target angle.
The data was fitted to two models, one previously published velocity command model (Sober and Sabes, 2003) and a second fully Bayesian model that had processing steps similar to Sober and Sabes (2003). In addition the second, new model includes explicit reference frame transformations and – more importantly – explicit transformations of the sensory noise throughout the model. Explicit noise has previously been use to determine multi-sensory integration weights (McGuire and Sabes, 2009); however, they only considered one-dimensional cases (we model the problem in 2D). Furthermore they did not model reference frame transformations explicitly nor model movement variability in the output (nor analyze movement variability in the data). Below, we outline the general working principle of the model; please refer to Appendix 1 for model details.
The purpose of these models was to determine the relative weighting of both vision and proprioception during reach planning, separately for each head roll angle. Sober and Sabes (2003, 2005) proposed that IHP is computed twice, once in visual and once in proprioceptive coordinates (Figure (Figure3A).3A). In order to determine the IHP in visual coordinates (motor planning stage, left dotted box in Figure Figure3A),3A), proprioceptive information about the hand must be transformed into visual coordinates (Figure (Figure3A,3A, red “T” box) using head orientation information both the visual and the transformed proprioceptive information are weighted based on reliability, and IHP is calculated. This IHP can then be subtracted from the target position to create a desired movement vector (Δx). If the hand position is misestimated (due to IHP offset), then there will be an error associated with the desired movement vector.
As a final processing step, this movement vector will undergo a transformation to be represented in a shoulder based reference frame (Figure (Figure3A,3A, TV→P box). Initial joint angles are calculated by transforming visual information about hand location into proprioceptive coordinates (Figure (Figure3A,3A, rightward arrows through red “T” box). This information is weighted along with the proprioceptive information, to calculate IHP in proprioceptive coordinates (right dotted box in Figure Figure3A)3A) and is used to create an estimate of initial elbow and shoulder joint angles (θ initial). Using inverse kinematics, a change in joint angles (Δθ), from the initial starting position to the target is calculated based on the desired movement vector. Since the estimate of initial joint angles (θ initial) is needed to compute the inverse kinematics, misestimation of initial joint angles will lead to errors associated with the inverse kinematics, and therefore error in the movement. We wanted to see how changing head roll would affect the weighting of visual and proprioceptive information. As can be seen from Figure Figure3A,3A, our model reflects the idea that head orientation affects this transformation. This is because we hypothesize (and hope to demonstrate through our data) that transformations add noise to the transformed signal and that the amount of this noise depends on the amplitude of the head roll angle. Therefore, we predict that head roll has a significant effect on the estimations of IHP, thus changing the multi-sensory integration weights and in turn affecting the accuracy of the movement plan.
To test the model's predictions, we asked participants to perform reaching movements while we varied head roll and dissociated visual and proprioceptive IHPs.
A total of 7560 trials were collected, with 384 trials being excluded due to eye movements. Subjects were unaware of the shifts in IHP. We used reaching errors to determine how subjects weighted visual and proprioceptive information. Reach error (in angular degrees) was computed as the angle between the movement and the visual hand–target vector, where 0° error would mean no deviation from the visual hand–target direction. As a result of the shifts in the actual starting hand locations, a situation was created where the subject received conflicting visual and proprioceptive information (Figure (Figure2).2). Based on how the subject responded to this discrepancy, we could determine how information was weighted and integrated.
Figure Figure44 displays nine sets of raw data reaches from a typical subject, depicting 10 reaches to each target. Every tenth data point is plotted for each reach, i.e., data points are distant in time by 10ms, allowing the changes in speed to be visually identifiable. The targets are symbolized by black circles, with the visual start position marked by a cross. Each set of reaches is representative of a particular head roll angle (rows) and IHP (columns). One can already observe from these raw traces that this subject weighted visual IHP more than proprioceptive information resulting in a movement path that is approximately parallel to a virtual line between the visual cross and target locations.
To further analyze this behavior, we compared the reach error (in degrees) for each hand offset condition (Figure (Figure5A).5A). This graph also displays a breakdown of the data for each target angle and shows a shift in reach errors between the different IHPs. The difference in reach errors between the each of the hand offsets indicates that both visual and proprioceptive information were used during reach planning. Figure Figure5B5B shows a fit from Sober and Sabes’ (2003, 2005) previously proposed model to the normalized data from Figure Figure5A5A (see also Appendix 1 for model details). The data from Figure Figure5A5A were normalized to 0 by subtracting the 0 hand offset from the IHPs 25 and −25mm. Sober and Sabes’ (2003) previously proposed velocity command model fit our data well. In Figure Figure5B,5B, it is clear that the normalized data points for each hand position follow the same pattern as the model predicted error, represented by the dotted lines. Based on this close fit of our data to the model, we can now use this model in a first step to investigate how head roll affects the weighting of vision and proprioceptive information about the hand.
As mentioned before, subjects performed the experiment described above for each head roll condition, i.e. −30, 0, and 30° head roll (to the left shoulder, upright and to the right shoulder respectively). We assumed that if head roll was not taken into account, there would be no difference in the reach errors between the head roll conditions. Alternatively, if head roll was accounted for, then we would expect at least two distinct influences of head roll. First, head roll estimation might not be accurate, which would lead to an erroneous rotation of the visual information into proprioceptive coordinates. This would be reflected in an overall shift of the reach error curve up/downward for eccentric head roll angles compared to the head straight-ahead. Second, head roll estimation might not be very precise, i.e., not very reliable. In that case, variability in the estimation should affect motor planning and thus increase movement variability overall and multi-sensory integration weights in particular. We will test these predictions below. Figure Figure66 shows differences in reach errors between the different head roll conditions, indicating that head roll was a factor influencing reach performance. This is a novel finding that has never been considered in any previous model.
From our model (Figure (Figure3A),3A), we predicted that as head roll moves away from 0, more noise would be associated with the signal (Scott and Loeb, 1994; Blohm and Crawford, 2007; Tarnutzer et al., 2009). This increase in noise should affect the overall movement variability (i.e., standard deviation, SD) because more noise in the head roll signal should result in more noise added during the reference frame transformation process. Figure Figure77 plots movement variability for trials where the head was upright compared to rolled to the left or right combined.
We performed a paired t-test between head roll and no head roll conditions across all seven subjects and all hand positions (21 standard deviation values per head roll conditions). Across all three IHPs, movement variability was significantly greater when the head was rolled compared to when the head was straight t(20)=−3.512, p<0.01. This was a first indicator that head roll introduced signal-dependent noise to motor planning, likely through noisier reference frame transformations (Sober and Sabes, 2003, 2005; Blohm and Crawford, 2007).
If changing the head roll angle ultimately affects reach variability, then we would expect that the information associated with the increased noise would be weighted less at the multi-sensory integration step. To test this, we fitted Sober and Sabes’ (2003) model on our data independently for each head orientation. The visual weights, of IHPs represented in visual (dark blue, αvis) and proprioceptive (light blue, αprop) coordinates are displayed in Figure Figure8A.8A. The visual weights of IHP in visual and proprioceptive coordinates were significantly different when the head was rolled compared to the head straight condition (t(20)=−4.217, p<0.01). Visual information was weighted more heavily when IHP was calculated in visual coordinates compared to proprioceptive coordinates. Furthermore, visual information was weighted significantly more for IHP in visual coordinates for head rolled conditions, compared to head straight. In contrast, visual information was weighted significantly less when the IHP was calculated in proprioceptive coordinates for head rolled conditions compared to head straight. This finding is representative of the fact that information that undergoes a noisy transformation is weighted less due to the noise added by this transformation, e.g., vision is weighted less in proprioceptive as opposed to visual coordinates (Sober and Sabes, 2003, 2005). An even further reduction of weighting of the transformed signal will occur if head roll is introduced, presumably due to signal dependant noise (see Discussion section).
In addition to accounting for head roll noise, the reference frame transformation also has to estimate the amplitude of the head roll angle. Any misestimation in head roll angle will lead to a rotational movement error. Figure Figure8B8B plots the rotation biases (i.e., the overall rotation in movement direction relative to the visual hand–target vector) for each head roll position. The graph shows that there is a rotational bias for reaching movements even for 0° head roll angle. This bias changes depending on head roll. There were significant differences between the rotational biases for head roll conditions compared to head straight (t(20) >6.891, p<0.01).
We developed a full Bayesian model of multi-sensory integration for reach planning. This model uses proprioceptive and visual IHP estimates and combines them in a statistically optimal way, separately in two different representations (Sober and Sabes, 2003, 2005): proprioceptive coordinates and visual coordinates (Figure (Figure3B).3B). The IHP estimate in visual coordinates is compared to target position to compute the desired movement vector while the IHP estimate in proprioceptive coordinates in needed to translate (through inverse kinematics) this desired movement vector into a change of joint angles using a velocity command model. For optimal movement planning, not only are [the point estimates in these two reference] frames are required, but the expected noise in those estimates is also needed (see Appendix).
Compared to previous models (Sober and Sabes, 2003, 2005), our model includes two crucial additional features. First, we explicitly include the required reference transformations (Figure (Figure3A,3A, “T”) from proprioceptive to visual coordinates (and vice versa), including the forward/inverse kinematics for transformation between Euclidean space and joint angles as well as for movement generation. The reference frame transformation T depends on an estimate of body geometry, i.e., head roll angle (Figure (Figure3A,3A, “H”) in our experiment. Second, in addition to modeling the mean behavior, we also include a full description of variability. Visual and proprioceptive sensory information have associated noise, i.e., proprioceptive and visual IHP as well as head roll angle. As a consequence, covariance matrices of all variables also have to undergo the above-mentioned transformations. In addition, these transformations themselves are noisy, i.e., they depend on noisy sensory estimates.
To illustrate how changes in transformation noise, visual noise, and joint angle variability affect predicted reach error, we used the model to simulate these conditions. We did this first to demonstrate that our model can reproduce the general movement error pattern produced by previous models (Sober and Sabes, 2003) and second to show how different noise amplitudes in the sensory variables change this error pattern. Figure Figure9A9A displays the differences in predicted error between high, medium and low noise in the reference frame transformation. As the amount of transformation noise increases, the reach error decreases. The transformed signal in both visual and proprioceptive coordinates is weighted less in the presence of higher transformation noise. However, the misestimation of IHP in visual coordinates has a bigger impact on movement error than the IHP estimation in proprioceptive coordinates (Sober and Sabes, 2003, 2005). As a consequence, the gross effect of higher transformation noise is a decrease in movement error because the proprioceptive information will be weighted relatively less after it is converted into visual coordinates.
Figure Figure9B9B illustrates the effect of visual sensory noise (e.g. in situations such as seen versus remembered stimuli) on predicted error in a high transformation noise condition. When the amount of visual noise increases (visual reliability decreases), proprioceptive information will be weighted more, and predicted error will increase. Conversely, as visual noise decreases (reliability increases), predicted error will decrease as well. Differences seen between different movement directions (forward and backward) are due to an interaction effect of transformations for vector planning (visual coordinates) and movement execution (proprioceptive coordinates).
Not only does visual noise impact the predicted error, but proprioceptive information does as well. Noise associated with different joint angles will result in proprioceptive information being weighted less than visual information, and as a result there will be a decrease in predictive error (Figure (Figure9C).9C). Figure Figure9C9C displays how changing the amount of noise associated with one joint angle over another can change the predicted error. For example with θ1>θ2, the signals indicating the arm deviations from the straight-ahead position are noisier than the signals indicating upper arm elevation. With this situation, visual error will be smaller when the targets are straight ahead or behind because the proprioceptive signals for the straight-ahead position are noisier and thus will be weighted less.
Figure Figure1010 displays the model fits to the data for both error (top panels) and variability (lower panels) graphs for each IHP (−25, 0, 25mm), comparing the different head roll effects. The solid lines represent the model fit for each IHP, with the squared nodes representing the behavioral data for each target. The model fits are different for each head roll position, with 0 head roll falling in between the tilted head positions. The model predicts that −30° head roll and 30° head roll would have reach errors in opposite directions; this is consistent with the data. Furthermore, the model presents 0 head roll as having the least variability when reaching towards the visual targets, with the behavioral data following the same trend.
In addition to modeling the effect of head roll on error and variability, we plotted the differences for IHP as well. Figure Figure1111 displays both error and variability graphs for each head roll condition (same plots as in Figure Figure10,10, but re-arranged according to head roll conditions). The reach errors for different IHPs changed in a systematic way; however differences in variability between the IHPs are small and show a similar pattern of variability across movement directions.
Determining how head roll affects multi-sensory weights was the main goal of this experiment. Previously in this section we fitted Sober and Sabes’ (2003) original model to the data, and displayed the visual weights for IHP estimates for both visual and proprioceptive coordinate frames (Figure (Figure8A).8A). In our model, we did not explicitly fit those weights to the data; however, from the covariance matrices of the sensory signals, we could easily recover the multi-sensory weights (see Appendix 1). Since our model uses two-dimensional covariance matrices (a 2D environment allows a visual coordinate frame to be represented in x and y, and proprioceptive coordinates to be displayed by two joint angles), the recovered multi-sensory weights were also 2D matrices. We used the diagonal elements of those weight matrices as visual weights in visual (x and y) and proprioceptive (joint angles) coordinates. Figure Figure1212 displays significant differences (t(299)<−10, p<0.001) for all visual weights between head straight and head rolled conditions, except for θ2. Visual weights were higher for visual coordinates when the head is rolled. In contrast, visual weights decrease in proprioceptive coordinates when the head is rolled compared to the head straight condition. These results were very similar to the original model fits performed in Figure Figure8A.8A. Thus, our model was able to simulate head roll dependent noise in reference frame transformations underlying reach planning and multi-sensory integration. More importantly, our data show that head roll dependent noise can influence multi-sensory integration in a way that is explained through context-dependent changes in added reference frame transformation noise.
In this study, we analyzed the effect of context-dependent head roll on multi-sensory integration during a reaching task. We found that head roll influenced reach error and variability in a way that could be explained by signal-dependent noise in the coordinate matching transformation between visual and proprioceptive coordinates. To demonstrate this quantitatively, we developed the first integrated model of multi-sensory integration and reference frame transformations in a probabilistic framework. This shows that the brain has online knowledge of the reliability associated with each sensory variable and uses this information to plan motor actions in a statistically optimal fashion (in the Bayesian sense).
When we changed the hand offset, we found reach errors that were similar to previously published data in multi-sensory integration tasks (Sober and Sabes, 2003, 2005; McGuire and Sabes, 2009) and were well described by Sober and Sabes’ (2003) model. In addition we also found changes in the pattern of reach errors across different head orientations. This was a new finding that previous models did not explore. There were multiple effects of head roll on reach errors. First, there was a slight rotational offset for the head straight condition, which could be a result of biomechanical biases, e.g., related to the posture of the arm. In addition, our model-based analysis showed that reach errors shifted with head roll. Our model accounted for this shift by assuming that head roll was over-estimated in the reference frame transformation during the motor planning process. The over-estimation of head roll could be explained by ocular counter-roll. Indeed, when the head is held in a stationary head roll position, ocular counter-roll compensates for a portion of the total head rotation (Collewijn et al., 1985; Haslwanter et al., 1992; Bockisch and Haslwanter, 2001). This means that the reference frame transformation has to rotate the retinal image less than the head roll angle. Not taking ocular counter-roll into account (or only partially accounting for it) would thus result in an over-rotation of retinal image, similar to what we observed in our data. However, since we did not measure ocular torsion, we cannot evaluate this hypothesis.
Alternatively, an over-estimation of head roll could in theory be related to the effect of priors in head roll estimation. If for some reason the prior belief of the head angle is that head roll is large, then Bayesian estimation would predict a posterior in head roll estimation that is biased toward larger than actual angles. However, a rationale for such a bias is unclear and would be contrary to priors expecting no head tilt such as reported in the subjective visual verticality perception literature (Dyde et al., 2006).
The second effect of head roll was a change in movement variability. Non-zero head roll angles produced reaches with higher variability compared to reaches during upright head position. This occurred despite the fact that the quality of the sensory input from the eyes and arm did not change. We took this as evidence for head roll influencing the sensory-motor reference frame transformation. Since we assume head roll to have signal-dependent noise (see below), different head roll angles will result in different amounts of noise in the transformation.
Third and most importantly, head roll changed the multi-sensory weights both at the visual and proprioceptive processing stages. This finding was validated independently by fitting Sober and Sabes’ (2003) original model and our new full Bayesian reference frame transformation model to the data. This is evidence that head roll variability changes for different head roll angles and that this signal-dependent noise enters the reference frame transformation and adds to the transformed signal, thus making it less reliable. Therefore, the context of body geometry influences multi-sensory integration through stochastic processes in the involved reference frame transformations.
Signal-dependent head roll noise could arise from multiple sources. Indeed, head orientation can be derived from vestibular signals as well as muscle spindles in the neck. The vestibular system is an essential component for determining head position sense; specifically the otolith organs (utricle and saccule) respond to static head positions in relation to gravitational axes (Fernandez et al., 1972; Sadeghi et al., 2007). We suggest that the noise from the otoliths varies for different head roll orientations; such signal-dependent noise has previously been found in the eye movement system for extraretinal eye position signals (Gellman and Fletcher, 1992; Li and Matin, 1992). In addition, muscle spindles are found to be the most important component in determining joint position sense (Goodwin et al., 1972; Scott and Loeb, 1994), with additional input from cutaneous and joint receptors (Clark and Burgess, 1975; Gandevia and McCloskey, 1976; Armstrong et al., 2008). Muscles found in the cervical section of the spine contain high densities of muscle spindles, enabling a relatively accurate representation of head position (Armstrong et al., 2008). In essence, as the head moves away from an upright position, more noise should be associated with the signal due to an increase in muscle spindle firing (Burke et al., 1976; Edin and Vallbo, 1990; Scott and Loeb, 1994; Cordo et al., 2002). However, due to the complex neck muscle arrangement, a detailed biomechanical model of the neck (Lee and Terzoloulos, 2006) would be needed to corroborate this claim.
We have shown that noise affects the way reference frame transformations are performed in that transformed signals have increased variability. A similar observation has previously been made for eye movements (Li and Matin, 1992; Gellman and Fletcher, 1992) and visually guided reaching (Blohm and Crawford, 2007). This validates a previous suggestion that any transformation of signals in the brain has a cost of added noise (Sober and Sabes, 2003, 2005). Therefore, the optimal way for the brain to process information would be to minimize the number of serial computational (or transformational) stages. The latter point might be the reason why multi-sensory comparisons could occur fewer times but in parallel at different stages in the processing hierarchy and in different coordinate systems (Körding and Tenenbaum, 2007).
It has been suggested that in cases of virtual reality experiments, the visual cursor used to represent the hand could be considered as a tool attached to the hand (Körding and Tenenbaum, 2007). As a consequence, there is additional uncertainty as to the tool length. This uncertainty adds to the overall uncertainty of the visual signals. We have not modeled this separately, as tool-specific uncertainty would simply add to the actual visual uncertainty (the variances add up). However, the estimated location of the cursor tool itself could be biased toward the hand; an effect that would influence the multi-sensory integration weights but that we cannot discriminate from our data.
In our model, multi-sensory integration occurred in specific reference frames, i.e. in visual and proprioceptive coordinates. Underlying this multiple comparison hypothesis is the belief that signals can only be combined if they are represented in the same reference frame (Lacquaniti and Caminiti, 1998; Cohen and Andersen, 2002; Engel et al., 2002; Buneo and Andersen, 2006; McGuire and Sabes, 2009). However, this claim has never been explicitly verified and this may not be the way neurons in the brain actually carry out multi-sensory integration. The brain could directly combine different signals across reference frames in largely parallel neural ensembles (Denève et al., 2001; Blohm et al., 2009), for example using gain modulation mechanisms (Andersen and Mountcastle, 1983; Chang et al., 2009). Regardless of the way the brain integrates information, the behavioral output would likely look very similar. A combination of computational and electro-physiological studies would be required to distinguish these alternatives.
Our model is far from being complete. In transforming the statistical properties of the sensory signals through the different processing steps of movement planning, we only computed first-order approximations and hypothesized that all distributions remained Gaussian. This is of course a gross over-simplification; however, no statistical framework for arbitrary transformations of probability density functions exists. In addition, we only included relevant 2D motor planning computations. In the real world, this model would need to be expanded into 3D with all the added complexity (Blohm and Crawford, 2007), i.e., non-commutative rotations, offset between rotation axes, non-linear sensory mappings and 3D behavioral constraints (such as Listing's law).
Our findings have implications for behavioral, perceptual, electrophysiological and brain imaging experiments. First, we have shown behaviorally, that body geometry signals can change the multi-sensory weightings in reach planning. Therefore, we also expect other contextual variables to have potential influences, such as gaze orientation, task/object value, or attention (Sober and Sabes, 2005). Second, we have shown contextual influences on multi-sensory integration for action planning, but the question remains whether this is a generalized principle in the brain that would also influence perception.
Finally, our findings have implications for electrophysiological and brain imaging studies. Indeed, when identifying the function of brain areas, gain-like modulations in brain activity are often taken as an indicator for reference frame transformations. However, as previously noted (Denève et al., 2001), such modulations could also theoretically perform all kinds of other different functions involving the processing of different signals, such as attention, target selection or multi-sensory integration. Since all sensory and extra-sensory signals involved in these processes can be characterized by statistical distributions, computations involving these variables will evidently look like probabilistic population codes (Ma et al., 2006) – the suggested computational neuronal substrate of multi-sensory integration. Therefore, the only way to determine if a brain area is involved in multi-sensory integration is to generate sensory conflict and analyze the brain activity resulting from this situation in conjunction with behavioral performance (Nadler et al., 2008).
In examining the effects of head roll on multi-sensory integration, we found that the brain incorporates contextual information about head position during a reaching task. We developed a new statistical model of reach planning combining reference frame transformations and multi-sensory integration to show that noisy reference frame transformations can alter the sensory reliability. This is evidence that the brain has online knowledge about the reliability of sensory and extra-sensory signals and includes this information into signal weighting, to ensure statistically optimal behavior.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
This work was supported by NSERC (Canada), CFI (Canada), the Botterell Fund (Queen's University, Kingston, ON, Canada) and ORF (Canada).
In the following sections, we describe the mathematical details of our model. We will assume that all sensory variables to have a certain estimate μ with Gaussian associated noise σ2. Joint angles will be denoted by θ whereas Euclidean variables are x. Vectors x are bold, matrices A are capitalized.
Figure Figure3B3B shows the arrangement of the body in the experimental setup with the hand at the IHP location. Since in our case the forearm was approximately parallel to the work surface (right panel of Figure Figure3B),3B), we can fully characterize the spatial arm position x as a function of two joint angles θ, i.e., deviation from straight-ahead (θ1) and upper arm elevation (θ2):
where L1/2 are the upper arm and forearm lengths respectively. In order to compute the inverse kinematic transformation of the noise covariance matrix, we used a first-order Taylor expansion of x(θ) around current joint angles θ0, i.e., can then be written as a linear combination of θ, i.e. x=Aθ +b, with
This allows us to write the covariance matrix ∑ of x as a function of the covariance matrix of θ, as:
The same approach can be used to compute the forward kinematics with
In our case of head roll movements, the required shoulder-centered-to-retinal coordinate transformation (T) simply consists of a rotation of the angle θH=βH, where β is a gain factor and H is the estimate of the head roll angle. Euclidean position in visual coordinates (xV) can thus be obtained from Euclidean position in proprioceptive coordinates (xP) using xV=TxP with
Since head roll (H) and thus θH are noisy variables, the transformation T introduces new noise on top of rotating the proprioceptive (P) covariance matrix into visual coordinates (V). We designed this new noise to be composed of a constant component simulating the fact that all transformations have a cost (Sober and Sabes, 2003) and a head orientation signal-dependent component ΣH.
From random matrix theory we know that any matrix can be decomposed into a constant and variable component, such that A=A0+E, where A0 has 0 variance and E 0 mean. Then, perturbation theory tells us that any linear transformation of a noisy variable x=x0+e can be written y=Ax=(A0+E)(x0+e)=A0x0+A0e+Ex0+Ee. The covariance of y can then be approximated by the covariance of A0e+Ex0, since the covariance of Ee is negligible and A0x0 has 0 covariance. Thus In our case, the matrix ΣE represents the variability resulting from the fact that the angle of the reference frame transformation is variable. This results in variability added to the direction orthogonal of y. Representing y in polar coordinates results in:
Note that, as expected, this term introduces errors perpendicular to the rotated vector. The reason for this is that variability in the rotation leads to noise only in the rotational direction around the transformed vector y.
The inverse transformation and associated covariance matrix can simply be computed by replacing the head roll angle H by –H.
At the heart of the model is the multi-sensory integration step that combines proprioceptive and visual sensory information. In our model (Figure (Figure3A),3A), this integration occurs twice, once in visual coordinates as part of computing the visual desired movement vector and once in proprioceptive coordinates, which is required to transform the desired movement vector into a change in joint angles when determining the motor command. From basic multivariate Gaussian statistics, the means μ and covariances ∑ of the combined IHP estimates from vision (V) and proprioception (P) writes as:
As mentioned above, this calculation is carried out twice, once in proprioceptive and once in visual coordinates. In visual coordinates, the sensory Euclidean visual information is combined with the transformed (forward kinematics and reference frame transformation) proprioceptive information (Euclidean). In proprioceptive coordinates, the sensory proprioceptive joint angles are combined with the visual information transformed into joint coordinates (inverse reference frame transformation and inverse kinematics).
To recover the weight matrix of the visual IHP estimate, we used and to follow (from subtraction of one from the other) that , where I is the identity matrix.
Once the IHP estimate from the previous step has been subtracted from the target location (), the resulting desired movement vector Δx needs to be transformed into a motor command . Here, we used a previously described velocity command model (Sober and Sabes, 2003; 2005) to perform this step as follows:
where J is the Jacobian of the system, θ is the actual joint configuration and is the estimated joint configuration from the multi-sensory integration step in proprioceptive coordinates. The Jacobian matrix is defined as . In our case, the Jacobian and its inverse write as:
To compute the covariance of the motor command, we need to propagate the variances through Eq. 15. To do so, we first re-write Eq. 15 as with . Since J(θ) is a constant transformation matrix, the covariance matrix of the final motor command can be written as:
It remains to calculate the covariance matrix ΣΔθ of the motor command expressed in joint angles. Since depends on a noisy estimate of the joint angles in proprioceptive coordinates, we again have to apply random matrix theory to approximate the noise induced by , as follows:
The covariance matrix associated with the noisy inverse Jacobian is computed similar to Eq. 12 as follows (using multivariate Taylor expansion):
We were only interested in the initial movement direction, as the model does not capture movement execution dynamics. Therefore, we transformed the final motor command from Cartesian into Polar coordinates. To transform both the means and covariance matrix into polar coordinates, we used the following formula:
To obtain the variance or movement direction towards different targets, we rotated the covariance matrix by the angle of movement direction. For the maximum likelihood estimation (MLE) procedure described below, we then only used Σ(r,),22, i.e., the variance orthogonal to the movement angle, and transformed it into angular units.
To estimate the model parameters from the data, we used a standard maximum likelihood estimation procedure. To do so, we calculated the negative log-likelihood (L) for our data to fit the above model given the set of fitting parameters ρ:
where (μ, σ2) are the mean and variance resulting from the model given the parameter set ρ, n is the number of data points and y contains the data measured from the experiment. We can then search for the maximum likelihood estimate by minimizing Lρ over the parameter space, as:
These computations were carried out in Matlab R2007a (The Mathworks, Natick, MA, USA) using the fmincon.m (for Eq. 24)function.
To fit Sober and Sabes’ (2003) original model to our data, we used a standard non-linear least-squares regression method. The model equations were the same as for the full model, but without considering variances or reference frame transformations. Visual and proprioceptive information were simply combined using scalar weights, as in Sober and Sabes (2003, 2005).
Upper arm and lower arm (including fist) lengths were set constant to L1=30 and L2=45cm respectively. Shoulder location was assumed 30cm backward from the target and 25cm rightward of the target. Forward kinematics (Eqs. 5 and 6) for the center target location directly leads to IHP joint angles of θ1=42.5° and θ2=−8.3° for the deviation of straight-ahead and the upper arm elevation respectively. IHPs and target positions were taken from the experimental data.
There were five parameters in the model that were identified from the data, i.e. the variances of both proprioceptive () joint angles (same for both) and horizontal visual () IHP, the variance associated to the head roll angle (), a fixed reference frame transformation cost ( ΣH) and the head rotation gain for the reference frame transformation (β). The variance of target position () was fixed. To account for the fact that visual distance estimation is less reliable than visual angular position estimation, we set the distance variability to (evaluated from McIntyre et al., 1997; Ren et al., 2006, 2007).
The best-fit model parameters are represented in Table Table1.1. They were obtained through bootstrapping analysis (N=100). We used a minimum number of model parameters to describe our data. In particular, we did not have two independent joint angles, as our data were not compelling enough to distinguish the effect of both.
|Visual variance||0.347±0.019 (mm2)|
|ΣH||Constant transformation noise||0.297±0.031(mm2)|
|β||Head roll compensation gain||1.041±0.009 (.)|