|Home | About | Journals | Submit | Contact Us | Français|
Animal nervous systems resolve sensory conflict for the control of movement. For example, the glass knifefish, Eigenmannia virescens, relies on visual and electrosensory feedback as it swims to maintain position within a moving refuge. To study how signals from these two parallel sensory streams are used in refuge tracking, we constructed a novel augmented reality apparatus that enables the independent manipulation of visual and electrosensory cues to freely swimming fish (n = 5). We evaluated the linearity of multisensory integration, the change to the relative perceptual weights given to vision and electrosense in relation to sensory salience, and the effect of the magnitude of sensory conflict on sensorimotor gain. First, we found that tracking behaviour obeys superposition of the sensory inputs, suggesting linear sensorimotor integration. In addition, fish rely more on vision when electrosensory salience is reduced, suggesting that fish dynamically alter sensorimotor gains in a manner consistent with Bayesian integration. However, the magnitude of sensory conflict did not significantly affect sensorimotor gain. These studies lay the theoretical and experimental groundwork for future work investigating multisensory control of locomotion.
How multimodal information is integrated for the moment-to-moment control of movement is not well understood, in part because different tasks, environments and physiologies necessitate different strategies. In lobsters, motor control shifts between modalities in a context-dependent manner; tethered lobsters used vision to track the movement of a low-frequency stimulus and proprioception to track a high-frequency stimulus . This strategy has also been observed in freely swimming sharks. Sharks switch between sensory modalities during hunting and substitute alternate modalities when necessitated by environmental changes or their own sensory limitations . Rather than a switch or substitution, flies apparently integrate information contemporaneously across many sensory modalities for behavioural control. For example, a tethered fly does not locate the source of an attractive odour without a richly textured visual panorama . Furthermore, the odour has a context-dependent influence over the gain of the optomotor response . The fly's motor responses to simultaneous visual and olfactory cues are a linear sum of the responses to these stimuli when presented alone .
Weakly electric fishes appear to re-weight multimodal information in relation to behavioural context. During prey capture, the relative contributions of vision, electrosense and mechanosense change as a function of environmental factors such as water conductivity [6,7]. Similarly, these fish dramatically change their locomotor behaviour based on ambient illumination. While they track a refuge smoothly in the light, the fish produce fore–aft movements in the dark that are believed to enhance electrosensory feedback .
Each of these studies used a similar approach in which the animal's performance was compared as either the sensory modalities themselves were altered or the availability of sensory stimuli was altered (i.e. the animal did not have simultaneous access to more than one sensory modality). The application of control theory, however, requires the dynamic perturbation of sensory feedback. Here, we developed an augmented reality infrastructure that enables simultaneous and independent manipulation of the two sensory modalities, vision and electrosense, relied on by weakly electric fishes to perform refuge tracking [9–12]. In this robust and natural behaviour, untethered fish swim to maintain position within a moving refuge. Our novel system enables us to apply small perturbations to sensory feedback in each modality, which permits control theoretic analyses of multimodal integration during free behaviour.
We evaluated the linearity of the multisensory interaction by simultaneously presenting either conflicting or coherent visual and electrosensory cues. We also quantified the effects of saliency of electrosensory cues on the relative weights given to electrosense and vision. Finally, we examined whether fish re-weight sensory information based on the magnitude of conflict between visual and electrosensory cues.
The neural computations involved in sensorimotor control are fundamentally closed-loop: sensing governs action, action changes the state of the animal in its environment, and these changes are sensed. Control theory provides a common framework to quantify and interpret the behaviour of the whole animal through perturbations to exogenous reference signals and measurements of corresponding behavioural responses (for reviews, see [13,14]). Closed-loop neuromechanical modelling has been used to investigate the feedback control of diverse biological systems and behaviours, including flight control in moths [15,16] and flies [17–19], flower tracking in moths , postural balance in humans [21,22] and refuge-tracking in fish [8,23,24].
Building on this tradition of using control theory in the study of biological systems, we apply system identification techniques to analyse how the fish performs the complex sensorimotor task of refuge tracking. Refuge tracking is a closed-loop behaviour; the fish continuously modulates its motor commands to stabilize itself with respect to the moving refuge. The behaviour is enabled by the nervous system's ability to filter parallel visual and electrosensory streams in a modality-specific way and then fuse them into a unified precept of the refuge. Because our apparatus (figure 1a) enables us to provide independent cues to each sensory modality, we can apply feedback control theory to elucidate the rules governing that multisensory interaction. The topology of our experiment is represented by the block diagram in figure 1c, where all signals and subsystems are modelled in the frequency domain. In a recent study, Roth et al.  used a similar topology and analysis to show the linearity of vision and mechanosense in moths performing flower tracking.
When the visual and electrosensory stimuli are congruent, V(s), E(s) and C(s) can be collected into a single sensorimotor transform. Under this assumption, Cowan & Fortune  showed that this lumped multisensory controller depends on a precise model of the plant. Subsequently, Sefati et al. published a model of the plant, P(s), based on a quasi-steady analysis of the fluid dynamics . Critically, we do not yet understand how visual and electrosensory cues are integrated by the brain to control refuge tracking. Here, we are interested in the relative open-loop sensory gains to vision, V(s), and electrosense, E(s). These transfer functions represent the frequency-dependent perceptual ‘weight’ given by the central nervous system to vision and electrosense, respectively, as a function of stimulus frequency.
To characterize V(s) and E(s), we measure the fish motion as it resolves the conflict between independent electrosensory and visual inputs, rather than the congruent stimuli used in previous studies. By examining the frequency content of the fish's tracking motion in response to independent perturbations to vision and electrosense, we quantify the performance of individual components of the closed-loop system in terms of a behaviour-level model. Once the system is broken into its constituent subsystems, the equation predicting its response, Y(s), to given reference signals can be derived from the block diagram. To make the transfer function algebra more intuitive, we rearrange the block diagram (figure 1c) such that the feedback loop is consolidated into a closed-loop transfer function, G(s) (figure 2a). Then, as shown in figure 2b, we simplify the closed-loop block diagram to an open-loop cascade of visual and electrosensory motion processing, V(s) and E(s), with the closed-loop transfer function, G(s):
In the above equation, G(s) encapsulates the closed-loop dynamics, including the animal's reafferent stimulation of its own visual and electrosensory cues:
That is, the presence of V(s) and E(s) in the denominator of G(s) reflects the fact that vision and electrosense still contribute to the feedback loop regardless of which modality is perturbed. Crucially, G(s) multiplies both and in equation (2.1) (figure 2). Therefore, the open-loop gains, V(s) and E(s), are proportional to closed-loop experimentally measured gains and , respectively:
The closed-loop gain to vision, , is the proportion of the fish response attributable to the visual reference motion, while the closed-loop gain to electrosense, , is the proportion attributable to the electrosensory reference motion. In terms of and , the fish response is
In this manner, the input–output frequency response of the whole system enables us to empirically observe the relative contributions of vision and electrosense in the sensorimotor transform.
Our multisensory stimulation method exploits the fish's natural tendency to seek refuge in narrow cavities. The experimental apparatus is similar to that reported in previous studies [8,23,25,27] and was equipped with an actuated refuge, a projector and a high-speed video camera (figure 1a). The test environment is a 17-gallon rectangular tank made from non-tempered clear glass. We constructed a 12 × 5 × 4 cm triangular refuge of 0.05 cm white (polytetrafluoroethylene) PTFE held in place by a clear and colourless acrylic frame. The frame was designed to give as little electrosensory information as possible beyond that of the PTFE refuge. The frame connects the refuge to the linear stepper motor (STS_0620-R, H2 W Technologies, Inc., Valencia, CA, USA), which actuates the refuge along the longitudinal centreline of the tank with up to 1 µm resolution. Uniquely, this apparatus includes a projector (Pocket Projector Pro, Brookstone, Merrimack, NH, USA) mounted on the stepper motor and aligned with the centre of the refuge. It back-projects the visual stimulus, a pattern of 15 vertical stripes, onto the refuge (figure 1b). The trajectory of the stripes is controlled independently from that of the refuge. Crucially, the PTFE refuge is sufficiently translucent so that the projected light pattern can be seen by the fish from inside the refuge. As the fish maintains position under the refuge, it gathers electrosensory information from the physical refuge structure and visual information from the light pattern. The stripes are the dimmest that still elicit a tracking response when the refuge is stationary, because if too bright, the stripes partially illuminate the tank and the fish can see the refuge.
A high-speed camera (pco.1200 camera link, PCO AG, Kelheim, Germany) records the fish's position inside the refuge. A mirror placed at an angle below the tank provides direct viewing access to the fish for videography. Two infrared LED illuminators (CMVision-IR200, C&M Vision Technologies, Inc., Houston, TX, USA) are mounted under the tank to facilitate recordings in the dark. No markers are required.
Five adult Eigenmannia virescens (length 12–15 cm) were obtained from a commercial vendor and housed according to published guidelines . Fish were drawn from communal mixed-sex tanks at 27°C and conductivity 150–250 µS cm−1.
An individual fish was transferred to the testing environment at least 12 h prior to a data collection session. Each fish received 10 replicates of six stimuli profiles (table 1) at two conductivities, 150 and 500 µS cm−1, all in the dark. The profile order was randomized with the constraint that the fish complete every profile once before repeating any profile. Fish 3, Fish 4 and Fish 5 performed the trials at low conductivity first; and Fish 1 and Fish 2 performed high-conductivity trials first.
For a given profile, both the electrosensory and visual stimuli include a high-amplitude (2.4 cm s−1), low-frequency (0.05 Hz) base component which the fish has been shown to track accurately [8,23,25]. In addition, one or both of the sensory inputs contained a higher frequency (0.25 Hz) ‘probe’ component at one of two amplitudes (0.36 or 0.18 cm s−1) with randomized phase. Both probe component amplitudes were deliberately chosen to be much lower than that of the base component, because we expected that the small amplitude probe signal would act as a cross-modal illusion and trigger an unconscious sensory re-weighting rather than an attentional switch . For example, the input trajectory for the refuge in Profile 3 (high-amplitude electrosensory probe) was given as follows:
Since there was no visual probe for that trial, the light pattern was given as follows:
Here, and indicate time-domain representations of and , respectively.
The fish completed 10 ‘training’ trials of the high-amplitude coherent stimuli (Profile 5) before a data collection session began. The fish performed approximately 36 trials in each session with an approximate inter-trial interval of 2 min in which the refuge and light pattern were stationary. To mitigate transient effects, each 100 s trial had 10 s ramps at the beginning and end which were excluded from further analysis. The base frequency of 0.05 Hz dictated that the period was 20 s, so each trial consisted of exactly four periods of the input. The camera frame rate was 20 Hz, meaning 1600 frames of data were collected for analysis for each trial. Video clips of a fish performing a profile at both conductivity conditions are included in the electronic supplementary material.
Because the fish were unconstrained, they occasionally performed movements unrelated to the tracking task. Experiments in which the fish left the refuge or reversed orientation within the refuge were excluded from data analysis. All other volitional movement was included. Fish 1 only completed three successful trials of the high-amplitude electrosensory stimulus (Profile 3) at high conductivity, and those trials were also excluded.
The absolute positions of the fish and refuge for each trial (n = 558) were digitized from the video in Matlab using custom code (MathWorks, Natick, MA, USA), and the time trajectory of velocity for the refuge, visual stimulus and fish were calculated (figure 3a). The remainder of the analysis will be in terms of velocity, not position, because the fish were free to maintain an arbitrary position and initial orientation with respect to the refuge, as in previous studies of refuge tracking [8,23].
The time-domain mean of a single fish's velocity for each profile for 10 replicates was taken at each frame of that profile, a technique recommended to reduce the bias and variance of the frequency response function measurement . For instance, the fish occasionally uses whole-body bending to extract additional electrosensory information from its surroundings  and rapid shifts in position to correct accumulated tracking error (drift with respect to the refuge), and time-domain averaging reduces the effects of these nonlinear behaviours (figure 4).
A discrete Fourier transform (DFT) was applied to the averaged velocity data using the fast Fourier transform algorithm. The DFT represents the time-domain signals as complex-valued functions of frequency (figure 4). From the frequency domain data, we extracted the gain at the base frequency in response to the coherent stimulus and the gain at the probe frequency due the modality of interest. Together, these terms compose the closed-loop gain, or , and we calculated magnitude and angle from the resulting complex function.
There was assumed to be no measurement error on the input or the output, a reasonable assumption given the high precision of the measurement equipment and synchronization built into the data collection system.
Unless otherwise noted, a full factorial two-way analysis of variance tested the effect of conductivity and/or stimulus amplitude (depending on the hypothesis) on the fish response. All statistical analyses were performed using Matlab's anova1 and anovan functions (MathWorks).
The highest peaks in output power occurred at the input frequencies at low and high conductivity (see the electronic supplementary material for figures). From this result, we conclude that the fish tracked the stimuli, and its response was not the result of other behaviours such as exploratory movements.
The magnitude of a response to a stimulus with coherent visual and electrosensory components was compared to that for the visual and the electrosensory stimuli alone. We expected multisensory enhancement: the sum of the gain to vision from a trial with a visual probe and gain to electrosense from a trial with an electrosensory probe would be less than the gain in a trial where the stimuli are coherent . The fish displayed multisensory enhancement, exhibiting significantly higher gain for coherent cross-modal stimuli (Profiles 5 and 6) compared to single stimuli trials (Profiles 1–4; figure 5). That result is consistent with the literature in fish [8,9] and mammals [31,32] and indicates that the fish uses visual-electrosensory integration during refuge tracking.
From our first hypothesis, we expected that for trials with the high-amplitude probe, the sum of the gains to vision from a trial with a visual probe and gain to electrosense from a trial with an electrosensory probe would be approximately equal to the gain in a trial in which the stimulus contains coherent visual and electrosensory components at high amplitude. Specifically, , where here is the closed-loop gain to vision in trials with profile 1, is the closed-loop gain to electrosense in trials with profile 3 and is the closed-loop gain to electrosense in trials with profile 5.
Since the frequency response is characterized by both a phase shift and magnitude, we consider its position on the complex plane, where gain magnitude is the distance from the origin and phase shift is the counterclockwise angle from the positive real axis . The multisensory integration appears to be approximately linear (figure 6a). At low conductivity, when the stimulus contained coherent visual and electrosensory components, the gain magnitude was slightly higher than the sum of the incoherent stimuli, but the effect was insignificant (figure 6b). At high conductivity, the response was indistinguishable from linear.
In low conductivity, not all fish exhibit a robust response to the unimodal visual probe (see the electronic supplementary material). However, when the visual probe is coherent with the electrosensory probe, there is a strong enhancement over the unimodal electrosensory response, demonstrating that the visual stimulus is salient. This indicates a supralinear integration of vision and electrosense at low conductivity. This supralinear relationship is borne out on the complex plane (figure 6a).
When the conductivity of the water was increased, the fish experienced decreased contrast in the perceived electrosensory image of the refuge . We found that the gains for trials with coherent stimuli (Profiles 5 and 6) were unchanged between conductivity conditions, suggesting that the fish accurately tracked the refuge despite the categorical change in electrosensory saliency (figure 6). While unintuitive, this result has been described once before . What remains unknown, and what we investigated here, is the extent to which the fish re-weights electrosensory and visual information in adverse environmental conditions.
We anticipated that the fish would re-weight the electrosensory and visual signals to favour vision when electrosensory saliency was reduced. The gain ratio is useful to evaluate the fish's sensory re-weighting. A gain ratio equal to 1 would indicate that the fish weights visual and electrosensory stimuli equally, and indicates a higher weight to vision than electrosense. At low conductivity, only Fish 2 weighted vision higher than electrosense, suggesting that given a salient electrosensory stimulus, fish relied more heavily on electrosense than vision (figure 7a). We observed significantly higher gain ratios for high-conductivity trials compared with low-conductivity trials for four out of five fish (figure 7a).
The fish up-weighted vision when the electrosensory signal was degraded (figure 7b). Specifically, in profiles with a visual probe (Profiles 1 and 2), gain to vision was significantly higher at high conductivity than low conductivity across fish and amplitudes, implying that the re-weighting was mediated by modulating the gain to vision rather than a change in electrosensory gain.
The gain is a complex number containing both magnitude and phase information about the fish's response. The tracking error, or Bode error, captures both phase and magnitude at the probe frequency, so it is a useful measure to compare tracking performance between profiles . On the complex plane, the tracking error is the distance from the frequency response point to the point representing perfect tracking (unity gain and zero phase shift; figure 6a). In terms of tracking error, the fish displayed more accurate tracking of the visual stimulus at high conductivity than the same stimulus at low conductivity (figure 8). Again, the electrosensory response was unchanged by conductivity.
Based on the literature , we expected the fish to interpret a lower amplitude probe as more reliable and increase the gain to the probed modality. However, when the visual signal had low amplitude, two of the five fish decreased the magnitude of the gain to vision. Similarly, one fish decreased the gain to electrosense when the electrosensory stimulus amplitude was low. Gain to the visual stimulus was not affected by stimulus amplitude at either conductivity (p = 0.340). Similarly, gain to electrosense was not significantly affected by amplitude at either conductivity (p = 0.419). The tracking error was unaffected by the stimulus amplitude for visual (p = 0.211) and electrosensory stimuli (p = 0.749) for both conductivities. One might expect a significant result from testing a large number of fish, but the failure to detect significance with five fish suggests that any amplitude-dependent nonlinearities are small relative to the variability between conditions.
Robustly interpreting sensory input is central to successful interaction with the environment in general and this tracking task in particular. Our experimental apparatus enabled us to quantify the change in relative weights given to vision and electrosense during a complex locomotor task. We found that Eigenmannia virescens employed flexible, saliency-based locomotor control. Specifically, the animals up-weighted visual information when electrosensory salience was compromised (high conductivity).
Fish routinely showed greater response to electrosensory stimuli than visual stimuli; three fish favoured electrosense even at high conductivity for some amplitude conditions. Since Eigenmannia virescens is nocturnal, electrosense might be its more biologically relevant sensory modality in the dark, causing it to up-weight electrosense over vision. Analogously, humans show a strong bias towards vision over audition in a spatial tracking task . In both cases, the multisensory interaction seems to be obeying ‘modality appropriateness' where the modality with the highest appropriateness to a given task dominates . Indeed, we observed oscillatory swimming patterns like those previously associated with no light (electrosensory-only) conditions , indicating that in our experiments the fish may have been relying more heavily on electrosensory information than vision. A visual stimulus better matched to the electric fish eye physiology may induce greater the reliance on vision, but research into such physiology remains sparse.
In a previous tracking study with coherent visual and electrosensory inputs (i.e. a visible physical refuge), the fish's response approximately conformed to the scaling property of linearity , so we expected that the response would also obey superposition across modalities. Indeed, we found that the fish's multisensory integration approximately obeyed superposition. Specifically, the sum of the gains to vision and electrosense in profiles with unimodal probes was approximately equal to the gains measured for coherent, cross-modal stimuli at both low and high conductivity. This result is consistent with previous research on multisensory interaction in insects: in a similar task, flower tracking, freely flying moths obeyed scaling  and superposition  of vision and mechanosense, and tethered flies showed a linear superposition of visual and olfactory motor responses during odour plume tracking .
As hypothesized, the ratio of visual gain to electrosensory gain increased at high conductivity, suggesting that the fish re-weighted the open-loop gains to vision and electrosense according to the relative saliency of the sensory inputs. These results are similar to those in a recent study of multisensory integration in sharks in which sharks dynamically substituted alternate modalities during hunting based on sensory and environmental conditions . Analogously, when faced with adverse electrosensory conditions, the fish up-weighted vision to maintain accurate refuge tracking. This up-weighting of vision resulted in significantly lower visual tracking error in high conductivity.
The fish's saliency-dependent response agrees with the human multisensory interaction literature. In a spatial tracking task, when visual uncertainty was low, auditory signals exerted little or no influence on perceived target location, but with increasing visual uncertainty, the participants demonstrated increased auditory influence . The fish were biased towards electrosense when the electrosensory uncertainty was low, but the weight to vision increased with electrosensory uncertainty. This finding supports the view that multisensory integration is mediated by the relative saliency in individual sensory domains. Strong intramodality dependence enables the nervous system to dynamically adapt to changing environmental conditions.
In a task similar to the one presented here, freely flying moths displayed a small decrease in gain at high frequencies to the motion of the flower during flower tracking in dim conditions compared with bright conditions . Based on this result, we expected to find a decrease in open-loop electrosensory gain at high conductivity, but the closed-loop gain to electrosense did not diminish in this condition. Perhaps the fish produced more active sensing movements to increase the electrosensory contrast; indeed, the fish has been shown to increase the amplitude of its forward–backward oscillations in response to increased conductivity and achieve similar tracking performance across conductivities . These active ‘wiggles’ in the fish's fore–aft movement would have contributed to variability in our estimates of tracking gain at high conductivity but enabled the fish to maintain tracking performance. In this way, active sensing may explain why the electrosensory gain did not decrease at high conductivity even though the magnitude of the gain to vision and the relative gain to vision over electrosense increased.
Contrary to our third hypothesis, the fish did not increase gain to a given modality based on a low-amplitude probe stimulus to that modality. In cases where the gain magnitude was higher for high-amplitude stimuli than low-amplitude stimuli, the fish actually moved farther (did more mechanical work) to follow the unreliable signal. This finding differs with results from research in human multisensory integration. For instance, humans were found to down-weight visual information in favour of auditory information in the presence of decreased visual signal reliability (spatial offset) during spatial localization . In another study, humans down-weighted the higher amplitude stimuli of either touch or vision during postural stabilization . The ability to down-weight unreliable signals is crucial: fall-prone older adults are hypothesized to be more visually dependent, failing to shift reliance towards somatosensory cues in environments where visual inputs are unstable . The discrepancy between our results and these human studies may come from the different task goals: self-orientation versus refuge tracking.
Our results could also be explained by an attentional switch to following the high-amplitude, high-frequency unimodal probe stimulus. At low amplitude, perhaps the sensory illusion was strong; the fish could not distinguish between the probe and base components of the cross-modal input and unconsciously re-weighted the sensory signals (as expected). In the spatial localization tasks, human participants reported being unaware of a spatial discrepancy between the auditory and visual signals ; but we do not have enough information about the fish's visual processing to determine whether the fish detected the sensory conflict in our task, and the fish were unable to complete post-trial surveys.
In this study, the single probe frequency used here was the minimum necessary stimulus to investigate the phenomenon of multisensory integration and control. A limitation of this approach is that it is impossible to predict the response to stimuli other than sinusoids at the probe frequency. In other words, a model fitted to these data would be underconstrained. Furnishing a predictive dynamical model requires broadband stimuli such as sums of sines , band-limited noise , chirps  and step functions .
Determining how the dynamics depend on sensory salience also requires an independent set of perturbations to the quality of the sensory cues themselves. For fish refuge-tracking behaviour, this could be achieved by degrading the visual signal—blurred stripes, incoherent pattern movement, etc.—in a way analogous to degrading electrosense through increased conductivity. Ultimately, one could use the responses from these richer stimuli, together with a model of the locomotor mechanics , to produce a predictive closed-loop model of control [13,14]. Such a model might include Bayesian inference for multisensory integration and state estimation, i.e. a Kalman filter .
There are two challenges to developing a Bayesian model of sensory integration in the context of closed-loop control. First, one must understand how the animal interprets the change in saliency and uses it to inform its estimate of the variance of the sensory signal. Second, active sensing behaviour violates the separation of sensing and action, an implicit assumption in most engineering approaches to state estimation. An exciting frontier lies in integrating Bayesian inference and active sensing [39,40].
The authors thank E. Roth for insightful discussion on stimulus design and closed-loop modelling, A. H. Griffith for assistance with data collection, and T. R. Mitchell for early contributions to the data analysis. The authors would also like to thank the reviewers for constructive feedback and suggestions.
All experimental procedures were approved by the Johns Hopkins University Animal Care and Use Committee and followed guidelines established by the National Research Council and the Society for Neuroscience.
The datasets and analysis code supporting this article are available at http://dx.doi.org/10.7281/T1D798BQ.
E.E.S. collected the data, analysed the data and prepared the majority of the manuscript. A.D. was instrumental in the design and construction of the experimental apparatus and the preparation of the Material and methods section of the manuscript. S.A.S. and E.S.F. contributed to the conception of the experiment and interpretation of the results. N.J.C. led the experimental design and interpretation of results and contributed to the drafting of the manuscript. All authors revised the manuscript and approved the final version.
No competing interest declared.
This material is based upon work supported by a Complex Systems Scholar Award to N.J.C. from the James McDonnell Foundation under grant no. 112836, a National Science Foundation Graduate Research Fellowship to E.E.S. under grant no. 112379 and an Achievement Rewards for College Scientists scholarship to E.E.S.