Our experience of the world is based largely on multi-sensory information. For instance, when we manipulate objects we typically see and touch them simultaneously. Also, the sight of a person and the sound of his or her voice are co-located in space, something that also applies to sound-emitting objects in general. Further, when someone speaks, speech sounds correlate in an orderly way with facial movements. When objects or people move they typically produce a sound accompanying their movement and sound is produced when, for instance, a ball bounces on a surface or rolls across a hard floor. The ability to detect the matches and correlations existing in information from separate senses is thus a necessary condition for an integrated multisensory awareness of the world, and vital questions arise regarding the developmental origins of this ability.
Auditory-visual spatial co-location refers to the cross-modal association between the location of a sound and a visual event and, of course, depends on infants being able to localise sounds. Here there is some disagreement in the literature, with some work indicating a discontinuity or U-shaped developmental function, with auditory localisation hard to elicit at around 2 months (Clifton, Morrongiello, Kulig, & Dowd, 1981
; Field, Muir, Pilon, Sinclair, & Dodwell, 1980
; Muir, Clifton, & Clarkson, 1989
), and other work suggesting a more or less linear increase in localisation ability with age (Morrongiello, 1988
; Morrongiello, Fenwick, & Chance, 1990
). It appears likely that this disagreement relates to differences in the dependent measure. When visual orienting to sound is measured there is a dip in responsiveness at 2 months that may reflect a change over in the neural system mediating the orienting response (Muir et al., 1989
). However, a linear increase in localisation ability emerges when the response to sound position change does not involve orienting to it. This conclusion is in line with evidence from the spatial co-location literature, which suggests this ability is present at birth (Morrongiello, Fenwick, & Chance, 1998
) and in older infants, including 2-month-olds (Morrongiello, Fenwick, & Nutley, 1998
). We should note, however, that even at 6 to 7 months, accuracy of auditory localisation discrimination is approximately one tenth of adult resolution (Ashmead, Clifton, & Perris, 1987
Spatial co-location is fundamental to everyday perception and so extending our knowledge of the conditions under which infants reveal sensitivity to auditory-visual spatial co-location should be a priority. A current view is that temporal synchrony between sound and sight is initially more salient than spatial co-location (Morrongiello, et al., 1998
). However, spatial co-location is also a ubiquitous feature of intersensory information from the world; there are frequent cases in which objects produce sounds that are consistently co-located with their visual manifestation. Examples include people, who frequently talk while stationary or in motion and generally produce some sound of footfall when in motion, mechanical mobile toys within the home, and many forms of mechanised transport in the wider environment.
Many studies of spatial co-location (e.g. Fenwick & Morrongiello, 1998
; Morrongiello et al., 1998
) involve events at two fixed places: although the events themselves may be dynamic (for instance, the visual stimulus may move up and down to gain and maintain the infant’s attention), co-location of sound and sight occurs at static locations. However, there is some work that investigates detection of dynamic auditory-visual correspondences for movements in the near-far plane (Pickens, 1994
; Walker-Andrews & Lennon, 1985
). These studies indicate an ability to form such correspondences at 4 to 5 months. In these cases, however, the correspondence is not a direct spatial one, because auditory ‘distance’ is specified by sound intensity. Also, Pickens demonstrated that infants showed an association between changing sound amplitude and changing size of the visual stimulus in the absence of cues for movement in depth. This could mean that synaesthetic correspondence between visual size and sound amplitude explains part of the effect in these cases, a possibility made more plausible by evidence for synaesthetic correspondences at 4 months (Walker, Bremner, Mattock, Mason, Spring, Slater, & Johnson, 2010
) as well as in toddlers (Maurer, Pathman, & Mondloch, 2006
; Mondloch & Maurer, 2004
It thus appears important to investigate dynamic auditory-visual co-location where visual and auditory locations are more directly specified. In this respect, lateral movement is a good candidate for investigation, because it is possible to provide veridical auditory information for changing location. Also there is evidence that infants are sensitive to a bounce illusion
in which two objects that move smoothly through each other appear to bounce when a sound co-occurs with their fusion (Scheier, Lewkowicz, & Shimojo, 2003
), which suggests that intermodal information is likely to be processed in the case of lateral movements. Surprisingly, however, to our knowledge there is no work that investigates dynamic co-location in lateral movements, though this should be relatively easy to investigate. Suppose, for example, infants are habituated to an event sequence in which a sounding object moves back and forth on a horizontal path. It is then possible to test for dishabituation when the object moves as usual but the sound is dislocated, so that, for instance, as the object moves left the sound moves right. If infants show recovery of looking, we can conclude that they have detected the invariant dynamic relation between locus of sight and sound, and note when this is violated. We can also investigate whether any such effect is limited to the case of co-location by habituating infants to a dislocation relation and testing for recovery of looking to the co-location relation. In addition to filling an important gap in the literature, studies of this sort carry the dual advantage of tapping into dynamic events that typically occur in the world (moving objects typically make a sound due to their movement).
It is not clear what one would predict regarding emergence of this dynamic form of spatial co-location. On the one hand, we might expect quite young infants to reveal this ability. We know that newborns detect spatial co-location in the case of static positions (Morrongiello et al., 1998
), and presentation of dynamic information might, if anything, enhance this ability. On the other hand, adults are quite poor at detecting departure from dynamic spatio-temporal co-location of a moving object and a moving sound, there being a tendency to perceive a sound as moving with the visual object even when it is not (Soto-Faraco, Kingstone, Lyons, Gazzaniga, & Spence, 2002
). If this tendency exists in infants it could act as a barrier to detection of dislocation between sight and sound.
The work reported here is a systematic investigation of circumstances under which infants detect violation of amodal auditory-visual relations in dynamic events involving sounding objects. We employ well-tested techniques used successfully to investigate object unity (Johnson, Bremner, Slater, & Mason, 2000
) and trajectory perception (Bremner, et al., 2005
) in infancy, and report four experiments that investigate the conditions under which infants detect changes in spatio-temporal co-location and dislocation between moving visual and auditory stimuli. In all four experiments, as the visual stimulus we use an image of a ball, moving on a horizontal trajectory, and as the auditory stimulus we use an attractive sound, stereophonically produced so as to create the impression (to adults) that it moved with or in the opposite direction to the object. Thus in co-located displays there was redundant dynamic co-location of sound and sight, but the relation between the nature of the sound and the nature of the object was arbitrary. The latter choice was made partly because piloting indicated that it was important to ensure that both the auditory and visual stimulus were salient, and more ‘realistic’ sounds, such as that of a ball rolling, did not appear to recruit attention. Additionally, however, many of the sound-emitting physical objects that infants encounter produce sounds that are arbitrarily related to their visual appearance; this is particularly true of infant toys. Thus, for both methodological and theoretical reasons we chose a sound that was completely arbitrary relative to the visual object.
Dynamic auditory-visual spatial co-location is an amodal intersensory relation involving redundant presentation of information across the senses, and there is good evidence that redundant presentation of this sort recruits infants’ attention and enhances learning (Bahrick, Flom, & Lickliter, 2002
; Bahrick & Lickliter, 2000
; Bahrick, Lickliter, & Flom, 2004
). It should be noted that in all displays used in this series, there was also temporal redundancy consisting of common onset and offset, and hence duration of visual and auditory events. However, although stimulus onset generally happened when infants were fixating the screen, offset generally occurred when infants were looking away. Also, there were no discontinuities in auditory and visual stimuli during trials. Thus, only common onset information was liable to be consistently perceived by infants. However, given the argument that temporal synchrony is initially more salient than spatial co-location (Morrongiello et al., 1998
), it is possible that common onset is salient enough to cue that sound and sight are linked. Thus, this information in itself may be sufficient to link auditory and visual information that is spatially dislocated.
In the experiments reported here we investigate dynamic auditory spatial relations in three ways. In Experiment 1 we investigate whether infants between 2 and 8 months of age demonstrate a spontaneous sensitivity to co-location and detect departures from this relation. In Experiment 2 we investigate infants’ response to departure from co-location following exposure to co-location during habituation trials. And in Experiments 3 and 4 we exposed infants to a dislocated relation between sight and sound in which the sound appears to move in the opposite direction to the visual object, and measured their post habituation response to a co-located event. Thus, in addition to investigating spontaneous sensitivity to dynamic co-location, this series investigated whether asymmetries existed in the degree to which infants could detect departures from co-location and dislocation relations following repeated exposure. Given that co-location is a good inter-modal cue to ‘objecthood’, we might expect greater sensitivity to co-location than dislocation.