|Home | About | Journals | Submit | Contact Us | Français|
Research has shown that infants are able to track a moving target efficiently – even if it is transiently occluded from sight. This basic ability allows prediction of when and where events happen in everyday life. Yet, it is unclear whether, and how, infants internally represent the time course of ongoing movements to derive predictions. In this study, 10-month-old crawlers observed the video of a same-aged crawling baby that was transiently occluded and reappeared in either a temporally continuous or non-continuous manner (i.e., delayed by 500 ms vs. forwarded by 500 ms relative to the real-time movement). Eye movement and rhythmic neural brain activity (EEG) were measured simultaneously. Eye movement analyses showed that infants were sensitive to slight temporal shifts in movement continuation after occlusion. Furthermore, brain activity associated with sensorimotor processing differed between observation of continuous and non-continuous movements. Early sensitivity to an action’s timing may hence be explained within the internal real-time simulation account of action observation. Overall, the results support the hypothesis that 10-month-old infants are well prepared for internal representation of the time course of observed movements that are within the infants’ current motor repertoire.
Infants possess a remarkable ability to predict future events. This has been demonstrated in various domains such as visual expectation (Canfield and Haith, 1991; Adler et al., 2008), social interaction (Adamson and Frick, 2003; Striano et al., 2006), action perception (Hunnius and Bekkering, 2010; Rosander and von Hofsten, 2011), and object tracking (Rosander and von Hofsten, 2004). Predicting when and where events occur is indispensable to understand and smoothly coordinate one’s behavior with others’ actions in everyday life (cf. Hommel et al., 2001). However, it is unclear whether infants actually rely on real-time processing of observed actions when predicting their future trajectory. As a consequence, the cognitive and neural processes of such real-time representations remain poorly understood.
Transient occlusion of ongoing movement is a frequently used paradigm to investigate predictive abilities and their neural implementations. According to this research, both mnemonic processes (Wilcox and Schweinle, 2003; Keane and Pylyshyn, 2006; Bosco et al., 2012; Springer et al., 2013) and sensorimotor processes (e.g., Graf et al., 2007; Southgate et al., 2009; Elsner et al., 2013) have been advocated to assist movement observation. Studies on object motion suggest that infants linearly extrapolate the ongoing trajectory of observed movement (e.g., von Hofsten et al., 1998). Linear extrapolation corresponds to working memory operations (e.g., Baddeley and Hitch, 1974; Pelphrey and Reznick, 2002) maintaining an internal representation of the target movement during occlusion that can be matched following the reappearance to generate predictions. In line with this assumption, infants need to plan and control their eye movements based on previously collected information in order to match pre- and post-occlusion input (Bennett and Barnes, 2003; Rosander and von Hofsten, 2004; Springer et al., 2013; Kwon et al., 2014; Bache et al., 2015).
While object motion usually follows linear trajectories with continuous velocity human movement is non-linear with changes in velocity and path. Linear extrapolation may hence not be an optimal approximation of human trajectories. Infants have been shown to render precise predictions about observed human actions, such as transporting a ball into a basket (Green et al., 2014). Here, predictions may be derived from internally simulating the observed action in sensorimotor areas of the brain as if performing the action oneself (Flanagan and Johansson, 2003; Falck-Ytter et al., 2006; Rosander and von Hofsten, 2011). In line with this assumption, initial evidence in infants suggests that sensorimotor processes support the internal representation of spatiotemporal aspects of human action, including predictive functions (Southgate et al., 2009, 2010; Stapel et al., 2010, 2016).
It remains unclear whether infants’ processing of human movement recruits real-time representations employing simulation, memory, or both. Here, we consider representations as a neural pattern of stimulus coding that maintains stimulus properties as a close analog to the original sensory input in order to integrate previous and newly incoming stimulation (Hebb, 1949/2009).
Transient occlusion allows manipulating the temporal structure of on-going movement so that the post-occlusion trajectory does not reflect a time-matching continuation of the pre-occlusion movement. Applying such a paradigm, behavioral studies in adults pointed out that the processing of observed actions is running parallel to the actions’ time course (e.g., Graf et al., 2007). However, previous studies also suggested that delayed and forwarded manipulations may not be processed similarly. More precisely, adults judged the continuation of a human action after a transient occlusion to be continuous when it was in fact slightly delayed, while they judged the continuation to be on time when it was in fact slightly forwarded (e.g., Sparenberg et al., 2012). Infants could recognize temporal shifts only if extreme jumps forward in time were presented (Wilcox and Schweinle, 2003; Bremner et al., 2005), while they could readily detect an one-second delay in their mothers’ interaction (Striano et al., 2006). To further explore how infants process the time course of human action, delayed and forwarded movements need to be contrasted with continuous movement.
The present study aimed to investigate infants’ sensitivity to the time course of human action. Specifically, 10-month-old crawlers watched a same-aged crawling baby that was transiently covered from sight. Following the occlusion, the movement was either continued in a time-matching manner (i.e., no time shift, resulting in continuous movement continuation) or in a non-matching manner (i.e., time shift, resulting in delayed or forwarded movement continuation) relative to the pre-occlusion movement stream (Graf et al., 2007). Due to limits in attention span, infants were randomly assigned to one of two experimental groups watching either continuous and delayed (i.e., Delay group) or continuous and forwarded movements (i.e., Forward group) within a single experimental session.
To capture mnemonic and sensorimotor contributions to movement processing, eye movements (via eye-tracking) and rhythmic neural activity (via electroencephalography, EEG) were measured simultaneously. Eye movements have been associated with both mnemonic (e.g., Keane and Pylyshyn, 2006) and sensorimotor processing (e.g., Elsner et al., 2013) and therefore provide a rather indirect measure of cognitive processes. Rhythmic neural activity may provide a complementary view. Specifically, mnemonic functions are assumed to be reflected in frontal theta1 modulations (Jacobs and Kahana, 2010; Saby and Marshall, 2012; Lisman and Jensen, 2013; Bache et al., 2015), and sensorimotor simulation is assumed to be reflected in central alpha modulations (also labeled sensorimotor, rolandic or mu rhythm; Cochin et al., 1999; Muthukumaraswamy et al., 2004; Marshall et al., 2011; Bache et al., 2015).
Only if the ongoing movement was processed in real-time while it was hidden during occlusion, could a time-matching continuation be distinguished from a non-matching one following occlusion (cf. Graf et al., 2007). Hence, infants’ sensitivity to the time course of movements would be reflected in differences in tracking and neural patterns following occlusion, whereas there should be no differences prior to and during the occlusion. With regard to eye-tracking, we hypothesized that the tracking of the target’s reappearance position would be more accurate (i.e., landing on mid to front parts of the target) and more consistent (i.e., less variable across infants) in time-matching continuations (Bennett and Barnes, 2003). In contrast, the reappearance position would be overshot (i.e., landing in front of the target) in delayed continuations, and undershot (i.e., landing behind the target) in forwarded continuations, and tracking would be overall less consistent in both non-continuous continuations. With regard to EEG, we hypothesized that frontal theta activity would be elevated more when processing non-matching than when processing time-matching continuations because temporarily stored representations during occlusion would not match the reappearance position following occlusion (Orekhova et al., 1999; Kwon et al., 2014). Secondly, central alpha activity was expected to decrease more (i.e., indicating increased processing, Hanslmayr et al., 2012; Werkle-Bergner et al., 2014) in non-matching than in time-matching continuations because real-time simulation during occlusion should result in a prediction error relative to the actual reappearance position following occlusion (Kilner et al., 2007; Stapel et al., 2010).
Participants were recruited from a database of parents interested in participating in infant studies at the Max Planck Institute for Human Development, Berlin. Infants were invited at 10 months of age (±10 days) according to the following criteria: (a) the infant was born at term (week of gestation ≥ 37, birth weight ≥ 2500 g), (b) to the parents’ knowledge, the infant had no visual impairments nor current health issues, and (c) according to the parents, the infant was capable of crawling on hands and knees with her/his stomach lifted but not yet able to walk. Parents were encouraged to bring their own notes about their children’s motor development to fill in a short checklist in the lab. The experiment was approved by the Institute’s Ethics Committee.
A total of 99 10-month-old infants were tested. Twelve infants were not considered for further preprocessing as they did not crawl a distance of 1.5 m in the lab at least once (n = 4) or were too fussy to be properly tested following preparation for EEG and eye-tracking (n = 8). For eye-tracking analysis, 14 further infants were excluded because (a) the calibration failed (n = 3), (b) the trigger information was missing in the recorded data (n = 6), (c) the measurement failed due to technical issues (n = 4), or (d) fewer than 10% of the actually watched trials were free of artifacts (n = 1). Furthermore, for the EEG analysis, 37 further infants were excluded because they did not produce enough artifact-free EEG data (at least 10 trials per condition; n = 30) or the measurement failed due to technical issues (n = 7).
Thus, the final eye-tracking sample consisted of 32 infants in the Delay group and 31 infants in the Forward group, and the final EEG sample comprised 24 infants in the Delay group and 25 infants in the Forward group. Tables Tables1,1, ,22 provide descriptive information on the final samples for eye-tracking and EEG analysis, respectively. Figure Figure11 illustrates which trials of both eye-tracking and EEG data were contributed to the analysis within the final samples. Note that not all infants provided data in both measures, and artifact-free trials were contributed randomly throughout the test session. As a result, eye-tracking and EEG data were analyzed separately (cf. Stapel et al., 2010).
Participants repeatedly watched a video of a same-aged baby crawling in front of a light gray background (2480 ms; pre-occlusion phase). The baby’s movement was transiently occluded by a full-screen black occlusion (500 ms; occlusion phase) and then immediately continued (1000 ms; post-occlusion phase). Hence, each trial lasted for 4000 ms. The video, however, was 4500 ms long, allowing to manipulate the movements’ timing. We choose to present an intransitive movement, that is a movement not directed at an apparent object or goal, in order to avoid confounds with object knowledge or object saliency. To avoid lateralization of brain activity, each video was presented from both left to right and right to left (i.e., flipped versions of the original video). On the x-axis of the monitor, the stimulus (i.e., crawling baby) was on average 279 pixel (ranging from 207 to 315 pixel) wide and moved with an average speed of 3° visual angle per second (see Figure Figure22 for an illustration of the stimulus material).
In a between-subjects design, participants were randomly assigned to one of two experimental groups: In the Delay group, continuous and delayed movements were shown, while in the Forward group, continuous and forwarded movements were presented. To achieve continuous and non-continuous (i.e., delayed or forwarded) movements, the starting time in the video footage was varied. More precisely, during pre-occlusion, non-continuous trials started either 500 ms earlier (i.e., at 0 ms in forwarded conditions) or 500 ms later (i.e., at 1000 ms in delayed conditions) as compared to the continuous trials (i.e., at 500 ms). However, following the occlusion (i.e., 500 ms), the movement was always continued at 3000 ms in the video footage. In other words, during occlusion, the video footage was paused in delayed trials (i.e., 0 ms elapsed), fast-forwarded in forwarded trials (i.e., 1000 ms elapsed), and continued in real-time in continuous trials (500 ms elapsed). Therefore, in non-continuous trials, the post-occlusion movement did not match a natural continuation of the pre-occlusion movement, but resulted in a forwarded (i.e., 500 ms too early) or a delayed (i.e., 500 ms too late) time course of the movement. Notably, the visual input slightly varied during pre-occlusion phases, while it was identical during occlusion and post-occlusion phases. Within each trial, time manipulation could only be detected following occlusion. This design ensured that differences between conditions during occlusion and post-occlusion could not be attributed to visual differences but reflect the manipulation of the movements’ time course.
Stimuli were presented using a customized program written in Microsoft Visual C++ (Microsoft Corporation, Redmond, WA, United States). Each trial was preceded by a centered fixation object (i.e., colored pictures of toys; duration of 800 – 1300 ms) on gray background. Conditions were presented in blocks of six trials, because rapid learning over trials has been reported (see Henrichs et al., 2014). The order of blocks was quasi-randomized such that blocks with the same condition and movement direction were never repeated successively. Participants were randomly assigned to one of two predefined block orders per experimental group. The stimulus presentation was controlled by an experimenter; depending on infants’ attention and compliance up to 24 blocks (i.e., 144 trials) were presented. The experiment was conducted in an acoustically and electromagnetically shielded room. Experimental sessions were video-recorded in time-synchronized split-screen images including a frontal and lateral view of the infant as well as a running and a condition trigger for coding infants’ behavior post hoc (Interact; Mangold International GmbH, Arnstorf, Germany). The lighting conditions were kept comparable across participants. The infant sat on the parent’s lap in a BabyBjörn® baby carrier facing a 20.1” monitor (dimensions: 40.8 cm × 30.6 cm, visual angle ≈ 29° × 22°) at a distance of approximately 80 cm (for more detailed information on the experimental procedure, see Bache et al., 2015). Despite restricting infant’s position, sitting distance could range from 60 to 90 cm when infants leaned forward or backward. In our set-up, the size of one pixel (0.051 cm) equals 0.037° visual angle for an ideal sitting distance.
Eye movements were recorded continuously using an EyeLink 1000 remote system eye-tracker (SR Research, Ottawa, ON, Canada), which allows for free head movements. The eye-tracking camera including the infra-red source was permanently positioned centrally below the presentation monitor. Participants were seated 55 cm from the recording eye-tracking camera. The camera recorded the corneal relative to the pupil reflection of the left eye at a frequency of 250 Hz in terms of raw gaze positions in pixel.
The infants’ head position was tracked using a small sticker on their forehead that allowed accounting for head movement of up to 100 cm/s. Infants’ position relative to the head box of the eye-tracker was checked using the camera image before the experimental procedure started. The data were filtered online using the second stage of the built-in heuristic filter (Stampe, 1993) which reduces noise in the data by a factor of 4 to 6 (according to the EyeLink manual). The average accuracy of the eye-tracking system is 0.5° visual angle for an ideal participant (i.e., sitting still with minimal head movements and generating a perfect calibration), as reported by the providing company, which would approximate to a 0.07 cm area at the viewing distance of 80 cm in the present experiments.
Following EEG preparation and prior to stimulus presentation, a five-point calibration procedure on a gray background was performed in the following order: center, upper center, lower center, left center, right center. The calibration target was a dancing rabbit in a square shape (96 × 96 pixel, approximately 4.9 cm2 on the monitor and 3.5° visual angle from the sitting position) accompanied by an attractive sound. An experimenter pushed a button to accept the gaze position if it was on the target position. The central position was repeated at the end as an estimate of accuracy. Calibration was only accepted if it was reported to be ‘good’ by the recording software (i.e., average error < 1° visual angle) and if the overall pattern of gaze positions matched the target’s positions according to the experimenter’s evaluation. If the calibration was not accepted, it was repeated until it was satisfying. If calibration could not be obtained, the experimental procedure was continued, but the participants’ eye-tracking data were discarded from analysis.
Ideal preprocessing of eye-tracking data should yield data that represent artifact-free and task-relevant eye movement. Yet, in infant studies, raw eye-movement data are typically only preprocessed in terms of detecting saccades or fixations by applying built-in algorithms of the eye-tracking system at hand (e.g., Gredebäck and Melinder, 2010). Recently, Wass et al. (2014) demonstrated that data quality affects fixation detection to such an extent that the interpretation of the results is put into question – even when a satisfactory calibration outcome is achieved. Moreover, comparing common categorization algorithms, it has been shown that results for fixations and saccades vary to such an extent that automated categorization may not always return meaningful results (Komogortsev et al., 2010; see Wass et al., 2013, for calculation of data quality post hoc).
In order to avoid classification artifacts and to account for data quality, raw gaze positions (i.e., x- and y-value in pixel per measurement unit) were visually inspected using a custom-made graphical user interface (GUI, see Supplementary Material) in MATLAB 7.10.0 (MathWorks Inc., Natick, MA, United States) to detect trials with measurement errors (i.e., noisy or no data, e.g., following gross movement, substantial changes in body/head position, or changes in the eyes’ lubrication) and compliance failure (e.g., gazing away from or staring blankly at the monitor; see Haith, 2004; Schneider et al., 2008; Wass et al., 2014). More precisely, raw data were segmented into 3400 ms long epochs from -2200 ms to 1200 ms relative to the onset of occlusion. The first and last 300 ms of each trial were discarded from analysis because (a) following stimulus onset, infants reoriented from the centered fixation object to the stimulus movement starting on either the left or right side of the monitor, and (b) approaching stimulus offset, infants’ attention frequently terminated. The extracted segments were displayed neutral with respect to condition, movement direction, and test session to avoid confounding influence. The stimulus dimensions (i.e., x- and y-values in pixel) for each video-frame were derived using OpenCV2 by defining the color contrast separating colored stimulus and grayish background. Stimulus dimensions were included in the GUI to map gaze positions to actual stimulus position. Only trials with less than 50% missing data (incl. data points beyond the monitor) were considered for inspection.
Each trial was visually scanned by a trained rater (CB) according to the persistent or repeated presence of the following exclusion criteria: (a) missing gaze positions, gaze positions outside and/or on the borders of the monitor shortly before, during, and/or following the occlusion in order to make sure that transitions were actually perceived, (b) noisy and/or broken data resulting from technical error, (c) prolonged stationary data points reflecting blank stares without following of the stimulus movement. In principle, trials could be associated with more than one criterion. Missing or outlying data points at the beginning and end of the trial were not regarded as an exclusion criterion. Trials that were identified as being of poor quality were discarded from further analyses (see Supplementary Material). In ambiguous trials, video-recordings of the experimental session were used to inform the decision.
Following visual inspection, the percentage of trials available for eye-tracking analysis was calculated relative to the number of trials that the infant had actually watched during stimulus presentation, based on behavioral coding of video-recordings. Only data from infants providing at least 10% artifact-free trials were considered for further analyses.
As the movement was mainly evolving on the horizontal axis across time, only raw gaze positions (in pixel) on the x-dimension (Gx) were used. Within subjects, gaze positions were averaged per condition for each measurement point (i.e., every 4 ms). Data for movement from right to left were flipped, so all trials were available in the left-to-right direction. Data on either the y- and/or x-axis that were outside of the monitor’s dimensions were considered missing, and this was also applied to the corresponding gaze position on the other axis. Missing values were discarded before averaging.
The analysis focused on infants’ gaze behavior in reaction to the moving stimulus. However, it is difficult to quantitatively determine gaze relative to moving objects based on raw gaze positions. To relate gaze and stimulus position, the midpoint of the minimal and maximal x-value of the stimulus dimension per video frame (see Preprocessing) was determined as mean stimulus position (in pixel). Due to the biological characteristics of crawling (i.e., stretching and flexing of extremities), the stimulus dimensions vary from frame to frame and thus the mean stimulus position over time does not represent a linear movement (see black dotted line in Figure Figure3A3A). Following, at each measurement point, the respective mean stimulus position was subtracted from the raw gaze position, resulting in a difference score that reflects the distance between gaze position and stimulus position. Thus, if infants were looking at the front parts of the stimulus target (i.e., baby’s hands and head), the resulting scores would be positive (and vice versa). Resulting difference scores were averaged for each measurement point per condition within each participant.
For statistical analysis, within subjects, the mean distance as well as the variance in distance between gaze and stimulus position were calculated for each trial across predefined 500 ms time windows for each phase of the trial (i.e., the last 500 ms of the pre-occlusion, the 500 ms of the occlusion, and the first 500 ms of the post-occlusion phase), and resulting means and variances, respectively, were averaged per condition. The two measures reveal different aspects of viewing behavior: Mean distance represents the average gaze position relative to the target position, and was thus taken to reflect tracking accuracy. Variance in distance represents the average fluctuation in tracking behavior, and was thus taken to reflect tracking consistency (i.e., whether tracking was rather consistent or random across infants).
Electroencephalography was recorded continuously with a BrainAmp DC amplifier (BrainProducts GmbH, Gilching, Germany) from 32 active electrodes (actiCap by BrainProducts) inserted into a soft elastic cap according to the 10–20-system (EASYCAP GmbH, Herrsching, Germany). During recording, the right mastoid electrode served as reference and the left mastoid was recorded as an additional channel. Ground was placed at location AFz. Impedances were kept below 20 kΩ during preparation. The EEG was recorded with an analog pass-band of 0.1 to 250 Hz and digitized with a sampling rate of 1000 Hz.
Prior to EEG-preprocessing, based on behavioral coding of video-recordings, trials were discarded if infants (a) did not attend to the total duration of stimulus presentation and (b) produced limb movement that could be seen as part of imitative crawling (i.e., limb movements involving movement of the joints wrist, elbow, shoulder as well as knee and hip). The latter criterion was chosen because we were interested in brain activity related to action observation but not to imitation. Furthermore, using Vision Analyzer 2 (Brain Products) for visual inspection, EEG trials were discarded which comprised broken channels or extreme/untypical artifacts (i.e., extensive movements). To this end, remaining EEG data were segmented into 4700 ms long epochs (from -2700 to 2000 ms relative to the onset of occlusion). Subsequent preprocessing and analyses were conducted using the FieldTrip (developed at the F.C. Donders Centre for Cognitive Neuroimaging, Nijmegen, The Netherlands3; Oostenveld et al., 2011) and custom-made routines operated in MATLAB 7.10.0 (MathWorks Inc., Natick, MA, United States).
Data were cleared of stereotypic artifacts using Independent Component Analysis (ICA; Jung et al., 2000). Specifically, ICs representing eye blinks, saccades, muscle activity, or instrumental noise were visually identified and discarded from further analysis by a trained rater (CB). To this end, all selected segments across all conditions were concatenated within subjects, filtered (high pass 1 Hz, low pass 100 Hz, 6th-order Butterworth-filter), and subjected to an extended infomax ICA (Bell and Sejnowski, 1995). A DFT-filter as implemented in FieldTrip was used to suppress line-noise. Decisions for rejection were based on integrated information from the ICs topography, power spectrum, event-related potentials (ERPs) as well as individual trials and the distribution of the IC over trials. Rejected ICs were in accordance with previous reports on typical artifacts in EEG data when stimulus presentation elicited eye-movements in a passive viewing paradigm (e.g., Plöchl et al., 2012).
All subsequent analyses were carried out in sensor space, based on the back-projection of the non-artifact ICs. Previously identified broken channels were interpolated after ICA-cleaning; a maximum of three non-neighboring channels were allowed. Cleaned data was re-referenced to the mathematically linked mastoids, filtered (high pass 1 Hz, low pass 30 Hz, 6th-order Butterworth-filter), and segmented into 4000 ms epochs according to the onset of occlusion (-2480 to 1520 ms). For each single trial, the offset was removed by subtracting the average of the total epoch.
Rhythmic neural activity was analyzed by means of fast Fourier transformation (FFT) using an individualized data approach taking idiosyncrasies into account (Nesselroade et al., 2007). That is, we identified the individual peak frequency (IPF) at the individual peak electrode (IPE) in a given electrode cluster and frequency range (Doppelmayr et al., 1998; Werkle-Bergner et al., 2009). In line with the literature, frontal theta activity, considered as reflecting mnemonic processing (see Saby and Marshall, 2012 for a review), was defined as oscillatory activity within 4–6 Hz at frontal electrodes F3, Fz, F4, FC1, and FC2 (Orekhova et al., 1999, 2006). Central alpha activity, assumed to indicate sensorimotor simulation (for reviews, see Marshall and Meltzoff, 2011; Cuevas et al., 2014), was defined as oscillatory activity within 6–9 Hz at central electrodes FC1, FC2, C3, Cz, C4, CP1, and CP2 (Stroganova et al., 1999; Marshall et al., 2002).
To detect individual peak frequencies, the spectral power distribution between 1 and 20 Hz at each electrode was estimated by means of FFT across all trials and phases (i.e., from -2480 ms to 1520 ms with regard to occlusion onset). Each trial was zero-padded to 10 s and tapered with a Hanning window to achieve a frequency resolution of 0.1 Hz. The power spectra were corrected for the 1/f trend inherent in scalp EEG data to facilitate the detection of spectral peaks (Demanuele et al., 2007; He et al., 2010). When no IPF was detected, the missing values were interpolated with the mean of all detected peaks to preserve comparable samples for the EEG measures. There was one missing value for frontal theta and central alpha each. These missings were not detected in the same participants across EEG measures.
For analyses of modulations in rhythmic neural activity, FFT was performed separately for each phase of the trial (i.e., pre-occlusion, occlusion, post-occlusion). As the phases (i.e., pre-occlusion, occlusion, post-occlusion) of each trial varied in length, the data were again zero-padded to 10 sec prior to FFT calculation, resulting in a common frequency resolution of 0.1 Hz. Power values for each phase of the trial and experimental condition were extracted for each participant at the respective IPF and electrode after averaging across trials within participants. For each condition, data were collapsed across movement directions (i.e., left to right and right to left) to obtain enough trials for statistical comparison. As the distribution of power values was skewed, data were log-transformed prior to the analysis4.
To provide rich information on infants’ tracking behavior over the course of the stimulus movement, mean horizontal gaze positions as well as mean horizontal distance in gaze and stimulus positions over time were described qualitatively. In addition, statistical analyses were done using SPSS 15.0 (SPSS Inc., 1989–2006, United States). Specifically, mixed effects repeated-measures ANOVAs with a between-subject factor Group (Delay group vs. Forward group) and the within-subjects factors Phase (pre-occlusion vs. occlusion vs. post-occlusion phase) and Time (continuous vs. non-continuous) were carried out separately for each measure of eye movement (i.e., mean distance, variance in distance) and rhythmic neural activity (i.e., frontal theta, central alpha). Including Phase makes it possible to check that differences in dependent variables occur only after the time-course manipulation was introduced, namely during the post-occlusion phase. Partial eta squared, , is reported as an estimate of the effect size. Greenhouse-Geisser corrections were applied if the assumption of sphericity was violated. As group sizes were equal, ANOVA was assumed to be robust toward violation of the assumption of homogeneity. Significant effects were followed up by separate Bonferroni-corrected ANOVAs or t-tests.
Mean horizontal gaze positions over time are shown in Figure Figure33.
Notably, the grand averages reflected the individual data (Figure Figure3B3B) suggesting that tracking was rather consistent across infants. In sum, average raw gaze positions over time indicate that infants were sensitive to manipulations in the timing of observed movements.
The average horizontal distance in gaze and stimulus position over time is shown in Figure Figure44.
Like mean horizontal gaze positions, grand averages of mean horizontal distance in gaze and stimulus positions were representative of individual data, which were actually highly systematic across conditions and individuals (Figure Figure4B4B) highlighting that tracking behavior was rather consistent across participants. Overall, these results indicate that infants were able to detect slight temporal shifts in the continuation of transiently occluded movements.
To analyze the mean distance as a marker for tracking accuracy in 500 ms time windows before, during, and following occlusion, a mixed effects repeated-measures ANOVA was performed. The results showed a significant main effect of the within-subjects factor (a) Phase [F(1.6,97.9)= 130.25, p = 0.000, = 0.68]. Furthermore, there were significant interaction effects for (b) Phase and Time [F(1.6,97.1)= 4.59, p = 0.012, = 0.07], (c) Time and Group [F(1,61)= 10.37, p = 0.002, = 0.15], and (d) Phase, Time, and Group [F(1.6,97.1)= 17.1, p = 0.000, = 0.22]. No further effects were observed (F < 3.06, p > 0.085). Figure Figure55 provides an overview of the results for mean distance and variance in distance.
To evaluate the (d) three-way interaction effect, a total of six paired-sample t-tests were performed, separately per levels of Group and Phase. The results showed that, during post-occlusion, the Delay group tracked continuous movements (M = -47.88, SE = 9.98) at more rear parts than non-continuous movements [M = 0.9, SE = 13.8; t(31) = -3.25, p = 0.003; pre-occlusion: t(31) = 0.54, p = 0.595; occlusion: t(31) = 1.51, p = 0.142], whereas the Forward group tracked continuous movements (M = -12.91, SE = 10.34) more frontally than non-continuous movements [M = -58.18, SE = 9.92; t(31) = 3.69, p = 0.001; pre-occlusion: t(30) = 2.1, p = 0.03; occlusion: t(30) = 2.0, p = 0.05].
In sum, these results indicate that infants differentiated continuous from non-continuous movements following occlusion. However, as already indicated in the qualitative description of average distance over time (see Qualitative Description of Distance in Gaze and Stimulus Position over Time), continuous movement was apparently not tracked similarly across experimental groups: Corresponding to our hypotheses infants in the Forward group tracked continuous movements more accurately but undershot forwarded continuations. Counter to expectations, infants in the Delay group did not overshoot delayed, but undershot continuous movements.
To analyze the variance in distance as a marker of tracking consistency in 500 ms time windows before, during, and following occlusion, a mixed-effects repeated-measures ANOVA was calculated. This revealed significant main effects of the within-subjects factor (a) Phase [F(1.7,104.2) = 24.72, p = 0.000, = 0.29] and the between-subjects factor (b) Group [F(1,61) = 4.69, p = 0.034, = 0.07]. No further effects were found (all F < 2.25, all p > 0.110).
Using paired-sample t-tests to follow up on the main effect of (a) Phase indicated that variance in distance was highest during post-occlusion [M = 4369.1, SE = 369.07; all t(62) > 4.31, all p = 0.000]. Variance in distance was also higher during occlusion (M = 2958.11, SE = 295.13) compared to pre-occlusion [M = 1842.46, SE = 189.36, t(62) = 3.54, p = 0.001].
To follow-up on the main effect of (b) Group, an unpaired t-test showed that variance in distance was higher in the Delay group (M = 3487.29, SE = 314.94) than in the Forward group [M = 2611.92, SE = 250.87; t(61) > 2.16, p = 0.034].
In sum, variance in distance increased due to transient occlusions. In addition, tracking was less consistent overall when infants watched continuous and delayed crawling versus continuous and forwarded crawling.
Taken together, both qualitative and statistical analyses of gazing behavior combine to provide a consistent picture: Results indicate that infants detected slight manipulations of the time course of an observed movement. Specifically, infants watching continuous and forwarded movements produced a tracking pattern consistent with the hypothesis of internal real-time simulation of observed movements during a transient occlusion (Graf et al., 2007). In contrast, infants watching continuous and delayed movements, albeit discriminating both conditions, produced a tracking pattern suggesting that real-time representations were not always precise (enough) or possibly altered by further processing (e.g., learned expectations across repeated presentations).
To analyze mnemonic contributions to time-course representations, a mixed effects repeated-measures ANOVA was calculated for frontal theta activity. Results showed a significant main effect of Phase [F(1.55,2.06) = 5.72, p = 0.009, = 0.57] without evidence for further effects (all F < 1.41; all p > 0.250). Figure Figure66 and Table Table33 provide an overview of the EEG results. Hence, counter to expectations, no differential activation of frontal theta activity was found, indicating that the manipulation of the time course of ongoing movement did not elicit differential demands on memory processes.
To analyze contributions from sensorimotor simulation to time-course representations, a mixed effects repeated-measures ANOVA was performed for central alpha activity. A significant interaction effect of Phase and Time occurred [F(1.9,91.5) = 3.61, p = 0.031, = 0.07]. No further effects were observed (all F < 2.14, all p > 0.123).
As also implied by the small effect size, follow-up repeated measures ANOVAs separately per level of Phase, did not yield significant effects (all F < 2.64, all p > 0.110). From the inspection of results as displayed in Figure Figure66 it may be concluded that, during post-occlusion, central alpha activity was lower for non-continuous than for continuous movements. Hence, in line with our hypothesis, our findings suggest that the cortical sensorimotor system is involved when infants render real-time simulations of transiently occluded movements that are within their motor repertoire.
This study explored the internal representation of the time course of observed movement. To this end, 10-months-old crawling infants watched videos of a same-aged crawling baby that was transiently occluded and reappeared in a time-matching (i.e., continuous) or non-matching (i.e., delayed vs. forwarded) manner. To tap mnemonic and sensorimotor contributions to time-course representations, eye movement and rhythmic neural activity were simultaneously measured. First, the results suggest that sensorimotor functions were recruited more during the perception of non-matching continuations following occlusion. In contrast, there was no evidence for a differential role of mnemonic functions for time-course representations. Secondly, eye movements differentiated between time-matching and non-matching continuations following occlusion indicating a high sensitivity to the movements’ time course. In sum, we conclude that 10-month-old infants generate internal movement representations that reflect the timing of observed movements. This corresponds to the internal real-time simulation account of action observation (Graf et al., 2007).
To investigate infants’ sensitivity to the time course of observed movements, we assessed eye-tracking patterns in response to a transiently occluded human movement. Our findings showed that 10-month-old infants distinguished between temporally matching and temporally shifted (i.e., delayed vs. forwarded) continuations following occlusion as demonstrated by differences in the mean distance in gaze and stimulus position.
Previous studies have indicated that 4- to 7-month-old infants are largely insensitive to a manipulation in the timing of an object’s motion during occlusion, in that temporal violations were only detected in extreme cases (i.e., instantaneous reappearance on the other side of an occluding board; Wilcox and Schweinle, 2003; Bremner et al., 2005). Only at the age of 2 years did toddlers’ searching behavior demonstrate an understanding for the relation between time, velocity, and distance when a train went through a tunnel (Möhring et al., 2012). Adults were more accurate in identifying one of multiple moving objects when the objects instantaneously disappeared and reappeared at the position they had vanished or even before that position but not when the objects reappeared at a linearly extrapolated position along their movement trajectory (Keane and Pylyshyn, 2006). Nevertheless, the present study illustrates 10-month-old crawling infants’ sensitivity to slight temporal shifts when observing videos of a crawling baby.
We can think of at least three possible reasons why infants in the present study were able to detect temporal changes. First, manipulation in the timing of an object’s motion, as carried out in previous infant studies (Wilcox and Schweinle, 2003; Bremner et al., 2005), might be processed differently than manipulation in the timing of a human action because body form and dynamics offer rich information on, for instance, changes in velocity or direction (Hernik et al., 2014; Wronski and Daum, 2014). This notion corresponds to studies in adults showing that occluded human actions are internally simulated in real-time (Graf et al., 2007; Parkinson et al., 2012; Springer et al., 2013). Moreover, actions with natural human kinematics have been found to be more accurately predicted than those with artificial ones (Stadler et al., 2012). Similarly, proficient motor experience has been shown to enhance prediction of reappearance positions (Stapel et al., 2016).
Second, previous studies predominantly investigated object motion during the 1st months of life only (e.g., von Hofsten et al., 1998; Wilcox and Schweinle, 2003; Bremner et al., 2005), whereas the present study investigated human motion in 10-month-olds. Though the developmental trajectory of time-course representation is poorly understood to date, one may assume that older infants are better at solving temporal shifts in movement (cf Kopp, 2014), irrespective of the observed target.
Third, in most studies, data on infants’ gazing behavior are reduced to a selection of putatively relevant aspects, for example, to overall looking time following habituation (e.g., Bremner et al., 2005) or to predictive looking at the end of an observed action (e.g., Henrichs et al., 2014). While the data reduction approach has doubtlessly provided interesting information, it may also have prevented researchers from discovering further early capabilities (see also Roberts, 2004). Here, rich data on the gaze progression over time were analyzed, demonstrating 10-month-old infants’ spatiotemporal sensitivity while observing continuous and time-manipulated human movement that was within their own motor repertoire.
To explore the neural basis of internal real-time processing, we assessed rhythmic neural oscillations related to mnemonic (i.e., frontal theta) and sensorimotor processing (i.e., central alpha) while infants were observing movements that were either time-matching or non-matching following a transient occlusion.
Frontal theta activity did not differ between time-matching and non-matching continuations. Thus, we found no evidence that slight time-course manipulations in ongoing movement pose differential mnemonic demands on 10-month-old infants. Frontal theta, as measured here, is thought to implement a neural accumulator (Bland and Oddie, 2001; van Vugt et al., 2012) assisting in maintaining and integrating extracted information across time and space (e.g., Miller and Cohen, 2001; Simons and Spiers, 2003). Correspondingly, it has been shown that, in 10-month-old infants, mnemonic functions support the binding of pre- and post-occlusion movement input into a coherent and unified percept (Bache et al., 2015). The present finding, however, modifies the notion of mnemonic contributions, suggesting that precise temporal representations for movement integration may not be provided by mnemonic functions alone (Wilson, 2001; Coppe et al., 2010).
For central alpha activity, we found a significant interaction effect between the timing of movement (i.e., continuous vs. non-continuous) across the phases of the trial (i.e., pre-occlusion, occlusion, post-occlusion). Although it was not possible to discern the direction of the effect in follow-up analyses, inspection of Figure Figure66 suggests differences between time-matching and non-matching continuations following occlusion. Central alpha, as observed here, has been associated with sensorimotor simulation during movement observation (Cochin et al., 1999; Muthukumaraswamy et al., 2004; Marshall et al., 2011). Therefore, the present findings indicate sensorimotor involvement in the internal simulation of the timing of human movement. This interpretation is also supported by concurrent findings on eye movements (as described above), suggesting that the non-reliable differences in neural activity may not be due to infants’ lacking capabilities to detect differences in movements’ time courses.
Behavioral and neuroimaging studies in adults and infants suggest a crucial role of sensorimotor brain areas in timed internal simulation (e.g., Schubotz and von Cramon, 2002; Graf et al., 2007; Southgate et al., 2009; Stadler et al., 2011; Cross et al., 2012; Elsner et al., 2013; Springer et al., 2013; Stapel et al., 2016). Such a predictive function of the motor system may allow reduction of the processing delay in sensory–motor loops, which pose a fundamental challenge to proactive control of perception and behavior (e.g., Blakemore and Frith, 2005; Schubotz, 2007). However, simulating sensorimotor consequences in real-time may not (yet) be fast, stable, or precise enough in 10-month-old crawlers observing a crawling movement (see Wolpert and Flanagan, 2001).
Effects of either delayed or forwarded continuations were most obvious when comparing time-matching continuations between the two groups (Delayed and Forwarded). We assumed that, if occluded movement was internally simulated in real-time, infants would undershoot reappearance positions in forwarded continuations and overshoot them in delayed continuations, whereas infants would accurately track reappearance positions in continuous movements. Results showed that infants alternately watching continuous and forwarded movements produced a tracking pattern consistent with this hypothesis. However, infants alternately watching continuous and delayed movements undershot time-matching continuations and overshot delayed continuations. In fact, the tracking patterns of both experimental groups were found to be unexpectedly overlapping (see Figure Figure44): Infants watching continuous and delayed movements tracked the continuous movement in a similar way as infants watching continuous and forwarded movements tracked the forwarded movement. Vice versa, infants watching continuous and forwarded movements pursued the continuous movement in a similar way as infants watching continuous and delayed movements pursued the delayed movement. Moreover, tracking was less consistent across infants, when infants watched continuous and delayed continuations in contrast to continuous and forwarded continuations. Note, however, that the variation between conditions is a between subject comparison, i.e., two different groups of subjects performed delayed and forwarded conditions.
Though illustrating infants’ remarkable sensitivity to an action’s time course, these findings cannot solely be explained in terms of internal real-time processing. We can, however, only speculate as to which processes may have contributed to the pattern of results.
First, the present findings suggest that delayed and forwarded time-shifts in observed human action are not processed similarly (Bremner et al., 2005; Striano et al., 2006). This corresponds to adult studies showing that adults judged the continuation of actions following an occlusion to be continuous when it was in fact slightly delayed while slightly forwarded continuations were judged correctly as forwarded (e.g., Sparenberg et al., 2012). Switching from tracking external motion to internally representing motion may be costly and may thus lead to misaligned internal processing (Sparenberg et al., 2012; see also Mitrani and Dimitrov, 1978). In line with this notion, it is not obvious whether infants in the present study detected delayed continuations as manipulated in time. Future studies are needed to pinpoint the threshold at which time-matching and non-matching continuations are experienced as equal to determine potential switching costs early in life.
Second, the present findings may indicate that continuous movements are not always perceived as such (see also Adler et al., 2008). An influence of the stimulus context on action perception may be explained in accordance with priming effects (e.g., Pavlova and Sokolov, 2000). For example, when adults first performed a seemingly unrelated motor task (e.g., arm movement) and later observed movements corresponding to the motor task (i.e., arm movement) and non-corresponding (i.e., leg movement), the evaluation of the timing of movement continuations following occlusion was facilitated in corresponding conditions (Springer et al., 2013). Priming during action observation has also been reported in infant populations (e.g., Daum and Gredeback, 2011). From this perspective, non-matching conditions here may have served as the prime altering the processing of the time-matching condition. Future studies may disentangle whether and how time-shifted movements can change the perception of alternately presented continuous movements.
Third, it is possible that expectations based on learning across the repeated/blocked presentation of conditions may have contributed to the present results. This may be assumed because infants seem to have adapted their gaze position according to the expected reappearance position when approaching the occlusion offset (see Figure Figure44). Specifically, they looked slightly further back in delayed and slightly further forward in forwarded movements. In addition, following occlusion, there was a tendency to undershoot movements irrespective of the actual condition, which may be interpreted as an overall conservative strategy to stay on the target following a transient full-screen occlusions (cf. Stapel et al., 2016). At the same time, differences in tracking following occlusion suggest that infants did not learn that the stimulus’ reappearance position was kept identical in all conditions (see Stimulus Material and Procedure). Future studies should clarify whether and how learning may contribute to internal time-course representations when infants observe repetitive human movements.
There was a considerable drop-out on the level of trials and participants in both eye and brain measures. High attrition rates of 25–75% are commonly observed in EEG studies with mobile infant populations (see de Haan, 2007; for a meta-analysis see Stets et al., 2012). In eye-tracking studies with infants, drop-out on the level of trials and participants has not been documented consistently. Concurrent preparation of both EEG and eye-tracking reduces potential testing time and challenges infants’ compliance (e.g., see number of infants who could not be properly tested in Participants). Furthermore, both methods are sensitive to gross body and head movements that may result in a critical loss of data. In addition, eye-tracking is sensitive to repeated, persistent, and substantial changes in the position of the eyes (due to changes of head and/or body position), and measurement quality decreases over time in head-free recording (Holmqvist et al., 2011). At the same time, multiple repetition of the stimulus material is required for EEG to reduce noise in the signal. Therefore, it seems reasonable to assume comparable drop-out rates for eye-tracking and EEG data, and, potentially, overall higher attrition in simultaneous measurement in comparison to single measurement of either brain or eye data. Furthermore, not all participants can be expected to contribute (enough) data to both measures.
As a consequence of high attrition, it was not possible here to directly relate EEG and eye-tracking measures (see also Stapel et al., 2010). Furthermore, it cannot be excluded that attrition was selective for infants who complied better with testing requirements (e.g., Marshall et al., 2009) restricting the generalizability of effects. Moreover, due to infrequent and random contribution of data (see Figure Figure11), a systematic analysis of tracking over time (i.e., within and across blocks) was not conducted, because it would have required reducing the number of available trials and participants substantially.
From a methodological perspective, eye movements elicited during action perception add a source of artifacts to the EEG measurement potentially distorting the results. In adults, it has been shown that eye tracking data measured simultaneously with EEG can be used to identify and correct for those artifacts (e.g., Dimigen et al., 2011; Plöchl et al., 2012). In contrast, in infants, automated approaches to clean EEG of stereotypic artifacts are lacking. Here, we visually identified ICs representing eye movement related artifacts. Even though the ICA produced meaningful results in accordance with the adult literature, we cannot be certain whether artifacts were sufficiently removed in all data because eye and brain data could not directly be related as discussed above.
In this study, an experimental paradigm previously used to investigate internal real-time processing during action perception in adults (e.g., Graf et al., 2007) was successfully adapted and applied to an infant population. We found that 10-month-old crawlers are able to detect slight manipulations of the timing of observed crawling movements as reflected in infants’ tracking and neural patterns. This suggests a remarkable sensitivity to spatiotemporal information about external events early in life.
The study was conducted in the Baby Laboratory at the Max Planck Institute for Human Development, Berlin, Germany. The study was approved by the Institute’s Ethics Committee in accordance with the Declaration of Helsinki. Participants were recruited from a database of parents interested in participating in infant studies at the Max Planck Institute for Human Development. Parents gave their written informed consent upon arrival.
CB, AS, WS, FK, and UL conceived and designed the study, CB collected the data, CB, HN, and MW-B analyzed and interpreted the data, CB drafted the manuscript, all authors revised the work and approved the final version for publication.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The reviewer SH declared a shared affiliation, though no other collaboration, with several of the authors AS, WS to the handling Editor, who ensured that the process nevertheless met the standards of a fair and objective review.
This research was supported by the Max Planck Research Network for the Cognitive and Neurosciences (Maxnet Cognition) and funded by the Max Planck Society. MW-B’s work is supported by a grant from the German Research Foundation (DFG, WE 4269/3-1) as well as an Early Career Research Fellowship 2017 – 2019 awarded by the Jacobs Foundation. The study was conducted in partial fulfillment of the doctoral dissertation of CB. CB received training and financial support from the International Max Planck Research School on the Life Course (LIFE, http://www.imprs-life.mpg.de). We cordially thank the infants and their parents for participating in this study and our student assistants for their support in data collection, coding, and preprocessing. We further wish to thank Berndt Wischnewski for technical assistance and Julia Delius for editorial help.
1Please note that beside mnemonic functions frontal theta rhythms have also been associated with cognitive control, action monitoring, and error processing (e.g., Cavanagh and Frank, 2014; Cohen, 2014 for recent overviews). But given the current task design, those functions are beyond the scope of the present study.
4Comparable results were obtained in non-log-transformed data after exclusion of outliers (>mean ± 3SD).
The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpsyg.2017.01170/full#supplementary-material