The experiments were performed on five macaque monkeys (Macaca mulatta
, 3–6 kg, 4–7 years of age), including four male (M3, M5, M6 and M15) and one female (M13). All animal care and experimental procedures met the national and European guidelines and were approved by the ethical committee of the K.U. Leuven medical school. The details of the surgical procedures, training of monkeys, image acquisition, eye monitoring and statistical analysis of monkeys scans have been described previously (Vanduffel et al., 2001
; Fize et al., 2003
; Nelissen et al., 2005
), and will be described here only briefly.
During the experiments the monkeys sat in a sphinx position in a plastic monkey chair directly facing the screen. In the training and scanning sessions they were required to maintain fixation within a 2×2° window centered on a red dot (0.35 × 0.35°) in the middle of the screen. Eye position was monitored at 50Hz or 120 Hz (in later experiments) via the pupil position and corneal reflection (Iscan). During scanning the fixation window was slightly elongated in the vertical direction to 3°, to accommodate an occasional artifact on the vertical eye trace induced by the scanning sequence. The monkeys were rewarded (with apple juice) for fixating the small red dot within the fixation window for long periods (several minutes), while stimuli were projected in the background. Before each scanning session, a contrast agent, monocrystalline iron oxide nanoparticle (MION or Sinerem®), was injected into the femoral/saphenous vein (6-11mg/kg).
Visual stimuli were projected from a Barco 6300 liquid crystal display projector (1024 × 768 pixels; 60 Hz) onto a screen 54 cm from the monkeys' eyes. Unless otherwise mentioned, all tests included a simple fixation condition, in which the fixation target was shown on an empty gray screen, as baseline.
In Experiment 1, three monkeys (M3, M5 and M6) were scanned in a 1.5T scanner and one animal (M15) in a 3T scanner with custom-made 8-channel monkey Rx coils to achieve higher signal to noise ratios in the anterior STS. Stimuli () consisted of video-clips showing a hand (and forearm) grasping and picking up an object (‘isolated hand’ action, Supp. Video S1, 13 by 16° size) and video-clips showing a full view of a person performing the same actions (‘acting person’, Supp. Video S2, 18 by 20° size). Four different ‘isolated hand’ action video sequences were used: a male or female hand grasping and picking up a candy (precision grip) or a ball (whole hand grasp). These video sequences lasted 3.3 s and 11 randomly selected sequences were presented in a 36 s block. Six different ‘acting person’ video sequences, lasting 6s, were presented in random order in a block: a man or woman grasping and picking up an apple (whole hand grasp), a piece of carrot or a peanut (precision grip). The longer duration of the ‘acting person’ videos was due to the larger number of static frames at the beginning and end of the videos. The duration of the actual hand movement period was similar in the two movies: 2.1 s on average in the ‘hand action’ videos and 2.7 s in the ‘acting person’ videos. Two types of control stimuli were used: a) static single frames of the action videos, one from the middle of the video sequence when the hand is about to grasp the object and one from the end of the video sequence when the object has been picked up; b) scrambled videos produced by phase scrambling each frame of the video sequences (Supp. Video S3). Static stimuli were refreshed every 3.3 s or every 6 s by showing a frame selected from one of the four ‘isolated hand action’ videos or from one of the six ‘acting person’ videos respectively. The acting person videos differed from the static controls by the 2.7 s dynamic period, as shown by the strong differential activity in MT/V5 (see results).
The fixation condition used a white or green background, matched to those in the ‘acting person’ and ‘hand action’ videos, respectively. In each case, 5 different runs with different orders of conditions were used. Within a run, the order remained constant, and conditions were repeated once. The stimulus conditions are the same as those used in the Nelissen et al. (2005)
study. For two monkeys (M5 and M6) we used the parietal and STS data obtained in this earlier study, while two additional monkeys (M3, M15) were scanned.
In Experiment 2
, (M6, M13) we performed an additional action control test that contrasted the responses to goal-directed (object present) and mimicked (no object present) ‘isolated hand’ actions to either a translating hand (Supp. Video S4) or to a static single frame. The goal-directed hand action video-clips were the same as those used in Experiment 1. The mimicked hand action clips showed an isolated hand mimicking a grasping action (Supp. Video S5). The translation controls were introduced because many regions in the middle and posterior STS are involved in visual motion analysis (Zeki, 1974
; Maunsell and Van Essen, 1983
; Desimone and Ungerleider, 1986
; Vanduffel et al., 2001
; Nelissen et al., 2006
). The action and static stimuli conditions were the same as those in Experiment 3 of Nelissen et al. (2005)
. The monkeys scanned in that experiment were different animals from those employed in the present study.
In the analyses of both experiments, regions of interest (ROIs) were defined in the anterior portion of STS and in the IPS using the hand action test of Experiment 2 of Nelissen et al. (2005)
as an independent action localizer
test (monkeys: M1, M3, M5). This action localizer, also used in Nelissen et al. (2006)
, consisted of videos showing ‘isolated hand’ actions (3.3 s for 1 cycle; 4 different video sequences presented in random order) and a static (single frames from the middle of the videos shown for 36 s) and scrambled control.
Functional time series (runs) in Experiments 1 and 2 and in the action localizer test consisted of gradient-echo echoplanar whole-brain images acquired on a 1.5 T Siemens Sonata with a surface coil positioned over the head (1.5 T; repetition time (TR) 2400 ms; echo time (TE) 27 ms; 32 sagittal slices, 2 mm isotropic voxels). For one animal (M15) functional time series in Experiment 1 consisted of gradient-echo echoplanar whole-brain images acquired on a 3T Siemens TIM Trio with a custom-built 8-channel receive coil (TR 2000 ms, TE 17 ms, 30 horizontal slices, 1.5 mm isotropic voxels).
In the 1.5T scan sessions of Experiment 1, the ‘isolated hand action’ videos (and controls) and the ‘acting person’ videos (and controls) were presented in alternate runs. In addition, after six such runs, in which the monkey was passive, an active run was introduced in which the monkey had to detect the change in orientation of a small bar shown in the center of the screen while the action stimuli were presented in the background. The aim of this run was merely to enhance the alertness of the subject and these data were not analyzed. After this active run, another set of six passive runs were collected, with the cycle of six plus one run being repeated once or twice in a session. In the 3T scan sessions of Experiment 1 and in Experiment 2, only passive runs were collected, with little difference in the results.
For each monkey an anatomical (three-dimensional magnetization prepared rapid acquisition gradient echo, MPRAGE) volume (1×1×1 mm voxels) was acquired under anesthesia in a separate session.
Volume-based data analysis
Data were analyzed using SPM5 and Match software. Only runs in which the monkeys held fixation within the window for >85% of the time were analyzed. In these analyses, realignment parameters, as well as eye movement traces, were included as covariates of no interest to remove eye movements and brain motion artifacts. Spatial preprocessing consisted of realignment and rigid co-registration with a template anatomy (M12, corresponding to M1 in Ekstrom et al., 2008
). To compensate for echo-planar distortions in the images and for inter-individual anatomical differences, the functional images were warped to the template anatomy using the non-rigid matching software, BrainMatch (Chef d'Hotel et al., 2002
). The functional volumes were then re-sliced to 1 mm3
isotropic and smoothed with an isotropic Gaussian kernel (FWHM = 1.5 mm). Group analyses (fixed effects) were performed with an equal number of volumes per monkey, supplemented with single subject analysis, and the level of significance was set at p<0.05 corrected (family wise error) for multiple comparisons, unless stated otherwise.
Region of Interest (ROI) based analysis
Sixteen different ROIs, 10 within the STS and 6 in the parietal cortex, were defined onto the anatomical template (M12, ).
Localization of the IPL, IPS and STS ROIs
The 10 STS ROIs () consisted of 6 motion
sensitive regions defined in Nelissen et al. (2006)
, plus 4 additional regions defined by the action
localizer test. The motion
sensitive regions are located in the caudal (MT/V5, MTp, dorsal part of MST- MSTd-, FST) and middle portion of the STS (LST, and STPm). These latter two regions are considered here part of the anterior STS (aSTS). The MTp region corresponds to the MSTv region of Nelissen et al. (2006)
. Here the Kolster' terminology (Kolster et al., 2010
) is used, as a way of reconciling the motion sensitive regions defined by Nelissen et al. (2006)
with the retinotopic results of Kolster et al. (2009)
. Indeed the region we refer to as MTp appears to be located more posteriorly than the MSTv described by Kolster et al. (2009)
, which shared a border with FST. The action
localizer test yielded 2 separated activation sites in the anterior portion of the lower bank of the STS tentatively designated lower bank 1 (LB1, 6 to 10 mm anterior to interaural plane) and lower bank 2 (LB2, 11 to 15 mm anterior to interaural plane). These regions responded significantly to videos showing hand grasping actions (compared to static and scrambled controls). The action localizer test did not yield such activation sites in the corresponding portion of the upper bank. In order to have an equally sensitive analysis of the two banks of the STS, we tentatively defined two upper bank regions, upper bank 1 (UB1) and upper bank 2 (UB2), at the same anterior – posterior level as LB1 and LB2. Both UB regions are part of STPa (Bruce et al., 1981
; Cusick et al., 1997
; Oram and Perret, 1994
; Jellema et al., 2000
Six ROIs were defined in the parietal lobe (), four in the inferior parietal lobule convexity (area PF, PFG, PG and Opt), following Gregoriou et al., 2006 and two in the lateral bank of the intraparietal sulcus (AIP and anterior portion of LIP). Area AIP was delineated on the basis of previous single cell (Murata et al., 2000
) and fMRI studies (Durand et al., 2007
). The action localizer test also yielded a significant, active site in the lateral bank of the IPS (), in the anterior portion of the IPS region active during the execution of visual saccades (Wardak et al., 2010
), which was termed LIPa (anterior portion of LIP), following Durand et al. (2007)
Since the grasping actions in the videos were performed near the fovea and mostly within the right visual field, the ROI analysis was restricted to the left hemisphere. For number of voxels and proportion of visually responsive voxels in each ROI see Supplemental Material
Region-of-interest analysis was done using MarsBaR (version 0.41.1, http://marsbar.sourceforge.net
). The significance threshold for the t-tests was set at p<0.05 one-tailed, Bonferroni corrected for the number of ROIs (Check). A ROI was considered to be significantly activated only if the action observation condition activated the ROI more than any of the control conditions (including the fixation baseline) at p<0.05 corrected, both in the group and at least 2/3 of the monkeys (i.e. 2/3 in experiment 1 and 2/2 in experiment 2).
The correlation analysis for heterogeneity of ROIs uses the method developed by Peelen et al. (2006)
. For each voxel of the ROI, we plot the differential activity in one contrast (e.g. isolated hand action– fixation only) as the function of differential activity in the second contrast (e.g. acting person – fixation only). These correlations were calculated using only the voxels that were activated in the subtraction defining the abscissa, e.g. acting person – fixation. Including both the deactivated and the activated voxels would have inflated these correlations.
To minimize distortions due to whole hemisphere flattening, the procedure developed by Durand et al (2007)
was used to flatten the STS, IPS and adjoining IPL convexity and finally the inferior ramus of the arcuate sulcus (IAS). The trajectory in the flatmaps of the iso-AP levels corresponding to coronal sections depends on the overall shape of the sulcus. In the IPS (), iso-AP levels display a V shape, reflecting the increase in the depth of the IPS at more caudal levels. In the anterior STS, iso-AP lines run almost perpendicular to the elongated flattened shape of the STS (). Since the STS widens caudally, the posterior iso-AP lines there are strongly bent. Hence, MTp, for instance, might appear to be located completely posterior to MT/V5 on the STS flatmap, while it is actually located at similar anterior-posterior level as MT/V5.
Correlation with connectional data
To identify the possible pathways conveying action-related visual information from the STS to frontal lobe areas, functional MRI data were combined with data from nine representative macaque monkeys (five Macaca nemestrina
- Cases 14, 20, 26, 27, 30, three Macaca fascicularis
- Cases 13, 29, 36, and one Macaca fuscata
- Case 17) in which neural tracers were injected into areas PFG (Case 13l, 27r, 29r), AIP (Cases 14r, 17r, 20l), 45B (Cases 26l, 30r, 36l) and F5a (Case 30r). All the cases of injections in areas PFG, AIP and 45B have been already presented in previous connectional studies to which the reader is referred also for details on surgical, histological and data analysis procedures (Rozzi et al., 2006
; Borra et al., 2008
; Gerbella et al., 2010
). Data from the tracer injection in F5a have been presented only in abstract form.
In Cases 13l, 27r and 29r, peroxidase-conjugated wheat germ agglutinin, (WGA-HRP, 4%, one injection, 0.1 μl), cholera toxin B subunit, conjugated with Alexa 488 (CTBgreen, 1% in distilled water, two injections 1 μl each) or Fast Blue (FB, 3%, one injection, 0.2 μl) were injected into PFG. In Cases 14r, 17r and 20l, microruby (MR, 10% phosphate buffer 0.1 M, pH 7.4, two injections, 1 μl each), WGA-HRP (seven injections, 0.1 μl each) and Diamidino Yellow (2%, one injection, 0.2 μl) were injected into AIP using similar procedures. In Cases 26l, 30r and 36l, DY (one injection, 0.2 μl), FB (one injection, 0.2 μl) and FB (one injection, 0.2 μl) were injected into area 45B. Finally, in Case 30r, in addition to the FB injection into area 45B, DY (one injection, 0.2 μl) was injected into F5a.
In all cases, the distribution of retrograde labeling was plotted in sections every 600 μm and analyzed qualitatively and quantitatively, as the number of labeled neurons found in a given cortical subdivision outside the injected area, in percent of the overall labeling observed in the injected hemisphere. The distribution of the labeling observed after tracer injections in the same area was remarkably consistent across different cases and the quantitative analysis showed quite similar percent distributions of the labeling in the different animals (Rozzi et al., 2006
; Borra et al., 2008
; Gerbella et al., 2010
For the purposes of the present study, the distributions of the retrograde labeling observed in the ipsilateral STS in all PFG, AIP and 45B injection cases and in the ipsilateral IPS in Case 30r (FB and DY) were re-analyzed as follows. The distributions were first visualized in two-dimensional (2D) reconstructions of the STS and IPS as described in Matelli et al. (1998)
. For purposes of comparison all reconstructions of injected hemispheres were shown as left hemispheres. These 2D reconstructions (flatmaps) were then warped to the flattened left STS and IPS isomaps. For the STS warping, reference points were placed every 2 mm along the lower lip, fundus and upper lip of the STS of both the input flatmap (tracer injection) and the reference flatmap (fMRI isomap). In each STS flatmap, AP 0 was set at the level of the posterior commissure. For the IPS warping, reference points were placed every 2 mm along the lower lip, the middle of the lateral bank and the fundus of the IPS of both the input flatmap (tracer injection) and the reference flatmap (fMRI isomap). The deformation of the input flatmap to conform to the reference flatmap was based upon a linear interpolation of a triangular mesh, formed by the reference points and the four corner points of the image, using the Matlab ‘griddata’ function (Watson, 1992
) which is based on a Delaunay triangulation of the data.
To analyze the consistency of the labeling in a cortical region the following procedure, a method inspired by that of Borra et al. (2008)
, was used. The procedure introduces a slight smoothing of the labeling to overcome individual differences. Each dot of label was replaced by a disc of 100% luminance with a 10 pixel radius (roughly 600 micron in the flatmap, approximately the distance between sampled sections) and these luminance discs were added for the STS flatmap of a given individual. The composite luminance distribution was thresholded just above 100%, to reject isolated labeled neurons in the connectivity map of a single animal. The overlap between maps from different subjects was calculated and color coded. The results did not depend much on the exact size of the disc, nor on the threshold as long as it exceeded 100%. The fMRI data were also smoothed in the flatmaps, for the same reason, by replacing each local maximum with a disc of 20 pixels radius in the flat map. A union of the discs was computed for each animal and the overlap between individual activation patterns was calculated and color coded for each point of the flatmap.