|Home | About | Journals | Submit | Contact Us | Français|
Conceived and designed the experiments: NT. Performed the experiments: CMK CRC VGB. Analyzed the data: CMK MLW. Wrote the paper: CMK NT.
Neuroimaging studies suggest that category-selective regions in higher-order visual cortex are topologically organized around specific anatomical landmarks: the mid-fusiform sulcus (MFS) in the ventral temporal cortex (VTC) and lateral occipital sulcus (LOS) in the lateral occipital cortex (LOC). To derive precise structure-function maps from direct neural signals, we collected intracranial EEG (icEEG) recordings in a large human cohort (n = 26) undergoing implantation of subdural electrodes. A surface-based approach to grouped icEEG analysis was used to overcome challenges from sparse electrode coverage within subjects and variable cortical anatomy across subjects. The topology of category-selectivity in bilateral VTC and LOC was assessed for five classes of visual stimuli—faces, animate non-face (animals/body-parts), places, tools, and words—using correlational and linear mixed effects analyses. In the LOC, selectivity for living (faces and animate non-face) and non-living (places and tools) classes was arranged in a ventral-to-dorsal axis along the LOS. In the VTC, selectivity for living and non-living stimuli was arranged in a latero-medial axis along the MFS. Written word-selectivity was reliably localized to the intersection of the left MFS and the occipito-temporal sulcus. These findings provide direct electrophysiological evidence for topological information structuring of functional representations within higher-order visual cortex.
Visual object recognition is a ubiquitous feature in our day-to-day lives, enabling us to recognize the faces of our loved ones, find a favorite snack in the grocery aisle, and even read the words on this page. Achieved with rapidity and accuracy, object recognition appears nearly effortless. Yet the apparent automaticity with which we perform this feat belies its underlying neural complexity, and damage to any part of the network of cortical regions involved may produce debilitating deficits—such as visual agnosias (e.g. face-blindness)—that can seriously affect social or vocational life [1, 2].
Extensive human and non-human primate research has identified putative higher-order visual areas in the ventral temporal and lateral occipital cortical complexes (VTC and LOC, respectively), which are believed to mediate object recognition via the activity of distinct neuronal clusters that differentially and selectively activate to specific categories of visual stimuli (e.g. faces/places/animals/tools/words) [3–18]. However, the functional and organizational principles of the VTC and LOC continue to remain a topic of debate. This is largely due to the considerable variability in anatomical location and spatial relation of different category specific regions reported in subjects, both within and across studies [19–22].
Recently, advances in functional, structural, and anatomical neuroimaging have begun to yield new insights into structure-function relationships of the VTC and LOC . Specifically, in the VTC, the mid-fusiform sulcus (MFS) has been revealed to predict lateral-to-medial transitions in receptor and cyto-architectonics, white-matter connectivity, and large-scale functional maps (e.g. animacy maps, eccentricity bias); while in the LOC, dorso-ventral transitions in large-scale functional maps appear to be arranged around the lateral occipital sulcus (LOS). Further comparisons between the MFS/LOS and the relative organization of category-selective regions have revealed that these smaller-scale functional representations also align with the same sulcal landmarks [21, 22, 24–41].
Taken together, these findings suggest that these anatomical landmarks—the MFS and LOS—may provide a structural framework for the organization of higher-order visual representations, in which opposing sides of these sulci contain neural hardware for processing distinct classes of visual information (foveal vs. peripheral, animate vs. inanimate, face vs. place) . Importantly, smaller-scale functional representations appear to be nested within larger-scale representations, such that visual information processing is organized in a way that mirrors the hierarchical organization of human conceptual knowledge . Concrete (i.e. basic-level) categorical information is embodied at smaller spatial scales, via category-selective regions, while abstract (i.e. superordinate-level) categorical information is reflected at larger spatial scales [22, 23, 42]. For example, lateral to the MFS, face and body-part selective regions (basic-level information) are localized adjacent to each other, and converge within animate representations (superordinate-level) of large-scale animacy maps [21, 27, 31, 33, 37]. Similarly, medial to the MFS, tool and place-selective regions converge within large-scale inanimate representation [14, 31, 39]. This hierarchical structuring of visual information might explain how the VTC and LOC may be biologically optimized to achieve rapid object recognition and categorization .
Notably, given the spatial constraints of the VTC and LOC (i.e. a 2D cortical sheet), different functional maps (e.g. animacy and eccentricity bias) appear to be organized on the same spatial gradients around the MFS and LOS, respectively . However, the correspondence between different functional maps is not necessarily one-to-one. For instance, in addition to animacy distinctions, the MFS also predicts medio-lateral transitions in eccentricity bias maps (i.e. peripheral vs. foveal representations, respectively) [33, 43]. And while place (inanimate and peripheral) and face (animate and foveal) stimuli engage medial and lateral regions of the MFS, respectively, word stimuli (inanimate and foveal) selectively engage regions lateral to face-selective regions, in the vicinity of the occipitotemporal sulcus (OTS) [44–46].
While fMRI studies have made great strides towards understanding the organization of these visual areas, the spatio-temporal resolution and indirect nature of hemodynamic measures prevents a definitive assessment of their functional topography [47, 48]. Although newer analytic approaches have been developed to address the limitations of traditional localization-based techniques (e.g. multivariate pattern analysis) [49–53], their relationship to the underlying neural population activity has not been validated in humans [54, 55]. Human intracranial EEG (icEEG) recordings provide high spatiotemporal resolution neural recordings and offer a unique opportunity to validate hypotheses of VTC and LOC organization [56–58].
Despite recent work, a comprehensive icEEG investigation into the topology of VTC and LOC category-selectivity remains lacking (for review see ). This is due largely to challenges arising from spatially variable and sparse electrode coverage within subjects. The discrete and clinically directed implantation of electrodes precludes evaluation of both small and large-scale functional organization in any single individual, requiring the combination of data across a large number of subjects to achieve adequate cortical coverage. However, current approaches for the spatial co-registration of datasets across individuals (e.g. affine/volumetric normalizations) are unable to preserve the topological alignment of homologous functional regions, due to the highly folded (nonlinear) cortical geometry . As a result, prior icEEG studies have focused more on evaluating the functional properties of category-selective regions, but not their topological organization within the VTC and LOC (but see ) [3, 17, 61–70].
Recently, new methodological advances have introduced surface-based normalization strategies for grouping icEEG data [60, 71, 72], which provide computationally efficient methods to correct for inter-subject anatomical variability and sparse-sampling . In the current study, we utilized one such surface-based grouped icEEG approach  to investigate VTC and LOC category tuning across a large patient cohort (n = 26); using data collected during the visual naming of living (faces, animals, and body-parts) and non-living (tools and places) stimuli, as well as during a wordstem completion task.
If models of visual information structuring are accurate , we expect electrodes with selectivity for living (face and other animate) stimuli to be localized lateral to the MFS in the VTC, and ventral to the LOS in the LOC. Also, lateral to the MFS (in the left VTC), we expect word-selective electrodes to co-localize with face-selective electrodes in regions biased toward foveal representation. In contrast, electrodes with selectivity for non-living stimuli (tools and places) should localize medial to the MFS in the VTC and superior to the LOS in the LOC. Importantly, we expect the relative spatial arrangements of category-selective electrodes (within individuals) to be preserved within larger-scale functional representations at the group level (across individuals).
To test this hypothesis, we generated topologically precise population-level maps of icEEG data , and directly evaluated whether: 1) large-scale functional maps (e.g. animacy: living vs. non-living) emerge from the relative arrangements of distinct category-selective regions in the VTC and LOC; and 2) that transitions in multi-scale functional maps are preserved around specific sulcal landmarks (e.g. MFS and LOS, respectively). We found that, in the LOC, selectivity for living and non-living stimuli is arranged along the LOS about a ventral-to-dorsal axis. In the VTC, living and non-living stimuli are arranged along the MFS about a lateral-to-medial axis. Furthermore, in the left VTC, word-selectivity is reliably predicted by the intersection of the anterior MFS and the occipitotemporal sulcus, and is interspersed with other foveally represented categories. These results were consistent at both the individual and population-level, and provide direct evidence for structure-function coupling in the VTC and LOC from electrophysiological data in humans.
Data were collected from 26 subjects (16 females, mean age 33 ± 11 years, mean IQ 100 ± 11) undergoing left (LH, n = 16) or right hemispheric (RH, n = 10) subdural electrode (SDE) implantation (Table 1). All experimental procedures were reviewed and approved by the Committee for the Protection of Human Subjects (CPHS) of the University of Texas Health Science Center at Houston as Protocol Number: (HSC-MS-06-0385), and written informed consent was obtained from all subjects.
Subjects participated in a visual confrontation-naming task using 5 categories : famous faces, animate non-face (animals and body-parts; hereafter referred to as “animate”), famous places, tools, and word stimuli (Fig 1a; ~80 to 120 stimuli per category).
Pictorial stimuli (face, animate, place, tool) were displayed at eye-level on a 15” LCD screen placed at 2 feet from the patient (2000 ms on screen, jittered 3000 ms inter-stimulus interval; 500 x 500 pixel image size, ~10.8° x 10.8° of visual angle, with a grid overlay on 1300 x 800 pixel white background, ~28.1° x 17.3° of visual angle). Subjects were instructed to overtly name the stimuli during the experiment. Face stimuli consisted of gray-scale, real images of famous individuals shown in frontal view (celebrities, politicians, and historical figures). Place stimuli consisted of color, real images of famous landmarks (e.g. Eiffel tower, Grand Canyon). Animate and tool stimuli were from the Snodgrass and Vanderwart object pictorial set . Word stimuli were presented as partial word stems (e.g. “kne_”) to which subjects were instructed to respond with the first action word that came to mind (e.g. “kneeling”). Words consisted of black, lower-case text (2000 ms on screen, jittered 3000 ms inter-stimulus interval; font height of 100 pixels, Calibri font type, ~2.1° of visual angle) centered on a 1300 x 800 pixel white background.
For each category, images were randomly selected from our database and never repeated, so each subject saw a unique sequence of images. All subjects in both right and left hemispheric cohorts participated in the visual naming tasks with pictorial stimuli. However, given the strong hemispheric bias associated with word reading [13, 76–78], the word-naming task was only performed in the left hemispheric cohort. Due to clinical time constraints, 12 of 16 subjects in the left hemisphere cohort completed the word-naming task. A transistor-transistor logic pulse triggered by the stimulus presentation software (Python v2.7) at stimulus onset was recorded as a separate input during the experiments to time lock all trials during all tasks .
Pre-implantation anatomical MRI scans were collected using a 3T whole-body MR scanner (Philips Medical Systems, Bothell WA) equipped with a 16-channel SENSE head coil. Anatomical images were collected using magnetization-prepared 180-degree radio-frequency pulses and rapid gradient-echo (MP-RAGE) sequence, optimized for gray-white matter contrast, with 1 mm thick sagittal slices and an in-plane resolution of 0.938 x 0.938 mm . Cortical surface models (Fig 1b) were reconstructed using FreeSurfer software (v5.1) , and imported to SUMA for visualization .
A total of 3506 SDEs (LH n = 2101; RH n = 1386) were implanted (PMT Corporation; top-hat design; 3 mm diameter contact with cortex) using previously published techniques . 933 SDEs (LH n = 482; RH n = 451) were excluded due to proximity to seizure onset sites, inter-ictal spikes, or 60 Hz noise. The remaining 2573 SDEs (LH n = 1619, RH n = 935) were localized to cortical surface models using intra-operative photographs and an in-house recursive grid partitioning technique .
Using anatomical criteria, we identified all SDEs localized to the VTC and LOC for each individual in native anatomical space. The VTC includes the fusiform gyrus—bounded laterally by the occipitotemporal sulcus, medially by the collateral sulcus and anterior lingual gyri, posteriorly by the posterior transverse collateral sulcus, and anteriorly by the anterior tip of the mid-fusiform sulcus (MFS) . The LOC includes the middle and inferior occipital gyri—bounded dorsally by the transverse occipital sulcus, ventrally by the occipitotemporal sulcus, posteriorly by the occipital pole, and anteriorly by the posterior superior temporal sulcus, as well as the posterior aspects of the inferior and middle temporal gyri (Fig 2) [21, 38, 39, 46].
To enable a population-level evaluation of category-selective topology, individual subject SDE coordinates were mapped to a standardized cortical surface (MNI N27 template brain aligned to Talairach coordinate space) using a surface-based normalization strategy (rather than affine or non-linear volumetric transformations) [60, 73, 83–85], to maximize the overlap between topologically and functionally homologous regions across subjects [86–88]. This surface-based normalization approach is detailed in an earlier publication from our group [60, 82]. Briefly, surface-based representations of electrode coverage are generated with respect to each subject’s cortical model, using geodesic metrics to correct for local gyral and sulcal folding patterns. Individual electrode datasets are subsequently normalized to a standardized cortical surface mesh, using SUMA’s surface-based normalization strategy, to enable a one-to-one correspondence between anatomical locations across subjects . A total of 159 SDEs (LH n = 94, RH n = 64) were localized to the VTC and 83 SDEs (LH n = 48, RH n = 35) to the LOC (Fig 2, Table 1).
In 14 subjects, ECoG data were collected at 1000 Hz using NeuroFax software (Nihon Kohden, Tokyo, Japan) (bandwidth 0.15–300 Hz). The other 12 subjects underwent ECoG data collection at 2000 Hz (bandwidth 0.1–750 Hz) using the NeuroPort recording system (Blackrock Microsystems, Salt Lake City, UT). Electrodes were referenced to a common average of all electrodes in a given subject, except for those with 60 Hz noise or epileptiform activity when initially referenced to an artificial 0V . All electrodes with greater than 10 dB of noise in the 60 Hz band, inter-ictal epileptiform discharges, or localized to sites of seizure onset were excluded.
To focus only on perceptual processes, analyses were restricted to a period 100–400 ms after stimulus presentation [59, 66, 90, 91]. For all ECoG data, analyses were performed by first bandpass filtering raw ECoG data into the broadband gamma frequency range (60–120 Hz, following removal of 60Hz line noise and its harmonics; IIR Elliptical Filter, 30 dB sidelobe attenuation). A Hilbert transform was applied and the analytic amplitude was smoothed (Savitzky-Golay FIR, 5th order, frame length of 155 samples; Matlab 2013b, Mathworks, Natick, MA) to estimate the time course of broadband gamma activity (BGA) . BGA was utilized in our analyses as it has been demonstrated to provide precise estimates of task-specific cortical activity [56, 90, 92–96] as well as the strongest correlation with the BOLD fMRI signal used in non-invasive neuroimaging studies [59, 68, 97–102].
We note here that different components of the recorded icEEG signal, specifically the raw-field or event-related potential (ERPs), contain additional information that could be used for the purposes of this analysis. However, ERP frequency components are more heavily influenced by lower frequency ranges, reflecting synchronized activity across larger cortical distances, which results in a poorer temporal and spatial resolution than BGA . Furthermore, prior icEEG studies that have directly compared ERPs and BGA have demonstrated that a) fMRI BOLD activity tracks BGA, not ERPs ; b) the presence of BGA accurately predicts the presence of an ERP, and BGA magnitude is positively correlated with the size of the ERP, although the converse is not ; and c) BGA is more sensitive to task-dependent modulations in local cortical activity than ERPs, and BGA can distinguish between increases and decreases in neural activity (which ERPs cannot do) [63, 69, 104–106].
An important feature of ECoG recordings is the time resolution that these data provide. Time series representations of mean percent change in BGA (across trial) were calculated by comparing post-stimulus BGA power to a mean pre-stimulus baseline activity (-700 to -200 ms) (Fig 1c) [60, 79]. For each category, trials with noise or artifacts during either the baseline or post-stimulus window were discarded, resulting in a mean (+/- sd) of 46 (18) face trials; 31 (9) animate trials; 29 (8) tool trials; 49 (6) place trials; and 38 (11) word trials used in the analyses.
We also sought to characterize category-selectivity onsets per SDE per individual. We note here that the millisecond temporal resolution afforded by ECoG allows for precise latency estimates . Using the BGA time-series to perform paired two-way t-tests at each time point, selectivity onset latencies were determined as the first time point at which a significant contrast (p < 0.05; corrected using the false discovery rate—FDR—procedure for multiple comparisons ) for a single category (against all other categories) was observed, which then remained significant for successive data points (>100 ms).
To quantify category selective responses in each SDE, the d’ (d-prime) sensitivity index was computed for each category per electrode (a total of 5 d’ indices per electrode). The d’ index is an established metric in signal detection used to determine how well a target can be discriminated from competing stimuli [59, 106, 109–114]. For each category at each electrode, the mean BGA in the 100-400ms interval after stimulus onset was standardized by across trial standard deviation [59, 113]. The d’ index was calculated as the difference between the standardized BGA for each category against all other categories:
where uj is the mean response to the current category j; oj is across-trial standard deviation of BGA activity to category j; and ui and oi denote the same for the other categories. Because 5 categories in all were evaluated, for each category j, N will be equal to 4. In this fashion, each electrode could be judged selective for multiple categories .
Significance thresholds were determined through permutation testing. For each electrode per subject, a null distribution was generated by randomly shuffling category labels across all trials and recomputing the d’ index 10,000 times. The p-value for each category per electrode was determined as the fraction of shuffled d’ indices that were greater than the actual d’ index . At the group-level, individual p-values were corrected for multiple comparisons (across categories and SDEs, per region and hemisphere) to an adjusted alpha (q) level of 0.01. Corrections for multiple comparisons were performed using the false-detection rate (FDR) procedure .
To test for lateral-to-medial and ventral-to-dorsal functional gradients in the VTC and LOC respectively, linear mixed effects (LME) models were generated to quantify the relationship between category-selectivity (determined by the d’ index) and the cortical topology while controlling for individual subject effects. For each category, SDE coordinates (in group, i.e. Talairach, space following surface-based normalization) were modeled as a fixed effect, and patient ID modeled as a random effect to control for inter-subject variability as well as non-independence (e.g. one subject contributing multiple SDEs) [lme4 and lmerTest packages in R] [115–119]. To control for spatial multicollinearity, SDE coordinates per hemisphere in each region (VTC and LOC) were mean-centered prior to inclusion in the LME models. LME models were then fitted per category for each hemisphere in each region.
Finally, to visually evaluate the spatial organization of SDE category-selectivity relative to anatomical landmarks (the MFS and LOS), SDEs with significant d’ indices (FDR corrected q ≤ 0.01) for each category were visualized on the MNI N27 cortical surface (aligned to Talairach space), and color-coded by category-preference.
ECoG recordings of broadband gamma activity (BGA; 60 -120Hz) from 26 subjects (LH n = 16; RH n = 10) were analyzed to evaluate the relationship between category-selectivity and cortical topology in higher-level visual cortex. In total, 242 SDEs were evaluated (Fig 2): 159 SDEs were localized to ventral temporal cortex (VTC: LH n = 94, median = 5 SDEs/subject, interquartile range, IQR = 3–8.25; RH n = 64, median = 4.5 SDEs/subject, IQR = 4–5), and 83 SDEs were localized to lateral occipital cortex (LOC: LH n = 48, median = 3.5 SDEs/subject, IQR = 1.5–7; RH n = 35, median = 7 SDEs/subject, IQR = 3–10).
At the individual level, task-dependent increases in BGA peaked at ~350—400ms after stimulus onset (Fig 3). Category-selective BGA responses (significant d’ index at an FDR corrected q ≤ 0.01), organized with respect to the cortical topology, were consistently seen at the single subject level. However, the sparse sampling in each individual case precluded a comprehensive evaluation of these relationships at the single subject level, and surface-based normalization was performed to transform all SDE coordinates across subjects to a common brain space (Fig 4).
Of the 242 SDEs used in the analysis (VTC and LOC bilaterally), a total of 142 SDEs (~59%) had a significant d’ index for at least one category (FDR corrected q ≤ 0.01). In the VTC, a total of 69/94 SDEs (~73%) in the left hemisphere and 34/64 SDEs (~53%) in the right hemisphere had a significant d’ index (FDR corrected q ≤ 0.01) for at least one category (Fig 4, left). In the LOC, a total of 26/48 SDEs (~54%) in the left hemisphere and 13/35 SDEs (~37%) in the right hemisphere had a significant d’ index for at least one category (Fig 4, right). Notably, only 7 SDEs (VTC n = 6; LOC n = 1) had a significant d’ index for a second category (both faces and places), all of which were localized in the left hemisphere to the respective sulci of interest (MFS or LOS).
To robustly quantify the relationships between d’ index and SDE coordinates (mm, in Talairach space), while controlling for non-independence of data within individuals, linear mixed effects (LME) models were generated for each stimulus category. We note that in the VTC, x and z coordinates were highly correlated (RH: rs,62 = .97, p = 2.2e-16; LH: rs,92 = -.83, p = 2.2e-16). Therefore only the x and y coordinates were evaluated for the VTC (z coordinate was removed). Similarly, in the LOC, the x and y coordinates were highly correlated (LH: rs,46 = -.94, p = 2.2e-16; RH: rs,33 = .865, p = 1.8e-14). Therefore only the y and z coordinates were evaluated in the LOC (x coordinate was removed). The exclusion of the z and x coordinates as predictors for VTC and LOC category selectivity, respectively, remains consistent with the anatomical principles governing structure-function hypotheses currently being tested (e.g. animacy maps in the VTC are a function of a lateral-to-medial axis).
In the VTC, the x and y coordinates, and the interaction term (x*y), were entered as fixed effects into the models. In the LOC, the fixed effects were entered as the y and z coordinates, and the interaction term (z*y). Given that multiple SDEs could be contributed from each individual, all models included random-effect variable intercepts for subjects to control for inter-subject variability. Complete model results for the VTC and LOC are provided in Fig 5. For brevity, only significant LME results are discussed in the following section. Scatterplots depicting univariate relationships between grouped d’ indices and the spatially normalized SDE coordinates of interest (Talairach space) are available in the supporting information (S1 Fig) [ggplot2 and stats packages in R] [120, 121].
In the right VTC, LME analysis was performed for 4 stimulus categories (faces, animate, places, and tools) using 64 SDEs (Fig 5). For face stimuli, a negative relationship was found with increasing d’ index in the x-axis (B = -0.0586, S.E. = 0.0080, p = 6.5e-10; indicating selectivity increases laterally), a significant positive relationship with increasing selectivity in the y-axis (B = 0.0171, S.E. = 0.0072, p = .021; posteriorly), and a significant negative relationship between face-selectivity and the x*y interaction term (B = -0.0023, S.E. = 0.0008, p = 4.3e-03). For place stimuli, we found a significant positive relationship with increasing selectivity in the x-axis (B = 0.0648, S.E. = 0.0083, p = 1.2e-10; medially), and a significant positive relationship between selectivity and the x*y interaction term (B = 0.0022, S.E. = 0.0008, p = 9.2e-03). No significant associations were noted for tool- or animate-selectivity.
In the left hemisphere VTC, LME analysis was performed for 4 stimulus categories (faces, animate, places, tools) using 94 SDEs, and for 1 stimulus category (words) using 64 SDEs. For face stimuli, we found a significant positive relationship with an increasing d’ index in the x-axis (B = 0.0704, S.E. = 0.0117, p = 3.32e-08; selectivity increases laterally). For animate stimuli, a negative relationship was observed for increasing selectivity in the y-axis (B = -0.0128, S.E. = 0.0040, p = 2.15e-03; anteriorly). For places, we found a negative relationship with increasing place-selectivity in the x-axis (B = -0.0547, S.E. = 0.0120, p = 1.53e-05; medially), and a positive relationship with increasing selectivity in the y-axis (B = 0.0301, S.E. = 0.0071, p = 5.91e-05; posteriorly). For tools, we found a negative relationship with increasing selectivity in the x-axis (B = -0.0363, S.E. = 0.0088, p = 9.00e-05; medially), and a negative relationship with the y-axis (B = -0.0176, S.E. = 0.0051, p = 9.28e-04; anteriorly). For words, a negative relationship was observed with increasing selectivity in the y-axis (B = -0.0369, S.E. = 0.0088, p = 9.67e-05; anteriorly).
In the left LOC, LME analysis was performed for 4 stimulus categories (faces, animate, places, and tools) using 48 SDEs and for 1 stimulus category (words) using 26 SDEs (Fig 5). For both face and animate stimuli, we found significant negative relationships with increasing d’ indices in the z-axis (face B = -0.0175, S.E. = 0.0084, p = 0.043; animate B = -0.0176, S.E. = 0.0060, p = 5.6e-03; selectivity increases ventrally for both). For places, we found a significant positive relationship with increasing selectivity in the z-axis (B = 0.0398, S.E. = 0.0075, p = 3.8e-06; dorsally), a significant positive relationship with the y-axis (B = 0.0435, S.E. = 0.0130, p = 1.7e-03; anteriorly), as well as a significant positive relationship with the y*z interaction term (B = 0.0030, S.E. = 0.0011, p = 9.8e-03). No significant associations were noted for tool or word-selectivity.
Finally, in the right LOC, LME analysis was performed for 4 stimulus categories (faces, places, tools, and animate) using 35 SDEs. For faces, we found a significant negative relationship with increasing selectivity in the z-axis (B = -0.0306, S.E. = 0.0134, p = .029; selectivity increases ventrally), and for places we found a significant positive relationship with increasing selectivity in the z-axis (B = 0.0366, S.E. = 0.0095, p = 6.0e-04; dorsally). No significant associations were noted for tool- or animate-selectivity.
To evaluate the spatial relationship of category-selective SDEs with respect to cortical folding patterns, all SDEs with significant d’ indices were visualized on the MNI N27 brain surface (in Talairach space), and color-coded by category preference (Fig 6). Notably, all animate-selective (LH n = 3/3) and nearly all face-selective (LH n = 27/28; RH n = 15/17) SDEs were localized to or lateral to the mid-fusiform sulcus (MFS) in the VTC bilaterally. Similarly, all place-selective (LH n = 29/29; RH n = 14/14) and tool-selective SDEs (LH n = 7/7; RH n = 2/2) were localized to or medial to the MFS bilaterally. Additionally both tool-selective and word-selective (LH n = 6/6) SDEs were clustered along the anterior boundary of the mid-fusiform sulcus in the left VTC. In addition to tools, word-selective SDEs were also interspersed with anteriorly localized face- and animate-selective SDEs.
In the LOC, bilaterally, a similar arrangement of category-selectivity with respect to the lateral occipital sulcus (LOS) was observed. All face-selective (LH n = 8/8; RH n = 5/5) and animate-selective (LH 6/6; RH n = 3/3) SDEs were uniformly localized at or inferior to the LOS, while all place-selective (LH n = 9/9; RH n = 3/3) and tool-selective (LH n = 1/1; RH n = 2/2) SDEs were localized at or superior to the LOS. However, no discernable spatial arrangement of word-selective (LH n = 3) SDEs could be observed.
We also examined the spatial organization of the 105 remaining SDEs without significant d’ indices (S3 Fig). Of these, 34 SDEs (LH n = 17; RH n = 17) demonstrated little to any change in BGA (i.e. were non-responsive), and all were localized at the boundaries of the VTC/LOC: at-or-anterior to the MFS in the VTC (LH n = 14; RH n = 12) and at-or-superior to the transverse occipital sulcus in the LOC (LH n = 3; RH n = 5). The remaining 71 SDEs were largely interspersed amongst category-selective SDEs, with one notable exception: in the VTC bilaterally, no non-selective SDEs were localized postero-lateral to the MFS, in regions of face-selectivity. In the remaining regions, d’ index values of non-selective SDEs were largely congruent with d’ indices of the surrounding category-selective SDEs, consistent with the gradients of category-selectivity observed in the LME analyses and scatterplots (S1 Fig).
Finally, for VTC and LOC category-selective SDEs, we investigated the timing of selectivity emergence for each category (S2 Fig). Bilaterally, median selectivity onset latencies were between ~150 to ~250 ms overall. In the left hemisphere LOC, median (sd) onset latencies were: 133 ms for (n = 1) tool-selective SDEs; 138 (39) ms for (n = 6) animate-selective SDEs; 155 (35) ms for (n = 8) face-selective SDEs; 194 (54) ms for (n = 3) word-selective SDEs; and 219 (43) ms for (n = 9) place-selective SDEs. In the left VTC, median (sd) onset latencies were: 153 (55) ms for (n = 6) word-selective SDEs; 162 (50) ms for (n = 28) face-selective SDEs; 172 (61) ms for (n = 29) place-selective SDEs; 193 (36) ms for (n = 7) tool-selective SDEs; and 250 (52) ms for (n = 3) animate-selective SDEs.
In the right LOC, median (sd) onset latencies were: 158 (42) ms for (n = 5) face-selective SDEs; 191 (84) ms for (n = 2) tool-selective SDEs; 238 (81) ms for (n = 3) place-selective SDEs; and 305 (101) ms for (n = 3) animate-selective SDEs. In the right VTC, median (sd) onset latencies were: 163 (88) ms for (n = 2) tool-selective SDEs; 173 (35) ms for (n = 14) place-selective SDEs; and 203 (74) ms for (n = 17) face-selective SDEs.
Systematic comparisons of onset latencies of category-selectivity (S3 Fig), within regions per hemisphere (Wilcoxon signrank test, FDR corrected for multiple comparisons), revealed no significant differences across categories in any of the regions (q > 0.05). We note that in the LOC, bilaterally, face-selectivity onset appeared to trend earlier than place-selectivity onset. However, variable sample-sizes of these category-selective SDEs, both within and across the VTC/LOC bilaterally limit the power of these comparisons.
We utilized a surface-based grouped icEEG analyses, combined across a large cohort (n = 26; LH n = 16, RH n = 10), to provide a population-level electrophysiological evaluation of the topology of category-selectivity in higher-order visual cortex. We demonstrate a consistent spatial organization of category-selective regions with respect to specific anatomical landmarks in the ventral temporal and lateral occipital cortical complexes (VTC and LOC). Importantly, our findings advance prior work by demonstrating that the use of surface-based normalization strategies in grouped icEEG analyses preserves structure-function coupling in a common brain space. In doing so, we provide a method to circumvent the sparse-sampling problem that has constrained the broader application of icEEG to the study of cognitive function at the single subject level [60, 122].
Our data reveal significant associations between category-selectivity with both lateral-to-medial and posterior-to-anterior axes in the VTC, as well as a dorsal-to-ventral axis in LOC, bilaterally.
In the LOC, the lateral-occipital sulcus (LOS) provides a consistent boundary for transitions in the selectivity between living (face and animate) and non-living (place and tool) stimuli: face and animate selective regions are constrained at or ventral to the LOS, while place and tool selective regions are constrained dorsally. Notably, face- and animate selective SDEs are interspersed on the ventral aspects of the LOC in a fashion consistent with prior fMRI studies that demonstrate alternating regions of face and limb-selectivity [21, 123, 124]. Additionally, in the left LOC, tool stimuli elicit strong, but non-selective activations in SDEs localized ventral to the LOS. Although the ventral LOC exhibits an overall greater selectivity for living stimulus categories, the role of the LOC in more general visual form processing is well documented [18, 21, 37, 125, 126].
In the VTC, the mid-fusiform sulcus (MFS) provides a consistent boundary for transitions in the selectivity between living (face and animate) and non-living (place and tool) stimuli: face and animate selective areas are constrained at or lateral to the MFS, while place and tool selective regions are constrained at or medial to the MFS. Furthermore, in the left VTC, the anterior aspect of the MFS predicts the location of word, animate, and tool selective responses, suggesting that the VTC may possess additional functional gradients along the postero-anterior anatomical axis. Notably, regions demonstrating word-selectivity are clustered around the intersection of the occipito-temporal sulcus (OTS) and the anterior MFS (Fig 6), consistent with prior studies of word selectivity that have localized cortical regions sensitive to orthographic stimuli to the general vicinity of the OTS (i.e. the visual word-form area) [13, 17, 45, 76–78, 112, 127]. Given that word, tool, animate, and face stimuli are typically dependent on central vision [23, 43, 44, 46], the co-localization of word-selectivity with these other categories remains consistent with our original prediction that word-selectivity should be observed in regions with foveal bias. Taken together, these findings support the hypothesis that distinct functional maps (e.g. animacy and eccentricity bias) are arranged along similar organization principles within the same expanse of cortical tissue .
While the locations of VTC and LOC category-selectivity reported here are consistent with an extensive body of invasive and non-invasive neuroimaging studies [9, 14, 19, 20, 37, 38, 50, 61, 74, 105, 128–138], our findings are novel in that they provide direct electrophysiological support for hypotheses of hierarchical information structuring using icEEG data combined across many individuals. Such hypotheses propose that small-scale functional representations are nested together within larger-scale functional maps in higher-level visual cortex, facilitating object categorization by the visual system (and possibly other higher-order cognitive systems) by enabling the extraction of different levels of categorical detail at different spatial scales (i.e. small scale for face information, larger scale for animacy information) [22, 23]. This hierarchical information structure is believed to arise from the distinct anatomical organization of these regions, as the MFS and LOS also predict transitions in cortical micro- and macro-architecture (e.g. cyto- and receptor architectonics and white-matter structural networks, respectively) [25, 33, 34, 36]. Such organization may speed visual categorization by directing unrelated visual information to distinct neural networks operating in parallel (e.g. details pertaining to scenes vs. faces), while related visual information (e.g. faces and body-parts) converge onto shared neural substrates [23, 139].
Notably, such a parallel network organization is independently supported by the result from our BGA time-series analyses, which revealed no significant regional differences in the emergence of category-selectivity across conditions (between ~150 and ~250 ms). The relative similarities in selectivity onset latencies both across categories within the VTC, as well as between the VTC and LOC, support the hypothesis that visual information is received and processed in these regions in a largely independent fashion [21, 25, 113, 140, 141]. Although the small and unequal numbers of category-selective SDEs (within and across regions) makes a definitive assessment of our time-series results impossible, we note that the latencies of selectivity reported here are consistent with prior intracranial work in both non-human primates and humans [1, 3, 61, 63, 64, 66, 68, 128, 130, 142, 143].
To date, evidence for hierarchical coding models has come almost exclusively from non-invasive neuroimaging studies. Although a recent electrophysiological study has also reported large-scale animacy distinctions along the MFS , the analysis in this study was restricted to a small sample size (n = 6; LH 3, RH 3) and constrained to the individual level. Our work here validates their findings in a larger population, extends the investigation to the LOC, and broadens the stimulus classification to include tools and words. The high spatial resolution of icEEG enabled us to confirm the boundaries of these higher-level visual regions via the consistent localization of non-responsive SDEs anterior to the MFS in the VTC and superior to the transverse occipital sulcus in the LOC. The finding that no non-selective SDEs are localized within postero-lateral VTC, bilaterally, supports prior reports on the highly selective nature of this region (specifically with respect to faces) [141, 144]. Notably, our observation that SDEs with dual-selectivity were localized within the MFS or LOS indicates that either our recordings average across multiple modules arranged in proximity to each other within the sulcus, or that the transitions between neuronal clusters tuned to specific categories may be a gradual one . While the recording scale of the SDEs used clinically does not allow us to distinguish between these two possibilities, our results nevertheless provide novel support that these sulci—the MFS and LOS—are critical to the functional topology of higher-level visual cortex.
The sparse-sampling problem has been a long-standing limitation of icEEG, to which the recent development of surface-based grouped techniques provides a viable and much-needed solution [56, 87, 98, 122]. In the current study, we combined data across 26 different subjects, each introducing a unique source of topological and pathological variability. The nonlinear transformation utilized here to map 242 SDEs into a common brain space preserved structure-function coupling across this heterogenous population, thus validating surface-based approaches to grouped icEEG. Furthermore, our findings also demonstrate a consistency of functional representation in our patient population—both amongst themselves and with respect to healthy subjects—thereby validating the use of patients with focal epilepsy for the study of cognitive function. In doing so, our work advances the field of icEEG by broadening its potential to contribute to the study of human cognition beyond the single subject level.
Three main limitations of this work are apparent to us. The first is that we include only subjects implanted with SDEs, which record from the gyral crowns, and may be biased against activity arising from sulcal sources. Notably, prior literature focusing on word-, limb-, and body-selectivity in the VTC has reported regions localized in or near the OTS [13, 21, 27, 45, 123, 145, 146]. The paucity of VTC animate selectivity reported in the current study, as well as the clustering of word-selectivity on gyral crowns in the anterior MFS, may have resulted from this gyral bias. To investigate this possibility, future icEEG work will integrate SDE data with data obtained from penetrating depth electrodes or stereotactic EEG .
A second limitation is the inconsistency in the low-level visual features of our stimuli (e.g. colored images for places vs. gray-scale face stimuli vs. line-drawings of tools/animate stimuli), which provide a potential confound in our analysis. However, higher-level visual regions are known to be invariant to changes in low-level visual features, and to maintain visual selectivity across a large spectrum of visual information, including color [23, 147–154]. Such invariance is reflected here by the co-localization of SDEs exhibiting category-selectivity for visually disparate stimuli along sulcal boundaries in the VTC and LOC. More specifically, while place and tool stimuli were the least similar in terms of low-level features (e.g. real color images of large, naturalistic stimuli vs. line-drawings of small, handheld objects) both were clustered together medial to the MFS. Similarly, in the LOC, face and animate stimuli (gray-scale vs. line-drawings, respectively) were clustered together ventrally with respect to the LOS. Future icEEG studies incorporating a more diverse selection of category classes will be needed to more fully substantiate our findings, as our results are limited by the use of only two animate and inanimate classes.
The third limitation is that our stimulus set does not allow us to unequivocally claim that the abstract semantic concept of “animacy” is the driving force behind the topological organization we observe. Notably, prior studies have argued that animacy distinctions in higher-order visual areas may simply be a by-product of shape similarities between stimuli of related categories (but see ) [156–161]. Nevertheless, category-specific functional gradients along abstract semantic boundaries (e.g. animacy) have been previously demonstrated in the congenitally blind . Additionally, in a recent study describing the topographic representation of body parts in the VTC and LOC, shape similarities were found to be insufficient to explain the architecture of the body-maps observed. Specifically, the authors demonstrated that regions preferential to a specific class of body-parts (e.g. upper limbs) were more responsive to within-class images, despite their greater dissimilarities in shape (e.g. hands and elbows), than to more similarly shaped images from distinct classes (e.g. feet and knees—lower-limbs) . Finally, a recent computational study has suggested how functional representations along abstract semantic boundaries (specifically animacy) could be achieved via top-down influences (reflected in supervised learning models); with their most successful models incorporating both visual and semantic information .
Thus, a final account of the functional topology within higher-order visual regions will likely need to account for both low-level visual features (e.g. shape) as well as influences from semantic or categorical dimensions [23, 37, 40, 51, 155, 161, 163–165]. This interpretation is in line with evidence from recent monkey electrophysiological studies suggesting that core (i.e. invariant) object recognition in primates (both human and non-human) rely on non-semantic representations of visual features, upon which semantic knowledge (in humans) can subsequently be learned [155, 159, 161, 163].
We provide a grouped icEEG investigation of VTC and LOC category-selectivity, and demonstrate unequivocal evidence for structure-function coupling through direct electrophysiological recordings in a large human cohort. Our findings support hypotheses of hierarchical information structuring in higher-level visual cortex, via the generation of large-scale functional maps (e.g. animacy) from nested functional representations consequent to this structure-function coupling.
Surface-based strategies to icEEG analyses provide novel opportunities for researchers to pool ECoG datasets across centers. Given the relative rarity of icEEG data in many cortical regions of interest, the adoption of such collaborative strategies could provide an invaluable tool to greatly expand the relevant application of high spatiotemporal resolution icEEG to the study of higher-level cognitive function.
S1 Dataset comprises a Matlab file (.mat) containing 6 variables. For the 94 VTC SDEs across the 16 subjects with left hemispheric coverage, there exists a distinct matrix for the BGA mean, BGA standard deviation, and d’ indices for each of the 5 categories tested. The order of the categories in these variables is identical to the order of the listed in the “Categories” cell variable. Additionally, for each SDE, a “Pt_Coords_LH_VTC” matrix contains the SDE coordinates in group space (MNI N27 brain, aligned to Talairach space). Finally, there is also a cell labeled “pt_dprime_final_shuf_LH_VTC” which contains all 10,000 shuffled d’ indices for each subject’s SDE. The use of cell format enables the number of SDE’s each subject contributed to be seen. The number of SDEs contained in each cell corresponds to the number of VTC SDEs for the LH cohort listed in Table 1. This information can be used to identify which SDE belongs to which subject in the other matrices contained in this S1 dataset.
The S2 dataset contains the identical matrices and cells as described in for the S1 dataset. In this case, these data are for the 64 VTC SDEs across the 10 subjects with right hemispheric coverage.
The S3 dataset contains the identical matrices and cells as described in for the S1 dataset. In this case, these data are for the 48 LH LOC and 35 RH LOC SDEs across the entire bilateral cohort of 26 subjects.
Scatterplots depict grouped d’ indices for each category plotted vs. subdural electrode (SDE) coordinates (in Talairach space) per hemisphere in each region. In the ventral temporal cortex (VTC; RH n = 64, LH n = 94), comparisons were made against the x and y coordinates. In the lateral occipital cortex (LOC; LH n = 48, RH n = 35), comparisons were made with the z and y coordinates. For each plot, regression lines were fitted (color-coded by category), and the strengths of association were estimated using Spearman correlations (bottom right, bold text denotes FDR corrected q ≤ 0.05, for multiple comparison across categories and SDEs per region and hemisphere). Spearman correlations were selected (over Pearson’s) for their robustness to outlier influence and smaller sample sizes. Furthermore, Spearman’s correlations test for monotonic relationships, and the relationships between d’ indices and SDE coordinates are not known a priori to be linear.
Box plots depict median onset latency of category-selectivity in the left (LH) and right (RH) hemisphere lateral occipital cortex (LOC) and ventral temporal cortex (VTC), for the five categories of interest: words (cyan), tools (green), places (blue), faces (red) and non-face animate (body-parts and animals) stimuli. Timing of the onset of selectivity was evaluated for each category-selectivity SDE per subject, and determined using pairwise comparisons in broadband gamma activity time-series, for each category against all others. No significant differences between categories (following corrections for multiple comparisons) were noted, although low sample sizes likely underpowered these contrasts. Word stimuli were not tested in the right hemisphere, and in the RH VTC, no significant animate SDEs were observed.
Non category-selective subdural electrodes (SDEs) are visualized on the MNI N27 template brain (aligned to Talairach coordinate space) after surface based normalization. SDEs are color-coded by the category with the largest d’ index for that electrode (matched to image legends). SDE diameter reflects the magnitude of the d’ value for that category, scaled by the largest d’ value across categories per region (regions per hemisphere are scaled differently). Compass points denote SDE coordinates (Talairach space) and direction. Notably, in the postero-lateral aspects of bilateral ventral temporal cortex, no non-significant SDEs are observed.
We thank Dr. Alice Chuang and Dr. Stephen Mills for their statistical expertise and advice, as well as Matthew Rollo and Dr.’s Eleonora Bartoli Suganya Karunakaran, Kamin Kim, and Bartlett Moore IV for their comments on earlier drafts of the manuscript. We are especially grateful to all the patients who volunteered for participation in this study, the neurologists at the Texas Comprehensive Epilepsy Program (Jeremy Slater, Giridhar Kalamangalam, Omotola Hope and Melissa Thomas) who participated in the care of these patients, Vips Patel, and all of the nurses and technicians in the Epilepsy Monitoring Unit at Memorial Hermann Hospital who helped make this research possible.
This work was supported by the National Center for Advancing Translational Sciences of the National Institutes of Health (5TL1TR000369-08 to CMK), the National Institute of Biomedical Imaging and Bioengineering (NIBIB; T32EB006350 to CRC), National Institute for Deafness and communication disorders (NIDCD – R01DC014589 to NT) and partially by the National Science Foundation (NSF-FO 1533664 to NT). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
All relevant data are within the paper and its Supporting Information files.