|Home | About | Journals | Submit | Contact Us | Français|
Studies have indicated that temporal and prefrontal brain regions process face and vocal information. Face-selective and vocalization-responsive neurons have been demonstrated in the ventrolateral prefrontal cortex (VLPFC) and some prefrontal cells preferentially respond to combinations of face and corresponding vocalizations. These studies suggest VLPFC in non-human primates may play a role in communication that is similar to the role of inferior frontal regions in human language processing. If VLPFC is involved in communication, information about a speaker's face including identity, face-view, gaze and emotional expression might be encoded by prefrontal neurons. In the following study, we examined the effect of face-view in ventrolateral prefrontal neurons by testing cells with auditory, visual, and a set of human and monkey faces rotated through 0°, 30°, 60°, 90°, and −30°. Prefrontal neurons responded selectively to either the identity of the face presented (human or monkey) or to the specific view of the face/head, or to both identity and face-view. Neurons which were affected by the identity of the face most often showed an increase in firing in the second part of the stimulus period. Neurons that were selective for face-view typically preferred forward face-view stimuli (0° and 30° rotation). The neurons which were selective for forward face-view were also auditory responsive compared to other neurons which responded to other views or were unselective which were not auditory responsive. Our analysis showed that the human forward face (0°) was decoded better and also contained the most information relative to other face-views. Our findings confirm a role for VLPFC in the processing and integration of face and vocalization information and add to the growing body of evidence that the primate ventrolateral prefrontal cortex plays a prominent role in social communication and is an important model in understanding the cellular mechanisms of communication.
In social interactions, information from the face is of essential importance. Facial expression and identity are critical pieces of information that guide our communication exchanges. Within a face, the eyes and mouth receive the most attention when we view faces (Haith et al, 1977; Klin et al., 2002; Vinette et al., 2004). This bias toward examining the eyes and mouth is also present in non-human primates when looking at pictures or videos of conspecifics (Wilson and Goldman-Rakic, 1994; Nahm et al., 1997; Ghazanfar et al., 2006; Gothard et al., 2009). Of all of the interesting and informative features of a face, it is the eyes and mouth that will provide clues to the emotional state of the viewed person or conspecific and the angle of gaze which will direct our attention within the environment. In non-human primates the angle of gaze can indicate submission or dominance in social rank, an important facet of social interactions, and direct gaze is considered a threat in certain social contexts. In contrast, most of our human vocal exchanges are given with direct gaze when we speak to one another, and avoiding direct gaze is informative as well. Thus, areas of the brain involved in communication and language may receive information about the gaze or view of the person with which we are communicating.
The neural circuitry involved in the processing of facial information includes cortical areas within the parietal, temporal, and frontal lobes. In the human brain perception of faces consistently activates an area in the lateral fusiform gyrus known as the fusiform face area (FFA), (Kanwisher et al., 1997). Facial expression, identity, and gaze-direction have been shown to activate specific brain regions within the temporal lobe. Facial expression and gaze direction have been preferentially linked with the superior temporal sulcus (STS) and the amygdala, while the processing of features which occurs during identity processing has been more strongly linked with the inferotemporal cortex and fusiform gyrus (Haxby et al., 2002; Engell and Haxby, 2007; Breiter et al., 1996; Morris et al., 1996; Kawashima et al., 1999). Studies in the non-human primate have also revealed selective face-processing areas of the temporal lobe (Tsao et al., 2003, 2006), and single unit responses to faces have been recorded in a variety of brain regions. Work by Perrett and colleagues (1985) have shown that different views of the head and eyes activate different populations of neurons within the STS. Facial expression and identity have been shown to activate neurons in the inferotemporal cortex, the superior temporal sulcus, and in the amygdala (Hasselmo, et al., 1989; Young and Yamane, 1992; Eifuku et al., 2004; Kuraoka and Nakamura, 2007; Freiwald and Tsao, 2010), sometimes within the same cells (Gothard et al., 2007) while other studies have shown that amygdala neurons are sensitive to direct gaze (Hoffman et al., 2007).
The face processing network, however, extends beyond the temporal lobe and includes areas within the frontal lobe. Several studies have demonstrated activation of human orbitofrontal or prefrontal cortex in the processing of facial expression or gaze-direction (Dolan et al., 1996; Kesler-West et al., 2001; Vuilleumier et al., 2001; Nomura et al., 2004; Ishai et al., 2004, 2005; Sergerie et al., 2005; Engell and Haxby, 2007; LoPresti et al., 2008). Orbital and ventral prefrontal cortex have been shown to be activated during tasks of facial memory (Dolan et al., 1996) or perception of emotional faces (Kesler-West et al., 2001; Iidaka et al., 2001; Ishai et al., 2005; Pourtois et al., 2006). Importantly, face cells, similar to those in the temporal lobe that are selectively responsive to pictures of faces, have been recorded in the non-human primate ventrolateral prefrontal (VLPFC) cortex (O'Scalaidhe et al., 1997; 1999) and orbitofrontal cortex (Thorpe et al., 1983; Rolls 1996). The single unit studies have recently been confirmed with fMRI where activation of VLPFC and orbitofrontal cortex by faces was shown in macaque monkeys (Tsao et a., 2008). Finally, single cells in VLPFC have been found to respond to both vocalizations and the corresponding facial gesture (Sugihara et al., 2006). Thus, VLPFC is a part of the face processing network, although its precise role remains uncertain.
In the present study, we asked whether prefrontal neurons would be differentially responsive to different views of rotated faces. Since previous studies have shown that some VLPFC neurons respond to both faces and vocalizations, we predicted that neurons might also respond differentially to face stimuli that vary in face-view /head orientation, where face features vary in their visibility. Moreover, we hypothesized that VLPFC neurons which are responsive to auditory stimuli including vocalizations (Romanski et al., 2005) are likely to be involved in communication and therefore may be responsive to forward face stimuli since this face-view is most commonly utilized during communication. Our results support our hypothesis and indicate that VLPFC neurons respond to face-view. Neurons that were selective for a particular face-view were most often responsive to forward face-view/head orientations, and all of these neurons were responsive to complex auditory stimuli.
We recorded auditory and visual responsive cells in the prefrontal cortex of 3 naïve rhesus monkeys (Macaca mulatta) that had not yet been tested with combined face and vocalization stimuli. All methods were in accordance with NIH Guidelines for the Care and Use of Laboratory Animals, and the Yale Animal Care and Use Committee Guidelines or the University of Rochester committee on Animal Care and Use. The recording methods have been previously described (Romanski et al., 2005; Sugihara et al., 2006) and are briefly described here. A stainless steel recording cylinder was chronically implanted overlying the inferior convexity of the prefrontal cortex including areas 12 and 45 as defined anatomically (Preuss and Goldman-Rakic, 1991) and physiologically (O'Scalaidhe et al., 1993). Animals were trained in a fixation task for juice reward. Each trial consisted of a 500 ms pre-stimulus fixation period, a 1000 ms stimulus (auditory or visual) period, and a 500 ms post-stimulus period. Animals initiated the trial by fixating the central fixation point. A juice reward was delivered at the termination of the post-stimulus fixation period and the fixation requirement was released. There was a 2– 3 second inter-trial interval. Breaking fixation at any time during the trial caused the trial to abort, and the data for that trial was discounted. Stimuli were presented in blocks of 10 and randomly presented 8 – 12 times resulting in 80 – 120 total trials per stimulus set. The timing of the behavioral contingencies, presentation of all stimuli, delivery of reward, and monitoring of eye position were controlled by a PC system running CORTEX (NIH derived software) or other custom software.
Because of the heterogeneity of ventral prefrontal cortex neurons, which include auditory, visual, somatosensory, saccade, and reach neurons, our standard testing procedure involved the presentation of auditory and visual stimuli that covered a wide range of potential visual and auditory features in order to find responsive cells. In this study, we did not test overall auditory or visual selectivity, except within the face-view stimulus list. The auditory stimuli which were presented were used to determine general auditory responsivity but not selectivity and so were drawn from a large library of sounds used previously (Romanski and Goldman-Rakic, 2002). The vocalizations, which were unfamiliar to the recorded subjects, included exemplars from a larger colony housed separately from the recorded subjects and included coos, screams, grunts, pant threats, chirps, and barks which are common macaque vocalizations. In addition we used exemplars from a library of macaque vocalizations provided by Marc Hauser in his recordings on the island of Cayo Santiago, which we have used previously in our recordings (Romanski et al., 2005). The non-vocalization auditory stimuli included FM sweeps, noise bursts, clicks, environmental sounds, tones, and chords. Auditory stimuli were presented in 10 item lists (n=12 lists). One stimulus was presented per trial. Each 10-item-list contained 2 monkey vocalizations, 2 human vocalizations, 2 band passed noise stimuli, 2 FM sweeps, and 2 environmental sounds (door slamming, keys jangling, car honking, whistle, etc). Because of the heterogeneity in our auditory lists, we were not able to test categorical responses to auditory stimuli. This would require a much larger list with more exemplars per category, which has been performed in other studies (Romanski et al., 2005).
Auditory stimuli were presented via a PC connected to a Yamaha MSP5 monitor speaker (overall frequency response 50 Hz - 40kHz) located just below the video monitor and placed 30 inches directly in front of the monkey. The auditory stimuli varied from 65 – 75 dB SPL measured at the level of the monkey's ear with a B& K sound level meter.
Visual stimuli were also presented in 10 item lists. There were n=40 lists which each contained 1–3 monkey faces, 1–3 human faces, 2 familiar objects, 2 clipart objects, 1 solid color square, and 1 pattern or fractal square. Visual stimuli were presented on a computer monitor (30 inches in front of the monkey) so that they spanned ~7 degrees. These studies were performed on a CRT monitor with a refresh rate of 72 Hz. Neurons which demonstrated a response to any of the auditory or visual stimuli from the presented list or showed any task related activity were tested further with additional stimulus lists including the face-view/head rotation list shown in Figure 1. The human face in this list is taken from the Tarrlab Object Data bank. The monkey face is a digitized photo of an unfamiliar 4 yr old male rhesus monkey. The frames for each face-view /head orientation were separate digital pictures taken when the monkey looked to a cued location as the picture was taken.
The recordings took place in a sound attenuated room. Animals were acclimated to the laboratory and testing conditions and then trained on a fixation task. For these animals, eye position was measured with an ISCAN® infrared pupil monitoring system (1 animal) or with a magnetic search coil apparatus (CNC Engineering, Seattle, WA; 2 animals). The fix spot subtended 0.5 degrees and the fixation window size during the pre-stimulus and post-stimulus period spot was 2–4 degrees. The fixation window was enlarged during stimulus presentation to 5–7 degrees.
Each day, the subjects were brought to the experimental chamber and were prepared for extracellular recording. The head was fixed in place by means of a chronically implanted head-post and a stereotaxic adaptor was placed on the recording cylinder. A parylene-coated Tungsten electrode (0.8–2.0 MΩ at 1 kHz, FHC, Inc.) was lowered into the target region by a hydraulic microdrive (Narishige MO-95C), which fit over the recording cylinder. The neuronal activity was amplified (BAK MD-4 amplifier), filtered (Khron-Hite, 3700, Avon, MA), discriminated (BAK Window Discriminator), and displayed on an oscilloscope. Discriminated spikes were digitized and saved on-line. Simultaneous isolation of multiple units was possible with dual time/amplitude window discriminators. During the recordings, the electrode was advanced into the brain until a stable unit was found. The cell was tested with one or more sets of auditory and/or visual stimuli. If a cell was not responsive to any stimuli or to any aspect of the task (i.e. fixation, reward, etc.) the electrode was advanced 150 – 200 microns and a new cell was isolated and tested. Each isolated unit (n=301) was tested with at least one auditory list and one visual list followed by the rotated face-view/head stimulus set (stimuli as in Figure 1). The order of the auditory and visual testing lists was varied randomly across the population of isolated single units. The face-view/head orientation stimulus set was tested either 2nd or 3rd in order, and this was also varied across the population. Although we did not check the receptive fields of visually responsive neurons in the VLPFC, previous studies have shown that visually responsive neurons in VLPFC are large and include the fovea (Suzuki and Azuma, 1983). In Wilson et al., (1993) receptive fields were tested in stimulus selective neurons, and it was found that neurons responded strongest when optimal stimuli were presented foveally. The area targeted for recording in this study was the ventrolateral prefrontal cortex (areas 12 and 45) as shown in Figure 2. Recordings in this area have shown that responses to complex sounds and to visual stimuli can be elicited. The organization of the auditory and visual responsive region and the recording tracks are shown in Figure 2.
A unit was considered to be responsive if the analysis revealed a significant difference in activity between the baseline firing rate and the stimulus period using a repeated measures ANOVA (p< 0.05). For baseline firing rate, the ITI is traditionally used but the fixation period can also be used if units are not fix-spot responsive or responsive to task onset during the fixation period. Comparison of neuronal firing in the stimulus period with firing in the ITI (54/96) or the FIX period (55/96) yielded a similar number of responsive units. Units with a change in firing rate during the stimulus period were further analyzed for selectivity with either post-hoc analysis (Tukey HSD) or as defined below.
We performed a 2-way MANOVA on neurons which were tested with the face-view/head orientation list (Figure 1) where the factors were Identity of face (monkey, human) and Face-view of the rotated face from the center (0°, 30°, 60°, 90°, −30°) examined in the early (0 – 500) and late (501–1001) part of the stimulus period. Neurons that were judged to be responsive to either a single factor or an interaction of the two factors were further examined. A selectivity index (SI) was calculated using the absolute value of the averaged responses to each stimulus minus the baseline firing rate. The SI is a measure of the depth of selectivity across all the face-view and identity stimuli presented and is defined as:
where n is the total number of stimuli, λ1 is the firing rate of the neuron to the ith stimulus, and λmax is the neuron's maximum firing rate to one of the stimuli (Wirth et al., 2009). Thus, if a neuron responds to only one stimulus and not to any other stimuli, the SI would be 1. If the neuron responded identically to all stimuli in the list, the SI would be 0.
Linear discriminant analysis (Johnson and Wichern, 1998) was used to classify single trial responses of individual neurons with respect to the stimuli which generated them using the MATLAB classify function. Classification performance was estimated using 2-fold cross validation. This analysis resulted in a stimulus-response matrix, where the stimulus was the face-view presented on an individual trial, and the response was the face-view to which each single trial neural response was classified. Each cell of the matrix contained the count of the number of times that a face-view was classified as a particular response by the algorithm. Percent correct performance for each stimulus class was calculated by dividing the number of correctly classified trials for a particular stimulus (the diagonal element of a particular row) by the total number of times a particular stimulus was presented (usually 8–12, the sum of the off-diagonal elements in a particular row).
We also calculated the partial information (Romanski et al., 2005) contained in the neural responses of a single cell about each stimulus as:
where the sum is taken over all possible responses. The stimulus-response probability distributions were estimated using the classification matrix. It has been stated that the use of small samples sizes leads to a bias or overestimate of the information (Optican et al. 1991; Treves and Panzeri 1995; Panzeri and Treves, 1996; Golomb et al. 1997; Borst and Theunissen, 1999; Panzeri et al. 2007; Sakaguchi et al., 2010). The bias term can be corrected in several ways including calculating the Bayesian estimate of the bias term according to Panzeri and Treves (1996), and subtracting it from all information measures (Averbeck et al., 2003). Another bias-correction method relies on using the decoded response (e.g., Rolls et al. 1998; Victor and Purpura 1996), which is the method we have used in the current study. Computation of information from the decoding matrix ensures that the decoded information I(S; D) is equal to, or less than I(S; R); that is, I(S; D) gives the lower bound of I(S; R). Thus, information is not overestimated. The average of the partial information across stimuli gives the total or mutual information in the neural response about all of the stimuli. The partial information about a particular stimulus is a measure of how well the response can be predicted when a given stimulus is shown. This is related to the percent correct classification for each stimulus. However, it is possible for a given stimulus to elicit a reliable, consistent response, which leads to high partial information when the response is incorrect. Thus the percent correct classification will be low, as when several stimuli from a list elicit identical responses, i.e. a cell which responds in a similar manner to all of the stimuli presented. Since there may be more information in the raw neural responses than there is in the decoded neural responses due to mismatches between the model and the real response distribution, these information estimates are lower bounds. This regularization is necessary with the small number of trials conducted for each stimulus.
We examined the response of ventrolateral prefrontal cortex neurons in rhesus macaques as they performed a fixation task for juice reward while presented with auditory or visual stimulus during fixation During recordings each neuron was first tested with an array of 10 visual stimuli (including objects, faces, and patterns), 10 auditory stimuli (including vocalizations, noise bursts and other sounds), and the face-view stimulus array (Figure 1), as described above. Of 301 isolated cells in the ventrolateral prefrontal cortex (Figure 2), 95 cells completed testing with a visual, an auditory, and the face-view/head orientation stimulus set. We compared the neuronal response during the stimulus period with baseline firing using a 1-way ANOVA with repeated measures (baseline and stimulus period) analysis and found that 66/95 neurons (69%) had a response to the auditory, visual, or face-view stimulus arrays. 48/95 cells were unimodal visual in that they were significantly responsive to the visual and/or face-view sets of stimuli but were not responsive to the auditory list which they were tested with. These neurons responded to a wide range of stimuli including faces, objects, and patterns (Figure 3A, B). In contrast only 4/95 cells were unimodal auditory cells and only responded to stimuli from the auditory stimulus array (Figure 3C), while 14/95 cells were multisensory and responded to both auditory and visual/face-view lists. Overall, a total of 39 cells had a significant response compared to baseline for one or more stimuli in the face-view/head orientation stimulus set.
We analyzed the neurons tested with the face-view stimulus set (n=95) with a 2-way MANOVA to assess the effect of Face-view (0°, 30°, 60°, 90°, −30°) and Identity (monkey or human) on early (0 to 500 sec after stimulus onset) and late (501 – 1001 msec after stimulus onset) stimulus periods. 14 neurons had a main effect of Face-view while 20 neurons showed a main effect of Identity. A total of 9 neurons had a significant interaction of Face-view and Identity like the cell shown in Figure 1. Furthermore, analysis of the early and late bins of the response to the face-view stimuli indicated that there was a significant interaction of Time bin and the factor Identity. More neurons showed an effect of Identity for the late part of the stimulus period (n=13) than in the early part of stimulus period (n=7 cells). For example, the neuron in Figure 1 had a main effect of Face-view (p < 0.001), a main effect of Identity (p< 0.001), and an interaction of Face-view x Identity (p < 0.001). For this cell, both human and monkey faces at 0° and 30° evoked an increase in firing, but the response to human faces is greater due to the sustained firing in the second half of the stimulus period. The effect of Identity was significant for the late bin of the stimulus period (p < 0.001) but not for the early bin (p<0.093) for this cell. Post-hoc analysis with the Tukey test indicated that the human face stimuli at 0 and 30 degrees were significantly different (p < 0.05) from all other stimuli but not from each other. The forward monkey faces (0° and 30°) were different from all other monkey faces (p < 0.05).
We examined the response of neurons to the face-view set of stimuli by considering their selectivity to particular face-views. To this end we calculated a selectivity index for neurons which had a main effect of Face-view or an interaction of Face-view x Identity (n = 17 neurons) as the absolute value of the averaged response to each condition (minus baseline) as described (see Methods). The SI provides a measure of the depth of selectivity where a highly selective neuron which responds to only one face-view would have an SI of 1, while an SI of 0 would indicate an identical response to all 10 face-view stimuli. 13/17 neurons had an SI greater than 0.50 as indicated in the histogram in Figure 4. The cell depicted in Figure 1, which was responsive to both Face-view and Identity and their interaction, was responsive to 0 and 30 degree face-view stimuli for both human and monkey faces and had a selectivity index of 0.568.
Figure 5 is a bar graph of the mean firing rate for four different face-view responsive neurons. The first three cells (Figure 5A–C) all had a main effect of face-view, and the stimulus which evoked the greatest response was at 0° and/or 30°. The cell in Figure 5A had a main effect of Face-view (p < 0.001). Post-hoc analysis with the Tukey test indicated that the human face 0° and monkey face 0° were significantly different from all other stimuli for this cell (Figure 5A); the SI was 0.80225. The cell in Figure 5B was significantly responsive to 3 face-views and had an SI of 0.658947. This cell had a main effect of Face-view (p < 0.001) and a main effect of Identity (p < 0.004). The monkey faces at 0° and 30° were significantly different (p < 0.05) from all other stimuli but not each other, and the human face at 0° was significantly different from the other human face-views. The cell depicted in Figure 5C also had an effect of Face-view (p < 0.001) and Identity (p < 0.003) and their interaction (p < 0.038) and was most responsive to the 0 and −30 degree human face (p< 0.05) and also by the 0 degree monkey face (SI = 0.740). Some neurons responded significantly above baseline to many of the face stimuli in our set without particular selectivity. The cell in Figure 5D did not show an effect of Face-view but had a significant effect of Identity in the late part of the stimulus period (p< 0.0473) and thus had a lower SI of 0.264. Of the face-view responsive neurons (n= 17), most showed their greatest response to the forward facing stimuli at 0° or 0° and 30°/−30°degrees rotation (n=13). Only 2 neurons had a significant response to the 90° face stimulus (HF 90; Figure 6).
We further evaluated the cells which had an effect of Face-view and found that cells which were selective for forward views (0°, 30°/−30°) were multisensory in that all of these neurons also had significant responses to one or more stimuli from the auditory list it was tested with (n=13 cells). In contrast, the cells which responded to the face-view set of stimuli but were not selective for 0° and 30° (n=3), were not auditory responsive. We performed Fisher's Exact test on the contingency that forward face-view selective neurons were likely to be auditory responsive compared to neurons which preferred other views and found this to be true (p< 0.0250; Table 1). Thus, neurons which were responsive to forward face-view were more likely to be auditory responsive than neurons tuned to other views. Figure 7 depicts two multisensory neurons responsive to both auditory stimuli and to the face-view stimuli. For both cells, there is a response to the forward face-view stimuli at 0° and 30° (Figure 7A) and at 0° (Figure 7B). For the neurons depicted here, there was a clear response to the macaque and human vocalizations although selectivity for vocalizations over other stimuli was not tested in this study.
We examined the classification of face-view stimuli by prefrontal neurons using a linear discriminant analysis on cells which had been shown to be responsive to face-view in the 2-way MANOVA. The percent correct is plotted for 3 cells in Figure 8. The mean firing rate for each of these same cells was plotted in Figure 5 (A, B, D). In Figure 5A, the human face at 0° had the highest response and was significantly different from all other stimuli, and in the decoding analysis (Figure 8A) the percent correct is 63% while the monkey face at 0° which elicited a lower response had a lower percent correct but was still above chance while all other stimuli were near or below chance (dotted gray line). For other cells, although the mean firing rate was greatest for the 0° and 30° stimuli, these responses were similar enough to each other that the stimulus which elicited the weakest response sometimes was decoded with a higher percent correct. In Figure 5B, the human profile face elicited the lowest mean firing rate (below baseline) while the monkey 0° and 30° faces evoked the highest mean firing rate. This resulted in a high percent correct for this “weak” stimulus in the decoding analysis (figure 8B). Cells which were not selective, although they had a response that was above baseline to many or all of the face stimuli (Figure 5D), had neuronal responses to the stimuli that were similar to each other and resulted in a low percent correct that was at, or below chance (Figure 8C). We averaged the results of the decoding analysis across the population of face-view responsive cells (n=17) and found that the human forward face (0°) was most reliably discriminated (Figure 9, black squares); the percent correct was 35% for the human 0° stimulus, while the monkey face 0° was 28% correct and the human face 90° was 26.4% correct. The other face-view stimuli were not decoded as reliably but were still above chance level (gray dashed line).
We also calculated the partial information contained in the neural responses of a single cell about each stimulus for face-view responsive neurons. The partial information about a particular stimulus is a measure of how well the response can be predicted, when a given stimulus is shown. This is related to the percent correct classification for each stimulus (Averbeck et al., 2003). The average of the partial information across all stimuli is the total or mutual information. We estimated the stimulus-response probability distributions using the classification matrix generated with the linear discriminant analysis and calculated partial information for the face-view responsive cells across all 10 face stimuli. Just as the direct human face at 0° (HF0) was decoded best, it also elicited the highest partial information (Figure 9) across the population of face-view cells (n=17). Significance testing of the partial information with a 1-way ANOVA indicated that there was a significant difference across the face-view stimuli (F=2.47, p=0.012) and post-hoc testing indicated that the human face at 0° had a higher partial information than all other face-view stimuli. The partial information for the human face in profile (90°) had the second highest partial information just as it was elevated in the decoding analysis.
In the present study, we have shown that single neurons in the macaque ventral prefrontal cortex (VLPFC) respond differentially to changes in face-view/head orientation. Neurons in the VLPFC responded selectively to either the identity of the face presented (human or macaque) or to the view of the face/head, or to both identity and face-view. Neurons which were affected by the identity of the face most often showed an increase in firing in the second part of the stimulus period. Neurons that were selective for face-view were typically most responsive to the forward face stimuli (0° and 30° rotation). Our data indicated that the human forward face (0°) was decoded better than any other facial stimuli and also contained the most information relative to other face-views (Figure 9). Furthermore, neurons that preferred the forward views (0° and 30°) were auditory responsive.
The preference for forward face-view by prefrontal neurons is interesting since forward face-view conveys a variety of social signals which prefrontal neurons might encode. First, direct gaze is considered a social threat for monkeys, especially of a human (Kalin and Shelton, 1989). Thus, the neurons which responded best to the forward view faces, especially the human face, may have done so due to perceived threat by the viewer, whereas other face views may have been less likely to evoke a response because they were less threatening. In addition to the perceived threat of direct gaze, forward face-view stimuli also provide the most complete view of a number of communication-relevant facial features including the eyes and mouth, which are the most frequently viewed elements of a face, even in macaque monkeys (Haith et al., 1977; Gothard et al., 2009).
In fact, the appearance or disappearance of a salient feature is thought to underlie face-view preferences (Liu and Chaudhuri, 2002; Stephan and Caine, 2007). In Perrett et al., (1985) forward face-view evoked responses most frequently, even when salient features, such as the eyes or the mouth were obscured. In the current study, the 0° and off-center, 30° face views were the preferred view most frequently (Figures 1, ,5).5). The 30° or ¾ face-views have the advantage of conveying a 3-D image of the nose and face-shape, which are additional pieces of information that are valuable in recognition (Logie et al., 1987). Some studies which have examined the neural response to face-view have found an advantage in recognition for the ¾ face-view (Logie et al., 1987; Van der Linde and Watson, 2010) and have shown activation by this view compared to others in a number of cortical regions including the inferior frontal gyrus (Kowatari et al., 2004). The 30° is also less threatening. Thus, prefrontal neurons which are integrating many details and features of face information, including angle of gaze, position of the mouth, and position of ears for the purpose of evaluation of emotional expression or identity processing, would receive the greatest amount of information from the 0°and 30° face-view stimuli where these features are most visible.
Face-view and gaze are important elements in communication since attention to the face is a prerequisite for most verbal and non-verbal communication. An area of the brain that is involved in the process of social communication would be likely to receive information about both auditory and visual cues. In the current study, we found an association between face-view and auditory responsiveness. The neurons which preferred forward views (0° and 30°) were responsive to auditory stimuli whereas the neurons which preferred other views or which had no preference were not auditory responsive. We have previously shown that most auditory neurons in VLPFC are multisensory and that these multisensory neurons prefer face-vocalization combinations compared to non-face/non-vocalization stimuli (Sugihara et al., 2006). The findings here, that such multisensory neurons are specialized for forward face-view, affirms a role for the non-human primate VLPFC in communication and not simply generalized audiovisual integration since forward face-view coupled with acoustic stimuli is most common during communication.
Prefrontal neurons were also affected by the identity of the faces in our study. Most identity responsive cells responded to the human face stimuli (Figure 1) although some cells preferred the monkey faces (Figure 5 B, D). Across the population, the human forward face evoked a response that was significantly different from all other stimuli and was decoded best among the face-view responsive cells (Figure 9). Nonetheless it is difficult to completely separate the factors of face-view and identity since we used only two different identity faces. The response to the 0° human face in fact, could be due to either factor. Nonetheless, previous studies have noted clear differences in the viewing of monkey versus human faces by monkeys which could support the results shown here. For example, monkeys show differences in their ability to recognize monkey versus human faces and use different perceptual strategies (Gothard et al, 2009; Parr et al., 1999). Scanpath analysis shows that monkeys examine the eye region of monkey faces more than the eye region in human faces and show a novelty preference for monkey but not human faces (Gothard et al., 2009).
Importantly, the effect of identity was temporally specific in VLPFC neurons as more neurons showed an effect of identity on the late part of the stimulus period compared with the earlier part. One explanation is that the effect of identity may develop more slowly than face-view since it involves the integration and processing of information about many features as well as memories before enough information is accumulated for recognition. This longer processing time may appear as the late response in our data. In contrast, the earlier effect of face-view could be due to a rapid response to a salient feature such as the eyes. Differential temporal processing of different aspects of faces was demonstrated for inferotemporal neurons in the influential study by Sugase et al., (1999). In their study, global information about the general visual category (object, human face, monkey face) was conveyed in the earliest part of the response while fine information about identity or expression was conveyed in the late part of the neuronal response, similar to the results presented here.
In other studies, identity and facial expression or face-view have been localized to separate temporal lobe regions. Most commonly, identity and the physical features that define identity have been most often found to evoke responses in neurons within the inferotemporal cortex (Hasselmo, et al., 1989; Young and Yamane, 1992; Eifuku et al., 2004). In contrast, facial expression and face-view sensitive neurons have been localized to the cortex in the superior temporal sulcus including the superior temporal polysensory area (Hasselmo, et al., 1989; Eifuku et al., 2004). In their study of face-view in the STS, Perrett and colleagues (1985), using stimuli similar to those shown here where gaze and head orientation changed together, showed different views of the face (frontal, profile, tiled upwards or downwards) maximally activated different populations of neurons in the STS. The integrated processing by the entire population can then account for all gaze directions. In their study, Freiwald and Tsao (2010) used a combined single-unit neurophysiology and fMRI approach to localize face-responsive cells and patches in the temporal lobe. These were then tested with 8 different face-views. Their results indicate an increase in invariance to head orientation processing as information proceeds from ML/MF to AL and further to AM. The view invariance of these face-patches suggests a role for these areas in identity processing (Friewald and Tsao, 2010). In contrast to the population processing of gaze and the view invariant responses in face-cell patches of the temporal lobe, the neurons in the VLPFC described here, respond best to forward gaze (at 0°or 30°), which suggests a different role in face processing for this region. Furthermore, the fact that forward face-view selectivity occurs in auditory responsive neurons conveys a role in communication or integrative processes. This is also suggested by the convergence of anatomical projections from face and vocalization processing areas in VLPFC.
Face processing areas in inferotemporal cortex and the STS have reciprocal connections with the frontal lobe so that identity, facial expression, and face-view information can easily converge on prefrontal neurons. Anatomical studies in non-human primates have noted connections between area 45 in ventral prefrontal cortex and inferotemporal cortex areas TE and TEO (Webster et al., 1994; Barbas, 1992). There are also dense reciprocal projections between the dorsal bank of the STS and orbital and lateral prefrontal cortex (Romanski et al., 1999a; Petrides and Pandya, 2002; Saleem et al., 2008; Diehl et al., 2008). Ventral prefrontal cortex also receives projections from the amygdala which has face (Leonard et al., 1985; Gothard et al., 2007), gaze (Tazumi et al., 2010, Hoffman et al., 2007) as well as face-vocalization (Kuraoka and Nakamura, 2007) neurons. Direct and averted gaze in non-human primates are clear indicators of threat and submission, respectively, and it is not surprising that a region such as the amygdala which is involved in the signaling of fear or vigilance (Davis and Whalen, 2001; LeDoux, 2007) should show a change in neural activity with direct or averted gaze. Since there are connections between the VLPFC with the amygdala (Carmichael and Price, 1995; Barbas, 2000, 2007) information concerning the emotional aspects of gaze likely reach VLPFC neurons and could explain some of the selectivity shown in the current study for direct face-view.
A network of brain regions devoted to the processing of identity and face-view information would thus include the amygdala, the STS, and inferotemporal cortex. These areas can send specific information about various aspects of facial identity, facial expression, and face-view to VLPFC. While some of these areas may be specialized for the processing of faces, others are multisensory and have also been shown to respond to auditory information. These areas, together with vocalization processing regions in the auditory association cortex which send projections to VLPFC (Romanski et al.., 1999b), would provide a rich array of face and vocalization information to the frontal lobe and would allow VLPFC to integrate identity, expression, gaze, and vocalization information during social communication and other executive processes.
The author would like to thank the following individuals for their assistance: M. Pappy, J. Coburn, D. Shannon and C. Louie for histology; M. Diltz for assistance with figures and comments on the manuscript; and C. Constantinidis for comments on the manuscript.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.