|Home | About | Journals | Submit | Contact Us | Français|
Cognitive control mechanisms allow individuals to behave adaptively in the face of complex and sometimes conflicting information. While the neural bases of these control mechanisms have been examined in many contexts, almost no attention has been paid to their role in resolving conflicts between competing social cues, which is surprising, given that cognitive conflicts are part of many social interactions. Evidence about the neural processing of social information suggests that two systems—the mirror neuron system (MNS) and mental state attribution system (MSAS)—are specialized for processing nonverbal and contextual social cues, respectively. This could support a model of social cognitive conflict resolution in which competition between social cues would recruit domain-general cognitive control mechanisms, which in turn would bias processing towards the MNS or MSAS. Such biasing could also alter social behaviors, such as inferences made about the internal states of others. We tested this model by scanning participants using fMRI while they drew inferences about social targets' emotional states based on congruent or incongruent nonverbal and contextual social cues. Conflicts between social cues recruited the anterior cingulate and lateral prefrontal cortex, brain areas associated with domain-general control processes. This activation was accompanied by biasing of neural activity towards areas in the MNS or MSAS, which tracked, respectively, with perceivers' behavioral reliance on nonverbal or contextual cues when drawing inferences about targets' emotions. Together, these data provide evidence about both domain general and domain specific mechanisms involved in resolving social cognitive conflicts.
Everyday life continually requires the selection of, and responses to, goal relevant stimuli in the face of complex and sometimes conflicting information. This ability is reliably associated with engagement of posterior medial and lateral prefrontal regions implicated in processing response conflicts and task-related uncertainty (Botvinick et al., 2001; Critchley et al., 2001), and flexibly determining which information is relevant to current goals (Thompson-Schill et al., 1997; Konishi et al., 1998). These regions constitute a domain-general cognitive control system that resolves conflict by amplifying or attenuating activity in “downstream” neural structures responsible for processing relevant information (Desimone and Duncan, 1995; Miller and Cohen, 2001) ranging from spatial locations (Corbetta et al., 1991) to affective cues (Bishop et al., 2004; Ochsner et al., 2009).
Although cognitive control is a central topic in cognitive neuroscience, extant research has largely ignored the role of control processes in social cognition. This is surprising, as much of human cognition is aimed at guiding social behavior (Tomasello, 2000), often on the basis of complex and conflicting cues. For example, during social interactions, a perceiver may have to decide how a social target feels based on incongruent cues (e.g., a target smiling while describing an upsetting event). Such situations require the resolution of conflicts between the mental states implied by competing cues. To date, little is known about the neural systems underlying this ability.
To investigate this issue we used fMRI to scan perceivers while they judged social targets' emotions based on nonverbal (silent videos of targets describing emotional events) and contextual (verbal descriptions of emotional events) cues that suggested emotions of the same or different valence. We tested two hypotheses. First, we predicted that conflicts between cues would recruit domain general cognitive control centers in medial and lateral prefrontal cortex, a hypothesis consistent with prior work examining the processing of incongruent social cues (Decety and Chaminade, 2003; Mitchell, 2006; Wittfoth et al., 2009).
Second, we predicted that these control systems would help resolve conflict by “biasing” processing towards domain specific neural systems involved in responding to social cues deemed to be task-relevant, as reflected in perceivers' behavioral reliance on a given cue-type when rating target affect. On the one hand, to the extent that perceivers behaviorally rely on nonverbal cues, biasing could increase activity in regions responsible for processing such cues, including premotor and parietal regions comprising the mirror neuron system (MNS; see Rizzolatti and Craighero, 2004). On the other hand, to the extent that perceivers deem contextual cues more relevant, processing could be biased towards systems implicated in drawing inferences about non-observable mental states such as beliefs, including the medial prefrontal, posterior cingulate, temporopolar, and temporoparietal regions comprising the mental state attribution system (MSAS; see Ochsner et al., 2004; Amodio and Frith, 2006; Mitchell, 2009a). Since these systems are functionally dissociable (Van Overwalle and Baetens, 2009), and may in some cases inhibit each other (Brass et al., 2009), they are strong candidate targets for the effects of social cognitive conflict resolution.
18 healthy, right-handed volunteers (9 female, mean age 24.5 years) were recruited in compliance with the human subjects regulations of Columbia University and provided informed consent. All were screened for any neurological/psychological conditions or medications that could influence the measurement of cerebral blood flow with an fMRI safety screening form. Two participants were excluded from analysis due to low response rates (over 25% of trials missed), leaving 16 participants (8 female, mean age 23.9 years).
Stimuli for this study were silent video clips and sentences describing moderately emotional events.
In a previous study, Zaki et al. (2008) collected a library of stimulus videos. Fourteen targets (7 female, mean age 26.5) were videotaped while describing positive and negative autobiographical emotional events. After recording, the videos were played back to the targets as they used a sliding 9-point Likert scale from 1 (very negative) to 9 (very positive) to continuously rate the valence of the affective response they had felt while discussing these events.
From these recordings, ten-second video clips were selected during which targets' self-ratings were positive (mean rating > 5 throughout the ten-second clip; z-score > 0.25 based on all of an individual target's self-ratings) as well as negative (mean rating < 5 throughout the ten-second clip; z-score < -0.25 based on all of an individual target's self-ratings). 36 positive clips (mean target rating = 7.55 ± 0.85) from 7 targets and 36 negative clips (mean target rating = 2.46 ± 1.04) from 8 targets were used as experimental stimuli, presented silently. In order to independently assess the emotional valence of these clips, 18 pilot subjects (10 female) rated each clip using the same Likert scale described above. These pilot ratings demonstrated that perceivers' ratings of nonverbal social cues were less affectively valenced than targets' own, but nonetheless reliably rated clips in a valence-congruent manner based on nonverbal information alone (mean rating of positive stimuli = 6.03 ± 97; mean rating of negative stimuli = 4.05 ± .78).
Contextual stimuli were 128 sentences describing moderately affective events relevant to college students (examples: “My friends surprised me on my birthday.” “My parents don't approve of my career choice”) that could describe plausible contexts for the nonverbal videos; sentences were generated by the first and second authors. These stimuli were normed in a manner similar to the nonverbal stimuli. Fourteen volunteers (6 female) provided pilot valence ratings of these sentences using a 9-point Likert scale ranging from 1 (very negative) to 9 (very positive). Based on these ratings, 36 positive and 36 negative sentences were chosen; these contextual cues alone reliably produced affect-congruent ratings from perceivers (mean rating of positive stimuli = 7.42 ± 0.44) and 36 negative sentences (mean rating of negative stimuli = 2.69 ± 0.51). Positive and negative sentences were controlled with respect to word length (with mean word counts of 9.81 ± 2.38 and 9.31 ± 2.91, respectively; t(70) < 1, p > .25).
The current task was designed to produce a type of response conflict often encountered in the social world, in which multiple social cues simultaneously suggest opposing, but equally plausible, inferences concerning the internal states of a social target. Perceivers were presented with two main trial types, both of which presented simultaneous nonverbal and contextual social cues. During congruent trials, nonverbal and contextual cues were shown that matched with respect to the valence of emotion they suggested targets were experiencing (e.g. a positive video and a positive sentence). Conversely, during incongruent trials, nonverbal and contextual cues suggested emotions of opposing valence (e.g., a positive video and a negative sentence); videos and sentences in incongruent trials always differed—with respect to their normed affect ratings—by a minimum of 2 points on the 9-point Likert Scale (mean difference = 3.41 points). During both congruent and incongruent trials, perceivers were asked to make a judgment about the target's affective state. Perceivers were not instructed to pay attention to one modality or the other, and thus were required to determine, on a trial-by-trial basis, which cue-type seemed most relevant to the social inference task.
Unimodal trials (on which nonverbal or contextual cues were presented in isolation) also were included to localize brain regions preferentially engaged by nonverbal and contextual cues, so that activity in these regions during multimodal trials could be interrogated (see Figure 1a for schematic of trial types). Nonverbal and contextual cues were pseudorandomly assigned to trial types such that each participant was presented with unique pairings of cues during congruent and incongruent trials and that each nonverbal and contextual cue was presented close to an equal number of times in each condition across subjects (each cue was presented in each condition 6±1 times across 18 subjects). Trials in all conditions were evenly split between positive and negative cues.
Prior to entering the scanner, participants were given instructions and completed one block of trials as described below, using cues not included in the main stimulus set. They were told that social targets in the silent video clips were discussing emotional autobiographical events. Importantly, they were told that, when nonverbal and contextual cues were paired, contextual cues were a one-sentence summary of what targets in the silent video were describing. In actuality, the contextual sentences did not describe what participants in the videos truly were discussing and instead were paired with videos in the counterbalanced manner described above. Perceivers were asked to use all the information provided (nonverbal only, contextual only, or multimodal) to assess how the target felt. Participants were also instructed to “try to make the ratings as best you can” when a description of what the target was talking about was not present (nonverbal only trials) or when targets could not be seen (context only trials). Thus, participants were instructed to decide how a social target would feel even in instances when a sentence was presented without a visible target.
Upon entering the scanner, perceivers completed 6 blocks comprised of 16 trials each. A run contained 4 trials from each of the 4 conditions, split evenly across valences and valence combinations. Each trial began with a fixation cross, presented for a jittered duration ranging from 2-6 seconds, followed by presentation of nonverbal and/or contextual cues for 10 seconds, depending on the trial type. After the stimulus terminated, the question “How do you think this person felt?” appeared on the presentation screen along with a 9-point Likert scale identical to the one described above. Participants were given 4 seconds to respond. See Figure 1a for schematic of trial timeline.
Images were acquired using a 1.5 Tesla GE Twin Speed MRI scanner equipped to acquire gradient-echo, echoplanar T2*-weighted images (EPI) with blood oxygenation level dependent (BOLD) contrast. Each volume comprised 25 axial slices of 4.5mm thickness and a 3.5 × 3.5mm in-plane resolution, aligned along the AC-PC axis. Volumes were acquired continuously every 2 seconds. 6 functional runs were acquired from each subject, each run began with 5 ‘dummy’ volumes, which were discarded from further analyses, and 190 volumes of additional functional data. At the end of the scanning session, a T1 weighted structural image was acquired for each subject.
Our main behavioral analysis of interest was an “information usage index” (IUI) for congruent trials, which served as a measure of perceivers' relative reliance on nonverbal or contextual information. For each incongruent trial, we computed the IUI as the distance along the Likert scale between the rating a perceiver made based on a context sentence / nonverbal video clip pair to the average of the ratings given to the constituent video and context sentence in that pair during stimulus norming (see Figure 1b). For example, imagine that the normative affect rating made about a given nonverbal video seen in isolation was 7 on the Likert scale, and the normative rating of a given context sentence seen in isolation was 3. If a perceiver sees a combination of that video and sentence and rates a target's affect as 4, then their rating would be 1 point away from the mean (i.e., 5) of the normed video and sentence ratings. Such a case would result in an IUI of “1c,” indicating that the perceiver's rating was 1 scale-point closer to normative ratings of the contextual cue alone than it was to normative rating of the nonverbal cue alone. The opposite case, in which ratings would be 1 point closer to normative ratings of the nonverbal cue alone, would be given a score of “1n”.
We also assessed the overall affective intensity participants ascribed to each video / sentence pair by comparing perceiver ratings to the scale's neutral point: 5. Thus, if a given rating had a value of “r,” the intensity rating of canonically positive stimuli was computed as r – 5, and intensity of ratings for negative stimuli was computed as 5 – r. Note that this could result in negative intensity ratings, for example if a positive stimulus was rated below 5.
Images were preprocessed and analyzed using SPM2 (Wellcome Department of Imaging Neuroscience, London, UK), and using custom code in Matlab 7.1 (The Mathworks, Matick, MA). All functional volumes from each run were realigned to the first volume of that run, spatially normalized to the standard MNI-152 template, and smoothed using a Gaussian kernel with a full width half maximum (FWHM) of 6mm. Mean intensity of all volumes from each run were centered at a mean value of 100, trimmed to remove volumes with intensity levels more than 3 standard deviations from the run mean, and detrended by removing the line of best fit. After this processing, all three runs were concatenated into one consecutive timeseries for the regression analysis.
After preprocessing, analyses were performed using the general linear model, and random effects models for 2nd level analyses. Three analytic approaches were employed.
First, main effects contrasts were used to identify brain regions whose activity was greater in the nonverbal only or the context only conditions, and also to examine activity that was greater for incongruent, as opposed to congruent trials. The aim of comparing nonverbal only and contextual only conditions was to isolate brain regions differentially involved in processing each cue modality. Based on prior research, we believed that this comparison would reveal regions responsive to the specific perceptual characteristics and task demands associated with each cue-type (e.g., extrastriate visual cortex regions associated with face perception during nonverbal cue presentation and left lateralized temporal cortex regions associated with processing semantic information during contextual cue presentation). Further, we believed that nonverbal and contextual cues would preferentially engage the MNS (Iacoboni, 2009; Wolf et al., 2010) and MSAS (Saxe, 2006; Mitchell, 2009a), respectively.
Second, we used parametric analyses to identify neural activity tracking with perceivers' reliance on nonverbal or contextual cues when drawing inferences based on conflicting social cues on incongruent trials. For this analysis normalized IUI values were used as parametric modulators providing regression weights for each incongruent trial. Using this method, we identified clusters of brain activity that tracked—within perceivers—with the relative amount perceivers had relied on nonverbal or contextual cues on a given trial. Based on our hypothesis that biased processing favoring the MNS and MSAS should also drive perceivers' reliance on that cue-type in situations of response conflict, we were most interested in overlap between regions identified by this parametric analysis and those identified by main effects contrasts comparing processing of nonverbal and contextual cues. As such, we computed a conjunction between the parametric analysis and this main effect contrast, using the minimum statistic approach advocated by Nichols et al. (2005).
Third, we used functional connectivity analyses to test the hypothesis that on incongruent trials domain-general cognitive control centers would be involved in biasing processing towards brain regions responsible for processing domain-specific social cues. This analysis was conducted using in-house Matlab scripts analogous to the psychophysiologic interactions described by Friston et al. (1997). We first defined seed regions of interest (ROIs) associated with domain general cognitive conflict detection and resolution, defined by the contrast of incongruent vs. congruent multimodal social cues. Three such conflict-related peaks were defined: in the dorsal anterior cingulate cortex (ACC; MNI coordinates: -4, 40, 34), the posterior dorsomedial prefrontal cortex (pDMPFC; coordinates: -12, 24, 58), and the right ventrolateral prefrontal cortex (VLPFC; coordinates: 48, 24, -8). ROIs were defined as spheres with a radius of 6mm about these peaks and the average time course for all voxels in each one were extracted.
We then searched for areas whose functional covariance with these seed regions corresponded with perceivers' increasing reliance on nonverbal or contextual cues. This analysis employed three regressors: (1) the raw timecourse—averaging across all voxels—of the seed ROI, (2) behavioral reliance on the relevant type of social cue, operationalized using the IUI measure described above, and (3) the interaction between these first two factors. Contrasts were then assessed for regressor 3, while regressors 1 & 2 were included as covariates of no interest. Thus, this analysis specifically isolated regions whose functional covariance with seed regions increased as perceivers relied on one type of social cue or the other, and not regions simply covarying with the seed region regardless of behavioral reliance, or demonstrating activity related to behavioral reliance per se.
The results of this analysis were constrained in a theory driven way using a 3-way conjunction interaction, by masking these contrasts with functional maps derived from the main effects and parametric analyses described above. Thus, in total, regions identified here were required to meet three conditions: (1) they must be more engaged by one type of social cue viewed in isolation as compared to the other (e.g., more engaged by nonverbal than by contextual cues), (2) they must demonstrate activity tracking with behavioral reliance on that cue type (e.g., increasing activity as perceivers behaviorally rely more on nonverbal cues, and (3) they must demonstrate connectivity with a seed region that increases as reliance on a given (e.g. nonverbal) cue-type also increases.
Main effect maps (nonverbal vs. contextual and incongruent vs. congruent) were thresholded at p < .005, with a spatial extent threshold of k = 55, corresponding to a threshold of p < .05, corrected for multiple comparison, as assessed through Monte Carlo simulations implemented in Matlab (S. Slotnick, Boston College). To compute appropriate thresholds for maps of the two-way conjunction between main effects and parametric analyses, and the three-way conjunction between main effects, parametric analysis, and connectivity analyses, we employed Fisher's methods (Fisher, 1925), which combines probabilities of multiple hypothesis tests using the formula:
where pi is the p-value for the ith test being combined, k is the number of tests being combined, and the resulting statistic has a chi-square distribution with 2k degrees of freedom,. Thus, thresholding each test at a p values of .0100 for a 2-way conjunction and of .0235 for a 3-way conjunction corresponded to a combined threshold p value of .001, uncorrected. We combined these values with an extent threshold of k = 35, again corresponding to a corrected threshold of p < .05 as assessed using simulations.
Consistent with domain general studies of cognitive control, incongruent, as opposed to congruent, social cues were associated with slower responses (t(15) = 2.53, p < .05, see Figure S1 for bar plots of mean RT across conditions). Response times did not vary with the valence of either nonverbal or contextual cues (all ps > .30). As there was no “correct” answer to the affect inferences perceivers drew, error rates could not be calculated.
The information usage index (IUI, described above) revealed that, during incongruent trials, contextual information predominantly influenced perceivers' ratings of target affect. That is, the ratings perceivers made about incongruent sentence / video pairs were closer to normed ratings of context sentences alone than to ratings of nonverbal videos alone (mean information usage index = 0.54c, t(15) = 5.58, p < .001, see Figure 1c for a distribution of IUI values). Nonetheless, reliance on nonverbal or contextual cues on also varied greatly (mean within-subject IUI SD = .99 units), with all perceivers showing a wide range of reliance on nonverbal or contextual cues across trials. This allowed us to correct for each perceiver's mean IUI and search for brain activity tracking with behavioral biasing towards each type of cue relative to that mean.
The presence of social cognitive response conflict (i.e. the comparison of incongruent vs. congruent trials) recruited activity in several regions associated with domain-general conflict monitoring and control, including the anterior cingulate cortex, right ventrolateral prefrontal cortex, right middle frontal gyrus, and posterior dorsomedial prefrontal cortex (Figure 2, Table S1; for data on the comparison of congruent > incongruent trials, see Supplementary Results).
We used main effect comparisons between nonverbal only and contextual only conditions to identify activity related to processing each type of cue, with the expectation that areas associated the MNS and MSAS would involved in processing nonverbal and contextual cues, respectively. Consistent with this prediction and with extant research, nonverbal, as compared to contextual, cues more strongly engaged much of the right MNS, including the frontal operculum, inferior parietal lobe, and intraparietal sulcus. Nonverbal cues also engaged the bilateral amygdala, superior temporal sulcus and fusiform face area (FFA) more than contextual cues (Figure 3a, Table S1). Contextual, as compared to nonverbal, cues engaged regions in the MSAS, notably rostral portions of the medial prefrontal cortex, the medial parietal lobe, and bilateral temporal pole. Contextual cues also preferentially engaged a swath of left-lateralized cortex stretching from the inferior frontal gyrus (including Broca's area) through the superior temporal and angular gyri (encompassing Wernicke's area, see Figure 3b, Table S2).
Our second analytic strategy tested the hypotheses that 1) during social cognitive conflict, processing would be biased towards either regions associated with the MNS or MSAS, and that 2) this biased processing would track with reliance on nonverbal or contextual cues during affective inference. This analysis searched for brain regions whose activity tracked with the IUI described above (in the Methods section). That is, we searched for brain regions whose activity tracked parametrically with the extent to which perceivers' relied on nonverbal or contextual information on a trial-by-trial basis when judging target affect. Our specific hypothesis was that behavioral reliance on nonverbal and contextual cues would track with amplification or enhancement of activity in regions associated with the MNS and MSAS, respectively. As such, we masked the parametric analyses with main effects contrasts used to localize responses to nonverbal and contextual cues, respectively, and identified elements of the MNS and MSAS as noted above.
Results were consistent with our predictions. On one hand, as a perceiver relied more on contextual cues in making judgments about targets, they increasingly recruited areas within the MSAS (rostral medial prefrontal cortex and bilateral temporal poles), as well as the left angular gyrus. On the other hand, as perceivers relied more on nonverbal information, they increasingly recruited right-lateralized areas within the MNS (dorsal premotor cortex, inferior parietal lobule, and intraparietal sulcus) as well as the right fusiform gyrus (Figure 3c & 3d, Table S3).
Our third analysis explored a more specific hypothesis: that biased processing in areas responsible for processing behaviorally relevant nonverbal and contextual cues would be instantiated through functional coupling between these regions and brain areas involved in detection and control of domain general conflict. In order to test this idea, we selected seed ROIs related to domain general conflict detection and resolution (ACC, pDMPFC, and right VLPFC regions in the incongruent > congruent contrast), and searched for regions whose functional connectivity with these seed regions increased as perceivers relied more on nonverbal or contextual social cues. These analyses were also masked by main effects and parametric contrasts described above, ensuring that connectivity analyses would be focused on regions of theoretical interest that had been identified in these prior analyses.
Two key results were observed. On the one hand, as participants relied more on nonverbal cues, the ACC and VLPFC (and, at a more lenient extent threshold of k = 10, the pDMPFC) became more functionally coupled with the right premotor cortex, but no other regions. On the other hand, as participants relied more on contextual cues, the pDMPFC and ACC became increasingly coupled with the left angular gyrus, but not other regions. At a more lenient extent threshold of k = 10, the VLPFC also demonstrated increasing connectivity with the left temporal pole (Figure 4, Table S4). These results add converging support for the hypothesis that control-related regions actually do bias processing in certain “downstream” areas responsible for processing motoric and semantic information as perceivers judgments increasingly rely on nonverbal and contextual social cues, respectively.
Every day, perceivers integrate complex and sometimes-conflicting social cues into a coherent sense of what others are experiencing. While the neural bases of other types of conflict resolution have been extensively examined, the mechanisms underlying the resolution of social cognitive conflict have remained unclear. The present study addressed this issue by asking perceivers to draw inferences about social targets' affective states based on incongruent social cues. We capitalized on prior work demonstrating that nonverbal and contextual social cues preferentially engage elements of the MNS and MSAS (Fletcher et al., 1995; Carr et al., 2003; Ochsner et al., 2004; Iacoboni et al., 2005; Gobbini et al., 2007) to generate the hypothesis that social cognitive conflict would engage brain areas responsive to domain general cognitive control, and this engagement would be accompanied by biased processing regions associated with the MNS and MSAS.
Findings were consistent with this hypothesis. First, perceivers encountering incongruent, as opposed to congruent, social cues engaged dorsal anterior cingulate, posterior medial prefrontal, and ventrolateral prefrontal regions broadly implicated in monitoring response conflict across cognitive and affective domains (Blair et al., 2007; Luo et al., 2007; Ochsner et al., 2009; Wittfoth et al., 2010) and in selecting goal-relevant information (Miller and Cohen, 2001; Aron et al., 2004; Wager et al., 2004; Badre et al., 2005; Ochsner and Gross, 2005). Second, during social cognitive conflict, processing of social cues was “biased” through increased engagement of areas within the MNS or MSAS that tracked, respectively, with perceivers' behavioral reliance on nonverbal or contextual cues during social cognitive inference. Other regions related to processing modality-specific information—for example, left temporal cortex and angular gyrus, which are known to be involved in processing semantic information present in contextual cues (Binder et al., 1997; Price, 2000)—also demonstrated such biasing. Third, some of the regions exhibiting biased processing (the premotor cortex for nonverbal cues, and the angular gyrus and temporal pole for contextual cues) also demonstrated increasing functional coupling with domain general control regions, an inter-regional relationship that increased in strength as perceivers behaviorally relied more on the relevant type of social cue.
The current results have implications relevant to the neural bases of cognitive control and social cognition. With respect to cognitive control, this study joins work on the resolution of affective conflict (Bishop et al., 2004; Etkin et al., 2006; Egner et al., 2008; Ochsner and Gross, 2008; Ochsner et al., 2009) in suggesting that control over specific types of information may recruit unique neural mechanisms. Affective and social information have both been labeled “special,” in that they may warrant specialized information processing mechanisms (Ostrom, 1984; LeDoux, 2000), instantiated in functionally specialized neural systems (Olsson and Ochsner, 2008; Mitchell, 2009b). It is therefore sensible that exerting control over conflict in emotional and social domains should affect processing in these specialized neural systems.
With respect to social cognition, the current experiment joins a small but growing number of neuroimaging studies capitalizing on the characterization of the MNS and MSAS by relating engagement of these systems to social behavior (de Lange et al., 2008; Schiller et al., 2009; Zaki et al., 2009; Wolf et al., 2010; Spunt et al., in press). Data from these studies complement work identifying neural systems involved in processing isolated “pieces” of social information by demonstrating that these systems are flexibly engaged when perceivers put these pieces together in the service of complex social inferences. Such data can also help researchers bypass an often-unproductive debate about whether perceivers understand targets primarily using the MSAS or MNS (Gallese et al., 2004; Saxe, 2005; Apperly, 2008). An emerging view suggests that, while the MNS and MSAS may be functionally dissociable (van Overwalle and Baetens, in press), interactions between these systems and domain-general cognitive control mechanisms best characterize everyday social cognition (Zaki and Ochsner, 2009).
Together, these data support a novel model of social cognitive conflict resolution: responding to incongruent social cues recruits domain general cognitive control mechanisms, which direct behavior by biasing engagement of domain-specific neural systems specialized for processing specific types of social cues. As such, this study suggests new directions for the study of both typical and abnormal social cognition.
In typical contexts, future studies could further characterize the basic mechanisms underlying social cognitive conflict resolution in a number of ways. For example, future work could manipulate judgment and stimulus factors to explain why some, but not all, of the behaviorally-related biasing in the MNS and MSAS was accompanied by increased connectivity with domain general control centers. Along these lines, future studies could more closely examine the directionality and temporal characteristics of these effects. Although we have theorized that control systems act on systems that represent specific types of social cues, because the connectivity analyses we employed cannot establish the directionality of these effects, it is possible (though less plausible), that areas related to processing different types of social cues drive activity in domain general control centers, instead of the reverse. Studies designed to test directional connectivity could help to resolve this issue.
Studies of social cognitive conflict also could make additional contact with the broader literature on cognitive control. For example, previous studies have generated affective conflict through “Stroop-like” tasks that instruct participants to concentrate on one cue while ignoring irrelevant emotional distracters. While the current task also presents cues whose valence conflicts, the social cognitive judgment participants perform here—like social inferences more generally—does not involve simply ignoring one cue in favor of another. Instead, participants must decide how to integrate multiple cues to assess target emotion on a scale with many response options. Conflict in this situation arises because responses are underdetermined (Botvinick et al, 2001). In other cases of underdetermined responding (e.g. word stem completion tasks), the relative strength of association between each cue and a response can increase or reduce conflict and relevant brain activity (Barch et al., 2000). Social inferences may follow this pattern: for example, in the face of an extremely intense nonverbal cue (e.g., a social target grimacing in pain), contextual cues may be discounted, and conflict may be low, whereas in the face of two relatively ambiguous cues, conflict may be high. Systematically varying the ambiguity of each cue-type could allow for more formal parameterization of social cognitive conflict signals.
Finally, future research could examine how social cognitive conflict resolution adapts to perceivers' learning histories. Data increasingly suggest that medial frontal regions involved in cognitive control also integrate information about reinforcement history when selecting relevant information (Behrens et al., 2007; Rushworth et al., 2007). Control over social behavior likely involves similar mechanisms, especially since perceivers typically encounter certain targets (e.g., friends) repeatedly. When these targets display conflicting social cues, perceivers' conflict resolution may be informed by their learning history. For example, if relying on nonverbal cues tended to produce adaptive decisions about a given target's affective states, perceivers may learn over time to weigh nonverbal cues more strongly when subsequently encountering that target. Further, perceivers' sense of the “diagnostic value” of behaviors in predicting targets' traits (e.g. Trope and Burnstein, 1975) could contribute to weighting of those behaviors in future social inferences. For example, if conflicting cues are seen by a perceiver as actually being consistent (e.g., someone crying after winning an award who is viewed as especially happy), perceivers may not learn to weigh one type of behavior or the other during later inferences. Future studies should employ feedback, and manipulate the diagnostic values of behaviors, to model perceivers' integration of learning histories during social cognitive decision-making.
The current approach can also inform research on disorders of social cognition, such as autism spectrum disorders (ASD). Individuals with ASD demonstrate deficits in responding to both nonverbal and contextual social cues (Baron-Cohen et al., 1997; Baron-Cohen et al., 2003; Rogers et al., 2003), as well as abnormal recruitment of both the MSAS and the MNS (Hadjikhani et al., 2005; Dapretto et al., 2006; Wang et al., 2007). Although these data have motivated theories of ASD as a failure of either one of these systems or the other (e.g., Oberman and Ramachandran, 2007), a more integrative account would suggest that both of these systems—and their interaction when processing complex information—may underlie social symptoms in ASD. An intriguing hypothesis emerging from the current study is that individuals with ASD may have difficulty biasing their processing of social information based on current goals. Such deficits could be caused by functional decoupling of frontal control centers and domain specific neural systems involved in processing social cues, a prediction consistent with abnormalities in long-distance connectivity between frontal and posterior cortical regions that characterize ASD (Courchesne and Pierce, 2005).
Ultimately, models of social cognitive conflict resolution will have to account for both biased competition in processing social cues, and the breakdown of control over conflict in populations with social cognitive abnormalities. The current data take initial steps towards these goals by demonstrating one way in which domain general control mechanisms interact with domain specific systems to resolve social cognitive conflict.
The authors thank Marget Thomas and Annie Knickman for assistance in data collection. This work was supported by Autism Speaks Grant 4787 (to J.Z.), NIDA Grant DA022541 (to K.N.O.) and NIH grant MH076137 (to K.N.O.).