Cognitive control refers to processes that flexibly and adaptively allocate mental resources to permit selection of thoughts and actions directed by our intentions and goals under a certain context (Posner and Snyder, 1975
; Miller, 2000
; Badre, 2008
; Kouneiher et al., 2009
; Solomon et al., 2009
), and has been implicated in a range of cognitive tasks involving attention, learning, and decision-making. Although the relationship between the activity of the frontoparietocingulate system and cognitive control has been consistently demonstrated in functional neuroimaging studies, the underlying computational mechanisms and dynamics of how these brain regions work together to implement the function of cognitive control remains unclear. The present study investigates the instantiation of cognition control by developing biologically realistic neural network models to perform a simple MFT, with the intention that the results can be extended to explain the computational underpinnings of cognitive control in other more complex tasks.
Search for the majority of a given item set is a common task and it is surprising that few studies have been conducted to understand how people perform the task. One reason may have to do with the fact that it can be easily done algorithmically, often via designed circuits or built-in functions. In statistics, the majority function is associated with mode, a statistic representing the value that occurs the most frequently in a data set, which is often readily shown by histograms. Fan et al. (2008
) have developed a task to study how humans perform the majority function in a well-controlled environment. A careful analysis of the computational load required by different algorithms suggests that instead of using intuitive search strategies such as exhaustive search or self-terminating search, humans may adopt a grouping search algorithm, which involves sampling and re-sampling the item set with a majority-determining size.
It is important to note that the majority search, even in the context of MFT, is clearly relevant to ordinary visual search, on which a large body of research has been conducted (Treisman, 1982
; Wolfe et al., 1989
; Grossberg et al., 1994
; Pylyshyn, 1994
; Najemnik and Geisler, 2005
). One might speculate that a possible pop-out mechanism exists for same-directional arrows in a stimulus set. However, given the close spatial proximity and perceptual similarity of the arrows, the perceptual basis of this pop-out is weak. In addition, the memory requirement (i.e., how many items in a given category have already been found) is greatly magnified in the MFT. As a result, in order to perform the task in a more efficient way, decisions of where to search next (which may not involve over eye movements) and when to make the response are critical, making a guided search a possibility (Niwa and Ditterich, 2008
). However, to what degree such decisions depend on cognitive control has been unclear (Gray et al., 2006
The grouping search algorithm makes distinctive claims regarding the involvement of cognitive control in the task than other more straightforward algorithms such as the self-terminating search. Based on the grouping search algorithm, for every selected group a judgment of group congruence has to be made, and in the case of incongruence a different group has to be selected and examined. Therefore, the algorithm implies the heavy and continuous involvement of cognitive control for conflict detection and re-sampling. This is different from those sequential-scanning based algorithms, where one can identify and count arrows one-by-one until a final decision can be made – no congruence judgment is explicitly necessary in these algorithms (Schall, 2001
). On the other hand, the grouping search algorithm also implies that the task performance will be sensitive to the configuration of stimulus set. For those highly incongruent stimulus sets (e.g., three left arrows and two right arrows, compared to five left arrows), since the probability of selecting an incongruent sample is high, the likelihood of re-sampling is high, leading to longer reaction times.
A straightforward parallel search model, where all arrows in a presented stimulus set are simultaneously selected and processed (e.g., via setting k
for the V4 and IPS layers to 5 in the five-arrow conditions), would presumably predict that all conditions of equal stimulus set size (e.g., 3:2, 4:1, and 5:0) have roughly same response time (i.e., the Output unit representing the majority will win out easily in all conditions). However, it is possible to augment this simple parallel search model with a mechanism to quantify the incongruence in a parallel fashion. For example, a mutual competition between two units in the Output layer (via setting its k
to 1 in the current model) allows increased RTs in response to incongruent stimuli without engaging the whole V4–ACC–LPFC–IPS–V4 loop (see Gilbert and Shallice, 2002
, for an example of incongruent competition in the context of Stroop effect). Neurally, such a mutual competition leads to activity normalization between two decision units, and neither unit would quickly reach a high decision threshold for a response (i.e., slow RT in this case) when they are comparably activated (e.g., driven by equally salient incongruent stimuli), leading to a pattern of RTs as a function of signal-to-noise ratio in evidence-based decision-making (e.g., Wong et al., 2007
; Grossberg and Pilly, 2008
). Note that in such a model the ACC–LPFC networks can still detect incongruence if any, but the effect of such detection could be too late to delay the response (i.e., an Output unit may have already fired). To a certain extent the grouping model can be regarded as an enhanced version of this augmented parallel search model in the sense that in the grouping model (1) a subset of stimuli can be simultaneously processed; and (2) the sensitivity to signal-to-noise ratio in different conditions is magnified by conflict detection through the V4–ACC–LPFC–IPS–V4 loop.
With advances in functional brain imaging, these claims lead to further hypotheses regarding possible brain activity and connectivity to support task performance. While it would be necessary to carry out functional brain imaging studies to examine the involvement of these brain areas, it is hard to reveal the dynamics of the brain in instantiating the computation. In the current study we show that we can study the dynamics of majority function computation in the brain by developing biologically realistic computational models of the task. In general a biologically plausible computational model that can perform the task in similar conditions as humans do and produce results that fit the human data provides not only an existence proof of the underlying algorithm but also a detailed process-based explanation for how the algorithm might be implemented in the brain (Marr, 1982
; Anderson and Lebiere, 1998
; O'Reilly and Munakata, 2000
; O'Reilly, 2006
; McClelland, 2009
; Sun, 2009
). Specifically, we developed two models of MFT performance, one simulating a grouping search algorithm and the other one simulating a self-terminating search algorithm. The two models share the same network structure and both are able to perform the task. Nevertheless, they involve the function of cognitive control differently.
The grouping search model demonstrates how modules simulating different brain functions work together to instantiate cognitive control for the majority function computation. Two critical components of the algorithm, sampling and re-sampling, are implemented through Leabra's built-in kWTA mechanism and the joint work of the V4–ACC–LPFC–IPS loop. With k in V4 and IPS set to be the respective threshold in each condition, sampling is naturally implemented in V4. Because the network weights are randomly set, the initial sampling can be random as well. When a congruent sample is selected, a response can be quickly generated. When an incongruent sample is selected, the incongruence is detected and re-sampling occurs. More importantly, the model shows that cognitive control, important for the detection of incongruence in the selected sample and subserved by a set of neural modules, is recruited to modulate re-sampling. The longer RT in the 2:1 condition than that in the 5:0 condition, for example, is vividly explained by the frequent activations of the ACC and LPFC layers and the subsequent extra re-sampling processes in the 2:1 condition but the lack of those in the 5:0 condition. These results highlight the particular involvement of ACC and LPFC in implementing the function of cognitive control in the MFT and similar tasks.
It is interesting to note that the grouping search model can be revised to implement the self-terminating search, but the resulting model fails to fit the human data. The essential change concerns the k setting in V4 and IPS layers, which is gradually increased to simulate the sequential scanning and counting, a necessary component of the self-terminating search. The result that the self-terminating model fails to fit the human data to a certain extent provides further support for the claim that the grouping search model captures some essential constraints of cognitive control in the task. However, it is important to note that other models are certainly possible and many claims of the grouping search model are open to further experimental investigation. For example, it is possible to implement the self-terminating search more literally by fixing the k setting in V4 and IPS layers to be 1 (rather than gradually increasing it) and adding recurrent connections for both units in the model Output layer to achieve evidence accumulation over time as a way of counting. By doing so, although the model V4 explicitly samples one item at a time, a correct decision can still be made based on the sampling history maintained in the model Output layer.