|Home | About | Journals | Submit | Contact Us | Français|
Making accurate predictions about what may happen in the environment requires analogies between perceptual input and associations in memory. These elements of predictions are based on cortical representations, but little is known about how these processes can be enhanced by experience and training. On the other hand, studies on perceptual expertise have revealed that the acquisition of expertise leads to strengthened associative processing among features or objects, suggesting that predictions and expertise may be tightly connected. Here we review the behavioral and neural findings regarding the mechanisms involving prediction and expert processing, and highlight important possible overlaps between them. Future investigation should examine the relations among perception, memory and prediction skills as a function of expertise. The knowledge gained by this line of research will have implications for visual cognition research, and will advance our understanding of how the human brain can improve its ability to predict by learning from experience.
When walking on the street, we are not surprised to see cars, parking meters, or traffic lights. If we catch a glimpse of something that appears on the sidewalk and quickly disappears into the bushes, we may think that it could be a bird, a squirrel, or a cat, depending on its size and shape. However, we would be baffled if we instead saw something unpredicted, such as a goat or an anchor, on the street, because it would be completely out of context. While external information from the world is continuously extracted and processed by various sensory modalities, the human brain readily generates top–down predictions1 based on associations in memory formed from previous experience to make sense of and interact with the environment (Bar, 2007). Various predictions may be formed continuously. For instance, when we see a parking meter, we predict that it is likely a car next to it. Or we predict that a blurry impression is a harmless squirrel. Recent work on visual prediction has suggested that predictions are formed rapidly and draw on associative connections stored in long-term memory (e.g., Bar, 2004, 2009; Gilbert and Wilson, 2007; Schacter et al., 2007, 2008).
Strong associative activations and fast processing speed are also characteristics of expert processing (e.g., Chase and Ericsson, 1981; Freyhof et al., 1992; Richler et al., 2009). For instance, while most people may recognize a fast approaching car merely as a ‘silver car’, a car expert may recognize it instantaneously as the newest model of Jaguar XF, know what engine it may have, and can distinguish between this and other comparable models. In this review, we highlight the possible relations between the processes responsible for prediction and the processes involved in expert processing. We focus our discussion on recent behavioral and imaging findings on visual prediction and on visual expertise, as theories in these two areas have been elaborated and studied especially in the last decade (e.g., Bar, 2003, 2004; Gauthier et al., 2000a; Wong et al., 2009a). Merging the findings from these two literatures offers new insights on the role of associative processing in a variety of cognitive processes that are central to our mental lives, such as recognition, learning, memory and prediction.
Making rapid and accurate predictions is beneficial in many situations and can facilitate perception and action. To do so, one needs to acquire knowledge about various attributes and relations of objects, people, and events in the world. Such knowledge, stored in memory, constitutes the basis of recognition and prediction for both familiar and unfamiliar instances (e.g., recognizing your own cat vs. a stray cat). While interacting with the environment, the human brain not only makes use of incoming perceptual information, but also compares this input with representations in memory, and generates specific and testable predictions (e.g., ‘is this my phone?’). To understand how the brain generates proactive predictions to guide cognition and behavior, Bar (2007, 2009) proposed a unified theory that links the study of analogies, associations and predictions, which have previously been studied independently (e.g., Bar, 2003, 2004; Gentner, 1983; Holyoak and Thagard, 1997; Minsky, 1975; Schank, 1975). The general idea of the proactive brain framework is that predictive processes involve finding an analogy between an input (e.g., a sofa) and a similar representation in memory (e.g., a general representation of similar sofas you saw before), which activates associated representations (e.g., a coffee table, pillows) related to the particular analogy. The co-activation of associated representations provides specific, on-line predictions on what other instances may be of high relevance in the particular context (see Fig. 1).
Since objects in the world rarely appear in isolation but rather appear in typical configurations with other objects that share the same context, knowledge about associations among objects becomes highly useful to understand what to expect in various situations. Objects can be related to each other or to the environment in numerous ways: for instance, a microwave and an oven are kitchen appliances; a cell phone or a laptop may be carried by a businessperson in an office, a train station, or a coffee shop. It has been proposed that stored memory representations of objects are clustered and linked depending on the relatedness of the objects. These clusters of related representations can be referred to as ‘context frames’ (e.g., Bar, 2004; Bar and Ullman, 1996; Barsalou, 1992; Friedman, 1979; Mandler and Johnson, 1976; Palmer, 1975; Schank, 1975). In such representations of context, certain elements are generally expected to appear (e.g., a sofa and a TV set in a living room, or a ball and a hoop in basketball game). Context frames can be formed from real-world experience via implicit observations or explicit learning. For instance, implicit learning can occur for covariance between shapes, syllables, tones, or even more abstract, conceptual categories that appear in a predictable arrangement (Behrmann et al., 2005; Brady and Oliva, 2008; Chun and Jiang, 1998; Fiser and Aslin, 2001; Saffran et al., 1996; Saffran et al., 1999; Turk-Browne et al., 2005). In addition, meaningful relations about objects, people, or events can also be learned explicitly (e.g., a bottle is for holding water; Superman and Clark Kent are the same person).
Associative processing is quickly triggered merely by looking at an everyday object (e.g., a chair; Aminoff et al., 2007; Bar and Aminoff, 2003; Bar et al., 2007a), and such associative processing is critical for visual recognition and prediction. The efficiency of predicting the occurrence of an item depends on the consistency between bottom– up sensory input and stored associative representations in memory. When seeing a salient item in a picture (e.g., a football player), associative processing may lead an observer to expect a particular context (e.g., a football field) and other objects in the scene (e.g., banners, cheering fans), as all these are predictable within the same context frame. But if an unexpected item occurs in a given context instead (e.g., a clergyman in a football field, see Fig. 2), recognition of either the item or context becomes hindered, presumably because the incongruent associations do not match our predictions or expectations (Davenport, 2007; Davenport and Potter, 2004; Joubert et al., 2007; Mack and Palmeri, 2010; Palmer, 1975; see also Biederman et al., 1982). These findings suggest that context frames are activated to generate predictions by seeing either a familiar object or context (Bar and Ullman, 1996), and that observers are unable to selectively attend to an item while ignoring the context (Davenport and Potter, 2004; Joubert et al., 2007; see also Mack and Palmeri, 2010). Notably, the ultra-rapid detection for inconsistency between objects and scenes suggests that associative predictions may be generated instinctively (e.g., for as brief as 26 ms of presentation time, Joubert et al., 2007; Mack and Palmeri, 2010).
Predictions propagate from top–down mechanisms to influence bottom–up processes. The information extracted from bottom–up inputs to support the top–down processes may be minimal, such that the information can be analyzed and interpreted promptly. Bar (2003, 2004) proposed that only limited sensory information is necessary to trigger predictive processes (see Fig. 3A). Namely, partial information about objects in a visual context conveyed by low spatial frequencies (LSF) is extracted and processed rapidly relative to fine details carried by high spatial frequencies (HSF). The LSF information is then matched to representations in memory that may be ‘averaged’ from similar instances previously encountered. The general representations of objects or scenes, or ‘gist’ (Oliva and Torralba, 2007; Torralba and Oliva, 2003), reduce the number of possible identities for attended objects (e.g., umbrellas, lamps), and are often adequate for matching objects at the basic-level for everyday recognition (Rosch et al., 1976) since visual features of objects in the same basic-level categories (e.g., dogs) are often similar to each other (e.g., compared to cats, Fig. 3B). In sum, by linking the general impression of a new input with the most similar representation in memory based on similarity, top–down mechanisms can quickly generate predictions that facilitate bottom–up processes in object and context recognition (Aminoff et al., 2008; Bar et al., 2006a; Kveraga et al., 2007).
Observers appear to be able to generate associations and predictions reliably and possibly automatically, and this ability is likely acquired through extensive experience while interacting with the world. Just how much our ability to produce helpful predictions is enhanced by experience and further training is an important open question. Every person possesses some level of expertise in many domains, but enthusiasts of various domains (e.g., birdwatchers, chess players, musicians, stamp collectors) possess much broader and deeper visual and non-visual knowledge that are associated with items in their domains of expertise, compared with average people or novices (e.g., Tanaka and Taylor, 1991). Here we synthesize evidence from the existing literature that indicates that experts elicit stronger associative processing due to enhanced visual and nonvisual knowledge and are faster and more accurate in making use of analogies and generating predictions about their specific domains of expertise. Note that associative or predictive processes can be observed in most people. For instance, color information (e.g., yellow) may be strongly associated with certain everyday objects (e.g., banana). Such color–shape associations can influence perception of the actual color on the objects (Hasen et al., 2006; Witzel et al., 2011). Nonetheless, a key question is whether such processes can be enhanced through training. Our focus here is on expertise in visual perception, although the influence of enhanced associative knowledge on predictions is likely general to various areas of non-visual expertise (e.g., athletes; see Ericsson and Lehmann, 1996; Ericsson and Smith, 1991).
Not surprisingly, perceptual expertise2 leads to improvements over novices in many visual perception or memory tasks. At a glance, advanced birdwatchers or car experts excel at identifying individual objects in their respective areas of expertise, regardless of whether the task involves classification at the subordinate-level (e.g., ‘black-winged snowfinch’ or ‘Honda Civic 2004’) or the basic-level (e.g., ‘bird’ or ‘car’) (Mack et al., 2009; Tanaka, 2001; Tanaka and Taylor, 1991; Wong et al., 2009a); grandmasters of chess have larger visual and memory capacity for and greater search efficiency with meaningful chess configurations compared with amateur or novice players (e.g., Brockmole et al., 2008; Chase and Simon, 1973). In recent years, the majority of studies on perceptual expertise have focused on the role of visual or shape properties in expert recognition and memory (e.g., Gauthier and Tarr, 1997; Gauthier et al., 2000a,b, 2003; Grill-Spector et al., 2004; Harel et al., 2010; Herzmann and Curran, 2011; Op de Beeck et al., 2006, 2008; Rhodes et al., 2004; Rossion et al., 2004; Rossion et al., 2007; Scott et al., 2006, 2008; Wong et al., 2009a,b). We suggest that both visual and non-visual knowledge can play an important role in associative processing. Here we will first describe the studies that have revealed strong and rigid perceptual associations for objects that are developed during the acquisition of perceptual expertise. These findings indicate that some aspects of perceptual expertise effects may resemble the object–context associative effect in studies on prediction discussed above.
Rigid associative relations among different features and items can be learned and expected as a result of expertise training. Since associations among features or items are particularly strongly established in experts and are retrieved automatically (cf. Schneider and Shiffrin, 1977; Shiffrin and Schneider, 1977), experts may find it impossible to ignore associated visual information, even when such information is task-irrelevant. For instance, expert readers appear to automatically extract contextual information (e.g., font, size) when reading text, such that a sequence of letters presented in the same font are recognized faster than in different fonts (e.g., ‘prediction’ vs. Gauthier et al., 2006; Mayall et al., 1997; Sanocki, 1987, 1988). Such contextual information may be completely irrelevant to the task at hand (i.e., to identify letters and words rather than the fonts). Although readers who are fluent in a language may take such contextual regularity for granted, novice readers (e.g., non-Chinese readers viewing Chinese characters) are not affected in the same manner (Gauthier et al., 2006). Likewise, impoverished or incomplete visual context may also be sufficient for experts to activate relevant associations to facilitate recognition. For instance, a briefly presented (50 ms) prime word, which consists of letters and digits (e.g., M4T3R14L) or letters and symbols (e.g., MΔT€R1ΔL) that are similar to an actual word (e.g., MATERIAL), facilitates recognition of a target word almost as much as when the prime and target are identical words (Perea et al., 2008), indicating powerful efficiency and expectancy in expert visual word recognition and prediction.
Another example of rigid visual associative processing can be found in face perception. Most adults are experts in face recognition. Although all faces are homogeneous in their configuration and features (i.e., all faces have two eyes, a nose and a mouth), we are able to discriminate and identify faces at the subordinate level (e.g., Brad Pitt) as quickly as at the basic level (e.g., male), indicating perceptual expertise (Tanaka, 2001). A characteristic of face perception is holistic processing, which reveals that all features in a face are processed as a whole (e.g., Farah et al., 1998; Tanaka and Farah, 1993; Young et al., 1987). Since facial features always co-occur and co-vary in a meaningful way (e.g., Brad Pitt’s eyes always appear with his nose and mouth), it appears natural that strong associations among facial features are developed and strengthened for face recognition expertise. For instance, seeing a big smile on someone’s face leads to the prediction or expectation of seeing dimples or scrunched eyes on the same face. In other words, it is almost impossible to selectively process one feature of a face without taking in other associated facial information. This inability for selective attention has been shown in the composite paradigm (see Fig. 4, Cheung et al., 2008; Farah et al., 1998; Hole, 1994; Richler et al., 2011; Young et al., 1987), where the top half of a face (e.g., Brad Pitt) is combined with the bottom half of another face (e.g., Matt Damon) to form a composite. Observers have great difficulty in identifying the target half of the composite (e.g., top) while ignoring the task-irrelevant half (e.g., bottom), because the representations of the facial features ‘fuse’ together within a face context.
This holistic effect arises from rigid associative processing of features and happens at a glance (≤ 50 ms presentation time, Richler et al., 2009). Intriguingly, the holistic effect strikingly resembles the object– scene consistency effect (e.g., Davenport and Potter, 2004, cf. Figs. 2 and and4),4), indicating failures of selective attention to a subset of information in a context. Note that the holistic effect appears to be experience-based and is associated with specific computations (e.g., subordinate-level recognition) with objects in a homogeneous category (e.g., faces, cars, birds), as the holistic effect is reduced for objects with which we have less experience (e.g., faces from an unfamiliar race; Michel et al., 2006; Tanaka et al., 2004). Furthermore, holistic processing is not unique for faces, and has also been observed in experts of non-face categories such as cars (Bukach et al., 2010; Gauthier and Tarr, 2002), novel objects (Gauthier et al., 2003; Wong et al., 2009a), English words (Wong et al., 2011), musical notations (Wong and Gauthier, 2010a), fingerprints (Busey and Vanderkolk, 2005) and chess game boards (Boggan et al., in press). This raises the possibility that the object–context association effect (e.g., Davenport and Potter, 2004) can also be strengthened by additional yet specific practice.
Although associative processing appears critical in both visual prediction and perceptual expertise, past studies from these two lines of research have asked distinct sets of questions. For instance, most studies in visual prediction are concerned with recognition of everyday objects and scenes, while most perceptual expertise research has emphasized rapid subordinate-level processing of objects in only one or a few categories. Investigation of the underlying neural mechanisms for these processes has also revealed different emphases: associative prediction research has focused on a large-scale brain network that coordinate top–down processes, whereas perceptual expertise studies have mainly concentrated on local regions in the ventral visual stream. Here we describe the main findings on the neural correlates of predictive and expert processing separately, and highlight the possible overlaps between them that likely merit further attention from both fields.
Recent studies using human functional neuroimaging have revealed a cortical network that mediates context-based associative predictions, which includes structures in three main regions: the medial temporal lobe (MTL), the medial parietal cortex (MPC) and the medial prefrontal cortex (MPFC). Note that this network largely overlaps with the ‘default mode’ network, which is active when observers are not engaged in goal-directed behavior (Raichle et al., 2001; Buckner et al., 2008). This overlap implies that the associative processing comprises a large part of the functions mediated by the brain’s default mode (Bar et al., 2007b; see Fig. 5). The associative prediction network has been defined by contrasting everyday objects that are strongly associated with a specific context (e.g., a beach ball) with objects that are weakly tied to any specific context (e.g., a camera) (Aminoff et al., 2007, 2008; Bar and Aminoff, 2003; Bar, 2004). Robust context association effects are observed in the parahippocampal cortex (PHC) in the MTL and the retrosplenial complex (RSC) in the MPC. The PHC and RSC have previously been thought to be engaged in place processing (e.g., Aguirre et al., 1996; Epstein and Kanwisher, 1998) and in episodic and autobiographical memory (e.g., Ranganath et al., 2004; Svoboda et al., 2006; Wagner et al., 1998), which may be reasonable due to the associative nature of these processes (e.g., Bar and Aminoff, 2003; Bar et al., 2008). While MTL, MPC, and MPFC may be involved for different kinds of associations, additional research is required to distinguish these relations. It is possible that the MTL is responsible for simple or unique associations (e.g., Schacter, 1987; Eichenbaum, 2000; Ranganath et al., 2004). Specifically, the PHC appears to represent stimulus-specific context and associations, which are sensitive to specific appearance (e.g., my office, Aminoff et al., 2008). In contrast, the RSC in the MPC represents prototypical, generic information about associative context frames (e.g., an office, Aminoff et al., 2008). The representations and processes from PHC and RSC presumably interact with and provide the basis for the predictive processes in the MPFC, while the MPFC may be involved for processing of deliberative or conditional associations (Bar et al., 2007b). In particular, the orbitofrontal cortex (OFC), which is a multimodal association region in the MPFC (Barbas, 2000; Kringelbach and Rolls, 2004) that receives fast projection of visual input via the magnocellular pathway (Kveraga et al., 2007), may become increasingly important in generating top–down influences to predict possible object identities when the visual input is relatively coarse (Bar et al., 2006a). These different kinds of top–down processes then facilitate modality-specific cortex such as the fusiform gyrus for object recognition (Bar et al., 2006a; Kveraga et al., 2011).
Instead of examining large-scale cortical networks, many neuroimaging studies on perceptual expertise have instead focused on local regions in the ventral visual pathway, such as in the face-, word- or letter-selective regions (e.g., face-selective: Kanwisher et al., 1997; McCarthy et al., 1997; word-selective: Baker et al., 2007; Cohen et al., 2000; letter-selective: Gauthier et al., 2000; James et al., 2005). Various types of perceptual expertise (e.g., with faces, words, letters, or non-face objects) differentially alter neural representations in these perceptual regions (e.g., Gauthier et al., 1999; Gauthier et al., 2000a; James et al., 2005; Xu, 2005; Wong et al., 2009b). For instance, the word- and letter-selective areas are found to be selective to expert processing of printed words or letters in various languages (e.g., Roman letters, Chinese characters, Hebrew words; Baker et al., 2007; James et al., 2005; Wong et al., 2009c). In spite of the drastically different linguistic and visual properties in these writing systems, comparable expert training in reading appears to be critical in recruiting these areas. Likewise, the ‘fusiform face area’ (FFA) was originally proposed to be selective for faces (Kanwisher et al., 1997), which are processed holistically (i.e., facial features are associated in a face context; Farah et al., 1998; Tanaka and Farah, 1993; Young et al., 1987). As mentioned above, experts of non-face objects (e.g., cars, chess game boards) also exhibit enhanced holistic processing (e.g., Bukach et al., 2010; Boggan et al., in press). The increase in holistic processing is found to be correlated with higher activity in the FFA, suggesting the possibility that the FFA can be modulated by perceptual expertise with cars, birds (Gauthier et al., 2000a), chess game boards (Bilalić et al., 2011; see also Krawczyk et al., 2011), or novel 3D objects (Gauthier and Tarr, 2002; Wong et al., 2009b).
Since experts also possess superior non-visual associative knowledge about objects of expertise (e.g., the habitat of a warbler or that of a belted kingfisher, and what sounds they make), it is conceivable that enhanced activity should also be observed in the context associative regions. Interestingly, several studies indeed reported enhanced neural activity for objects of expertise in the lateral PHC for expert birdwatchers (Gauthier et al., 2000a), car experts (Gauthier et al., 2000a; Harel et al., 2010) and advanced chess players (Campitelli et al., 2007; see also Amidzic et al., 2001). Although it is possible that the enhanced PHC activity in chess experts is related to the spatial processing of the position of chess pieces on a chessboard, it is unlikely that the similar effect in bird and car experts can be explained by spatial processing, supporting the notion that the best characterization of the role of the PHC is related to associations in general rather than space in particular (Bar et al., 2008). Indeed, this may instead be a neural indicator of associative processing in experts across domains.
In the current framework of associative prediction (e.g., Bar, 2003, 2004), the PHC, RSC and OFC influence processing in the perceptual system to guide object recognition (Bar et al., 2006a; Kveraga et al., 2011). Additionally, perceptual and semantic associations are strengthened with expertise (Gauthier et al., 2003; Tanaka and Taylor, 1991; Herzmann and Curran, 2011). Critically, how does the associative prediction network interact with the perceptual system in experts? Let’s take face expertise as an example to demonstrate the potential interactions between perceptual and predictive processes that are generated by perception of an object of expertise. When seeing a famous face (e.g., Barack Obama), all features of the face are processed holistically and the visual representation may instantly activate associated visual or non-visual details about the person (e.g., he lives in the White House; he was a Senator for Illinois). These associations can lead to predictions about what items or people may be around him (e.g., Michelle Obama or Joe Biden). Indeed, famous faces not only activate the FFA, a perceptual locus for faces (Grill-Spector et al., 2004; Kanwisher et al., 1997; Ishai et al., 1999), but also the PHC, a key area of the associative network (Bar et al., 2007a; Leveroni et al., 2000; Pourtois et al., 2005; Sergent et al., 1992; Trautner et al., 2004).3 Conversely, when meeting a new friend, all visual features of the face are also likely processed holistically. While this representation may not be linked to specific visual or non-visual details associated with that person, you likely generate associations related to the face immediately (e.g., he looks like a friend from high school) and make predictions about different attributes of the person (e.g., whether he is friendly or aggressive; what kind of job he may have) based on similarity of the person to ‘prototypes’ or general representations of many other people you already know (Ambady et al., 2000; Bar et al., 2006b; Bar et al., 2007a; Willis and Todorov, 2006). We suggest that these processes are supported by the interactions between the context associative network and the perceptual system, and may be triggered by various kinds of perceptual cues (e.g., biological motion, Kramer et al., 2010). It is likely that these interactive processes are not only found with face or person perception but with perceptual expertise for other object categories. However, further empirical work is necessary to understand how exactly these interactions operate.
The result and purpose of associative predictions may be to activate other brain areas to be readily engaged in anticipation for what is coming. To reiterate, experts’ superior visual and non-visual knowledge would be beneficial in generating such predictions. For instance, with a glance at a single, visually presented musical notation, a multimodal brain network (including the motor cortex, auditory cortex) is activated in proficient music readers, indicating the rich representations related to music reading and performance (Wong and Gauthier, 2010b). More importantly, experts do not only have more resources for generating predictions and planning for appropriate actions, but are also likely to make more accurate and elaborated predictions in a given context. For example, chess grandmasters make better moves than amateur chess players because they can ‘think ahead’ further (Charness, 1981; Holding, 1992). Specifically, strong players are more accurate in predicting the endpoint positions of the pieces than weaker players (Holding, 1989). Therefore, strong associations in long-term memory that are formed during the acquisition of expertise in a domain may lead to increased strength, depth and specificity of predictions.
Apart from the potential links between visual prediction and perceptual expertise discussed above, many questions remain to be further explored. For instance, what are the critical elements that make efficient training for different object categories to promote top– down associative processing? As mentioned earlier, it is important to distinguish the nature of the visual training tasks, as not all training requirements would lead to top–down effects that are generalizable across tasks and across exemplars within an object category. Moreover, since various object categories (e.g., faces and letters) require different computational demands from the visual system, do the different categories recruit identical top–down associative mechanisms, or are there possible different sub-systems supporting associative processing?
Moreover, is there a qualitative or quantitative difference between the ability to predict between experts and novices? Important questions to address include what the differences are in the time course and mechanisms for accessing visual and non-visual associations between experts and novices, and whether there are any differences in the neural representations that support associations, depending on the degree of expertise. Currently, it remains unknown whether the visual and semantic associations are primarily stored in the context associative network for experts, or the associations may also be represented in the visual system as perceptual expertise strengthens visual performance.
It is also interesting to ask whether experts are more adaptive or inflexible when facing inconsistent or unpredictable situations. Some theoretical accounts on the flexibility of expertise skills (e.g., Ericsson and Lehmann, 1996) have suggested that experts automatically retrieve reasoning or associations linked to particular tasks or stimuli and thus cannot ignore such rigid associations even when the associations are not optimal for the task at hand. In contrast, experts might also have access to more probable associations and analogies that may contribute to more flexible and creative processing.
To answer these questions, future investigation of expertise should broaden the window of examination to include the behavioral and neural mechanisms of associative prediction in experts and novices in various domains. In sum, while most people are proficient in the skill of everyday recognition and prediction, little work in cognitive neuroscience has been done to understand the acquisition and development of this skill and how it may interact with expertise training in the perceptual system. We suggest that studying visual prediction jointly with perceptual expertise will provide a more complete picture of visual cognition.
This work was supported by NEI-NIH grant 1R01EY019477-01, NSF grant BCS-0842947, and DARPA grant N10AP20036. We thank Daryl Fougnie, Eiran Vadim Harel and Tomer Livne for helpful comments on the manuscript.
1The terms ‘prediction’ and ‘expectation’ have been used interchangeably in the literature to describe top–down processes that are involved in visual recognition. Here, we primarily use ‘prediction’ to describe top–down visual facilitation that is based on activations of appropriate visual or non-visual associations and analogies.
2Note that not all types of visual training would enhance top–down processing due to associations. For instance, perceptual learning studies often found task-specific improvement on visual discrimination for trained visual features or patterns (e.g., Zhang et al., 2010; for reviews, see Gilbert and Sigman, 2007; Sagi, 2011; but see Wong et al., 2011); contextual cueing studies showed implicit or explicit learning of specific contextual associations (e.g., Chun and Jiang, 1998; Brockmole and Henderson, 2006). In contrast, the training effects for perceptual expertise can be generalized across unfamiliar exemplars and tasks (e.g., identity or location tasks; e.g., Gauthier et al., 2000a, b; Wong et al., 2009a). More importantly, experts often acquire both visual and nonvisual object knowledge (e.g., Tanaka and Taylor, 1991), which we propose to play a critical role in the activation and progression of associations to facilitate object processing.
3Moreover, attractive faces preferentially activate the OFC, which may be related to associative processing and reward or esthetic assessment (Aharon et al., 2001; Ishai, 2007; O’Doherty et al., 2003).