|Home | About | Journals | Submit | Contact Us | Français|
The ability to create and enjoy music is a universal human trait and plays an important role in the daily life of most cultures. Music has a unique ability to trigger memories, awaken emotions and to intensify our social experiences. We do not need to be trained in music performance or appreciation to be able to reap its benefits—already as infants, we relate to it spontaneously and effortlessly. There has been a recent surge in neuroimaging investigations of the neural basis of musical experience, but the way in which the abstract shapes and patterns of musical sound can have such profound meaning to us remains elusive. Here we review recent neuroimaging evidence and suggest that music, like language, involves an intimate coupling between the perception and production of hierarchically organized sequential information, the structure of which has the ability to communicate meaning and emotion. We propose that these aspects of musical experience may be mediated by the human mirror neuron system.
It is a common experience to be transported back through time when we hear one of our favorite songs: to the summer vacation with our best friend; to the school disco; to that cross-country drive; to our first teenage crush. Music is a universal human trait, offering the mind a unique mode of communication, a means of evoking or stimulating different emotions and, most importantly, the kind of experience that can unite and define social groups, generations and cultures. People around the world use song and dance to tell stories, to conduct rituals, to teach children about their history and culture, to entertain and to relax. We relate to music spontaneously and effortlessly, and often with an emotional response. Yet the nature of such musical experiences is extremely complex; the processing and representation of music involves a multitude of perceptual and cognitive mechanisms that have yet to be fully described. It has recently been proposed that music is best understood as a form of communication in which acoustic patterns and their auditory representations elicit a variety of conscious experiences (Bharucha et al., 2006). Here we review some recent evidence on the neural basis of musical processing in relation to two other modes of communication, language and action, both of which have been described as supported by the human mirror neuron system. We hypothesize that the powerful affective responses that can be provoked by apparently abstract musical sounds are supported by this human mirror neuron system, which may subserve similar computations during the processing of music, action and linguistic information.
The mirror neuron system has been proposed as a mechanism allowing an individual to understand the meaning and intention of a communicative signal by evoking a representation of that signal in the perceiver's own brain. Neurons with these ‘mirror’ properties have been described in both area F5 of the premotor cortex and in parietal area PF of the macaque brain (Rizzolatti and Craighero, 2004). These visuomotor neurons discharge both when the monkey performs an action and when it observes another individual perform a similar action (di Pellegrino et al., 1992; Gallese et al., 1996; Fogassi et al., 2005). It has been suggested that parietal mirror neurons have the special property of coding motor acts as belonging to an action sequence, predicting the intended goal of a complex action (Fogassi et al., 2005). In addition, subsets of premotor mirror neurons have been shown to have audiovisual properties, and are able to represent actions independently of whether they are performed, heard or seen (Kohler et al., 2002). Thus, area F5 of the ventral premotor cortex, and area PF of the inferior parietal lobule in the monkey are considered to form a fronto-parietal mirror neuron system critical to action understanding and intention attribution (Rizzolatti et al., 2001; Rizzolatti and Craighero, 2004; Fogassi et al., 2005).
A similar fronto-parietal network, including the posterior inferior frontal gyrus (BA 44), adjacent ventral premotor cortex and the inferior parietal lobule (BA 40), appears to subserve related functions in the human brain (Rizzolatti and Craighero, 2004). A large number of neuroimaging studies have now shown that such a human fronto-parietal mirror neuron system is engaged during action observation and imitation (Fadiga et al., 1995; Hari et al., 1998; Iacoboni et al., 1999; Johnson-Frey et al., 2003; Molnar-Szakacs et al., 2005, 2006; Aziz-Zadeh et al., 2006;). The proposed frontal mirror neuron region has also been shown to be involved in understanding the intentions behind the actions of others (Iacoboni et al., 2005). In addition, there is some recent evidence to suggest that, as in the monkey, the human mirror neuron system shows sensitivity to auditory stimuli related to actions (Aziz-Zadeh et al., 2004; Buccino et al., 2005). Thus, a range of current evidence suggests that a human fronto-parietal mirror neuron system shows properties consistent with the ability to represent the actions and intentions of others, across modalities, by recruiting one's own motor system. For the purposes of this review, we will refer to this action observation/execution matching system as the ‘human mirror neuron system’.
To date, parallels between the activity of mirror neurons recorded in the monkey and human neuroimaging findings are established primarily in the domain of action observation/execution. However, functions of this neural system have recently been linked to several high-level human cognitive functions such as empathy (Carr et al., 2003; Gallese, 2003b; Dapretto et al., 2006), theory-of-mind (Williams et al., 2001, 2006) and self-other discrimination (Uddin et al., 2005, 2006). A notion of shared representations for production and perception of speech has been previously proposed and formalized in the Motor Theory of Speech Perception (Studdert-Kennedy et al., 1970; Liberman and Mattingly, 1985; Liberman and Whalen, 2000). This theory holds that two-way communication is based on shared representation and occurs when sender and perceiver co-activate this representation. Mirror neurons may provide the neural basis of this shared representation between sender and perceiver, which Liberman postulated as the necessary prerequisite for any type of communication (Liberman and Mattingly, 1985; Rizzolatti and Arbib, 1998). Gallese (2003a) has also proposed that this perception-action link is supported by an automatic and non-conscious simulation mechanism, whereby one uses the same neural resources to represent and understand the actions of others as to perform one's own actions (Gallese, 2003a). Such a neural system allows one, in essence, to experience the mind of the other, or as the expression would have it, to ‘walk in another's shoes’.
How might this system for action representation be involved in the experience of music? Until the recent advance of recorded music and synthesized sounds (relative to human evolution), music has always been associated with motor activity. From drumming to singing to virtuosic sitar playing, the production of music involves well-coordinated motor actions that produce the physical vibrations of sound. The experience of music thus involves the perception of purposeful, intentional and organized sequences of motor acts as the cause of temporally synchronous auditory information. Thus, according to the simulation mechanism implemented by the human mirror neuron system, a similar or equivalent motor network is engaged by someone listening to singing/drumming as the motor network engaged by the actual singer/drummer; from the large-scale movements of different notes to the tiny, subtle movements of different timbres. This allows for co-representation of the musical experience, emerging out of the shared and temporally synchronous recruitment of similar neural mechanisms in the sender and the perceiver of the musical message. This shared musical representation has a similar potential for communication as shared language or action.
The connection between music and motor function is evident in all aspects of musical activity—we dance to music, we move our bodies to play musical instruments, we move our mouths and larynx to sing. A number of recent neuroimaging studies have shown that specific musical experience or expertise can modulate the activity within the fronto-parietal mirror neuron system (Haslinger et al., 2005; Bangert et al., 2006), as can dancing experience (Cross et al., 2006) and music-related motor learning (Buccino et al., 2004; Calvo-Merino et al., 2004). Other recent studies have relied on the coupling of perception and action in musical experience to investigate the neural organization of such complex behaviors as sequence learning and temporal production (Janata and Grafton, 2003).
Music, of course, is a communicative signal comprised of patterns whose performance and perception are governed by combinatorial rules, or a sort of musical grammar (Sloboda, 1985); the auditory signal is not simply organized in consecutive sequential elements, but involves hierarchical relationships. Hierarchical organization is the process of integrating lower-level units to form more complex higher-level units and in the case of music this involves combinations of both sequential and simultaneous elements such as notes, rhythms, phrases, chords, chord progressions and keys to form an overall musical structure (Lerdahl and Jackendoff, 1983). Human language is a communicative signal with a similar hierarchical structure, in which phonemes are combined to form words, phrases and sentences up to the discourse level of speech structure (Hockett, 1960). Such principles of hierarchical organization also underlie other complex abilities such as problem-solving (Newell and Simon, 1972) and tool-use (Greenfield, 1991; Greenfield et al., 2000).
Hierarchical processing in action and in linguistic grammar have been found to show both behavioral and neural similarities in developmental investigations, psycholinguistic research, cross-species comparison and neuroscientific studies (1978, 1991, 2005). Grossman (1980) used evidence from aphasic patients to suggest that Broca's area is the common neural substrate for processing hierarchy in both language and action. He found that Broca's aphasics who lack hierarchical organization in their syntactic production were also impaired in recreating hierarchically organized tree structures used by Greenfield and Schneider (1977). In contrast, fluent aphasics, who have hierarchically organized (but semantically empty) speech were able to reproduce the hierarchical structure of the models (Grossman, 1980). In the musical domain, damage to this area of the posterior inferior frontal gyrus can lead to the conjoint impairments of aphasia and amusia—a selective problem with perceiving and interpreting music (Alajouanine, 1948). Recent evidence has also shown that aphasic patients with syntactic comprehension difficulties in language exhibit similar syntactic difficulties in the domain of musical harmony (Patel, 2005).
Neuroimaging studies of language function and studies of sensory-motor integration have shown evidence of an overlap between the brain regions involved in linguistic processing and regions comprising the human mirror neuron system (Rizzolatti and Arbib, 1998; Arbib, 2005). Recent neuroimaging studies have also implicated Broca's area and its right hemisphere homologue in the perception and representation of hierarchically organized human behavior (Koechlin and Jubault, 2006; Molnar-Szakacs et al., 2006). Furthermore, it has been proposed that parallel functional segregation within Broca's area during language and motor tasks may reflect similar computations used in both language and motor control (Molnar-Szakacs et al., 2005). In accordance with these findings, neuroimaging studies have shown that Broca's area and its right-hemisphere homologue supports the processing of syntax in both language (Dapretto and Bookheimer, 1999; Friederici et al., 2000a, 2000b) and music (Patel et al., 1998; Maess et al., 2001; Koelsch et al., 2002; Patel, 2003; Tillmann et al., 2003; Koelsch and Siebel, 2005); see Figure 1. In parallel with the developmental literature on action and language, infants also seem to show implicit knowledge of principles of hierarchical organization for music, for example they are able to distinguish different scales and show preferences for consonant over dissonant tonal combinations (Trehub, 2003).
The proposal of a common neural substrate for music, language and motor functions is supported by evidence from studies of language disorders. For example, it has been shown that children with dyslexia exhibit specific timing difficulties in the domain of music (Overy et al., 2003), motor control (Fawcett and Nicolson, 1995; Wolff, 2002) and language (Tallal et al., 1993; Goswami et al., 2002) and that music lessons with dyslexic children can lead to improvements in language skills (Overy, 2003). It has also been found that patients with severe non-fluent aphasia can benefit from Melodic Intonation Therapy (MIT), a highly imitative speech therapy technique based on singing. The technique has been shown to lead to speech improvements (Sparks et al., 1974), coupled with changes in the neural resources recruited during speech (Belin et al., 1996; Overy et al., 2005).
The role of the human mirror neuron system, Broca's area in particular, in mediating the sensory-motor transformations underlying imitation is already well-established (Iacoboni et al., 1999; Koski et al., 2003, 2002; Heiser et al., 2003; Molnar-Szakacs et al., 2005). The success of music/speech therapy methods such as MIT might thus be due, at least in part, to the fact that their imitative elements involve a direct transfer of sensory information to a motor plan, leading to a strong recruitment and co-activation of brain regions involved in the perception and production of both music and language.
The range of research findings discussed so far lends support to the hypothesis that the perception of action, language and music recruit shared neural resources, which appear to be located in brain regions comprising the human mirror neuron system. Based on this evidence, we propose that humans may comprehend all communicative signals, whether visual or auditory, linguistic or musical, in terms of their understanding of the motor action behind that signal, and furthermore, in terms of the intention behind that motor action. The expressive nature of any human action or vocalisation sends a signal of the intentional and emotional state of the executor, such that even footsteps can be correctly interpreted as conveying simple emotions (such as sad, happy, angry or stressed) (de Gelder, 2006). Thus, as a sentence or a musical phrase can be used to express an individual's semantic intention or emotional state, a listener can understand the intended expression of the sentence or melody, via the perceived ‘motion’ of the signal. Since the acoustic nature of music can convey pure, non-referential ‘motion’ in pitch-space and time, it can thereby convey complex and subtle qualities of human ‘e’motion, using varying complexities of structural hierarchy.
Indeed, one of the defining features of music is its ability to induce an emotional response in listeners (Gabrielsson, 2001) and one of the main reasons people give for listening to music is to experience or modulate their emotional state (Sloboda and O'Neill, 2001). Emotional responses to music are present in early life and across cultures (Balkwill and Thompson, 1999), indicating that the ability to perceive emotions in music may be innate (Zentner and Kagan, 1996; Trevarthen, 1999). Numerous measures of autonomic arousal have been used to investigate the emotion-inducing effects of music. Skin conductance responses appear to be useful indicators of musically induced emotional arousal (VanderArk and Ely, 1992, 1993), and ‘chills’ (goosebumps) can be elicited when participants are allowed to select music they find arousing (Panksepp, 1995; Gabrielsson, 2001). Neuroimaging studies of affective responses to music have revealed the involvement of a network of paralimbic and neocortical regions, including frontal pole, orbitofrontal cortex, parahippocampal gyrus, superior temporal gyrus/sulcus, cingulate and the precuneus (Blood et al., 1999; Blood and Zatorre, 2001; Koelsch et al., 2005; Menon and Levitin, 2005; Koelsch et al., 2006). These regions correspond well to brain regions previously associated with processing emotional states and evaluating reward, particularly in socially relevant cognitions (Adolphs, 1999, 2001, 2003; Adolphs et al., 2000).
Emotion, especially as communicated by the face, the body and the voice is an active motor process. Emotion and action are intertwined on several levels, and this motor-affective coupling may provide the neural basis of empathy (Carr et al., 2003; Leslie et al., 2004)—especially the aspect of empathy that requires no intermediary cognitive process, but rather, is our automatic and immediate ‘motor identification’ or inner imitation of the actions of others (Lipps, 1903; Gallese, 2003a). The chameleon effect, whereby empathic individuals exhibit non-conscious mimicry of the postures, mannerisms and facial expressions of others provides strong support in favor of this theory (Chartrand and Bargh, 1999). There is also recent neuroimaging evidence that the anterior insula, the right amygdala and mirror neuron areas in the posterior inferior frontal gyrus, show enhanced activity during imitation of emotional facial expressions vs simple observation, providing additional support for the role of sensorimotor-affective coupling in understanding the emotions of others (Carr et al., 2003). In addition, Adolphs and colleagues have shown that in a group of lesion patients, those with a lesion of the amygdala and the sensorimotor cortex performed the worst when asked to name or rate facial expression of emotion (Adolphs et al., 2000). Thus, it appears that the mirror neuron system is involved not only in the intersubjective representations of actions but also in emotion—representations that allow us to feel connected with other agents.
It has been suggested that the perception of emotion in music may arise in part from its relation to physical posture and gesture (Davies, 1994; Jackendoff and Lerdahl, 2006). It has also been shown that expressive music can induce subliminal facial expressions in listeners (Witvliet and Vrana, 1996), and these in turn may induce subjective and physiological emotional expressions (Ekman et al., 1983). As posture, gesture and facial expressions are important implicit cues in social communication, one can easily imagine that ‘musical gesture’ can have similar effects in communicating emotions. Two regions, including the posterior inferior frontal gyrus and the anterior insula are commonly activated during musically evoked emotional states (Koelsch et al., 2005, 2006; Menon and Levitin, 2005). These two structures may hold the key to understanding how the brain uses a simulation mechanism to represent emotional states evoked by musical experience. As described above, the posterior inferior frontal gyrus (BA 44) is the frontal component of the human fronto-parietal mirror neuron system. With its ability to link perceptual and behavioral representations of a stimulus during the perception of emotionally arousing music, the mirror neuron system may simulate an emotional state in the listener (Gridley and Hoff, 2006).
The anterior insula may also support neural representations for subjective autonomic states, including bodily states such as pain and hunger, as well as more subtle states such as perception of heart rate and emotional awareness (Craig, 2002, 2003, 2004; Critchley et al., 2004). Anatomical data shows that the insular lobe has reciprocal connections with the limbic system as well as with posterior parietal, inferior frontal and superior temporal cortex (Augustine, 1996). Through its connection to regions of motor significance, the anterior insula has been proposed to serve as the neural relay station between the human mirror neuron system linking perception and action and the limbic system involved in processing emotions (Carr et al., 2003). Thus, in the case of music, the human mirror neuron system and the limbic system may communicate through the insula to provide an automatic representation of the musical stimulus (Figure 1).
In conclusion, we propose here that in its ability to integrate and represent cross-modal information, the mirror neuron system may provide a domain-general neural mechanism for processing combinatorial rules common to language, action and music, which in turn can communicate meaning and human affect. Although it is yet to be determined which specific aspects of processing linguistic, musical or motor syntax may recruit frontal mirror neuron regions, the emerging picture from the literature suggests that the mirror neuron system provides a neural substrate for representing infinite combinations of hierarchical structures, a computation that may underlie more general cognitive abilities. There is also evidence that perhaps this region may be the source of predictive models of upcoming events in sequential processing, a feature also common to language, music and action (Molnar-Szakacs et al., 2005; Zatorre and McGill, 2005).
While the evolutionary advantage of musical ability is still under debate (Hauser and McDermott, 2003), there is growing evidence that music plays an role in cognitive development, emotion regulation and social interaction (Trevarthen, 1999; Juslin and Sloboda, 2001). We propose that the human mirror neuron system may subserve some of these effects, linking music perception, cognition and emotion via an experiential rather than a representational mechanism. A review of the literature on musically induced emotions provides support for our proposal that music can invoke motor representations of emotions by recruiting the insula, a neural relay between the limbic and motor systems. Action, language and music appear to share neural resources, and we have proposed that common features governing the use and function of these means of communication may be represented within the fronto-parietal mirror neuron system. Given that music, language and action (i) show specific and relatively fixed developmental time courses (Trehub, 2001; Greenfield, 2005); (ii) are ubiquitous means of social communication in all human societies (Brown, 1991; Fiske, 2004); and (iii) share overlapping neural resources, it follows that these human abilities may be related evolutionarily. Ultimately, as further research tests some of the hypotheses presented here, we will perhaps gain valuable insight into a most fascinating aspect of humanity: the ability to express oneself creatively.
“You are the music, while the music lasts.”
- T.S. Elliot.
We thank Ralph Adolphs and an anonymous reviewer for comments on the manuscript. We also express our appreciation to the organizers and participants of The Cold Spring Harbor Workshop on the Biology of Social Cognition, July 14-20, 2006. Financial support to attend this conference was provided to KO by the Medical Research Council, UK and to IMS by the Laboratory of Cognitive Neuroscience and the Brain Mind Institute at the Ecole Polytechnique Federale de Lausanne (EPFL).
Conflict of Interest