|Home | About | Journals | Submit | Contact Us | Français|
Common coding theory states that perception and action may reciprocally induce each other. Consequently, motor expertise should map onto perceptual consistency in specific tasks such as predicting the exact timing of a musical entry. To test this hypothesis, ten string musicians (motor experts), ten non-string musicians (visual experts), and ten non-musicians were asked to watch progressively occluded video recordings of a first violinist indicating entries to fellow members of a string quartet. Participants synchronised with the perceived timing of the musical entries. Results revealed significant effects of motor expertise on perception. Compared to visual experts and non-musicians, string players not only responded more accurately, but also with less timing variability. These findings provide evidence that motor experts’ consistency in movement execution—a key characteristic of expert motor performance—is mirrored in lower variability in perceptual judgements, indicating close links between action competence and perception.
Recent studies suggest that the perception and execution of actions are tightly linked (see Casile & Giese, 2006; Knoblich & Flach, 2001; for further neurophysiological evidence, see, e.g., Calvo-Merino, Glaser, Grèzes, Passingham & Haggard, 2005; Calvo-Merino, Grèzes, Glaser, Passingham & Haggard, 2006; Chaminade, Meary, Orliaguet & Decety, 2001). These findings can be interpreted in light of the common coding theory (Prinz, 1997; see also Hommel, Müsseler, Aschersleben & Prinz, 2001; Schütz-Bosbach & Prinz, 2007). Common coding theory states that the perception and production of actions share common representations. In particular, it is argued that sensory and motor representations overlap since actions are controlled by the sensory effects they produce (Greenwald, 1970; Prinz, 1997). Both cognitive neurosciences and behavioural studies provide evidence in support of the common coding hypothesis. The identification of mirror neurons that respond to both the observation and production of an action (Di Pellegrino, Fadiga, Fogassi, Gallese & Rizzolatti, 1992; Rizzolatti & Craighero, 2004) has been interpreted to confirm common coding on a neurophysiological level. Furthermore, neural activations in motor areas have not only been found during ongoing action observation (Gallese, Fadiga, Fogassi & Rizzolatti, 1996; Iacoboni, Woods, Brass, Bekkering, Mazziotta & Rizzolatti, 1999; Rizzolatti et al., 1996), but also during the anticipation of observed action effects (Kilner, Vargas, Duval, Blakemore & Sirigu, 2004). A mechanism assumed to underlie the anticipation of observed actions are so-called motor or action simulations (e.g., Jeannerod, 1999, 2001). That is, when individuals anticipate action-related effects, they internally simulate, either covertly or explicitly, the execution of the action and thereby access their own action repertoire (Jeannerod, 2003; Knoblich & Flach, 2003). More specifically, motor simulations involve imagining—in advance—the actions and effects that specify the event, and they also occur automatically when an action is observed (Keller, Knoblich & Repp, 2007).
In keeping with the neuroscientific evidence, behavioural studies showed similarly that during action observation a shared representation is activated in the motor system (Brass, Bekkering & Prinz, 2001; Brass, Bekkering, Wohlschläger & Prinz, 2000; Liepelt, Ullsperger, Obst, Spengler, von Cramon & Brass, 2009). Together, these findings provide support for one of the main tenets of common coding theory: that the perception and production of action are intrinsically linked by common codes. Consequently, perception and action should also be able to reciprocally induce each other (see Schütz-Bosbach & Prinz, 2007).
While there is abundant evidence that how we perceive the environment influences the way we act on it (e.g., Hayhoe, 2000), that is, how perception influences action, less is known about how actions modulate the way we perceive the environment. Recently, Aglioti, Cesari, Romani and Urgesi (2008) provided evidence that motor expertise modulates action anticipation. Aglioti et al. (2008) asked expert basketball players, expert watchers (no motor but comparable visual experience), and a control group (no basketball experience) to predict the success of basketball free-throw shots. Participants were presented with video clips of free-throw shots which were progressively occluded. Athletes who were motor experts were able to predict shot outcome (i.e., ‘in’ or ‘out’) more accurately and earlier than the other two groups. Aglioti et al. (2008) concluded that motor experience seems to be a crucial factor for anticipating the effects of others’ actions. In other words, observers are perceptually better attuned to those actions that are part of their own action repertoire (cf. Schütz-Bosbach & Prinz, 2007). This is in accordance with a corollary hypothesis derived from common coding theory, in that a high degree of overlap between perceptual and action representations facilitates action perception (Schütz-Bosbach & Prinz, 2007). Thus, the more experienced an observer is in executing an action, the more accurate is the anticipation of the same actions and their effects performed by another person (Aglioti et al., 2008; Cañal-Bruland, van der Kamp & van Kesteren, 2010).
From a motor control perspective, one of the key elements of motor expertise is the high precision in which movements are executed repeatedly (Ericsson & Lehmann, 1996; Magill, 2004). As for the example of Aglioti et al. (2008), elite basketball players are supposed to be significantly better in reproducing successful free-shots than their less skilled counterparts. Thus, if consistency in motor execution is a crucial component of motor expertise, and in keeping with common coding theory motor proficiency should modulate perception, then a tentative hypothesis emerges that consistency in motor execution should map onto perceptual consistency. Besides sports and other domains of expertise, music performance provides an important window into testing this hypothesis. Given the extensive training musicians accumulate before reaching expert level (Ericsson, Krampe & Tesch-Römer, 1993; Ericsson & Lehmann, 1996), musicians acquire both highly developed perceptual and motor skills. In order to accomplish the demands of coordination in ensemble performance, for example, musicians need to respond to each others’ gestures and sound output in the shortest of time to reach synchronisation (see Keller 2008). Several studies highlighted the importance of visual information in musical ensemble communication (see reviews by Goodman, 2002; Davidson & King, 2004) and between conductors and musicians (Luck & Toiviainen, 2006; Wöllner & Auhagen, 2008). There is evidence for the development of domain-specific superior perceptual processes in conductors (Nager, Kohlmetz, Altenmüller, Rodriguez-Fornells & Münte, 2003) and pianists (Repp & Knoblich, 2009).
Musicians’ motor expertise is manifest in movement parameters. Chen, Woollacott and Pologe (2006) reported that cellists at intermediate and expert level show relatively low motor variability across different shifting tasks with various movement distances and velocities. Similarly, Konczak, van der Velden and Jaeger (2009) recently compared novice and expert violinists who were trained with the same method (Suzuki), and found evidence for a higher degree of motor consistency and precision in experts when executing particular movements. More specifically, motor proficiency and experience were related to a suppression of sagittal motion of the shoulder, which in turn reduced variability in the motion of the violin bow. Reduced motor variability thus increases consistency and precision, and is characteristic for expert violinists.
The aim of the current study is to examine whether motor expertise maps onto perceptual consistency in a music-specific task. To test this hypothesis, we invited two groups of musicians with very high skills and comparable experience in ensemble performance, and a control group without formal musical training (non-musicians). The two groups of musicians consisted of ten string musicians with motor and visual expertise, and ten musicians of further instruments with visual but no task-related motor expertise. Participants were required to watch videos showing a first violinist on a video screen, who indicated entries to fellow musicians. The task was to indicate as exactly as possible the perceived entries of the music by pressing a corresponding button. Based on common coding theory, first, we predicted that musicians would be more accurate than non-musicians in estimating the entries. To test this hypothesis, five temporal occlusion points (cf. Aglioti et al., 2008) were created, resembling differences in preparation time in musical ensemble performance. Given their visual and motor experience, string musicians were predicted to await the correct entry in sequences with relatively long preparation times, while being able to respond quickly in short sequences. Non-musicians, in contrast, were expected to be less accurate in the anticipation of musical entries.
Second, if a high degree of overlap between perceptual and action representations facilitates action perception, and motor simulations contribute to the anticipation of actions and their effects (cf. Jeannerod 1999, 2001; Keller et al., 2007), then experts highly experienced in producing an action are predicted to better anticipate the same actions and their effects when observing these actions. Therefore, we further hypothesised that motor experts (i.e., string musicians) would be more consistent in timing the perceived entry, or in other words, show reduced timing variability when compared to the other two groups.
A total of thirty volunteers (twenty female) took part in the experiment. Ten advanced students of string instruments at a major conservatoire (mean age: 21.40 years, SD = 1.34), ten advanced music students of other instruments at the same conservatoire (mean age: 22.60 years, SD = 3.24) and ten participants who had not received any formal musical training and thus had no expertise in playing or observing other instrumentalists (mean age: 25.40 years, SD = 3.81) participated in the experiment. The music students had played their respective instruments for a mean of 12.45 years (SD = 3.58) and performed in musical ensembles for a mean of 5.85 years (SD = 3.28). There were no significant differences between the music student groups regarding these variables. The study was approved by the RNCM’s Ethics Committee, and participants gave their informed consent prior to taking part.
A first violinist was video-recorded (Panasonic NV-GS280 digital video camera, approximately 2.5 m apart) from a frontal perspective while indicating entries to fellow musicians in a string quartet. The recordings took place in a regular rehearsal of the quartet, and no specific instructions were given to the first violinist.1 Two of the entries were at a forte and two at a piano dynamic level. Using acoustical analysis software, entries of the playing were identified in accordance with the detectable tone onset in the waveforms. The four videos were edited in a standardised way such that the defined tone onsets occurred 2,000 ms after the beginning of each video, with a total video duration of 3,000 ms. In the experiment only visual information was shown. For each of the four videos, five temporal occlusion conditions were produced, in which the preparation time before the actual entry varied systematically. Condition 1 showed the full length of the video (start of playing, tone onset at 2,000 ms); in condition 2 the first 400 ms were omitted (start at 1,600 ms), in condition 3 the first 800 ms were omitted (start at 1,200 ms), in condition 4 the first 1,200 ms were omitted (start at 800 ms), and in condition 5 the first 1,600 ms were omitted (start at 400 ms). Each video was presented twice, leading to 40 videos.
Participants were tested individually watching the forty videos without sound on a 19-in. (1,280 × 1,024, 75 Hz) computer screen. They were asked to press a computer key when they perceived the entries of the music as indicated by the first violinist. Since the actual playing time after the defined onsets lasted 1,000 ms and no sound was provided, no information was given about the musical excerpts being performed by the quartet. In addition, none of the participants had previously played with the first violinist in a chamber ensemble, and none of them was familiar with the composition. Purpose-written computer software played back the videos and simultaneously recorded participants’ keystrokes. After five practice trials with a different violinist from the one in the experimental videos, participants were presented with the videos in random order. In addition, they provided demographic information about their musical background on a questionnaire and took part in a simple visual reaction time test (cf. Hughes & Franz, 2007), which consisted of nine short videos. Following a fixation cross in the centre of the screen at 1,000 ms, participants were required to press a computer key when a blue circle-stimulus appeared (at 1,400, 1,800 or 2,000 ms).
Responses were recorded from the beginning of each video. Response times for the two videos with similar dynamic level and the two repetitions were averaged per participant. A 2 (dynamic level) × 5 (temporal occlusion) × 3 (expertise groups) repeated measures ANOVA was calculated on the response times. Second, the standard deviations in the response times for all videos (2 dynamic levels × 2 videos × 5 occlusion points) were used as an index of timing variability and subjected to a one-way ANOVA on the factor expertise group. Finally, timing variability was further investigated with absolute deviations from the pre-defined entry of the music (e.g., at 2,000 ms in temporal occlusion condition 1), which were subjected to a repeated measures ANOVA. The alpha level for significance was set at 0.05, and the effect sizes were calculated using partial eta squared values (ηp2). If the sphericity assumption was violated, the Greenhouse-Geisser correction was used. In addition, if error variances were not equal across groups, appropriate post-hoc procedures (Dunnett T3) were employed.
Response time lags were calculated for the reaction time test. There were no significant differences between the three groups of participants, F(2, 27) = 0.72, ns. The mean time lag between signal and recorded response was M = 295.14 ms (SD = 75.05).
A repeated measures ANOVA resulted in a significant main effect for the within-participants factor Temporal occlusion, F(2.61, 70.44) = 442.56, p < 0.001, ηp2 = 0.94. The longer the videos were presented (i.e., the more information prior to the entry was provided), the later participants indicated the perceived entry. There was also a significant main effect for the within-participants factor Dynamic level, F(1, 27) = 27.46, p < 0.001, ηp2 = 0.50, indicating that participants responded differently to the quality of the violinist’s movements (piano vs. forte). The mean response times for forte videos was M = 1,460.22 ms (SE = 25.87) from the start of the videos, and for the piano videos M = 1,657.23 ms (SE = 42.00), suggesting that piano entries caused longer timing delays and were perceived less accurately.
While there was no significant main effect for the between-participants factor Group, F(2, 27) = 0.16, ns, the interaction between Group and Temporal occlusion was significant, F(8, 108) = 5.69, p < 0.001, ηp2 = 0.30. As illustrated in Fig. 1, string musicians (motor and perceptual experts) responded later when they synchronised with the entries in longer videos, but also reacted faster in response to short videos, where the entries occurred only 400 ms after the beginning of the video. Given the time lags present in the reaction time task, some participants in the group of non-musicians responded prematurely in long duration videos and were also less accurate in short videos, where quick responses were required. The latter effect is also present in non-string musicians. In contrast, motor experts did not only use the information more accurately in longer videos, but also showed higher efficiency in responding promptly to short videos. There were no further interactions between variables.
Timing consistency as indicated by reduced timing variability was tested by calculating the standard deviations of the response times for each video across groups. A one-way ANOVA yielded a significant difference between groups, F(2, 57) = 15.28, p < 0.001, ηp2 = 0.35. The mean timing variability of the string musicians was 137.80 ms (SD = 77.76), of the non-string musicians 272.89 ms (SD = 135.84), and of the non-musicians 338.62 ms (SD = 129.11). Paired post-hoc comparisons (Bonferroni) revealed that string musicians were significantly less variable in their estimations of the entry as compared to non-string musicians (p < 0.01) and non-musicians (p < 0.001). Differences between non-string musicians and non-musicians did not reach significance. Thus, the group with the highest motor expertise outperformed the other groups in the consistency of timing their responses (Fig. 2).
In order to further investigate timing variability in relation to temporal occlusion, indicated dynamic level and group of participants, absolute difference values between each participant’s individual response and the pre-defined entries of the music were calculated. These absolute deviations indicate mean (negative or positive) asynchronies. For instance, the entry of videos in occlusion condition 5 was at 400 ms; this value was subtracted from each participant’s response time, and only absolute values were entered into subsequent analyses. An ANOVA resulted in significant main effects for the within-participants factors Temporal occlusion: F(2.55, 68.90) = 42.45, p < 0.001, ηp2 = 0.61, and Dynamic level: F(1, 27) = 62.10, p < 0.001, ηp2 = 0.70. The mean absolute timing deviations for the forte videos was 302.46 ms (SE = 16.42), and for the piano videos 509.41 ms (SE = 31.80). Given that there were no significant interactions with the factor Group of participants, forte videos caused smaller timing deviations across all groups of participants compared to piano videos. Similarly, for all groups of participants, timing deviations were higher when videos had shorter durations before the entry of the music (Fig. 3).
There was a significant effect for the between-participants factor Group, F(2, 27) = 3.57, p < 0.05, ηp2 = 0.21. Post-hoc comparisons (Bonferroni) indicated that absolute timing deviations of string musicians were smaller as compared to the ones of non-musicians (p < 0.05). Thus, participants with the highest motor expertise responded with the smallest absolute timing deviations. Although further differences between groups are statistically not significant, participants with visual expertise show a tendency for smaller deviations as compared to non-musicians (cf. Fig. 3). Mean timing variability across groups was calculated by entering the standard deviations of the asynchronies (absolute deviations) into a one-way ANOVA, which resulted in a significant main effect, F(2, 57) = 7.62, p < 0.01, ηp2 = 0.21. Post-hoc comparisons (Dunnett T3) revealed that timing variability was smaller for string musicians (M = 130.94 ms, SD = 72.41) as compared to non-string musicians (M = 240.89 ms, SD = 142.35; p < 0.05) and non-musicians (M = 240.03 ms, SD = 77.21; p < 0.001).
This study investigated whether motor expertise is reflected in enhanced perceptual consistency. In accordance with the common coding hypothesis suggesting bi-directional links between perception and action (Prinz, 1997; see also Hommel et al., 2001; Schütz-Bosbach & Prinz, 2007), we predicted that string musicians with motor expertise show superior timing in a domain-specific perceptual task. Both motor and perceptual skills are crucial for ensemble coordination (Ericsson & Lehmann, 1996; Keller, 2008). To this end, we invited string musicians (motor experts), musicians of other instruments with comparable ensemble experience (visual experts), and non-musicians to watch video sequences of a first violinist performing entries to fellow musicians in a string quartet, and to synchronise with the perceived musical entries. Results revealed significant effects of motor expertise on perceptual accuracy. String musicians responded more timely and, more importantly, showed less perceptual variability than the other two groups of participants.
A large body of previous research established characteristics of expert movements (Bernstein, 1967; Magill, 2004; Chen et al., 2006; Konczak et al., 2009), one of them being reduced motor variability in repeatedly performed and highly trained tasks (for an overview, see Davids, Bennett & Newell, 2006). To our knowledge, the current study demonstrated for the first time that experts’ reduced motor variability is also reflected in a perceptual timing task. Expert ensemble musicians are constantly required to perform with minimal timing variation between them in order to reach synchronised performances (cf. Keller, 2008). This is particularly important for the entries of the music, where differences in the start times between ensemble musicians become immediately audible. The preparation time before a musical entry differs in real-life situation, as simulated with the progressively occluded videos. Our results show that motor experts (and to some extent visual experts) did not only await the entry in videos with longer preparatory durations, but were also able to respond quickly if only short periods of 400 ms were shown before the entry of the music. In contrast, some non-musicians indicated the entry too early in videos with long preparation periods and showed larger time lags in short videos. Consequently, non-musicians’ absolute deviations (asynchronies) from the pre-defined entries were significantly larger as compared to the string-musicians, which could draw on their perceptual and motor experience.
Strikingly, motor experts also outperformed visual experts in the perceptual timing task, indicating that action competence influences the perception of others’ actions (Schütz-Bosbach & Prinz, 2007). That is, musicians with active experiences of playing a string instrument showed significantly less timing variability when compared to musicians who had studied other instruments. This finding extends on previous work that reported evidence for decreased timing variability in musicians compared to non-musicians (Hove, Keller & Krumhansl, 2007) by highlighting specifically the contribution of motor expertise and motor consistency within musicians’ general sensorimotor skills. This result is well in accordance with studies in other domains such as dancing (Calvo-Merino et al., 2005, 2006), sports (Aglioti et al., 2008) and typewriting (Rieger, 2004), which showed perceptual advantages of motor experts over visual experts. As for the work by Aglioti et al., a potential limitation of the current study may lie in the fact that string musicians may also have more visual experience of watching violinists when compared to their fellow ensemble musicians who play other instruments. However, string musicians did not differ in their general ensemble experience from the group of musicians with other instruments. Since participants with visual expertise had played in musical ensembles for a comparable amount of time, it can be argued that they had also required sufficient experience in watching violinists, who often lead the music in small ensembles without a conductor.
Musicians and non-musicians in the present study did not differ in their responses in a simple visual reaction time test, contrary to a study into transfer effects of musical skills (Hughes & Franz, 2007). Hughes and Franz’ results suggest that musical training provides benefits for uni- or bimanual tapping tasks, suggesting that early training enhances general perceptual-motor processes. Since our data were based on a comparable reaction time test but did not result in faster response rates for musically trained participants, we conclude that the response differences that occurred in the experimental task were due to the specificity of the more complex visual information provided by the violinist in the videos. Rather than generally responding quicker, string musicians were more accurate in their responses, supporting the notion that sensorimotor expertise is domain-specific and not a general ability (Ericsson, Krampe & Tesch-Römer, 1993; Ericsson & Lehmann, 1996).
Finally, all groups of participants were significantly influenced by the quality of the violinist’s movements. Timing deviations were larger for piano entries as compared to forte entries. A descriptive inspection of the videos suggests that the violinist indicated forte entries with quicker movements of the bow. Since there was no significant interaction between the dynamic level of the indicated entries and the between-participants factor, general differences in motor-dependent perceptual accuracy were apparent across these two specifically musical tasks.
To conclude, in accordance with the common coding hypothesis expert string musicians’ motor expertise contributes to the perception of musical entries performed by a violinist. More specifically, this is the first study to show that the ability to execute domain-specific movements with low variability—a key characteristic of expert motor performance—induces precise perceptual judgements with low variability when observing these movements. To spark further theoretical developments, future research needs to examine to what extent the processes underlying motor learning are reflected in perceptual learning, and vice versa. Another possible route to follow for future research is to investigate whether relations between motor variability in movement kinematics and perceptual timing consistency in expert performers are also persistent in other sensory modes such as auditory perception. Moreover, future studies could investigate the combined impact of visual and auditory information on timing consistency. An interesting question would be to examine in what ways dissociations of the expected entry based on visual information (i.e., the violinist’s gesture) and auditory information affect perception and timing consistency of experts and less experienced performers.
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
1The quartet performed Vaughan William’s String Quartet No. 1 in G minor, out of which excerpts were taken for the present study.
Clemens Wöllner, Phone: +44-161-9075263, Email: firstname.lastname@example.org.
Rouwen Cañal-Bruland, Email: email@example.com.