Auditory processing, by its nature, requires considerable investment of executive resources. Auditory information is evanescent and can vary rapidly over short time intervals. Thus, both attending to the signal and buffering of information in the auditory stream are vital to comprehending everything from speech to birdsong to music. In this investigation, we examined the attentional and working memory (WM) capacities, and the relation between them, in a profession that should demand the highest level of auditory executive capacity: musical conductors.

Several kinds of attention are essential to parse the auditory world. Considering first selective attention, Fritz, Elhilali, David, and Shamma (2007) observed that in everyday situations we constantly must extract features such as harmonicity, intensity, duration, and rhythm in order to group, identify, and locate auditory objects, even above and beyond the challenge of following one conversation among many, as in the “cocktail party problem” (Cherry, 1953). In other words, we need proficiency at auditory scene analysis (Bregman, 1990; Fritz et al., 2007). We normally accomplish this analysis quite well with both top-down and bottom-up control processes—by, for instance, increasing activity in relevant and decreasing activation in irrelevant sensory cortices in cross-modal selective attention (Johnson & Zatorre, 2006).

Divided attention, when one has to monitor several streams of information at once, is even more challenging. However, this is essential for monitoring information that may be presented at one of several spatial locations (e.g., listening for one bird call in a busy forest) or for integrating stereophonic sound from multiple speakers. Divided attention taxes central resources, given limited pools of attentional capacity, and is even more vulnerable to age-related decline than is selective attention, even when hearing acuity is accounted for (Humes, Lee, & Coughlin, 2006).

WM is required for one to maintain this transient information after attentional processes have selected the to-be-processed information. Although spans are typically smaller for tones than for words (Williamson, Baddeley, & Hitch, 2010), which are likely processed by separable subsystems (Berz, 1995), WM shows some similarities across auditory domains. As one example, both types of acoustic WM are vulnerable to acoustic similarity (phonetic or pitch, for word and tone WM, respectively; Williamson et al., 2010). Musicians likely need to use multimodal memory as they integrate sound with movement with visual notation (cf. Chaffin, Lisboa, Logan, & Begosh, 2010; Wöllner & Williamon, 2007). However, outside of some studies on the interactions of verbal and musical WM (Schendel & Palmer, 2007), not much is known about musicians’ use of multimodal memory (cf. Palmer, 2006).

The two executive resources of working memory and attention are positively related: In general, the higher the WM span, the better the attentional capacity. This is shown within individuals, because people with larger WM spans have better selective attention. For instance, they are less likely to detect their own name on the irrelevant channel in dichotic listening (Conway, Cowan, & Bunting, 2001). Higher WM span also predicts better divided attention: If asked to monitor the two channels simultaneously, high-WM people are better at hearing their own name in one of the channels (Colflesh & Conway, 2007). The on-average lower WM span in older people partly explains the age-related declines in both kinds of tasks, particularly in divided attention (Humes et al., 2006). WM may benefit attention in at least two ways: by allowing more flexibility in divided-attentional strategy and by enabling more effective inhibition when that is required in selective attention.

Another way to examine individual differences in executive function is to look at experts versus novices in a given domain. If experts exceed novices in attentional capacity, we may be able to propose a cognitive mechanism underlying expertise. As one example, Beilock, Carr, MacMahon, and Starkes (2002) showed that experts in several sports domains suffered in their skilled tasks when asked to pay conscious attention to each step, whereas novices benefited from that instruction. This implies that left to their own devices, the experts were successfully dividing attention at a preconscious level. With respect to WM, Hambrick and Engle (2002) showed that WM predicted memory performance for domain-specific knowledge, even among experts (in fact, those with higher WM showed a larger positive effect of prior knowledge in the domain). The possibility that expertise benefits “basic” executive capacities has implications for training regimens, although it should be noted that specific training cannot be separated from preexisting skills/genetic predisposition in many studies, because the experts have self-selected into their domain and researchers do not normally have pretraining baseline data.

Music provides an interesting and naturalistic domain for capturing executive functions, even for ordinary listeners. For instance, listeners must be able to use selective attention to focus on a soloist playing with an ensemble. They must also use WM to extract the tonal, harmonic, and rhythmic relationships that make any musical passage comprehensible. Divided attention enables these elements to be extracted simultaneously. Divided attention is challenged even more in listening to polyphonic music, in which separable melody lines are presented simultaneously.

Music experts, not surprisingly, show superior performance in musical executive tasks. For instance, novices resort to switching between the different lines when they are asked to detect errors in two familiar melodies played at once, whereas trained musicians use a more integrative strategy (Bigand, McAdams, & Forêt, 2000; Poudrier & Repp, 2013). In general, musicians show superior online temporal monitoring (Rammsayer & Altenmüller, 2006), even when the presentation is visual (Rammsayer, Buttkus, & Altenmüller, 2012). They also have higher WM spans than novices for auditory, and especially for tonal, material (Benassi-Werke, Queiroz, Araújo, Bueno, & Oliveira, 2012; Schulze, Zysset, Mueller, Friederici, & Koelsch, 2011; Williamson et al., 2010), although not necessarily for nontemporal tasks such as spatial WM (Hansen, Wallentin, & Vuust, 2013). As we mentioned above, although we cannot exclude preexisting propensity as a factor in these superior skills, early-trained individuals show larger expertise effects than do late-trained individuals do, suggesting at least some direct influence of years of training (Bailey & Penhune, 2010).

We propose that musical conductors are unusually well suited to allow researchers access to expertise-related skills in executive functioning. Most studies with musical experts involve performing instrumentalists, who of course need to engage in the executive functions described above. Most conductors begin their professional life as instrumentalists, but then they begin specialized training at the conservatory level or at later stages in their professional lives. The attentional and WM demands in orchestral conducting are considerable: During rehearsals, the conductor must monitor as many as 40 parts simultaneously, an extraordinary divided-attention task in itself. And within that task environment, he or she must monitor many aspects, such as rhythmic or tonal errors, early or late entrances, dynamics, and other aspects of expression. Selective attention is challenged if the conductor wishes to focus on a soloist, a tricky passage, or a particular music aspect. In addition to purely auditory tasks, the conductor must also integrate visual input, including monitoring the score, scanning and making eye contact with performers for cues and other direction, and monitoring their own movements—all the while keeping a beat that may also be changing continuously for expressive purposes.

A number of skills exemplifying conductor expertise have been documented. At the level of motor control, expert conductors’ gestures are easier to synchronize with and more consistent in their temporal–spatial features as compared to novices’ gestures (Wöllner, Deconinck, Parkinson, Hove, & Keller, 2012). Conductors also typically have a strong sense of the agency of their own gestures, as well as clear concepts of quality for conducting movements (Wöllner, 2012). Although musical conductors’ primary tasks comprise coordinating the music by bodily movements and facial expressions, they simultaneously need to concentrate on the musical composition and monitor the musicians’ performances. Regarding cognition, although attentional demands are the most obvious executive function for successful conducting, we also assume that a large WM capacity would benefit these experts. WM is directly taxed, for instance, if during a passage the conductor wishes to remember something to mention to the orchestra, but needs to retain that thought until such time as the music pauses. And as was noted above, WM and attention appear to be related constructs (Cowan, 1995), suggesting an indirect route by which we might find WM superiority among musical conductors.

Curiously, we could locate very few studies that have examined the executive skills of this interesting group of experts. In one study, conductors scored higher than a control group in a series of multisensory tasks, including pitch discrimination, judgment of temporal order, and localization of targets in space (Hodges, Hairston, & Burdette, 2006). In an event-related potential (ERP) study of auditory spatial attention, Nager, Kohlmetz, Altenmüller, Rodriguez-Fornells, and Münte (2003) found that conductors, as compared to pianists, had more precise spatial maps for sound location, shown both behaviorally and in their brain responses. At the same time, the conductors were preattentively more sensitive to deviant sounds outside the attended part of auditory space, thus showing both enhanced selective and divided attention.

In the present study, we presented a range of attentional and WM tasks to groups of musical conductors and pianists, matched for years of experience. In agreement with Nager et al. (2003), we thought that pianists were an appropriate control group, in that the overall motoric demands are to some extent comparable (e.g., bimanual requirements), and both require auditory and visual attention to more than one line of music at a time when reading musical notation and listening to the sound outcome. However, we also added the variable of experience, in that we tested both students and professionals from both groups. Because student conductors have made the choice to enter that profession, personality and other factors related to career choice are presumably somewhat controlled. Thus, we could at least partially test whether years of professional experience per se would enhance any group differences we might find among people with fewer years in the profession.

Of course, professional pianists and conductors will on average be older than students. However, on the one hand, we could mitigate this built-in confound by using age as a covariate wherever appropriate. On the other hand, this difference also allowed us to examine whether older age (and the expected concomitant reduction in WM span) would modulate any differences we expected to see in pianists and conductors (Halpern & Bartlett, 2002). Furthermore, although this was not the main purpose of our study, this analysis could also indicate whether the lifelong cognitive training of the older participants in the present study inoculated them from the age-related decline in WM and attention skills that have been documented for same-age contemporaries without such extensive training (Parbery-Clark, Anderson, Hittner, & Kraus, 2012).

Specifically, we devised attentional tasks that required monitoring passages of music for small deviations in timing or pitch. Participants had to monitor a single line of music (baseline) or one line presented along with another one simultaneously (selective attention), or monitor two lines of music simultaneously (divided attention). We also presented WM span tasks in the auditory and visual domains, including cross-modal memorization and recall conditions (monitoring notation and sounded music). Finally, we included a long-term memory task of remembering the tempi of three pieces presented at the beginning of the session, given that experts typically have better long-term memory for domain-specific information than do nonexperts (Herzmann & Curran, 2011).

We hypothesized that conductors would have better overall task performance than equally trained pianists in all the attentional tasks, but particularly in divided attention, given the task demands of conducting. We expected WM and attentional performance to be correlated over individuals, and that conductor–pianist differences would be enhanced among the more trained professionals. We were open to the possibility that the older age of the professionals would reduce the advantage of experience, perhaps resulting in no differences among the students and professionals, or potentially resulting even in a larger task superiority among student conductors than in pianists.

Method

Participants

A total of 30 highly trained musicians drawn from two professions (19 male, 11 female; 18–73 years of age) took part in the study, each group comprising less experienced and highly experienced participants. The first group included 15 musical conductors (age: M = 35.20 years, SD = 16.79, 11 male, four female), of whom eight participants were students and seven professional conductors (henceforth, “experts”). The second group included similar numbers of eight student and seven expert pianists (age: M = 32.47 years, SD = 11.81, eight male, seven female). The majority of the students studied at North German music academies, and some in Berlin. The conductors were trained in the classical repertoire and conducted student as well as professional ensembles. The pianists had studied their instrument for a longer time (M = 18.00 years, SD = 8.32) than the conductors had taken conducting lessons (M = 4.53 years, SD = 2.29), reflecting the general situation in professional training, according to which conductors typically start their career as an instrumentalist. The conductors had experience in playing the piano (number of years of lessons: M = 11.87, SD = 4.84), albeit less than the pianists (p < .02). The pianists, on the other hand, had taken only 0.80 years (SD = 1.32) of conducting lessons (i.e., fewer than the conductors, p < .001).

More importantly, the pianists and conductors (both students and experts) did not differ significantly in total numbers of years of professional musical training, including other instruments (for all participants, M = 19.57, SD = 9.28), the numbers of years they had worked in their domain (M = 11.17, SD = 12.45), or age. As intended, the students and experts across the groups of conductors and pianists differed in age and number of years of professional experience (all ps < .005).

Material: Memory and attention tests

In order to test for domain-specific capacities in selective and divided attention, WM, and long-term memory (LTM), a number of musical memory and attention tasks were devised that were based on existing tasks.

Absolute pitch

Since the WM tests involved remembering musical pitches, a short version of an absolute pitch test was used that presented ten sine waves of different frequencies, followed by silence and distraction sounds (cf. Schlemmer, Kulke, Kuchinke, & Van der Meer, 2005). The task was to name the pitches of the sine waves. A threshold of 80 % correct indicated possession of absolute pitch, and the number of correctly identified pitches was entered as a covariate in the subsequent analysis.

Verbal WM span

As a baseline assessment of WM, independently from domain-specific skills, participants’ verbal span was tested with an operations span task, with each trial increasing the number of sentences from two to eight (cf. Robert, Borella, Fagot, Lecerf, & de Ribaupierre, 2009; Daneman & Carpenter, 1980; Levitt, Fugelsang, & Crossley, 2006). Participants had to indicate whether the sentences were either true (“Paris is in France”) or false (“He ate the moon”), which served as a distractor task controlling for cognitive processing in WM (cf. Saults & Cowan, 2007). Maximum recall of the final words of these sentences indicated participants’ verbal span. The presentation modality was either visual (written sentences on a computer screen) or auditory (presented via headphones) in the different conditions.

WM span for timing

On the basis of the verbal span test, participants were presented with a succession of short rhythms (beat: 90 bpm, 4/4) and had to remember the final rhythmic timing value (e.g., eighth, quarter, half, or dotted half note). As a distractor task, they had to indicate whether the rhythm before the final note included syncopations (cf. Bailey & Penhune, 2010). The presentation mode was either visual (scores on a computer screen) or auditory (piano sounds created with the Logic Pro X software). The recall condition included visual–written tests (writing down the timing value in musical notation) or motor components (playing the duration of the timing value on a piano).

WM span for pitch

Musical triads were presented in either minor or major mode with a piano sound, and participants had to indicate the mode (distractor task). After each sequence of triads, they were asked to recall the last pitch of the triads (limited to five pitches from G4 to D5, as was indicated to them before). Again, triads were presented in increasing numbers from two up to eight. The recall condition included writing the last pitches down in musical notation (reference tones were given in this condition) or playing the pitches on the piano.

LTM for timing

Experienced musicians are relatively consistent in their timing (Rammsayer et al., 2012; Repp, 2010). Therefore, we tested LTM for timing based on internal consistency by presenting the scores of two compositions (unfamiliar to participants and with any references to tempi removed from the scores) and asked them to tap the beat at a tempo that they believed to be adequate. In addition, participants tapped the beat to a composition of their own choice that they had recently performed in public. Approximately 1.5 h later, at the end of the experimental session, participants performed the same tasks again.

Attention

In a baseline task, the participants focused on separate melodic streams that changed in melodic contour but not in the duration of the individual notes (72 bpm, interonset interval = 833 ms). The melodic streams contained either (a) small timing deviations (notes shortened by 50 ms) or (b) pitch deviations (25 cents lower than the correct pitch in equal temperament; 100 cents = one semitone) and were presented in one ear. The melodies had either the timbre of an English horn (left ear) or a flute (right ear). All melodies were constructed with the Logic Pro X software. Participants were required to tap as soon as they perceived a deviation (cf. Colflesh & Conway, 2007). In the selective-attention condition, a different melodic stream was presented in each ear, segregated by the two timbres. Melodic streams were played with a temporal offset of eighth notes (half of the notes’ values) between the streams. Participants focused on one stream and indicated any deviations. In the divided-attention task, they had to focus on deviations in both streams simultaneously. The melodies were written by the first author and contained melodic fragments from Bach’s Goldberg Variations (Fig. 1).

Fig. 1
figure 1

Sample score of the attention task. The targets (T) in this example were slightly shortened note values (by 50 ms). Participants tapped as soon as they perceived the deviations in timing duration

Design and procedure

In a mixed 2 × 2 design, all participants (student and expert pianists and conductors) completed all tasks in individual experimental sessions. Following the first part of the LTM test, the absolute pitch and verbal WM tests were completed. Participants then performed the pitch and timing WM tasks. Presentation modality (auditory or visual) was counterbalanced across the participants, and practice trials were given before each task. Subsequently, the baseline, selective-attention, and divided-attention tasks were completed. At the end, Part 2 of the LTM test was carried out. The total duration of the experiment ranged from 75 to 110 min.

Data analysis

Six of the participants were identified as possessing absolute pitch (80 % threshold). An analysis of covariance (ANCOVA) with numbers of correctly identified notes, however, did not show influences on any of the experimental findings.

WM span was calculated for each task. Only items remembered in the correct order were considered correct. Spans were compiled for memorization modality, musical task, and recall condition. Regarding LTM for timing, mean intertap intervals were calculated, and individual differences between the pre- and posttests were assessed. An overall average of the three compositions for the pre–post tempi differences was calculated.

Attentional accuracy was assessed by counting correct responses to the target deviations that fell within a time window of 1,000 ms after the onset of the targets. Signal detection analysis was used to investigate the hit rate (number of correct responses to targets) and false alarm rate (erroneous responses to no targets). For the targets, in each condition a maximum of three correct responses was possible (e.g., three timing deviations for flute timbre, three timing deviations for English horn, etc.); correspondingly, a maximum number of three false alarms were counted, in case participants tapped more often and thus did not respond adequately to the target.

For the WM, LTM, and attention tasks, ANCOVAs were calculated to assess differences based on profession (conductors vs. pianists) and experience (students vs. experts), controlling for age as a covariate. The relations between WM, LTM, and attentional flexibility (combining scores of the timing and pitch deviation tasks for baseline and for selective and divided attention) were calculated by means of Spearman signed-rank correlations.

Results

We first investigated group differences in attentional flexibility by employing signal detection analysis. Second, results are reported for multimodal WM (including the auditory and visual presentation modalities and verbal, written, or piano recall) and LTM. Finally, we analyzed the relationships between attention and memory capacity.

Attention

We first tested whether conductors and pianists differed in the baseline, selective-attention, and divided-attention tasks. We also analyzed differences between students and experts, and examined potential age effects. A 2 × 2 ANCOVA for each of the three attention tasks, with musical Profession and Experience as factors and Age as a covariate, resulted in significant effects for both selective and divided attention in the timing deviation tasks (Fig. 2). No interaction effects emerged. In contrast, the baseline and pitch tasks showed no significant effects or interaction for any factors. Regarding the Profession factor, conductors (mean d’ scores = 2.22, SD = 1.74) detected significantly more targets than pianists (d’: M = 0.61, SD = 1.92) in the divided-attention timing task, F(1, 25) = 10.48, p < .005, η p 2 = .30.

Fig. 2
figure 2

Selective- and divided-attention timing tasks (d’ values) for the groups of student and expert pianists and conductors. Error bars indicate SEMs

Regarding the Experience factor, expert pianists and conductors alike detected more targets than students in both the selective- and divided-attention timing tasks. For selective attention, experts’ d’ scores averaged 2.67 (SD = 1.72), and students had a mean d’ of 1.80 (SD = 2.02), F(1, 25) = 5.80, p < .05, η p 2 = .19. For divided attention, experts’ d’ scores were 2.16 (SD = 1.85), and students’ were 0.76 (SD = 1.90), F(1, 25) = 13.39, p < .005, η p 2 = .35.

It should be noted that age as a covariate accounted for some of the variance in the divided-attention timing task, F(1, 25) = 7.46, p < .05, η p 2 = .23. However, when age was removed from the model, there were still significant differences between conductors and pianists, as well as between students and experts (both ps < .05).

WM and LTM

Participants were relatively consistent in mastering the WM tasks, independently of the material to memorize or the recall condition. In other words, those who succeeded in certain tasks were also successful in others (correlations between tasks ranged from r S = .31 to .88). Regarding the modality in which the material was presented, within-participants analyses revealed higher WM spans for the visual musical (M = 5.63, SD = 0.93) than for the auditory musical (M = 3.58, SD = 1.23) material, t(29) = 9.55, p < .001, d = 1.88 (two-tailed; see Fig. 3). Scores for both modalities were correlated over participants, r S(30) = 0.43, p < .05. In the verbal baseline test, in contrast, auditory and visual stimuli were memorized equally well.

Fig. 3
figure 3

Musical working memory span for auditory and visual tasks (M, SEM)

A 2 × 2 ANCOVA (factors: Profession, Experience, and Age as a covariate) did not reveal significant differences between conductors and pianists (Fig. 3). Regarding the factor Experience, experts had slightly higher visual spans than students, F(1, 25) = 4.65, p < .05, η p 2 = .18. These results were also related to the covariate Age, F(1, 25) = 7.42, p < .05, η p 2 = .23. Further inspection of the results indicated that age accounted for most of the variance in this model, since removing age resulted in no further differences between students and experts in visual tasks. However, age was not simply correlated to task performance, and higher age did not reduce visual WM span (r S = .22, n.s.). There were no group differences in the verbal WM baseline task, and no significant interactions in any tasks.

Differences in LTM were calculated with a univariate ANCOVA (Fig. 4). Two outliers that were more than two standard deviations beyond the mean of the sample (in absolute values more than 185 ms) were removed prior to analysis, so that the values of 14 pianists and 14 conductors were entered into the analysis. Conductors had more precise LTM for timing (average differences: M = 57 ms, SD = 30) than did pianists (M = 85 ms, SD = 52), F(1, 23) = 4.30, p < .05, η p 2 = .16. Experts (M = 54 ms, SD = 36) were also more precise than students (M = 86 ms, SD = 46), F(1, 23) = 4.39, p < .05, η p 2 = .16. Age did not influence the results, and there were no significant interactions.

Fig. 4
figure 4

Long-term memory timing consistency (in milliseconds), according to the groups of student and expert pianists and conductors (M, SEM). Smaller values indicate higher consistency

Relations between WM capacity and attention

Several significant positive correlations were apparent between musical WM scores and selective and divided attention (Table 1). Performance in the verbal WM test, on the other hand, was not related to attention, and we also found no significant correlations with the baseline attention tasks or the LTM performance. These results indicate that higher span in domain-specific WM was related to participants’ greater attentional resources.

Table 1 Correlations (Spearman’s rho) for overall scores in the attention and working memory (WM) tasks (N = 30)

In addition, participants’ overall reaction time for correct responses in the selective-attention condition was related to WM span for all musical tasks that included auditory presentation modalities (r S = –.395, p < .05) or presented pitches (r S = –.415, p < .05). These results indicate that participants with high WM for these tasks perceived and processed targets more quickly in the selective-attention condition.

Discussion

Some specialist professions demand extraordinary cognitive processing of multiple streams of information at the same time. Placing the research summarized above within an expertise framework, we suggested that domain-specific skills exercised over an extended period of time are reflected in attention and memory capacities. Thus, we investigated individuals more and less experienced in musical conducting and piano performance. Whereas conductors did not differ from pianists in the baseline tasks and in selective attention, they successfully detected more targets in the divided-attention timing task. In other words, conductors showed higher flexibility in switching their focus of attention either to a single or to two different streams. Conductors were also more consistent in LTM for timing, a skill particularly important for their profession. We also found evidence across professional domains for expertise effects, such that experts outperformed students in selective and divided attention. Individuals with good attentional capacities had higher WM spans for a variety of multimodal tests, providing further evidence for a relationship between WM and attentional flexibility in domain-specific tasks for highly trained individuals.

Given the amounts of musical training among all our participants, all groups had acquired forms of expertise that might qualify them as experts in their fields when comparing them to individuals without such musical training. We did not contrast participants with groups of nonmusicians, since a high degree of formal knowledge, such as being able to read and write musical notation, was required for mastering the experimental tasks of the present study. Whereas previous research on more general skills such as speech-over-noise detection or verbal memory had demonstrated beneficial transfer effects of musical training (Parbery-Clark, Skoe, Lam, & Kraus, 2009; cf. Williamson et al., 2010), for the present study of domain-specific differences between highly trained musicians, the cognitive demands on task performance were designed to be very high in order to avoid ceiling effects. On the other hand, the demands were also designed to be domain-specific and meaningful for conductors and pianists in their musical professions.

Even though expertise-related differences between students and experts were found, it remains an open question whether (a) conductors’ higher consistency in LTM and their attentional flexibility are a precondition for their profession or (b) these cognitive capacities are acquired during extended conducting training. According to the former view, it may be necessary for conductors to adjust their focus of attention rapidly, so that they are able to focus either on the whole orchestra or on individual instruments during live performances, before they enter the podium or start a conducting career. Those aspiring conductors with higher attentional flexibility will then likely be more successful in their profession. Some of the tasks we have described here might be useful in assessing potential success at the beginning of conductor training regimens. Although research has typically examined general attentional and memory skills that may benefit from musical training—for instance, in verbal domains (Ruggles, Freyman, & Oxenham, 2014; for a review, see White, Hutka, Williams, & Moreno, 2013)—less is known about the specific cognitive skills involved in expert performance. Even less research has been done on the extraordinary cognitive capacities of the small number of outstanding experts that may distinguish them from others in their fields, and long-term studies may indeed offer insights into necessary predispositions.

According to the second view, training and experience may fundamentally enhance these cognitive skills, and adept conductors could simply adjust to the demands of their field. Given the differences between students and experts that were obtained from all three types of tasks, in WM, LTM, and attention, we see tentative evidence for this interpretation and assume that accumulated training and professional experience may shape the cognitive capacities required. A recent study of selective attention provided cross-sectional evidence for more consistent processing of auditory stimuli in school children and young adults with musical training than in nonmusicians, whereas no such effects were found for preschool children with or without music lessons (Strait, Slater, O’Connell, & Kraus, 2015), suggesting levels of development according to training and age groups. Regarding WM for nonverbal auditory information, nonmusicians typically recall fewer items than musically trained participants (Benassi-Werke et al., 2012; Li, Cowan, & Saults, 2013). It would be interesting to test the student conductors of the present study again at later points in their careers in order to see potential developments in attentional flexibility.

The analyses used in the present design were aimed at testing for expertise-related differences across domains. In absolute scores, expert conductors showed the best results of all four groups for the divided-attention timing and LTM tasks. Both tasks assess temporal skills, which are clearly related to one of the core duties in the domain of conducting, since the organization of time in a musical ensemble with high control and consistency is one of their primary responsibilities (Wöllner et al., 2012).

It seems worthwhile to investigate the time course for changes of attentional foci—for instance, how quickly experts may switch from one channel to both channels in dichotic listening tasks. Contrary to traditional views of multitasking, according to which attention is constantly divided, current opinion assumes rapid shifts between various sources of information (see Alzahabi & Becker, 2013, for a current study on attention in multimedia contexts; see also Ralph, Thomson, Seli, Carriere, & Smilek, 2015). Bigand et al. (2000) provided several models for the processing of two melodic streams at the same time; their experimental results suggest, however, that participants did not rapidly switch between the melodies, but rather combined both streams into one perceived musical entity. Therefore, attending constantly to multiple lines in music is certainly different from speech-to-noise detection, as in verbal selective- and divided-attention tasks. Nevertheless, conductors need to be able to focus on a particular line whenever they detect a deviation in one or more of the instruments, be it in terms of timing, tuning, dynmics, style, or timbre. This skill of not only detecting such deviations, but also displaying an appropriate behavioral reaction in such situations—from eye contact to alterations of the overall tempo, if necessary—qualifies experienced conductors. Again, this aspect could be studied further with modifications of an ongoing task.

In the selective- and divided-attention tasks used in the present study, a temporal offset was used to sever harmonic integration and enable rapid switching between the two streams. Although harmonic integration of various streams is clearly paramount for the perception of Western music, professional conductors are still required to identify which instruments are slightly out of tune, for instance, or are playing wrong notes. They may thus either fully concentrate on one source or, rather than “divide” their attention, be in a state of overall attentional awareness to allow for rapid switching between the streams of information, and then again focus on a stream that needs particular attention. This process may be called “attentional flexibility,” since it describes the zooming in and out of the foci of attention and thus exemplifies an extraordinary adaptation to the demands of complex and rapidly changing tasks. Keller (2001) underlined the importance of attentional flexibility in musical ensemble performance, in which musicians need to monitor their own as well as other musicians’ parts. Nevertheless, attentional flexibility should not be taken as a synonym for divided attention. Further research may assess the speed with which experts are able to switch their attentional foci. In the present study, conductors’ rapid adaptation to the tasks demands in the selective or divided experimental conditions points to their cognitive flexibility when facing a particular challenge. Since deviations in one stream could appear at any time, even if conductors may have integrated several streams, they still need to be able to focus on the stream in question when the target occurs.

Neither the verbal WM baseline test nor the baseline attentional one-channel target detection yielded any differences between pianists and conductors. In order to investigate the specific cognitive skills of highly trained individuals, tests of general cognitive functioning may therefore not adequately assess the subtle and refined skills necessary in the profession, so that naturalistic, domain-specific tasks may be necessary to capture these cognitive processes. Although we cannot exclude the possibility of cognitive “G-factors” that may influence task performance in various ways, we assume that expertise-related differences among highly trained individuals will primarily occur in their domains (Hambrick & Engle, 2002; Herzmann & Curran, 2011). Other research on musical “transfer effects” has typically investigated individuals with some musical training, but not highly trained experts (White et al., 2013). On the other hand, consistency in performance across WM tasks was relatively high, especially with regard to recall condition. For example, those individuals with high musical WM spans for auditory material were typically also successful at the visual tasks in remembering notation. Furthermore, in the different recall conditions, participants could equally well write the items down or reproduce them on the piano. For this skill, pianists did not differ from conductors, which may be explained by most conductors’ high proficiency on the piano. Although these findings provide hints to a general consistency within musical tasks, no transfer or relationship to tasks outside the musical domain was observed here, because performance in verbal WM was not related to any other of the WM tasks. Future research may further test verbal and other nonmusical tasks in highly trained experts (see Parbery-Clark et al., 2009; Ruggles et al., 2014; Swaminathan et al., 2015). However, our results suggest that musical training did not enhance all cognitive skills, nor was it related to self-selection into the chosen profession.

There was a large range in participants’ ages (from 18–73 years), which could be viewed as a limitation of the study. However, we controlled for age-related effects in ANCOVAs and found that age did not limit participants’ cognitive skills in any of the tests. On the contrary, whereas LTM was not significantly influenced by age, differences between experts and students in the attentional and WM tests were actually positively related to age. Experts’ higher age thus partially accounted for some of the variance in the data, and showed relationships with higher visual WM spans and better target detection in the divided-attention tests. A confound between years of training and age is usually built into studies of expertise. Older musicians’ lifelong experiences will undoubtedly always differ from those of younger musicians, since expert musicians typically start their training and career at similar times in their biographies. Only such longitudinal studies as we suggested above may fully assess the influence of age on experts’ superior cognitive skills. On the other hand, since we did not observe indications of any decline in these skills with older participants (for a summary of research into age-related effects on attention and WM, see Levitt et al., 2006; cf. Hambrick & Engle, 2002), our results are in line with previous studies (for a review, see Halpern & Bartlett, 2002), suggesting that experts may retain the cognitive functioning in their respective domains to a high degree. In other words, musical training may even function as a protective factor or cognitive reserve in older age (Parbery-Clark et al., 2012).

Using the design of the present study, a number of related research questions may be addressed. For instance, it would be possible to further differentiate between the subskills of trained musicians within a given domain of expertise. Pianists who are primarily experienced as accompanists may show higher divided attention than piano soloists, and conductors’ skills may to some degree be shaped by the musical genres and the repertoire they perform. Furthermore, our results point to higher visual spans as compared to auditory spans, which is in line with a previous study by Lehnert and Zimmer (2006). On the other hand, the score-based approach of classical Western musicians may particularly strengthen visual modalities of information processing. Other musical traditions have developed stronger auditory, oral, and interactive means of knowledge transfer. Jazz musicians, for example, could potentially possess higher auditory WM, since a great deal of their repertoire is conveyed over this modality, and only partially exists as musical notation. Finally, applications of these findings in a conservatory environment are conceivable, and young musicians may thus benefit on their road to excellence by deliberate exercises of attentional flexibility.