Introduction

Research on mental workload has, to date, concentrated on physiological measures of brain, eye, and heart activity. This study focuses on behavioral measures of handwriting. Handwriting is a complex activity, compounded of cognitive, kinesthetic, perceptual, and motor components (Bonny, 1992; Reisman, 1993). It is considered as an “overlearned” skill involving very rapid sequencing of movements. As has been discussed by Weintraub (1997), several theoretical models have indicated that handwriting performance involves retrieving the form, size, and direction of letters, relating them to their sounds (phonemes), memorizing all the required parameters, and transferring them by motor execution to the paper (Weintraub, 1997).

With time, handwriting performance becomes automatic. Studies have shown that young children need to think more about the size, form, and direction of letters, tending to write more slowly and in larger letters (Berninger, 1991; Wann, 1986). It was also found that they write with a lower flow (Meulenbroek & Van Galen, 1986; Smits-Engelsman, Van Galen, & Portier, 1994), resulting in separate movements rather than a sequential pattern (Meulenbroek & Van Galen, 1986; Smits-Engelsman et al., 1994; Wann, 1986). Adults have automatic sequential performance, which begins to change due to physiological changes in elderly people (Dixon, Kurzman, & Friesen, 1993; Rosenblum & Werner, 2006).

Thus, adults 20 years of age and above are expected to write in an automatic manner unless suffering from some pathology, physical or mental, that affects their handwriting performance (Longstaff & Heath, 1999). Automatic handwriting movements increase efficacy and reduce redundancy (Latash, 1998). The more skilled and automatic the handwriting act, the less variability there will be in temporal (performance time), spatial (length, height, width), and pressure (applied on or toward a surface) measures, and greater consistency will be evident (Smits-Engelsman & Van Galen, 1997). This means fewer pauses, less variation in letter formation, more spatial accuracy, and better control of pen pressure (Meulenbroek & Van Gemmert, 2003; Schoemaker, Ketelaars, Van Zonneveld, Minderaa, & Mulder, 2005; Wann, 1986).

When handwriting performance is automatic, it releases cognitive resources to deal with other tasks that can be performed simultaneously—that is, dual-task processing (e.g., Schneider, Domais, & Shiffrin, 1984). According to dual-task studies, human processing resources are shareable (Kahnemann, 1973; Navon & Gopher, 1979), but the difficulty of tasks for the hand limits the ability for dual-task performance (Fisk & Schneider, 1983). For example, when driving automatically, people can converse and think about the content of a conversation in a manner that will result in sequential and logical discussion. However, when they drive an unfamiliar vehicle or on unfamiliar routes, conversation is interrupted, becoming less logical and sequential. Thus, when a cognitive task such as arithmetical calculation or driving is more complex and demands additional resources, other tasks, such as writing or communicating on the phone, may suffer from a loss of resources that influences performance. This has been conceptualized as mental workload—the processing costs incurred in task performance (Kramer, 1991; Wickens, 1992).

Previous studies have indicated that a computerized system of objective measures of the handwriting process may be sensitive to dis-automatization (e.g., Teulings, 2001; Tucha, Laufkotter, Mecklinger, Klein, & Lange, 2001; Werner, Rosenblum, Bar-On, Heinik, & Korczyn, 2006). Specifically, it has been found that such computerized measures are sensitive to cognitive deterioration during the various stages of Alzheimer’s disease (Werner et al., 2006). A recent study, in the field of applied psychology, also found differences between the writing of truthful and untruthful sentences in a sample of healthy participants, suggesting that deception is cognitively taxing and, therefore, damages handwriting performance (Luria & Rosenblum, 2010).

The focus of this study is on mental workload in a healthy population when writing numbers. This is important because there is evidence in the literature that cognitive mechanisms differ for writing words or numbers (e.g., Gruber, Indefrey, Steinmetz, & Kleinschmidt, 2001) and because there is a paucity of information about the influence of mental load on young healthy adults while writing numbers. There are several examples, mostly in the clinical field, that demonstrate effects of mental workload on handwriting behavior. Findings by Van Gemmert and Van Galen (1997) reinforce evidence about the influence of cognitive stress on handwriting. For that study, participants were required to perform a secondary arithmetical task together with a number-writing task under two levels of cognitive stress. It was found that performing the arithmetic task in parallel did indeed affect number writing, with increased reaction and movement time and elevated axial pen pressure. In a study by Van Gemmert, Teulings, and Stelmach (1998; Van Gemmert, Teulings, Contreras-Visal, & Stelmach, 1999) among people with Parkinson’s disease, researchers expected to find smaller writing sizes, but there was no decrease in writing size when mental load was increased,

The aim of the present study was to check whether significant differences would be found in handwriting measures between high and low mental workload conditions, on the basis of the premise that high mental workload decreases automatization levels in handwriting and manifests itself in specific handwriting segments. Although previous studies had described handwriting features under cognitive stress, the present study was related specifically to three main points; that is, the methodology was different in that the focus was on mental, not on motor, load while a single writing task concerning graded numbers was performed. Second, the sample consisted of healthy participants, and third, the analysis was related to several aspects of handwriting—namely, angular velocity, tempo, pressure, and spatial measures.

We suggest that handwriting measures may serve as indicators for levels of mental workload in healthy populations. Such measures are important because the need for objective measures of mental workload increases with time (Nachreiner, 1999). Kramer and Weber (2000) reviewed existing measures of mental workload, such as heart rate and heart rate variability; eye-scan patterns, blink rate, and duration; and brain activity (event-related potentials [ERPs] and electroencephalography). Iani, Gopher, and Lavie (2004) suggested that peripheral arterial tone can also be used as a mental workload measure. This demonstrates that mental load is related to a variety of physiological reactions.

There are two main categories of existing mental workload measures. One is of brain activity evoked by various cognitive processes. This measures cerebral blood flow and constructs a neural activity image of the brain (Reiman, Lane, Van Petten, & Bandettini, 2000). The second category focuses on arousal, on the basis of the assumption that changes in workload will be indicated by changes in the autonomic nervous system, resulting in peripheral reactions that can be measured (Gopher & Donchin, 1986; Kramer & Weber, 2000).

We include handwriting behavior as a behavioral category that is indicative of mental workload and can be measured during actual performance. Kramer (1991) defined intrusiveness as the degree to which a measure interferes with task performance. Most of the existing methods require an intrusive laboratory environment, special instrumentation (which, in the case of brain imaging, is very expensive), and further expenses for data analysis. Furthermore, the experiments do not always imitate real-life performance, raising questions regarding the functionality and veracity of the obtained results (Majnemer, 2009). Conversely, handwriting measures (such as the one presented in this study and in previous studies; Rosenblum, Parush, & Weiss, 2003a,b) are inexpensive and simple extensions of standard handwriting behavior.

This study was designed to provide evidence of handwriting as an additional mental workload measure to Kramer’s (1991) two other parameters—that is, sensitivity and reliability. We tested whether handwriting measures are sensitive to variations in mental workload and provide evidence of reliability by presenting replications of the patterns of results for other paradigms and participants. We chose numerical calculations, rather than words, in order to test the reliability of these handwriting measures in different types of writing.

We used a within-subjects design to test the effect of mental workload on writing three numerical progressions of varying difficulty with a sample of healthy individuals. Our research hypotheses were that differences would be found between writing under high and under low mental workload, with pressure, temporal, angular velocity, and spatial measures obtained by a computerized system. On the basis of previous results concerning detection of clinical pathologies and deception (Luria & Rosenblum, 2010; Werner et al., 2006), we predicted that writing under high mental workload would affect both handwriting measures (the mean values of each measure) and their variability (the SD values of each measure). However, because there was very little information about the effects of mental workload on the handwriting behaviors measured in this study (when numbers rather than words and sentences were written), we hypothesized differences between handwriting measures with no specific direction.

  • H1: Differences will be found in segment duration on paper and in air (and variability in duration) between high and low mental workload conditions.

  • H2: Differences will be found in angular velocity (and angular velocity variability) between high and low mental workload conditions.

  • H3: Differences will be found in spatial levels (and spatial variability) between high and low mental workload conditions.

  • H4: Differences will be found in pressure levels (and pressure variability) between high and low mental workload conditions.

We also assumed that a more complex process of statistical analysis would reveal finer structures and, possibly, more variability than had been found with the basic analysis made in our previous studies. We tried to identify a statistical profile of automatic handwriting (a function of the measured parameters that is constant over time during automatic writing and changes when writing stops being automatic).

  • H5: A profile of handwriting measures will discriminate between writing under low and high mental workloads.

Method

Participants

Participants included 56 healthy students, 34 females and 22 males, 20–63 years of age (mean age = 25.46, SD = 5.81); 52 of them were under 30 years of age, and only 1 participant was above 35 (63 years old). The older participant was ultimately excluded from our analysis in order to obtain a homogeneous age group. All participants had completed high school, and they averaged 1.9 years of higher education (SD = 0.9). They were recruited at the University of Haifa, Israel. Eighty-eight percent of the participants were born in Israel, 10% in the former Soviet Union, and 2% in Europe (and had immigrated to Israel before they were 7 years old and, therefore, had learned to read and write in Hebrew). On the basis of the participants’ report and observation of their writing hand in the present study, 79% of the participants had right-hand dominance, and 21% were left-handed. They came from various disciplines: 80% humanities, 20% from exact studies.

Criteria for participation included residence in Israel for at least 20 years, normal or corrected-to-normal vision and hearing, at least 12 years of education in Israeli educational frameworks, and at least three sentences written in Hebrew at least three times a week. Anyone suffering from any form of neurological/emotional or physical disability was not eligible for participation.

Instruments

The socio-demographic questionnaire included gender, age, and number of years of education.

Mental workload manipulation Participants were asked to write three numerical progressions. We controlled for mental load by differentiating the gap in the progressions. In the easiest, the gap was one (1, 2, 3, 4, . . .); for the medium mental load, the gap was three (1, 4, 7, 10); and for the most difficult, the gap was four (1, 5, 9, 13). Participants were asked to add ten more items to each of the three progressions, on the basis of the given four first numbers as presented above. This was in line with other studies using arithmetic in dual-task paradigm studies (Cho, Gilchrist, & White, 2008) to manipulate mental workload (Seibt, Scheuch, & Hinz, 2001) and for comparison with physiological reactions such as heart rate (Lehrer et al., 1996).

The digitizing tablet, online data collection, and analysis software for objective spatial, temporal, and pressure measures were provided by the Computerized Penmanship Evaluation Tool (ComPET), developed by Rosenblum et al. (2003a). ComPET software includes (1) data collection and (2) data analysis, which is programmed via MATLAB software toolkits (for more details, see Rosenblum, Chevion, & Weiss, 2006; Rosenblum, Dvorkin, & Weiss, 2006). The system enables collection and analysis of spatial, temporal, angular velocity, and pressure handwriting data while the participant writes on a paper affixed to a digitizer (an electronic tablet).

All writing was on A4 lined paper affixed to the surface of a WACOM Intuos 2 (Model GD 0912-12X18) xy digitizing tablet, using a wireless electronic pen with a pressure-sensitive tip (Model GP-110). The x and y location and angle of the pen tip were sampled on the digitizer at 100 Hz by means of a 1300-MHz Pentium (R) M laptop computer. The digitizer provided accurate temporal measures throughout the writing, both when the pen was touching the tablet (on-paper time) and when it was raised (in-air time). It also provided accurate spatial measures when the pen was touching the tablet and/or when it was lifted above the digitizer (up to 6 mm). Beyond 6 mm, spatial measurement was not reliable.

The handwriting evaluation system does not recognize letters, words, or sentences. It only analyses segments—that is, the curves created by the movement of the pen-tip on the paper, which are represented on an x-, y-coordinate system (Mergl, Tigges, Schröter, Möller, & Hegerl, 1999). That is, the computerized analysis recognizes points when the pen is in contact with and/or leaves the paper. Segments were measured from when pen pressure rose above 50 (nonscaled units) at the beginning of a segment to when the pen returned to 50 at the end of the segment and was raised from the paper. It is important to note that there is variability between and within writers; therefore, we aggregated measures of the entire task. The mean and the standard deviation of each measure were examined for each participant in order to follow intraindividual variability across different measures. On the basis of previous handwriting analyses (Lacquaniti, Ferrigno, Pedotti, Soechting, & Terzuolo, 1987), standard deviations (SDs) for segment duration, path length, height, and width were analyzed as measures of handwriting performance consistency. That is, we calculated the SD between segments in each condition,. For example, we calculated the mean length of all segments and then calculated the deviation of each segment from this mean. In order to measure, as accurately as possible, variability of pressure and angular velocity, we measured SD within a segment; that is, we calculated the variability between all data points for each segment. Because the tablet reports the value of these measures 100 times per second, it is possible to capture changes in pressure and angular velocity even within segments.

We used several measures to analyze handwriting behavior:

  1. 1.

    Pressure: mean pressure on the writing surface for the entire task, measured in nonscaled units from 0 to 1,024 with a linear curve (the default curve of the WACOM Intuos 2 digitizer).

  2. 2.

    Temporal: segment duration in air (pen is not in contact with writing surface) and on paper, both reported in milliseconds.

  3. 3.

    Spatial:

    1. 3.1.

      Segment length in millimeters: total path length from starting point to finishing point for each written segment.

    2. 3.2.

      Segment height (y-axis): direct distance from the lower to the highest point of the segment in millimeters.

    3. 3.3.

      Segment width (x-axis): direct distance from the left side of the segment to the right side in millimeters.

  4. 4.

    Angular velocity of a segment indicates how many degrees the pen travels when writing a segment. This is measured in degrees per second, to allow for comparison between short and long segments. Angular velocity was also measured in other studies using radians per second (see, e.g., Lacquaniti, Terzuolo, & Viviani, 1983; Viviani & Terzuolo, 1982).

Procedure

Signed informed consent was obtained from the participants following approval by the Ethics Committee of the University of Haifa. Advertisements posted at the University invited students to participate in the study. The participants were asked to write three numerical progressions with different gaps (1, 2, 3, 4; 1, 4, 7, 10; 1, 5, 9, 13). All tasks were performed on paper affixed to a digitizing tablet, and each participant conducted one trial (numerical progression) per condition.

Data analysis

Statistics of the dependent variables were tabulated and examined. It is important to note that data collection is performed automatically by the computerized system in real time—that is, while the participant is writing. The data, obtained as a text file, are objective and exact physical data—length, time, direction, and pressure. The raw data are then aggregated to the final measures using averages and SDs.

In the analysis, handwriting measures for the three conditions of mental workload (in a single factor) were first compared by GLM MANOVAs with repeated measures. MANOVAs were done for each type of measure to achieve mean values and standard deviations according to the following:

  1. 1.

    pressure on the writing surface;

  2. 2.

    time: pen duration in air and on paper;

  3. 3.

    space: segment-path length, width, height;

  4. 4.

    angular velocity of writing.

The second and third stages were intended to integrate the existing measures into profiles that would be better and simpler handwriting indicators.

In the second stage, in order to reduce the number of measures, we used principle component analysis (PCA) with varimax rotation for all measures that significantly discriminated between the three conditions of mental load. PCA is a data-reduction statistical technique that scores large sets of measured variables and reduces them to smaller sets of composite variables, retaining as much information as possible from the original variables (Fabrigar, Wegener, MacCallum, & Strahan, 1999). PCA was conducted on all the data, regardless of mental complexity condition or participant, in order to capture variability in the data.

Lastly, we conducted cluster analysis with the central measure, using a measure from each factor that was found to differentiate better between the three mental load conditions of stage 1. These measures were analyzed by interactive partitioning (K-means), which minimizes within-cluster variability and maximizes between-cluster variability (Tinsley & Brown, 2000). Cluster analysis was conducted on all the data regardless of mental workload condition. On the basis of the clusters that emerged empirically from the data, we compared the three cognitive complexity conditions and tested the frequency of clusters in each of them. This analysis identified which of the handwriting clusters differentiated better between the conditions of mental load.

Results

Prior to the analysis, we screened participants’ answers for mistakes in the mathematical sequence. We found that 18 participants had a mistake in their calculated sequence (4 of whom had 2 mistakes). Participants’ mistakes were in the high and medium mental workload conditions (11 mistakes per condition). We therefore added a “mistakes” variable to the data and controlled for mistakes in our analysis. We found no significant effects of the mistakes variable on the handwriting variables in the different workload conditions. We also filtered out some of the outlier segments according to their duration. After careful screening, we ruled out very long or very short segments that seemed extreme and/or rare, as compared with other segments in the data. We assumed that such segments resulted from measurement errors—for example, when participants stopped for questions or did not understand the task. Segments of less than 50-ms or more than 850-ms duration were deleted. In total, we filtered out 67 segments—that is, 1.8%. We also ran the analysis with a first-order filter with a 12-Hz cutoff frequency but found no significant differences between results with and without the filter. Because we used aggregated measures of each condition, we decided to analyze the data as completely as possible and, therefore, report the analysis without the first-order filter.

We tested the reliability of our measures using a split-half reliability procedure. This was important because, for some participants, segments might represent whole numbers. We split each condition randomly in order to compare the different frequencies of numbers and to make sure that the handwriting measurement was not affected by the frequency differences of certain numbers in each condition. We found sufficient reliability (average Spearman–Brown stepped-up reliability = .71), which indicates that, regardless of frequency of numbers the participants wrote in each split half, the handwriting behavioral measures were stable for each condition and an overall pattern of individual handwriting emerged. Furthermore, using mixed models with repeated measures, we tested the variability between the participants in each condition and measure (see Table 1). We found high and significant variability between participants in all the conditions of the study and for all of the handwriting measures. This provides additional support that the measures reflect overall individual handwriting style that is not only reliable, but also highly distinctive between individuals.

Table 1 Mixed models repeated measures test of between-subjects variability of temporal, spatial, angular velocity, and pressure handwriting measures within each mental workload conditions

Stage 1

The first stage of the analysis tested differences between three mental workload conditions by SPSS GLM with a repeated measures procedure, including workload as one factor with three levels of difficulty, in order to test hypotheses 1–4. The results are presented in Table 2.

Table 2 Comparison of means, standard deviations, and F values of temporal, spatial, angular velocity, and pressure handwriting measures for three mental workload conditions

The MANOVA analyses indicated significant differences between mental workload conditions in temporal, spatial, and angular measures.

Hypothesis 1 predicted differences in mean and variability in the duration of the pen in the air and on the page under high and low mental load conditions. The results supported the hypothesis, and significant differences were found between the three mental workload conditions. In the post hoc analysis, we found that duration of the pen on the page per segment was longer in higher mental workload conditions (M = 288.2 ms, SD = 128.5) than in medium conditions (M = 283.0 ms, SD = 119.8). Duration on the page per segment in the medium mental workload task was significantly higher than in the easy mental workload condition (M = 229.3 ms, SD = 95.2). The analysis also revealed significant differences between the three conditions in regard to duration of the pen in the air between segments. A post hoc analysis revealed a significant difference between the higher mental workload condition, which had the longest duration (M = 517.5 ms, SD = 278.0), and the medium condition (M = 438.1 ms, SD = 259.1). The medium mental workload condition also differed significantly from the low condition, which was the shortest (M = 402.4 ms, SD = 248.7). The distance traveled by the pen in the air is illustrated in Fig. 1.

Fig. 1
figure 1

Three mathematical progressions written by the same writer, the first under low mental workload, the second under medium mental workload, and the third under high mental workload (visualization of on-page and in-air data)

Not only did mean duration vary significantly between mental workload conditions, but also variability of duration varied significantly, both on the page and in the air. The highest variability in the high mental workload condition (on page, mean SD = 167.1, SD = 121.6; in air, mean SD = 632.1, SD = 460.5) differed significantly from the medium mental workload condition (which had medium variability; on page, mean SD = 123.83, SD = 56.0; in air, mean SD = 498.2, SD = 438.5). The medium mental workload condition differed significantly from the low condition (on page, mean SD = 104.3, SD = 54.2; in air, mean SD = 437.4, SD = 416.7). In sum, higher mental workload is related to longer duration and greater variability both on the page and in the air.

Hypothesis 2 predicted differences in writing between mental workload conditions in regard to angular velocity and variability of angular velocity. The hypothesis was supported, and the post hoc analysis demonstrated that angularity discriminated significantly between high mental load (M = 1,114.4, SD = 349.3) and medium mental load (M = 1,285.6, SD = 3110.9), but without distinction between easy and medium mental load conditions (M = 1,141.2, SD = 408.8). Variability in angular velocity also discriminated significantly between high (mean SD = 2,408.9, SD = 824.9) and medium (mean SD = 2,737.5, SD = 718.3) mental load conditions, but not between medium and low (mean SD = 2,746.8, SD = 898.2) mental load conditions. Thus, handwriting had the highest angular velocity in the medium mental load condition, with smaller variations in angular velocity as the mental load increased.

Hypothesis 3 predicted differences between mental workload conditions in spatial writing measures and their variability. This hypothesis was partially supported. Significant differences were found between the three mental workload conditions for mean and variability of height and length and variability of width. Mean height of segments under high mental load (M = 3.6, SD = 0.90) was significantly lower than under medium load (M = 3.9, SD = 0.90), but no differences were found between medium and low mental load conditions. (M = 3.9, SD = 0.96), No significant differences were found for segment width. Significant differences were found between all the three conditions in length of segment, the longest being in the medium condition (M = 9.012, SD = 1.96), which differed significantly according to the post hoc test from the high mental workload condition (M = 8.13, SD = 2.25). Segments under low mental workload condition (M = 8.64, SD = 2.08) were significantly shorter than under medium mental workload (see example in Fig. 2). No significant differences were found in length variability, but there were significant differences in width and height variability. Variability in segment width was significantly higher under the low mental workload condition (mean SD = 1.02, SD = 0.32) than under the other two conditions (medium workload [mean SD = 1, SD = 0.30]; high workload [mean SD = 0.98, SD = 0.30], where variability was significantly lower than under medium mental workload). Conversely, variability in segment height was significantly higher under the high mental load condition (mean SD = 1.36, SD = 0.35) than under medium workload (mean SD = 1.22, SD = 0.39). No significant differences were found between medium and low workload conditions (low workload, mean SD = 1.29, SD = 0.33). In sum, segment length and height become smaller when mental workload increases; variability in segment height also increases, but segment width decreases.

Fig. 2
figure 2

Three mathematical progressions written by the same writer, the first under low mental workload, the second under medium mental workload, and the third under high mental workload (visualization of on- page data)

Hypothesis 4 predicted differences between high and low mental workload conditions in handwriting pressure and pressure variability. This hypothesis was not supported, because no significant differences were found between the three conditions.

Stage 2: Data reduction

Hypothesis 5 predicted that a profile of handwriting measures would distinguish between writing under high and low mental workloads. In order to test this hypothesis and reduce the number of parameters to those that best captured the data, we conducted a PCA and included six measures that were found to differentiate between mental load conditions in Stage 1 (see Table 2).Footnote 1 The PCA explained 75.7% of the variance in the data and converged after only three iterations. It also revealed two components (see Table 3). In the first component, spatial measures (length, height, and width) and temporal measures (duration) were loaded together, unlike other handwriting measures. In the second factor, angular velocity measures (mean and SD for angular velocity) were loaded together.

Table 3 Factor loadings according to exploratory factor analysis (principal component analysis with varimax rotation) of handwriting measures

Stage 3: Handwriting profile

In order to create a profile for stage 3, we selected one measure for each group of measures—that is, the measure that was the most significant indicator of mental load in the MANOVA analysis (the selected measures were mean segment duration, mean angular velocity, and segment length). Cluster analysis with K-means was used to create a handwriting profile. Cluster analysis revealed three profiles (see Table 4), which present the normalized Z-score of each measure. As can be seen, the first cluster presents a profile of small handwriting with low levels of angular velocity, written quickly. The second cluster captured a profile of (mainly) long segments that take longer to write, and the third cluster consisted of segments that seem to be medium in duration, angular velocity, and length.

Table 4 Results of K-means cluster analysis

In order to compare mental load conditions, we analyzed frequency of writing profiles for each condition (see Table 5). Analysis indicated that the first cluster is the best indicator; that is, the higher the mental load, the more marked and frequent was the combination of the three measures that followed: smaller segments, less angular velocity, and time taken to write the segments (cluster 1). Cluster 1 was also the most frequently observed in the data (47% of the segments), demonstrating discriminate validity, especially between low/medium and high mental workloads (i.e., almost 50% more frequent in high mental workload). Cluster 2 discriminated well only between low and medium/high mental workload, but not between medium and high mental load. It appears that the combinations of long segments written over long duration happen twice more frequently in medium or high mental workload conditions than in low mental workload conditions. Cluster 2 captured about a third (32%) of the segments in the data. Cluster 3 did not discriminate well between mental workload conditions but did capture the smallest percentage of segments in the data (21%). A chi-square analysis revealed highly significant differences between frequencies of handwriting clusters in each mental load condition, χ2(4) = 55.7, p < .001, thus supporting hypothesis 5.

Table 5 Frequencies of each cluster in each of the mental workload conditions

Discussion

For this study, we applied a computerized handwriting digitizer tablet in order to detect the effects of mental workload on handwriting. It was assumed that greater mental workload would be apparent in the handwriting of participants writing numerical progression series of different cognitive complexities. Temporal and spatial measures were derived for each mental load condition, as well as pressure and angular velocity measures. Results showed support for hypotheses 1–3, demonstrating that, under high mental load , mean durations of writing, both on the page and in the air, were higher, as were SDs of durations., Segment height, length mean, and SD for width were lower, SD for height was larger, while the mean and SD for angular velocity were smaller than under lower mental load. Results did not support hypothesis 4, and no significant differences were found for pressure measures, mean segment width, or SD for segment height.

Regarding hypothesis 5, this study is the first to create profiles of handwriting behaviors by means of data reduction and cluster analysis. The PCA results provide evidence that some handwriting measures are intercorrelated and that two principal components capture most of the variability in the data, suggesting that it is possible to reduce the number of parameters for measuring handwriting. Cluster analysis provided empirical evidence that these measures represent a meaningfully integrated profile that differs significantly under three mental load conditions. The handwriting profile that is apparently most sensitive to mental load (especially to high levels of mental workload) is of small, not angular, segments that take less time to write. As mental load increased in the highest condition, so did the participants’ handwriting profile. The other profile that seems to be sensitive, specifically, to increased mental workload above the lowest condition is that of longer segments that take longer to write.

It is important to note that while there were significant differences between some of the handwriting measures according to mental load conditions, changing in a rather linear and synchronized way with cognitive manipulation, others had significant differences only in a few conditions and were not always synchronized with cognitive manipulation. For example, there were significant differences in the mean and SD for duration measures in all the cognitive load conditions, and these were synchronized with cognitive manipulation; as cognitive load increased, the duration increased. On the other hand, some measures differentiated only between two mental load conditions. For example, angular velocity differentiated between medium and high load, but not between low and medium load. Finally, some measures showed significant differences but were not synchronized with manipulation. For example, the longest segment lengths were in the medium mental workload condition. It seems that not all measures relate linearly to mental load, so that careful consideration must be made about whether and how to use these specific measures as cognitive load indicators.

There are few studies that focus on mental workload while numbers are written. On the basis of previous literature (e.g., McCloskey, Caramazza, & Basili, 1985), it can be assumed that the mechanisms involved in number processing differ from those for words/lexical processing as previously evaluated in relation to lie detection and clinical pathologies. Transforming a multidimensional knowledge structure (domain knowledge) into a linear sequence of words (the text) requires the following processes:

  1. 1.

    Generating and organizing text content by retrieving information from long-term memory or from the environment (e.g., documentary sources).

  2. 2.

    Translating semantic representation into linguistic structures.

  3. 3.

    Revision to allow evaluation and modification of conceptual and linguistic characteristics of a text.

  4. 4.

    Creating a written graphomotor plan.

However, in the case of numbers, different processes are required:

  1. 1.

    Lexical processing involves comprehension or production of the individual elements as a number (e.g., the digit 3 or the word three).

  2. 2.

    Syntactic processing involves relations among elements in order to comprehend or produce a whole number.

  3. 3.

    Calculation requires cognitive mechanisms for (1) processing of optional symbols (e.g., +, -, :) that identify the operation to be performed; (2) basic arithmetic (e.g., multiplication, such as 6 × 7 = 42); and (3) calculation (McCloskey et al., 1985). We chose calculation because it demands mental activity (Tucha, Mecklinger, Walitza, & Lange, 2006), thereby serving our interest in studying writing under mental workload. Future research should test the effect of cognitive load on writing words, which demands different processes and resources and may be manifested differently in handwriting variables.

Our results are partly in line with those in studies by Van Gemmert and Van Galen (1994, 1996, 1997, 1998) on the effects of physical and mental stress on fast and accurate spatial control while handwriting tasks are performed. They assumed that dis-automatization as a result of mental stress causes increased variation in handwriting velocity, longer movement duration, and smaller writing size among patients with Parkinson’s disease (Van Gemmert et al., 1998). They found that, among adults, auditory stress did indeed cause longer reaction times and higher axial pen pressure (Van Gemmert & Van Galen, 1998). Similarly, Bailey (1988) found that higher pressure is an indication of mental stress. Our results present a systematic description of handwriting measures of a healthy population (with many measures) and support the automatic and controlled information-processing model (Schneider & Shiffrin, 1977; Shiffrin & Schneider, 1977). It seems that in a task such as writing a more complex numerical progression, the automatic process in normal handwriting is replaced with a more controlled process that is sensitive to task difficulty, thereby limiting dual-task performance (Fisk & Schneider, 1983; Kahnemann, 1973; Navon & Gopher, 1979; Vrij, Fisher, Mann, & Leal, 2006, 2008; Wickens, 1991). These results are similar to those of previous studies regarding various pathologies (Rosenblum, et al., 2003a,b; Rosenblum & Livneh-Zirinski, 2008; Rosenblum et al., 2006b; Rosenblum & Werner, 2006) and detection of deception (Luria & Rosenblum, 2010). In sum, these results provide evidence for the assumption that measures of handwriting processes can capture dis-automatization in handwriting.

We also instigated angular velocity (the degrees through which the pen travels per second). Our results demonstrated that the mean and SD for angular velocity differentiate significantly between high and other cognitive load conditions. Furthermore, PCA proved that this discrete measure is not cross-loaded with other measures. Although significant differences were found only between high and low/medium, but not between medium and low cognitive load conditions, it seems that under high cognitive load, angular velocity tended to decrease, so that participants wrote with less variability in angular velocity. This is in line with Luria and Rosenblum’s (2010) contention that in deceptive writing, which is more cognitively taxing than writing the truth, participants’ movements are more limited in order to conserve cognitive resources. Mavrogiorgou et al. (2001) suggested that limitations of writing movements indicate less regularity, as manifested in segment length, height, and standard deviation, which also differ in our study. We believe that these results support studying angular velocity as a handwriting measure, since it will improve detection of cognitive aspects, such as deception or clinical pathology.

Similar to other studies (e.g., Van Gemmert & Van Galen, 1998), we measured segment duration and found that the SD for duration is a better indicator of cognitive load than is mean duration on the page, which does not discriminate between medium and high cognitive load conditions.

These results suggest that handwriting measures documented with a computerized digitizer and focusing on automatization/regularity can provide sensitive measures of mental workload. An advantage over other mental workload detection methods is that it is not intrusive and is user-friendly (see Kramer, 1991, for problems with other measures). Future studies on a variety of samples and cognitive tasks will test the reliability of this measure.

Computerized handwriting digitizers automatically generate objective data that cannot be obtained by merely observing handwriting behavior or by analyzing written texts (see, e.g., Guinet & Kandel, 2010). Training our research assistant on how to collect data using the tablet and the software took less than 1 h. Measures such as the SD for segment height or applied pressure are unique measures, easily and objectively received. The writer is not aware of the kind of data being measured. Even if s/he were aware, measures such as pressure, segment height, width, angular velocity, or SDs cannot be actively controlled consistently. Analysis of segments and not of letters enables use of this technique for writing in different languages, and overall, this technique will be useful for researchers and practitioners studying cognitive load.

Limitations

The results of the present study call for the use of computerized handwriting digitizers for future cognitive studies. The present study does, however, have limitations, one being that our research included comparison of conditions that were not identical. That is, in order to measure mental workload and to prevent the effect of learning due to repetition of the same task, we decided on a manipulation that, in turn, did not control for similarity of the segments in each condition. Participants were asked to write three numerical progressions with different gaps between the numbers, resulting in unequal frequency of numbers in each condition. Although most of the numbers existed in all conditions, some of the numbers were used more frequently in one condition than in the other. We note that this manipulation may have added noise to the measurement, due to differences in sizes and width between numbers. Nevertheless, we suggest that because handwriting behavior patterns (spatial, angular velocity, pressure, etc.) become consistent and automatic among adults, the overall measured characteristics of each individual’s segments should be similar even when different letters or numbers are written and should also be different from the writing characteristics of other individuals. That is, we suggest that for each individual, there is a general handwriting pattern that is consistent over different segments (numbers).

In support of this, we found split half reliability when we split each condition randomly; that is, we found similarities in the behavioral measures (within the same condition and participant), although we did not use identical numbers in each half test that was compared with the other half. We found further support in our analysis for variability between participants: Significant differences were found between individuals within the same condition, indicating high distinctiveness between individuals (see Table 1). In sum, we chose an approach that examines an overall writing style of individuals in order to observe handwriting behavior in a valid mental workload task. We acknowledge that due to our manipulation, we could not control for the differences in writing of specific segments, which may have damaged the reliability of our measurement. We suggest that future studies should test the effect of mental workload on writing while controlling for the segments being written. We believe that such scientific control will even strengthen our results and that the study of handwriting can benefit from a combination of these two methodological approaches (the study of an overall writing style and the controlled laboratory examination of each writing segment).

An additional limitation of this study is that the sample consisted only of students. Future studies with randomly sampled participants from different populations should improve the generalizability of the results. Furthermore, instead of one specific cognitive task, as was used in this study (i.e., numerical progression), future studies should employ a variety of tasks, such as writing from memory versus copying text or writing a paragraph versus writing while thinking of a number, in order to increase memory load. Such tasks require different cognitive functions and, thus, may influence handwriting differently.