Introduction

Motor assessment is relevant both to identify atypical development that might benefit from adapted forms of activity or remediation and to measure progress obtained with intervention. The assessment of motor abilities is also relevant for research so that we can learn how children gain control over their movements and how different factors, including aging, diseases, and training, affect motor performance.

It is well known that various factors affect motor performance [1•]. From an ecological perspective, motor performance is considered a function of the interaction between the person, the task, and the context [2]. Body-related factors, such as neuromuscular control and morphology, are parameters well studied, often under controlled conditions, but less so are task and contextual factors that can vary enormously between cultures. The assessment of motor abilities usually includes the use of standardized tests, in which children are expected to cooperate and perform specific tasks that are more or less common across various tests. Standardized tests are geared toward measurement of universals in performance [3••]; however, since the most well-known motor development tests were created by researchers from North America and Europe, concepts that prevail in these cultures are embedded in the tasks, materials, and format of the instruments. This raises the issue of whether these assessments would be valid in other countries. Is it valid to assume that motor skill development is the same across different cultures? Is it necessary to make assessment of motor ability more relevant cross-culturally?

To answer these questions, we first need to evaluate the evidence that motor development is indeed influenced by culture.

Culture and Motor Development

In a provocative article, Henrich, Heine, and Norenzayan [4••] claimed that although the authors of most research articles published in psychology and behavior tend to argue their findings can be generalized to human kind, this is not correct due to the limited samples in which most studies are based. The authors criticize the fact that most published articles include convenience samples, recruited around universities from western, educated, industrialized, rich, and democratic (WEIRD) countries. They present evidence that domains such as psychology, motivation, and behavior are influenced by culture and argue that more well-designed cross-cultural studies are needed before generalizations can be made.

The same concern applies to motor development. Descriptions of children’s motor development appeared in the literature more than 100 years ago [5••]. Developmental norms as a function of age and stages, as popularized by Gesell [6], soon became the standard by which children’s development across the world was compared. Inspired by Gesell’s work, current developmental scales (e.g., Bayley, Denver) [7, 8] further expand the idea of universals, adopting a normative approach that focus on the mean performance or on idealized typical children, without considering individual differences in motor proficiency [9]. Moreover, since most developmental scales were created in Western European and North American countries, population norms are limited to these regions.

The literature on motor development, however, presents several examples of cross-cultural variations in motor development. In the late 1980s, based on extensive review of anthropological investigations, Cintas [8] already presented evidences of cultural variations in the sequencing and timing of motor development. As pointed out by Adolph and colleagues [5••], a cascade of interacting factors such as climate, housing, availability of food, man-made artifacts, parental expectations, and childrearing practices, all immersed in cultural practices, can affect motor trajectories and movement forms. For example, children in Africa have been shown to present better head control and seat earlier as mothers provide more vigorous handling; newborn babies are placed seated on their laps, and by 3 months, some are even propped to sit in a hole on the floor. On the other hand, children from Japan, China, and Korea, required to use chopsticks from young ages, present earlier development of fine motor skills [5••]. This literature has shown that formal training, immersed in cultural practices, can accelerate the development of particular skills such as reaching, seating, crawling, and walking. Enriched environments and stimulation can accelerate the developmental timing of specific, culturally required skills while impoverished environments can restrict practice, resulting in delays. However, irrespective of the culture and other factors, all children acquire basic motor functions (i.e., reaching, seating, walking), even if at different sequences and rates, showing that there is more than one way to acquire skills that are vital for survival [5••].

From the 1980s, when Cintas [10••] warned developmental therapists to recognize that western motor assessment scales were not universal, the use of standardized assessment tools have become a routine, and motor scales have been used across different countries. Studies on the international use of motor assessment tools provide further support to cultural influences on motor performance.

Evidences of Cross-Cultural Variation in Motor Performance Using Common Tests of Motor Development

A simple search in scientific databases (i.e., Medline, Scielo), using a combination of terms (i.e., motor assessment, motor development, cross-cultural translation, transcultural adaptation, psychometrics), revealed several articles, published in the last 20 years, reporting the use of motor development tests in countries diverse from where they were created and normed. This search was conducted mainly to find examples of international use of the most commonly used motor assessments, and articles were considered for this revision only if the results section reported any information regarding cross cultural comparison.

Analysis of the articles, considering research aims, if the motor test was translated and how, the nationality of the children in the sample and results of the cross cultural comparisons, revealed that most authors, as expected, are basically interested in verifying if a specific motor test can be used in their countries. To assess whether the test can be used in their culture, most studies describe results of extensive psychometric analysis of local data, some authors make comparisons between their sample and information obtained from the tests’ manuals, and a few included group comparison with a data set derived from the original normative sample of the target test. Indeed, in over 100 articles on the cross-cultural use of motor tests obtained in the search, 27 compared the performance of locals with the means of the original normative sample, as reported in the test’s manual, and among those, only two reported access to the motor test´s original normative sample, being able to conduct direct comparison with the study’s sample [11, 12].

Although most studies do not focus specifically on cultural factors, most report differences on specific items or on the overall test score between local and normative samples. In spite of a general tendency to conclude that the differences identified are not likely to have practical impact (e.g., low effect size), differences may be observed even when the samples are from the same country and spoke the same language as the normative sample of the original test. For example, Crowe and colleagues [13] and Cohen and colleagues [14] compared the motor performance of Native American and African American children with the normative sample (i.e., z score) of the first edition of the Peabody Developmental Motor Scales (PDMS). They noted that two year old pueblo children had lower scores on the fine motor scale [13] while young African Americans scored higher on gross motor skills [140] More recent studies with the PDMS-2 also suggest the need to adjust cut-off scores for Dutch children [15] and new norms for Indian [16, 17] and Portuguese children [18].

Cross-cultural application of other tests commonly used for motor performance evaluation, also inform similar results; however, there are inconsistencies. For example, two studies conducted in Brazil, using the Alberta Infant Motor Scale (AIMS) [19, 20] suggests that Brazilian infants present lower gross performance when compared to the Canadian norms [21]. In one of them [19], 70 babies were assessed longitudinally from zero to 6 months while the other [20] included 795 infants ages zero to 18 months. Gontijo and colleagues [22], on the other hand, assessing 630 infants ages zero to 18 months show that Brazilian and Canadian infants present very similar course of gross motor development on the AIMS, however, the cut-off points on the 5th and 10th percentiles are not the same for all ages, suggesting the need to use local norms for the identification of motor delay.

Concerning the Bayley Scales of Infant Development, 3rd edition (BSID-III), of the four cross-cultural studies [2326] identified, three demonstrated significant differences in performance. Australian 2-year-old children presented significantly higher means on all scales of the BSID-III [23], especially in the motor one, with high risk extremely preterm infants performing within the mean, suggesting the US norms underestimates the development of Australian children. Similarly, Cromwell and colleagues [24] concluded that reliance on US-based norms for the BSID-III in Malawian children resulted in misclassification of developmental delay in the language and cognitive domains and to a lesser degree in the motor domain. Yu and colleagues [25], applying the US norms, also reported higher motor composite scores for Taiwanese 6-to 24-month-old infants, recommending an upward adjustment of the cut-off score for better identification of delays.

Differences in normative standards of performance are also shown for children on selective tests. Kambas and colleagues [27] using the Motor Proficiency Test with 4 to 6-year-old children (MOT 4–6) found significantly lower levels of motor performance in a large sample of Greek children compared with German norms. In a study with the Korperkoordinations Test fur Kinder (KTK), Smits-Engelsman and colleagues [28] observed that the percentage of Dutch children falling below the 15th and 50th percentile points were substantially higher than those for German children, especially at the 15th percentile, suggesting that the norms for the KTK are likely to overestimate the number of Dutch children with difficulties.

On the Test of Gross Motor Development (TGMD), Cepicka [29] reported that US norms for 7 year olds cannot be generalized to the Czech population due to their lower scores on locomotor and object control subscales. On the other hand, Aponte, French, Sherrill [30] suggested that the US norms can be used with Puerto Rican children; however, they did find that 7-year-old girls had lower gross motor skills.

In relation to the Bruininks-Oseretsky Test of Motor Proficiency (BOTMP), Tsiotra and colleagues [31] compared the performance of Greek and Canadian children finding that Greek children demonstrated significantly lower BOTMP-SF scores than their Canadian counterparts. Lam [32] observed that 5-year-old Hong Kong children were significantly better than the BOTMP normative sample in balance, bilateral coordination, strength, and upper limb coordination subtests. By contrast, the running and agility speed performance of Hong Kong children was inferior. In both studies, the authors suggest that cultural lifestyles might have influenced the results recommending further studies to investigate the validity of the US norms when applied to children from Greece and Hong Kong. Tsiotra and colleagues [31] attribute the lower motor skills and higher prevalence of motor coordination problems among Greek children to a more inactive lifestyle. According to Lam [32], as Hong Kong children tend do live in smaller spaces they are often required to avoid bumping into things, this way balance is necessary; as well, balance benches, beams and folding tunnels are standard kindergarten equipment, and high demands on manual control are needed in order to manipulate chopsticks and to write from age two.

Finally, studies with the Movement Assessment Battery for Children (MABC) provide further evidence of cross-cultural differences in motor development. Of the studies that used the first version of the instrument, direct comparison of samples from the US normative data and Japanese [11] and Chinese [12] children provides evidence of motor skills differences, with recommendations being made for further validity studies in Japan and cut-off scores adjustments in China. In Australia, 4-year-old children performed better than the American sample, but as the difference disappeared at age five, the authors did not recommend adjustments [33]. On the opposite direction, Van Waelvelde and colleagues [34] concluded that US norms were appropriate for 4-year-old Flemish children but required adjustment to identify 5-year-old children with mild motor impairment. The US cut off scores lacked sensitivity for the 5-year-old Flemish children, suggesting the need for a separate cutoff for children this age. Rosblad and Gard [35] examining 6-year-old children found better performance for the Swedish in only one item (“Rolling a Ball”), concluding that US norms might be used in their country.

As discussed by Venetsanou and colleagues [36], differences in motor performance between Asian and US children as measured by the MABC seem to be bigger than those between children from Europe and the USA. Studies on the cross-cultural use of the MABC-2 tend to focus on psychometric properties [37, 38] yielding less information on the reason for cross-cultural differences in motor performance.

Taken together, there is a wealth of information concerning the cross-cultural use of motor development tests and numerous instances of disagreement between test norms and local levels of performance. In some studies, the differences were not considered big enough to threaten the validity of the instrument; in others, re-norming is advisable. Even though there is evidence of differences in motor performance as measured by different instruments, it is important to note that differences may not be due only to motor skills, as other factors impact on test performance. For example, as children in Brazil are not accustomed to one by one testing situation, they tend to become restless on longer assessments, which impacts their performance. Recently, we also have observed that in a school with more permissive attitudes to instruction, it was difficult to retest children as they collaborated in the first assessment but not a second one, despite of all efforts of the examiner.

While there is evidence of culture-related motor differences, as measured by different motor tests, an aspect not well addressed in recent studies is the relevance of test items as well as the behavioral expectations concerning testing in different cultures. The issue of task relevance becomes critical when we consider that, although scores on motor development tests provide a picture of the level of motor performance under constrained situations, they do not always predict actual participation in daily life tasks [39], which usually is the final goal of intervention programs. In our experience, we have seen both children who perform well on the MABC-2 but cannot tie a shoe lace, as well as children who can play soccer beautifully but do conform to rules and follow instructions, performing poorly on the standardized test.

Given evidence of cultural differences in performance on current standardized motor tests, another point to consider is the assessment of participation in activities that require motor skills.

Innovative Standardized Assessments of Daily Living Skills

Professionals routinely combine standardized motor tests with home or school observations in order to have a better sense of the relationship between test scores and functional skills. There are several assessment tools and structured guidelines for natural observation of functional skills, and one of particular interest is the Assessment of Motor and Process Skills (AMPS) [40•] and its school version—School-AMPS [41]. The AMPS is a standardized measure of the quality of the performance of personal and instrumental activities of daily living (ADL). The assessment is conducted in natural environments (e.g., home, school) where, while the client performs regular tasks, the therapist observes and scores the performance on a taxonomy of action verbs that describes the motor skills and strategies or process skills utilized to finalize the task. The 16 motor and 20 process skills items can be used to describe the performance in any task and, with the use of Rasch analysis [42], 120 standardized ADL tasks, along with 27 classroom activites in the School version, used in different countries were calibrated into the system, providing a wide selection of choices for individuals aged 2 years to old age, and that ensure cultural relevance for assessment purposes [4344•].

Recent update of the Pediatric Disability Inventory—PEDI-CAT [44•, 45], an activities of daily living (ADL) questionnaire for children, but extended to young adults, used similar Rasch measurement principles and applied computer-assisted technology to calibrate difficulty for ADL tasks corresponding to different age levels. It was done in a manner that parents, after responding to questions that are relevant to the child’s condition, can be guided by algorithms in the program to respond to a set of easier and harder questions to obtain an accurate ability measure. The PEDI-CAT offers a choice with 276 items, from which parents choose a minimal set of tasks relevant to the child’s needs. Since all items are calibrated by difficulty level, based on the response to a small set of items, a computer program calculates the corresponding ability measure. The possibility to choose from a wide selection of tasks makes it possible to use the instrument in different settings and cultures. Both instruments (i.e., AMPS and PEDI-CAT) illustrate innovative use of measurement principles to create flexible standardized assessment tools that can be used in different cultures. There is, however, a language barrier; both tests are written in English bringing up the issue of translation.

With globalization, increased migrations and trend toward more population diversity, assessment tools often cannot be generalized [46]. More than ever, standardized assessment tools that have been translated and validated through research are required, ensuring relevance to the individuals who are being assisted [47, 48]. Expansion of multinational and multicultural research projects [49••] also underscores the need for cross-cultural adaptation of measuring instruments [4648, 49••].

Cross-Cultural Adaptation of Assessment Tools

The decision to use any standardized assessment cross-culturally should be preceded by deep understanding of its objectives, strengths, and limitations as well as analysis of the relevance of the content and the adequacy of the test’s format for that specific population. Only a thorough process of translation and cultural adaptation can guarantee more reliable and culturally valid measures. The term adaptation rather than translation is recommended [50••] because it is broader encompassing all aspects of the preparation of a test to be used in another language or culture.

There are many guidelines for cross-cultural adaptation [51••], but Beaton and colleagues [49••] provide a step by step description of the adaptation process for health-related questionnaires that has been widely used. According to these authors, the adaptation process should maximize semantic, idiomatic, experimental, and conceptual equivalences between the translated and the original protocol. The process involves the adaptation of the individual items, the instructions and response option and finally, psychometric analysis of the adapted instrument, and development of normative data [49••]. The aim is to obtain maximum accuracy in language translation while maintaining item difficulty, and the reading of the items should flow naturally as in the original test, maintaining the relevance to population being measured [47].

The same principles for cross-cultural adaptation of health questionnaires should be applied to performance and motor assessments. As an example, stacking blocks is a very common task in fine motor assessments that is not that common in Brazil; we have observed infants that play with the beautiful blocks but refuse to stack them up, which should not be taken as a sign of poor motor coordination. So, motor performance tests, as any other assessment tool, should not only be well translated linguistically but must also be culturally adapted to maintain the validity of their content at the conceptual level, seeking equivalence between the original and the translated version [49••]. For example, when translating a motor coordination questionnaire, the item—Your child will never be described as a “bull in a china shop”—that uses an expression well understood in Canada, was initially translated to an “elephant in a china shop”, and finally eliminated, as this expression was not understood and even considered offensive by Brazilian parents [52•]. In a recent adaptation of the same questionnaire to German [53••], the numbers 1 (worse performance) to 5 (best performance) representing the scoring criteria had to be substituted by words as parents had difficult to relating to them, as school performance/grades in Germany range from 1 to 6, with number 1 being the most positive and 6 the most negative.

When cultural issues seem to be affecting test performance, we have to consider whether to construct a new instrument or adapt an existing test. This decision should be based on several factors: the nature of the study in which the test will be used; the cost of making a new instrument versus using an existing one; the time frame, as creating a test is a longer process; the expertise of the research team in measurement and test design; the reliability and validity of the target instrument; the comparability of scores of the new instrument as well as evidence of successful use in other cultures and equality of conditions [47, 48, 49••, 50••]. Among motor assessment tools, the Developmental Coordination Questionnaire (DCDQ), a parent questionnaire, has been submitted to step by step transcultural adaptation to Brazilian Portuguese [52•], German [53••], Japanese [54], and Italian [55••]. Concerning motor performance test, description of the translation procedures was not mentioned in most articles reviewed, but the MABC-2 [37, 38] and the TGMD-2 [56] have been submitted to transcultural adaptation according to internationally recommended procedures [49••].

The adaptation of existing assessment tools has much to commend it [57, 58]: it provides a common measure for the investigation of a phenomenon; provides a standard measure for in international studies; allows comparison between national/cultural groups offering a standard assessment tool designed and adapted for the measurement of cross-cultural phenomenon; is cheaper and less time consuming than creating a new instrument.

Also important in the context of intervention research is the point that when the assessment tool is used as an outcome measure, the metric itself becomes an indicator of the effectiveness of the intervention. Therefore, the cross-cultural adaptation of outcome measures raises questions about the cultural equivalence in service provision. We need to consider that a concept may appear differently in a given profession, in a specific level of care, or in a particular population. While the linguistic translation may be the same, the meaning of the same expression can vary greatly [47]. Concepts such as family and client centered, although easily translated, may not represent the same practices in different cultures. The same way, words such as “treatment” and “intervention”, frequently used in health care, do not fit in the educational field, and different professionals with their own jargons may be in charge of specific services in different countries. For example, while in Denmark, the assessment and treatment of swallowing problems is done by occupational therapists; in the United States, it is done by speech therapists. This difference clearly shows that when adapting outcome measures, we must not only be aware of the cultural equivalence of the assessment tool but also of the cultural equivalence of the intervention [47].

Some drawbacks of test adaptation also must be considered, such as times when the adaptation cannot be justified as this process will not result in a valid test or when there is a risk of generating conclusions based on concepts from a culture that are no relevant or are only partially relevant in another culture [50••]. Besides culture, economical restrictions also must be considered. Motor performance tests, such as the ones mentioned previously, require standardized materials provided in test kits that might be affordable in the countries in which they are published but can be very expensive when imported. Researchers are likely to have access to grants and tax-free importation, conditions that do not apply to professionals. Finally, even though some motor assessment tools have been fully adapted and validated through research, adapted test protocols cannot be shared due to copyright restrictions and tests’ materials are not available for general use, limiting its widespread use.

Conclusion

Returning to our initial question of how can we make our assessment of motor ability relevant cross-culturally, it is clear that motor assessment should always be contextualized. Even when we are using standardized motor development tests within their original country, the test should always be associated with information about the performance in the real world, either by observation at home, school, or at the community, or by means of interviews with the child, the caregivers, or teachers.

If the assessment tool was not created for that specific population, we should be even more careful. It is important to consider if the instrument is appropriate for that culture, if it is worth adapting and, if that is the case, this process should be done step by step, according to current guidelines for cross-cultural adaptation. The adaptation process is time consuming, but the final product will be more equivalent to the original. It is also necessary to conduct full reliability and validity studies, as measurement properties may not be retained. All these steps should be registered and reported in publications, to give users and researchers more confidence on the quality of the adapted instrument and its measures.

In large and diverse countries like Brazil and the US, an adapted test cannot be recommended for widespread use when validity data has been collected in restricted samples. Collaboration is needed in order to conduct multi-center studies to gather information from different regions as well as from minorities.

One strategy to improve cultural sensitivity of motor assessments is to develop instruments in collaboration with researchers from different countries. While this takes time, the end product will be more generalizable in terms of administration and norm-referenced interpretation.

The shared use of existing databanks for cross-cultural studies would also be another useful resource. Some authors not only share the data set but also consult with the group interested in test adaptation, sharing expertise with less-experienced researchers in other countries. When conducting these international projects, it is important to stimulate data collection across all ages of the target instrument in order to be able to compare full scales. Test publishers also should be invited to collaborate in order to find ways to break barriers that limit the access to test kits and protocols.

The use on innovative approaches, such as computer adaptive testing, with the inclusion of items relevant to different countries, has not been fully explored in the field of motor testing and to increase the cultural validity of motor assessments, as culturally relevant tasks of similar difficulty level could be calibrated, allowing for culturally sensitive assessment.

Last but not least, the cross-cultural adaptation of measures raises questions about cultural equivalence in service provision, which must be carefully analyzed before deciding to adapt as well as after the instrument is put to use.