Abstract
Translating or adapting psychological and educational tests from one language and culture to other languages and cultures has been a common practice for almost a hundred years, beginning with Binet's test of intelligence. Despite the long history and the many good reasons for adapting tests, proper methods for conducting test adaptations and establishing score equivalence are not well known by psychologists. The purpose of this paper is to focus attention on judgmental and statistical methods and procedures for adapting tests with special focus on procedures for identifying poorly adapted items. When these methods are correctly applied, the validity of any cross-cultural uses of the adapted test should be increased.
References
1988 Equating the scores of the Prueba de Aptitud Academica and the Scholastic Aptitude Test (Report No. 88-2) New York, NY: College Entrance Examination Board
1987 SAT differential item performance for nine handicapped groups Journal of Educational Measurement 24 41 55
1991 Evaluation of IRT anchor test designs in test translation studies Unpublished doctoral dissertation, University of Massachusetts at Amherst
1991 State of the art procedures for translating, validating and using psychoeducational tests in cross-cultural assessment School Psychology International 12 119 132
1970 Back-translation for cross-cultural research Journal of Cross-Cultural Psychology 1 185 216
1976 Translation: Application and research New York: John Wiley
Ed.1986 The wording and translation of research instruments In W. J. Lonner & J. W. Berry (Eds.), Field methods in cross-cultural psychology (pp.137-164). Newbury Park, CA: Sage Publishers
1994 Methods for identifying biased items Newbury Park, CA: Sage Publishers
1986 Cross-language and cross-cultural comparisons in scale translations: Independent sources of information about item nonequivalence Journal of Cross-Cultural Psychology 17 417 440
1989 Two new approaches to assessing differential item functioning: Standardization and the Mantel-Haenszel method Applied Psychological Measurement 3 217 233
1993 DIF detection and description: Mantel-Haenszel and Standardization In P. W. Holland & H. Wainer (Eds.), Differential item functioning (pp.35-66). Hillsdale, NJ: Erlbaum
1986 Assessing the equivalence of measurement of attitudes and aptitudes across heterogeneous subpopulations (unpublished manuscript). Urbana-Champaign, IL: University of Illinois
1985 Equivalence of psychological measurement in heterogeneous populations Journal of Applied Psychology 70 662 680
1989 Differential item functioning: Implications for test translation Journal of Applied Psychology 74 912 921
1991 Item response theory: A tool for assessing the equivalence of translated tests Bulletin of the International Test Commission 18 33 51
1992 Identification of unique cultural response patterns by means of item response theory Journal of Applied Psychology 77 177 184
1993 Translating achievement tests for use in cross-national studies European Journal of Psychological Assessment 9 54 65
1994 Guidelines for adapting educational and psychological tests: a progress report European Journal of Psychological Assessment 10 229 244
1991 Adapting tests for use in different cultures: technical issues and methods Bulletin of the International Test Commission 18 3 32
1993 Advances in detection of differentially functioning test items European Journal of Psychological Assessment 9 1 18
1994 Enhancing the validity of cross-cultural studies: Improvements in instrument translation methods In T. Husen & T. N. Postlewaite (Eds.), International encyclopedia of education (2nd ed.). Oxford, UK: Pergamon Press
1989 Detecting potentially biased test items: Comparison of IRT area and Mantel-Haenszel methods Applied Measurement in Education 2 313 334
1991 Fundamentals of item response theory Newbury Park, CA: Sage Publishers
1989 Screening for potentially biased items in testing programs Educational Measurement: Issues and Practice 8 5 11
1988 Differential item performance and the Mantel Haenszel procedure In H. Wainer & H. I. Braun (Eds.), Test validity (pp.129- 145). Hillsdale, NJ: Erlbaum
1989 Applied logistic regression New York: Wiley
1987 A psychometric theory of evaluations of item and scale translations: Fidelity across languages Journal of Cross-Cultural Psychology 18 115 142
1986 Psychometric equivalence of a translation of the Job Descriptive Index into Hebrew Journal of Applied Psychology 71 83 94
1992, April Technical issues in the first and second IEA science studies Paper presented at the annual meeting of the American Educational Research Association, San Francisco
1991 Empirical comparison between factor analysis and multidimensional item response models Multivariate Behavioral Research 26 457 477
1981 Interaction between item content and group membership on achievement test items Journal of Educational Measurement 18 109 118
1990 An overview of cross-cultural testing and assessment In R. W. Brislin (Ed.), Applied cross-cultural psychology (pp.56-76). Newbury Park, CA: Sage Publications
1984, April Analysis of cross-cultural attitudinal scale translation using maximum likelihood factor analysis Paper presented at the meeting of the American Educational Research Association, New Orleans, LA
in press Using logistic regression with multiple ability estimates to detect differential item functioning Journal of Educational Measurement
1983 Psychometric approaches to intergroup comparison: The problem of equivalence In S. H. Irvine & J. W. Berry (Eds.), Human assessment and cross-cultural factors (pp.237-258). New York: Plenum
1986 Making inferences from cross-cultural data In W. J. Lonner & J. W. Berry (Eds.), Field methods in cross-cultural psychology (pp.17-46). Beverly Hills, CA: Sage
1987 Explaining cross-cultural differences: Bias analysis and beyond Journal of Cross-Cultural Psychology 18 259 282
1991 Culture-free measurement in the history of cross-cultural psychology Bulletin of the International Test Commission 18 72 87
1992 A method for translation of instruments to other languages Adult Education Quarterly 43 1 14
1994, April Logistic regression procedures for detecting DIF in nondichotomous item responses Paper presented at the meeting of the National Council on Measurement in Education, New Orleans
1988 The factor model as a theoretical basis for individual differences In S. H. Irvine & J. W. Berry (Eds.), Human abilities in cultural context (pp.147- 165). New York: Cambridge University Press
1989 A consumer's guide to statistics for identifying differential item functioning Applied Measurement in Education 2 255 275
1993 Evaluating hypotheses about differential item functioning In P. W. Holland & H. Wainer (Eds.), Differential item functioning (pp.281-315). Hillsdale, NJ: Erlbaum
1985 Validity of approximation techniques for detecting item bias Journal of Educational Measurement 22 49 58
1990 Detecting differential item functioning using logistic regression procedures Journal of Educational Measurement 27 361 370
1976 Approaches toward minimizing translation In R. W. Brislin (Ed.), Translation: Application and research (pp.228-243). New York: John Wiley
1991a Testing across cultures In R. K. Hambleton & J. Zaal (Eds.), Advances in educational and psychological testing (pp.277-307). Boston: Kluwer Academic Publishers
1991b Culture-free measurement in the history of cross-cultural psychology Bulletin of the International Test Commission 18 72 87
Towards an integrated analysis of bias in cross-cultural assessment European Journal of Psychological Assessment
in press1993 An IRT approach to cross-language test equating and interpretation European Journal of Psychological Assessment 9 233 241
1993 Practical questions in the use of DIF statistics in test development In P. W. Holland & H. Wainer (Eds.), Differential item functioning (pp.337- 348). Hillsdale, NJ: Erlbaum