Original Article

An Empirical Power Analysis of Quasi-Exact Tests for the Rasch Model

Measurement Invariance in Small Samples

Ingrid Koller

Leibniz Institute for Educational Trajectories, University of Bamberg, Germany

Search for more papers by this author

Marco Johannes Maier

Vienna University of Economics and Business, Institute for Statistics and Mathematics, Vienna, Austria

Search for more papers by this author

, and

Reinhold Hatzinger

Vienna University of Economics and Business, Institute for Statistics and Mathematics, Vienna, Austria

Search for more papers by this author

Published Online:January 01, 2015https://doi.org/10.1027/1614-2241/a000090

Abstract

Measurement invariance is not only an important requirement of tests but also a central point in the examination of the Rasch model. Ponocny (2001) suggested quasi-exact tests for small samples which allow for formulating test-statistics based on matrices obtained using Monte Carlo methods. The purpose of the present study was to analyze the type-I error rates and the empirical power of two test-statistics for the assumption of measurement invariance in comparison with Andersen’s likelihood ratio test (1973). Each simulation was based on 10,000 replications and was a function of sample size (n = 30, 50, 100, 200), test length (k = 5, 9, 17), varying number of items exhibiting model violation, magnitude of violation, and different ability distributions. The results indicate that it is possible to detect large model violations on item level with samples of n = 50 or n = 100, and even weak violations with n = 200. Additionally, the results showed that it is possible to investigate very small samples where a parametric approach is not possible, which is one of the most important advantages of quasi-exact tests.

References

Alexandrowicz, R. (2002). Die Teststärke des Likelihood-Quotienten-Tests nach Andersen bei der Überprüfung der Modellgültigkeit des dichotomen logistischen Modells nach Rasch [The power of Andersen-Likelihood-ratio-test for the examination of the dichotomous logistic model according Rasch]. (Unpublished doctoral thesis). University of Vienna, Austria. First citation in article Google Scholar
Andersen, E. B. (1973). A goodness of fit test for the Rasch model. Psychometrika, 38, 123–140. First citation in article Crossref, Google Scholar
Bradley, J. V. (1978). Robustness? British Journal of Mathematical and Statistical Psychology, 31, 144–152. First citation in article Crossref, Google Scholar
Chen, Y., & Small, D. (2005). Exact tests for the Rasch model via sequential importance sampling. Psychometrika, 70, 11–30. First citation in article Crossref, Google Scholar
De Ayla, R. J. (2009). The theory and practice of item response theory. New York, NY: The Guilford Press. First citation in article Google Scholar
Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Erlbaum. First citation in article Google Scholar
Ferne, T., & Rupp, A. A. (2007). A synthesis of 15 years of research on DIF in language testing: Methodological advances, challenges, and recommendations. Language Assessment Quarterly, 4, 113–148. First citation in article Google Scholar
Fischer, G. H. (1974). Einführung in die Theorie psychologischer Tests: Grundlagen und Anwendungen [Introduction to the theory of psychological tests: Basic principles and applications]. Bern, Switzerland: Huber. First citation in article Google Scholar
Fischer, G. H. (1981). On the existence and uniqueness of maximum-likelihood estimates in the Rasch model. Psychometrika, 46, 59–77. First citation in article Crossref, Google Scholar
Fischer, G. H. (1995a). Derivations of the Rasch Model. In G. H. Fischer, & I. W. Molenaar, (Eds.), Rasch Models: Foundations, recent developments, and applications (pp. 15–38). New York, NY: Springer. First citation in article Crossref, Google Scholar
Fischer, G. H. (1995b). Some neglected problems in IRT. Psychometrika, 60, 459–487. First citation in article Crossref, Google Scholar
Fischer, G. H., & Molenaar, I. W. (1995). Rasch Models: Foundations, recent developments, and applications. New York, NY: Springer. First citation in article Crossref, Google Scholar
Glas, C. A. W. (1998). Detection of differential item functioning using Lagrange multiplier tests. Statistica Sinica, 8, 647–667. First citation in article Google Scholar
Glas, C. A. W., & Verhelst, N. D. (1995). Testing the Rasch model. In G. H. Fischer, & I. W. Molenaar, (Eds.), Rasch Models: Foundations, recent developments, and applications (pp. 4–14). New York, NY: Springer. First citation in article Crossref, Google Scholar
Gonzáles-Betanzos, F., & Abad, F. J. (2012). The effects of purification and the evaluation of differential item functioning with the likelihood ratio test. Methodology, 8, 134–145. First citation in article Link, Google Scholar
Hambleton, R. K., & Rogers, H. J. (1989). Detecting potentially biased test items: Comparison of IRT area and Mantel-Haenszel methods. Applied Measurement in Education, 2, 313–334. First citation in article Crossref, Google Scholar
Holland, P. W., & Thayer, D. T. (1988). Differential item functioning and the Mantel-Haenszel procedure. In H. Wainer, & I. H. Braun, (Eds.), Test validity (pp. 129–145). Hillsdale, NJ: Erlbaum. First citation in article Google Scholar
Holland, P. W., & Wainer, H. (1993). Differential item functioning. Hillsdale, NJ: Erlbaum. First citation in article Google Scholar
Kelderman, H. (1989). Item bias detection using loglinear IRT. Psychometrika, 54, 681–697. First citation in article Crossref, Google Scholar
Kim, E. S., Yoon, M., & Lee, T. (2012). Testing measurement invariance using MIMIC: Likelihood ratio test with a critical value adjustment. Educational and Psychological Measurement, 72, 469–492. First citation in article Crossref, Google Scholar
Koller, I., Alexandrowicz, R., & Hatzinger, R. (2012). Das Rasch Modell in der Praxis: Eine Einführung mit eRm [The Rasch model in practical applications: An introduction with eRm]. Wien: facultas.wuv, UTB. First citation in article Crossref, Google Scholar
Koller, I., & Hatzinger, R. (2013). Nonparametric tests for the Rasch model: Explanation, development, and application of quasi-exact tests for small samples. Interstat, 11, 1–16. First citation in article Google Scholar
Kubinger, K. D., & Draxler, C. (2007). Probleme bei der Testkonstruktion nach dem Rasch-Modell [Some problems in calibrating an item pool according to the Rasch model]. Diagnostica, 53, 131–143. First citation in article Link, Google Scholar
Lehmann, E. L. (1986). Testing statistical hypotheses. New York, NY: Springer. First citation in article Crossref, Google Scholar
Magis, D., Béland, S., Tuerlinckx, F., & De Boeck, P. (2010). A general framework and an R package for the detection of dichotomous differential item functioning. Behavior Research Methods, 42, 847–862. First citation in article Crossref, Google Scholar
Maier, M. J., & Koller, I. (2014). Supplement to Koller, Maier, & Hatzinger: “An Empirical Power Analysis of Quasi-Exact Tests for the Rasch Model: Measurement Invariance in Small Samples”. Research Report Series/Department of Statistics and Mathematics, 127. WU Vienna University of Economics and Business, Vienna. Retrieved from: epub.wu.ac.at/4340/ First citation in article Google Scholar
Mair, P., & Hatzinger, R. (2007a). CML based estimation of extended Rasch models with the eRm package in R. Psychology Science, 49, 26–43. First citation in article Google Scholar
Mair, P., & Hatzinger, R. (2007b). Extended Rasch modeling. The eRm package for the application of IRT models in R. Journal of Statistical Software, 20, 1–20. Retrieved from www.jstatsoft.org First citation in article Crossref, Google Scholar
Mair, P., Hatzinger, R., & Maier, M. J. (2013). eRm: Extended Rasch Modeling [Computer software]. R package version 0.15-3. Vienna, Austria R Foundation Retrieved from CRAN.R-project.org/package=eRm First citation in article Google Scholar
Mellenberg, G. J. (1982). Contingency table models for assessing item bias. Journal of Educational Statistics, 7, 105–118. First citation in article Crossref, Google Scholar
Molenaar, I. W. (1995). Some background for item response theory and the Rasch model. In G. H. Fischer, & I. W. Molenaar, (Eds.), Rasch models: Foundations, recent developments, and applications (pp. 3–14). New York, NY: Springer. First citation in article Crossref, Google Scholar
Penfield, R. D. (2010). Modeling DIF effects using distractor-level invariance effects: Implications for understanding the causes of DIF. Applied Psychological Measurement, 34, 151–165. First citation in article Crossref, Google Scholar
Penfield, R. D., Myers, N. D., & Wolfe, E. W. (2008). Methods for assessing item, step, and threshold invariance in polytomous items following the Partial Credit Model. Applied Psychological Measurement, 68, 717–733. First citation in article Crossref, Google Scholar
Ponocny, I. (1996). Kombinatorische Modelltests für das Rasch-Modell [Combinatorial goodness-of-fit tests for the Rasch model]. (Unpublished doctoral thesis). University of Vienna, Austria. First citation in article Google Scholar
Ponocny, I. (2001). Nonparametric goodness-of-fit tests for the Rasch model. Psychometrika, 66, 437–460. First citation in article Crossref, Google Scholar
R Core Team . (2013). R: A language and environment for statistical computing [Computer software]. Vienna, Austria: R Foundation for Statistical Computing Retrieved from www.R-project.org/ First citation in article Google Scholar
Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Kopenhagen, Denmark: Danish Institute for Educational Research. First citation in article Google Scholar
Roju, N. S., van der Linden, W. J., & Fleer, P. F. (1995). IRT-based internal measures of differential functioning of items and tests. Applied Psychological Measurement, 19, 353–368. First citation in article Crossref, Google Scholar
Snijders, T. (1991). Enumeration and simulation for 0–1 matrices with given marginal. Psychometrika, 56, 397–417. First citation in article Crossref, Google Scholar
Stark, S., Chernyshenko, O. S., & Drasgow, F. (2006). Detecting differential item functioning with confirmatory factor analysis and item response theory: Toward a unified strategy. Journal of Applied Psychology, 91, 1292–1306. First citation in article Crossref, Google Scholar
Verhelst, N. D. (2008). An efficient MCMC algorithm to sample binary matrices with fixed marginals. Psychometrika, 74, 705–728. First citation in article Crossref, Google Scholar
Verhelst, N. D., Hatzinger, R., & Mair, P. (2007). The Rasch sampler. Journal of Statistical Software. May 20, Retrieved from www.jstatsoft.org First citation in article Crossref, Google Scholar

Volume 11Issue 2June 2015

ISSN: 1614-1881eISSN: 1614-2241

History

AcceptedOctober 24, 2014

Licenses & Copyright

Keywords

PDF download

Verify Phone

Congrats!

An Empirical Power Analysis of Quasi-Exact Tests for the Rasch Model

Measurement Invariance in Small Samples

Abstract

References

History

Licenses & Copyright

Support & Contact

Support & Contact

Legal information

Legal information

More offers

More offers

Our partners

Our partners

Change Password

Your password must have 8 characters or more and contain 3 of the following:

Password Changed Successfully

Create a new account

Request Username

Verify Phone

Congrats!

An Empirical Power Analysis of Quasi-Exact Tests for the Rasch Model

Measurement Invariance in Small Samples

Abstract

References

History

Licenses & Copyright

Support & Contact

Support & Contact

Legal information

Legal information

More offers

More offers

Our partners

Our partners