Abstract
Purpose. Data from international educational assessments conducted in many countries are mostly analyzed using item response theory. The assumption that all items behave the same in all countries is often not tenable. The variability of item parameters across countries can be taken into account by assuming that the item parameters are random effects (De Jong et al. in J. Consum. Res. 34:260–278, 2007; De Jong and Steenkamp in Psychometrika 75:3–32, 2010). However, the complex latent structure of such a model, with latent variables both at the item and the person level, renders maximum likelihood estimation computationally challenging. We describe a variational estimation technique that consists of approximating the likelihood function by a computationally tractable lower bound.
Methods. A mean field approximation to the posterior distribution of the latent variables was used. The update equations were derived for the specific case of discrete random effects and implemented in a Maximization Maximization algorithm (Neal and Hinton in M.I. Jordan (Ed.) Learning in Graphical Models, Kluwer Academic, Dordrecht, pp. 355–368, 1998). Parameter recovery was investigated in a simulation study. The method was also applied to the Progress in International Reading Study of 2006.
Results. The model parameters were recovered well under all conditions of the simulation study. In the application, the estimated variances of the random item effects showed a high positive correlation with traditional measures for the lack of item invariance across groups.
Conclusions. The mean field approximation and variational methods in general offer a computationally tractable alternative to exact maximum likelihood estimation.
References
Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: an application of the EM algorithm. Psychometrika, 46, 443–459.
De Jong, M. G., & Steenkamp, J.-B. E. M. (2010). Finite mixture multilevel multidimensional ordinal IRT models for large scale cross-cultural research. Psychometrika, 75, 3–32.
De Jong, M. G., Steenkamp, J.-B. E. M., & Fox, J.-P. (2007). Relaxing measurement invariance in cross-national consumer research using a hierarchical IRT model. Journal of Consumer Research, 34, 260–278.
De Jong, M. G., Steenkamp, J.-B. E. M., Fox, J.-P., & Baumgartner, H. (2008). Using item response theory to measure extreme response style in marketing research: a global investigation. Journal of Marketing Research, 45, 104–115.
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B, 39, 1–38.
Fahrmeir, L., & Tutz, G. (2001). Multivariate statistical modelling based on generalized linear models (2nd ed.). New York: Springer.
Glas, C. A. W., & van der Linden, W. J. (2001). Modelling variability in item parameters in item response models (Tech. Rep.). Enschede, University of Twente.
Humphreys, K., & Titterington, D. M. (2003). Variational approximations for categorical causal models with latent variables. Psychometrika, 68, 391–412.
Janssen, R., Tuerlinckx, F., Meulders, M., & De Boeck, P. (2000). A hierarchical IRT model for criterion-referenced measurement. Journal of Educational and Behavioral Statistics, 25, 285–306.
Kullback, S., & Leibler, R. A. (1951). On information and sufficiency. Annals of Mathematical Statistics, 86, 22–79.
Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading: Addison-Wesley.
Neal, R. M., & Hinton, G. E. (1998). A view of the EM algorithm that justifies incremental, sparse, and other variants. In M. I. Jordan (Ed.), Learning in graphical models (pp. 355–368). Dordrecht: Kluwer Academic.
Peterson, C., & Anderson, J. R. (1987). A mean-field theory learning algorithm for neural networks. Complex Systems, 1, 995–1019.
Pinheiro, P. C., & Bates, D. M. (1995). Approximations to the log-likelihood function in the nonlinear mixed-effects model. Journal of Computational and Graphical Statistics, 4, 12–35.
Rijmen, F. (2009). An efficient EM algorithm for multidimensional IRT models: full information maximum likelihood estimation in limited time (ETS Research Report, RR-09-03).
Rijmen, F. (2010). Formal relations and an empirical comparison between the bi-factor, the testlet, and a second-order multidimensional IRT model. Journal of Educational Measurement, 47, 361–372.
Rijmen, F., Tuerlinckx, F., De Boeck, P., & Kuppens, P. (2003). A nonlinear mixed model framework for item response theory. Psychological Methods, 8, 185–205.
Rijmen, F., Vansteelandt, K., & De Boeck, P. (2008). Latent class models for diary method data: parameter estimation by local computations. Psychometrika, 73, 167–182.
Rose, N., von Davier, M., & Xu, X. (2010). Modeling nonignorable missing data with item response theory (IRT) (ETS Research Report, RR-10-11).
Swaminathan, H., & Rogers, H. J. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27, 361–370.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Rijmen, F., Jeon, M. Fitting an item response theory model with random item effects across groups by a variational approximation method. Ann Oper Res 206, 647–662 (2013). https://doi.org/10.1007/s10479-012-1181-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10479-012-1181-7