Skip to main content
Log in

Modification indices for the 2-PL and the nominal response model

  • Published:
Psychometrika Aims and scope Submit manuscript

Abstract

In this paper, it is shown that various violations of the 2-PL model and the nominal response model can be evaluated using the Lagrange multiplier test or the equivalent efficient score test. The tests presented here focus on violation of local stochastic independence and insufficient capture of the form of the item characteristic curves. Primarily, the tests are item-oriented diagnostic tools, but taken together, they also serve the purpose of evaluation of global model fit. A useful feature of Lagrange multiplier statistics is that they are evaluated using maximum likelihood estimates of the null-model only, that is, the parameters of alternative models need not be estimated. As numerical examples, an application to real data and some power studies are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Agresti, A., & Yang, M. (1987). An empirical investigation of some effects of sparseness in contingency tables.Computational Statistics and Data Analysis, 5, 9–21.

    Google Scholar 

  • Aitchison, J., & Silvey, S.D. (1958). Maximum likelihood estimation of parameters subject to restraints.Annals of Mathematical Statistics, 29, 813–828.

    Google Scholar 

  • Albert, J.H. (1992). Bayesian estimation of normal ogive item response functions using Gibbs sampling.Journal of Educational Statistics, 17, 251–269.

    Google Scholar 

  • Andersen, E.B. (1973). A goodness of for test for the Rasch model.Psychometrika, 38, 123–140.

    Google Scholar 

  • Andersen, E.B. (1985). Estimating latent correlations between repeated testings.Psychometrika, 50, 3–16.

    Google Scholar 

  • Ando, A., & Kaufmann, O.M. (1965). Bayesian analysis of the independent normal process-neither mean nor precision known.Journal of the American Statistical Association, 60, 347–358.

    Google Scholar 

  • Baker, F.B. (1998). An investigation of item parameter recovery characteristics of a Gibbs sampling procedure. Applied Psychological Measurement, 22, 153–169.

    Google Scholar 

  • Birnbaum, A. (1968). Some latent trait models. In F.M. Lord & M.R. Novick (Eds.),Statistical theories of mental test scores (pp. 395–479). Reading, MA: Addison-Wesley.

    Google Scholar 

  • Bock, R.D. (1972). Estimating item parameters and latent ability when responses are scored in two or more nominal categories.Psychometrika, 37, 29–51.

    Google Scholar 

  • Bock, R.D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: an application of an EM-algorithm.Psychometrika, 46, 443–459.

    Google Scholar 

  • Breusch, T.S., & Pagan, A.R. (1980). The Lagrange multiplier test and its applications to model specification in econometrics.Review of Economic Studies, 47, 239–254.

    Google Scholar 

  • Buse, A. (1982). The likelihood ratio, Wald, and Lagrange multiplier tests: An expository note.The American Statistician, 36, 153–157.

    Google Scholar 

  • Choppin, B. (1983).A two-parameter latent trait model (CSE report No. 197). Los Angeles, CA: University of California, Center for Study of Evaluation, Graduate School of Education.

    Google Scholar 

  • de Leeuw, J., & Verhelst, N. D. (1986). Maximum likelihood estimation in generalized Rasch models.Journal of Educational Statistics, 11, 183–196.

    Google Scholar 

  • Fischer, G.H. (1974).Einführung in die Theorie Psychologischer Tests [Introduction to the theory of psychological tests]. Bern: Huber.

    Google Scholar 

  • Follmann, D. (1988). Consistent estimation in the Rasch model based on nonparametric margins.Psychometrika, 53, 553–562.

    Google Scholar 

  • Gelman, A., Carlin, J.B., Stern, H.S., & Rubin, D.B. (1995).Bayesian data analysis. London: Chapman and Hall.

    Google Scholar 

  • Glas, C.A.W. (1988). The derivation of some tests for the Rasch model from the multinomial distribution.Psychometrika, 53, 525–546.

    Google Scholar 

  • Glas, C.A.W. (1992). A Rasch model with a multivariate distribution of ability. In M. Wilson, (Ed.),Objective measurement: Theory into practice, Vol. 1. (pp.236–258) New Jersey: Ablex Publishing Co.

    Google Scholar 

  • Glas, C.A.W. (1998). Detection of differential item functioning using Lagrange multiplier tests.Statistica Sinica, 8, 647–667.

    Google Scholar 

  • Glas, C.A.W., & Verhelst, N.D. (1989). Extensions of the partial credit model.Psychometrika, 54, 635–659.

    Google Scholar 

  • Glas, C.A.W., & Verhelst, N.D. (1995). Tests of fit for polytomous Rasch models. In G. H. Fischer & I. W. Molenaar (Eds.).Rasch models. Their foundation, recent developments and applications. New York: Springer.

    Google Scholar 

  • Grayson, D.A. (1988). Two-group classification in item response theory: Scores with monotone likelihood ratio.Psychometrika, 53, 383–392.

    Google Scholar 

  • Hemker, B.T., Sijtsma, K., Molenaar, I.W. & Junker, B.W. (1996). Polytomous IRT models and monotone likelihood ratio of the total score.Psychometrika, 61, 679–693.

    Google Scholar 

  • Holland, P.W., & Rosenbaum, P.R. (1986). Conditional association and unidimensionality in monotone latent variable models.Annals of Statistics, 14, 1523–1543.

    Google Scholar 

  • Huynh, H. (1994). A new proof for monotone likelihood ratio for the sum of independent bernoulli random variables.Psychometrika, 59, 77–79.

    Google Scholar 

  • Jannarone, R.J. (1986). Conjunctive item response theory kernels.Psychometrika, 51, 357–373.

    Google Scholar 

  • Junker, B. (1991). Essential independence and likelihood-based ability estimation for polytomous items.Psychometrika, 56, 255–278.

    Google Scholar 

  • Kelderman, H. (1984). Loglinear Rasch model tests.Psychometrika, 49, 223–245.

    Google Scholar 

  • Kelderman, H. (1989). Item bias detection using loglinear IRT.Psychometrika, 54, 681–697.

    Google Scholar 

  • Koehler, K. (1986). Goodness-of-fit tests for loglinear models in sparse contingency tables.Journal of the American Statistical Association, 81, 483–493.

    Google Scholar 

  • Koehler, K., & Larntz, K. (1980). An empirical investigation of goodness-of-fit statistics for sparse multinomials.Journal of the American Statistical Association, 75, 336–344.

    Google Scholar 

  • Larntz, K. (1978). Small-sample comparison of exact levels for goodness-of-fit statistics.Journal of the American Statistical Association, 73, 253–263.

    Google Scholar 

  • Louis, T.A. (1982). Finding the observed information matrix when using the EM algorithm.Journal of the Royal Statistical Society, Series B, 44, 226–233.

    Google Scholar 

  • Lord, F.M. (1980).Applications of item response theory to practical testing problems. Hillsdale, NJ, Erlbaum.

    Google Scholar 

  • Martin-Löf, P. (1973).Statistika Modeller. Anteckningar från seminarier Lasåret 1969–1970, utardeltade av Rolf Sunberg. Obetydligt ändrat nytryck, oktober 1973. Stockholm: Institutet för Försäkringsmatematik och Matematisk Statistik vid Stockholms Universitet.

    Google Scholar 

  • Martin Löf, P. (1974). The notion of redundancy and its use as a quantitative measure if the discrepancy between a statistical hypothesis and a set of observational data.Scandinavian Journal of Statistics, 1, 3–18.

    Google Scholar 

  • McDonald, R.P. (1967). Nonlinear factor analysis.Psychometric monographs, No.15.

  • McDonald, R.P. (1997). Normal-ogive multidimensional model. In W.J. van der Linden & R.K. Hambleton (Eds.),Handbook of modern item response theory, (pp. 257–269). New York: Springer.

    Google Scholar 

  • Mislevy, R.J. (1986). Bayes modal estimation in item response models.Psychometrika, 51, 177–195.

    Google Scholar 

  • Mislevy, R.J., & Bock, R.D. (1990).PC-Bilog. Item analysis and test scoring with binary logistic models. Chicago: Scientific Software International.

    Google Scholar 

  • Molenaar, I.W. (1983). Some improved diagnostics for failure in the Rasch model.Psychometrika, 48, 49–72.

    Google Scholar 

  • Muraki, E. (1992). A generalized partial credit model: application of an EM algorithm.Applied Psychological Measurement, 16, 159–176.

    Google Scholar 

  • Patz, R.J. & Junker, B.W. (1997).Applications and extensions of MCMC in IRT: Multiple item types, missing data, and rated responses (Technical Report No. 670). Pittsburgh: Carnegie Mellon University, Department of Statistics.

    Google Scholar 

  • Rao, C.R. (1947). Large sample tests of statistical hypothesis concerning several parameters with applications to problems of estimation.Proceedings of the Cambridge Philosophical Society, 44, 50–57.

    Google Scholar 

  • Reckase, M.D. (1985). The difficulty of test items that measure more than one ability.Applied Psychological Measurement, 9, 401–412.

    Google Scholar 

  • Reckase, M.D. (1997). A linear logistic multidimensional model for dichotomous item response data. In W.J. van der Linden & R. K. Hambleton (Eds.),Handbook of modern item response theory (pp. 271–286). New York: Springer.

    Google Scholar 

  • Reiser, M. (1996). Analysis of residuals for the multinomial item response model.Psychometrika, 61, 509–528.

    Google Scholar 

  • Rosenbaum, P.R. (1984). Testing the conditional independence and monotonicity assumptions of item response theory.Psychometrika, 49, 425–436.

    Google Scholar 

  • Rubin, D.B. (1976). Inference and missing data.Biometrika, 63, 581–592.

    Google Scholar 

  • Stout, W.F. (1987). A nonparametric approach for assessing latent trait dimensionality.Psychometrika, 52, 589–617.

    Google Scholar 

  • Stout, W.F. (1990). A new item response theory modeling approach with applications to unidimensional assessment and ability estimation.Psychometrika, 55, 293–326.

    Google Scholar 

  • Thissen, D. (1991).MULTILOG. Multiple, categorical item analysis and test scoring using item response theory. Chicago: Scientific Software International.

    Google Scholar 

  • Thissen, D., & Steinberg, L. (1986). A taxonomy of item response models.Psychometrika, 51, 567–577.

    Google Scholar 

  • Yen, W.M. (1981). Using simultaneous results to choose a latent trait model.Applied Psychological Measurement, 5, 245–262.

    Google Scholar 

  • Yen, W.M. (1984). Effects of local item dependence on the fit and equating performance of the three-parameter logistic model.Applied Psychological Measurement, 8, 125–145.

    Google Scholar 

  • Zimowski, M.F., Muraki, E., Mislevy, R.J., & Bock, R.D. (1996).Bilog MG: Multiple-group IRT analysis and test maintenance for binary items. Chicago: Scientific Software International.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Glas, C.A.W. Modification indices for the 2-PL and the nominal response model. Psychometrika 64, 273–294 (1999). https://doi.org/10.1007/BF02294296

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02294296

Key words

Navigation