Skip to main content

An IRT Model for Multiple Raters

  • Chapter
Essays on Item Response Theory

Part of the book series: Lecture Notes in Statistics ((LNS,volume 157))

Abstract

An IRT model for multiple ratings is presented. If it is assumed that the quality of a student performance has a stochastic relationship with the latent variable of interest, it is shown that the ratings of several raters are not conditionally independent given the latent variable. The model gives a full account of this dependence. Several relationships with other models appear to exist. The proposed model is a special case of a nonlinear multilevel model with three levels, but it can also be seen as a linear logistic model with relaxed assumptions (LLRA). Moreover, a linearized version of the model turns out to be a special case of a generalizability model with two crossed measurement facets (items and raters) with a single first-order interaction term (persons and items). Using this linearized model, it is shown how the estimated standard errors of the parameters are affected if the dependence between the ratings is ignored.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  • Adams, R.J., & Wilson, M.R. (1996). Formulating the Rasch model as a mixed coefficients multinomial logit: A generalized approach to fitting Rasch models. In G. Engelhard & M.R. Wilson (Eds.), Objective measurement III: Theory into practice (pp. 143–166). Norwood, NJ: Ablex.

    Google Scholar 

  • Bryk, A.S., Raudenbush, S.W., & Congdon, R.T. (1996). HLM. Hierarchical linear and nonlinear modeling with the HLM/2L and HLM/3L programs [Computer software]. Chicago: Scientific Software.

    Google Scholar 

  • Cronbach, L.J., Gleser, G.C., Nanda, H., & Rajaratnam, N. (1972). The dependability of behavioral measurements. New York: Wiley

    Google Scholar 

  • Feldt, L.S., & Brennan, R.L. (1989). Reliability. In R.L. Linn (Ed.), Educational measurement (3rd ed., pp. 105–146). Washington, DC: American Council on Education.

    Google Scholar 

  • Fischer G.H. (1974). Einführung in die Theorie psychologischer Tests. Bern: Huber.

    MATH  Google Scholar 

  • Fischer G.H. (1995a). The linear logistic test model. In G.H. Fischer & I.W. Molenaar (Eds.), Rasch models: Foundations, recent developments, and applications (pp. 131–155). New York: Springer-Verlag.

    Google Scholar 

  • Fischer G.H. (1995b). Linear logistic models for change. In G.H. Fischer & I.W. Molenaar (Eds.), Rasch models: Foundations, recent developments, and applications (pp. 157–180). New York: Springer-Verlag.

    Google Scholar 

  • Glas, C.A.W., & Verhelst, N.D. (1995a). Testing the Rasch model. In G.H. Fischer & I.W. Molenaar (Eds.), Rasch models: Foundations, recent developments, and applications (pp. 69–95). New York: Springer-Verlag.

    Google Scholar 

  • Glas, C.A.W., & Verhelst, N.D. (1995b). Tests of fit for polytomous Rasch models. In G.H. Fischer & I.W. Molenaar (Eds.), Rasch models: Foundations, recent developments, and applications (pp. 326–352). New York: Springer-Verlag.

    Google Scholar 

  • Goldstein, H. (1995). Multilevel statistical models (2nd ed.). London: Arnold.

    Google Scholar 

  • Goldstein, H., Rasbash, J., Plewis, I., Draper, D., Browne, W., Yang, M., Woodhouse, G., & Healy, M. (1998). A user’s guide to MLwiN [Software manual]. London: Multilevel Models Project, Institute of Education, University of London.

    Google Scholar 

  • Hedeker, D., & Gibbons, R.D. (1996). MIXOR: A computer program for mixed-effects ordinal regression analysis. Computer Methods and Programs in Biomedicine, 49, 229–252.

    Article  Google Scholar 

  • Holland P.W., & Wainer, H. (Eds.). (1993). Differential item functioning. Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Linn, R.L., Baker, E.L., & Dunbar, S.B. (1991). Complex, performancebased assessment: Expectations and validation criteria. Educational Researcher, 20(8), 15–21.

    Google Scholar 

  • McCullagh, P., & Neider, J.A. (1989). Generalized linear models (2nd ed.). London: Chapman and Hall.

    MATH  Google Scholar 

  • Messick, S. (1994). The interplay of evidence and consequences in the validation of performance assessments. Educational Researcher, 23(2), 13–23.

    Google Scholar 

  • Rodriguez, G., & Goldman, N. (1995). An assessment of estimation procedures for multilevel models with binary responses. Journal of the Royal Statistical Society, Series A, 158, 73–89.

    Google Scholar 

  • Sanders, P.F. (1992). The optimization of decision studies in generalizability theory. Unpublished doctoral dissertation, University of Amsterdam.

    Google Scholar 

  • Snijders, T.A.B., & Bosker, R.J. (1999). Multilevel analysis: An introduction to basic and advanced multilevel modeling. London: Sage.

    MATH  Google Scholar 

  • Van den Wollenberg, A.L. (1982). Two new test statistics for the Rasch model. Psychometrika, 47, 123–140.

    Article  MATH  Google Scholar 

  • Veldhuijzen, N.H., Goldebeld P., & Sanders, P.F. (1993). Klassieke testtheorie en generaliseerbaarheidstheorie [Classical test theory and generalizability theory]. In T.J.H.M. Eggen & P.F. Sanders (Eds.), Psychometrie in de praktijk (pp. 33–82). Arnhem: CITO.

    Google Scholar 

  • Verfielst, N.D. (1993). On the standard errors of parameter estimates in the Rasch model (Measurement and Research Department Reports, 93-1). Arnhem: CITO.

    Google Scholar 

  • Verhelst, N.D., & Glas, C.A.W. (1995). The one parameter logistic model. In G.H. Fischer & I.W. Molenaar (Eds.), Rasch models: Foundations, recent developments, and applications (pp. 215–237). New York: Springer-Verlag.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer Science+Business Media New York

About this chapter

Cite this chapter

Verhelst, N.D., Verstralen, H.H.F.M. (2001). An IRT Model for Multiple Raters. In: Boomsma, A., van Duijn, M.A.J., Snijders, T.A.B. (eds) Essays on Item Response Theory. Lecture Notes in Statistics, vol 157. Springer, New York, NY. https://doi.org/10.1007/978-1-4613-0169-1_5

Download citation

  • DOI: https://doi.org/10.1007/978-1-4613-0169-1_5

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-0-387-95147-8

  • Online ISBN: 978-1-4613-0169-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics