Abstract
An IRT model for multiple ratings is presented. If it is assumed that the quality of a student performance has a stochastic relationship with the latent variable of interest, it is shown that the ratings of several raters are not conditionally independent given the latent variable. The model gives a full account of this dependence. Several relationships with other models appear to exist. The proposed model is a special case of a nonlinear multilevel model with three levels, but it can also be seen as a linear logistic model with relaxed assumptions (LLRA). Moreover, a linearized version of the model turns out to be a special case of a generalizability model with two crossed measurement facets (items and raters) with a single first-order interaction term (persons and items). Using this linearized model, it is shown how the estimated standard errors of the parameters are affected if the dependence between the ratings is ignored.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Adams, R.J., & Wilson, M.R. (1996). Formulating the Rasch model as a mixed coefficients multinomial logit: A generalized approach to fitting Rasch models. In G. Engelhard & M.R. Wilson (Eds.), Objective measurement III: Theory into practice (pp. 143–166). Norwood, NJ: Ablex.
Bryk, A.S., Raudenbush, S.W., & Congdon, R.T. (1996). HLM. Hierarchical linear and nonlinear modeling with the HLM/2L and HLM/3L programs [Computer software]. Chicago: Scientific Software.
Cronbach, L.J., Gleser, G.C., Nanda, H., & Rajaratnam, N. (1972). The dependability of behavioral measurements. New York: Wiley
Feldt, L.S., & Brennan, R.L. (1989). Reliability. In R.L. Linn (Ed.), Educational measurement (3rd ed., pp. 105–146). Washington, DC: American Council on Education.
Fischer G.H. (1974). Einführung in die Theorie psychologischer Tests. Bern: Huber.
Fischer G.H. (1995a). The linear logistic test model. In G.H. Fischer & I.W. Molenaar (Eds.), Rasch models: Foundations, recent developments, and applications (pp. 131–155). New York: Springer-Verlag.
Fischer G.H. (1995b). Linear logistic models for change. In G.H. Fischer & I.W. Molenaar (Eds.), Rasch models: Foundations, recent developments, and applications (pp. 157–180). New York: Springer-Verlag.
Glas, C.A.W., & Verhelst, N.D. (1995a). Testing the Rasch model. In G.H. Fischer & I.W. Molenaar (Eds.), Rasch models: Foundations, recent developments, and applications (pp. 69–95). New York: Springer-Verlag.
Glas, C.A.W., & Verhelst, N.D. (1995b). Tests of fit for polytomous Rasch models. In G.H. Fischer & I.W. Molenaar (Eds.), Rasch models: Foundations, recent developments, and applications (pp. 326–352). New York: Springer-Verlag.
Goldstein, H. (1995). Multilevel statistical models (2nd ed.). London: Arnold.
Goldstein, H., Rasbash, J., Plewis, I., Draper, D., Browne, W., Yang, M., Woodhouse, G., & Healy, M. (1998). A user’s guide to MLwiN [Software manual]. London: Multilevel Models Project, Institute of Education, University of London.
Hedeker, D., & Gibbons, R.D. (1996). MIXOR: A computer program for mixed-effects ordinal regression analysis. Computer Methods and Programs in Biomedicine, 49, 229–252.
Holland P.W., & Wainer, H. (Eds.). (1993). Differential item functioning. Hillsdale, NJ: Erlbaum.
Linn, R.L., Baker, E.L., & Dunbar, S.B. (1991). Complex, performancebased assessment: Expectations and validation criteria. Educational Researcher, 20(8), 15–21.
McCullagh, P., & Neider, J.A. (1989). Generalized linear models (2nd ed.). London: Chapman and Hall.
Messick, S. (1994). The interplay of evidence and consequences in the validation of performance assessments. Educational Researcher, 23(2), 13–23.
Rodriguez, G., & Goldman, N. (1995). An assessment of estimation procedures for multilevel models with binary responses. Journal of the Royal Statistical Society, Series A, 158, 73–89.
Sanders, P.F. (1992). The optimization of decision studies in generalizability theory. Unpublished doctoral dissertation, University of Amsterdam.
Snijders, T.A.B., & Bosker, R.J. (1999). Multilevel analysis: An introduction to basic and advanced multilevel modeling. London: Sage.
Van den Wollenberg, A.L. (1982). Two new test statistics for the Rasch model. Psychometrika, 47, 123–140.
Veldhuijzen, N.H., Goldebeld P., & Sanders, P.F. (1993). Klassieke testtheorie en generaliseerbaarheidstheorie [Classical test theory and generalizability theory]. In T.J.H.M. Eggen & P.F. Sanders (Eds.), Psychometrie in de praktijk (pp. 33–82). Arnhem: CITO.
Verfielst, N.D. (1993). On the standard errors of parameter estimates in the Rasch model (Measurement and Research Department Reports, 93-1). Arnhem: CITO.
Verhelst, N.D., & Glas, C.A.W. (1995). The one parameter logistic model. In G.H. Fischer & I.W. Molenaar (Eds.), Rasch models: Foundations, recent developments, and applications (pp. 215–237). New York: Springer-Verlag.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer Science+Business Media New York
About this chapter
Cite this chapter
Verhelst, N.D., Verstralen, H.H.F.M. (2001). An IRT Model for Multiple Raters. In: Boomsma, A., van Duijn, M.A.J., Snijders, T.A.B. (eds) Essays on Item Response Theory. Lecture Notes in Statistics, vol 157. Springer, New York, NY. https://doi.org/10.1007/978-1-4613-0169-1_5
Download citation
DOI: https://doi.org/10.1007/978-1-4613-0169-1_5
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-95147-8
Online ISBN: 978-1-4613-0169-1
eBook Packages: Springer Book Archive