Classical Test Theory as a first-order Item Response Theory: Application to true-score prediction from a possibly nonparallel test

Holland, Paul W.; Hoskens, Machteld

doi:10.1007/BF02296657

Classical Test Theory as a first-order Item Response Theory: Application to true-score prediction from a possibly nonparallel test

Articles
Published: March 2003

Volume 68, pages 123–149, (2003)
Cite this article

Psychometrika Aims and scope Submit manuscript

Paul W. Holland¹ &
Machteld Hoskens²

613 Accesses
57 Citations
3 Altmetric
Explore all metrics

Abstract

We give an account of Classical Test Theory (CTT) in terms of the more fundamental ideas of Item Response Theory (IRT). This approach views classical test theory as a very general version of IRT, and the commonly used IRT models as detailed elaborations of CTT for special purposes. We then use this approach to CTT to derive some general results regarding the prediction of the true-score of a test from an observed score on that test as well from an observed score on a different test. This leads us to a new view of linking tests that were not developed to be linked to each other. In addition we propose true-score prediction analogues of the Dorans and Holland measures of the population sensitivity of test linking functions. We illustrate the accuracy of the first-order theory using simulated data from the Rasch model, and illustrate the effect of population differences using a set of real data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Item Response Theory

An R toolbox for score-based measurement invariance tests in IRT models

Article Open access 16 December 2021

Lennart Schneider, Carolin Strobl, … Rudolf Debelak

Objective Measurement: How Rasch Modeling Can Simplify and Enhance Your Assessment

References

Bock, R.D., & Mislevy, R.J. (1982). Adaptive EAP estimation in a microcomputer environment.Applied Psychological Measurement, 6, 431–444.
Google Scholar
Dorans, N., & Holland, P.W. (2000). Population invariance and the equatability of tests: Basic theory and the linear case.Journal of Educational Measurement, 37, 281–306.
Article Google Scholar
Feuer, M.J., Holland, P.W., Green, B.F., Bertenthal, M.W., & Hemphill, F.C. (1999).Uncommon measures. Washington, DC: National Academy Press.
Google Scholar
Gelman, A. Carlin, J.B., Stern, H.S., & Rubin, D.B. (1995).Bayesian data analysis. London: Chapman and Hall.
Google Scholar
Holland, P.W. (1990) On the sampling theory foundations of item response theory models.Psychometrika, 55, 577–601.
Google Scholar
Kelley, T.L. (1923)Statistical methods. New York, NY: Macmillan
Google Scholar
Lord, F.M., & Novick, M.R. (1968).Statistical theories of mental test scores. Reading, MA: Addison-Wesley.
Google Scholar
Mislevy, R.J., Beaton, A.E., Kaplan, B., & Sheehan, K.M. (1992). Estimating population characteristics from sparse matrix samples of item responses.Journal of Educational Measurement, 29, 133–161.
Article Google Scholar
Pashley, P.J., & Phillips, G.W. (1993) Toward world-class standards: A research study linking national and international assessments. Center for Educational Progress. Princeton NJ: Educational Testing Service.
Google Scholar
Wainer, H. et al. (2001) Augmented scores—“Borrowing strength” to compute scores based on small numbers of items. In D. Thissen & H. Wainer (Eds.),Test Scoring (pp. 343–387). Mahwah, NJ: Earlbaum.
Google Scholar
Williams, V. et al. (1995) Projecting to the NAEP scale: Results from the North Carolina End-of-Grade testing program (Tech. Rep. #34). Chapel Hill, NC: National Institute of Statistical Science, University of North Carolina, Chapel Hill.
Google Scholar
Wu, M., Adams, R., & Wilson, M. (1997) ConQuest [Computer program]. Melbourne, Australia: Australian Council for Educational Research.
Google Scholar

Download references

Author information

Authors and Affiliations

Educational Testing Service, Rosedale Road 12-T, 08541, Princeton, NJ
Paul W. Holland
Ctb-McGraw Hill, USA
Machteld Hoskens

Authors

Paul W. Holland
View author publications
You can also search for this author in PubMed Google Scholar
Machteld Hoskens
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Paul W. Holland.

Additional information

This research is collaborative in every respect and the order of authorship is alphabetical. It was begun when both authors were on the faculty of the Graduate School of Education at the University of California, Berkeley.

We would like to thank both Neil Dorans, Skip Livingston and two anonymous referees for many suggestions that have greatly improved this paper.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Holland, P.W., Hoskens, M. Classical Test Theory as a first-order Item Response Theory: Application to true-score prediction from a possibly nonparallel test. Psychometrika 68, 123–149 (2003). https://doi.org/10.1007/BF02296657

Download citation

Received: 24 July 2001
Revised: 03 June 2002
Issue Date: March 2003
DOI: https://doi.org/10.1007/BF02296657

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Classical Test Theory as a first-order Item Response Theory: Application to true-score prediction from a possibly nonparallel test

Abstract

Access this article

Similar content being viewed by others

Item Response Theory

An R toolbox for score-based measurement invariance tests in IRT models

Objective Measurement: How Rasch Modeling Can Simplify and Enhance Your Assessment

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Key words

Navigation

Classical Test Theory as a first-order Item Response Theory: Application to true-score prediction from a possibly nonparallel test

Abstract

Access this article

Similar content being viewed by others

Item Response Theory

An R toolbox for score-based measurement invariance tests in IRT models

Objective Measurement: How Rasch Modeling Can Simplify and Enhance Your Assessment

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

Search

Navigation