Abstract
Item response theory (IT) models are now in common use for the analysis of dichotomous item responses. This paper examines the sampling theory foundations for statistical inference in these models. The discussion includes: some history on the “stochastic subject” versus the random sampling interpretations of the probability in IRT models; the relationship between three versions of maximum likelihood estimation for IRT models; estimating θ versus estimating θ-predictors; IRT models and loglinear models; the identifiability of IRT models; and the role of robustness and Bayesian statistics from the sampling theory perspective.
Similar content being viewed by others
References
Andersen, E. B. (1970). Asymptotic properties of conditional maximum likelihood estimators.Journal of the Royal Statistical Society, Series B, 32, 283–301.
Andersen, E. B. (1980). Discrete statistical models with social science applications. Amsterdam: North Holland.
Birch, M. W. (1964). A new proof of the Pearson-Fisher theorem.Annals of Mathematical Statistics, 35, 718–824.
Birnbaum, Z. W. (1967).Statistical theory for logistic mental test models with a prior distribution of ability (ETS Research Bulletin RB-67-12). Princeton, NJ: Educational Testing Service.
Bock, R. D. (1967, March). Fitting a response model for n dichotomous items. Paper read at the Psychometric Society Meeting, Madison, WI.
Bock, R. D., & Aitken, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm.Psychometrika, 46, 443–459.
Bock, R. D., & Lieberman, M. (1970). Fitting a response model for n dichotomously scored items.Psychometrika, 35, 179–197.
Bush, R. R., & Mosteller, F. (1955).Stochastic models for learning. New York: Wiley.
Cressie, N., & Holland, P. W. (1983). Characterizing the manifest probabilities of latent trait models.Psychometrika, 48, 129–141.
de Leeuw, J., & Verhelst, N. (1986). Maximum likelihood estimation in generalized Rasch models.Journal of Educational Statistics, 11, 183–196.
Follman, D. A. (1988). Consistent estimation in the Rasch model based on nonparametric margins.Psychometrika, 53, 553–562.
Guttman, L. (1941). The quantification of a class of attributes: A theory and method of scale construction. In P. Horst, et al. (Ed.),The prediction of personal adjustment (pp. 319–348). New York: Social Science Research Council.
Guttman, L. (1950). The basis for scalogram analysis. In S. A. Stoufer, et al. (Ed.),Studies in social psychology in World War II, Vol. 4, measurement and prediction (pp. 60–90). Princeton, NJ: Princeton University Press.
Haberman, S. J. (1977). Maximum likelihood estimates in exponential response models.Annals of Statistics, 5, 815–841.
Holland, P. W. (1981). when are item response models consistent with observed data?Psychometrika, 46, 79–92.
Holland, P. W. (1990). The Dutch Identity: A new tool for the study of item response models.Psychometrika, 55, 5–18.
Holland, P. W., & Rosenbaum, P. R. (1986). Conditional association and unidimensionality in monotone latent variable models.Annals of Statistics, 14, 1523–1543.
Junker, B. W. (1988). Statistical aspects of a new latent trait model. Unpublished doctoral dissertation, University of Illinois at Urbana-Champaign, Department of Statistics.
Junker, B. W. (1989).conditional association, essential independence and local independence, Unpublished manuscript, University of Illinois at Urbana-Champaign, Department of Statistics.
Junker, B. W. (in press). Essential independence and likelihood-based ability estimation for polytomous items.Psychometrika.
Lawley, D. N. (1943). On problems connected with item selection and test construction.Proceedings of the Royal Statistical Society of Edinburgh, 61, 273–287.
Lazarsfeld, P. F. (1950). The logical and mathematical foundations of latent structure analysis. In S. A. Stoufer, et al. (Ed.),Studies in social psychology in Wold War II, Vol. 4, measurement and prediction (pp. 362–412). Princeton, NJ: Princeton University Press.
Lazarsfeld, P. F. (1959). Latent structure analysis. In S. Koch (Ed.),Psychology: A study of a science, Volume 3 (pp. 476–543). New York: McGraw Hill.
Leonard, T. (1975). Bayesian estimation methods for two-way contingency tables.Journal of the royal Statistical Society, Series B, 37, 23–37.
Levine, M. V. (1989).Ability distribution, pattern probabilities and quasidensities (Final Report.) Champaign, IL: University of Illinois, Model Based Measurement Laboratory.
Lewis, C. (1985). Developments in nonparametric ability estimation. In D. J. Weiss (Ed.),Proceedings of the 1982 IRT/CAT conference (pp. 105–122). Minneapolis, MN: University of Minnesota.
Lewis, C. (1990).A discrete, ordinal IRT model. Paper presented at the Annual Meeting of the American Educational Research Association, Boston, MA.
Lindsay, B., Clogg, C. C., & Grego, J. (in press). Semi-parametric estimation in the Rasch model and related exponential response models, including a simple latent class model for item analysis.Journal of the American Statistical Association.
Lord, F. M. (1952). A theory of test scores.Psychometrika Monograph No. 7, 17 (4, Pt. 2).
Lord, F. M. (1967).An analysis of the Verbal Scholastic Aptitude Test using Brinbaum's three-parametric logistic model (ETS Research Bulletin RB-67-34). Princeton, NJ: Education Testing Service.
Lord, F. M. (1974). Estimation of latent ability and item parameters when they are omitted responses.Psychometrika, 39, 247–264.
Lord, F. M., & Novick, M. R. (1968).Statistical theories of mental test scores. Reading, MA: Addison-Wesley.
Mislevy, R., & Stocking, M. (1989).A consumer's guide to LOGIST and BILOG.Applied Psychological Measurement, 13, 57–75.
Oakes, D. (1988). Semi-parametric models. In S. Kotz & N. L. Johnson (Eds.),Encyclopedia of statistical science, Volume 8 (pp. 367–369). New York: Wiley.
Rasch, G. (1960).Probabilistic medoels for some intelligence and attainment tests. Copenhagen: Nielson and Lydiche. (for Danmarks Paedagogiske Institut).
Rosenbaum, P. R. (1984). Testing the conditional independence and monotonicity assumptions of item response theory.Psychometrika, 49, 425–436.
Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores.Psychometrika Monograph No. 17, 33, (4, Pt. 2).
Samejima, F. (1972). A general model for free response data.Psychometrika Monograph No. 18, 34, (4, Pt. 2).
Samejima, F. (1983). Some methods and approaches of estimating the operating characteristics of discrete item responses. In H. Wainer & S. Messick (Eds.),Principals (sic) of modern psychological measurement (pp. 154–182). Hillsdale, NJ: Lawrence Erlbaum Associates.
Stout, W. (1987). A nonparametric approach for assessing latent trait unidimensionality.Psychometrika, 52, 589–617.
Stout, W. (1990). A new item response theory modeling approach with applications to unidimensionality assesment and ability estimation.Psychometrika, 55, 293–325.
Thissen, D. (1982). Marginal maximum liklihood estimation for the one-parameter logistic model.Psychometrika, 47, 175–186.
Tjur, T. (1982). A connection between Rasch's item analysis model and a multiplicative Poisson model.Scandinavian Journal of Statistics, 9, 23–30.
Tsao, R. (1967). A second order exponental model for multidimensional dichotomous contingency tables with applications in medical diagnosis. Unpublished doctoral disseration, Harvard University, Department of Statistics.
Tucker, L. R. (1964). Maximum validity of a test with equivlent items.Psychometrika, 11, 1–14.
Wainer, H., et al. (1990).Computerized adaptive testing: A primer. Hillsdale, NJ: Lawrence Erlbaum Associates.
Wright, B. D. (1977). Solving meassurement problems with the Rasch model.Journal of Educational Measurement, 14, 97–116.
Wright, B. D., & Douglas, G. A. (1977). Best procedures for sample-free item analysis.Applied Psychological Measurement, 1, 281–295.
Wright, B. D., & Stone, M. H. (1979).Best test design. Chicago: Mesa Press.
Author information
Authors and Affiliations
Additional information
A presidential address can serve many different functions. This one is a report of investigations I started at least ten years ago to understand what IRT was all about. It is a decidedly one-sided view, but I hope it stimulates controversy and further research. I have profited from discussions of this material with many people including: Brian Junker, Charles Lewis, Nicholas Longford, Robert Mislevy, Ivo Molenaar, Donald Rock, Donald Rubin, Lynne Steinberg, Martha Stocking, William Stout, Dorothy Thayer, David Thissen, Wim van der Linden, Howard Wainer, and Marilyn Wingersky. Of course, none of them is responsible for any errors or misstatements in this paper. The research was supported in part by the Cognitive Science Program, Office of Naval Research under Contract No. Nooo14-87-K-0730 and by the Program Statistics Research Project of Educational Testing Service.
Rights and permissions
About this article
Cite this article
Holland, P.W. On the sampling theory roundations of item response theory models. Psychometrika 55, 577–601 (1990). https://doi.org/10.1007/BF02294609
Issue Date:
DOI: https://doi.org/10.1007/BF02294609