Skip to main content
Log in

Markov chain estimation for test theory without an answer key

  • Article
  • Published:
Psychometrika Aims and scope Submit manuscript

Abstract

This study develops Markov Chain Monte Carlo (MCMC) estimation theory for the General Condorcet Model (GCM), an item response model for dichotomous response data which does not presume the analyst knows the correct answers to the test a priori (answer key). In addition to the answer key, respondent ability, guessing bias, and difficulty parameters are estimated. With respect to data-fit, the study compares between the possible GCM formulations, using MCMC-based methods for model assessment and model selection. Real data applications and a simulation study show that the GCM can accurately reconstruct the answer key from a small number of respondents.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Aarts, E., & Kours, T.J. (1989). Simulated Annealing and Boltzman machines:Stochastic approach to combinatorial optimization and neural computing. New York, NY: John Wiley & Sons.

    Google Scholar 

  • Baker, F.B. (1992).Item response theory: Parameter estimation techniques. New York, NY: Marcel Dekker.

    Google Scholar 

  • Batchelder, W.H., Kumbasar, E., & Boyd, J.P. (1997). Consensus analysis of three-way social network data.Journal of Mathematical Sociology, 22, 29–58.

    Google Scholar 

  • Batchelder, W.H., & Romney, A.K. (1986). The statistical analysis of a general Condorcet model for dichotomous choice situations. In B. Grofman & G. Owen (Eds.),Information pooling and group decision making (pp. 103–112). Greenwich, CT: JAI Press.

    Google Scholar 

  • Batchelder, W.H., & Romney, A.K. (1988). Test theory without an answer key.Psychometrika, 53, 71–92.

    Google Scholar 

  • Batchelder, W.H., & Romney, A.K. (1989). New results in test theory without an answer key. In E.E. Roskam (Ed.),Mathematical psychology in progress. Berlin, Germany: Springer-Verlag.

    Google Scholar 

  • Batchelder, W.H., & Romney, A.K. (2000).Extending cultural consensus theory to comparisons among cultures. Institute of the Mathematical Behavioral Sciences (Tech. Rep. 00-017). Irvine, CA: University of California, Irvine.

    Google Scholar 

  • Bernardo, J.M., & Smith, A.F.M. (1994).Bayesian theory. Chichester, England: John Wiley & Sons.

    Google Scholar 

  • Carlin, B.P., & Louis, T.A. (1998).Bayes and empirical Bayes methods for data analysis (first reprint). Boca Raton, FL: Chapman & Hall/CRC.

    Google Scholar 

  • Chen, W.H., & Thissen, D. (1997). Local dependence indices for item pairs using item response theory.Journal of Educational and Behavioral Statistics, 22, 265–289.

    Google Scholar 

  • Clogg, C.C. (1981). New developments in latent structure analysis. In D.M. Jackson & E.F. Borgatta (Eds.),Factor analysis and measurement in sociological research (pp. 215–246). Beverly Hills, CA: Sage Publications.

    Google Scholar 

  • Cowles, M.K., & Carlin, B.P. (1996). Markov Chain Monte Carlo convergence diagnostics: A comparative review.Journal of the American Statistical Association, 91, 883–904.

    Google Scholar 

  • Crowther, C.S., Batchelder, W.H., & Hu, X. (1995). A measurement-theoretic analysis of the fuzzy logic model of perception.Psychological Review, 102, 396–408.

    Google Scholar 

  • Gelfand, A.E., & Smith, A.F.M. (1990). Sampling based approaches to calculating marginal densities.Journal of the American Statistical Association, 85, 398–409.

    Google Scholar 

  • Gelfand, A.E., Smith, A.F.M., & Lee, T.M. (1992). Bayesian analysis of constrained parameter and truncated data problems using Gibbs sampling.Journal of the American Statistical Association, 87, 523–532.

    Google Scholar 

  • Gelman, A., Meng, X.-L., & Stern, H. (1996). Posterior predictive assessment of model fitness via realized discrepancies (with discussion).Statistica Sinica, 6, 733–807.

    Google Scholar 

  • Gelman, A., & Rubin, D.B. (1995). Avoiding model selection in Bayesian social research. In Peter V. Marsden (Ed.),Sociological Methodology (pp. 165–173). Cambridge, MA: Blackwell Publishing.

    Google Scholar 

  • Gelman, A., & Rubin, D.B. (1999). Evaluating and using statistical methods in the social sciences.Sociological Methods and Research, 27, 407–410.

    Google Scholar 

  • Geyer, C.J. (1992). Practical Markov Chain Monte Carlo (with discussion).Statistical Science, 7, 473–483.

    Google Scholar 

  • Geyer, C.J. (1996). Estimation and optimization of functions. In W.R. Gilks, S. Richardson, & D.J. Spiegelhalter (Eds.),Markov Chain Monte Carlo in practice (pp. 241–255). Boca Raton, FL: Chapman & Hall/CRC.

    Google Scholar 

  • Gilks, W.R., Richardson, S., & Spiegelhalter, D.J. (Eds.). (1996).Markov Chain Monte Carlo in practice. Boca Raton, FL: Chapman & Hall/CRC.

    Google Scholar 

  • Green, D.M., & Swets, J.A. (1966).Signal detection theory and psychophysics. New York, NY: John Wiley & Sons.

    Google Scholar 

  • Grofman, B., & Owne, G. (Eds.). (1986).Information pooling and group decision making. Greenwich, CT: JAI Press.

    Google Scholar 

  • Hastings, W.K. (1970). Monte Carlo methods using Markov Chains and their applications.Biometrika, 57, 99–109.

    Google Scholar 

  • Insightful Corporation. (1995)S-PLUS documentation. Seattle, WA: Author. (Formerly Statistical Sciences, Inc.)

    Google Scholar 

  • Johnson, N.L., & Kotz, S. (1970).Continuous univariate distributions, Vol. 2. Boston, MA: Houghton-Mifflin.

    Google Scholar 

  • Karabatsos, G. (2001). The Rasch model, additive conjoint measurement, and new models of probabilistic measurement theory.Journal of Applied Measurement, 2, 389–423.

    Google Scholar 

  • Lazarsfeld, P.F., & Henry, N.W. (1968).Latent structure analysis. New York, NY: Houghton Mifflin.

    Google Scholar 

  • Lord, F. (1983). SmallN justifies the Rasch model. In D.J. Weiss (Ed.),New horizons in latent trait test theory and computerized adaptive testing (pp. 51–61). New York, NY: Academic Press.

    Google Scholar 

  • Macmillan, N.A., & Creelman, C.D. (1991).Detection theory: A user's guide. New York, NY: Cambridge University Press.

    Google Scholar 

  • McCullaugh, P., & Nelder, J.A. (1983).Generalized linear models. London, U.K.: Chapman and Hall.

    Google Scholar 

  • Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., & Teller, E. (1953). Equations of state calculations by fast computing machines.Journal of chemical physics, 21, 1087–1091.

    Google Scholar 

  • Nelson, T.O., & Narens, L. (1980). Norms of 300 general information questions: Accuracy of recall, latency of recall, and feeling-of-knowing ratings.Journal of Verbal Learning and Verbal Behavior, 19, 338–368.

    Google Scholar 

  • Patz, R.J., & Junker, B.W. (1999). A straightforward approach to Markov Chain Monte Carlo Methods for item response models.Journal of Educational and Behavioral Statistics, 24, 146–178.

    Google Scholar 

  • Rasch, G. (1960).Probabilistic models for some intelligence and attainment tests. Copenhagen: The Danish Institute of Educational Research.

    Google Scholar 

  • Romney, A.K., & Batchelder, W.H. (1999). Cultural consensus theory. In R.A. Wilson & F.C. Keil (Eds.),The MIT enclyclopedia of the cognitive sciences (pp. 208–209). Cambridge, MA: The MIT Press.

    Google Scholar 

  • Romney, A.K., Weller, S.C., & Batchelder, W.H. (1986). Culture as consensus: A theory of culture and respondent accuracy.American Anthropologist, 88, 313–338.

    Google Scholar 

  • Roskam, E.E., & Jansen, P.G.W. (1984). A new derivation of the Rasch model. In E. Degreef & J. Van Buggenhaut (Eds.),Trends in mathematical psychology (pp. 293–307). North-Holland: Elsevier Science Publishers.

    Google Scholar 

  • Scheiblechner, H. (1995). Isotonic ordinal probabilistic models (ISOP).Psychometrika, 60, 281–304.

    Google Scholar 

  • Spiegelhalter, D.J., Best, N.G., Carlin, B.P., & van der Linde, A. (in press). Bayesian measures of model complexity and fit.Journal of the Royal Statistical Society, Series B.

  • Swets, J.A. (1996).Signal detection theory and ROC analysis in psychology and diagnostics: Collected papers (scientific psychology series). Hillsdale, NJ: Lawrence Erlbaum Associates.

    Google Scholar 

  • Tanner, M.A. (1996).Tools for statistical inference: Methods for the exploration of posterior distributions and likelihood functions (3rd ed.). New York, NY: Springer.

    Google Scholar 

  • Tierney, L. (1994). Exploring posterior distributions with Markov chains (with discussion).Annals of Statistics, 22, 1701–1762.

    Google Scholar 

  • Ye, J. (1998). On measuring and correcting the effects of data mining and model selection.Journal of the American Statistical Association, 93, 120–132.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to George Karabatsos.

Additional information

This study was supported in part by Spencer Foundation grant SG2001000020, George Karabatsos, Principal Investigator, and also in part by NSF Renewal Grant SES-0001550 to A.K. Romney and W.H. Batchelder, Co-Principal Investigators. The second author acknowledges the kind support of the Santa Fe Institute, where he worked on aspects of this paper as a Visiting Professor in the fall of 2001. Both authors appreciate the detailed comments offered by the Editor and two referees on an earlier version of the manuscript.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Karabatsos, G., Batchelder, W.H. Markov chain estimation for test theory without an answer key. Psychometrika 68, 373–389 (2003). https://doi.org/10.1007/BF02294733

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02294733

Key words

Navigation