Abstract
Researchers studying item response models are often interested in examining the effects of local dependency on the validity of the resulting conclusion from statistical inference. This paper focuses on the detection of local dependency. We provide a framework for viewing local dependency within dichotomous and polytomous items that are clustered by design, and present a testing procedure that allows researchers to specifically identify individual item pairs that exhibit local dependency, while controlling for false positive rate. Simulation results from the study indicate that the proposed method is effective. In addition, a discussion of its relation to other existing methods is provided.
Similar content being viewed by others
References
Agresti, A. (1990).Categorical data analysis. New York: Wiley & Sons.
Bahadur, R. (1961). A representation of the joint distribution of responses ton dichotomous items. In. H. Solomon (Ed.),Studies in item analysis and prediction. (pp. 158–68). Palo Alto, CA: Stanford University Press.
Becker, R.A., Chambers, J. M., & Wilks, A. R. (1988).The new S Language. New York: Chapman & Hall.
Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing.Journal of the Royal Statistical Society, Series B, 57, 289–300.
Birch, M.W. (1964). The detection of partial association I: The case.Journal of Royal Statistical Society, Series B, 27, 313–324.
Bishop, Y., Fienberg, S., & Holland, P. (1975).Discrete multivariate analysis. Boston, MA: MIT Press.
Bradlow, E., Wainer, H., & Wang, X. (1999). A Bayesian random effects model for testlets.Psychometrika, 64, 153–168.
Breslow, N. (1981). Odds ratio estimators when the data are sparse.Biometrika, 68, 73–84.
Chen, W., & Thissen, D. (1997). Local dependence indexes for item pairs using item response theory.Journal of Educational and Behavioral Statistics, 22, 265–289.
Cochran, W.G. (1954). Some methods of strengthening the commonx 2 tests.Biometrics, 10, 417–451.
Dale, R. (1986). Global cross-ratio models for bivariate, discrete ordered responses.Biometrics, 42, 909–917.
Darroch, J.N. (1981). The Mantel-Haenszel test and tests of marginal symmetry: Fixed effects and mixed models for a categorical response.International Statistical Review, 49, 285–307.
Donner, A., & Hauck, W., (1988). Estimation of a common odds ratio in case-control studies of familial aggregation.Biometrics, 44, 369–378.
Douglas, J., Kim, H., Habing B, & Gao, F. (1998). Investigating local dependence with conditional covariance functions.Journal of Educational and Behavioral Statistics, 23, 129–151.
Efron, B. (1982).The jackknife, the bootstrap and other resampling plans (CBMS-NSF Regional Conference Series in Applied Mathematics, Volume 38). Philadelphia: SIAM.
Gao, F. (1997).DIMTEST enhancements in some parametric IRT asymptotics. Unpublished doctoral dissertation, University of Illinois at Urbana-Champaign, Department of Statistics.
Gibbons, R.D., Bock, R.D., & Hedeker, D.R. (1989).Conditional dependence (Biometric Lab. Rep. 89-1). Urbana-Champaign, IL: University of Illinois.
Goldstein, H. (1980). Dimensionality, bias, independence and measurement scale problems in latent trait test score models.British Journal of Mathematical and Statistical Psychology, 33, 234–246.
Habing, B.T. (1998).Some issues in weak local dependence in item response theory. Unpublished doctoral dissertation, University of Illinois at Urbana-Champaign, Department of Statistics.
Habing, B., & Donoghue, J.R. (1998).Local dependence assessment for exams with polytomous items and incomplete item-examinee layouts. Manuscript submitted for publication.
Habing, B.T., & Roussos, L. (1998).A model for item response data with pairwise local dependence. Paper presented at the annual meeting of the National Council of Measurement in Education, San Diego, CA.
Hambleton, R.K., Swaminathan, H., Cook, L.L., Eignor, D.E., & Gifford, J.A. (1978). Developments in latent trait theory: Models, technical issues, and applications.Review of Educational Research, 48, 476–510.
Harwell, M., Stone, C.A., Hsu, T., & Kirisci, L. (1996). Monte Carlo studies in item response theory.Applied Psychological Measurement, 20, 101–125.
Hattie, J.A. (1985). Methodological review: Assessing unidimensionality of tests and items.Applied Psychological Measurement, 9, 139–164.
Hattie, J., Krakowski, K., Rogers, H.J., Swaminathan, H. (1996). An assessment of Stout's index of essential unidimensionality.Applied Psychological Measurement, 20, 1–14.
Hauck, W. (1979). The large sample variance of the Mantel-Haenszel estimator of a common odds ratio.Biometrics, 25, 817–820.
Hochberg, Y., & Tamhane, A. (1987).Multiple comparison procedures. New York, NY: Wiley & Sons.
Holland, P.W. (1981). When are item response models consistent with observed data?Psychometrika, 46, 79–92.
Holland, P., & Rosenbaum, P. (1986). Conditional association and unidimensionality in montone latent variable models.Annals of Statistics, 14, 1523–1543.
Holland, P.W., & Thayer, D.T. (1988). Differential item performance and the Mantel-Haenszel procedure. In H. Wainer & H.I. Braun (Eds.),Test validity (pp. 129–145). Hillsdale, NJ: Erlbaum.
Hoskens, M. & De Boeck, P. (1997). A parametric model for local item dependencies among test items.Psychological Methods, 2, 261–277.
Ip, E.H. (2000). Adjusting for information inflation due to local dependency in moderately large item clusters.Psychometrika, 65, 73–91.
Jannarone, R. (1992a). Conjunctive measurement theory: Cognitive research prospects. In M. Wilson (Ed.),Objective measurement: Theory and practice, Volume 1 (pp. 210–235). Norwood, NJ: Ablex Publishing.
Jannarone, R. (1992b). Local dependence: Objectively measurable or objectionably abominable?. In M. Wilson (Ed.),Objective Measurement: Theory and practice, Volume 2. Norwood, NJ: Ablex Publishing.
Jennings, D.E. (1986). Outliers and Residual distributions in logistic regression.Journal of the American Statistical Association, 81, 987–990.
Junker, B.W. (1991). Essential independence and likelihood-based ability estimation for polytomous items.Psychometrika, 56, 255–278.
Junker, B.W. (1993). Progress in characterizing strictly unidimensional IRT representations.The Annals of Statistics, 21, 1359–1378.
Kim, H. (1994).New techniques for the dimensionality assessment of standardized test data. Unpublished doctoral dissertation, University of Illinois at Urbana-Champaign, Department of Statistics.
Lehmann, E.L. (1991).Testing statistical hypothesis (2nd ed.). New York, NY: Springer-Verlag.
Mantel, N. (1963). Chi-square tests with one degree of freedom: Extensions of the Mantel-Haenszel procedure.Journal of the American Statistical Association, 58, 690–700.
Mantel, N., & Haenszel, W. (1959). Statistical aspects of the retrospective study of disease.Journal of the National Cancer Institute, 22, 719–748.
McCullagh, P., & Nelder, J.A. (1989).Generalized linear models (2nd ed.). New York: Chapman & Hall.
McDonald, R.P. (1981). The dimensionality of tests and items.British Journal of Mathematical and Statistical Psychology, 34, 100–117.
McDonald, R. P. (1994). Testing for approximate dimensionality. In D. Laveault, B. Zumbo, M. Gessarli, & M. Boss (Eds.),Modern theory of measurement: Problems and issues (pp. 63–86). Ottawa: University of Ottawa Press.
Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm.Applied Psychological Measurement, 16, 159–176.
Nandakumar, R., & Stout, W.F. (1993). Refinements of Stout's procedure for assessing latent trait unidimensionality.Journal of Educational Statistics, 18, 41–68.
Pashley, P.J., & Reese, L.M. (1995).On generating locally dependent item responses (Statistical Rep. 95-04). Newton, PA: Law School Admission Council.
Plackett, R.L. (1965). A class of bivariate distributions.Journal of American Statistical Association, 65, 516–522.
Reese, L. (1995).The impact of local dependencies on some LSAT outcomes (Statistical Rep. 95-02). Newton, PA: Law School Admission Council.
Rosenbaum, P.R. (1984). Testing the conditional independence and monotonicity assumptions of item response theory.Psychometrika, 49, 425–435.
Roussos, L.A., Stout, W.F., & Marden, J.I. (1998). Using new proximity measure with hierarchical cluster analysis to detect multidimensionality.Journal of Educational Measurement, 35, 1–30.
Shaffer, J.P. (1995). Multiple hypothesis testing.Annual Review of Psychology, 46, 561–584.
Somes, G.W., & O'Brien, K.F. (1985). Mantel-Haenszel statistics. In Johnson & Kotz (Eds.),Encyclopedia of Statistical Science, Vol.5 (pp. 214–217). New York, NY: Wiley & Sons.
Stout, W.F. (1987). A nonparametric approach for assessing latent traitdimensionality.Psychometrika, 52, 589–617.
Stout, W.F. (1990). A new item response theory modeling approach with application to unidimensionality assessment and ability estimation.Psychometrika, 55, 293–325.
Stout, W.F., Habing, B., Douglas, J., Kim, H., Roussos, L., & Zhang, J. (1996). Conditional covariance based nonparametric multidimensionality assessment.Applied Psychological Measurement, 20, 331–354.
Stout, W.F., Nandakumar, R., Junker, B., Chang, H.H., & Steidinger, D. (1991).DIMTEST and TESTSIM [Computer program]. Urbana-Champaign: University of Illinois, Department of Statistics.
Suppes, P., & Zanotti, M. (1981). When are probabilistic explanations possible?Synthese, 48, 191–199.
Tate, R.L. (1998).A comparison of selected methods for assessing the dimensionality of tests comprised of dichotomous items. Paper presented at the meeting of the National Council of Measurement in Education, San Diego, California.
Tuerlinckx, F., & De Boeck, P. (1998).The effect of ignoring local item dependencies on the estimated discrimination parameters (Research Rep. 98-2). Leuven, Belgium: University of Leuven.
Williams, V.S.L., Jones, L.V., & Tukey, J. (1994).Controlling error in multiple comparisons, with special attention to National Assessment of Educational Progress (Tech. Rep. 33). Research Triangle Park, NC: National Institute of Statistical Sciences.
Wu, H., & Stout, W.F. (1996, June).A test of local independence going beyond conditional covariance exploration. Paper presented at the Annual Meeting of the Psychometric Society, Banff, Canada.
Yen, W.M. (1984). Effects of local item dependence on the fit and equating performance of the three-parameter logistic model.Applied Psychological Measurement, 8, 125–145.
Yen, W.M. (1993). Scaling performance assessments: Strategies for managing local item dependence.Journal of Educational Measurement, 30, 187–213.
Zhang, J., & Stout, W.F. (1999). Conditional covariance structure of generalized compensatory multidimensional items.Psychometrika, 64, 129–152.
Zwick, R. (1987). Assessing the dimensionality of NAEP reading data.Journal of Educational Measurement, 24, 293–308.
Author information
Authors and Affiliations
Corresponding author
Additional information
The research was supported under the National Assessment of Educational Progress (Grant No. R902B990007) administered by the National Center of Education Statistics, U.S. Department of Education. This work was started when the author was at the Division of Statistics and Psychometrics at the Educational Testing Service. I thank Juliet Shaffer for her comments on the multiple testing procedure. I also thank three anonymous referees and the Associate Editor for suggestions that greatly improved the presentation of the manuscript.
Rights and permissions
About this article
Cite this article
Ip, E.Hs. Testing for local dependency in dichotomous and polytomous item response models. Psychometrika 66, 109–132 (2001). https://doi.org/10.1007/BF02295736
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02295736