Skip to main content
Log in

On the misuse of manifest variables in the detection of measurement bias

  • Published:
Psychometrika Aims and scope Submit manuscript

Abstract

Measurement invariance (lack of bias) of a manifest variableY with respect to a latent variableW is defined as invariance of the conditional distribution ofY givenW over selected subpopulations. Invariance is commonly assessed by studying subpopulation differences in the conditional distribution ofY given a manifest variableZ, chosen to substitute forW. A unified treatment of conditions that may allow the detection of measurement bias using statistical procedures involving only observed or manifest variables is presented. Theorems are provided that give conditions for measurement invariance, and for invariance of the conditional distribution ofY givenZ. Additional theorems and examples explore the Bayes sufficiency ofZ, stochastic ordering inW, local independence ofY andZ, exponential families, and the reliability ofZ. It is shown that when Bayes sufficiency ofZ fails, the two forms of invariance will often not be equivalent in practice. Bayes sufficiency holds under Rasch model assumptions, and in long tests under certain conditions. It is concluded that bias detection procedures that rely strictly on observed variables are not in general diagnostic of measurement bias, or the lack of bias.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Berk, R. A. (1982).Handbook of methods for detecting test bias. Baltimore, MD: The Johns Hopkins University.

    Google Scholar 

  • Cleary, T. A. (1968). Test bias: Prediction of grades of Negro and white students in integrated colleges.Journal of Educational Measurement, 5, 115–124.

    Google Scholar 

  • Holland, P. W., & Thayer, D. T. (1988). Differential item performance and the Mantel-Haenszel procedure. In H. Wainer & H. I. Braun (Eds.),Test validity (pp. 129–145). Hillsdale, NJ: Lawrence Erlbaum.

    Google Scholar 

  • Ironson, G. H. (1982). Use of chi-square and latent trait approaches for detecting item bias. In R. A. Berk (Ed.),Handbook of methods for detecting test bias (pp. 117–160). Baltimore, MD: The Johns Hopkins University.

    Google Scholar 

  • Junker, B. W. (1990, June).Essential independence and structural robustness in item response theory. Paper presented at the annual meeting of the Psychometric Society, Princeton, NJ.

  • Lehmann, E. L. (1955). Ordered families of distributions.Annals of Mathematical Statistics, 26, 399–419.

    Google Scholar 

  • Lehmann, E. L. (1986).Testing statistical hypotheses. New York: Wiley.

    Google Scholar 

  • Lord, F. M. (1980).Applications of item response theory to practical testing problems. Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Lord, F. M., & Novick, M. R. (1968).Statistical theories of mental test scores. Reading, MA: Addison-Wesley.

    Google Scholar 

  • Mantel, N., & Haenszel, W. (1959). Statistical aspects of the analysis of data from retrospective studies of disease.Journal of the National Cancer Institute, 22, 719–748.

    Google Scholar 

  • Marascuilo, L. A., & Slaughter, R. E. (1981). Statistical procedures for identifying possible sources of item bias based onx 2 statistics.Journal of Educational Measurement, 18, 229–248.

    Google Scholar 

  • Mellenbergh, G. J. (1989). Item bias and item response theory.International Journal of Educational Research, 13, 127–143.

    Google Scholar 

  • Rao, C. R. (1973).Linear statistical inference and its applications. New York: Wiley.

    Google Scholar 

  • Reilly, R. R. (1986). Validating employee selection procedures. In D. H. Kaye & M. H. Aicken (Eds.),Statistical methods in discrimination litigation (pp. 133–158). New York: Marcel Dekker.

    Google Scholar 

  • Scheuneman, J. D. (1979). A method of assessing bias in test items.Journal of Educational Measurement, 16, 143–152.

    Google Scholar 

  • Shealy, R., & Stout, W. F. (1990, June).A new model and statistical test for psychological test bias. Paper presented at the annual meeting of the Psychometric Society, Princeton, NJ.

  • Shepard, L. A., Camilli, G., & Averill, M. (1981). Comparison of procedures for detecting test-item bias with both internal and external ability criteria.Journal of Educational Statistics, 6, 317–375.

    Google Scholar 

  • Stout, W. F. (1990). A new item response theory modeling approach with applications to multidimensionality assessment and ability estimation.Psychometrika, 55, 293–325.

    Google Scholar 

  • Thissen, D., Steinberg, L., & Wainer, H. (1988). Use of item response theory in the study of group differences in trace lines. In H. Wainer & H. I. Braun (Eds.),Test validity (pp. 147–169). Hillsdale, NJ: Lawrence Erlbaum.

    Google Scholar 

  • Zwick, R. (1990). When do item response function and Mantel-Haenszel definitions of differential item functioning coincide?,Journal of Educational Statistics, 15, 185–197.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

Preparation of this article was supported in part by PSC-CUNY grant #661282 to Roger E. Millsap.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Meredith, W., Millsap, R.E. On the misuse of manifest variables in the detection of measurement bias. Psychometrika 57, 289–311 (1992). https://doi.org/10.1007/BF02294510

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02294510

Key words

Navigation