Skip to main content

Sample Size and Chi-Squared Test of Fit—A Comparison Between a Random Sample Approach and a Chi-Square Value Adjustment Method Using Swedish Adolescent Data

  • Conference paper
  • First Online:
Book cover Pacific Rim Objective Measurement Symposium (PROMS) 2014 Conference Proceedings

Abstract

Background Significance tests are commonly sensitive to sample size, and Chi-Squared statistics is not an exception. Nevertheless, Chi-Squared statistics are commonly used for test of fit of measurement models. Thus, for analysts working with very large (or very small) sample sizes, this may require particular attention. However, several different approaches to handle a large sample size in test-of-fit analysis have been developed. Thus, one strategy may be to adjust the fit statistic to correspond to an equivalent sample of different size. This strategy has been implemented in the RUMM2030 software. Another strategy may be to adopt a random sample approach. Aims The RUMM2030 Chi-Square value adjustment facility has been available for a long time, but still there are few studies describing the empirical consequences of adjusting the sample to correspond to a smaller effective sample size in the statistical analysis of fit. Alternatively, a random sample approach could be adopted in order to handle the large sample size problem. The purpose of this study was to analyze and compare these two strategies as test-of-fit approximations, using Swedish adolescent data. Sample The analysis is based on the survey Young in Värmland which is a paper-and-pencil-based survey conducted recurrently since 1988, targeting all adolescent in school-year 9 residing the county of Värmland, Sweden. So far, more than 20,000 individuals have participated in the survey. In the analysis presented here, seven items based on the adolescents, experiences of the school environment were subjected to analysis, in total 21,088 individuals. Methods For the purposes of this study, the original sample size was adjusted to several different effective samples using the RUMM2030 adjustment function, in the test-of-fit analysis. In addition, 10 random samples for each sample size were drawn from the original sample and averaged Chi-Square values calculated. The Chi-Square values obtained using the two strategies were compared. Results Given the original sample of 21,000, adjusting to samples 5,000 or larger, the RUMM2030 adjustment facility work as well as a random sample approach. In contrast, when adjusting to lower samples, the adjustment function is less effective in approximating the Chi-Square value for an actual random sample of the relevant size. Hence, fit is exaggerated and misfit underestimated using the adjustment function, in particular that is true for fitting but not misfitting items. Conclusion Although the inferences based on p values may be the same, despite big Chi-Square value differences between the two approaches, the danger of using fit statistics mechanically cannot be enough stressed. Neither the adjustment function nor the random sample approach is sufficient in evaluating model fit; instead, several complementing methods should be used.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Andrich, D. (1978). A rating formulation for ordered response categories. Psychometrika, 43, 561–573.

    Article  Google Scholar 

  • Andrich, D. (1988). Rasch models for measurement. Newbury Park: Sage Publications.

    Google Scholar 

  • Andrich, D., Sheridan, B., & Luo, G. (2009). Interpreting RUMM2030 (Part I, Dichotomous Data): Rasch unidimensional models for measurement. Perth, Western Australia: RUMM Laboratory Pty Ltd.

    Google Scholar 

  • Andrich, D., Sheridan, B., & Luo, G. (2013). RUMM2030: A windows program for the rasch unidimensional measurement model [Computer Software]. Perth, WA, Australia: RUMM Laboratory.

    Google Scholar 

  • Andrich, D., & Styles, I. (2011). Distractors with information in multiple choice items: A rationale based on the Rasch model. Journal of Applied Measurement, 12, 67–95.

    Google Scholar 

  • Bergh, D. (2015). Chi-squared test of fit and sample size—A comparison between a random sample approach and a chi-square value adjustment method. Journal of Applied Measurement, Forthcoming.

    Google Scholar 

  • Gustafsson, J.-E. (1980). Testing and obtaining fit of data to the Rasch model. British Journal of Mathematical and Statistical Psychology, 33, 205–233.

    Article  Google Scholar 

  • Lantz, B. (2013). The large sample size fallacy. Scandinavian Journal of Caring Sciences, 27, 487–492.

    Article  Google Scholar 

  • Linacre, J. M. (2002). What do infit and outfit, mean-square and standardized mean. Rasch Measurement Transactions, 16, 878.

    Google Scholar 

  • Martin-Löf, P. (1973, May 7–12). The notion of redundancy and its use as a quantitative measure of the deviation between a statistical hypothesis and a set of observational data. Paper presented at the Conference on foundational questions in statistical inference, Aarhus, Denmark.

    Google Scholar 

  • Martin-Löf, P. (1974). The notion of redundancy and its use as a quantitative measure of the discrepancy between a statistical hypothesis and a set of observational data. Scandinavian Journal of Statistics, 1, 3–18.

    Google Scholar 

  • Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Denmark, Copenhagen, Danish Institute for Educational Research. Expanded edition, 1980. Chicago: University of Chicago Press.

    Google Scholar 

  • Smith, R. M., Schumacker, R. E., & Bush, J. M. (1998). Using item mean squares to evaluate fit to the Rasch model. Journal of Outcome Measurement, 2, 66–78.

    Google Scholar 

  • Tennant, A., & Pallant, J. F. (2012). The root mean square error of approximation (RMSEA) as a supplementary statistic to determine fit to the Rasch model with large sample sizes. Rasch Measurement Transactions, 25, 1348–1349.

    Google Scholar 

  • Wright, B. D., & Masters, G. N. (1982). Rating scale analysis. Chicago: MESA Press.

    Google Scholar 

  • Wright, B. D., & Linacre, J. M. (1994). Reasonable mean-square fit values. Rasch Measurement Transactions, 8(3), 370.

    Google Scholar 

  • Wright, B. D., & Masters, G. N. (1990). Computation of OUTFIT and INFIT statistics. Rasch Measurement Transactions, 1990, 84–85.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniel Bergh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bergh, D. (2015). Sample Size and Chi-Squared Test of Fit—A Comparison Between a Random Sample Approach and a Chi-Square Value Adjustment Method Using Swedish Adolescent Data. In: Zhang, Q., Yang, H. (eds) Pacific Rim Objective Measurement Symposium (PROMS) 2014 Conference Proceedings. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-47490-7_15

Download citation

Publish with us

Policies and ethics