Original Article

Much ado About Nothing, or Much to do About Something?

Effects of Scale Shortening on Criterion Validity and Mean Differences

Moritz Heene

Department of Psychology and Educational Sciences, LMU Munich, Germany

Search for more papers by this author

Stella Bollmann

Department of Psychology and Educational Sciences, LMU Munich, Germany

Search for more papers by this author

, and

Markus Bühner

Department of Psychology and Educational Sciences, LMU Munich, Germany

Search for more papers by this author

Published Online:January 01, 2014https://doi.org/10.1027/1614-0001/a000146

Abstract

Short scales have become widely used in settings in which participant time is limited and when assessment would otherwise be impossible. Based on simulated data on the population level we investigate whether scale shortening affects the desired invariance of criterion-related validities as well as differences between estimated expected values of populations. We conclude that, under a unidimensional model, decreasing the number of items does neither affect criterion validity nor differences between expected values between two populations. It is, however, discussed, that possible problems of scale shortening can occur on the construct level and, practically more important, on the individual score level.

References

Andersen, E. B. (1973). A goodness of fit test for the Rasch model. Psychometrika, 38, 123–140. First citation in article Crossref, Google Scholar
Andreski, S. (1972). Social sciences as sorcery. London, UK: Deutsch. First citation in article Google Scholar
Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. M. Lord, M. R. Novick (Eds.), Statistical theories of mental test scores (pp. 395–479). Reading, MA: Addison-Wesley. First citation in article Google Scholar
Burisch, M. (1984). You don’t always get what you pay for: Measuring depression with short and simple versus long and sophisticated scales. Journal of Research in Personality, 18, 81–98. First citation in article Crossref, Google Scholar
Burisch, M. (1997). Test length and validity revisited. European Journal of Personality, 11, 303–315. First citation in article Crossref, Google Scholar
Cole, D. A. , Ciesla, J. A. , Steiger, J. H. (2007). The insidious effects of failing to include design-driven correlated residuals in latent-variable covariance structure analysis. Psychological Methods, 12, 381. First citation in article Crossref, Google Scholar
Cook, J. R. , Stefanski, L. A. (1994). Simulation-extrapolation estimation in parametric measurement error models. Journal of the American Statistical Association, 89, 1314–1328. First citation in article Crossref, Google Scholar
Costa, P. T. , MacCrae, R. R. (1992). Revised NEO Personality Inventory (NEO PI-R) and NEO Five-Factor Inventory (NEO FFI): Professional Manual. Odessa, FL: Psychological Assessment Resources. First citation in article Google Scholar
Credé, M. , Harms, P. , Niehorster, S. , Gaye-Valentine, A. (2012). An evaluation of the consequences of using short measures of the Big Five personality traits. Journal of Personality and Social Psychology, 102, 874–888. First citation in article Crossref, Google Scholar
Emons, W. H. , Sijtsma, K. , Meijer, R. R. (2007). On the consistency of individual classification using short scales. Psychological Methods, 12, 105–120. First citation in article Crossref, Google Scholar
Fiske, D. W. (1973). Can a personality construct be validated empirically? Psychological Bulletin, 80, 89–92. First citation in article Crossref, Google Scholar
Goetz, C. , Coste, J. , Lemetayer, F. , Rat, A.-C. , Montel, S. , Recchia, S. , Guillemin, F. (2013). Item reduction based on rigorous methodological guidelines is necessary to maintain validity when shortening composite measurement scales. Journal of Clinical Epidemiology, 66, 710–718. First citation in article Crossref, Google Scholar
Gosling, S. D. , Rentfrow, P. J. , Swann, W. B. Jr (2003). A very brief measure of the Big-Five personality domains. Journal of Research in Personality, 37, 504–528. First citation in article Crossref, Google Scholar
Guttman, L. (1977). What is not what in statistics. Journal of the Royal Statistical Society. Series D (The Statistician), 26, 81–107. doi: 10.2307/2987957 First citation in article Crossref, Google Scholar
Karabatsos, G. (2000). A critique of Rasch residual fit statistics. Journal of Applied Measurement, 1, 152–176. First citation in article Google Scholar
Karabatsos, G. (2001). The Rasch model, additive conjoint measurement, and new models of probabilistic measurement theory. Journal of Applied Measurement, 2, 389–423. First citation in article Google Scholar
Kiefer, T. , Robitzsch, A. , Wu, M. (2014). TAM: Test Analysis Modules. Retrieved from CRAN.R-project.org/package=TAM First citation in article Google Scholar
Mair, P. , Hatzinger, R. , Maier, M. J. eRm: Extended Rasch Modeling 2013 Retrieved from CRAN.R-project.org/package=eRm First citation in article Google Scholar
Marais, I. , Andrich, D. (2008). Formalizing dimension and response violations of local independence in the unidimensional Rasch model. Journal of Applied Measurement, 9, 200–215. First citation in article Google Scholar
Meehl, P. E. , Rosen, A. (1955). Antecedent probability and the efficiency of psychometric signs, patterns, or cutting scores. Psychological Bulletin, 52, 194–216. First citation in article Crossref, Google Scholar
Pace, V. L. , Brannick, M. T. (2010). How similar are personality scales of the “same” construct? A meta-analytic investigation. Personality and Individual Differences, 49, doi: 10.1016/j.paid.2010.06.014 669–676. First citation in article Crossref, Google Scholar
Pastor, D. A. , Dodd, B. G. , Chang, H.-H. (2002). A comparison of item selection techniques and exposure control mechanisms in CATs using the generalized partial credit model. Applied Psychological Measurement, 26, 147–163. First citation in article Crossref, Google Scholar
Paunonen, S. V. , Jackson, D. N. (1985). The validity of formal and informal personality assessments. Journal of Research in Personality, 19, 331–342. First citation in article Crossref, Google Scholar
Ponocny, I. (2001). Nonparametric goodness-of-fit tests for the Rasch model. Psychometrika, 66, 437–459. doi: 10.1007/BF02294444 First citation in article Crossref, Google Scholar
R Core Team . (2013). R: A Language and Environment for Statistical Computing. Vienna, Austria: R Core Team. Retrieved from www.R-project.org/ First citation in article Google Scholar
Rammstedt, B. , John, O. P. (2007). Measuring personality in one minute or less: A 10-item short version of the Big Five inventory in English and German. Journal of Research in Personality, 41, 203–212. First citation in article Crossref, Google Scholar
Rizopoulos, D. (2006). ltm: An R package for latent variable modelling and item response theory analyses. Journal of Statistical Software, 17, 1–25. First citation in article Crossref, Google Scholar
Robins, R. W. , Hendin, H. M. , Trzesniewski, K. H. (2001). Measuring global self-esteem: Construct validation of a single-item measure and the Rosenberg Self-Esteem Scale. Personality and Social Psychology Bulletin, 27, 151–161. First citation in article Crossref, Google Scholar
Schonemann, P. H. (1997). Some new results on hit-rates and base-rates in mental testing. Chinese Journal of Psychology, 39, 173–192. First citation in article Google Scholar
Schonemann, P. H. , Thompson, W. W. (1996). Hit-rate bias in mental testing. Current Psychology of Cognition, 15, 3–28. First citation in article Crossref, Google Scholar
Sideridis, G. (2011). The effects of local item dependence on estimates of ability in the Rasch model. Retrieved from www.rasch.org/rmt/rmt253d.htm First citation in article Google Scholar
Smith, R. M. (1988). The distributional properties of Rasch standardized residuals. Educational and Psychological Measurement, 48, 657–667. First citation in article Crossref, Google Scholar
Spearman, C. (1904). The proof and measurement of association between two things. The American Journal of Psychology, 15, 72–101. First citation in article Crossref, Google Scholar
Steiger, J. H. , Schonemann, P. H. (1978). A history of factor indeterminacy. In S. Shye (Ed.), Theory Construction and Data Analysis in the Behavioral Sciences (pp. 136–178). San Francisco, CA: Jossey-Bass. First citation in article Google Scholar
Thalmayer, A. G. , Saucier, G. , Eigenhuis, A. (2011). Comparative validity of brief to medium-length Big Five and Big Six Personality Questionnaires. Psychological Assessment, 23, 995–1009. First citation in article Crossref, Google Scholar
Thissen, D. , Steinberg, L. , Mooney, J. A. (1989). Trace lines for testlets: A use of multiple-categorical-response models. Journal of Educational Measurement, 26, 247–260. First citation in article Crossref, Google Scholar
Tuerlinckx, F. , De Boeck, P. (2001a). Non-modeled item interactions lead to distorted discrimination parameters: A case study. Methods of Psychological Research Online, 6, 159–174. First citation in article Google Scholar
Tuerlinckx, F. , De Boeck, P. (2001b). The Effect of ignoring item interactions on the estimated discrimination parameters in item response theory. Psychological Methods, 6, 181–195. First citation in article Crossref, Google Scholar
Wang, W. C. , Chen, C. T. (2005). Item parameter recovery, standard error estimates, and fit statistics of the Winsteps program for the family of Rasch models. Educational and Psychological Measurement, 65, 376–404. First citation in article Crossref, Google Scholar
Warm, T. A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54, 427–450. First citation in article Crossref, Google Scholar
Wood, D. , Nye, C. D. , Saucier, G. (2010). Identification and measurement of a more comprehensive set of person-descriptive trait markers from the English lexicon. Journal of Research in Personality, 44, 258–272. First citation in article Crossref, Google Scholar
Yen, W. M. (1993). Scaling performance assessments: Strategies for managing local item dependence. Journal of Educational Measurement, 30, 187–213. First citation in article Crossref, Google Scholar

Volume 35Issue 4November 2014

ISSN: 1614-0001eISSN: 2151-2299

History

AcceptedAugust 5, 2014

Licenses & Copyright

Keywords

PDF download

Verify Phone

Congrats!

Much ado About Nothing, or Much to do About Something?

Effects of Scale Shortening on Criterion Validity and Mean Differences

Abstract

References

History

Licenses & Copyright

Support & Contact

Support & Contact

Legal information

Legal information

More offers

More offers

Our partners

Our partners

Change Password

Your password must have 8 characters or more and contain 3 of the following:

Password Changed Successfully

Create a new account

Request Username

Verify Phone

Congrats!

Much ado About Nothing, or Much to do About Something?

Effects of Scale Shortening on Criterion Validity and Mean Differences

Abstract

References

History

Licenses & Copyright

Support & Contact

Support & Contact

Legal information

Legal information

More offers

More offers

Our partners

Our partners