Abstract
The purpose of this study is to empirically address questions pertaining to the effects of data screening practices in survey research. This study addresses questions about the impact of screening techniques on data and statistical analyses. It also serves an initial attempt to estimate descriptive statistics and graphically display the distributions of popular screening techniques. Data were obtained from an online sample who completed demographic items and measures of character strengths (N = 307). Screening indices demonstrate minimal overlap and differ in the number of participants flagged. Existing cutoff scores for most screening techniques seem appropriate, but cutoff values for consistency-based indices may be too liberal. Screens differ in the extent to which they impact survey results. The use of screening techniques can impact inter-item correlations, inter-scale correlations, reliability estimates, and statistical results. While data screening can improve the quality and trustworthiness of data, screening techniques are not interchangeable. Researchers and practitioners should be aware of the differences between data screening techniques and apply appropriate screens for their survey characteristics and study design. Low-impact direct and unobtrusive screens such as self-report indicators, bogus items, instructed items, longstring, individual response variability, and response time are relatively simple to administer and analyze. The fact that data screening can influence the statistical results of a study demonstrates that low-quality data can distort hypothesis testing in organizational research and practice. We recommend analyzing results both before and after screens have been applied.
Similar content being viewed by others
Notes
It is noteworthy that subsequent work has failed to replicate this six-factor structure but has also failed to consistently support an alternative factor structure using confirmatory analysis (Vanhove et al., 2016).
Using listwise deletion did not substantially change the results of the analysis. For example, the percentage of participants flagged differed by 2.7% or less for each screening technique and correlated higher than 0.99 with percentages computed using the data imputation technique.
References
Aguinis, H., Gottfredson, R. K., & Joo, H. (2013). Best-practice recommendations for defining, identifying, and handling outliers. Organizational Research Methods, 16, 270–301.
Allison, P. D. (2009). Missing data. In R. E. Millsap & A. Maydeu-Olivares (Eds.), The SAGE handbook of quantitative methods in psychology (pp. 72–90). Los Angeles: Sage.
Anscombe, F. J. (1973). Graphs in statistical analysis. The American Statistician, 27, 17–21.
Anscombe, F. J., & Guttman, I. (1960). Rejection of outliers. American Society for Quality, 2, 123–147.
Bagby, R. M., Gillis, J. R., & Rogers, R. (1991). Effectiveness of the Millon clinical multiaxial inventory validity index in the detection of random responding. Psychological Assessment, 3, 285–287.
Behrend, T. S., Sharek, D. J., Meade, A. W., & Wiebe, E. N. (2011). The viability of crowdsourcing for survey research. Behavior Research, 43, 800–813.
Bentler, P. M. (1995). EQS structural equations program manual. Encino, CA: Multivariate Software.
Berry, D. T. R., Wetter, M. W., Baer, R. A., Larsen, L., Clark, C., & Monroe, K. (1992). MMPI-2 random responding indices: Validation using a self-report methodology. Psychological Assessment, 4, 340–345.
Berry, D. T. R., Wetter, M. W., Baer, R. A., Widiger, T. A., Sumpter, J. C., Reynolds, S. K., & Hallam, R. A. (1991). Detection of random responding on the MMPI-2: Utility of F, back F, and VRIN scales. Psychological Assessment, 3, 418–423.
Bowling, N. A., Huang, J. L., Bragg, C. B., Khazon, S., Liu, M., & Blackmore, C. E. (2016). Who cares and who is careless? Insufficient effort responding as a reflection of respondent personality. Journal of Personality and Social Psychology, 111, 218–229.
Brown, W. (1910). Some experimental results in the correlation of mental abilities. British Journal of Psychology, 3, 296–322.
Bruehl, S., Lofland, K. R., Sherman, J. J., & Carlson, C. R. (1998). The variable responding scale for detection of random responding on the multidimensional pain inventory. Psychological Assessment, 10, 3–9.
Chandler, J., Mueller, P., & Paolacci, G. (2014). Nonnaïveté among Amazon mechanical Turk workers: Consequences and solutions for behavioral researchers. Behavior Research Methods, 46, 112–130.
Chen, Z., Watson, P. J., Biderman, M., & Ghorbani, N. (2016). Investigating the properties of the general factor (M) in bifactor models applied to the big five or HEXACO data in terms of method or meaning. Imagination, Cognition, and Personality: Consciousness in Theory, Research, and Clinical Practice, 35, 216–243.
Clark, L. A., & Watson, D. (1995). Constructing validity: Basic issues in objective scale development. Psychological Assessment, 7, 309–319.
Clark, M. E., Gironda, R. J., & Young, R. W. (2003). Detection of back random responding: Effectiveness of MMPI-2 and personality assessment inventory validity indices. Psychological Assessment, 15, 223–234.
Costa, P. T., & McCrae, R. R. (1997). Stability and change in personality assessment: The revised NEO personality inventory in the year 2000. Journal of Personality Assessment, 68, 86–94.
Costa, P. T., & McCrae, R. R. (2008). The revised NEO personality inventory (NEO-PI-R). In G. J. Boyle, G. Matthews, & D. H. Saklofske (Eds.), The SAGE handbook of personality theory and assessment (pp. 179–198). London: SAGE.
Couch, A., & Keniston, K. (1960). Yeasayers and naysayers: Aggreeing response set as a personality variable. Journal of Abnormal and Social Psychology, 60, 151–174.
Credé, M. (2010). Random responding as a threat to the validity of effect size estimates in correlation research. Educational and Psychological Measurement, 70, 596–612.
Cronbach, L. J. (1950). Further evidence on response sets and test design. Educational and Psychological Measurement, 10, 3–31.
Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16, 297–334.
Curran, P. G. (2016). Methods for the detection of carelessly invalid responses in survey data. Journal of Experimental Social Psychology, 66, 4–19.
De Ayala, R. J., & Sava-Bolesta, M. (1999). Item parameter recovery for the nominal response model. Applied Psychological Measurement, 23, 3–19.
DeMars, C. E. (2003). Sample size and the recovery of nominal response model item parameters. Applied Psychological Measurement, 27, 275–288.
DeSimone, J. A. (2015). New techniques for evaluating temporal consistency. Organizational Research Methods, 18, 133–152.
DeSimone, J. A., Harms, P. D., & DeSimone, A. J. (2015). Best practice recommendations for data screening. Journal of Organizational Behavior, 36, 171–181.
Dunn, A. M., Heggestad, E. D., Shanock, L. R., & Nels, T. (in press). Intra-individual response variability as an indicator of insufficient effort responding: Comparison to other indicators and relationships with individual differences. Journal of Business and Psychology. Available from https://link.springer.com/article/10.1007/s10869-016-9479-0. Accessed 11 April 2017.
Edwards, A. L. (1957). The social desirability variable in personality assessment and research. New York: Dryden.
Ellingson, J. E., Sackett, P. R., & Hough, L. M. (1999). Social desirability corrections in personality measurement: Issues of applicant comparison and construct validity. Journal of Applied Psychology, 84, 155–166.
Frederiksen, N. (1965). Response set scores as predictors of performance. Personnel Psychology, 18, 225–244.
Gallen, R. T., & Berry, D. T. R. (1996). Detection of random responding in MMPI-2 protocols. Assessment, 3, 171–178.
Gallen, R. T., & Berry, D. T. R. (1997). Partially random MMPI-2 protocols: When are they interpretable? Assessment, 4, 61–68.
Goldberg, L. R., Johnson, J. A., Eber, H. W., Hogan, R., Ashton, M. C., Cloninger, C. R., & Gough, H. C. (2006). The international personality item pool and the future of public-domain personality measures. Journal of Research in Personality, 40, 84–96.
Harms, P. D., & DeSimone, J. A. (2015). Caution! MTurk workers ahead—Fines doubled. Industrial and Organizational Psychology, 8, 183–190.
Hough, L. M., Eaton, N. K., Dunnette, M. D., Kamp, J. D., & McCloy, R. A. (1990). Criterion-related validities of personality constructs and the effect of response distortion on those validities. Journal of Applied Psychology, 75, 581–595.
Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indices in covariance structure analysis. Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1–55.
Huang, J. L., Bowling, N. A., Liu, M., & Li, Y. (2015). Detecting insufficient effort responding with an infrequency scale: Evaluating validity and participant reactions. Journal of Business and Psychology, 30, 299–311.
Huang, J. L., Curran, P. G., Keeney, J., Poposki, E. M., & DeShon, R. P. (2012). Detecting and deterring insufficient effort responding to surveys. Journal of Business and Psychology, 27, 99–114.
Huang, J. L., Liu, M., & Bowling, N. A. (2015). Insufficient effort responding: Examining an insidious confound in survey data. Journal of Applied Psychology, 100, 828–845.
Jackson, D. N. (1976). The appraisal of personal reliability. Paper presented at the meetings of the Society of Multivariate Experimental Psychology, University Park, PA.
Johnson, J. A. (2005). Ascertaining the validity of individual protocols from web-based personality inventories. Journal of Research in Personality, 39, 103–129.
Kurtz, J. E., & Parrish, C. L. (2001). Semantic response consistency and protocol validity in structured personality assessment: The case of the NEO-PI-R. Journal of Personality Assessment, 76, 315–332.
Leary, M. R., & Kowalski, R. M. (1990). Impression management: A literature review and two-component model. Psychological Bulletin, 107, 34–47.
Littman-Ovadia, H., & Lavy, S. (2012). Character strengths in Israel. European Journal of Psychological Assessment, 28, 41–50.
Liu, M., Bowling, N. A., Huang, J. L., & Kent, T. A. (2013). Insufficient effort responding to surveys as a threat to validity: The perceptions and practices of SIOP members. The Industrial-Organizational Psychologist, 51, 32–38.
Liu, M., Huang, J. L., Bowling, N. A., & Bragg, C. (2013). Attenuating effect of insufficient effort responding on relationships between measures. Paper presented at the 28th annual conference for the Society for Industrial and Organizational Psychology, Houston, TX.
Mahalanobis, P. C. (1936). On the generalized distance in statistics. Proceedings of the National Institute of Sciences of India, 2, 49–55.
Maniaci, M. R., & Rogge, R. D. (2014). Caring about carelessness: Participant inattention and its effects on research. Journal of Research in Personality, 48, 61–83.
McCrae, R. R., Costa, P. T., Grant, W., Barefoot, J. C., Siegler, I. C., & Williams Jr., R. B. (1989). A caution on the use of the MMPI K-correction in research on psychosomatic medicine. Psychosomatic Medicine, 51, 58–65.
McGrath, R. E., Mitchell, M. K., Kim, B. H., & Hough, L. (2010). Evidence for response bias as a source of error variance in applied assessment. Psychological Bulletin, 136, 450–470.
Meade, A. W., & Craig, S. B. (2012). Identifying careless responses in survey data. Psychological Methods, 17, 437–455.
Meijer, R. R., Monelaar, I. W., & Sijtsma, K. (1994). Influence of test and person characteristics on nonparametric appropriateness measurement. Applied Psychological Measurement, 18, 111–120.
Messick, S. (1960). Dimensions of social desirability. Journal of Consulting Psychology, 24, 279–287.
Nichols, D. S., Greene, R. L., & Schmolck, P. (1989). Criteria for assessing inconsistent patterns of item endorsement on the MMPI: Rationale, development, and empirical trials. Journal of Clinical Psychology, 45, 239–250.
Niessen, A. S. M., Meijer, R. R., & Tendeiro, J. N. (2016). Detecting careless respondents in web-based questionnaires: Which method to use? Journal of Research in Personality, 63, 1–11.
O’Rourke, T. W. (2000). Techniques for screening and cleaning data for analysis. American Journal of Health Studies, 16, 205–207.
Orne, M. T. (1962). On the social psychology of the psychological experiment: With particular reference to demand characteristics and their implications. American Psychologist, 17, 776–783.
Paolo, A. M., & Ryan, J. J. (1992). Detection of random response sets on the MMPI-2. Psychotherapy in Private Practice, 4, 1–8.
Peterson, C., & Seligman, M. E. P. (2004). Character strengths and virtues: A handbook and classification. New York, NY: Oxford University Press.
Pinsoneault, T. B. (2007). Detecting random, partially random, and nonrandom Minnesota multiphasic personality inventory-2 protocols. Psychological Assessment, 19, 159–164.
Rosse, J. G., Stecher, M. D., Miller, J. L., & Levin, R. A. (1998). The impact of response distortion on preemployment personality testing and hiring decisions. Journal of Applied Psychology, 83, 634–644.
Ruch, W., Proyer, R. T., Harzer, C., Park, N., Peterson, C., & Seligman, M. E. P. (2010). Values in action inventory of strengths (VIA–IS): Adaptation of the German version and the development of a peer-rating form. Journal of Individual Differences, 31, 138–149.
Schinka, J. A., Kinder, B. N., & Kremer, T. (1997). Research validity scales for the NEO-PI-R: Development and initial validation. Journal of Personality Assessment, 68, 127–138.
Schmitt, N., & Stults, D. M. (1985). Factors defined by negatively keyed items: The result of careless respondents? Applied Psychological Measurement, 9, 367–373.
Schwarz, N. (1999). Self-reports: How the questions shape the answers. American Psychologist, 54, 93–105.
Snell, A. F., Sydell, E. J., & Lueke, S. B. (1999). Towards a theory of applicant faking: Integrating studies of deception. Human Resources Management Review, 9, 219–242.
Spearman, C. (1910). Correlation calculated from faulty data. British Journal of Psychology, 3, 271–295.
Stevens, J. P. (1984). Outliers and influential data points in regression analysis. Psychological Bulletin, 95, 334–344.
Vanhove, A. J., Harms, P. D., & DeSimone, J. A. (2016). The abbreviated character strengths test (ACST): A preliminary assessment of test validity. Journal of Personality Assessment, 98, 536–544.
Wetter, M. W., Baer, R. A., Berry, D. T. R., Smith, G. T., & Larsen, L. H. (1992). Sensitivity of MMPI-2 validity scales to random responding and malingering. Psychological Assessment, 4, 369–374.
Wilkinson, L., & Task Force on Statistical Inference. (1999). Statistical methods in psychology journals. American Psychologist, 54, 594–604.
Woods, C. M. (2006). Careless responding to reverse-worded items: Implications for confirmatory factor analysis. Journal of Psychopathology and Behavioral Assessment, 28, 189–194.
Yu, C. (2002). Evaluating cutoff criteria of model fit indices for latent variable models with binary and continuous outcomes. Unpublished doctoral dissertation. Retrieved from http://statmodel2.com/download/Yudissertation.pdf on February 18, 2014.
Zickar, M. J., & Robie, C. (1999). Modeling faking good on personality items: An item-level analysis. Journal of Applied Psychology, 84, 551–563.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
DeSimone, J.A., Harms, P.D. Dirty Data: The Effects of Screening Respondents Who Provide Low-Quality Data in Survey Research. J Bus Psychol 33, 559–577 (2018). https://doi.org/10.1007/s10869-017-9514-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10869-017-9514-9