Dirty Data: The Effects of Screening Respondents Who Provide Low-Quality Data in Survey Research

DeSimone, Justin A.; Harms, P. D.

doi:10.1007/s10869-017-9514-9

Dirty Data: The Effects of Screening Respondents Who Provide Low-Quality Data in Survey Research

Original Paper
Published: 02 September 2017

Volume 33, pages 559–577, (2018)
Cite this article

Journal of Business and Psychology Aims and scope Submit manuscript

Justin A. DeSimone¹ &
P. D. Harms¹

7829 Accesses
128 Citations
10 Altmetric
1 Mention
Explore all metrics

Abstract

The purpose of this study is to empirically address questions pertaining to the effects of data screening practices in survey research. This study addresses questions about the impact of screening techniques on data and statistical analyses. It also serves an initial attempt to estimate descriptive statistics and graphically display the distributions of popular screening techniques. Data were obtained from an online sample who completed demographic items and measures of character strengths (N = 307). Screening indices demonstrate minimal overlap and differ in the number of participants flagged. Existing cutoff scores for most screening techniques seem appropriate, but cutoff values for consistency-based indices may be too liberal. Screens differ in the extent to which they impact survey results. The use of screening techniques can impact inter-item correlations, inter-scale correlations, reliability estimates, and statistical results. While data screening can improve the quality and trustworthiness of data, screening techniques are not interchangeable. Researchers and practitioners should be aware of the differences between data screening techniques and apply appropriate screens for their survey characteristics and study design. Low-impact direct and unobtrusive screens such as self-report indicators, bogus items, instructed items, longstring, individual response variability, and response time are relatively simple to administer and analyze. The fact that data screening can influence the statistical results of a study demonstrates that low-quality data can distort hypothesis testing in organizational research and practice. We recommend analyzing results both before and after screens have been applied.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Qualitative Inhaltsanalyse

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

Article Open access 30 January 2023

How to use and assess qualitative research methods

Article Open access 27 May 2020

Notes

It is noteworthy that subsequent work has failed to replicate this six-factor structure but has also failed to consistently support an alternative factor structure using confirmatory analysis (Vanhove et al., 2016).
Using listwise deletion did not substantially change the results of the analysis. For example, the percentage of participants flagged differed by 2.7% or less for each screening technique and correlated higher than 0.99 with percentages computed using the data imputation technique.

References

Aguinis, H., Gottfredson, R. K., & Joo, H. (2013). Best-practice recommendations for defining, identifying, and handling outliers. Organizational Research Methods, 16, 270–301.
Article Google Scholar
Allison, P. D. (2009). Missing data. In R. E. Millsap & A. Maydeu-Olivares (Eds.), The SAGE handbook of quantitative methods in psychology (pp. 72–90). Los Angeles: Sage.
Chapter Google Scholar
Anscombe, F. J. (1973). Graphs in statistical analysis. The American Statistician, 27, 17–21.
Google Scholar
Anscombe, F. J., & Guttman, I. (1960). Rejection of outliers. American Society for Quality, 2, 123–147.
Google Scholar
Bagby, R. M., Gillis, J. R., & Rogers, R. (1991). Effectiveness of the Millon clinical multiaxial inventory validity index in the detection of random responding. Psychological Assessment, 3, 285–287.
Article Google Scholar
Behrend, T. S., Sharek, D. J., Meade, A. W., & Wiebe, E. N. (2011). The viability of crowdsourcing for survey research. Behavior Research, 43, 800–813.
Article Google Scholar
Bentler, P. M. (1995). EQS structural equations program manual. Encino, CA: Multivariate Software.
Google Scholar
Berry, D. T. R., Wetter, M. W., Baer, R. A., Larsen, L., Clark, C., & Monroe, K. (1992). MMPI-2 random responding indices: Validation using a self-report methodology. Psychological Assessment, 4, 340–345.
Article Google Scholar
Berry, D. T. R., Wetter, M. W., Baer, R. A., Widiger, T. A., Sumpter, J. C., Reynolds, S. K., & Hallam, R. A. (1991). Detection of random responding on the MMPI-2: Utility of F, back F, and VRIN scales. Psychological Assessment, 3, 418–423.
Article Google Scholar
Bowling, N. A., Huang, J. L., Bragg, C. B., Khazon, S., Liu, M., & Blackmore, C. E. (2016). Who cares and who is careless? Insufficient effort responding as a reflection of respondent personality. Journal of Personality and Social Psychology, 111, 218–229.
Article PubMed Google Scholar
Brown, W. (1910). Some experimental results in the correlation of mental abilities. British Journal of Psychology, 3, 296–322.
Google Scholar
Bruehl, S., Lofland, K. R., Sherman, J. J., & Carlson, C. R. (1998). The variable responding scale for detection of random responding on the multidimensional pain inventory. Psychological Assessment, 10, 3–9.
Article Google Scholar
Chandler, J., Mueller, P., & Paolacci, G. (2014). Nonnaïveté among Amazon mechanical Turk workers: Consequences and solutions for behavioral researchers. Behavior Research Methods, 46, 112–130.
Article PubMed Google Scholar
Chen, Z., Watson, P. J., Biderman, M., & Ghorbani, N. (2016). Investigating the properties of the general factor (M) in bifactor models applied to the big five or HEXACO data in terms of method or meaning. Imagination, Cognition, and Personality: Consciousness in Theory, Research, and Clinical Practice, 35, 216–243.
Article Google Scholar
Clark, L. A., & Watson, D. (1995). Constructing validity: Basic issues in objective scale development. Psychological Assessment, 7, 309–319.
Article Google Scholar
Clark, M. E., Gironda, R. J., & Young, R. W. (2003). Detection of back random responding: Effectiveness of MMPI-2 and personality assessment inventory validity indices. Psychological Assessment, 15, 223–234.
Article PubMed Google Scholar
Costa, P. T., & McCrae, R. R. (1997). Stability and change in personality assessment: The revised NEO personality inventory in the year 2000. Journal of Personality Assessment, 68, 86–94.
Article PubMed Google Scholar
Costa, P. T., & McCrae, R. R. (2008). The revised NEO personality inventory (NEO-PI-R). In G. J. Boyle, G. Matthews, & D. H. Saklofske (Eds.), The SAGE handbook of personality theory and assessment (pp. 179–198). London: SAGE.
Google Scholar
Couch, A., & Keniston, K. (1960). Yeasayers and naysayers: Aggreeing response set as a personality variable. Journal of Abnormal and Social Psychology, 60, 151–174.
Article PubMed Google Scholar
Credé, M. (2010). Random responding as a threat to the validity of effect size estimates in correlation research. Educational and Psychological Measurement, 70, 596–612.
Article Google Scholar
Cronbach, L. J. (1950). Further evidence on response sets and test design. Educational and Psychological Measurement, 10, 3–31.
Article Google Scholar
Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16, 297–334.
Article Google Scholar
Curran, P. G. (2016). Methods for the detection of carelessly invalid responses in survey data. Journal of Experimental Social Psychology, 66, 4–19.
Article Google Scholar
De Ayala, R. J., & Sava-Bolesta, M. (1999). Item parameter recovery for the nominal response model. Applied Psychological Measurement, 23, 3–19.
Article Google Scholar
DeMars, C. E. (2003). Sample size and the recovery of nominal response model item parameters. Applied Psychological Measurement, 27, 275–288.
Article Google Scholar
DeSimone, J. A. (2015). New techniques for evaluating temporal consistency. Organizational Research Methods, 18, 133–152.
Article Google Scholar
DeSimone, J. A., Harms, P. D., & DeSimone, A. J. (2015). Best practice recommendations for data screening. Journal of Organizational Behavior, 36, 171–181.
Article Google Scholar
Dunn, A. M., Heggestad, E. D., Shanock, L. R., & Nels, T. (in press). Intra-individual response variability as an indicator of insufficient effort responding: Comparison to other indicators and relationships with individual differences. Journal of Business and Psychology. Available from https://link.springer.com/article/10.1007/s10869-016-9479-0. Accessed 11 April 2017.
Edwards, A. L. (1957). The social desirability variable in personality assessment and research. New York: Dryden.
Google Scholar
Ellingson, J. E., Sackett, P. R., & Hough, L. M. (1999). Social desirability corrections in personality measurement: Issues of applicant comparison and construct validity. Journal of Applied Psychology, 84, 155–166.
Article Google Scholar
Frederiksen, N. (1965). Response set scores as predictors of performance. Personnel Psychology, 18, 225–244.
Article Google Scholar
Gallen, R. T., & Berry, D. T. R. (1996). Detection of random responding in MMPI-2 protocols. Assessment, 3, 171–178.
Article Google Scholar
Gallen, R. T., & Berry, D. T. R. (1997). Partially random MMPI-2 protocols: When are they interpretable? Assessment, 4, 61–68.
Article Google Scholar
Goldberg, L. R., Johnson, J. A., Eber, H. W., Hogan, R., Ashton, M. C., Cloninger, C. R., & Gough, H. C. (2006). The international personality item pool and the future of public-domain personality measures. Journal of Research in Personality, 40, 84–96.
Article Google Scholar
Harms, P. D., & DeSimone, J. A. (2015). Caution! MTurk workers ahead—Fines doubled. Industrial and Organizational Psychology, 8, 183–190.
Article Google Scholar
Hough, L. M., Eaton, N. K., Dunnette, M. D., Kamp, J. D., & McCloy, R. A. (1990). Criterion-related validities of personality constructs and the effect of response distortion on those validities. Journal of Applied Psychology, 75, 581–595.
Article Google Scholar
Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indices in covariance structure analysis. Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1–55.
Article Google Scholar
Huang, J. L., Bowling, N. A., Liu, M., & Li, Y. (2015). Detecting insufficient effort responding with an infrequency scale: Evaluating validity and participant reactions. Journal of Business and Psychology, 30, 299–311.
Article Google Scholar
Huang, J. L., Curran, P. G., Keeney, J., Poposki, E. M., & DeShon, R. P. (2012). Detecting and deterring insufficient effort responding to surveys. Journal of Business and Psychology, 27, 99–114.
Article Google Scholar
Huang, J. L., Liu, M., & Bowling, N. A. (2015). Insufficient effort responding: Examining an insidious confound in survey data. Journal of Applied Psychology, 100, 828–845.
Article PubMed Google Scholar
Jackson, D. N. (1976). The appraisal of personal reliability. Paper presented at the meetings of the Society of Multivariate Experimental Psychology, University Park, PA.
Johnson, J. A. (2005). Ascertaining the validity of individual protocols from web-based personality inventories. Journal of Research in Personality, 39, 103–129.
Article Google Scholar
Kurtz, J. E., & Parrish, C. L. (2001). Semantic response consistency and protocol validity in structured personality assessment: The case of the NEO-PI-R. Journal of Personality Assessment, 76, 315–332.
Article PubMed Google Scholar
Leary, M. R., & Kowalski, R. M. (1990). Impression management: A literature review and two-component model. Psychological Bulletin, 107, 34–47.
Article Google Scholar
Littman-Ovadia, H., & Lavy, S. (2012). Character strengths in Israel. European Journal of Psychological Assessment, 28, 41–50.
Article Google Scholar
Liu, M., Bowling, N. A., Huang, J. L., & Kent, T. A. (2013). Insufficient effort responding to surveys as a threat to validity: The perceptions and practices of SIOP members. The Industrial-Organizational Psychologist, 51, 32–38.
Google Scholar
Liu, M., Huang, J. L., Bowling, N. A., & Bragg, C. (2013). Attenuating effect of insufficient effort responding on relationships between measures. Paper presented at the 28th annual conference for the Society for Industrial and Organizational Psychology, Houston, TX.
Mahalanobis, P. C. (1936). On the generalized distance in statistics. Proceedings of the National Institute of Sciences of India, 2, 49–55.
Google Scholar
Maniaci, M. R., & Rogge, R. D. (2014). Caring about carelessness: Participant inattention and its effects on research. Journal of Research in Personality, 48, 61–83.
Article Google Scholar
McCrae, R. R., Costa, P. T., Grant, W., Barefoot, J. C., Siegler, I. C., & Williams Jr., R. B. (1989). A caution on the use of the MMPI K-correction in research on psychosomatic medicine. Psychosomatic Medicine, 51, 58–65.
Article PubMed Google Scholar
McGrath, R. E., Mitchell, M. K., Kim, B. H., & Hough, L. (2010). Evidence for response bias as a source of error variance in applied assessment. Psychological Bulletin, 136, 450–470.
Article PubMed Google Scholar
Meade, A. W., & Craig, S. B. (2012). Identifying careless responses in survey data. Psychological Methods, 17, 437–455.
Article PubMed Google Scholar
Meijer, R. R., Monelaar, I. W., & Sijtsma, K. (1994). Influence of test and person characteristics on nonparametric appropriateness measurement. Applied Psychological Measurement, 18, 111–120.
Article Google Scholar
Messick, S. (1960). Dimensions of social desirability. Journal of Consulting Psychology, 24, 279–287.
Article Google Scholar
Nichols, D. S., Greene, R. L., & Schmolck, P. (1989). Criteria for assessing inconsistent patterns of item endorsement on the MMPI: Rationale, development, and empirical trials. Journal of Clinical Psychology, 45, 239–250.
Article PubMed Google Scholar
Niessen, A. S. M., Meijer, R. R., & Tendeiro, J. N. (2016). Detecting careless respondents in web-based questionnaires: Which method to use? Journal of Research in Personality, 63, 1–11.
Article Google Scholar
O’Rourke, T. W. (2000). Techniques for screening and cleaning data for analysis. American Journal of Health Studies, 16, 205–207.
Google Scholar
Orne, M. T. (1962). On the social psychology of the psychological experiment: With particular reference to demand characteristics and their implications. American Psychologist, 17, 776–783.
Article Google Scholar
Paolo, A. M., & Ryan, J. J. (1992). Detection of random response sets on the MMPI-2. Psychotherapy in Private Practice, 4, 1–8.
Google Scholar
Peterson, C., & Seligman, M. E. P. (2004). Character strengths and virtues: A handbook and classification. New York, NY: Oxford University Press.
Google Scholar
Pinsoneault, T. B. (2007). Detecting random, partially random, and nonrandom Minnesota multiphasic personality inventory-2 protocols. Psychological Assessment, 19, 159–164.
Article PubMed Google Scholar
Rosse, J. G., Stecher, M. D., Miller, J. L., & Levin, R. A. (1998). The impact of response distortion on preemployment personality testing and hiring decisions. Journal of Applied Psychology, 83, 634–644.
Article Google Scholar
Ruch, W., Proyer, R. T., Harzer, C., Park, N., Peterson, C., & Seligman, M. E. P. (2010). Values in action inventory of strengths (VIA–IS): Adaptation of the German version and the development of a peer-rating form. Journal of Individual Differences, 31, 138–149.
Article Google Scholar
Schinka, J. A., Kinder, B. N., & Kremer, T. (1997). Research validity scales for the NEO-PI-R: Development and initial validation. Journal of Personality Assessment, 68, 127–138.
Article PubMed Google Scholar
Schmitt, N., & Stults, D. M. (1985). Factors defined by negatively keyed items: The result of careless respondents? Applied Psychological Measurement, 9, 367–373.
Article Google Scholar
Schwarz, N. (1999). Self-reports: How the questions shape the answers. American Psychologist, 54, 93–105.
Article Google Scholar
Snell, A. F., Sydell, E. J., & Lueke, S. B. (1999). Towards a theory of applicant faking: Integrating studies of deception. Human Resources Management Review, 9, 219–242.
Article Google Scholar
Spearman, C. (1910). Correlation calculated from faulty data. British Journal of Psychology, 3, 271–295.
Google Scholar
Stevens, J. P. (1984). Outliers and influential data points in regression analysis. Psychological Bulletin, 95, 334–344.
Article Google Scholar
Vanhove, A. J., Harms, P. D., & DeSimone, J. A. (2016). The abbreviated character strengths test (ACST): A preliminary assessment of test validity. Journal of Personality Assessment, 98, 536–544.
Article PubMed Google Scholar
Wetter, M. W., Baer, R. A., Berry, D. T. R., Smith, G. T., & Larsen, L. H. (1992). Sensitivity of MMPI-2 validity scales to random responding and malingering. Psychological Assessment, 4, 369–374.
Article Google Scholar
Wilkinson, L., & Task Force on Statistical Inference. (1999). Statistical methods in psychology journals. American Psychologist, 54, 594–604.
Article Google Scholar
Woods, C. M. (2006). Careless responding to reverse-worded items: Implications for confirmatory factor analysis. Journal of Psychopathology and Behavioral Assessment, 28, 189–194.
Article Google Scholar
Yu, C. (2002). Evaluating cutoff criteria of model fit indices for latent variable models with binary and continuous outcomes. Unpublished doctoral dissertation. Retrieved from http://statmodel2.com/download/Yudissertation.pdf on February 18, 2014.
Zickar, M. J., & Robie, C. (1999). Modeling faking good on personality items: An item-level analysis. Journal of Applied Psychology, 84, 551–563.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Management, University of Alabama, 361 Stadium Drive, Tuscaloosa, AL, 35487-0025, USA
Justin A. DeSimone & P. D. Harms

Authors

Justin A. DeSimone
View author publications
You can also search for this author in PubMed Google Scholar
P. D. Harms
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Justin A. DeSimone.

Rights and permissions

Reprints and permissions

About this article

Cite this article

DeSimone, J.A., Harms, P.D. Dirty Data: The Effects of Screening Respondents Who Provide Low-Quality Data in Survey Research. J Bus Psychol 33, 559–577 (2018). https://doi.org/10.1007/s10869-017-9514-9

Download citation

Published: 02 September 2017
Issue Date: October 2018
DOI: https://doi.org/10.1007/s10869-017-9514-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dirty Data: The Effects of Screening Respondents Who Provide Low-Quality Data in Survey Research

Abstract

Access this article

Similar content being viewed by others

Qualitative Inhaltsanalyse

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

How to use and assess qualitative research methods

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Dirty Data: The Effects of Screening Respondents Who Provide Low-Quality Data in Survey Research

Abstract

Access this article

Similar content being viewed by others

Qualitative Inhaltsanalyse

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

How to use and assess qualitative research methods

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation