Ascertaining the validity of individual protocols from Web-based personality inventories

https://doi.org/10.1016/j.jrp.2004.09.009Get rights and content

Abstract

The research described in this article estimated the relative incidence of protocols invalidated by linguistic incompetence, inattentiveness, and intentional misrepresentation in Web-based versus paper-and-pencil personality measures. Estimates of protocol invalidity were derived from a sample of 23,994 protocols produced by individuals who completed an on-line version of the 300-item IPIP representation of the NEO-PI-R (Goldberg, 1999). Approximately 3.8% of the protocols were judged to be products of repeat participants, many of whom apparently resubmitted after changing some of their answers. Among non-duplicate protocols, about 3.5% came from individuals who apparently selected a response option repeatedly without reading the item, compared to .9% in a sample of paper-and-pencil protocols. The missing response rate was 1.2%, which is 2–10 times higher than the rate found in several samples of paper-and-pencil inventories of comparable length. Two measures of response consistency indicated that perhaps 1% of the protocols were invalid due to linguistic incompetence or inattentive responding, but that Web participants were as consistent as individuals responding to a paper-and-pencil inventory. Inconsistency did not affect factorial structure and was found to be related positively to neuroticism and negatively to openness to experience. Intentional misrepresentation was not studied directly, but arguments for a low incidence of misrepresentation are presented. Methods for preventing, detecting, and handling invalid response patterns are discussed. Suggested for future research are studies that assess the moderating effects of linguistic incompetence, inattentiveness, and intentional misrepresentation on agreement between self-report and acquaintance judgments about personality.

Introduction

World Wide Web-based personality measures have become increasingly popular in recent years due to the ease of administering, scoring, and providing feedback over the Internet. Web-based measures allow researchers to collect data, inexpensively, from large numbers of individuals around the world in a manner that is convenient to both researchers and participants. With this emerging technology, two important questions about Web-based measures have been raised. The first is the degree to which established paper-and-pencil personality measures retain their reliability and validity after porting them to the Web (Kraut et al., 2004). Although this question should be answered empirically for each personality measure in question, studies to date suggest that personality measures retain their psychometric properties on the Web (Buchanan et al., in press, Gosling et al., 2004).

This article addresses a second kind of validity concern for Web-based measures, protocol validity (Kurtz & Parrish, 2001). The term protocol validity refers to whether an individual protocol is interpretable via the standard algorithms for scoring and assigning meaning. For decades psychologists have realized that even a well-validated personality measure can generate uninterpretable data in individual cases. The introduction of this article first reviews what we know about the impact of three major influences on the protocol validity of paper-and-pencil measures: linguistic incompetence, careless inattentiveness, and deliberate misrepresentation. Next, the introduction discusses why these threats to protocol validity might be more likely to affect Web-based measures than paper-and-pencil measures. The empirical portion of this article provides estimates of the incidence of protocol invalidity for one particular Web-based personality inventory, and compares these estimates to similar data for paper-and-pencil inventories. Finally, the discussion reflects on the significance of protocol invalidity for Web-based measures and suggests strategies for preventing, detecting, and handling invalid protocols.

Section snippets

Three major threats to protocol validity

Researchers have identified three major threats to the validity of individual protocols. These threats can affect protocol validity, regardless of the mode of presentation (paper-and-pencil or Web). The first is linguistic incompetence. A research participant who has a limited vocabulary, poor verbal comprehension, an idiosyncratic way of interpreting item meaning, and/or an inability to appreciate the impact of language on an audience will be unable to produce a valid protocol, even for a

Incidence and detection of invalid protocols for paper-and-pencil inventories

Many of the major personality inventories, e.g., the California Psychological Inventory (CPI; Gough & Bradley, 1996), Hogan Personality Inventory (HPI; Hogan & Hogan, 1992), Multidimensional Personality Questionnaire (MPQ, Tellegen, in press), and Minnesota Multiphasic Personality Inventory (MMPI; Butcher, Dahlstrom, Graham, Tellegen, & Kaemmer, 1989), have built-in protocol validity scales to detect cases in which individuals are not attending to or failing to understand item meanings or are

Linguistic incompetence as a special problem for Web-based measures

Because unregulated Web-based personality measures are readily accessible to non-native speakers from all backgrounds around the world, linguistic competency may be a greater concern for Web-based measures than for paper-and-pencil measures administered to the native-speaking college students often used in research. Non-native speakers may have difficulty with both the literal meanings of items and the more subtle sociolinguistic trait implications of items (Johnson, 1997a). At the time

Summary of the present research plan

The most direct way of assessing protocol validity would be to compare the results of testing (trait level scores, narrative descriptions) with another source of information about personality in which we have confidence (e.g., averaged ratings or the consensus of descriptions from knowledgeable acquaintances—see Hofstee, 1994). Gathering such non-self-report criteria validly over the Internet while protecting anonymity is logistically complex, and ongoing research toward that end is still in

Participants

Before screening for repeat participation, the sample consisted of 23,994 protocols (8764 male, 15,229 female, 1 unknown) from individuals who completed, anonymously, a Web-based version of the IPIP-NEO (Goldberg, 1999; described below). Reported ages ranged from 10 to 99, with a mean age of 26.2 and SD of 10.8 years. Participants were not actively recruited; they discovered the Web site on their own or by word-of-mouth. Protocols used in the present analyses were collected between August 6,

Duplicate protocols

The SPSS LAG function revealed 747 protocols (sorted first by time and then by nickname) in which all 300 responses were identical to the previous protocol. Also identified were an additional 34 cases in which the first 120 responses were identical. A few additional protocols contained nearly all identical response (e.g., four protocols contained 299 identical responses, one contained 298 identical responses). Protocols with 298, 299, or 300 identical responses to 300 items (or 118, 119, or 120

Discussion

The present study investigated the degree to which the unique characteristics of a Web-based personality inventory produced uninterpretable protocols. It was hypothesized that the ease of accessing a personality inventory on the Web and the reduced accountability from anonymity might lead to a higher incidence (compared to paper-and-pencil inventories) of four types of problematic protocols. These problems are as follows: (a) the submission of duplicate protocols (some of which might be

Conclusions

Of more substance and practical importance than the specter of radical misrepresentation on Web-based personality measures are issues such as detecting multiple participation and protocols that are completed too carelessly or inattentively to be subjected to normal interpretation. The incidence of: (a) repeat participation, (b) selecting the same response category repeatedly without reading the item, and (c) skipping items all exceed the levels found in paper-and-pencil measures. Nonetheless,

Acknowledgments

Some of these findings were first presented in an invited talk to the Annual Joint Bielefeld-Groningen Personality Research Group meeting, University of Groningen, The Netherlands, May 9, 2001. I thank Alois Angleitner, Wim Hofstee, Karen van Oudenhoven-van der Zee, Frank Spinath, and Heike Wolf for their feedback and suggestions at that meeting. Some of the research described in this article was conducted while I was on sabbatical at the Oregon Research Institute, supported by a Research

References (55)

  • J.A. Johnson

    Units of analysis for description and explanation in psychology

  • D.L. Paulhus

    Measurement and control of response bias

  • J.S. Wiggins

    In defense of traits

  • Buchanan, T., Johnson, J. A., & Goldberg, L. R. (in press). Implementing a five-factor personality inventory for use on...
  • J.N. Butcher et al.

    Minnesota Multiphasic Personality Inventory2 (MMPI-2): Manual for administration and scoring

    (1989)
  • J.N. Butcher et al.

    Personality: Individual differences and clinical assessment

    Annual Review of Psychology

    (1996)
  • R.B. Cattell

    The scree test for the number of factors

    Multivariate Behavioral Research

    (1966)
  • P.T. Costa et al.

    Revised NEO Personality Inventory (NEO PI-RTM) and NEO Five-Factor Inventory (NEO-FFI) professional manual

    (1992)
  • P.T. Costa et al.

    Stability and change in personality assessment: The revised NEO personality inventory in the year 2000

    Journal of Personality Assessment

    (1997)
  • Costa, P. T., Jr., & McCrae, R. R. (in press). The revised NEO Personality Inventory (NEO-PI-R). In: S. R. Briggs, J....
  • M.D. Dunnette et al.

    A study of faking behavior on a forced choice self-description checklist

    Personnel Psychology

    (1962)
  • R.C. Fraley

    How to conduct behavioral research over the Internet

    (2004)
  • L.R. Goldberg

    The development of markers for the Big-Five factor structure

    Psychological Assessment

    (1992)
  • L.R. Goldberg

    A broad-bandwidth, public domain, personality inventory measuring the lower-level facets of several five-factor models

  • Goldberg, L. R. (in press). The comparative validity of adult personality inventories: Applications of a...
  • L.R. Goldberg et al.

    The prediction of semantic consistency in self-descriptions: Characteristics of persons and of terms that affect the consistency of responses to synonym and antonym pairs

    Journal of Personality and Social Psychology

    (1985)
  • S.D. Gosling et al.

    Should we trust web based studies? A comparative analysis of six preconceptions about Internet questionnaires

    American Psychologist

    (2004)
  • H.G. Gough et al.

    CPI manual: Third edition

    (1996)
  • Hendriks, A. A. J. (1997). The construction of the Five Factor Personality Inventory. Unpublished doctoral...
  • W.K.B. Hofstee

    Who should own the definition of personality?

    European Journal of Personality

    (1994)
  • W.K.B. Hofstee et al.

    Integration of the big five and circumplex approaches to trait structure

    Journal of Personality and Social Psychology

    (1992)
  • R. Hogan

    Personality psychology: Back to basics

  • R. Hogan et al.

    Hogan Personality Inventory manual

    (1992)
  • Jackson, D. N. (1976). The appraisal of personal reliability. Paper presented at the meetings of the Society of...
  • D.N. Jackson

    Jackson Vocational Interest Survey manual

    (1977)
  • O.P. John et al.

    The big five trait taxonomy: History, measurement, and theoretical perspectives

  • Cited by (368)

    View all citing articles on Scopus

    Prepared for the special issue of the Journal of Research in Personality 39 (1), February 2005, containing the proceedings of the 2004 meeting of the Association for Research in Personality.

    View full text