Skip to main content
Free AccessEditorial

On Issues of Validity and Especially on the Misery of Convergent Validity

Published Online:https://doi.org/10.1027/1015-5759/a000156

Summary

Here, we would like to consider the concept of convergent validity in taking the general and specific perspectives. The general perspective suggests that special attention should be given to the purpose of application in the definition of the concept. It is therefore proposed that the structuring of the relationship between test construction and test application be done according to the standard purposes of application. The specific perspective on convergent validity reveals indeterminacy and vagueness. Aside from the core meaning of convergent validity, four different elaborations can be identified: convergent validity as a transfer procedure, as a trait-controlled transfer procedure, as a traitmethod-controlled transfer procedure, and as an equivalence check. The less sophisticated elaborations suffer from the lack of specific limits, so that the ascription of convergent validity appears to be a matter of arbitrariness, as is the selection of the type of elaboration. All this adds up to what may be considered the misery of convergent validity.

Introduction

Because psychological assessment addresses latent human attributes, whether a test actually represents the attribute is a question of utmost importance. What in many other sciences is taken for granted needs to be explicitly established in psychological assessment since the link from the latent to the manifest level is not an especially obvious one. Furthermore, there is the complication of the situation since the validity of a test appears to depend on the specific use of a test and on the stage of the cultural framework. This dependency is especially obvious in the meaning of test items for persons from different cultures (van de Vijver, 2011). As a consequence, it is necessary to ensure an appropriate adaptation when the test is transferred from one language into another one (Schweizer, 2010; van de Vijver, 2003). In line with the recognition of these relationships, more recent inquiries into validity no longer consider it a property of the test but rather a property of the usage of the test for a particular purpose (Kane, 2006; Messick, 1989a; Sireci, 2007).

This way of emphasizing purpose and situation appears to be very reasonable as a legal conceptualization that is largely guided by legal standards and the demands of proper practice (Sireci, 2006). In a way it reflects the growing importance of psychological assessment in social life. However, irrespective of the necessity of considering the purpose of assessment, this recent notion bears a danger to psychometric standards since the specificity of the assessment situation associated with a particular purpose can become the excuse for low psychometric standards. Emphasizing the interpretation of test scores as recommended by Messick (1989a, 1989b) may increase the tolerance for deviations from a solid psychometric frame of reference. Furthermore, there is even the possibility of abusing psychological assessment by assigning too much weight to the purpose of the application. Because of this danger we recommend investigating and establishing validity with respect to “standard” purposes that enable the separation of the actual purpose of application from the context of the establishment of validity. Several types of validity have been proposed and are reiterated in virtually all textbooks on test construction (e.g., McDonald, 1999; Lewis-Beck, 1994). These types of validity can be established by considering one or several “standard” purposes of application that lie outside the actual purpose of the application. Of course, such separation demands that more weight be given to the definition and consideration of “standard” purposes in future test constructions than is presently the case.

Convergent Validity and Its Shortcomings

In line with these preliminary remarks concerning the recent notion of validity, the following inquiry on convergent validity assumes the framework of a standard purpose of application. Given this framework, convergent validity denotes the observation of a considerable correlation between tests that refer to the same psychological concept, mostly a construct. In other words, “two different measures of the same thing should intercorrelate highly” (Ghiselli, Campbell, & Zedeck, 1981, p. 285). According to these two descriptions of the core meaning of convergent validity, an appropriate theoretical background is an essential prerequisite, and it appears to be closely linked to the concept of operationalization thought to link entities of the latent level to entities of the manifest level. Unsurprisingly, the idea that there are psychological constructs (MacCorquodale & Meehl, 1948) was fundamental to the development of convergent validity, much as it was to construct validity (Cronbach & Meehl, 1955). Constructs are usually considered the theoretical basis providing justification for the expectation of a considerable correlation between tests. It even appears that validity did not play a major role in the theory of mental testing before the invention of the idea of the psychological construct. In the influential textbook by Gulliksen (1950) of the time before introduction of the construct, validity is defined as “the correlation of the test with some criterion” (p. 88), and no more than half a page is spent reflecting about it. Since that time, however, the sensitivity for various issues of validity has considerably increased so that nowadays validity is considered “the most important concept in psychometrics” (Sireci, 2007, p. 477). A number of different types of validity are to be taken into consideration.

Unfortunately, indeterminacy and vagueness characterize convergent validity aside from the core meaning and what may be perceived as the misery of convergent validity from the scientific point of view. In this and the following sections this assertion is pointed out in some detail. First let me elaborate on the vagueness of convergent validity. Given a standard purpose of application it is necessary to distinguish between two types of outcomes of an investigation of convergent validity. Since convergent validity is closely linked to the concept of correlation, two types of coefficients must be considered: coefficients indicating convergent validity and coefficients indicating the lack of convergent validity. It is interesting to observe that most textbooks avoid stating an explicit limit. Thus it might be sufficient to check whether the correlation reaches the level of significance. However, the level of significance cannot really be the solution to the problem since large sample sizes can cause rather small correlations to reach the level of significance. Furthermore, if there were an explicit limit, it would be necessary to also consider the properties of the tests and of the sample. For example, it is known that the size of such a correlation depends on the reliability of the tests that are correlated with each other (Lord & Novick, 1968). Moreover, there is the influence of the observational method on the size of correlation between two tests (Campbell & Fiske, 1959). This problem is discussed in more detail in one of the next sections. Additionally, there is the problem of how to deal with inconsistent results. A few tests of a study referring to the same construct may lead to correlations larger than a given limit, whereas other correlations of tests also referring to the same construct may be smaller. Apparently, a simple correlation between two tests is a vague argument in favor of convergent validity. It is only clear that the larger the correlation, the more likely convergent validity becomes.

The following sections discuss the indeterminacy of convergent validity. It becomes obvious that there are different elaborations of the core meaning, all of which are in use so that convergent validity can mean any one of them.

Convergent Validity as a Simple Transfer Procedure

First, there is the elaboration of convergent validity as a transfer procedure. Convergent validity is investigated by correlating tests with each other where one test shows an established validity, the other one being in need of being validated. An important characteristic of the test with the established validity is that it not a perfect representation of the construct; however, the association of this test and the construct is beyond debate. Employing such a procedure may be perceived as an attractive way of securing the validity of a new test since the necessary expenses in time and effort are quite low. However, this procedure is not really suitable for avoiding the type of vagueness described in the previous section.

Nevertheless, convergent validity appears to be a very popular type of validity. Papers reporting on the construction of a new test or the validation of an already established test usually include a section on convergent validity – but frequently fail to consider additional measures referring to other contents and alternative observational methods. In such papers variations can only be found concerning the number of other tests referring to the same construct and the considerations of specific facets of the construct. A check of the issues of the last 2 years of the European Journal of Psychological Assessment reveals that even nowadays convergent validity is quite common as the sole type of validity (Balducci, Fraccaroli, & Schaufeli, 2010; Balzarotti, John, & Gross, 2010; Campos, & Gonçalves, 2011; Carelli, Wiberg, & Wiberg, M., 2011; Cui, Teng, Li, & Oei, 2010; Höfling, Moosbrugger, Schermelleh-Engel, & Heidenreich, 2011; Kazarian, & Taher, 2010; Knutsche, Knibbe, Engels, & Gmel, 2010). All of these papers are in line with the core meaning of convergent validity, although there may be different opinions on what is to be considered as “high.”

Convergent Validity as a Trait-Controlled Transfer Procedure

However, the concentration on the described transfer procedure was already disclaimed by Campbell and Fiske (1959) when they proposed the multitrait-multimethod methodology. Their analyses of the results of previous research revealed that correlations obtained for tests referring to the same construct often overestimate the relationship between these tests because of common method variance. Scores obtained by the same observational method can be found to correlate with each other simply because of the common observational method, since the observational method can be the source of systematic variation respectively can contribute to the correlation of tests. One consequence of this observation is the demand to associate each investigation of convergent validity with an investigation of discriminant validity since the influence of the observational method may become obvious in such an investigation. The expected value for the correlation thought to establish discriminant validity is zero, since tests referring to different constructs should not be related to each other. Correlations establishing convergent validity should considerably exceed correlations establishing discriminant validity. Interestingly, a check of the issues of the last 2 years of European Journal of Psychological Assessment reveals quite a number of papers reporting results on convergent as well as discriminant validity without considering construct validity (De Carvalho Leite, Seminotti, Freitas, & de Lourdes Drachler, 2011; Fernandez, Dufey, & Kramp, 2011; Fossati, Borroni, Marchione, & Maffei, 2011; Glaesmer, Grande, Braehler, & Roth, 2011; Gorostiaga, Balluerka, Alonso-Arbiol, & Haranburu, 2011; Petermann, Petermann, & Schreyer, 2010; Rivero, Garcia-Lopez, & Hofmann, 2010; Teubert & Pinquart, 2011; Veirman, Brouwers, & Fontaine, 2011; Zohar & Cloninger, 2011).

Convergent Validity as a Traitmethod-Controlled Transfer Procedure

There is another consequence of the observation reported by Campbell and Fiske (1959), and it is the one promoted by the authors themselves: the consideration of alternative observational methods. This requires that different observational methods be applied in the operationalization of the construct. For example, the test can be designed as a self-report measure and at the same time as a peer-report measure. The same trait can be assessed by a questionnaire and a rating scale, or as a set of related rating scales. Such combinations enable the estimation of the relationship of interest by excluding the influence of the observational method. Based on these possibilities Campbell and Fiske recommend the systematic combination of several constructs and several observational methods in order to achieve an investigation of validity that is not impaired by method-induced distortion. This elaboration of convergent validity is addressed as traitmethod-controlled transfer procedure.

Unfortunately, the advantage of the original multitrait-multimethod approach was undone by the associated necessity of comparing a large number of correlations with each other in the evaluation of the multitrait-multimethod matrix. This disadvantage has now been overcome by introducing methods for the evaluation of the multitrait-multimethod matrix as a whole. Confirmatory factor analysis proved to be especially useful for this purpose. Several confirmatory factor models showing slightly differing properties have also been developed (e.g., Eid, 2000; Marsh, & Grayson, 1995). The availability of this method for data analysis finally turns the multitrait-multimethod methodology into a truly valuable research approach. Within this approach convergent validity no longer plays a major role. This advanced multitrait-multimethod methodology rather concentrates on construct validity. The issues of the last 2 years of European Journal of Psychological Assessment include a number of papers reporting such an investigation of construct validity – something clearly desirable for an assessment journal (Backenstrass, Joest, Gehrig, Pfeiffer, Mearns, & Catanzaro, 2010; Bäccman & Carlstedt, 2010; Blickle, Momm, Liu, Witzki, & Steinmayr, 2011; Crocetti, Schwartz, Fermani, & Meeus, 2010; Derkman, Scholte, Van der Veld, & Markland, 2010; Di Giunta, Eisenberg, Kupfer, Steca, Tramontano, & Caprara, 2010; Gorska, 2011; Isoard-Gautheur, Oger, Guillet, & Martin-Krumm, 2010; Lehmann-Willenbrock, Grohmann, & Kauffeld, 2011; Maiano, Morin, Monthuy-Blanc, & Garbarino, 2010; Pereda, Arch, Peró, Guàrdia, & Forns, 2011; Vissers, Keijsers, van der Veld, de Jong, & Hutschemaekers, 2010; Wright, Creed, & Zimmer-Gembeck, 2010; Zohar, Denollet, Ari, & Cloninger, 2011).

Convergent Validity as an Equivalence Check

The consideration of confirmatory factor analysis for investigating multitrait-multimethod matrices has not only advanced the multitrait-multimethod methodology, but also introduced a new and promising perspective on convergent validity. Now convergent validity can be considered with respect to two different levels, the manifest and the latent levels. The original notion of convergent validity applies to the manifest level. Measurements occur on the manifest level and are therefore considered to be flawed. Some of what is characterized as the “misery of convergent validity” is due to this assignment of measurements. The manifest level is contrasted by the latent level. This other level is thought to be without error. The implications of this assumption for expectations regarding convergent validity are amazing. Considered on the latent level, convergent validity suggests the perfect relationship since error and specificity characterizing individual tests which could impair the relationship are excluded. The latent referents of two tests that are assumed to represent the same construct should show nothing less than equivalence in the sense of a perfect correlation. The demonstration of this trait-specific or ability-specific equivalence requires the step from the manifest to latent levels by means of appropriate models of measurement that take disturbing structural specificity into consideration (Schweizer, Rauch, & Gold, 2011; Schweizer & Schreiner, 2010). The logic of such demonstration can be borrowed from the methodology developed for investigating the dimensionality of scales (DiStefano & Motl, 2006; Vautier, Raufaste, & Carou, 2003). Assuming that the latent referents of the two tests give rise to two dimensions, one can investigate whether the two dimensions can be replaced by a single dimension without causing impairment to the model fit.

This further elaboration of convergent validity recently emerged and enables convergent validity as an equivalence check. Since trait-specific or ability-specific equivalence should characterize the relationships on the latent level, there is no more room for the vagueness that characterizes convergent validity on the manifest level. Referring to the same construct can only mean equivalence in the sense of a perfect correlation when considered at the latent level. Although this expectation is a general one in the sense that it applies to all combinations of tests, it appears to be especially relevant for the cases of abridged, brief, and short tests that not only claim to represent the same construct as the original tests and to measure virtually the same but also share items. The relationships between abridged, brief, and short tests on one hand and original tests on the other should be especially close, and the authors of such tests usually assert that the abridged, brief, or short test can in fact replace the original one without any loss of information. Unfortunately, there is presently the practice of developing abridged, brief, and short tests by repeating more or less the steps of the original test construction with a subset of items (Laverdière, Diguer, Gamache, & Evans, 2010; van Baardewijk, Andershed, Stegge, Nilsson, Scholte, & Vermeiren, 2010). This approach can actually be expected to produce tests that may be similar to the original tests. However, it cannot guarantee that the original tests and the abridged, brief, or short tests actually measure exactly the same thing. There is always the danger that the abridged, brief, or short test is somewhat biased in by neglecting or emphasizing something: a facet of the construct that is covered by the original test or a process that is stimulated by the original test. The equivalence check can be instrumental in ruling out these possibilities.

Discussion

The investigation of the concept of convergent validity and of the practice of test construction and psychometric evaluation has revealed both vagueness and indeterminacy. This is an observation that does not exclude that different researchers are sure to agree about the core meaning of convergent validity, the relationship of two tests referring to the same construct. However, such agreement may disappear as soon as convergent validity is considered in more detail, since it becomes apparent that the concept is a bit imprecise and even vague because of the close association with the concept of correlation without stating a specific limit. Furthermore, there are the various elaborations of the concept with different implications concerning the investigation of convergent validity.

The basic elaboration is convergent validity as transfer procedure. The establishment of convergent validity according to this elaboration means the simple transfer of validity from one test to another test. It was found in 8 studies. The next elaboration is convergent validity as a trait-controlled transfer procedure. This demands the additional consideration of discriminant validity and was observed in another 10 studies. The third elaboration is the traitmethod-controlled transfer procedure of the multitrait-multimethod approach. However, in the combination of this approach with confirmatory factor analysis, convergent validity is virtually dissolved in construct validity. The literature research revealed the investigation of construct validity in 14 studies. Finally, there is the elaboration as equivalence check, which is the result of switching to the latent variable approach for investigating convergent validity. No study designed according to this elaboration has so far appeared in the European Journal of Psychological Assessment.

Finally, the question needs to be addressed whether it is justified to equate vagueness and indeterminacy with misery. From the point of view of the test constructor, vagueness and indeterminacy may be perceived as useful since these characteristics can be instrumental to highlighting the positive results of test construction and to prevent disadvantageous observations from turning a long-term research project into a disaster. Vagueness and indeterminacy can help to present the outcome of virtually every evaluation of a test as a success. However, is this science? More importantly, does it contribute to progress in science? If failure is virtually impossible because of vagueness and indeterminacy, then there is no guidance for the applied researcher who is searching for a test to properly represent a specific construct. Furthermore, there is no obvious reason for improving less than optimal tests. If vagueness and indeterminacy characterize basic concepts like convergent validity, from the scientific point of view the true researcher must experience misery. Relief from this misery would be highly appreciated.

References

  • Backenstrass, M. , Joest, K. , Gehrig, N. , Pfeiffer, N. , Mearns, J. , Catanzaro, S. J. (2010). The German version of the Generalized Expectancies for Negative Mood Regulation Scale: A construct validity study. European Journal of Psychological Assessment, 26, 28–38. First citation in articleLinkGoogle Scholar

  • Bäccman, C. , Carlstedt, B. (2010). A construct validation of a Profession-Focused Personality Questionnaire (PQ) versus the FFPI and the SIMP. European Journal of Psychological Assessment, 26, 136–142. First citation in articleLinkGoogle Scholar

  • Balducci, C. , Fraccaroli, F. , Schaufeli, W. B. (2010). Psychometric properties of the Italian Version of the Utrecht Work Engagement Scale (UWES-9): A cross-cultural analysis. European Journal of Psychological Assessment, 26, 143–149. First citation in articleLinkGoogle Scholar

  • Balzarotti, S. , John, O. P. , Gross, J. J. (2010). An Italian adaptation of the Emotional Regulation Questionnaire. European Journal of Psychological Assessment, 26, 61–67. First citation in articleLinkGoogle Scholar

  • Blickle, G. , Momm, T. , Liu, Y. , Witzki, A. , Steinmayr, R. (2011). Construct validation of the Test of Emotional Intelligence (TEMINT): A two-study investigation. European Journal of Psychological Assessment, 27, 282–298. First citation in articleLinkGoogle Scholar

  • Campbell, D. T. , Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56, 81–105. First citation in articleCrossrefGoogle Scholar

  • Campos, R. , Gonçalves, B. (2011). The Portuguese version of Beck Depression Inventory-II (BDIII): Preliminary psychometric data with two non clinical samples. European Journal of Psychological Assessment, 27, 258–264. First citation in articleLinkGoogle Scholar

  • Carelli, M. G. , Wiberg, B. , Wiberg, M. (2011). Development and construct validation of the Swedish Zimbardo Time Perspective Inventory. European Journal of Psychological Assessment, 27, 220–227. First citation in articleLinkGoogle Scholar

  • Crocetti, E. , Schwartz, S. J. , Fermani, A. , Meeus, W. (2010). The Utrecht-Management of Identità Commitments Scale (U-MICS): Italian validation and cross-national comparisons. European Journal of Psychological Assessment, 26, 172–186. First citation in articleLinkGoogle Scholar

  • Cronbach, L. J. , Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52, 281–302. First citation in articleCrossrefGoogle Scholar

  • Cui, L. , Teng, X. , Li, X. , Oei, T. P. S. (2010). The factor structure and psychometric properties of the Resiliency Scale in Chinese undergraduates. European Journal of Psychological Assessment, 26, 162–171. First citation in articleLinkGoogle Scholar

  • De Carvalho Leite, J. C. , Seminotti, N. , Freitas, P. F. , de Lourdes Drachler, M. (2011). The Psychosocial Treatment Expectations Questionnaire (PTEQ) for alcohol problems: development and early validation. European Journal of Psychological Assessment, 27, 228–236. First citation in articleLinkGoogle Scholar

  • Derkman, M. M. S. , Scholte, R. H. J. , Van der Veld, W. M. , Markland, R. C. M. E. (2010). Factorial and construct validity of the sibling relationship questionnaire. European Journal of Psychological Assessment, 26, 277–283. First citation in articleLinkGoogle Scholar

  • Di Giunta, L. , Eisenberg, N. , Kupfer, A. , Steca, P. , Tramontano, C. , Caprara, G. V. (2010). Assessing perceived empathic and social self-efficacy across countries. European Journal of Psychological Assessment, 26, 77–86. First citation in articleLinkGoogle Scholar

  • DiStefano, C. , Motl, R. W. (2006). Further investigating method effects associated with negatively worded items on self-report surveys. Structural Equation Modeling, 13, 440–464. First citation in articleCrossrefGoogle Scholar

  • Eid, M. (2000). A multitrait-multimethod model with minimal assumptions. Psychometrika, 65, 241–261. First citation in articleCrossrefGoogle Scholar

  • Fernandez, A. M. , Dufey, M. , Kramp, U. (2011). Testing the psychometric properties of the Interpersonal Reactivity Index (IRI) in Chile: Empathy in a different cultural context. European Journal of Psychological Assessment, 27, 179–185. First citation in articleLinkGoogle Scholar

  • Fossati, A. , Borroni, S. , Marchione, D. , Maffei, C. (2011). The Big Five Inventory (BFI): Reliability and validity of its Italian translation in three independent nonclinical samples. European Journal of Psychological Assessment, 27, 50–58. First citation in articleLinkGoogle Scholar

  • Ghiselli, E. E. , Campbell, J. P. , Zedeck, S. (1981). Measurement theory for behavioral sciences. San Francisco: W. H. Freeman. First citation in articleGoogle Scholar

  • Glaesmer, H. , Grande, G. , Braehler, E. , Roth, M. (2011). The German version of the Satisfaction with Life Scale (SWLS): Psychometric properties, validity, and population-based norms. European Journal of Psychological Assessment, 27, 127–132. First citation in articleLinkGoogle Scholar

  • Gorostiaga, A. , Balluerka, N. , Alonso-Arbiol, I. , Haranburu, M. (2011). Validation of the Basque Revised NEO Personality Inventory (NEO PI-R). European Journal of Psychological Assessment, 27, 193–204. First citation in articleLinkGoogle Scholar

  • Gorska, M. (2011). Psychometric properties of the Polish version of the Interpersonal Competence Questionnaire (ICQ-R). European Journal of Psychological Assessment, 27, 186–192. First citation in articleLinkGoogle Scholar

  • Gulliksen, H. (1950). Theory of mental tests. New York: Wiley. First citation in articleCrossrefGoogle Scholar

  • Höfling, V. , Moosbrugger, H. , Schermelleh-Engel, K. , Heidenreich, T. (2011). Mindfulness or mindlessness? A modified version of the Mindful Attention and Awareness Scale (MAAS). European Journal of Psychological Assessment, 27, 59–64. First citation in articleLinkGoogle Scholar

  • Isoard-Gautheur, S. , Oger, M. , Guillet, E. , Martin-Krumm, C. (2010). Validation of a French version of the Athlete Burnout Questionnaire (ABQ) in competitive sport and physical education context. European Journal of Psychological Assessment, 26, 203–211. First citation in articleLinkGoogle Scholar

  • Kane, M. T. (2006). Test validation. In R. L., Brennan (Ed.), Educational measurement (4th ed., pp. 17–64). Westport, CT: American Council on Education/Praeger. First citation in articleGoogle Scholar

  • Kazarian, S. S. , Taher, D. (2010). Validation of the Arabic center for Epidemiological Studies Depression (CES-D) Scale in a Lebanese community sample. European Journal of Psychological Assessment, 26, 68–73. First citation in articleLinkGoogle Scholar

  • Knutsche, E. , Knibbe, R. , Engels, R. , Gmel, G. (2010). Being drunk to have fun or to forget problems? Identifying enhancement and coping drinkers among risky drinking adolescents. European Journal of Psychological Assessment, 26, 46–54. First citation in articleLinkGoogle Scholar

  • Laverdière, O. , Diguer, L. , Gamache, D. , Evans, D. E. (2010). The French adaptation of the Short form of the Adult Temperament Questionnaire. European Journal of Psychological Assessment, 26, 212–219. First citation in articleLinkGoogle Scholar

  • Lehmann-Willenbrock, N. , Grohmann, A. , Kauffeld, S. (2011). Task and relationship conflict at work: Construct validity of a German version of Jehn’s Intragroup Conflict Scale. European Journal of Psychological Assessment, 27, 171–178. First citation in articleLinkGoogle Scholar

  • Lewis-Beck, M. S. (1994). Basic measurement. International Handbooks of Quantitative applications in the Social Sciences (Vol. 4). Singapore: Sage. First citation in articleGoogle Scholar

  • Lord, F. M. , Novick, M. R. (1968). Statistical theories of mental test scores. Menlo Park, CA: Addison Wesley. First citation in articleGoogle Scholar

  • MacCorquodale, K. , Meehl, P. E. (1948). On a distinction between hypothetical constructs and intervening variables. Psychological Review, 55, 95–107. First citation in articleCrossrefGoogle Scholar

  • Maiano, C. , Morin, A. J. S. , Monthuy-Blanc, J. , Garbarino, J.-M. (2010). Construct validity of the Fear of Negative Appearance Evaluation Scale in a community sample of French adolescents. European Journal of Psychological Assessment, 26, 19–27. First citation in articleAbstractGoogle Scholar

  • Marsh, H. , Grayson, D. (1995). Latent variable models of multitrait-multimethod data. In R. H. Hoyle, (Ed.), Structural equation modeling: Concepts, issues, and applications (pp. 177–187). Thousand Oaks, CA: Sage. First citation in articleGoogle Scholar

  • McDonald, R. P. (1999). Test theory: A unified treatment. Mahwah, NJ: Erlbaum. First citation in articleGoogle Scholar

  • Messick, S. (1989a). Validation. In R. Linn, (Ed.), Educational measurement (3rd ed., pp. 13–103). Washington, DC: American Council on Education/Macmillan. First citation in articleGoogle Scholar

  • Messick, S. (1989b). Meaning and values in test validation: The science of ethics of assessment. Educational Researcher, 18, 5–11. First citation in articleCrossrefGoogle Scholar

  • Pereda, N. , Arch, M. , Peró, M. , Guàrdia, J. , Forns, M. (2011). Assessing guilt after traumatic events: The Spanish Adaptation of the Trauma-Related Guilt Inventory. European Journal of Psychological Assessment, 27, 251–257. First citation in articleLinkGoogle Scholar

  • Petermann, U. , Petermann, F. , Schreyer, I. (2010). The German Strengths and Difficulties Questionnaire: Validity of the teacher version for preschoolers. European Journal of Psychological Assessment, 26, 256–262. First citation in articleLinkGoogle Scholar

  • Rivero, R. , Garcia-Lopez, L. J. , Hofmann, S. G. (2010). The Spanish version of the Self-Statements During Public Speaking Scale: Validation in adolescents. European Journal of Psychological Assessment, 26, 129–135. First citation in articleLinkGoogle Scholar

  • Schweizer, K. (2010). The adaptation of assessment instruments to the various European languages. European Journal of Psychological Assessment, 26, 75–76. First citation in articleLinkGoogle Scholar

  • Schweizer, K. , Rauch, W. , Gold, A. (2011). Bipolar items for the measurement of personal optimism instead of by unipolar items. Psychological Test and Assessment Modeling, 53, 399–413. First citation in articleGoogle Scholar

  • Schweizer, K. , Schreiner, M. (2010). Avoiding the effect of item wording by means of bipolar instead of unipolar items: An application to social optimism. European Journal of Personality, 24, 137–150. First citation in articleGoogle Scholar

  • Sireci, S. G. (2006). Validity on trial: Psychometric and legal conceptualizations of validity. Educational Measurement: Issues and Practice, 25, 27–34. First citation in articleCrossrefGoogle Scholar

  • Sireci, S. G. (2007). On validity theory and test validation. Educational Researcher, 36, 477–481. First citation in articleCrossrefGoogle Scholar

  • Teubert, D. , Pinquart, M. (2011). The Coparenting Inventory for Parents and Adolescents (CI-PA): Reliability and validity. European Journal of Psychological Assessment, 27, 206–214. First citation in articleLinkGoogle Scholar

  • Van Baardewijk, Y. , Andershed, H. , Stegge, H. , Nilsson, K. W. , Scholte, E. , Vermeiren, R. (2010). Development and test of short versions of the Youth Psychopathic Traits Inventory and the Youth Psychopathic Traits Inventory – Child Version. European Journal of Psychological Assessment, 26, 122–128. First citation in articleLinkGoogle Scholar

  • van de Vijver, F. J. R. (2003). Test adaptation/translation methods. In R. Fernández-Ballesteros, (Ed.), Encyclopedia of psychological assessment (pp. 960–964). Thousand Oaks, CA: Sage. First citation in articleCrossrefGoogle Scholar

  • van de Vijver, F. J. R. (2011). Bias and real differences in cross-cultural differences: Neither friends nor foes. In F. J. R. van de Vijver, A. Chasiotis, S. M. Breugelmans, (Eds.), Fundamental questions in cross-cultural psychology (pp. 235–257). New York, US: Cambridge University Press. First citation in articleCrossrefGoogle Scholar

  • Vautier, S. , Raufaste, E. , Carou, M. (2003). Dimensionality of the revised Life Orientation Test and the status of filler items. International Journal of Psychology, 38, 390–400. First citation in articleCrossrefGoogle Scholar

  • Veirman, E. , Brouwers, S. A. , Fontaine, J. (2011). The assessment of emotional awareness in children: Validation of the Levels of Emotional Awareness for Children. European Journal of Psychological Assessment, 27, 265–273. First citation in articleLinkGoogle Scholar

  • Vissers, W. , Keijsers, G. P. J. , van der Veld, W. M. , de Jong, C. A. J. , Hutschemaekers, G. J. M. (2010). Development of the Remoralization Scale: An extension of contemporary psychotherapy outcome measurement. European Journal of Psychological Assessment, 26, 293–301. First citation in articleLinkGoogle Scholar

  • Wright, M. , Creed, P. , Zimmer-Gembeck, M. J. (2010). The development and initial validation of a Brief Daily Hassles Scale suitable for use in adolescents. European Journal of Psychological Assessment, 26, 220–226. First citation in articleLinkGoogle Scholar

  • Zohar, A. H. , Cloninger, C. R. (2011). The psychometric properties of the TCI-140 in Hebrew. European Journal of Psychological Assessment, 27, 73–80. First citation in articleLinkGoogle Scholar

  • Zohar, A. H. , Denollet, J. , Ari, L. L. , Cloninger, C. R. (2011). The psychometric properties of the DS14 in Hebrew and the prevalence of type D personality in Israeli adults. European Journal of Psychological Assessment, 27, 274–281. First citation in articleLinkGoogle Scholar

Karl Schweizer, Department of Psychology, Goethe University Frankfurt, Mertonstr. 17, 60054 Frankfurt a. M., Germany, +49 69 798-22081, +49 69 798-23847,