Skip to main content
Open AccessOriginal Article

Development of a Scale for Assessing Basic Psychotherapeutic Skills

Published Online:https://doi.org/10.1026/1616-3443/a000623

Abstract

Abstract.Background: Well-established scales for the observation-based assessment of psychotherapy competence encompass multiple domains, require extensive rater training, and are rather cost-intensive. Objective: To develop a comprehensive but easy-to-administer instrument for the observation-based assessment of basic communication and counseling skills in both real and simulated patient encounters, the Clinical Communication Skills Scale (CCSS). Methods: We investigated the content validity and applicability of this scale. We then presented videos of simulated therapy sessions conducted by a competent vs. noncompetent therapist online to N = 209 laypersons and psychology students. Results: Results suggested a one-factorial solution. Internal consistency was excellent (α = .94). For most aspects, convergent validity with established scales was moderate to high. The CCSS effectively differentiated between both levels of skill. Conclusions: The CCSS appears to be a feasible, reliable, and valid instrument. Nonetheless, its psychometric criteria should be investigated further in clinical samples, with licensed therapists, and in other languages.

Entwicklung einer Skala zur Erfassung psychotherapeutischer Basisfertigkeiten

Zusammenfassung.Theoretischer Hintergrund: Etablierte Skalen zur beobachtungsbasierten Einschätzung psychotherapeutischer Kompetenzen umfassen mehrere Bereiche, erfordern eine umfangreiche Schulung der Urteiler_innen und ihr Einsatz ist eher kostenintensiv. Fragestellung: Ziel war die Entwicklung eines umfassenden aber leicht anwendbaren Instruments für die beobachtungsbasierte Bewertung von Basis-Gesprächsführungsfertigkeiten in Gesprächen mit realen und simulierten Patient_innen, die Skala zur Einschätzung Klinischer Gesprächsführung (SEKG). Methode: Zunächst wurden Inhaltsvalidität und Anwendbarkeit der Skala geprüft. Danach wurden N = 209 Laien und Psychologiestudent_innen Videos von simulierten Therapiesitzungen mit einer kompetenten vs. inkompetenten Therapeutin online präsentiert. Ergebnisse: Die Ergebnisse legen eine einfaktorielle Struktur nahe. Die interne Konsistenz war exzellent (α = .94). Die konvergente Validität mit etablierten Skalen war zumeist moderat bis hoch. Die SEKG differenzierte gut zwischen den beiden Fertigkeitsstufen. Schlussfolgerungen: In unserer Studie zeigte sich die SEKG als praktikabel, reliabel und valide. Dennoch sollten ihre psychometrischen Kriterien in klinischen Stichproben, mit approbierten Therapeut_innen und in anderen Sprachen weiter geprüft werden.

Definitions of Psychotherapy Competence

One of the most prominent definitions of psychotherapy competence can be traced back to Barber et al. (2007), who distinguished global competence (i. e., whether a therapist “appropriately and independently manages several clinical problems and can adequately help patients realize their treatment goals”) from limited-domain competence (applied “within the context of a specific psychotherapy intervention or treatment modality,” p. 494). The latter is conceptualized to include treatment-unique (e. g., Socratic questioning) as well as general psychotherapy techniques (e. g., the exploration of symptoms; Barber et al., 2007; Weck, 2014). Applying these techniques requires the simultaneous consideration of the patient’s context (Barber et al., 2007). Accordingly, other authors also highlight treatment-specific competencies in addition to generic and metacompetencies as highly relevant (University College London, 2022).

Competence Models

For competence assessments, a framework that was first proposed nearly 30 years ago in the medical field is Miller’s (1990) pyramid model, which included the stages of knowledge (knows), competence (knows how), performance (shows how), and action (does). It was later adapted by Muse and McManus (2013) to cognitive-behavioral therapy and has been cited increasingly over time. The stages of knowledge, practical understanding, practical application, and clinical practice are important for assessment as well as for competence development. Whereas Muse and McManus propose standardized role-plays to assess the practical application of knowledge and skills (“shows how”), for clinical practice assessments (“does”), ratings of treatment sessions are necessary. Thus, for the various purposes and objectives, it is reasonable to implement role-plays either with standardized patients (SPs) or with real ones. Whereas the ratings of therapy sessions apply to real patients, using SPs enables the consistent presentation of symptoms and, thus, the reduction of confounding variables (Weck et al., 2019). To improve competence assessment and development, viable rating scales are needed not only for the assessment of patient encounters, but also for rating simulated interactions.

Competence Assessments in the Medical Field

In the medical field, role-plays are integral parts of Objective Structured Clinical Examinations (OSCEs). Despite its importance in the training of medical students, a systematic review of instruments for the evaluation of communication skills within OSCEs (Cömert et al., 2016) assessed the methodological quality of most associated studies as poor and the psychometric quality of the instruments as only average. In addition, the instrument they rated as most appropriate for the assessment of students’ medical communication skills (Huntley et al., 2012) involves various aspects (e. g., conversation techniques, professional behavior, or empathy). Similarly, further instruments from the medical field must be adapted to the psychotherapy setting (e. g., Burt et al., 2014; Scheffer et al., 2008; Simmenroth-Nayda et al., 2014). In contrast, counseling-specific scales focus mainly on helping skills (Hill, 2020; Hill & Kellems, 2002).

Competence Assessments in Psychotherapy

For the assessment of psychotherapy competence, the publication of the Cognitive Therapy Scale (CTS; Young & Beck, 1980) was highly influential. Since then, modifications and disorder-specific developments, a revised version (CTS-R; Blackburn et al., 2001), and a related instrument derived specifically for training purposes (Kühne, Lacki et al., 2019; Muse et al., 2017) were proposed. The CTS and the CTS-R are among the most prominent and widely used competence scales today (Muse et al., 2017), and both have been suggested for training and accreditation purposes (Kazantzis et al., 2018).

To apply the above-mentioned competence scales correctly, extensive rater training is necessary (for an online example, see https://www.accs-scale.co.uk), lasting from one day to several days (Kazantzis et al., 2018; Muse & McManus, 2016). Hence, rater training is cost-intensive, and in our experience, it may be difficult to acquire a pool of qualified raters for the subsequent studies. Experts view competence in cognitive-behavioral therapy (CBT) as a “complex and fuzzy concept” (Muse & McManus, 2016), and, regarding the CTS, they even disagree on which specific therapeutic behaviors are necessary for evaluating each item (Schmidt et al., 2018). Using the above rating scales is time-consuming, which renders external ratings too rather cost-intensive. This is problematic, as the major barriers to the implementation of procedures for ensuring treatment integrity (of which competence assessments are part) are time, cost, and labor constraints (Perepletchikova et al., 2009).

For this reason, Roth and colleagues (2019) introduced two scales for the assessment of generic and the CBT-specific skills observed in whole CBT sessions in routine care. The authors provide several questions as behavioral anchors for the assessment of each of the 39 items (Roth, 2016). They trained six raters experienced in using the CTS-R to apply the new instrument via two team meetings. The raters assessed a nonrandomized sample of video recordings provided by psychotherapy trainees but reached only poor to moderate interrater reliability (Roth et al., 2019). All of the above-mentioned scales refer to whole sessions and thus comprise numerous domains, from agenda-setting to giving homework. Nevertheless, training often targets specific skills, such as the identification of key cognitions.

The Current Study

A comprehensive and more efficient easy-to-administer scale could be helpful, particularly to support the effecttive implementation of skills assessments in psychotherapy training, simulated role-plays, and process research. Thus, the current study served to develop and validate a feasible instrument for the observation-based assessment of basic communication and counseling skills in real and simulated-patient encounters. The scale was designed as an adjunct to more comprehensive rating scales. It should be self-explanatory, easy to administer without intensive training, and cover basic counseling techniques. We explored the content validity and applicability as well as the reliability, validity, and dimensionality of the scale. Furthermore, we investigated its ability to differentiate between (higher and lower) levels of skill.

Methods

Development of the Measure

Since the medical field has a long tradition of using standardized patients (SPs), role-plays, and competence assessments, the Clinical Communication Skills Scale (CCSS) was derived from well-established measures mainly from the medical field, specifically the Calgary-Cambridge Guide (Simmenroth-Nayda et al., 2014), the Global Consultation Rating Scale (Burt et al., 2014), the Berlin Global Rating (Scheffer et al., 2008), the counseling-specific Supervisor or Peer Rating Form of Helper Exploration Skills (Hill, 2020), and a publication on communication skills by Martin et al. (2017). Relevant items were adapted to the context of psychotherapy (for details, see Appendix A).

We developed a preliminary instrument with 48 items. In order not only to cover fidelity or the frequency of behaviors, but in order also to capture appropriateness, the raters evaluated how appropriately the therapist applied each skill on a Likert-type 4-point scale (0 = not at all appropriately, 1 = not particularly appropriately, 2 = generally appropriately, 3 = entirely appropriately). If the option “not possible to evaluate” was chosen, these values were handled as missing data. Furthermore, we includeed an item on general counseling skills (rated from 1 = insufficient to 6 = excellent), and the option to provide qualitative feedback.

Two authors (FK, a licensed psychotherapist; DSAB, a psychotherapist in training) piloted the CCSS with two teaching videos for psychotherapists in training (Brakemeier & Jacobi, 2017), as well as with ten videotapes of role-play interactions with psychology students (see Appendix A).

Content Validity and Applicability

Regarding the content validity of the scale, we then conducted an online survey. We distributed an invitation to participate in the anonymous survey to colleagues via our professional networks and had the participants evaluate the relevance and comprehensibility of each item on a Likert-type 5-point scale (1 = not very relevant / ‌comprehensible to 5 = very relevant / ‌comprehensible). We further provided open comment fields for each item, and, in the end, we asked the participants whether they thought any important aspects were missing.

Video Material

We produced two videotapes, both featuring an 8-minute segment of a session on behavioral activation with a simulated depressed patient (SP). Both segments were standardized regarding therapist, content, and simulated patient, whereby the skill level of the therapist was manipulated. Based on a manual on CBT for depression (Hautzinger, 2013), the therapist (FK) first assessed the SP’s current mood and activity level. The patient was then encouraged to try out one activity (e. g., taking a walk) repeatedly during the ensuing week. For the competent-therapist video, the therapist was instructed to display as many appropriate behaviors as possible, as listed in the CCSS, e. g., being empathic (item 25) or exploring the patient’s beliefs, emotions, or behaviors (item 27). For the noncompetent-therapist video, the therapist was instructed to not or to inappropriately apply as many of those aspects as possible, for example, to behave condescendingly or to remain superficial regarding thoughts and feelings. Three authors (PEH, UM, FW) provided feedback during the video recordings, resulting in two videos that were unanimously considered as demonstrating the most / least skillful therapist behaviors.

Online Procedure and Participants

We used a cross-sectional study design. The study was conducted via an online platform of the University of Potsdam (UP Survey; June – August 2019). German-speaking adults (≥ 18 years) were eligible to participate. They were informed about the content of the study, anonymous data collection and privacy, and gave informed consent for participation.

We recruited two groups of samples (see Table 1). Since the scale was developed to be easy to administer without extensive training, the first subsample consisted of laypersons with no psychological expertise. Participants were recruited via a commercial online platform (Clickworker, 2019), watched, and evaluated the competent-therapist video, and then received an expense allowance of €3.00. The second subsample was a convenience student sample recruited via the department’s participant pool, social media, and student mailing lists. These participants received course credit. We assumed that, if laypersons and students were able to use the CCSS appropriately, assistant raters in research or training programs would be capable, too.

Table 1 Characteristics of the convenience and commercial sample

To investigate the ability of the measure to differentiate between levels of skill, each participant in the convenience sample was randomized to either watch the therapist in the competent or the noncompetent condition. For the sake of survey brevity, laypersons from the commercial sample were presented only with the competent-therapist video.

Overall, N = 209 participants with a mean age of 33.14 years (SD = 11.91) completed the study (see Table 1). If students from the convenience sample provided information on their subject, they mostly indicated attending psychology or linguistics courses. The commercial subsample was significantly older (t‍(198.34) = 6.55, p < .001, Cohen’s d = .91), less educated (Χ2‍(4) = 31.21, p < .001, Cramer’s V = .39), and included a larger proportion of male participants (Χ2‍(2) = 40.85, p < .001, Cramer’s V = .44) than the convenience (student) sample.

Convergent Validity

Convergent validity was examined using correlations between the CCSS with measures that also assess therapist competence, empathy, and certain personality variables (i. e., agreeableness, openness) we considered relevant to therapeutic skills. For these variables, we expected moderate positive correlations.

Competence

To facilitate the feasibility of the current study, we chose selected items of the Cognitive Therapy Scale (CTS; Young & Beck, 1980; German: Weck et al., 2010), namely, those on Communication and Rationale, as well as the Overall Competence Rating. The items intercorrelated as follows: rs, Comm-Rationale = .62, rs, Comm-Overall = .69, rs, Rationale-Overall = .63; their reliability was good (α = .85). Responses were given on a Likert-type 7-point scale (0 = poor to 6 = excellent); descriptions for the categories of poor, mediocre, good, and excellent were provided to the participants.

Empathy

The therapist’s empathy was rated on the Empathy Scale (ES; adapted from Persons & Burns, 1985; German: Partschefeld et al., 2013). This instrument consists of 10 items and uses a Likert-type 4-point scale (0 = not at all to 3 = strongly). In the present study, its reliability was good (α = .86).

Agreeableness and Openness

Participants further rated the therapist on the scales Agreeableness and Openness from the short version of the Big Five Inventory (BFI-K; German: Rammstedt & John, 2004). The BFI-K uses a Likert-type 5-point scale (0 = disagree strongly to 4 = agree strongly). We decided on these traits, as they were predictors of active and empathic communication in a prior study (Sims, 2017). In the current study, however, the reliabilities were less than satisfactory (Agreeableness: α = .65; Openness: α = .57) but comparable to other studies using this measure (e. g., Rammstedt & John, 2004).

Divergent Validity

Extraversion

In a previous study, extraversion had not predicted active and empathic listening (Sims, 2017). Therefore, in the current survey, participants also rated the therapists’ Extraversion (α = .69) on the BFI-K. It was hypothesized that the CCSS would not correlate significantly with Extraversion.

Statistical Analysis

Analyses on the reliability, dimensionality, and validity of the CCSS were performed with the data on the competent-therapist video for both samples (convenience and commercial) combined.

Reliability

Cronbach’s α was used to estimate internal consistency (Cronbach, 1951). Overall, values above .80 are considered “good” (Field et al., 2012). To test each item’s impact on the overall score, we calculated item-total correlations, with values between .40 and .70 being considered “good” (Moosbrugger & Kelava, 2012).

Dimensionality and Validity

We used exploratory factor analysis to determine the dimensionality of the CCSS. First, we used the Kaiser-Meyer-Olkin measure (KMO) and Bartlett’s test of sphericity to check the suitability of our data for subsequent factor analyses (Field et al., 2012); KMO values between .70 and .80 were considered “good” (Kaiser, 1974). We expected the KMO measure to exceed .70 (Kaiser, 1974) and assumed the significance of Bartlett’s test (p < .05) indicating sufficient correlations between items. We determined an adequate number of factors using the Kaiser–1 heuristic (eigenvalues above 1.0), scree plots, and parallel analyses. Factor loadings of items were considered substantial if they exceeded .40 (Stevens, 2002).

To test whether the CCSS can differentiate between levels of skill, we used t-tests for independent samples. For determining convergent and discriminant validity, we used Pearson’s correlations. Determining a .05 level of significance, all analyses were performed using R software (R Core Team, 2018).

Results

Content Validity and Applicability

Altogether, 10 persons participated in the online survey, three of whom were licensed psychotherapists, one a psychotherapist in training, two researchers in psychology, and four experienced in role-plays (i. e., associates of theater arts). 80 % were female; their mean age was 34.56 years (SD = 7.06). The participants had an average of 7.93 years (SD = 5.10) of work experience. After conducting the survey, we eliminated 10 items and added a new one (item 18), resulting in the 37-item version (plus one item on general counseling skills). All items were considered very comprehensible (M = 4.64, SD = .31) and fairly relevant (M = 4.33, SD = .28; Likert-type 5-point scale). According to the participants’ notes, some items were reworded or merged for better comprehensibility, whereas others remained unchanged; all survey results and the changes made according to them are documented in Appendix B.

Descriptive Results and Item Analysis

Appendix C shows the item means, standard deviations, and item-total correlations, separately for the noncompetent- and competent-therapist videos. First, we excluded 3 items from further analyses because of low positive or even negative item-total correlations (i. e., items 33, 34, and 37). The high number of missing values was the second reason for their exclusion (item 33: 33.01 %; item 34: 31.58 %; item 37: 25.36 %).

In our manipulation check, the noncompetent therapist was rated as significantly less skillful than the competent therapist (CCSS mean and general skills, p < .01). Except for Extraversion (t‍(135.44) = 1.60, p = .11), this was also the case for the other items and scales on Competence, Empathy, Agreeableness and Openness (see below, all p < .01).

Reliability

Across the main 37 items (without the general skills rating), the internal consistency of the CCSS was excellent (α = .94). For the subsamples, it ranged from α = .93 (convenience sample, noncompetent-therapist video) and α = .94 (commercial sample, competent-therapist video only) to α = .97 (convenience sample, competent-therapist  video). Moreover, item-total correlations exceeded .40, except for items 14, 33, 34, and 37 (see Appendix C). This indicates that ratings on most items resembled the CCSS total score.

Dimensionality

To examine the dimensionality of the CCSS, we performed principal component analysis on the final 34-item CCSS. According to the Kaiser-Meyer-Olkin measure (KMO = .91; all items > .84), our data were suitable for factor analysis. Bartlett’s test of sphericity was significant (Χ2‍(561) = 2905.10, p < .001), indicating sufficient between-item correlations. The explorative factor analysis yielded seven components with eigenvalues above Kaiser’s criterion of 1, but the scree plot and the parallel analysis clearly suggested obtaining one component (see Appendix D). The factor loadings between .42 and .75 imply that each item contributed substantially to that single component, which accounted for 40 % of the variance.

Validity

Table 2 presents the results on convergent and discriminant validity. As expected, the CCSS mean correlated significantly and positively with the CTS Communication and Rationale scores, and with its Overall Competence Rating. Furthermore, the CCSS correlated significantly and positively with the therapist’s Empathy, Agreeableness, and Openness. There was a small nonsignificant correlation with Extraversion.

Table 2 Correlations based on the competent-therapist video (n = 154)

Differences Between Skill Assessments

Concerning the competent-therapist video (both samples), the therapist’s skill was rated on average as generally to entirely adequate (M = 2.33, SD = 0.50, range = 0 – 3). The general counseling skills mean was 4.68 (SD = 1.08, range = 1 – 6), which represents good competence. In contrast, the noncompetent therapist’s skills (convenience sample) were rated on average as not at all to not particularly adequate (M = 0.89, SD = 0.42, range = 0 – 1.91). Concerning general counseling skills, the noncompetent therapist’s mean was 2.04 (SD = 0.86, range = 1 – 5).

As mentioned above, participants in the commercial sample watched only the competent-therapist video. Regarding general counseling skills, they rated the therapist as significantly less competent than participants from the convenience sample (p < .05). There was no significant difference in the CCSS mean ratings (p = .23).

Discussion

As a complement to current comprehensive rating instruments, we developed and validated a viable scale for the observation-based assessment of basic communication and counseling skills, the Clinical Communication Skills Scale (CCSS; see Appendices E and F). Item development was based on a comprehensive process of literature search, expert ratings, refinement of items, and piloting in two samples. The internal consistency of the CCSS was excellent, and the factor analysis suggested a one-factorial solution. Nevertheless, a factor analysis based on an independent dataset is necessary. As expected, the CCSS correlated significantly with the CTS Communication and Rationale scales and with overall competence. If subsequent studies draw on whole sessions, they could also use the complete CTS for validation purposes (Roth et al., 2019). Such studies are appropriate for further investigating differential correlation patterns between subscales of comprehensive rating scales and the CCSS.

Concerning therapist characteristics, the highest correlations emerged between the CCSS and therapist empathy, strong relationships with agreeableness and openness, and, as expected, a much smaller nonsignificant association with Extraversion. Empathy correlated more strongly with the CCSS mean than with its overall item on general counseling skills, which is why subsequent studies should investigate the specificity of such an overall item. These results are in line with studies on patient preferences for an empathic, accepting, appreciative, and honest therapist (DeGeorge et al., 2013). They are also consistent with meta-analytic results underscoring the relationships between therapist empathy and genuineness, on the one hand, and the therapeutic alliance, on the other hand (Nienhuis et al., 2018). Nonetheless, Nienhuis et al. point to the conceptual and methodological overlap and lack of discreteness between the constructs, especially when rated by the same individual. Therefore, future studies should also provide an operationalization of empathy to minimize rater effects. The therapist in our study was perceived as less skillful in the noncompetent condition, which indicates that the manipulation was effective. Except for Extraversion, the noncompetent therapist was also rated as less empathic, agreeable, and open. Therefore, the lack of discreteness seems to apply to the ratings in our study – and presumably to other observer-based studies as well.

Established observer-based instruments such as the CTS (Young & Beck, 1980) or the Assessment of Core CBT Skills (ACCS; Muse et al., 2017) are highly relevant for education, accreditation, and research. In addition, we strove to develop a measure that is easier to apply than more comprehensive rating scales. To examine the scale’s feasibility, we included students and laypersons who had not received specific training. This, in turn, might have contributed to rater effects. Participants in our commercial sample were mainly older, less educated, and more often male than the individuals in the convenience sample, who were mainly female psychology (or linguistics) students. For the latter group, we assume some knowledge on CBT for treating depression. Thus, watching the competent-therapist video, participants in the commercial sample perceived the therapist as acting less skillfully than did individuals in the convenience sample. Therefore, a certain amount of knowledge seems inevitable to minimize rater effects such as a “bias to remain consistent” (i. e., the halo-effect; Wirtz, 2017). Future studies should examine in more detail the influence of rater effects and the amount of necessary training. Furthermore, subsequent studies should investigate whether the instrument is also feasible if more ambivalent or if difficult therapy situations are to be rated. Finally, the scale should also demonstrate its reliability and validity in skills assessments of psychotherapy trainees with more divergent competence levels.

During the development of the CCSS, we focused on specific indicators based on concrete behaviors and decided on a multi-item scale. As outlined in the Introduction, feasible skills assessment is relevant not only to simulated interactions but remains challenging for patient encounters as well. However, our scale should apply to both domains and be easy to administer. As is the case in established observation-based scales (e. g., Kazantzis et al., 2018; Kühne, Lacki et al., 2019), the CCSS’s high internal consistency of above .90 may indicate the presence of redundant items. Therefore, the ideal number of CCSS items remains an open empirical question. Following the descriptive results, we excluded 3 items from psychometric analyses. For now, we decided to retain them in the scale for two reasons: First, we considered the integration of writing up information into the discussion (item 33), the graphical demonstration of content (item 34), and the farewell (item 37) as therapeutically relevant; second, we did not present whole sessions to the participants, but only 8-minute videos because of the practicability of the online survey. These segments did not cover the above-mentioned integration of writing and visualizations, and both might be more relevant to whole sessions. In subsequent studies, the 3 items require specific psychometric consideration.

While the complete CCSS should be advantageous, for example, for the detailed evaluation of whole sessions, a rigorously shortened version could be helpful if more economic ratings are necessary. Thus, we are developing a CCSS short version, which is currently being validated in different samples (for updates on that project, see Maass, 2021, May 28, Clinical Communication Skills Scale – Short Version, osf.io/xbeqa). Furthermore, we used the German CCSS for our evaluation (see Appendix E). Although we translated the CCSS to English (see Appendix F) according to scientific standards (Wild et al., 2005), psychometric testing of the English translation is pending. Third, the CCSS aims to assess skills of different therapeutic orientations, although further studies should investigate whether the scale is applicable across therapy schools (as intended), or whether it is more applicable for example in cognitive-behavioral therapy than in psychodynamic therapy.

In contrast to the usual empirical studies on therapeutic competencies (Kühne, Meister et al., 2019), we did not include a small number of raters who assess a larger number of therapy situations, but rather two situations that were evaluated by a large number of raters. Consequently, we could not calculate interrater reliabilities. Because of the online design and for confidentiality reasons, we did not use video recordings of real patients but of simulated interactions. Although our study differed in these respects from other research in the field, we tested the practicability of the design for rating competencies online. Subsequent studies should examine the CCSS in recordings of clients and patients, investigate other mental disorders than depression, refer to various therapeutic interventions, include psychotherapists with varying levels of skill, and compare the ratings of clinicians, psychotherapists, and patients.

In summary, we succeeded in developing an economic measure for observer-based ratings, the Clinical Communication Skills Scale (CCSS). According to our data, the instrument enables reliable and valid measurement and is feasible for use by untrained raters, which is highly relevant to the integration of competence assessments into practice. Subsequent studies could compare the CCSS simulated vs. real-patient encounters in simulated- vs. real-patient encounters or investigate the amount of information or training necessary to utilize the measure correctly.

Conclusion

To summarize our experiences with developing and using skills measurements, we encourage comprehensive literature searches, expert ratings on content validity, piloting, and a multistep psychometric evaluation of newly developed instruments. Nonetheless, a variety of research questions regarding the CCSS as well as comparable scales require further investigation, for example:

  • Using the scale in whole therapy sessions with different patient populations, with psychotherapy trainees who have diverse competence levels, in more ambiguous therapy situations, or with therapists with different backgrounds.
  • Comparing the ratings of different stakeholders such as laypersons, psychotherapists, and patients; calculating interrater reliabilities.
  • Comparing the scale with more comprehensive competence assessments (such as the CTS or ACCS), defining the optimal length of the scale, considering rater effects, and determining the best way to train the raters.

We would like to thank Dr. Brian Bloch for editing the English.

Literatur

  • Barber, J. P., Sharpless, B. A., Klostermann, S., & McCarthy, K. (2007). Assessing intervention competence and its relation to therapy outcome: A selected review derived from the outcome literature. Professional Psychology Research and Practice, 38, 493 – 500. https://doi.org/10.1037/0735-7028.38.5.493 First citation in articleCrossrefGoogle Scholar

  • Blackburn, I. M., James, I. A., Milne, D. L., Baker, C., Standart, S., Garland, A., & Reichelt, F. K. (2001). The revised Cognitive Therapy Scale (CTS-R): Psychometric properties. Behavioural and Cognitive Psychotherapy, 29, 431 – 446. https://doi.org/10.1017/S1352465801004040 First citation in articleCrossrefGoogle Scholar

  • Brakemeier, E. L., & Jacobi, F. (2017). Verhaltenstherapie in der Praxis: Beltz Video-Learning [Behavioral therapy in practice]. DVD. Beltz. First citation in articleGoogle Scholar

  • Burt, J., Abel, G., Elmore, N., Campbell, J., Roland, M., Benson, J., & Silverman, J. (2014). Assessing communication quality of consultations in primary care: Initial reliability of the Global Consultation Rating Scale, based on the Calgary-Cambridge Guide to the medical interview. BMJ Open, 4, e004339. https://doi.org/10.1136/bmjopen-2013-004339 First citation in articleCrossrefGoogle Scholar

  • Clickworker. (2019). Clickworker surveys. Crowdsourcing platform. Retrieved from https://www.clickworker.com/surveys/ First citation in articleGoogle Scholar

  • Cömert, M., Zill, J. M., Christalle, E., Dirmaier, J., Härter, M., & Scholl, I. (2016). Assessing communication skills of medical students in objective structured clinical examinations (OSCE): A systematic review of rating scales. PloS one, 11, e0152717. https://doi.org/10.1371/journal.pone.0152717 First citation in articleCrossrefGoogle Scholar

  • Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16, 297 – 334. https://doi.org/10.1007/BF02310555 First citation in articleCrossrefGoogle Scholar

  • DeGeorge, J., Constantino, M. J., Greenberg, R. P., Swift, J. K., & Smith-Hansen, L. (2013). Sex differences in college students’ preferences for an ideal psychotherapist. Professional Psychology: Research and Practice, 44, 29 – 36. https://doi.org/10.1037/a0029299 First citation in articleCrossrefGoogle Scholar

  • Field, A., Miles, J., & Field, Z. (2012). Discovering statistics using R. Sage. First citation in articleGoogle Scholar

  • Hautzinger, M. (2013). Kognitive Verhaltenstherapie bei Depressionen [Cognitive-behavioral therapy for depression]. Beltz. First citation in articleGoogle Scholar

  • Hill, C. E. (2020). Helping skills: Facilitating exploration, insight, and action (4th ed.). Web form B: Supervisor rating form. APA. Retrieved from http://supp.apa.org/books/Helping-Skills-Fourth/student/PDF/WebFormB.pdf First citation in articleCrossrefGoogle Scholar

  • Hill, C. E., & Kellems, I. S. (2002). Development and use of the helping skills measure to assess client perceptions of the effects of training and of helping skills in sessions. Journal of Counseling Psychology, 49, 264 – 272. https://doi.org/10.1037/0022-0167.49.2.264 First citation in articleCrossrefGoogle Scholar

  • Huntley, C. D., Salmon, P., Fisher, P. L., Fletcher, I., & Young, B. (2012). LUCAS: A theoretically informed instrument to assess clinical communication in objective structured clinical examinations. Medical Education, 46, 267 – 276. https://doi.org/10.1111/j.1365-2923.2011.04162.x First citation in articleCrossrefGoogle Scholar

  • Kaiser, H. F. (1974). An index of factorial simplicity. Psychometrika, 39, 31 – 36. https://doi.org/10.1007/BF02291575 First citation in articleCrossrefGoogle Scholar

  • Kazantzis, N., Clayton, X., Cronin, T. J., Farchione, D., Limburg, K., & Dobson, K. S. (2018). The Cognitive Therapy Scale and Cognitive Therapy Scale-Revised as measures of therapist competence in cognitive behavior therapy for depression: Relations with short- and long-term outcome. Cognitive Therapy and Research, 42, 385 – 397. https://doi.org/10.1007/s10608-018-9919-4 First citation in articleCrossrefGoogle Scholar

  • Kühne, F., Lacki, F. J., Muse, K., & Weck, F. (2019). Strengthening the competence of therapists-in-training in the treatment of health anxiety (hypochondriasis): Validation of the assessment of core CBT skills (ACCS). Clinical Psychology & Psychotherapy, 26, 319 – 327. https://doi.org/10.1002/cpp.2353 First citation in articleCrossrefGoogle Scholar

  • Kühne, F., Meister, R., Maaß, U., Paunov, T., & Weck, F. (2019). How reliable are therapeutic competence ratings? Results of a systematic review and meta-analysis. Cognitive Therapy and Research, 44 (2), 241 – 257. https://doi.org/10.1007/s10608-019-10056-5 First citation in articleCrossrefGoogle Scholar

  • Martin, O., Rockenbauch, K., Kleinert, E., & Stöbel-Richter, Y. (2017). Aktives Zuhören effektiv vermitteln [Conveying active listening effectively]. Der Nervenarzt, 88, 1026 – 1035. https://doi.org/10.1007/s00115-016-0178-x First citation in articleCrossrefGoogle Scholar

  • Miller, G. E. (1990). The assessment of clinical skills/competence/performance. Academic Medicine, 65, 563 – 570. https://doi.org/10.1097/00001888-199009000-00045 First citation in articleCrossrefGoogle Scholar

  • Moosbrugger, H., & Kelava, A. (2012). Testtheorie und Fragebogenkonstruktion [Test theory and questionnaire construction]. Springer. First citation in articleCrossrefGoogle Scholar

  • Muse, K., & McManus, F. (2013). A systematic review of methods for assessing competence in cognitive–behavioural therapy. Clinical Psychology Review, 33, 484 – 499. https://doi.org/10.1016/j.cpr.2013.01.010 First citation in articleCrossrefGoogle Scholar

  • Muse, K., & McManus, F. (2016). Expert insight into the assessment of competence in cognitive-behavioural therapy: A qualitative exploration of experts’ experiences, opinions and recommendations. Clinical Psychology and Psychotherapy, 23, 246 – 259. https://doi.org/10.1002/cpp.1952 First citation in articleCrossrefGoogle Scholar

  • Muse, K., McManus, F., Rakovshik, S., & Thwaites, R. (2017). Development and psychometric evaluation of the Assessment of Core CBT Skills (ACCS): An observation-based tool for assessing cognitive behavioral therapy competence. Psychological Assessment, 29, 542 – 555. https://doi.org/10.1037/pas0000372 First citation in articleCrossrefGoogle Scholar

  • Nienhuis, J. B., Owen, J., Valentine, J. C., Winkeljohn Black, S., Halford, T. C., Parazak, S. E., … Hilsenroth, M. (2018). Therapeutic alliance, empathy, and genuineness in individual adult psychotherapy: A meta-analytic review. Psychotherapy Research, 28, 593 – 605. https://doi.org/10.1080/10503307.2016.1204023 First citation in articleCrossrefGoogle Scholar

  • Partschefeld, E., Strauß, B., Geyer, M., & Philipp, S. (2013). Simulationspatienten in der Psychotherapieausbildung [Simulation patients in psychotherapy training]. Psychotherapeut, 58, 438 – 445. https://doi.org/10.1007/s00278-013-1002-8 First citation in articleCrossrefGoogle Scholar

  • Perepletchikova, F., Hilt, L. M., Chereji, E., & Kazdin, A. E. (2009). Barriers to implementing treatment integrity procedures: Survey of treatment outcome researchers. Journal of Consulting and Clinical Psychology, 77, 212 – 218. https://doi.org/10.1037/a0015232 First citation in articleCrossrefGoogle Scholar

  • Persons, J. B., & Burns, D. D. (1985). Mechanisms of action of cognitive therapy: The relative contributions of technical and interpersonal interventions. Cognitive Therapy and Research, 9, 539 – 551. https://doi.org/10.1007/BF01173007 First citation in articleCrossrefGoogle Scholar

  • Rammstedt, B., & John, O. P. (2005). Die Kurzversion des Big Five Inventory (BFI-K): Entwicklung und Validierung eines ökonomischen Inventars zur Erfassung der der fünf Faktoren der Persönlichkeit. Diagnostica, 51 (4), 195 – 206. https://doi.org/10.1026/0012-1924.51.4.195 First citation in articleLinkGoogle Scholar

  • R Core Team. (2018). R: A language and environment for statistical computing. Software. Retrieved from https://www.R-project.org First citation in articleGoogle Scholar

  • Roth, A. D., Myles-Hooton, P., & Branson, A. (2019). Judging clinical competence using structured observation tools: A cautionary tale. Behavioural and cognitive psychotherapy. Behavioural and Cognitive Psychotherapy, 47, 736 – 744. https://doi.org/10.1017/S1352465819000316 First citation in articleCrossrefGoogle Scholar

  • Roth, A. D. (2016). A new scale for the assessment of competences in cognitive and behavioural therapy. Behavioural and Cognitive Psychotherapy, 4, 5, 620 – 624. https://doi.org/10.1017/S1352465816000011 First citation in articleCrossrefGoogle Scholar

  • Scheffer, S., Muehlinghaus, I., Froehmel, A., & Ortwein, H. (2008). Assessing students’ communication skills: Validation of a global rating. Advances in Health Sciences Education, 13, 583 – 592. https://doi.org/10.1007/s10459-007-9074-2 First citation in articleCrossrefGoogle Scholar

  • Schmidt, I. D., Strunk, D. R., DeRubeis, R. J., Conklin, L. R., & Braun, J. D. (2018). Revisiting how we assess therapist competence in cognitive therapy. Cognitive Therapy and Research, 42, 369 – 384. https://doi.org/10.1007/s10608-018-9908-7 First citation in articleCrossrefGoogle Scholar

  • Simmenroth-Nayda, A., Heinemann, S., Nolte, C., Fischer, T., & Himmel, W. (2014). Psychometric properties of the Calgary Cambridge guides to assess communication skills of undergraduate medical students. International Journal of Medical Education, 5, 212 – 218. https://doi.org/10.5116/ijme.5454.c665 First citation in articleCrossrefGoogle Scholar

  • Sims, C. M. (2017). Do the Big-Five personality traits predict empathic listening and assertive communication? International Journal of Listening, 31, 163 – 188. https://doi.org/10.1080/10904018.2016.1202770 First citation in articleCrossrefGoogle Scholar

  • Stevens, J. P. (2002). Applied multivariate statistics for the social sciences. Routledge. First citation in articleGoogle Scholar

  • University College London (2022). Competence frameworks for the delivery of effective psychological interventions. University College London. Retrieved from https://www.ucl.ac.uk/pals/research/clinical-educational-and-health-psychology/research-groups/competence-frameworks First citation in articleGoogle Scholar

  • Weck, F. (2014). Psychotherapeutische Kompetenzen: Theorien, Erfassung, Förderung. [Psychotherapeutic competencies: Theories, assessment, facilitation]. Springer-Verlag. First citation in articleGoogle Scholar

  • Weck, F., Hautzinger, M., Heidenreich, T., & Stangier, U. (2010). Assessing psychotherapeutic competencies: Validation of a German version of the Cognitive Therapy Scale. Zeitschrift für Klinische Psychologie und Psychotherapie, 39, 244 – 250. https://doi.org/10.1026/1616-3443/a000055 First citation in articleLinkGoogle Scholar

  • Weck, F., Wald, A., & Kühne, F. (2019). Erfassung therapeutischer Kompetenzen in der Forschung und Praxis [Assessment of therapeutic competencies in research and practice]. PiD – Psychotherapie im Dialog, 20, 23 – 27. https://doi.org/10.1055/a-0771-7989 First citation in articleCrossrefGoogle Scholar

  • Wild, D., Grove, A., Martin, M., Eremenco, S., McElroy, S., Verjee-Lorenz, A., & Erikson, P. (2005). Principles of good practice for the translation and cultural adaptation process for patient-reported outcomes (PRO) measures: Report of the ISPOR Task Force for Translation and Cultural Adaptation. Value in Health, 8, 94 – 104. https://doi.org/10.1111/j.1524-4733.2005.04054.x First citation in articleCrossrefGoogle Scholar

  • Wirtz, M. A. (2017). Interrater reliability. In V. Zeigler-Hill & T. Shackelford (Hrsg.), Encyclopedia of Personality and Individual Differences. Springer, Cham. https://doi.org/10.1007/978-3-319-28099-8_1317-1 First citation in articleGoogle Scholar

  • Young, J. E., & Beck, A. T. (1980). Cognitive Therapy Scale: Rating manual. Unpublished Manuscript, University of Pennsylvania, Philadelphia, PA. First citation in articleGoogle Scholar

Appendix A

Table A1 Final Items and References and Their Sources

Appendix B

Table B1 Results of the Expert Survey on Content Validity and Applicability

Appendix C

Table C1 CCSS Item Characteristics

Appendix D

Figure D1 Scree Plot and Parallel Analysis.

Appendix E

Table E1 Skala zur Einschätzung klinischer Gesprächsführung (SEKG)

Appendix F

Table F1 Clinical Communication Skills Scale (CCSS; English translation)