Assessments of psychotherapeutic competencies play a crucial role in research and training. However, research on the reliability and validity of such assessments is sparse. This study aimed to provide an overview of the current evidence and to provide an average interrater reliability (IRR) of psychotherapeutic competence ratings. A systematic review was conducted, and 20 studies reported in 32 publications were collected. These 20 studies were included in a narrative synthesis, and 20 coefficients were entered into the meta-analysis. Most primary studies referred to cognitive-behavioral therapies and the treatment of depression, used the Cognitive Therapy Scale, based ratings on videos, and trained the raters. Our meta-analysis revealed a pooled ICC of 0.82, but at the same time severe heterogeneity. The evidence map highlighted a variety of variables related to competence assessments. Further aspects influencing the reliability of competence ratings and regarding the considerable heterogeneity are discussed in detail throughout the manuscript.