Elsevier

Evaluation and Program Planning

Volume 48, February 2015, Pages 100-116
Evaluation and Program Planning

Comparing rating paradigms for evidence-based program registers in behavioral health: Evidentiary criteria and implications for assessing programs

https://doi.org/10.1016/j.evalprogplan.2014.09.007Get rights and content

Highlights

  • Registers tend to recognize the “traditional” Campbellian hierarchy of evidence.

  • Registers sometimes disagree on the criteria that define program effectiveness and research quality.

  • Registers use statistical significance as evidence of effectiveness, but usually ignore program effect size.

  • Registers avoid assessing qualitative evidence.

  • Using meta-analyses to guide program-level decision-making can be problematic.

Abstract

Decision makers need timely and credible information about the effectiveness of behavioral health interventions. Online evidence-based program registers (EBPRs) have been developed to address this need. However, the methods by which these registers determine programs and practices as being “evidence-based” has not been investigated in detail. This paper examines the evidentiary criteria EBPRs use to rate programs and the implications for how different registers rate the same programs. Although the registers tend to employ a standard Campbellian hierarchy of evidence to assess evaluation results, there is also considerable disagreement among the registers about what constitutes an adequate research design and sufficient data for designating a program as evidence-based. Additionally, differences exist in how registers report findings of “no effect,” which may deprive users of important information. Of all programs on the 15 registers that rate individual programs, 79% appear on only one register. Among a random sample of 100 programs rated by more than one register, 42% were inconsistently rated by the multiple registers to some degree.

Introduction

Since the 1960s, professional evaluation has aimed to apply scientific research methods to develop evidence-based practices or programs (EBPs). The term “evidence-based” gained traction in a number of primary disciplines ranging from healthcare to education to law enforcement. Most recently, federal, state, and local governments have begun to increase their mandates for the use of EBPs (Hawai’i State Center for Nursing, 2013, Minnesota Department of Corrections, 2011, Office of the President of the United States of America, 2014a, Office of the President of the United States of America, 2014b, Reickmann et al., 2011, United States Office of Management and Budget, 2013). Such mandates may require recipients of funding to spend a stated portion of their funds on EBPs (Office of Adolescent Health/Mathematica Policy Research, 2014). A policy memorandum from the United States Office of Management and Budget (OMB) directs federal agencies to use credible evidence in the formulation of budget proposals and performance plans. OMB further encourages funding of programs that are backed by strong evidence of effectiveness (United States Office of Management and Budget, 2013). However, exactly what constitutes credible evidence continues to be debated (Donaldson et al., 2009, Sackett et al., 1996).

Discussions surrounding credible evidence have largely focused on the philosophical and ideological disagreements concerning taxonomies and hierarchies of methodological quality and rigor. Relatively well established hierarchies describing standards of evidence do exist, ranging from highly rigorous systematic reviews to randomized controlled trials to single-case designs. The preferred designs in those hierarchies are those that are likely to yield evidence that is subject to the least amount of bias (Rossi et al., 2004, Shadish et al., 2002). However, evidence that is derived outside of the sphere of experiments is often accepted as credible as well (Donaldson et al., 2009). Furthermore, Gambrill (2006) points out that what constitutes a program as being evidence-based must not be limited to quality of evidence (i.e. what method was used), but must also include factors such as ethics and impartiality. The evaluation field has come to recognize that the context of evidence is also important. Information beyond manifest outcomes must be considered when making decisions about program implementation (Donaldson et al., 2009, Scriven, 2014, Shadish et al., 1991). Although considering multiple forms and sources of information of credible evidence sounds ideal in theory, it may cause problems for decision makers who need concrete and unambiguous rules to identify EBPs.

It is a daunting process for individuals and even organizations to access, review, and synthesize the large and rapidly growing body of research literature in many of the primary disciplines (Bastian et al., 2010, Jette et al., 2003, Shadish et al., 1991). Policymakers, administrators, clinicians, and practitioners need assistance in efficiently collecting, aggregating, and interpreting what constitutes valid evidence of a program's effectiveness. Consumers need resources that will help them filter information in order to make sound decisions for program participation. Evidence-based program registers (EBPRs) are a relatively recent mechanism for assisting this in this process in the behavioral health field (United States Government Accountability Office, 2009). These registers were established to assess applied research and evaluation studies of programs/interventions according to evidentiary (evidence-based) standards, in order to help potential users decide which programs/interventions to support or select for implementation.

Several recent reviews of EBPRs, primarily focused on federally sponsored registers – such as the National Register of Evidence-Based Programs and Practices, the What Works Clearinghouse, and Social Programs that Work – found needs for more transparency in how such registers assess program effectiveness, in their use of evidence-based standards, and in how information is disseminated to the public (Hennessy and Green-Hennessy, 2011, United States Government Accountability Office, 2009, United States Government Accountability Office, 2010).

At the inception of the present study, it was unknown whether these problems were remedied. The authors’ preliminary examination of relevant EBPRs found many different paradigms for rating programs. In particular, some registers listed programs as “evidence-based” (or not) with no further gradation in ratings, while others employed graded levels (“tiers”) of evidence in support of a program. Among the latter type of register, it appeared that the top two tiers of evidence were indicative of what the field has termed “evidence-based,” although the distinctions between the top two tiers varied among registers. In general there seemed to be many and often not obvious differences among registers in designating programs as evidence-based.

The purpose of this study is to describe the similarities and differences in evidentiary criteria of effectiveness and program rating schemes for EBPRs in behavioral health, with special attention to variation in evidentiary criteria between the top two tiers for registers with multiple rating tiers. This study also attempts to determine the extent to which such differences in evidentiary criteria affect ratings of the same program appearing in more than one register.

Section snippets

Methods

The present study was phase 2 of a sequential mixed methods study. In phase 1, the authors identified 20 active evidence-based practice registers that included interventions pertinent to behavior health. This was followed by analysis of the websites of those registers to identify their scope, purpose, key structural elements, funding sources, marketing and dissemination strategies, and challenges associated with maintaining them. Interviews with the register managers were also conducted in

Results

Twenty EBPRs are included in the content analysis of evidentiary criteria (Table 1). Twelve of the registers rate only individual treatment, prevention or service programs (“programs”), five registers rate only broad modalities or types of intervention (“modalities”), and three rate both individual programs and modalities (“mixed”). Four of the registers that rate modalities do so using meta-analytic or systematic review procedures. One register of modalities uses a rating scheme similar to the

Discussion

Decision-makers need independent and objective information about which programs have the strongest and most credible evidence of effectiveness. They may be faced with a large number of candidate programs and, given resource constraints, need to select the most promising for their circumstances. Or alternatively, they are attracted to a specific program or modality and need answers about whether it is likely to succeed in their setting and/or whether it is likely to be an improvement over what

Conclusions

Several conclusions can be drawn about how EBPRs rate behavioral health interventions and the consequent implications for program selection. Because many programs have detectable positive effects, a procedure is needed by which to differentiate the best from the others, whether best, good, marginally effective, or ineffective. Although EBPRs are meant to be that mechanism, only registers that rank programs and include a “no effect” category potentially provide enough information to make those

Lessons learned

Future research should more closely examine the implications of program or intervention scale on the types of evidence that are feasible and meaningful to collect. Large scale, multi-site studies have the resources to collect extensive outcome and implementation data, whereas most evaluations are more modest in capability. How this translates into assessment of which programs are evidence-based should be further studied.

Additionally, future research should examine the ways in which

Acknowledgments

The authors would like to extend our appreciation to Dr. Robert Slavin, who has done much work related to evidence-based education reform. His contributions to this study helped us to make several critical leaps in our thinking, and although the Best Evidence Encyclopedia was ultimately not incorporated into our final analysis sample, we feel that his work is an important contribution to the field. http://www.bestevidence.org/.

We would also like to acknowledge our expert review panel, which

Stephanie N. Means is a project manager at WMU's Evaluation Center and a Doctoral Candidate in the Interdisciplinary Doctoral Program in Evaluation (IDPE). She is currently working on several projects, including the Urban Core Collective, Professional Development in Science Teaching, and an evaluation of Literacy Center of West Michigan's adult literacy programs. Her past grant work includes projects with National Institute on Health, National Science Foundation, W.K. Kellogg Foundation and

References (40)

  • J. Greene et al.

    the impact of tobacco dependence treatment coverage and copayments in Medicaid

    American Journal of Preventive Medicine

    (2014)
  • P. 6 et al.

    Principles of methodology: Research design in social science

    (2012)
  • J. Abraham

    How might the affordable care act's coverage expansion provisions influence demand for medical care?

    The Milbank Quarterly

    (2014)
  • Agency for Healthcare Research and Quality

    Effective health care program: Helping you make better treatment choices

    (2013)
  • M.C. Alkin et al.

    The evaluator's role in valuing: Who and with whom

  • APA Presidential Task Force

    Evidence-based practice in psychology

    American Psychologist

    (2006)
  • H. Bastian et al.

    Seventy-five trials and eleven systematic reviews a day: How will we ever keep up

    PLoS Medicine

    (2010)
  • J.T. Burkhardt et al.

    An overview of evidence-based program registers (EBPRs) for behavioral health

    Journal of Evaluation and Program Planning

    (2014)
  • D.L. Chambless et al.

    Defining empirically supported therapies

    Journal of Consulting and Clinical Psychology

    (1998)
  • E. Chelimsky

    Valuing, evaluation methods, and the politicization of the evaluation process

  • C. Claes et al.

    An integrative approach to evidence-based practice

    Journal of Evaluation and Program Planning

    (2014)
  • Council for Training in Evidence-Based Behavioral Practice

    Definition and competencies for Evidence-Based Behavioral Practice (EBBP)

    (2008)
  • S.I. Donaldson et al.

    What counts as credible evidence in applied research and evaluation practice?

    (2009)
  • E. Gambrill

    Evidence-based practice and policy: Choices ahead

    Research on Social Work Practice

    (2006)
  • Q. Gu et al.

    The Medicare hospital readmissions reduction program: Potential unintended consequences for hospitals serving vulnerable populations

    Health Services Research

    (2014)
  • Hawai’i State Center for Nursing

    Hawai’i State Center for Nursing

    (2013)
  • K.D. Hennessy et al.

    A review of mental health interventions in SAMHSA's National Registry of Evidence-Based Programs and Practices

    Psychiatric Services

    (2011)
  • U.D. Jette et al.

    Evidence-based practice: Beliefs, attitudes, knowledge, and behaviors of physical therapist

    Physical Therapy

    (2003)
  • Cited by (45)

    • The influence of evidence-based program registry websites for dissemination of evidence-based interventions in behavioral healthcare

      2023, Evaluation and Program Planning
      Citation Excerpt :

      EBPRs use standardized research criteria to evaluate the merit and worth of behavioral health interventions based on existing evaluation studies. Their paradigms follow well-accepted hierarchies of evidence that usually define the randomized controlled trial as the highest form of evidence for program effectiveness (Burkhardt et al., 2015; Means et al., 2015; Horne, 2017; Shadish et al., 2002). The result of this process is usually a summary rating of the strength, and often quality, of evidence supporting or not supporting the effectiveness of a given behavioral health program or practice.

    • “What works” registries of interventions to improve child and youth psychosocial outcomes: A critical appraisal

      2022, Children and Youth Services Review
      Citation Excerpt :

      For instance, Means et al. (2015) examined a random sample of 100 programs assessed by more than one registry and found that 53% received different classifications across organizations. There are several reasons for these inconsistencies (Means et al., 2015; Fagan & Buchanan, 2016; Zack et al., 2019). First, registries use different processes and criteria to select studies that inform the rating.

    • State behavioral health agency website references to evidence-based program registers

      2021, Evaluation and Program Planning
      Citation Excerpt :

      This process results in a summary rating of the program or practice (e.g., “evidence-based, “top tier”, “well-supported), which the EBPR makes available to clinicians, provider agency leaders, policymakers, journalists, researchers, and the general public Most EBPRs have some method of periodically incorporating new evidence into their synthesis, although the frequency of these updates varies from register to register. ( Burkhardt et al., 2015; Hallfors & Cho, 2007; Jossie, 2019; Means et al., 2015; Petrosino, 2014). The summary ratings produced by the EBPRs are primarily intended to allow EBPR users with only minimal knowledge of research methods to select a new program or clinical practice to implement or to facilitate the vetting of programs or practices already being implemented.

    View all citing articles on Scopus

    Stephanie N. Means is a project manager at WMU's Evaluation Center and a Doctoral Candidate in the Interdisciplinary Doctoral Program in Evaluation (IDPE). She is currently working on several projects, including the Urban Core Collective, Professional Development in Science Teaching, and an evaluation of Literacy Center of West Michigan's adult literacy programs. Her past grant work includes projects with National Institute on Health, National Science Foundation, W.K. Kellogg Foundation and several other state and local level evaluations.

    Stephen Magura is the Director of the Evaluation Center at Western Michigan University and former Deputy and Acting Executive Director at National Development and Research Institutes. He has led clinical efficacy trials of behavioral, psychosocial and medication treatments as well as evaluations of social services programs. He has been principal investigator of 19 research grants from the National Institutes of Health and is the current Editor-in-Chief of the journal Substance Use and Misuse. He has also contributed three evidence-based programs to the National Registry of Evidence-based Programs and Practices (NREPP).

    Jason T. Burkhardt is a project manager at WMU's Evaluation Center, and a Doctoral Candidate in the Interdisciplinary Doctoral Program in Evaluation (IDPE). He is currently working on several projects, including the Evaluation Resource Center for the National Science Foundation's Advanced Technical Education Program and an evaluation of WMU's Upward Bound Program. His past grant work includes projects with the National Institute on Drug Abuse, the American Red Cross, the American Council on Education, and several state and local level evaluations. His dissertation work focuses on “Users of Evidence-Based Practice Registers”.

    Daniela C. Schröter is the Director of Research in WMU's Evaluation Center (EC) and an Associate Faculty member in the Interdisciplinary Ph.D. in Evaluation program (IDPE). She leads evaluation studies, conducts research on evaluation, provides professional development in evaluation, serves on the EC's leadership team, and works with a diverse group of doctoral students. Schröter received the 2013 AEA Guttentag Outstanding New Evaluator Award and represents The Evaluation Center on the Joint Committee on Standards for Education Evaluation.

    Chris L.S. Coryn is the Director of the Interdisciplinary Ph.D. in Evaluation (IDPE) program and an Associate Professor in the Evaluation, Measurement, and Research (EMR) program at Western Michigan University (WMU). He has been involved in and led numerous research studies and evaluations, funded by the Department of Justice, National Science Foundation, National Institutes for Health, and others, across several substantive domains, including research and evaluation in education, science and technology, health and medicine, community and international development, and social and human services. He has given numerous lectures, speeches, and workshops, both nationally and internationally.

    View full text