Preamble Charlotte Johnston, Editor in Chief

The Journal of Abnormal Child Psychology receives an increasing number of manuscripts reporting tests of associations of candidate genetic polymorphisms with measures of psychopathology in children and adolescents, including tests of gene-environment interactions. These studies represent the exciting new directions that research in child psychopathology is taking. However, the studies also represent a challenge with regard to the criteria used for evaluating them. Across several fields, standards for the publication of such studies are currently in flux. Therefore, it seemed appropriate to advance an editorial policy for the Journal in order to guide both authors and reviewers as we evaluate and incorporate candidate gene work into the study of child psychopathology.

To meet this mission, I invited Dr. Benjamin Lahey a previous president of ISRCAP, a long-standing editorial board member, and an active contributor to the research on candidate gene and gene-environment interactions in child psychopathology to partner with Dr. Walter Matthys, a current Associate Editor for the Journal, to prepare a statement outlining such an editorial policy. I express my appreciation to both of them for their efforts in producing the well-reasoned and clear policy presented below. I am confident that this policy will serve our journal well as we move forward in learning about genetic factors in youth psychopathology.

Editorial Policy Benjamin B. Lahey and Walter Matthys

We appreciate the opportunity to articulate an editorial policy regarding candidate gene studies for the Journal of Abnormal Child Psychology. Although this policy outlines standards for evaluating submitted work, these standards may be of the greatest value when planning new studies of such topics. This policy is largely based on the similar policy adopted by the journal Behavior Genetics (Hewitt 2012), but it is expanded to be directly relevant to this journal. Finally, we note that the emerging nature of candidate gene studies will likely necessitate ongoing revision to this policy as methods and techniques advance.

Much has been written recently about the general issue of replicability in psychology, psychiatry, and medicine. Serious concerns have been raised about an excessive number of false-positive findings (i.e., findings that have failed to be replicated) (Ioannidis 2005c). The concern is that such false positives mislead both scientists and the public, and perhaps direct the allocation of scientific resources in less than optimal ways. There will always be false positives in science, even when the strongest methods are used, but a number of practices that are commonly engaged in by researchers may increase the number of findings that cannot be replicated (Simmons et al. 2011). Moreover, there is good reason to believe that science is not quick to detect false-positives and correct misimpressions (Ioannidis 2012). Thus, the issue of replicability is front and center today in many sciences.

The tendency for novel findings to subsequently fail replication may be particularly great in new and “hot” areas of research, such as candidate gene associations and gene-­environment interactions. The existence of a strong publication bias towards positive findings is partly due to incentives both for authors and editors to publish positive reports. Other things being equal, reviewers and editors may be more likely to agree that exciting and novel findings should be published than research on more established topics.

Particular attention has been given to the problem of replicability in research using molecular genetic methods. Although there are many ways to study associations between genetic variants and behavior, each with its own pitfalls, particular concerns have been raised about replicability in candidate gene studies (Ioannidis 2005a, 2005b; Ioannidis and Khoury 2011; Ioannidis et al. 2001; Munafo 2009; Sullivan 2007). This is not to say that there have not been replicated candidate gene findings; there have been (Gizer et al. 2009). Nonetheless, the number of statistically significant candidate gene associations and gene-environment interactions that appear not to be replicated is easily large enough to warrant particular concern (Duncan and Keller 2011; Hewitt 2012; Hunter 2005; Mill and Petronis 2007; Moffitt et al. 2005).

Keeping in mind that any set of restrictive standards designed to reduce the number of published false positives will almost certainly have the disadvantage of reducing the number of published novel findings that are later replicated, the position of the Journal is that special care needs to be taken in making editorial decisions in a manner that reduces false-positive findings of candidate gene associations and gene-environment interactions. Therefore, manuscripts involving candidate genes will only be favorably viewed if they meet the standards below. These standards are not entirely new to the journal, but rather reflect a synthesis of the standards used by reviewers and editors in recent editorial decisions. Sound arguments for special cases will, of course, be considered, but the following list is intended as appropriate standards.

  1. 1.

    Measurement. As in any area of research, tests of genetic associations and gene-environment interactions require reliable and valid measures of all variables. Findings of gene-environment interactions are strengthened, moreover, when the environments and phenotypes are measured using separate informants (Moffitt et al. 2005).

  2. 2.

    Biological and psychological plausibility. In studies of candidate genetic variants and gene-environment interactions, findings are more convincing when the genetic variants, environments, and phenotypes are all selected based on a well articulated and plausible biological and psychological model of the phenomenon. Choosing environments and genetic variants with known functional consequences strengthens such a case. Because each candidate genetic variant represents only one of an extremely large number of possible variants, a strong case must be made for studying the selected variant and its associated psychological function over other possible choices.

  3. 3.

    Attention to population heterogeneity. A satisfactory approach to population heterogeneity is required. Molecular genetic studies pose special problems for sample selection. On the one hand, there are important advantages inherent in studying diverse samples, if only to conduct research that applies to all children and adolescents. On the other hand, the world’s ancestry groups differ in terms of both allele frequencies and complexity. This creates opportunities for mistaken findings based on population stratification (Rosenberg et al. 2010) and may even require different genotyping platforms for different ancestry groups (Hoffmann et al. 2011). These issues are particularly problematic for smaller diverse samples that rarely have the power to successfully address these issues.

  4. 4.

    Prospective designs. Prospective longitudinal designs provide much stronger tests of genetic associations and gene-environment interaction than cross-sectional designs. This is because prospective studies can rule out reverse causation (Kraemer 2010). It may be possible to justify cross-sectional studies of genetic associations and gene-environment interaction, however. As noted below, these must fully take gene-environment correlation into account, which is difficult to accomplish in cross-­sectional designs.

  5. 5.

    Statistical power and control of alpha. Manuscripts will be considered for publication only if the tests are based on sample sizes that provide sufficient statistical power based on reasonable estimates of effect sizes. The sample size must allow statistical analyses to correct the alpha level for the number of related statistical tests performed, even ones that are not reported (Ioannidis 2005c; Little et al. 2009).

  6. 6.

    Statistical tests of gene-environment interaction. Special care needs to be taken when conducting any statistical test of interaction (McClelland and Judd 1993). Because violations of the assumption of multivariate normality can result in the “detection” of interactions when none exists (Eaves 2006), statistical methods must reduce the likelihood of scaling artifacts. This is especially pertinent for psychopathology research because highly skewed and kurtotic distributions of dimensions of psychopathology almost never meet the assumption of multivariate normality. In addition, tests of ­gene-environment can be difficult to interpret in the presence of gene-environment correlation (Eaves et al. 2003; Rathouz et al. 2008). Thus, tests of interaction must take this phenomenon into account. Gene-environment correlation may be circumvented in a randomized control trial because the randomization breaks any potential gene-environment correlation (van IJzendoorn et al. 2011). In addition, tests of gene-environment-interaction must consider the fact that the power to detect interactions is often lower than power to detect main effects (Duncan and Keller 2011).

  7. 7.

    Need for replication. Tests of simple and moderated associations with genetic variants will be considered for publication if they reflect sound attempts to directly replicate previously published findings. Successful replications and failures to replicate will be given equal priority, although to be convincing, failures to replicate must be close replications in terms of methods and samples and must be adequately powered. For replications, adequate sample size is challenging to estimate for at least two reasons. First, confidence intervals must be constructed around the original effect size, with the lower bound giving the predicted effect size for determining the size of the replication sample (not the original effect size) (Greene et al. 2009). Second, the well-documented “shrinking effect size” and “winner’s curse” phenomena (Ioannidis, et al. 2001; Xiao and Boehnke 2009) must be considered. These refer to a tendency for findings with large effect sizes for a genetic variant to be published first (perhaps partly because only large effect sizes will be statistically significant in small, preliminary first studies). Very often, the effect sizes in subsequent studies are smaller even when they are statistically significant. Thus, even conservative estimates of the sample size needed for tests of replication based on the lower bound of the original effect size estimate may be too small. Statistical methods have been developed to improve sample size estimates under conditions of expected shrinking effect sizes, however (Xiao and Boehnke 2009).

When a manuscript presents the first test of a given candidate gene association, it is necessary to provide a successful replication in a second, independent sample in the same manuscript (Caspi et al. 2008). With large samples, random splits of the sample provide particularly strong tests of replication because of the lack of differences in methods and samples. Replication has become a minimum standard for novel findings of associations (Hewitt 2012). Such replications do not guarantee that the statistically significant association accurately reflects a process of nature (Hewitt 2012), but it is the strongest standard we possess. An argument could be made to waive the requirement of replication, however, in an otherwise strong test of association in which the finding meets the strict statistical criteria for genome-wide association tests (e.g., p < 10−8) (Hewitt 2012). Of course, the journal also welcomes meta-analytic summaries of candidate gene work because such papers offer another avenue for clarifying the robustness of findings.