Test–retest reliability of event-related functional MRI in a probabilistic reversal learning task

https://doi.org/10.1016/j.pscychresns.2009.03.003Get rights and content

Abstract

Repeated functional magnetic resonance imaging (fMRI) studies aim to detect changes in brain activity over time, e.g. to analyze the cerebral correlates of therapeutic interventions. This approach requires a high test–retest reliability of the measures used to rule out incidental findings. However, reliability studies, especially for cognitive tasks, are still difficult to find in the literature. In this study, 10 healthy adult subjects were scanned in two sessions, 16 weeks apart, while performing a probabilistic reversal learning task known to activate orbitofrontal–striatal circuitry. We quantified the reliability of brain activation by computing intra-class correlation coefficients. Group analysis revealed a high concordance for activation patterns in both measurements. Intra-class correlation coefficients (ICCs) were high for brain activation in the associated regions (dorsolateral prefrontal, anterior prefrontal/insular and cingulate cortices), often exceeding 0.8. We conclude that the probabilistic reversal learning task has a high test–retest reliability, making it suitable as a tool for evaluating the dynamics of deterioration in orbitofrontal–striatal circuitry, e.g. to illustrate the course of a psychiatric disorder.

Introduction

During the last decade functional magnetic resonance imaging (fMRI) has been established as a popular tool for non-invasive examination of the working human brain. Today a growing number of longitudinal fMRI experiments are designed, e.g. to analyze the progress of a neuropsychiatric disorder, the functional re-organization of the brain after apoplectic stroke or the cerebral correlates of therapeutic interventions. All of these approaches postulate that fMRI constitutes a valid and reliable method, in the sense that differences between measurements at different time points are solely effects of interest and not random or systematic effects produced by the demanding method itself. Considering the large number of publications, there are still few reports about retest reliability measures of fMRI experiments.

Prior studies have used various approaches for evaluating the reproducibility of fMRI signals, differing primarily in the examined brain function: Experiments range from sensorimotor control (Yetkin et al., 1996, Loubinoux et al., 2001, Maitra et al., 2002, Yoo et al., 2005), visual stimulation (Rombouts and Barkhof, 1997, Miki et al., 2000), fear and disgust processing (Stark et al., 2004), auditory odd-ball processing (Kiehl and Liddle, 2003), language production (Brannen et al., 2001, Rutten et al., 2002), verbal (Manoach et al., 2001, Wei et al., 2004, Wagner et al., 2005) and spatial (Casey et al., 1998) working memory tasks to different higher cognitive tasks (McGonigle et al., 2000, Aron et al., 2006). Furthermore, these studies vary broadly in the test–retest interval (from a few hours to more than 1 year) and in the mathematical approach used to determine reproducibility. Several studies qualitatively assessed the consistency of suprathreshold activations in predefined brain areas only and showed mostly analogue results over repeated measurements. For quantitative analyses, many different measures were evaluated in order to determine the reliability: e.g., number of activated voxels, overlap ratio, correlation of activation values or lateralisations, intra-class correlation coefficient (ICC), intersect maps and conjunction analysis. Recently, the computation of ICCs, which index the degree of correlation between subjects at different time points by relating between-subject and total variance, has been proposed as the most exact approach to assess within-subject variability (Manoach et al., 2001, Aron et al., 2006). Therefore, we calculated ICCs of signal changes in previously determined regions of interest, which derived from activations at the group level for either session 1 or session 2 (inclusively).

To our knowledge there are only a few studies quantitatively examining the test–retest reliability of fMRI procedures using higher cognitive tasks (McGonigle et al., 2000, Aron et al., 2006). The present study aimed to establish the test–retest reliability of fMRI in a probabilistic reversal learning task, which requires subjects to adapt their response strategy according to changes in stimulus–reward contingencies. These set-shifting abilities are of interest in exploring psychiatric disorders, e.g. obsessive–compulsive disorder (OCD), which has been shown to be associated with executive dysfunctions including set-shifting disabilities (Kuelz et al., 2004). Interestingly, neuroimaging studies of OCD demonstrate alterations in orbitofrontal cortex, prefrontal cortices, anterior cingulate cortex and the basal ganglia (Pujol et al., 2004, Mitterschiffthaler et al., 2006), structures that have been shown to be involved in probabilistic reversal learning (Cools et al., 2002, Remijnse et al., 2005). In a first behavioral experiment we found prolonged reaction times with increasing severity of compulsions in OCD patients (Valerius et al., 2008). Remijnse et al. were the first to conduct an fMRI experiment with the reversal learning task comparing OCD patients with healthy subjects, and they found behavioral impairments as well as reduced activation of the left posterior orbitofrontal cortex (OFC), bilateral anterior prefrontal cortex (PFC), bilateral dorsolateral prefrontal cortex (DLPFC) and bilateral insula in patients (Remijnse et al., 2006). Assuming that these results are reproducible and the method is reliable, it should be interesting to examine patients in the course of their disease or before and after cognitive-behavioral psychotherapy.

In this longitudinal event-related fMRI study, we examined the test–retest reliability of a probabilistic reversal learning task, hypothesizing that this task shows minor practice effects and produces stable activation patterns in prefrontal, insular, cingulate and striatal cortices, making it suitable as a tool for evaluating the dynamics of dysfunctional fronto-striatal brain activity due to psychiatric disorders.

Section snippets

Subjects

Ten right-handed subjects (4 female, Mage = 39.8 years, S.D. = 10.03 years) with no history of psychiatric or neurologic disorder and with normal or corrected-to-normal vision participated. Subjects did not take any psychotropic medication, and there was no substance abuse in the medical history. Informed written consent was obtained from each subject after the procedure had been fully explained. The study was approved by the Ethics Committee of the University of Freiburg according to the

Behavioral data

Number of errors and reaction times of each event were registered and analyzed for differences using repeated measures ANOVAs. Furthermore, we calculated ICCs (1,1) as measures of absolute agreement in behavioral data. There were no significant differences between the two sessions, although the difference in number of preceding reversal errors and in reaction time just failed to reach significance (Table 1).

  • (number: mean = 12.2, S.D. = 8.6, versus mean = 6.2, S.D. = 7.1; F = 3.67, df = 1,9, P = 0.09)

Discussion

In this study we evaluated the test–retest reliability of event-related fMRI activations during a probabilistic reversal learning task. Strategy change in both sessions robustly activated prefrontal, cingulate and parietal areas. Group-activation patterns were similar across the two sessions except for slightly stronger overall activation in session 1 and more left-sided prefrontal and parietal activation in session 2. A direct comparison between sessions did not show any statistically

References (30)

  • BrannenJ.H. et al.

    Reliability of functional MR imaging with word-generation tasks for mapping Broca's area

    American Journal of Neuroradiology

    (2001)
  • CicchettiD.V.

    The precision of reliability and validity estimates re-visited: distinguishing between clinical and statistical significance of sample size requirements

    Journal of Clinical and Experimental Neuropsychology

    (2001)
  • CicchettiD.V. et al.

    Developing criteria for establishing interrater reliability of specific items: applications to assessment of adaptive behavior

    American Journal of Mental Deficiency

    (1981)
  • CoolsR. et al.

    Defining the neural mechanisms of probabilistic reversal learning using event-related functional magnetic resonance imaging

    Journal of Neuroscience

    (2002)
  • FriedmanL. et al.

    Test–retest and between-site reliability in a multicenter fMRI study

    Human Brain Mapping

    (2008)
  • Cited by (33)

    • Impulsivity and compulsivity in binge eating disorder: A systematic review of behavioral studies

      2021, Progress in Neuro-Psychopharmacology and Biological Psychiatry
    • Reliability of single-subject neural activation patterns in speech production tasks

      2021, Brain and Language
      Citation Excerpt :

      Thus, individuals may consistently activate or deactivate this region depending on their level of attention during speaking tasks. Previous studies of higher-level cognitive tasks have found reliable activation outside of areas commonly associated with the task, but this usually occurred in sensory and motor regions needed to complete the task (Aron et al., 2006; Freyer et al., 2009). Caceres et al. (2009) suggested that areas with high reliability but low significance values have time-series that are reliable but do not fit the task/HRF model, and demonstrated this pattern for half of their participants in one ROI.

    View all citing articles on Scopus

    Location of work: Department of Diagnostic Radiology, Medical Physics, University Hospital, Hugstetter Str. 55, D-79106 Freiburg, Germany.

    View full text