Choice-correlated activity fluctuations underlie learning of neuronal category representation

Engel, Tatiana A.; Chaisangmongkon, Warasinee; Freedman, David J.; Wang, Xiao-Jing

doi:10.1038/ncomms7454

Download PDF

Article
Open access
Published: 11 March 2015

Choice-correlated activity fluctuations underlie learning of neuronal category representation

Tatiana A. Engel^1,2^na1,
Warasinee Chaisangmongkon¹^na1,
David J. Freedman³ &
…
Xiao-Jing Wang^1,4,5

Nature Communications volume 6, Article number: 6454 (2015) Cite this article

12k Accesses
51 Citations
114 Altmetric
Metrics details

Subjects

Abstract

The ability to categorize stimuli into discrete behaviourally relevant groups is an essential cognitive function. To elucidate the neural mechanisms underlying categorization, we constructed a cortical circuit model that is capable of learning a motion categorization task through reward-dependent plasticity. Here we show that stable category representations develop in neurons intermediate to sensory and decision layers if they exhibit choice-correlated activity fluctuations (choice probability). In the model, choice probability and task-specific interneuronal correlations emerge from plasticity of top-down projections from decision neurons. Specific model predictions are confirmed by analysis of single-neuron activity from the monkey parietal cortex, which reveals a mixture of directional and categorical tuning, and a positive correlation between category selectivity and choice probability. Beyond demonstrating a circuit mechanism for categorization, the present work suggests a key role of plastic top-down feedback in simultaneously shaping both neural tuning and correlated neural variability.

Targeted V1 comodulation supports task-adaptive sensory decisions

Article Open access 30 November 2023

Thalamus-driven functional populations in frontal cortex support decision-making

Article Open access 28 September 2022

Mouse prefrontal cortex represents learned rules for categorization

Article Open access 21 April 2021

Introduction

Through experience we can learn to classify a continuum of sensory stimuli into discrete meaningful categories, which are critical for guiding behaviour^1,2. Training improves our ability to discriminate stimuli belonging to different categories and to group together perceptually dissimilar items within the same category. Such learning and refinement of categorical discriminations occur continuously in everyday life; however, their neural basis is poorly understood.

During training on visual tasks, perceptual improvements are accompanied by only moderate tuning changes in the early visual cortex^3,4, whereas more dramatic changes occur in inferior temporal and posterior parietal cortices. In monkeys trained to classify directions of random dot motion into two arbitrary categories, neurons in the lateral intraparietal (LIP) area encoded learned motion categories in an almost binary manner⁵, whereas in naive animals LIP neurons represent directions uniformly with bell-shaped tuning functions⁶. In contrast, categorization training did not induce any apparent change in motion tuning of neurons in the middle temporal (MT) area. Similarly, changes in responses of LIP but not MT neurons were associated with improved behavioural sensitivity on visual discrimination tasks^7,8,9, which had been attributed to refinements of functional connectivity between MT and LIP through reinforcement learning^10,11; however, the underlying circuit mechanism remains unknown.

We examined whether changes in tuning of LIP neurons induced by training on a motion categorization task can emerge in a neural circuit model through biophysically plausible Hebbian synaptic plasticity modulated by reward prediction error (RPE) signals^12,13,14,15. Unlike the classical two-layer categorization model¹⁶, our model incorporated a layer of neurons intermediate to sensory and decision layers. We found that neurons in the intermediate layer develop stable category representation if fluctuations of their firing rates are correlated with behavioural choices. In contrast, behavioural performance and neuronal tuning deteriorate with training in networks where activity fluctuations are not correlated with choices. Weak but systematic correlations between neural fluctuations and choices, termed choice probability (CP), have been found in many cortical areas^17,18. Here we show that CP is critical for successful learning through reward-dependent Hebbian plasticity, which generally holds across different network architectures and behavioural tasks.

Our model predicts that a mixture of directional and categorical tuning and bimodal distribution of preferred directions emerge in the intermediate-layer neurons through learning. This prediction was confirmed by analysis of LIP responses recorded in monkeys trained on the motion categorization task. Moreover, the model predicts that neurons with larger CP exhibit a larger increase in their category sensitivity (CS), leading to a positive correlation between these measures, which was also found in the LIP data. Finally, the model suggests that task-specific noise correlations arise from the plasticity of top-down connections and makes testable predictions about changes of noise correlations throughout learning.

Results

A neural circuit model of category learning

We trained a neural circuit model to perform a motion categorization task⁵. Twelve motion directions were assigned to two categories, C1 and C2, defined by an arbitrary category boundary (Fig. 1a), and the model learned through trial and error to decide on the category membership of these stimuli.

**Figure 1: Categorization task and the neural circuit model.**

Our model is a recurrent neural network comprising three interconnected circuits (Fig. 1b). Sensory neurons (MT) encode motion directions with bell-shaped tuning functions (Fig. 1c), arising from direction-selective bottom-up inputs and structured recurrent excitation¹⁹. Association neurons (LIP) are also tuned to motion directions initially (Fig. 1c)—just like LIP neurons in naive monkeys⁶—because synaptic weights are initialized to be stronger between sensory and association neurons with similar preferred directions. Over the course of learning, tuning of association neurons changes through synaptic plasticity. The activity of association neurons is pooled by the decision network, which consists of two competing populations (C₁ and C₂, Fig. 1b,c) firing at higher rates for the two respective category decisions^20,21. These neurons encode the model’s choice and represent a subpopulation of neurons within LIP or in the prefrontal cortex. Synaptic connections between association and decision neurons are initialized at random values; therefore, the model’s categorization decisions are completely random initially.

Our model has plastic feedforward connections from sensory to association (c^S→A) and from association to decision (c^A→D) circuits, and plastic feedback connections from decision to association circuit (c^D→A, Fig. 1b). At the end of each trial, the strength c of each plastic synapse is updated according to a reward-dependent Hebbian plasticity rule:

where r_pre and r_post are the trial-average firing rates of pre- and postsynaptic neurons, q is the learning rate parameter, R is the reward received on each trial (1 or 0 for correct and incorrect decisions, respectively), θ stands for a motion direction stimulus and ‹R|θ› is a stimulus-specific reward expectation, which may be encoded in the orbitofrontal cortex or basal ganglia. For simplicity, we computed ‹R|θ› as a running average of reward history¹⁴. Phasic activity of dopamine neurons encodes the difference R−‹R|θ›, called the RPE signal^12,22,23, and dopamine concentration modulates long-term plasticity^24,25. In our model, positive RPE signals lead to potentiation, while negative RPE signals lead to depression. Finally, the synaptic strengths c are bounded between 0 and 1.

Model learning performance

We compared the learning performance of our model with that of two control networks: a network without feedback, which had only feedforward connections between the local circuits, and a network with fixed tuning of association neurons, which had only feedforward connections and no plasticity of synapses between sensory and association neurons (effectively, a classical two-layer categorization model¹⁶). Initially, performance of all models rapidly improved from the chance level to ~80% correct responses over several thousand trials (Fig. 2a). During this short period of associative learning, the models learn to associate motion directions and categories, driven by plasticity of the synapses from association to decision neurons. Plasticity transforms the profile of these synapses from random to nearly binary: association neurons with preferred directions in category C1 have strong weights to C₁ and nearly zero weights to C₂ decision neurons, and vice versa (Supplementary Fig. 1b). As a result, motion directions from category C1 generate stronger input into the C₁ decision population, which makes C₁ choices more likely, because the probability of choice in our model is determined by the difference in input currents to two competing populations²¹. At this stage of learning, the performance is less accurate for stimuli closest to (15°) the category boundary (Fig. 2d). Near-boundary stimuli activate a subpopulation of association neurons with preferred directions in both categories (Fig. 1c), resulting in comparable inputs to both decision populations and less reliable categorization behaviour.

**Figure 2: Behavioural performance of the network models during training on the motion categorization task.**

As training progressed, the three models began to exhibit markedly different performance trends (Fig. 2b,e). The network with feedback steadily improved performance over a hundred thousand trials (several months of training for monkeys), mainly due to increasing accuracy for the near-boundary stimuli (Fig. 2e). In contrast, the performance of network without feedback gradually deteriorated, whereby accuracy decreased for all motion directions. The network with fixed tuning of association neurons maintained the same performance level as attained by the end of the associative learning period. These performance trends were preserved throughout extensively long training (Fig. 2c,f), by the end of which the performance of the network without feedback dropped to the chance level.

Transformation of tuning in association neurons

The striking differences in learning performance of the three models cannot be explained by the synaptic connections from the association to decision neurons, as they are shaped equally in all networks during associative learning and remain virtually unchanged later on (Supplementary Fig. 1b). The reason for the observed performance differences is the change in tuning of association neurons, driven by the plasticity of synapses between sensory and association neurons (Supplementary Figs 1 and 2). In the networks with and without feedback, association neurons have initially the same uniform direction tuning, which is only slightly altered after a short period of learning (6,000 trials, Fig. 3a, upper row), but becomes dramatically different in the two models after extensively long training (420,000 trials, Fig. 3a, lower row). In the network without feedback, the direction tuning deteriorates: the association neurons fire at the same rate for all motion directions. Consequently, the decision circuit receives nonselective inputs and the performance is at the chance level. In contrast, tuning transforms from directional to categorical in the network with feedback: two nonoverlapping subpopulations emerge in the association circuit that respond selectively to stimuli from their preferred categories. As a result, category decisions are very accurate even for near-boundary stimuli.

**Figure 3: Mixed direction and category tuning emerges in association neurons through learning.**

To quantify the development of category selectivity throughout learning, we computed the average category-tuning index⁵ (CTI) of association neurons in the model with feedback. Categorical tuning entails that neurons respond differently to stimuli in different categories and do not differentiate between stimuli in the same category. Accordingly, the CTI varies from −1.0 to 1.0, where positive values indicate larger response differences for stimuli in different categories and negative values indicate larger differences within each category (see Methods). Before learning, the average CTI of association neurons was zero, indicating uniform direction tuning (Fig. 3b), and then CTI gradually increased. At the intermediate learning stage corresponding to the amount of categorization training received by monkeys (65,000 trials or ~10–12 weeks), the average CTI was 0.18, comparable to the CTI value 0.125 previously reported for LIP neurons⁵.

The gradual increase in the CTI was accompanied by changes in the tuning curves of individual association neurons, which followed two systematic trends. In neurons that initially preferred directions near category centres, tuning curves broadened (Fig. 3c, right), while in neurons that initially preferred directions near category boundaries, tuning curves shifted so that their preferred directions moved towards centres of the respective categories (Fig. 3c, left). Broadening and shifting of tuning curves led to mixed tuning, whereby direction and category signals were combined on the single-cell level. To quantify this mixture, we fitted the tuning curve of each association neuron with a generalized linear model (GLM)²⁶, which contained a linear combination of two regressor functions: a direction (bell-shaped, equation (12)) and a category (binary step-like, equation (13)) tuning profiles (see Methods). The tuning was classified as pure directional, pure categorical or mixed, according to GLM coefficients that were significantly different from 0. At the intermediate learning stage (65,000 trials), 15.6% of association neurons exhibited a significant influence of category on their tuning curves, while 84.4% remained purely direction-tuned. We examined the distribution of preferred directions in direction-tuned neurons, and found that more neurons were tuned to category centres than to category boundaries (Fig. 3d, the result did not change if all neurons were included).

Broadening and shifting of tuning curves alter the representation of motion directions in a way that facilitates the discrimination of categories. We visualized the ensuing representation on the population level using classical multidimensional scaling²⁷ (MDS). In this framework, stimuli are represented as vectors in a high-dimensional space of neural firing rates, where each dimension corresponds to a neuron in the population. The MDS algorithm finds a two-dimensional configuration of the stimuli that preserves the distances between them as much as possible. In the sensory circuit, the MDS algorithm yields a circular configuration (Fig. 3e, left) that faithfully reproduces the arrangement of directions in the physical space. In the association circuit, the configuration is elongated along the axis perpendicular to the category boundary (Fig. 3e, right), which increases the distances between near-boundary stimuli in different categories making them more easily discriminable and decreases distances between stimuli within the same category making them less discriminable.

Mixed direction and category tuning in LIP neurons

We compared tuning changes in our model to the tuning (during the period of stimulus presentation) of MT and LIP neurons recorded in monkeys trained to categorize motion directions⁵. Such a comparison is meaningful, if the model and monkeys experienced similar amount of categorization training and reached similar behavioural performance. In the model, the time course of learning depends on the learning rate q and the maximal strength of feedback connections (Supplementary Fig. 3). We simulated the model for a range of q and and used the parameters that provided good match to experimental data for the similar number of training trials (that is, 65,000 trials, see Supplementary Fig. 4).

We fitted the tuning curve of each neuron in our database (67 MT and 156 LIP neurons) with direction and category-tuning functions and then classified tuning as directional, categorical or mixed following the same procedure that was used for model neurons. The majority of MT (91.0%) and LIP neurons (69.9%) exhibited pure direction tuning (Fig. 4a, upper panels, Fig. 4b). In agreement with our model prediction, the distribution of preferred directions was significantly bimodal among direction-tuned LIP neurons (Hartigan’s dip test P=0.003, Fig. 4c), but not among MT neurons (Hartigan’s dip test P=0.08). A considerable fraction of LIP neurons (18.0%) showed a mixture of directional and categorical tuning (Fig. 4a, lower panels). The distribution of preferred directions remained significantly bimodal when the mixed-tuned LIP neurons were included in the analysis (Hartigan’s dip test P<10⁻⁷). A small fraction of LIP neurons (3.9%) exhibited pure category tuning (Fig. 4a, middle panels), and the rest (8.3%) were not stimulus-selective. As a control, we repeated the analyses in different time epochs during the trial (Supplementary Table 1) and using a smoothed category-tuning function (Supplementary Table 2), and obtained similar results.

**Figure 4: Mixed direction and category tuning in LIP neurons.**

The representation of motion directions at the population level was consistent with the model prediction as well: the MDS algorithm revealed a nearly circular configuration of motion directions in MT (Fig. 4d, upper panel), whereas in LIP motion directions were arranged on an elongated ellipse with the major axis perpendicular to the category boundary (Fig. 4d, lower panel, see Supplementary Note 1 for statistical significance test). Similarly, CTI was significantly higher in LIP than in MT as has been previously reported for the same dataset⁵. Although the LIP population demonstrated high heterogeneity, the main tuning features in LIP bear a remarkable resemblance to the tuning transformation induced by learning in our model.

Reward-driven learning depends on choice probability

To understand effects of learning on tuning of association neurons, we need to examine the reward-dependent Hebbian plasticity rule (equation (1)). The plasticity rule entails that the expected weight change for each stimulus ‹Δc|θ› is proportional to the covariance between the reward R and neural activity N=r_pre r_post (ref. 15) (see Supplementary Note 2):

This means that average synaptic weight changes across many trials are driven by covariation between trial-to-trial fluctuations of the firing rates and reward. Thereby, synapses change to increase the expected reward. If for a particular synapse the neural activity is systematically higher on trials when the reward is above its mean, then the covariance is positive, the synapse is potentiated, and hence the mean neural activity and the expected reward increase (and analogously for negative covariance). Fluctuations of both reward and neural activity are critical for learning: if either R or N is deterministic, the covariance equals zero and learning does not increase expected reward.

Covariation between neural activity and reward entails covariation between neural activity and choices, if reward is assigned on the basis of behavioural responses. This simple intuition can be formalized mathematically, if we express the covariance Cov[R,N|θ] in terms of expectations conditioned on choices. For tasks with only two possible choices, we obtain a simple expression (see Methods for derivation and generalization to arbitrary number of choices):

Here P_i,θ is the probability that C_i choice is made for the stimulus θ; R_i,θ=‹R|θ,C_i› is the reward expected for choosing C_i for stimulus θ; and N_i,θ=‹N|θ,C_i› is the expected neural activity conditioned on the stimulus θ and choice C_i.

The term (N_1,θ−N_2,θ) represents the difference between the means of two neural activity distributions obtained on trials when different choices are made for the same stimulus θ, and is monotonically related to a measure called choice probability^17,28 (CP, Supplementary Fig. 5a). CP quantifies the accuracy with which an ideal observer could predict choices given neuronal firing rates on a trial-by-trial basis. A CP of 0.5 indicates no correlation between neural fluctuations and choices (N_1,θ≈N_2,θ, Fig. 5b), whereas a CP of 1 (or 0) indicates that the neuron’s firing rate is always higher (or lower) on trials when C₁ is chosen than on trials when C₂ is chosen for the same stimulus θ (N_1,θ>N_2,θ in Fig. 5c; our convention of computing CP differs from refs 17, 29, see Supplementary Note 3).

**Figure 5: Choice probability determines the direction and magnitude of synaptic changes.**

Equation (3) demonstrates that synaptic updates lead to increase in expected reward if CP≠0.5 for pre- or postsynaptic neurons; however, if CP≈0.5 for both pre- and postsynaptic neurons, the covariance Cov[R,N|θ] vanishes irrespective of the reward expectation. This result is a general property of reward-modulated Hebbian plasticity and holds across different tasks and network architectures. It can be illustrated using a single toy-model neuron (Fig. 5a), whose firing rates for C₁ and C₂ choices are sampled from two Gaussian distributions with different means, without specifying mechanisms generating CP. We assumed that C₁ choices are rewarded, leaving other task details unspecified. The synapse of this toy-model neuron is updated according to the reward-modulated Hebbian plasticity rule. As predicted by equation (3), CP determines the direction and magnitude of synaptic changes in the toy model. If CP>0.5, the covariance Cov[R,N] is positive and the synapse is potentiated (red traces in Fig. 5d), and if CP<0.5 the synapse is depressed (blue traces in Fig. 5d). The covariance magnitude is larger for larger |CP−0.5|, resulting in faster synaptic changes. If CP≈0.5, the covariance vanishes; hence, synaptic modifications are driven by noise similar to a random walk (yellow traces in Fig. 5d) and over a long period of learning any synaptic weight becomes equally likely (Fig. 5e).

This general principle explains both the fast associative learning and slower behavioural improvements in our model. Since activities of decision neurons directly represent the model’s choices, the magnitude of their CP is large; hence, the synapses of decision neurons change rapidly towards increasing expected reward, underpinning fast associative learning. In the network with feedback, CP arises via feedback from the decision circuit, which produces multiplicative rate modulations in association neurons^30,31 (Supplementary Fig. 2). Initially, CP is scattered around 0.5; however, when feedback connections become structured (~500 trials), neurons receiving stronger input from the C₁ (C₂) decision population fire at higher rates when C₁ (C₂) choices are made and exhibit CP>0.5 (CP<0.5, Fig. 6b). The magnitude of CP is smaller in association than in decision neurons; therefore, the tuning changes of association neurons and ensuing behavioural improvements happen more slowly than associative learning. In the network without feedback, CP≈0.5 in all association neurons and at all learning stages (Fig. 6a), because local noise in the decision circuit—required to attain realistic behavioural performance in the categorization task—diminishes the influence of association neurons' rate fluctuations on choices (see Supplementary Note 4 for details). Resulting unstructured synaptic changes lead to deterioration of tuning and behavioural performance. Regardless of which mechanism—feedforward or feedback—is more plausible for generating CP in real neurons, our results demonstrate the significance of CP for reward-dependent learning.

**Figure 6: Association neurons in the network with feedback, but not in the network without feedback, exhibit choice-correlated fluctuations.**

Choice-correlated fluctuations shape neural tuning changes

Over many trials, synaptic weight changes Δc_ij between the association neuron i and sensory neurons j=1…N follow the same two trends as observed in tuning functions (Fig. 3c). For neurons tuned to category centres, the initial bell-shaped profile widens on both sides until it transforms into a step-like profile aligned with the category boundary (Fig. 7e); hence, the tuning curves broaden. For neurons tuned to directions near category boundaries, synapses are strengthened on one side and weakened on the other side of the initial bell-shaped profile (Fig. 7f); hence, the tuning curves shift towards the category centre. Using equation (2), the expected weight change for stimulus θ can be expressed as ‹Δc_ij|θ›=q Cov[R,r_ir_j|θ]≈q ‹r_j|θ›Cov[R,r_i|θ] (see Supplementary Note 2). The overall expected weight change is then the average of ‹Δc_ij|θ› across all stimuli. Thus, synaptic changes are determined by the covariance Cov[R,r_i|θ] weighted by the rates of sensory neurons.

**Figure 7: The covariance between reward and neural activity drives tuning changes in association neurons.**

For neurons initially tuned to directions in category C1, CP>0.5 and the covariance Cov[R,r_i|θ] is positive for stimuli θεC1 and negative for θεC2 (Fig. 7a,b), since the term (R_1,θ−R_2,θ) in equation (3) changes sign for θ in different categories. The covariance magnitude is proportional to the product of probabilities of the correct response and error, P_1,θ(1−P_1,θ), which is largest for near-boundary stimuli (P_1,θ~0.5). When this covariance is combined with the firing rates of sensory neurons, the overall synaptic weight change is step-like for neurons tuned to category centres (Fig. 7c), and skewed towards the category centre for neurons tuned near category boundaries (Fig. 7d). For neurons initially tuned to directions in category C2, CP<0.5; hence, the covariance has just the opposite sign leading to the preference for category C2. Such tuning changes lead to behavioural improvements because the feedforward and feedback connections become aligned through learning.

Plastic top-down feedback induces task-specific correlations

In our model, category-tuning and neural fluctuations are simultaneously shaped through plasticity of feedforward and feedback connections to association neurons, giving rise to testable model predictions.

First, our model predicts that association neurons with larger CP exhibit greater sensitivity of their tuning curve to the stimulus category (Fig. 8a). The latter is quantified by category sensitivity (CS), which is the accuracy with which an ideal observer could discriminate between stimuli from categories C1 and C2 given neuron’s firing rates on correct trials. A positive correlation between CP and CS arises because of reciprocal interaction of plasticity on the feedforward c^S→A and feedback c^D→A connections to association neurons. On one hand, plasticity of feedforward connections from sensory neurons leads to a greater increase in CS for neurons with larger CP (Fig. 5). On the other hand, plasticity of feedback connections from decision neurons generates a greater difference in top-down inputs from two decision populations, hence larger CP, for neurons with larger CS. The correlation between CP and CS is not an a priori given, because these measures quantify independent aspects of neuronal response. CS measures the difference in response to stimuli from different categories on correct trials, whereas CP measures the difference in response to the same stimulus on correct versus error trials. The correlation between CP and CS is abolished if the learned profile of feedback connections is randomized (Supplementary Fig. 6c).

**Figure 8: Model predicts interdependence between the CP, CS and noise correlations.**

We tested whether the predicted correlation between CP and CS exists in MT and LIP neurons. The overall magnitude of CP was significantly greater in LIP than in MT population (Wilcoxon rank-sum test comparing distributions of |CP−0.5|, P=0.0006, Fig. 8c). Ten LIP neurons (11.4%, N=88) and none of MT neurons (0%, N=31) showed individually significant CP (shuffle test with 1,000 shuffles and two-sample t-test, P<0.05, see Methods). In agreement with the model prediction, CP and CS were significantly correlated in the LIP (Fig. 8b, Pearson correlation, r=0.494, N=88, P=10⁻⁶), but not in MT population (r=−0.181, N=31, P=0.33, Supplementary Fig. 6d). We also repeated the analyses using CP computed relative to the preferred category of each neuron^17,29 and obtained similar results (Supplementary Note 3 and Supplementary Fig. 6e–h). Although CP magnitude is slightly lower in LIP data than in the model, smaller CP magnitudes can be obtained in the model with weaker top-down connections (Supplementary Fig. 3). In addition, since recorded LIP neurons were sampled randomly, some of them might not be engaged in the categorization task and some were not visually responsive. This sampling heterogeneity may reduce the average effect size in the data and it is not incorporated in our model.

Second, our model predicts that interneuronal correlations depend on CS. In association neurons, correlations between their trial-to-trial rate fluctuations, termed noise correlations (a Pearson correlation coefficient between rate fluctuations of neurons i and j), arise from shared recurrent and feedforward inputs. In the network without feedback, noise correlations simply decrease with the difference in neurons’ preferred directions reflecting the bell-shaped profile of their recurrent and feedforward connections (Supplementary Fig. 5b). In the network with feedback, association neurons with the same category preference also share top-down input from decision neurons, consequently noise correlations are stronger among neurons that contribute to the same category decision (Fig. 8d), similar to previous experimental reports³². Moreover, noise correlations are larger in neural pairs with smaller absolute difference in their category sensitivities (ΔCS_ij=|CS_i−CS_j|): is positive in pairs with similar CS (ΔCS_ij~0) and negative in pairs with opposite category preference (ΔCS_ij~1, Fig. 8e). In addition, the magnitude of noise correlations is larger in neural pairs with higher CS strength, defined as (|CS_i−0.5|+|CS_j−0.5|)/2 (Fig. 8e). Such structured noise correlations—that remained static throughout learning—were required in a feedforward model^10,29 to capture the correlation between CP and task sensitivity observed in several experimental studies^7,8,29,33. However, the a priori assumption that noise correlations depend on CS is not realistic, since categories are assigned arbitrarily. Alternatively, our model suggests that plasticity of feedback connections represents a common mechanism by which the structure of noise correlations, CP and CS all develop dynamically through learning.

Discussion

Here we proposed a neural circuit mechanism for visual category learning. Our findings represent two major advances going beyond a model for categorization. First, we demonstrated that choice-correlated activity fluctuations, ubiquitous across cortical areas^7,8,9,17,34, are critical for learning through reward-dependent Hebbian plasticity, which generally holds across different network architectures and behavioural tasks. Second, we showed how behavioural improvements, neuronal tuning changes, CP and noise correlations can be all simultaneously shaped by a common plasticity mechanism in a network incorporating top-down feedback. Several model predictions about ensuing interdependences between these measures were confirmed by the analysis of LIP recordings.

The reward-dependent Hebbian plasticity in our model belongs to the family of covariance-based learning rules¹⁵ using a stimulus-specific RPE signal, which is critical for successful learning¹⁴ (Supplementary Fig. 7). The idea to harness local fluctuations for reward-dependent learning has been first proposed for connectionist networks³⁵, and later instantiated in networks of spiking neurons by exploiting either randomness of Poisson spiking^36,37 or stochasticity of synaptic transmission³⁸. Such plasticity rules can successfully learn precise spike patterns in networks of just a few neurons, but fail in larger networks and when behavioural outcomes are determined by population firing rates rather than by spike times of individual neurons^39,40. The reason for their failure in these situations is precisely the lack of correlation between population-level choices and local activity fluctuations. To overcome this problem, plasticity rules have been employed incorporating behavioural choice explicitly as a multiplicative factor^10,41,42. In contrast, our solution does not require any special plasticity rule, but instead utilizes network architecture where feedback from decision neurons generates choice-correlated variability.

Task-specific neural representations develop in many training paradigms across different cortical areas^{43,44,45,46,47,48,49,50,51,52,53,54,55}. Our model demonstrates how such task-specific representations can emerge through reward-dependent plasticity. Although task-specific selectivity could arise through activity modulation via plastic feedback connections⁵⁶, in our model, top-down modulation has a negligible effect on selectivity of association neurons (Supplementary Fig. 2), yet it is critical to guide learning of task-relevant features⁵⁷.

Tuning changes of association neurons in our model allow for more accurate categorization of near-boundary stimuli than in the classical categorization model with fixed tuning¹⁶. In our model, tuning changes arise from plasticity of feedforward synapses from sensory (MT) to association (LIP) neurons; however, similar results are obtained if plasticity acts only on the recurrent synapses within the association circuit, or on both the feedforward and recurrent synapses (Supplementary Fig. 8). In our model, the initial direction tuning of association neurons sets the profile of choice-correlated fluctuations, which in turn governs tuning changes. However, initial tuning is not required for successful learning: a population of nonselective neurons carrying choice-correlated fluctuations develops categorical tuning just as well. In this case, neurons develop purely binary category selectivity with the category preference determined solely by their CP (Supplementary Fig. 9). Last, retraining on a categorization task with a new category boundary results in readjustment of neural tuning (Supplementary Fig. 10) similar to experimental observations⁵.

It has been speculated that category signals in LIP represent abstract perceptual decisions: category C1 versus C2 (ref. 58). In the motion categorization task, but not in classic motion discrimination work in LIP⁷, abstract decisions were dissociated from the actions signalling those decisions by using a two-interval match-to-category design, where the required motor response was unknown at the time of the first stimulus presentation. Moreover, receptive fields of LIP neurons in the motion discrimination task were aligned with the saccadic choice targets and not with the motion stimulus as in our case; hence, that design was better suited to examine response-related rather than perceptual signals in LIP. Accordingly, these data were interpreted using a feedforward model, where LIP neurons represent a decision-variable pooling activity of MT neurons with weights adjusted by a reinforcement learning rule¹⁰, and behavioural improvements were ascribed to selective strengthening of connections from the most sensitive sensory to decision neurons^8,10. In contrast, we find that during motion categorization the representation of motion stimuli in LIP constitutes a mixture of directional and categorical tuning that facilitates discrimination of learned categories. Therefore, both mechanisms—that co-exist in our model—may be concurrently employed in the brain: refinements of sensory representations and of their readout by decision neurons.

In our model, mixed selectivity is robustly observed over a period from a few thousand to several hundred thousand trials, accompanied by increasing category tuning. Consistent with high CTI values reported previously in LIP⁵, we find that two factors contribute to the increasing population CTI: shift of preferred directions and emergence of mixed and pure category tuning. Some LIP neurons carried category selectivity throughout the delay period of the match-to-category task, which indicates that category encoding may not be a purely feedforward effect.

Our work demonstrates the significance of CP for reward-dependent learning regardless of its origin. The origin of CP has been recently debated, with accumulating evidence for top-down contributions^18,34,59. Notably, CP signals we observe in LIP are distinct from signals related to reward, attention and upcoming movements^5,54,60,61. Although origins of CP may differ between earlier sensory areas such as MT and more cognitive areas such as LIP, our model provides a common framework for understanding the impact of CP on plasticity of neuronal representation.

We proposed a novel model for how CP influences plasticity in LIP, although CP effects in sensory areas (for example, MT) have been modelled previously^18,29,62. Our model demonstrates how a task-specific structure of CP, noise correlations and CS can arise dynamically through reward-dependent plasticity of top-down connections and predicts that neurons with larger CP develop larger CS. Thus, learning-induced tuning changes may be more pronounced in cortical areas that exhibit greater CP (Supplementary Fig. 3). Interestingly, both CP and CS were found to be significantly larger in LIP than in the prefrontal cortex⁵⁴. Similarly, low CP of MT neurons might explain the absence of obvious tuning changes in this area through categorization training. Small but significant learning-related tuning changes have been observed in other sensory areas⁴ that also exhibit CP⁶³. Therefore, our findings may generalize across sensory areas not limited to LIP.

Methods

Neural circuit model

Network architecture. The network model comprises three interconnected local circuits: sensory, association and decision. All three are strongly recurrent networks with dynamics governed by local excitation and feedback inhibition^19,20,64. In simulations, we used a reduced mean-field model that has been shown to reproduce neural activity of a full spiking neural network²¹. The dynamics of each excitatory neural population is described by a single variable s representing the fraction of activated N-methyl-D-aspartate receptor conductance, governed by

with γ=0.641 and τ_s=60 ms. The firing rate r is a function of the total synaptic current I (refs 21, 65):

with a=270 Hz nA⁻¹, b=108 Hz and d=0.154 s.

The total synaptic current I consists of recurrent and noisy components, I=I_r+I_n. Recurrent input to a neuron i in the population A originating from the population B reads:

where is the synaptic coupling between the neuron j in the population B and the neuron i in the population A. The current is normalized by the number of presynaptic neurons N_B. Noisy current replicates background synaptic inputs and obeys: , where η(t) is a white Gaussian noise, , , τ_n=2 ms and σ_n=0.009 nA.

The sensory and association circuits were each simulated by 128 discrete units with equally spaced preferred directions from 0° to 360°. Within each circuit, the synaptic couplings g_ij between neurons with preferred directions θ_i and θ_j have a periodic Gaussian profile:

with σ=43.2°. Parameters J₋ and J₊ determine the amount of recurrent excitation and inhibition. In sensory and association networks, the recurrent inhibition is stronger than recurrent excitation, , , and . The particularly strong recurrent inhibition in the association circuit sets this module in the normalization regime⁶⁶, where the total population activity remains approximately constant for different stimuli¹⁹.

The decision circuit consists of two populations (C₁ and C₂) representing categorical choice, which pool activity of the association neurons. When stimulated, activities of the C₁ and C₂ populations diverge according to winner-take-all dynamics. This behaviour is attained through global inhibition and structured recurrent excitation within the decision circuit²¹: with J_C1,C1=J_C2,C2=0.3725, nA, J_C1,C2=J_C2,C1=−0.1137, nA.

Plastic synapses. All synapses connecting three local circuits (from sensory to association, and between association and decision neurons) are plastic and excitatory. Synaptic strengths of plastic connections are expressed as g_ij=g_maxc_ij, where g_max is the maximal connection strength and c_ij is bounded between 0 and 1, and represents the fraction of potentiated synapses between neurons i and j. At the end of each trial, all c_ij are updated according to the Hebbain plasticity rule modulated by the RPE as specified in equation (1), where the learning rate q=0.00003, and r_pre and r_post are average firing rates during the stimulus period. The stimulus-specific predicted reward ‹R|θ› was estimated by a running trial average¹⁴: , where τ_R=5, and n enumerates trials with stimulus θ.

Plastic synapses between sensory and association neurons were initialized with the periodic Gaussian profile as in equation (7) with . Plastic synapses between association and decision neurons ( and ) were initialized randomly from a uniform distribution on [0.25, 0.75]. The maximal connection strengths of plastic synapses were , and .

Simulation protocol and external inputs

Each simulation trial starts with a 200-ms pre-stimulus period (no external inputs), followed by a 1-s presentation of a motion direction stimulus and then by a 500-ms intertrial interval. When a motion direction stimulus θ_s is presented, neurons in the sensory network receive additional input current I_s that depends on the neuron’s preferred direction θ:

where σ_s=43.2° and g_s=0.1 nA. Neurons in the decision circuit receive a nonselective gating current of 0.01 nA during the stimulus period, which sets the circuit in the decision-making regime, and a brief −0.08 nA reset current during the first 300 ms of the intertrial interval, which represents the corollary discharge⁶⁷ and resets activity to the spontaneous level.

The model’s response on each trial was determined by comparing firing rates of two decision populations with a 20-Hz threshold during the last 25-ms of the stimulus period. The response is considered invalid if both or neither population reach threshold, or either population reaches threshold before the stimulus onset. Across trials, choices of the decision network are stochastic and are characterized by a sigmoidal dependence of the probability of choice C₁ on the difference ΔI in synaptic input currents to two competing populations⁶⁸. Reward equals R=1 on valid correct trials, R=0 on valid incorrect trials and no plasticity is triggered on invalid trials.

Noise correlation , CP and CS for the model neurons were estimated from 10,000 simulated trials with synapses ‘frozen’ (that is, no plasticity) at values attained after specified number of learning trials. Noise correlation was computed as the Pearson correlation coefficient between the firing rates of neurons i and j across all correct trials for the same stimulus, and then averaged across stimuli. CP and CS were computed as described in the Data analysis section, except for the CP estimation the model’s choice was known explicitly and did not have to be inferred.

Simulations were performed using a custom code written in Matlab implementing Heun integration with a time step of 1 ms. Code implementing the model is available upon request via email.

Derivation of equation (3)

The covariance in equation (2), Cov[R,N|θ]=‹RN|θ›−‹R|θ›‹N|θ›, can be expressed in terms of expectations conditioned on the choice:

for a task with n possible choices C_i, which are selected with probabilities P_i,θ for stimulus θ. In tasks where reward is delivered on the basis of behavioural response, the reward is independent of neural activity when conditioned on the choice; therefore,

where R_i,θ and N_i,θ denote the conditional expectations of reward and neural activity, respectively, for choice C_i and stimulus θ. In these terms, equation (9) can be rewritten as

For tasks with only two possible choices, equation (11) simplifies to equation (3). In the categorization task reward is a deterministic function of choice (1 and 0 for correct and error choice, respectively); hence, the term (R_1,θ−R_2,θ) in equation (3) becomes +1 or −1 for stimuli θεC1 or θεC2, respectively.

Toy-model neuron

We simulated a toy-model neuron (Fig. 5) to illustrate that CP drives synaptic changes independently of a particular network architecture and behavioural task. On each trial, a choice C₁ or C₂ was selected with probability 0.5. The firing rate of the toy-model neuron was then sampled from a Gaussian distribution with the mean N_i for choice C_i and variance 5 Hz. To generate different CP values, the following (N₁,N₂) pairs were used: (55, 50), (51, 50), (50, 50), (50, 51) and (50, 55) Hz. Synaptic changes were simulated with the plasticity rule in equation (1). For simplicity, the firing rate of neuron on the other synaptic side was assumed to be static through learning and set to 1. The mean firing rate and CP of the toy-model neuron were also assumed not to change through learning for simplicity. As in the circuit model, the predicted reward ‹R› was estimated by the running average with τ_R=5, the learning rate was q=0.00003 and the synapse was initialized at 0.5.

Behavioural task and neurophysiological recordings

All monkey data are from ref. 5, where experimental protocol and recording procedures were described in detail⁶⁹. Two rhesus monkeys (Macaca mulatta, weighing about 14 kg) were trained to classify random-dot motion stimuli according to an arbitrary category boundary, which divided 360° of motion directions into two 180°-wide categories. Stimuli were circular patches (9° in diameter) of high-contrast square dots that moved with 100% motion coherence and at a speed of 12° s⁻¹. Stimuli were always centred in the response field (RF) of the neuron under study. To dissociate categorical decisions from motor or premotor signals, the animals indicated category membership of the first stimulus (sample) by reporting (with a hand movement) whether it matched the category of the second stimulus (test). We focused on the categorization process of the sample stimulus and studied neural activity during the sample period (150–750 ms after stimulus onset, stimulus duration was 650 ms). To combine data from the two monkeys, all stimulus directions were rotated so that the category boundary was aligned with a 0°–180° axis.

The monkeys were implanted with a head post, scleral search coil and recording chamber. Recording chambers were implanted in accordance with coordinates (approximate centres at P3, L10) determined by magnetic resonance imaging, and allowed access to both the intraparietal sulcus (IPS) and the superior temporal sulcus by means of a dorsal approach. All surgical and experimental procedures followed the Harvard Medical School and National Institutes of Health guidelines. During LIP recordings, electrode penetrations sequentially encountered both the medial and lateral banks of the IPS. Most IPS neurons were tested with a memory-saccade task and a passive viewing flash-mapping task to generate detailed spatial maps of neuronal RFs. Neurons were considered to be in LIP if they showed spatially selective delay activity during the memory-saccade task or were located between such neurons in that electrode penetration. LIP neurons were not prescreened for direction selectivity. Area MT neurons were distinguished by direction-selective responses to moving spots and bars, and RF sizes that were roughly proportional to their eccentricity.

Data analysis

Tuning curve characterization. The firing rates of MT and LIP neurons were transformed to standard z-scores. Tuning curves r(θ) were then constructed by computing average standardized firing rates in response to 12 motion direction stimuli θ. Tuning curves r(θ) of MT and LIP neurons, as well as those of association neurons in the circuit model were fitted by directional and categorical tuning profiles (least squares fit). The directional tuning profile was modelled by an exponential cosine function:

where r₀ is the baseline firing rate, r_max is the peak amplitude, w is the tuning width parameter and θ₀ is the preferred direction. First, we obtained the median tuning width w for each population from the unconstrained fit, and then refitted tuning curves with w constrained within the 10 percentile range around the median (93.4°–126.1° for MT and 101.4°–142.7° for LIP and association neurons) to avoid very broad low amplitude (that is, nearly flat) directional fits. The resulting median tuning width was 120.9° for LIP and 104.9° for MT neurons, similar to previous reports⁶. The categorical tuning profile was modelled by a step function:

where is the average firing rate across stimuli in category C_i. We repeated the analysis with more complex categorical tuning profiles (a periodic sigmoid function and a step-like function with a smoother firing rate change near category boundaries), and it did not change the conclusions of our study.

We then used a regularized GLM²⁶ to determine the relative contribution of fitted directional and categorical tuning profiles to neural firing rates. Regularized GLM provides a principled way to assess the relative strength of direction and category tuning in each neuron, without overfitting and avoiding confounds because of correlation between direction and category-tuning profiles for neurons tuned to category centres. The regression algorithm solves the matrix equation β=(X^TX+λI)⁻¹X^Tr, where X is the matrix of three factors: fitted directional tuning profile, categorical tuning profile and a baseline, r is the vector of neuron’s firing rates across trials, β are the regression coefficients for each of the factors in X and λ is a ridge regression coefficient. The value of λ was chosen on the basis of a leave-one-trial-out cross-validation procedure, such that λ minimized the mean squared difference between predicted and actual firing rates⁷⁰.

To determine whether the resulting β coefficients were significantly different from zero, we used a standard t-test to compare β against the distribution of shuffled β values, which was obtained by randomizing the trial order and then refitting the linear regression model (1,000 reshuffles). Each neuron was then classified as direction-tuned or category-tuned if the corresponding β was significantly different from zero (P<0.05), mixed direction- and category-tuned if both β’s were significantly different from zero and nonselective if neither β was significantly different from zero.

CTI and CS. The CTI measured the difference in firing rate (averaged across all trials for each direction) for each neuron between pairs of directions in different categories (a between-category difference) and the difference in activity between pairs of directions in the same category (a within-category difference). The CTI was defined as the difference between the within-category and between-category differences divided by their sum. Values of the index could vary from 1 (strong differences in activity to directions in the two categories) to −1 (large activity differences between directions in the same category, no difference between categories). A CTI value of 0 indicates the same difference in the firing rate between and within categories.

CS was estimated using a receiver-operating characteristic (ROC) analysis¹⁷ applied to the distributions of firing rates on correct trials with stimuli from categories C1 and C2. CS is the area under the ROC curve, which ranges between 0 and 1, and indicates the accuracy with which an ideal observer can assign category membership of a stimulus on the basis of the neuron’s trial-by-trial firing rate. Values of 1 and 0 correspond to strong preference for categories C1 and C2, respectively. Values of 0.5 indicate complete overlap of the firing rate distributions for the two categories, that is, no category selectivity.

Estimation of CP in MT and LIP neurons. CP was estimated on trials for which the test stimulus was far from (45° or 75°) the category boundary. The monkeys were proficient in categorizing such stimuli (97% correct when both sample and test were far from the boundary); therefore, we assumed that on these trials the test stimulus was categorized correctly and inferred the monkey’s decision about the sample category to be the same as the test category if the monkey responded match, and different category if the monkey responded nonmatch⁵⁴. For each stimulus, CP was estimated using an ROC analysis applied to the distributions of firing rates on trials with different category decisions for the same stimulus (that is, correct versus error trials). CP is the area under the ROC curve that ranges between 0 and 1 and indicates the accuracy with which an ideal observer can predict the monkey’s category decision on a trial-by-trial basis given neuron’s firing rate. Values of 1 and 0 correspond to strong preference (higher firing rate) for C₁ and C₂ category decisions, respectively. Values of 0.5 indicate complete overlap of the firing rate distributions for two decisions. To reliably estimate CP, only stimuli with at least three trials for each category choice were included in the analysis, and only those neurons were included that had a valid CP estimate for at least one stimulus in each category, which resulted in 88 LIP and 31 MT neurons left for the analysis. The CP reported for each neuron was the average CP across all stimuli that passed the inclusion criteria. Significance of CP values for individual neurons was assessed with a shuffle test. To this end, choices of the monkey were randomly assigned to the firing rate data (separately for each stimulus), and then CP was recomputed (1,000 reshuffles). The actual CP was compared with the shuffled distribution with a two-sample t-test.

Additional information

How to cite this article: Engel, T. A. et al. Choice-correlated activity fluctuations underlie learning of neuronal category representation. Nat. Commun. 6:6454 doi: 10.1038/ncomms7454 (2015).

References

Rosch, E. Principles of categorization. in: Concepts: Core Readings (eds Margolis E., Laurence S. 189–206MIT Press: Cambridge, Massachusetts, (1999) .
Ashby, F. G. & Maddox, W. T. Human category learning. Annu. Rev. Psychol. 56, 149–178 (2005) .
Article Google Scholar
Ghose, G. M., Yang, T. & Maunsell, J. H. R. Physiological correlates of perceptual learning in monkey V1 and V2. J. Neurophys. 87, 1867–1888 (2002) .
Article Google Scholar
Yang, T. & Maunsell, J. H. R. The effect of perceptual learning on neuronal responses in monkey visual area V4. J. Neurosci. 24, 1617–1626 (2004) .
Article Google Scholar
Freedman, D. J. & Assad, J. A. Experience-dependent representation of visual categories in parietal cortex. Nature 443, 85–88 (2006) .
Article CAS ADS Google Scholar
Fanini, A. & Assad, J. A. Direction selectivity of neurons in the macaque lateral intraparietal area. J. Neurophys. 101, 289–305 (2009) .
Article Google Scholar
Law, C.-T. & Gold, J. I. Neural correlates of perceptual learning in a sensory-motor, but not a sensory, cortical area. Nat. Neurosci. 11, 505–513 (2008) .
Article CAS Google Scholar
Purushothaman, G. & Bradley, D. C. Neural population code for fine perceptual decisions in area MT. Nat. Neurosci. 8, 99–106 (2005) .
Article CAS Google Scholar
Uka, T., Sasaki, R. & Kumano, H. Change in choice-related response modulation in area MT during learning of a depth-discrimination task is consistent with task learning. J. Neurosci. 32, 13689–13700 (2012) .
Article CAS Google Scholar
Law, C.-T. & Gold, J. I. Reinforcement learning can account for associative and perceptual learning on a visual-decision task. Nat. Neurosci. 12, 655–663 (2009) .
Article CAS Google Scholar
Rombouts, J., Bohte, S. & Roelfsema, P. Neurally plausible reinforcement learning of working memory tasks. Adv. Neural Inf. Process. Syst. 25, 1880–1888 (2012) .
Google Scholar
Schultz, W., Dayan, P. & Montague, P. R. A neural substrate of prediction and reward. Science 275, 1593–1599 (1997) .
Article CAS Google Scholar
Schultz, W. Multiple dopamine functions at different time courses. Annu. Rev. Neurosci. 30, 259–288 (2007) .
Article CAS Google Scholar
Fremaux, N., Sprekeler, H. & Gerstner, W. Functional requirements for reward-modulated spike-timing-dependent plasticity. J. Neurosci. 30, 13326–13337 (2010) .
Article CAS Google Scholar
Loewenstein, Y. & Seung, H. S. Operant matching is a generic outcome of synaptic plasticity based on the covariance between reward and neural activity. Proc. Natl Acad. Sci. USA 103, 15224–15229 (2006) .
Article CAS ADS Google Scholar
Bishop, C. M. Neural Networks for Pattern Recognition Oxford University Press (1995) .
Britten, K. H., Newsome, W. T., Shadlen, M. N., Celebrini, S. & Movshon, J. A. A relationship between behavioral choice and the visual responses of neurons in macaque MT. Vis. Neurosci. 13, 87–100 (1996) .
Article CAS Google Scholar
Nienborg, H., Cohen, M. R. & Cumming, B. G. Decision-related activity in sensory neurons: correlations among neurons and with behavior. Annu. Rev. Neurosci. 35, 463–483 (2012) .
Article CAS Google Scholar
Engel, T. A. & Wang, X.-J. Same or different? A neural circuit mechanism of similarity-based pattern match decision making. J. Neurosci. 31, 6982–6996 (2011) .
Article CAS Google Scholar
Wang, X. J. Probabilistic decision making by slow reverberation in cortical circuits. Neuron 36, 955–968 (2002) .
Article CAS Google Scholar
Wong, K.-F. & Wang, X.-J. A recurrent network mechanism of time integration in perceptual decisions. J. Neurosci. 26, 1314–1328 (2006) .
Article CAS Google Scholar
Fiorillo, C. D., Tobler, P. N. & Schultz, W. Discrete coding of reward probability and uncertainty by dopamine neurons. Science 299, 1898–1902 (2003) .
Article CAS ADS Google Scholar
Bayer, H. M. & Glimcher, P. W. Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron 47, 129–141 (2005) .
Article CAS Google Scholar
Reynolds, J. N., Hyland, B. I. & Wickens, J. R. A cellular mechanism of reward-related learning. Nature 413, 67–70 (2001) .
Article CAS ADS Google Scholar
Shen, W., Flajolet, M., Greengard, P. & Surmeier, D. J. Dichotomous dopaminergic control of striatal synaptic plasticity. Science 321, 848–851 (2008) .
Article CAS ADS Google Scholar
Dobson, A. An Introduction to Generalized Linear Models Chapman & Hall/CRC (2001) .
Borg, I. & Groenen, P. J. F. Modern Multidimensional Scaling: Theory and Applications Springer (1997) .
Marzban, C. The ROC curve and the area under it as performance measures. Wea. Forecasting 19, 1106–1114 (2004) .
Article ADS Google Scholar
Shadlen, M. N., Britten, K. H., Newsome, W. T. & Movshon, J. A. A computational analysis of the relationship between neuronal and behavioral responses to visual motion. J. Neurosci. 16, 1486–1510 (1996) .
Article CAS Google Scholar
Martinez-Trujillo, J. C. & Treue, S. Feature-based attention increases the selectivity of population responses in primate visual cortex. Curr. Biol. 14, 744–751 (2004) .
Article CAS Google Scholar
Ardid, S., Wang, X.-J. & Compte, A. An integrated microcircuit model of attentional processing in the neocortex. J. Neurosci. 27, 8486–8495 (2007) .
Article CAS Google Scholar
Cohen, M. R. & Newsome, W. T. Context-dependent changes in functional circuitry in visual area MT. Neuron 60, 162–173 (2008) .
Article CAS Google Scholar
Gu, Y., Angelaki, D. E. & DeAngelis, G. C. Neural correlates of multisensory cue integration in macaque MSTd. Nat. Neurosci. 11, 1201–1210 (2008) .
Article CAS Google Scholar
Nienborg, H. & Cumming, B. G. Decision-related activity in sensory neurons reflects more than a neuron’s causal effect. Nature 459, 89–92 (2009) .
Article CAS ADS Google Scholar
Williams, R. J. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8, 229–256 (1992) .
MATH Google Scholar
Xie, X. & Seung, H. S. Learning in neural networks by reinforcement of irregular spiking. Phys. Rev. E 69, 1–10 (2004) .
MathSciNet Google Scholar
Pfister, J. P., Toyoizumi, T., Barber, D. & Gerstner, W. Optimal spike-timing-dependent plasticity for precise action potential firing in supervised learning. Neural Comput. 18, 1318–1348 (2006) .
Article MathSciNet Google Scholar
Seung, H. S. Learning in spiking neural networks by reinforcement of stochastic synaptic transmission. Neuron 40, 1063–1073 (2003) .
Article CAS Google Scholar
Vasilaki, E., Fremaux, N., Urbanczik, R., Senn, W. & Gerstner, W. Spike-based reinforcement learning in continuous state and action space: when policy gradient methods fail. PLoS Comp. Biol. 5, e1000586 (2009) .
Article ADS Google Scholar
Urbanczik, R. & Senn, W. Reinforcement learning in populations of spiking neurons. Nat. Neurosci. 12, 250–252 (2009) .
Article CAS Google Scholar
Roelfsema, P. R. & van Ooyen, A. Attention-gated reinforcement learning of internal representations for classification. Neural Comput. 17, 2176–2214 (2005) .
Article Google Scholar
Soltani, A. & Wang, X.-J. Synaptic computation underlying probabilistic inference. Nat. Neurosci. 13, 112–119 (2010) .
Article CAS Google Scholar
Fitzgerald, J. K., Freedman, D. J. & Assad, J. A. Generalized associative representations in parietal cortex. Nat. Neurosci. 14, 1075–1079 (2011) .
Article CAS Google Scholar
Goodwin, S. J., Blackman, R. K., Sakellaridi, S. & Chafee, M. V. Executive control over cognition: stronger and earlier rule-based modulation of spatial category signals in prefrontal cortex relative to parietal cortex. J. Neurosci. 32, 3499–3515 (2012) .
Article CAS Google Scholar
Toth, L. J. & Assad, J. A. Dynamic coding of behaviourally relevant stimuli in parietal cortex. Nature 415, 165–168 (2002) .
Article CAS ADS Google Scholar
Stoet, G. & Snyder, L. H. Single neurons in posterior parietal cortex of monkeys encode cognitive set. Neuron 42, 1003–1012 (2004) .
Article CAS Google Scholar
Ferrera, V. P., Yanike, M. & Cassanello, C. Frontal eye field neurons signal changes in decision criteria. Nat. Neurosci. 12, 1458–1462 (2009) .
Article CAS Google Scholar
Mante, V., Sussillo, D., Shenoy, K. V. & Newsome, W. T. Context-dependent computation by recurrent dynamics in prefrontal cortex. Nature 503, 78–84 (2013) .
Article CAS ADS Google Scholar
Swaminathan, S. K., Masse, N. Y. & Freedman, D. J. A comparison of lateral and medial intraparietal areas during a visual categorization task. J. Neurosci. 33, 13157–13170 (2013) .
Article CAS Google Scholar
Sigala, N. & Logothetis, N. K. Visual categorization shapes feature selectivity in the primate temporal cortex. Nature 415, 318–320 (2002) .
Article CAS ADS Google Scholar
Freedman, D. J., Riesenhuber, M., Poggio, T. & Miller, E. K. A comparison of primate prefrontal and inferior temporal cortices during visual categorization. J. Neurosci. 23, 5235–5246 (2003) .
Article CAS Google Scholar
Wallis, J. & Miller, E. From rule to response: neuronal processes in the premotor and prefrontal cortex. J. Neurophys. 90, 1790–1806 (2003) .
Article Google Scholar
Freedman, D. J., Riesenhuber, M., Poggio, T. & Miller, E. K. Categorical representation of visual stimuli in the primate prefrontal cortex. Science 291, 312–316 (2001) .
Article CAS ADS Google Scholar
Swaminathan, S. K. & Freedman, D. J. Preferential encoding of visual categories in parietal cortex compared with prefrontal cortex. Nat. Neurosci. 15, 315–320 (2012) .
Article CAS Google Scholar
Cromer, J. A., Roy, J. E. & Miller, E. K. Representation of multiple, independent categories in the primate prefrontal cortex. Neuron 66, 796–807 (2010) .
Article CAS Google Scholar
Szabo, M. et al. Learning to attend: modeling the shaping of selectivity in infero-temporal cortex in a categorization task. Biol. Cyber 94, 351–365 (2006) .
Article Google Scholar
Roelfsema, P. R., van Ooyen, A. & Watanabe, T. Perceptual learning rules based on reinforcers and attention. Trends Cogn. Sci. 14, 64–71 (2010) .
Article Google Scholar
Freedman, D. J. & Assad, J. A. A proposed common neural mechanism for categorization and perceptual decisions. Nat. Neurosci. 14, 143–146 (2011) .
Article CAS Google Scholar
DeLa Rocha, J., Wimmer, K., Renart, A., Roxin, A. & Compte, A. in Society for Neuroscience Annual Meeting (New Orleans, LA, USA (2012) .
Freedman, D. J. & Assad, J. A. Distinct encoding of spatial and nonspatial visual information in parietal cortex. J. Neurosci. 29, 5671–5680 (2009) .
Article CAS Google Scholar
Rishel, C. A., Huang, G. & Freedman, D. J. Independent category and spatial encoding in parietal cortex. Neuron 77, 969–979 (2013) .
Article CAS Google Scholar
Haefner, R. M., Gerwinn, S., Macke, J. H. & Bethge, M. Inferring decoding strategies from choice probabilities in the presence of correlated variability. Nat. Neurosci. 16, 235–242 (2013) .
Article CAS Google Scholar
Shiozaki, H. M., Tanabe, S., Doi, T. & Fujita, I. Neural activity in cortical area V4 underlies fine disparity discrimination. J. Neurosci. 32, 3830–3841 (2012) .
Article CAS Google Scholar
Compte, A., Brunel, N., Goldman-Rakic, P. S. & Wang, X. J. Synaptic mechanisms and network dynamics underlying spatial working memory in a cortical network model. Cerebral Cortex 10, 910–923 (2000) .
Article CAS Google Scholar
Abbott, L. F. & Chance, F. S. In Cortical Function: a View from the Thalamus, vol. 149 of Progress in Brain Research (eds Guillery V. C. R., Sherman S. 147–155Elsevier (2005) .
Carandini, M. & Heeger, D. J. Normalization as a canonical neural computation. Nat. Rev. Neurosci. 13, 51–62 (2011) .
Article Google Scholar
Crapse, T. B. & Sommer, M. A. Corollary discharge circuits in the primate brain. Curr. Opin. Neurobiol. 18, 552–557 (2008) .
Article CAS Google Scholar
Soltani, A. & Wang, X.-J. A biophysically based neural model of matching law behavior: melioration by stochastic synapses. J. Neurosci. 26, 3731–3744 (2006) .
Article CAS Google Scholar
Freedman, D. & Assad, J. Distinct encoding of spatial and nonspatial visual information in parietal cortex. J. Neurosci. 29, 5671–5680 (2009) .
Article CAS Google Scholar
Hastie, T., Tibshirani, R. & Friedman, J. H. Data Mining, Inference, and Prediction Springer (2009) .

Download references

Acknowledgements

This work was supported by NIH grant NIH R01MH092927, the Swartz Foundation, the Kavli Foundation, the McKnight Endowment Fund for Neuroscience and the Alfred P. Sloan Foundation. We thank John Assad for valuable contributions during all phases of the neurophysiological studies, which produced the data examined here.

Author information

Tatiana A. Engel and Warasinee Chaisangmongkon: These authors contributed equally to this work.

Authors and Affiliations

Department of Neurobiology, Yale University School of Medicine, Kavli Institute for Neuroscience, 333 Cedar Street, New Haven, 06510, Connecticut, USA
Tatiana A. Engel, Warasinee Chaisangmongkon & Xiao-Jing Wang
Department of Bioengineering, Stanford University, 318 Campus Drive, Stanford, 94305, California, USA
Tatiana A. Engel
Department of Neurobiology, The University of Chicago, 5812 S. Ellis Ave., Chicago, 60637, Illinois, USA
David J. Freedman
Center for Neural Science, New York University, 4 Washington Place, New York, 10003, New York, USA
Xiao-Jing Wang
NYU-ECNU Joint Institute of Brain and Cognitive Science, NYU-Shanghai, Shanghai, 200122, China
Xiao-Jing Wang

Authors

Tatiana A. Engel
View author publications
You can also search for this author in PubMed Google Scholar
Warasinee Chaisangmongkon
View author publications
You can also search for this author in PubMed Google Scholar
David J. Freedman
View author publications
You can also search for this author in PubMed Google Scholar
Xiao-Jing Wang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

T.A.E., W.C. and X.-J.W. designed research. T.A.E. and W.C. performed model simulations and analysed data. D.J.F. designed and performed experiments. T.A.E., W.C., D.J.F. and X.-J.W. discussed results and wrote the paper.

Corresponding author

Correspondence to Xiao-Jing Wang.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Information

Supplementary Figures 1-10, Supplementary Tables 1-2, Supplementary Notes 1-4 and Supplementary References (PDF 1234 kb)

Rights and permissions

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Reprints and permissions

About this article

Cite this article

Engel, T., Chaisangmongkon, W., Freedman, D. et al. Choice-correlated activity fluctuations underlie learning of neuronal category representation. Nat Commun 6, 6454 (2015). https://doi.org/10.1038/ncomms7454

Download citation

Received: 31 October 2014
Accepted: 29 January 2015
Published: 11 March 2015
DOI: https://doi.org/10.1038/ncomms7454

This article is cited by

Neural Mechanisms of the Maintenance and Manipulation of Gustatory Working Memory in Orbitofrontal Cortex
- Layla Chadaporn Antaket
- Yoshiki Kashimori
Cognitive Computation (2023)
Donut-like organization of inhibition underlies categorical neural responses in the midbrain
- Nagaraj R. Mahajan
- Shreesh P. Mysore
Nature Communications (2022)
Mouse visual cortex areas represent perceptual and semantic features of learned visual categories
- Pieter M. Goltstein
- Sandra Reinert
- Mark Hübener
Nature Neuroscience (2021)
If deep learning is the answer, what is the question?
- Andrew Saxe
- Stephanie Nelli
- Christopher Summerfield
Nature Reviews Neuroscience (2021)
Flexible categorization in perceptual decision making
- Genís Prat-Ortega
- Klaus Wimmer
- Jaime de la Rocha
Nature Communications (2021)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.