Skip to main content
Top
Gepubliceerd in: Psychological Research 3/2008

01-05-2008 | Original Article

Solving the credit assignment problem: explicit and implicit learning of action sequences with probabilistic outcomes

Auteurs: Wai-Tat Fu, John R. Anderson

Gepubliceerd in: Psychological Research | Uitgave 3/2008

Log in om toegang te krijgen
share
DELEN

Deel dit onderdeel of sectie (kopieer de link)

  • Optie A:
    Klik op de rechtermuisknop op de link en selecteer de optie “linkadres kopiëren”
  • Optie B:
    Deel de link per e-mail

Abstract

In most problem-solving activities, feedback is received at the end of an action sequence. This creates a credit-assignment problem where the learner must associate the feedback with earlier actions, and the interdependencies of actions require the learner to remember past choices of actions. In two studies, we investigated the nature of explicit and implicit learning processes in the credit-assignment problem using a probabilistic sequential choice task with and without a secondary memory task. We found that when explicit learning was dominant, learning was faster to select the better option in their first choices than in the last choices. When implicit reinforcement learning was dominant, learning was faster to select the better option in their last choices than in their first choices. Consistent with the probability-learning and sequence-learning literature, the results show that credit assignment involves two processes: an explicit memory encoding process that requires memory rehearsals and an implicit reinforcement-learning process that propagates credits backwards to previous choices.
Voetnoten
1
The measure of choice proportions is similar to the error measures in the studies by Schvaneveldt and Gomez (1998).
 
Literatuur
go back to reference Ashby, F. G., Queller, S., & Berretty, P. M. (1999). On the dominance of unidimensional rules in unsupervised categorization. Perception and Psychophysics, 61, 1178–1199.PubMed Ashby, F. G., Queller, S., & Berretty, P. M. (1999). On the dominance of unidimensional rules in unsupervised categorization. Perception and Psychophysics, 61, 1178–1199.PubMed
go back to reference Allen, S. W., & Brooks, L. R. (1991). Specializing the operation of an explicit rule. Journal of Experimental Psychology: General, 120, 3–19.CrossRef Allen, S. W., & Brooks, L. R. (1991). Specializing the operation of an explicit rule. Journal of Experimental Psychology: General, 120, 3–19.CrossRef
go back to reference Cleeremans, A. (1997). Sequence learning in a dual-stimulus setting. Psychological Research, 60, 72–86.CrossRef Cleeremans, A. (1997). Sequence learning in a dual-stimulus setting. Psychological Research, 60, 72–86.CrossRef
go back to reference Cleeremans, A., & McClelland, J. L. (1991). Learning the structure of event sequences. Journal of Experimental Psychology: General, 120, 235–253.CrossRef Cleeremans, A., & McClelland, J. L. (1991). Learning the structure of event sequences. Journal of Experimental Psychology: General, 120, 235–253.CrossRef
go back to reference Cohen, A., Ivry, R., & Keele, S. (1990). Attention and structure in sequence learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 16, 17–30.CrossRef Cohen, A., Ivry, R., & Keele, S. (1990). Attention and structure in sequence learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 16, 17–30.CrossRef
go back to reference Curran, T., & Keele, S. W. (1993). Attentional and nonattentional forms of sequence learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 19, 189–202.CrossRef Curran, T., & Keele, S. W. (1993). Attentional and nonattentional forms of sequence learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 19, 189–202.CrossRef
go back to reference Daw, N., Niv, Y., & Dayan, P. (2005). Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nature Neuroscience, 8, 1704–1711.PubMedCrossRef Daw, N., Niv, Y., & Dayan, P. (2005). Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nature Neuroscience, 8, 1704–1711.PubMedCrossRef
go back to reference Drewnowski, A., & Murdock, B. B. (1980). The role of auditory features in memory span for words. Journal of Experimental Psychology: Human Learning and Memory, 6, 319–332.CrossRef Drewnowski, A., & Murdock, B. B. (1980). The role of auditory features in memory span for words. Journal of Experimental Psychology: Human Learning and Memory, 6, 319–332.CrossRef
go back to reference Estes W. K. (1964). Probability learning. In: A. W. Melton (Ed.), Categories of human learning. New York: Academic. Estes W. K. (1964). Probability learning. In: A. W. Melton (Ed.), Categories of human learning. New York: Academic.
go back to reference Estes, W. K. (2002). Traps in the route to models of memory and decision. Psychonomic Bulletin and Review, 9(1), 3–25.PubMed Estes, W. K. (2002). Traps in the route to models of memory and decision. Psychonomic Bulletin and Review, 9(1), 3–25.PubMed
go back to reference Frensch, P., Buchner, A., & Lin, J. (1994). Implicit learning of unique and ambiguous transitions in the presence and absence of a secondary task. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20, 567–584.CrossRef Frensch, P., Buchner, A., & Lin, J. (1994). Implicit learning of unique and ambiguous transitions in the presence and absence of a secondary task. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20, 567–584.CrossRef
go back to reference Frensch, P. (1998). One concept, multiple meanings: On how to define the concept of implicit learning. In: M. A. Stadler, & P. A. Frensch (Eds.), Handbook of implicit learning (pp. 47–104). Thousand Oaks: Sage Publications. Frensch, P. (1998). One concept, multiple meanings: On how to define the concept of implicit learning. In: M. A. Stadler, & P. A. Frensch (Eds.), Handbook of implicit learning (pp. 47–104). Thousand Oaks: Sage Publications.
go back to reference Friedman, M. P., Burke, C. J., Cole, M., Keller, L., Millward, R. B., & Estes, W. K. (1964). Two-choice behavior under extended training with shifting probabilities of reinforcement. In: R. C. Atkinson (Ed.), Studies in mathematical psychology (pp. 250–316). Stanford, CA, USA: Stanford University Press. Friedman, M. P., Burke, C. J., Cole, M., Keller, L., Millward, R. B., & Estes, W. K. (1964). Two-choice behavior under extended training with shifting probabilities of reinforcement. In: R. C. Atkinson (Ed.), Studies in mathematical psychology (pp. 250–316). Stanford, CA, USA: Stanford University Press.
go back to reference Fu, W., & Anderson, J. (2006). From recurrent choice to skill learning: A model of reinforcement learning. Journal of Experimental Psychology: General, 135, 184–206. Fu, W., & Anderson, J. (2006). From recurrent choice to skill learning: A model of reinforcement learning. Journal of Experimental Psychology: General, 135, 184–206.
go back to reference Gallistel, C. R. (2005). Deconstructing the law of effect. Games and Economic Behavior, 52, 410–423.CrossRef Gallistel, C. R. (2005). Deconstructing the law of effect. Games and Economic Behavior, 52, 410–423.CrossRef
go back to reference Grafton, S. T., Hazeltine, E., & Ivry, R. (1995). Functional mapping of sequence learning in normal humans. Journal of Cognitive Neuroscience, 7, 497–510.CrossRef Grafton, S. T., Hazeltine, E., & Ivry, R. (1995). Functional mapping of sequence learning in normal humans. Journal of Cognitive Neuroscience, 7, 497–510.CrossRef
go back to reference Graybiel, A. M. (1995). Building action repertoires: Memory and learning functions of the basal ganglia. Current Opinion in Neurobiology, 5, 733–741.PubMedCrossRef Graybiel, A. M. (1995). Building action repertoires: Memory and learning functions of the basal ganglia. Current Opinion in Neurobiology, 5, 733–741.PubMedCrossRef
go back to reference Haider, H., & Frensch, P. A. (2002). Why aggregated learning follows the power law of practice when individual learning does not: Comment on Rickard (1997, 1999), Delaney et al. (1998), and Palmeri (1999). Journal of Experimental Psychology: Learning, Memory and Cognition, 28, 392–406.CrossRef Haider, H., & Frensch, P. A. (2002). Why aggregated learning follows the power law of practice when individual learning does not: Comment on Rickard (1997, 1999), Delaney et al. (1998), and Palmeri (1999). Journal of Experimental Psychology: Learning, Memory and Cognition, 28, 392–406.CrossRef
go back to reference Jimenez, L., Mendez, G., & Cleeremans, A. (1996). Direct and indirect measures of implicit learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 22, 948–969.CrossRef Jimenez, L., Mendez, G., & Cleeremans, A. (1996). Direct and indirect measures of implicit learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 22, 948–969.CrossRef
go back to reference Keele, S., Ivry, R., Mayr, U., Hazeltine, E., & Heuer, H. (2003). The cognitive and neural architecture of sequence representation. Psychological Review, 110, 316–339.PubMedCrossRef Keele, S., Ivry, R., Mayr, U., Hazeltine, E., & Heuer, H. (2003). The cognitive and neural architecture of sequence representation. Psychological Review, 110, 316–339.PubMedCrossRef
go back to reference Knowlton, B. J., Squire, L. R., & Gluck, M. (1994). Probabilistic classification learning in amnesia. Learning and Memory, 1, 106–120.PubMed Knowlton, B. J., Squire, L. R., & Gluck, M. (1994). Probabilistic classification learning in amnesia. Learning and Memory, 1, 106–120.PubMed
go back to reference McCallum, A. K. (1995) Reinforcement learning with selective perception and hidden state, Ph.D. Thesis, University of Rochester. McCallum, A. K. (1995) Reinforcement learning with selective perception and hidden state, Ph.D. Thesis, University of Rochester.
go back to reference Nissen, M., & Bullemer, P. (1987). Attentional requirements of learning: Evidence from performance measures. Cognitive Psychology, 19, 1–32.CrossRef Nissen, M., & Bullemer, P. (1987). Attentional requirements of learning: Evidence from performance measures. Cognitive Psychology, 19, 1–32.CrossRef
go back to reference Packard, M., & Knowlton, B. (2002). Learning and memory functions of the basal ganglia. Annual Review of Neuroscience, 25, 563–593.PubMedCrossRef Packard, M., & Knowlton, B. (2002). Learning and memory functions of the basal ganglia. Annual Review of Neuroscience, 25, 563–593.PubMedCrossRef
go back to reference Perruchet, P., & Amorim, P. A. (1992). Conscious knowledge and changes in performance in sequence learning: Evidence against dissociation. Journal of Experimental Psychology: Learning, Memory, and cognition, 18, 785–800.PubMedCrossRef Perruchet, P., & Amorim, P. A. (1992). Conscious knowledge and changes in performance in sequence learning: Evidence against dissociation. Journal of Experimental Psychology: Learning, Memory, and cognition, 18, 785–800.PubMedCrossRef
go back to reference Poldrack, R., Clark, J., Pare-Blagoev, E., Shohamy, D., Moyano, J., Myers, C., & Gluck, M. (2001). Interactive memory systems in the human brain. Nature, 414, 546–550.PubMedCrossRef Poldrack, R., Clark, J., Pare-Blagoev, E., Shohamy, D., Moyano, J., Myers, C., & Gluck, M. (2001). Interactive memory systems in the human brain. Nature, 414, 546–550.PubMedCrossRef
go back to reference Reber, A. S. (1989). Implicit learning and tacit knowledge. Journal of Experimental Psychology: General 118, 219–235.CrossRef Reber, A. S. (1989). Implicit learning and tacit knowledge. Journal of Experimental Psychology: General 118, 219–235.CrossRef
go back to reference Reed, J., & Johnson, P. (1994). Assessing implicit learning with indirect tests: Determining what is learned about sequence structure. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20, 585–594.CrossRef Reed, J., & Johnson, P. (1994). Assessing implicit learning with indirect tests: Determining what is learned about sequence structure. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20, 585–594.CrossRef
go back to reference Schultz, W., Dayan, P., & Montague, P. R. (1997). A neural substrate of prediction and reward. Science, 275, 1593–1599.PubMedCrossRef Schultz, W., Dayan, P., & Montague, P. R. (1997). A neural substrate of prediction and reward. Science, 275, 1593–1599.PubMedCrossRef
go back to reference Schvaneveldt, R., & Gomez, R. (1998). Attention and probabilistic sequence learning. Psychological Research, 61, 175–190.CrossRef Schvaneveldt, R., & Gomez, R. (1998). Attention and probabilistic sequence learning. Psychological Research, 61, 175–190.CrossRef
go back to reference Shanks, D. R., & St. John, M. F. (1994). Characteristics of dissociable human learning systems. Behavioral and Brain Sciences, 17, 367–447.CrossRef Shanks, D. R., & St. John, M. F. (1994). Characteristics of dissociable human learning systems. Behavioral and Brain Sciences, 17, 367–447.CrossRef
go back to reference Stadler, M. A. (1992). Statistical structure and implicit serial learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 18, 318–327.CrossRef Stadler, M. A. (1992). Statistical structure and implicit serial learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 18, 318–327.CrossRef
go back to reference Stadler, M. A. (1995). The role of attention in implicit learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 674–685.CrossRef Stadler, M. A. (1995). The role of attention in implicit learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 674–685.CrossRef
go back to reference Sun, R., Slusarz, P., & Terry, C. (2005). The interaction of the explicit and the implicit in skill learning: A dual-process approach. Psychological Review, 112, 159–192.PubMedCrossRef Sun, R., Slusarz, P., & Terry, C. (2005). The interaction of the explicit and the implicit in skill learning: A dual-process approach. Psychological Review, 112, 159–192.PubMedCrossRef
go back to reference Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction. Cambridge, MA: MIT. Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction. Cambridge, MA: MIT.
go back to reference Vulkan, N. (2000). An economist’s perspective on probability matching. Journal of Economic Surveys, 14, 101–118.CrossRef Vulkan, N. (2000). An economist’s perspective on probability matching. Journal of Economic Surveys, 14, 101–118.CrossRef
go back to reference Ward, L. B. (1937). Reminiscence and rote learning. Psychological Monographs, 49, 64. Ward, L. B. (1937). Reminiscence and rote learning. Psychological Monographs, 49, 64.
go back to reference Waldron, E., & Ashby, G. (2001). The effects of concurrent task interference on category learning: Evidence for multiple category learning systems. Psychonomic Bulletin and Review, 8, 168–176.PubMed Waldron, E., & Ashby, G. (2001). The effects of concurrent task interference on category learning: Evidence for multiple category learning systems. Psychonomic Bulletin and Review, 8, 168–176.PubMed
go back to reference Willingham, D. (1998). A neuropsychological theory of motor skill learning. Psychological Review, 105, 558–584.PubMedCrossRef Willingham, D. (1998). A neuropsychological theory of motor skill learning. Psychological Review, 105, 558–584.PubMedCrossRef
go back to reference Willingham, D., Nissen, M., & Bullemer, P. (1989). On the development of procedural knowledge. Journal of Experimental Psychology: Learning, Memory, and Cognition, 15, 1047–1060.PubMedCrossRef Willingham, D., Nissen, M., & Bullemer, P. (1989). On the development of procedural knowledge. Journal of Experimental Psychology: Learning, Memory, and Cognition, 15, 1047–1060.PubMedCrossRef
go back to reference Yellott, J. L. (1969). Probability learning with noncontingent success. Journal of Mathematical Psychology, 6, 541–575.CrossRef Yellott, J. L. (1969). Probability learning with noncontingent success. Journal of Mathematical Psychology, 6, 541–575.CrossRef
go back to reference Ziessler, M. (1994). The impact of motor responses on serial pattern learning. Psychological Research, 57, 30–41.PubMedCrossRef Ziessler, M. (1994). The impact of motor responses on serial pattern learning. Psychological Research, 57, 30–41.PubMedCrossRef
go back to reference Ziessler, M. (1998). Response-effect learning as a major component of implicit serial learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 24, 962–978.CrossRef Ziessler, M. (1998). Response-effect learning as a major component of implicit serial learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 24, 962–978.CrossRef
go back to reference Ziessler, M., & Nattkemper, D. (2001). Learning of event sequences is based on response-effect learning: Further evidence from a serial reaction task. Journal of Experimental Psychology: Learning, Memory, and Cognition, 27, 595–613.PubMedCrossRef Ziessler, M., & Nattkemper, D. (2001). Learning of event sequences is based on response-effect learning: Further evidence from a serial reaction task. Journal of Experimental Psychology: Learning, Memory, and Cognition, 27, 595–613.PubMedCrossRef
go back to reference Ziessler, M., Nattkemper, D., & Frensch, P. A. (2004). The role of anticipation and intention in the learning of effects of self-performed actions. Psychological Research, 68, 163–175.PubMedCrossRef Ziessler, M., Nattkemper, D., & Frensch, P. A. (2004). The role of anticipation and intention in the learning of effects of self-performed actions. Psychological Research, 68, 163–175.PubMedCrossRef
Metagegevens
Titel
Solving the credit assignment problem: explicit and implicit learning of action sequences with probabilistic outcomes
Auteurs
Wai-Tat Fu
John R. Anderson
Publicatiedatum
01-05-2008
Uitgeverij
Springer-Verlag
Gepubliceerd in
Psychological Research / Uitgave 3/2008
Print ISSN: 0340-0727
Elektronisch ISSN: 1430-2772
DOI
https://doi.org/10.1007/s00426-007-0113-7

Andere artikelen Uitgave 3/2008

Psychological Research 3/2008 Naar de uitgave