Abstract
This paper presents ILGM (the Infant Learning to Grasp Model), the first computational model of infant grasp learning that is constrained by the infant motor development literature. By grasp learning we mean learning how to make motor plans in response to sensory stimuli such that open-loop execution of the plan leads to a successful grasp. The open-loop assumption is justified by the behavioral evidence that early grasping is based on open-loop control rather than on-line visual feedback. Key elements of the infancy period, namely elementary motor schemas, the exploratory nature of infant motor interaction, and inherent motor variability are captured in the model. In particular we show, through computational modeling, how an existing behavior (reaching) yields a more complex behavior (grasping) through interactive goal-directed trial and error learning. Our study focuses on how the infant learns to generate grasps that match the affordances presented by objects in the environment. ILGM was designed to learn execution parameters for controlling the hand movement as well as for modulating the reach to provide a successful grasp matching the target object affordance. Moreover, ILGM produces testable predictions regarding infant motor learning processes and poses new questions to experimentalists.
Similar content being viewed by others
Notes
Technically, REINFORCE requires the firing probability of neurons be specified as a differentiable function of the input. Our algorithm does not conform to this criterion (i.e., Step 1) but it can be shown that a softmax approximation to Step 1 yields a learning rule similar to the one we used.
References
Arbib MA, Hoff B (1994) Trends in neural modeling for reach to grasp. In: Bennett KMB, Castiello U (eds) Insights into the reach to grasp movement. North-Holland, Amsterdam, New York, pp xxiii, 393
Barto A, Mahadevan S (2003) Recent advances in hierarchical reinforcement learning discrete-event systems. (in press)
Baud-Bovy G, Soechting JF (2001) Two virtual fingers in the control of the tripod grasp. J Neurophysiol 86:604–615
Bayley N (1936) The California infant scale of motor development, birth to three years. University of California Press, Berkeley, CA
Bernstein NA (1967) The coordination and regulation of movements. Pergamon Press, Oxford
Berthier NE, Clifton RK, Gullapalli V, McCall DD, Robin DJ (1996) Visual information and object size in the control of reaching. J Motor Behav 28:187–197
Bradley NS (2000) Motor control: developmental aspects of motor control in skill acquisition. In: Campbell SK, Van der Linden DW, Palisano RJ (eds) Physical therapy for children, a comprehensive reference for pediatric practice, 2nd edn. Saunders, Philadelphia, pp 45–87
Butterworth G, Verweij E, Hopkins B (1997) The development of prehension in infants: Halverson revisited. Brit J Dev Psychol 15:223–236
Clifton RK, Muir DW, Ashmead DH, Clarkson MG (1993) Is visually guided reaching in early infancy a myth. Child Dev 64:1099–1110
Dayan P, Hinton G (1993) Feudal reinforcement learning. In: Lippman DS, Moody JE, Touretzky DS (eds) Advances in neural information processing systems, San Mateo, CA, pp 5:271–278
Diamond A, Lee EY (2000) Inability of five-month-old infants to retrieve a contiguous object: a failure of conceptual understanding or of control of action? Child Dev 71:1477–1494
Dietterich TG (2000) Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research 13:227–303
Doya K, Samejima K, Katagiri K, Kawato M (2002) Multiple model-based reinforcement learning. Neural Comput 14:1347–1369
Fagg AH, Arbib MA (1998) Modeling parietal-premotor interactions in primate control of grasping. Neural Netw 11:1277–1303
Fearing RS (1986) Simplified grasping and manipulation with dexterous robot hands. IEEE Journal of Robotics and Automation 2:188–195
Gibson EJ (1969) Principles of perceptual learning and development. Prentice-Hall, Englewood Cliffs, NJ
Gibson EJ (1988) Exploratory behavior in the development of perceiving, acting and acquiring of knowledge. Ann Rev Psychol 39:1–41
Halverson HM (1931) An experimental study of prehension in infants by means of systematic cinema records. Genet Psychol Monogr 10:107–285
Iberall T, Arbib MA (1990) Schemas for the control of hand movements: an essay on cortical localization. In: M.A. G (ed) Vision and action: the control of grasping. Ablex, Norwood, NJ
Jeannerod M, Decety J (1990) The accuracy of visuomotor transformation: an investigation into the mechanisms of visual recognition of objects. In: Goodale MA (ed) Vision and action: the control of grasping. Ablex Pub. Corp., Norwood, NJ, pp viii, 367
Jeannerod M, Arbib MA, Rizzolatti G, Sakata H (1995) Grasping objects—the cortical mechanisms of visuomotor transformation. Trends Neurosci 18:314–320
Jeannerod M, Paulignan Y, Weiss P (1998) Grasping an object: one movement, several components. Novartis Foundation Symposium 218:5–16; discussion 16–20
Johansson RS, Westling G (1987a) Signals in tactile afferents from the fingers eliciting adaptive motor responses during precision grip. Exp Brain Res 66:141–154
Johansson RS, Westling G (1987b) Significance of cutaneous input for precise hand movements. Electroencephalogr Clin Neurophysiol 39[Suppl]:53–57
Lantz C, Melen K, Forssberg H (1996) Early infant grasping involves radial fingers. Dev Med Child Neurol 38:668–674
Lasky RE (1977) The effect of visual feedback of the hand on the reaching and retrieval behavior of young infants. Child Dev 48:112–117
Lockman J, Ashmead DH, Bushnell EW (1984) The development of anticipatory hand orientation during infancy. J Exp Child Psychol 37:176–186
MacKenzie CL, Iberall T (1994) The grasping hand. North-Holland, Amsterdam, New York
McCarty MK, Clifton RK, Ashmead DH, Lee P, Goulet N (2001) How infants use vision for grasping objects. Child Dev 72:973–987
Meltzoff AN (1988) Infant imitation after a 1-week delay: long-term memory for novel acts and multiple stimuli. Dev Psychol 24:470–476
Meltzoff AN, Moore K (1977) Imitation of facial and manual gestures by human neonates. Science 198:75–78
Murata A, Gallese V, Kaseda M, Sakata H (1996) Parietal neurons related to memory-guided hand manipulation. J Neurophysiol 75:2180–2186
Murata A, Gallese V, Luppino G, Kaseda M, Sakata H (2000) Selectivity for the shape, size, and orientation of objects for grasping in neurons of monkey parietal area AIP. J Neurophysiol 83:2580–2601
Newell KM (1986) Motor development in children: aspects of coordination and control. In: Wade MG, Whiting HTA (eds) Motor development in children: aspects of coordination and control. Nijhoff, Boston, pp 341–360
Newell KM, Scully DM, McDonald PV, Baillargeon R (1989) Task constraints and infant grip configurations. Dev Psychobiol 22:817–831
Newell KM, McDonald PV, Baillargeon R (1993) Body scale and infant grip configurations. Dev Psychobiol 26:195–205
Olmos M, Carranza JA, Ato M (2000) Force-related information and exploratory behavior in infancy. Infant Behav Dev 23:407–419
Oztop E, Arbib MA (2002) 2002, schema design and implementation of the grasp-related mirror neuron system, Biological Cybernetics 87:116–140
Rochat P (1998) Self-perception and action in infancy. Exp Brain Res 123:102–109
Rochat P, Morgan R (1995) Spatial determinants in the perception of self-produced leg movements by 3–5 month-old infants. Dev Psychol 31:626–636
Sakata H, Taira M, Kusunoki M, Murata A, Tanaka Y, Tsutsui K (1998) Neural coding of 3D features of objects for hand action in the parietal cortex of the monkey. Philosophical Transactions of the Royal Society of London Series B-Biological Sciences 353:1363–1373
Sakata H, Taira M, Kusunoki M, Murata A, Tsutsui K, Tanaka Y, Shein WN, Miyashita Y (1999) Neural representation of three-dimensional features of manipulation objects with stereopsis. Exp Brain Res 128:160–169
Sciavicco L, Siciliano B (2000) Modelling and control of robot manipulators. Springer, London New York
Smeets JB, Brenner E (1999) A new view on grasping. Motor Control 3:237–271
Smeets JB, Brenner E (2001) Independent movements of the digits in grasping. Exp Brain Res 139:92–100
Sporns O, Edelman GM (1993) Solving Bernstein’s problem: a proposal for the development of coordinated movement by selection. Child Dev 64:960–981
Streri A (1993) Seeing, reaching, touching: the relations between vision and touch in infancy. Harvester Wheatsheaf, London, New York
Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press, Cambridge, MA
Taira M, Mine S, Georgopoulos AP, Murata A, Sakata H (1990) Parietal cortex neurons of the monkey related to the visual guidance of hand movement. Exp Brain Res 83:29–36
Thelen E (2000) Motor development as foundation and future of developmental psychology. Int J Behav Dev 24:385–397
Twitchell TE (1970) Reflex mechanisms and the development of prehension. In: Connolly KJ (ed) Mechanisms of motor skill development. Academic Press, London, New York
van Rullen R, Thorpe SJ (2001) Rate coding versus temporal order coding: what the retinal ganglion cells tell the visual cortex. Neural Comput 13:1255–1283
von Hofsten C (1982) Eye-hand coordination in the newborn. Dev Psychol 18:450–461
von Hofsten C (1984) Developmental changes in the organization of prereaching movements. Dev Psychol 20:378–388
von Hofsten C (1991) Structuring of early reaching movements: a longitudinal study. J Mot Behav 23:280–292
von Hofsten C (1993) The structuring of neonatal arm movements. Child Dev 64:1046–1057
von Hofsten C, Fazel-Zandy S (1984) Development of visually guided hand orientation in reaching. J Exp Child Psychol 38:208–219
von Hofsten C, Ronnqvist L (1988) Preparation for grasping an object: a developmental study. J Exp Psychol Hum Percept Perform 14:610–621
Westling G, Johansson RS (1987) Responses in glabrous skin mechanoreceptors during precision grip in humans. Exp Brain Res 66:128–14
Williams RJ (1992) Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning 8:229–256
Zemel RS, Dayan P, Pouget A (1998) Probabilistic interpretation of population codes. Neural Comput 10:403–430
Acknowledgements
This work was supported in part by a Human Frontier Science Program grant to MAA, which made possible useful discussions with Giacomo Rizzolatti and Vittorio Gallese, to whom we express our gratitude. The work of MAA was supported in part by grant DP0210118 from the Australian Research Council (R.A. Owens, Chief Investigator). The work of EO was supported in part by ATR Computational Neuroscience Laboratories, Kyoto, Japan.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1
The simulation loop
The global logic of a simulation session is given below. This applies to all simulations except Simulation 1, where the function of the Wrist Rotation layer output is replaced by the automatic palm orientation described in Appendix 2. The Affordance layer encodes the orientation of the target object in SE3b, and the location of the object in SE4. In the remaining simulations it encodes the existence of a target object.
- Step 1::
-
Encode the affordances presented to the circuit in Affordance layer (A). Set the vector o as the center of the target object.
- Step 2::
-
Compute the outputs of Virtual Finger (V), Hand Position (H) and Hand Rotation (R) layers.
- Step 3::
-
Generate movement parameters v, h, r.
- Step 4::
-
Generate reach using the parameters v, h and r and o while monitoring for contact.
- Step 5::
-
On contact (or failure to contact) compute a contact list.
- Step 6::
-
Compute stability and generate reward signal rs based on the contact list.
- Step 7::
-
Adapt weights using rs.
- Step 8::
-
Go to Step 2, unless interrupted by the user.
Hand Position layer output encoding
The Hand Position layer generates a vector composed of azimuth, elevation and radius (α, β, r). The vector is related to the rectangular coordinates (in a left-handed coordinate system) as follows:
Figure 14 illustrates the conversion graphically.
Appendix 2
The arm/hand model
Since the arm/hand model we used in our simulations is a kinematics one, the absolute values of the modeled lengths are irrelevant and thus not specified here; however, the relative sizes of the segments are as shown in Fig. 15.
The arm is modeled to have a 3DOF joint at the shoulder to mimic the human shoulder ball joint and a 1DOF joint in the elbow for lower arm extension/flexion movements (see Fig. 15). The wrist is modeled to have 3DOFs to account for extension/flexion, pronation/supination, and ulnar and radial deviation movements of the hand. Each finger except the thumb is modeled to have 2DOFs to simulate metacarpophalangeal and distalinterphalangeal joints of human hand. The thumb is modeled to have 3DOFs, one for the metacarpophalangeal joint and the remaining two for extension/flexion and ulnar and radial extension movements of the thumb (i.e., for the carpometacarpal joint of the thumb).
Inverse kinematics
The simulator coordinate system for the forward and inverse kinematics is left-handed. The zero posture and the effect of positive rotations of the arm joints are shown in Fig. 16.
When we mention the reach component, we imply the computation of trajectories of the arm joint angles (θ1, θ2,θ3,θ4) to achieve a desired position for the end effector (i.e. the inverse kinematics problem). The end effector could be any point on the hand (e.g., the wrist, index finger tip, middle finger joints, etc.) as long as it is fixed with respect to the arm and the length of the lower limb of the arm is extended (for the sake of kinematics computations) to account for the end effector position. The end effector used in the simulations was the tip of either the index or the middle finger (see Appendix 3). The forward mapping, F, of the kinematics chain is a vector-valued function of the joint angles relating the joint angles to the end-effector position.
The Jacobian transpose method for inverse kinematics can be derived as a gradient descent algorithm by minimizing the square of the distance between the current (p) and the desired end-effector position (p desired ). The key to the algorithm is a special matrix called the geometric Jacobian matrix (J), which relates end-effector Cartesian velocity to the angular velocities of the arm joints (Sciavicco and Siciliano 2000):
or in vector notation \( \dot{p} = J\dot{\theta } \).
Representing the upper arm length and the (extended) lower arm length with l 1 and l 2 , respectively; and abbreviating sin(θ 1 ) and cos(θ 1 ) with s 1 and c 1 ; sin(θ 2 ) and cos(θ 2 ) with s 2 and c 2 ; sin(θ 3 ) and cos(θ 3 ) with s 3 and c 3 ; sin(θ 4 ) and cos(θ 4 ) with s 4 and c 4 , the Jacobian matrix of our arm model can be written as:
Then the algorithm is simply composed of iterating the following update rule until p=p desired , where η represents the joint angle update rate.
Note that the time dependency of the variables and the dependency of the Jacobian matrix on the joint angles are indicated with subscripts.
Automatic hand orientation
The automatic hand orientation employed in SE1a is modeled as minimizing the angle between the palm normal (X) and the vector (d) connecting the center of the object to the index finger knuckle using the hand’s extension/flexion degree of freedom (see Fig. 17). The angle is minimized when the palm normal coincides with the projection of d on to the extension/flexion plane of the hand.
When the hand makes a rotation of φ radians as illustrated in Fig. 17, the palm normal coincides with the projection of d on to the extension/flexion plane of the hand. Noting that object, index, pinky, wrist and elbow in Fig. 17 represent three-dimensional position vectors, the angle φ can be obtained as follows:
When automatic hand orientation is engaged, the hand is rotated by φ radians at each cycle of the simulation while reach is taking place. Note that when d prj is zero, the angle φ is not defined. In that case, the angle is returned as zero (i.e., hand is not rotated). This situation happens when the extension/flexion movement of the hand has no effect on the angle between the palm normal and d; in other words, when d is vertical to both X and Y.
Appendix 3
ILGM simulation parameters
The main behavior of the simulation system is determined by a resource file where many simulation parameters can be set. In this file, the three-dimensional positions and vectors are defined using a spherical coordinate system. The PAR, MER and RAD tags are used to indicate elevation, azimuth and radius components, respectively.
Object axis orientation range parameters
Object axis orientation range parameters define the minimum and maximum allowed tilt of the object around the z-axis (in the frontal plane). Ten units are allocated for encoding the tilt amount. These parameters are only used in SE3.
Base learning rate parameter
Base learning rate parameter is used as the common multiplier for all the learning rates in the grasp learning circuit as η AR = η VR = η HR = eta; η AV = η AH = η/MAXROTATE.
LGM layer size parameters
LGM layer size parameters define the number of units to allocate for the layers generating the motor parameters. The Virtual finger (V) layer is composed of ten units which specify the synergistic control of the fingers. In what follows BANK, PITCH and HEADING tags indicate the supination-pronation, wrist extension-flexion and radial/ulnar deviation movements, respectively. The size of the Wrist Rotation (R) layer is determined with the following parameters (in this example 9×9×1 units will be allocated):
In what follows, the tags locMER, locPAR and locRAD indicate the Hand Position (H) layer components. The size of the Hand Position layer is determined with the following parameters (in this example 7×7×1 units will be allocated)
Learning session parameters
Learning session parameters define the behavior of the simulator during learning. For a learning session, the simulator makes MAXBABBLE number of reach/grasp attempts. For each approach-direction (H layer output), the simulator makes MAXROTATE grasping attempts. After MAXREACH reaches are done, the next input condition is selected (e.g., the object orientation is changed). MAXBABBLE limits the maximum number of attempts the simulator will make. A particular simulation may be stopped at any instant. The saved connection weights then can be used for testing the performance later. Reach2Target parameter indicates which part of the hand should be used as the end effector by MG module for reach execution. The possible values are [INDEX, MIDDLE, THUMB] × [0,1,2] where 2 indicates the tip and 0 indicates the knuckle. An example set of parameter specification is as follows:
Grasp stability parameters
Grasp stability parameters define the acceptable grasps in terms of physical stability. costThreshold specifies the allowable inaccuracy in grasping. Ideally, the cost of grasping should be small indicating that the grasp is successful. Empirically, a threshold (E threshold ) value between 0.5 and 0.8 gives a good result for the implemented cost function. If the distance of the touched object to the palm is less than palmThreshold and the movement of the object due to finger contact is towards the palm, then the palm is used as a virtual finger to counteract the force exerted by the fingers. The negReinforcement parameter specifies the level of punishment returned when a grasp attempt fails (rs neg ). Empirically values greater than −0.1 and less than 0 result in good learning. Generally, a large negative reinforcement overwhelms the positively reinforced plans before they have chance to get represented in the layers.
Exploration and exploitation parameter
α (Randomness) specifies how often to use the learned distribution to generate grasp plans. A value of 1 means always use random parameter selection, while a value of 0 means always generate parameters from the current distribution of the layer. In all the simulations, the Virtual finger layer used the probability distribution representation to generate enclosure parameter (v). Next, we present the parameters used for other layers in the simulation experiments. The default parameters values used as examples in the descriptions above are not repeated here (Tables
Parameters for SE1a | |
---|---|
hand_rotBANK_code_len | N/A |
hand_rotPITCH_code_len | N/A |
hand_rotHEADING_code_len | N/A |
hand_locMER_code_len | 10 |
hand_locPAR_code_len | 10 |
hand_locRAD_code_len | 10 |
MAXREACH | N/A |
MAXROTATE | N/A |
MAXBABBLE | 1000 |
Reach2Target | INDEX0 |
costThreshold (E threshold ) | 0.75 |
palmThreshold | 125 |
negReinforcement (rs neg ) | −0.1 |
Randomness (α) | 0.85 |
a,
Parameters for SE2 | |
---|---|
hand_rotBANK_code_len | 9 |
hand_rotPITCH_code_len | 9 |
hand_rotHEADING_code_len | 1 |
hand_locMER_code_len | 7 |
hand_locPAR_code_len | 7 |
hand_locRAD_code_len | 1 |
MAXREACH | 1 |
MAXROTATE | 7 |
MAXBABBLE | 45000 |
Reach2Target | INDEX2 |
costThreshold (E threshold ) | 0.80 |
palmThreshold | 150 |
negReinforcement (rs neg ) | −0.05 |
Randomness (α) | 1 |
b,
Parameters for SE1b | |
---|---|
hand_rotBANK_code_len | 11 |
hand_rotPITCH_code_len | 11 |
hand_rotHEADING_code_len | 6 |
hand_locMER_code_len | 6 |
hand_locPAR_code_len | 6 |
hand_locRAD_code_len | 1 |
MAXREACH | N/A |
MAXROTATE | 55 |
MAXBABBLE | 10000 |
Reach2Target | INDEX0 |
costThreshold (E threshold ) | 0.75 |
palmThreshold | 125 |
negReinforcement (rs neg ) | −0.1 |
Randomness (α) | 0.95 |
c,
Parameters for SE3 | |
---|---|
hand_rotBANK_code_len | 12 |
hand_rotPITCH_code_len | 7 |
hand_rotHEADING_code_len | 1 |
hand_locMER_code_len | 5 |
hand_locPAR_code_len | 5 |
hand_locRAD_code_len | 5 |
MAXREACH | 1 |
MAXROTATE | 25 |
MAXBABBLE | 20000 |
Reach2Target | MIDDLE0 |
costThreshold (E threshold ) | 0.85 |
palmThreshold | 150 |
negReinforcement (rs neg ) | −0.1 |
Randomness (α) | 0.95 |
d and
Parameters for SE4 | |
---|---|
hand_rotBANK_code_len | 10 |
hand_rotPITCH_code_len | 7 |
hand_rotHEADING_code_len | 5 |
hand_locMER_code_len | 10 |
hand_locPAR_code_len | 10 |
hand_locRAD_code_len | 1 |
MAXREACH | 10 |
MAXROTATE | 30 |
MAXBABBLE | 10000 |
Reach2Target | MIDDLE0 |
costThreshold (E threshold ) | 0.75 |
palmThreshold | 128 |
negReinforcement (rs neg ) | −0.1 |
Randomness (α) | 1.0 |
e).
Rights and permissions
About this article
Cite this article
Oztop, E., Bradley, N.S. & Arbib, M.A. Infant grasp learning: a computational model. Exp Brain Res 158, 480–503 (2004). https://doi.org/10.1007/s00221-004-1914-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00221-004-1914-1