1 Introduction

Children who have been diagnosed with autism tend to perceive human social behaviours as complex, difficult to interpret, and potentially overwhelming. As such, they might subsequently withdraw from social interaction because they can find it difficult to communicate or socially interact with other people. If they do not understand how to practice proper social interaction and/or communication, these children can face many difficulties later in life [3]. However, because children with autism enjoy playing with mechanical devices, particularly robots [39, 42], one of our laboratory’s projects, AuRoRA, has pioneered the use of robotic toys as therapeutic and educational aides. These robotic toys are designed to teach children with autism basic social skills to help them to communicate and interact with others, such as turn-taking and imitation [1]. Since the project’s beginnings in 1998, many encouraging results have been found [13, 15, 64]. Specifically, discoveries have been made regarding how children with autism interact differently with other people than with robots, and how interactions among children with autism can be successfully mediated by these robots [4749, 65].

Research has shown that humanoid robots, whether used as toys programmed to dance to specific pieces of music or remotely-operated robotic “puppets”, can promote imitative free-form play among pairs of children with autism [49]. Additionally, such robots can also foster triadic interactions among themselves, a child with autism, and a human experimenter [47]. Such behaviours are necessary in order for children to engage in social play, a form of play in which children with autism have significant difficulty participating due to the social impairments that are characteristic of their disorder [28]. Previous work has shown that children with autism can engage in free-form, unstructured forms of social play known as associative play, and our earlier research has suggested that children with autism can engage in a more organized and complex form of social play, known as cooperative play, with robots in the context of an after-school club [61]. However, it has not yet been shown whether children with autism can participate in cooperative play, specifically when this form of play is implemented as a simple, dyadic, collaborative video game. Furthermore, it has not been shown whether playing cooperatively with humanoid robots has any effect on collaborative play skills among children with autism when compared to playing with a human being. This article presents a novel experimental setup consisting of an autonomous humanoid robot playing a collaborative, dyadic video game with children with autism. Furthermore, the article also describes our evaluations of the childrens’ experiences using our setup, our results from data analysis of the interaction games, and our suggestions for improving the systems which comprise our experimental setup and the ways in which they are used.

The remainder of this article is structured as follows: Sect. 1.1 discusses work related to key topics in this article, and Sect. 1.2 describes the approach taken towards the implementation of the experimental setup, the research question which motivated the design of our experimental setup, and our expectations for how children would respond to our setup as well as the trends we would expect to see in the data gathered. Section 1.3 discusses the participants of our study and our reasons for choosing them, and Sect. 2.1 describes the methodology behind our pilot study. Section 3 explains the logic and design choices of our dyadic video game and autonomous robot, as well as hurdles we had to overcome in their implementation. Section 4 lists the kinds of data gathered in our study and explains why those data were chosen. Section 5 discusses how the data was analyzed as well as which trends were observed in it, and Sect. 6 offers our evaluations of the children’s experiences with our experimental setup, as well as interpretations of the study’s data findings and the implications that this will have on future work. Section 7 summarizes the article and, based on the lessons learnt from the pilot study, outlines a more thorough experiment to be conducted in the future which will use an improved experimental setup. Individuals whose assistance and efforts have helped make this work possible are listed in Acknowledgements.Footnote 1

1.1 Related Work

Autism is a lifelong developmental disability which is characterized by deficits in social interaction, impaired social communication, and restricted interests as well as stereotyped behaviours [3]. Although these impairments can appear in a variety of forms and degrees of severity among the individuals diagnosed, they will generally impair the person’s ability to understand, relate to, and socially interact with other people. Children can manifest these symptoms through specific behaviours, such as displaying positive affect in social settings significantly less often than neurotypical (non-autistic) children [16], displaying positive affect while looking directly at another person significantly less often than either neurotypical or mentally retarded children [31], initiating joint attention using pointing (the selection and focus of gaze on the same object as someone else) far less than other children [23], and having difficulties in initiating and sustaining social play [30]. By observing how often these behaviours occur, one can quantify the quality of specific social interactions among those with autism [7].

Because children with autism particularly enjoy playing with computers and electronic devices [39], some researchers have studied how video games can be used to help these children. Because horizontal visual displays promote more group work and cooperation than vertical ones [51], some researchers have used video games displayed on horizontal interfaces to promote collaboration and social interaction among children with autism. Piper, O’Brien, Morris, et al developed a game called SIDES (Shared Interfaces to Develop Effective Social skills) using a Diamondtouch table display to detect and distinguish the hand-table contact of up to four players. In evaluating the game, the researchers found that while one group of children with autism played more cooperatively when the game enforced its own rules of turn-taking and piece ownership, another group played best when the game’s rules were not enforced at all [41]. Bauminger, Goren-Bar, Gal et al also developed a collaborative electronic interface based on a Diamondtouch display known as StoryTable, in which pairs of children could create different stories by jointly touching and dragging items on the display surface. Three pairs of children diagnosed with high-functioning autism played with this interface multiple times per week over the course of three weeks, and the researchers found that after participating in all of the play sessions, the children displayed more social behaviours such as making eye contact, positive affect while making eye contact, and sharing emotions than they did beforehand. [8]. Additionally, the children displayed fewer stereotypically “autistic” behaviours while playing with the StoryTable than they did while participating in other activities, and also spent more time playing social games, whether simple or complex, as well as less time playing in parallel with another child after participating in the study [25]. In a similar study, children with autism played with digital jigsaw puzzles on a Diamondtouch table which could be programmed to require either cooperative or individual touching and dragging in order for pieces to be moved around on the board. After pairs of children with autism repeatedly participated in each of the game’s play styles, it was found that the children exhibited more coordinative moves, more moves in general, and had greater proportions of simultaneous activity while playing the puzzle game cooperatively than when the children played separately but in parallel [6].

Since it was first suggested that robots positively affect the social interactions of children with autism [62], many researchers have studied this phenomenon in more detail. In addition to the above mentioned projects, Fasel and others used simulated systems and robotic ones to study normal and abnormal development of joint attention in infants with and without autism [18]. Later, the small robot Keepon, which was developed by Kozima, Nakagawa, and others, showed that it could establish triadic interactions among itself, a young child with autism, and either another child or the autistic child’s parent/caregiver [33, 34]. In order to improve the diagnostic methods for autism among children, Scassellati worked on open-loop robots and systems for automatically tracking the movement of a child, measuring their direction of gaze, and categorizing the prosody of their voice [54, 55]. Feil-Seifer and Matarić found that children with autism socially interacted more with robots that directly responded to their actions than they did to robots that behaved randomly or were completely unresponsive [19, 21]. Michaud and Théberge-Turmel built robots with many different designs (a ball, an elephant, etc.) and examined how children with autism interacted with them in order to see which one was both played best with and was most helpful for developing social skills [38]. Similarly, Kim, Leyzberg, Short and others had children with autism interact with Pleo, a dinosaur-like robot, as well as interview with an adult human, and found that the children were more socially engaged while interacting with the robot [32].

Social play is an effective method through which children learn about social interaction. Research suggests that since children with autism find it difficult to play with other children as well as participate in pretend play, it is particularly difficult for them to learn about social interaction [45]. As such, a great deal of work has focused why children with autism show difficulties in engaging in the abovementioned two forms of play. Wolfberg and Schuler discovered that children with autism find it easier to participate in symbolic play with other neurotypical children through assistance from external support structures, such as a helpful teacher [67]. Similarly, it was discovered by Charman and Baron-Cohen that more children with autism could engage in pretend play, in which a simple object is substituted for another more complicated one, when they were assisted with appropriate prompting from the experimenters [11].

Drawing on studies about how groups of people can constructively work together, researchers of human-robot interaction have examined how heterogeneous groups of robots and other people can best collaborate with each other. Fong, Thorpe, and Baur found that participants in their experiment were able to accomplish more tasks when they collaborated with many different autonomous robots than when they manually controlled the robots’ behaviours [22]. Hinds, Roberts, and Jones studied how different appearances and status roles of robots affected the task-solving performance of different collaborative pairings of humans and robots [27]. Drury, Scholtz, and Yanco described an awareness framework for different collaboration scenarios of humans and robots, and were able to re-examine specific failures among teams of humans and robots in terms of various awareness deficiencies [17]. Sidner, Lee, and Lesh studied how conversational gestures and gaze patterns can be used by robots to better engage people in collaborative, socially assistive interactions [57].

This present study incorporates ideas from the many different research areas mentioned in this section. Specifically, because research shows that there are specific social behaviours that children with autism will perform less often than non-autistic children due to their impairments in interacting and communicating with others, the experimental setup described in this article utilizes the frequency with which these behaviours are displayed to determine the change in social engagement between a child and their play partner over the course of different play sessions. In our setup, one of these play partners is the autonomous humanoid robot KASPAR which also reacts to specific forms of communication, as research shows that children with autism are particularly socially engaged when interacting with robots that respond to the children’s behaviour. The play sessions focus on social, cooperative play, as studies have shown that the particular difficulty which children with autism have with this style of play may further hinder their development of basic social skills, and the video game used in the play sessions uses a horizontally-oriented screen because it has successfully been shown to foster cooperative play among participants with autism. Furthermore, the robot’s behaviours, its role in the play sessions, and its degrees of expressiveness were designed according to findings from related research on successful collaboration between humans and robots.

1.2 Purpose of Experimental Setup, Goal of Study, and Expectations

While our earlier research suggested that children with autism were capable of playing cooperatively in the context of an after-school robotics class, its experimental setup was designed for children who were relatively high-functioning and therefore capable of interacting with others in a group setting [61]. Because this earlier design was limited in terms of the variety of children with autism who could benefit from it, we designed a new experimental setup that would still be engaging for the children while focusing on cooperative play and robots, without requiring the participants to have as developed sets of social skills. Furthermore, we designed this form of cooperative play and the set of robotic behaviours such that the experimental setup could be used in studies that would more easily and readily compare the children’s degrees of social interaction with other children to their social interactions with a robot.

1.2.1 Experimental Setup

The purpose of designing an experimental setup involving an autonomous, humanoid robot playing a dyadic cooperative video game was to have children with autism become engaged in both the cooperative form of play as well as their social interactions with the other player. This supports the intended usage of our experimental setup in a number of ways.

Firstly, while children with autism have difficulties in participating in social play because of their impairments in socially interacting and communicating with other people [28], our setup instantiates a social play setting as a cooperative video game in an effort to make children with autism play with others as well as interact with them. Because such a setting uses clearly structured and codified forms of interaction as well as electronic components in the forms of game controllers, the cooperative video game is intended to be both more appealing for children with autism as well as a simpler interactive context in which they can participate. Furthermore, because successfully playing our cooperative video game would require participants to socially interact, we also felt that the children with autism would socially interact with others simply out of a desire to accomplish tasks in the video game.

Secondly, children with autism have shown both increased amounts of social engagement while interacting with robots than they have while interacting with people [49], as well as a preference for interacting with a humanoid robot than with a zoomorphic robot [29]. As such, we felt that when the children with autism were presented with a humanoid robot as a play partner, they would become more engaged in both their cooperative play activities as well as their social interactions.

We therefore expected that even though children with autism do not often participate in social play because of their social impairments, the children would both participate in and enjoy playing the cooperative video game because of its clearly-defined rules and the simplified nature of social interactions within its context. Furthermore, the nature of the cooperative video game would also help the children to socially interact with other individuals, even if they only wanted to accomplish tasks in the video game. Additionally, we also expected that the humanoid robot would serve as a catalyst to make children with autism become more socially engaged and socially interactive while playing the cooperative video game.

1.2.2 Novel Pilot Study

While the goal of this study was to evaluate and test our experimental setup using children with autism, the additional aim of gathering data in this study’s equipment test was to practice analyzing the kinds of data that we expect to find in future studies, in which we will use objective measurements to determine whether dyadically collaborating with a humanoid robot while playing an explicitly cooperative game would change a child with autism’s collaborative dyadic interactions with a human in the same context. This is a novel and interesting aim for a number of reasons.

Firstly, previous research has shown that when used as social mediators, robots can help children with autism to interact with other people, including other autistic children, in novel ways [20, 33, 34, 4749, 65]. These earlier studies compared the children’s interactions in the contexts of the experiments with second-hand reports of the children’s earlier interactions in different settings. In addition, such studies have mainly focused either on single autistic children interacting dyadically with a robot or on single children triadically interacting with a robot as well as their parent or carer. However, no earlier studies have used the same experimental setting to compare dyadic interactions of single autistic children and a human adult with the dyadic interactions of the same children and a humanoid robot.

Secondly, the abovementioned earlier studies examined how children with autism interacted and played in open-ended, exploratory settings with robots. According to Parten’s research on play [40], we can classify some of the forms of play in these studies as parallel (two children with autism play in their own ways with the same robot at the same time, either without acknowledging each other or by acknowledgment without communication [48, 65]), some as associative (a child with autism imitates a robot and communicates with its human adult controller [29, 4649]), and on a few occasions, cooperative (two high-functioning children with autism spontaneously interact and communicate to organize a game together with a reactive robot [65], or a child with autism plays a two-player game with an experimenter while interacting minimally with them [29]). In these studies as well as others, there have been few cases of the children participating in cooperative play. This is not surprising, as that specific form of play requires frequent communication and interaction among its participants, and by definition, children with autism have great difficulty with these social activities. However, this study is novel because it asked autistic children to participate in cooperative play by continually communicating and interacting with both a human and a robot. Additionally, although almost all of the previous studies involved children with autism playing with robots in semi-organized ways without any specific goals, this study asked multiple autistic children to play in an organized, collaborative manner with a robot to achieve a specific, common goal.

Thirdly, few pilot studies on autonomous social robots for children with autism actually evaluate the impact of their complete systems on members of their target audience, in the sense that they might not test real autonomous robots in the presence of children with autism while also observing changes in their behaviour. Instead, pilot studies might utilize virtual robots in place of real, physical robots [63], they might only ask neurotypical children to participate in their pilot studies instead of children with autism [10], or they might only test to see whether their robotic system is capable of behaving successfully with children with autism while also observing whether the children can interact with their robot [24]. In contrast, this pilot study observed and tracked social behaviours of children with autism over multiple sessions of interacting with a real, fully autonomous robotic system.

Because children with autism have difficulties with generalizing behaviour and skills between settings [26], we wanted to design the autistic children’s interactions with both the robot and the human adult to be as similar as possible in order to ensure the highest likelihood of skill transference between the two settings. To this end, we used a humanoid robot known as KASPAR [14] (see Fig. 1) and programmed it to play with an autistic child using actions, gestures, and spoken phrases similar to those used by the human adult participating in our study. KASPAR is a minimally expressive robot that has a simplified form of human-like features and behaviours, thus allowing children with autism to explore social interaction in a safe and predictable environment in which they feel comfortable (see Sect. 3.2 for details). Drawing upon the deliberately strong similarities between the behaviours of the human and the robot, as well as earlier studies’ claims of autistic children’s increased displays of social engagement with robots, we expected that, in future studies, the autistic children’s social engagement and displays of positive affect during a play session with KASPAR would partially transfer over into a subsequent play session with a human adult. Furthermore, we also expected that such objective measurements during a subsequent play session with a human adult would be greater and more frequent than those during a play session which preceded playing with KASPAR; in short, the children would play more collaboratively with a human partner after having played with the robot than they did beforehand.

Fig. 1
figure 1

KASPAR is a child-sized humanoid robot and was developed to study human-robot interaction by the Adaptive Systems Research Group at the University of Hertfordshire

1.3 Participants

Six children with autism participated in this preliminary study from a local school for children with special needs; none of these children had interacted with KASPAR or played our collaborative game before. We specifically did not include a group of neurotypical, or non-autistic, children as a controlling factor in our study. This is because we did not want to distinguish or contrast neurotypical children with autistic children, as our research group is more interested in studying robot-assisted play as a tool for autism therapy than studying the nature of autism as a psychological disorder. We therefore adopted an approach commonly used in the field of assistive technology and focused on our particular user group. Five boys and one girl participated in our study (see Table 1), and while we did not have access to the children’s individual diagnoses for autism, their head teacher confirmed for us that each child had previously been diagnosed with autism by a medical professional. We received permission to report each child’s degree of communicative competency according to the P-scale (performance scale), which is a set of performance criteria used by all British schools for children with special needs working below level 1 of the UK’s national curriculum. The criteria in the P-scales rate the children’s ability to listen properly and speak coherently on a scale from one (being briefly aware of interactions with familiar people) to eight (linking up to four key-words in sentences while demonstrating an understanding of causality, or listening and responding appropriately to questions regarding causality) [2]. The study lasted three weeks, and almost all the participants played one game session per day on four days during this period; one of the children played only three video game sessions. Additionally, because the children themselves were underage and had difficulties in communicating, the parents signed consent forms on behalf of their children before the study began for them to participate in our study and be recorded on video.

Table 1 Descriptions of the children participating in this study

2 Experimental Method and Procedures

2.1 Method

This study was carried out with the approval of the Faculty Ethics Committee of University of Hertfordshire’s faculty of Engineering and Information Science. Because we designed our experimental setup, in which KASPAR autonomously played a collaborative video game, to be used in studies involving children with autism playing the video game with typically developed individuals as well as playing the same game with KASPAR, we felt that we should test the setup in similar circumstances. This would allow us to determine which aspects of the experimental setup should be changed to better accommodate realistic demands of future studies, and it would also give us experience analyzing data from such studies. Because we finished implementing our experimental setup near the end of the academic school year, we did not have sufficient time to make our first study into an extensive, long-term experiment. However, we had enough time to determine both how well the setup worked as well as enough time for each of the children in our study to participate in four separate video game play sessions, with each child only playing one game session on any given day.

Using the above time constraints, we considered many possible configurations of which partners (typically developed human, or KASPAR) the children would play with during their four play sessions and the order in which they would do so, with each configuration based on specific experimental designs. However, most of the possible configurations were judged as not yielding data that would be appropriately useful. For example, a multiple baseline design would have allowed us to determine whether children with autism would play more cooperatively only when they started playing with KASPAR or whether similar results would occur after a certain amount of consecutively-scheduled sessions involving playing with a human partner). However, this design could not be appropriately implemented in the four play sessions that were available. In contrast, the reversal, withdrawal, or ABAB design was one that could be feasibly conducted in a limited amount of time and which had been used in previous experimental research [43, 44, 56].

In the reversal, withdrawal, or ABAB, design, participants alternate between two distinct experimental phases: a phase in which a baseline of behaviour is tracked for some period of time (the “A” phase) and a phase in which an experimental intervention is implemented while the same behaviours are tracked (the “B” phase) [53]. In our implementation, the phase of playing with a typically developed human player was considered the baseline phase and was referred to as H, while the phase of playing with KASPAR was considered the intervention phase and was referred to as K. In our experiment, each phase was defined as one play session for one child, with no child having multiple play sessions on any single day. To distinguish whether it was each child’s first or second time playing the collaborative game with a human or robotic partner, we added a number suffix and wrote the partner ordering as H1–K1–H2–K2. Because each child alternated between partners and because we used the same standardized methods for describing the collaboration in both kinds of sessions, we could determine whether the children played more collaboratively during their second session with a human partner (H2) than during their first (H1), or whether any behavioural changes that occurred during the first intervention phase with the robot (K1) would disappear during other conditions. Additionally, since none of the children with autism knew their play partners (the adult human or KASPAR) from before the experiment, none of the children’s behaviours could have been affected by experiences with the partners from before the experiment. If each child’s human partner had instead been a family member or a friend, each participants’s game-playing experiences would be qualitatively, and possibly quantitatively, difficult to compare with those of anyone else. As such, because the interactions themselves were standardized, we could compare each child’s interaction with a specific play partner to those of every other child with the same partner.

It should be pointed out that because each phase consisted of a single play session, we could not determine whether a change in a child’s collaborative behaviours between H1 and H2 was due to the intermediary session with KASPAR (K1) or whether it was due to familiarization with the typically developed human from repeated play sessions. To properly distinguish which of these two factors were the cause behind a change in child’s behaviour between H1 and H2, a more thorough experiment would require multiple play sessions per child in each game phase. Despite this drawback, this particular implementation of the reversal design, which only had four play sessions for each child, was considered appropriate for the aims of this pilot study, which are to test our experimental setup in a setting and manner that would approximate the conditions of a properly designed scientific experiment. Furthermore, because the reversal design allows for exploring the effects of inserting an intervention phase after a baseline as well as a baseline after an intervention, allows each child to act as their own control group, and is a useful experimental method when dealing with small sample sizes, it is a design that would be very effective in a more complete and properly-designed experiment.

2.2 Procedure

In this pilot study, each child played two game sessions with the same human partner, H1 and H2, and two sessions with the humanoid robot KASPAR, K1 and K2, for a total of four game sessions altogether. During each game session, the child with autism stood on one side of a horizontally-oriented screen, while their play partner, whether KASPAR or the typically developed human, stood on the opposite side of the screen. Both players faced each other during every play session, and in order to cooperatively play the video game on the horizontal screen, the players had to properly synchronize and coordinate their actions; the game would not register the actions of single player if they were not performed at the same time and in the same manner as those of the other player. During each game session, the only people in the room in addition to the child with autism were the child’s carer, who would remind the children of the game rules or keep the children focused on playing the game if they became distracted; the experimenter, in order to record their own impressions about each interaction and help out if KASPAR did not operate correctly; and the typically-developed human player, who would inobtrusively operate the recording equipment when not acting as a human partner for the autistic children. This player had been trained to interact the same way with every child according to a well-rehearsed script, and KASPAR had been programmed to interact the same way with every child according to a specific set of inputs. Although the sessions lasted for up to 25 minutes, the children were free to stop playing earlier if they were bored or uncomfortable. During the course of each play session, the video game logged the in-game actions of the players, both the child with autism and their play partner, as well as the times at which they occurred. Additionally, two video cameras recorded the facial expressions, speech, and behaviours of both the players as well those of the child with autism’s carer. In order to become familiar with analyzing the data that this experimental setup would produce, both the game logs and the video recordings of the players’ behaviours were later analyzed for specific behavioural trends and tendencies.

3 System Development and Artifacts Used

3.1 Dyadic Cooperative Video Game

The video game used in every phase of this experiment was designed to promote collaboration among its players; we define “collaboration” in this study as any shared activity requiring communication, coordination, and synchronization among two or more co-located parties in order to achieve a common goal, which is a stricter definition than is generally used in research [52]. In this game, the two players stood on opposite sides of a horizontally-oriented screen while facing each other. On the screen were a number of colourful 3D shapes, such as spheres, donuts, and Platonic solids, on a black background as well as two perpendicular lines, one orange and one light blue (see Fig. 2). The child with autism was given a Wiimote with an orange stripe, and by rolling their controller from side to side (i.e. rotating it about the axis running from the front of the controller to its back), they could make the orange line move left or right. The other player, whether a human or a robot, was given a different Wiimote with a blue stripe. By tilting their controller downward or upward (rotating it along the axis running from the left side of the controller to its right side), the other player could make the blue line move up or down. When both lines intersected near a shape and both players pulled the triggers on their Wiimote controllers at the same time, a happy sound or music sample would play from a nearby speaker and the shape would spin around while fading in and out of transparency before disappearing. After all of the shapes had disappeared, a different set of shapes would appear on the screen and the game would continue.

Fig. 2
figure 2

The dyadic collaborative game. The player on the left (stand-in for the child with autism) controls the location of the orange selection line, while the player on the right (stand-in for the human adult or the humanoid robot) controls the location of the blue selection line (Color figure online)

In order for either player to coordinate the joint selection of a specific shape, they had to communicate their intentions to the other player by keeping their line on the screen positioned over the desired shape, speaking about the shape or pointing to it, and pressing a button on the top of their Wiimote. When this was done successfully, the non-autistic player would acknowledge the child’s choice, move their own line over to the specific shape, and then try to arrange it such that both players would pull the triggers on their Wiimotes when they counted to three. While testing the experimental setup, the role of the non-autistic player was to try and prompt the child with autism to pick a shape once every five seconds if the latter were being unresponsive or were not taking the initiative. If the child with autism was not looking at the game or if they had trouble picking a shape properly, then their carer would assist them. The only time that the non-autistic player could pick their own shape would be if they unsuccessfully prompted the child with autism to pick a shape three times in a row.

The game was designed, implemented, and play tested in the lab until the predicted bugs and glitches had been eliminated. In the course of the game’s design and implementation, we incorporated certain features into it for specific reasons. We decided to have the two video game players stand on opposite sides of a flatbed monitor instead of having them stand next to each other while facing an upright monitor because a horizontally-oriented screen has been found to promote greater collaborative interaction and turn-taking than a vertical, upright one [51]. Furthermore, because children with autism have difficulties in understanding the importance of another individual’s gaze changes or bids for joint attention [35], we felt that if the game players were standing side by side while facing a screen in front of them, the fact that each player would be out of each other’s visual fields of view would exacerbate the existing difficulties of children with autism and negatively impact their ability to play our game. Instead of designing the game such that two individuals could potentially act independently of each other and still play successfully, despite there being little to no active cooperation between them (such as in the video games Rampage [5], Bubble Bobble [59], or Joust [66]), we designed the game to require coordinated, synchronous, and cooperative actions on behalf of both players. If the gameplay were not designed to be as collaborative as possible, then because children with autism will naturally engage in nonsocial, solitary play much more frequently than social play [58], we felt that, given the option, autistic children would readily engage in solitary, noncommunicative play in our video game. Because we also wanted the children to play the game freely without feeling overly pressured or stressed, we excluded time limits, losing conditions, and elements of scoring or grading from gameplay. If these elements were included in the game design, we felt it would put unnecessary pressure on the children to perform and make it more difficult for them to socially interact with others.

To make the game accessible and appealing to people with potentially different levels of cognitive development, we designed the game with bright, distinct colours, a simple visual layout, and easily identifiable 3D shapes such as cubes, diamonds, and pyramids (see Fig. 3). We rendered the 3D graphics in the game with the OpenGL API v3.2 because it let us easily draw impressive-looking three-dimensional shapes and change many of their visual qualities, such as orientation, colour, lighting conditions, and opacity. It was important to be able to change the shapes’ visual qualities because offering such sensory rewards served as one of the primary incentives for children with autism to participate in our games [50]. Since the autistic children playing the game had impaired communication skills by definition, we designed the gameplay and the game’s visual layout to require as little explanation as possible.

Fig. 3
figure 3

Some of the 3D shapes, or Platonic solids, used in the video game. From left to right, they are a tetrahedron, a dodecahedron, and an octahedron, respectively (from http://en.wikipedia.org/wiki/Platonic_solid)

Additionally, we allowed each player to control a line on the screen by playing with the orientation of a Wiimote using the wiiuse v0.12 open-source libraries; one Wiimote could be rolled from side to side about its Y-axis to translate the vertical crosshair-line left and right, and the other Wiimote could be tilted backward and foreward about its X-axis to translate the horizontal crosshair-line up and down (see Fig. 4). This intuitive set of controls was used to allow the game to automatically track which shape the players selected in real time, to make it as easy as possible for the children to control what happened in the game, and to allow KASPAR, a robot without functional hands, to appear to play the game as easily as a human. We had originally wanted to use the Wiimotes to implement a form of control based on pattern recognition and various series of gestures, which was quite feasible from a technological standpoint, but due to time constraints and reservations as to whether KASPAR or children with autism would be able to accurately reproduce gestures with sufficient ranges of force, we opted to implement a set of game controls based on reading pitch and roll values from the Wiimotes.

Fig. 4
figure 4

The axes of each Wiimote according to their accelerometers. “Pitch” or “Tilt” was considered a rotation about the red X-axis, and “Roll” was considered a rotation about the blue Y-axis (from http://wiibrew.org/wiki/Wiimote)

The game was developed as a single-threaded application and written in C++. After successfully connecting to two specific Wiimotes and initializing a number of different data structures, the game entered a perpetual loop. While in this loop, the game checked for and handled any button activity from the Wiimotes, displayed its graphics at a rate of 77 frames per second, and checked for any keyboard input, which would show that the game should be paused, quit, or toggled between typical gameplay and a one-player mode which was used to verify that the child with autism understood the basic game mechanics. In displaying the graphics, an orthographic projection was applied to all of the 3D shapes in order to give the game display a uniform appearance in which all shapes had the same orientation and large size, no matter where they appeared on the screen. In the course of displaying its graphics, the game first applied a low-pass filter to the roll and pitch values of each Wiimote in the forms of windowed running averages and drew the players’ colour-coded selection bars accordingly. The game then determined whether each shape should be lit up depending on whether a player’s selection bar was close to it, determined whether the players simultaneously selected the same shape, and if the players had done so, then the game displayed a sensory reward in the form of spinning, flashing shapes and pleasant music, before making the shape disappear. After the last shape had disappeared from the screen, a new set of four shapes would appear and the game would continue. While all of this happened, the game also kept a text-based log of every significant event that happened and continually sent all game-related data to the software process controlling KASPAR.

3.2 KASPAR, the Autonomous Humanoid Robot

KASPAR, the minimally-expressive humanoid robot with which the autistic children played, was developed by the Adaptive System Research Group at University of Hertfordshire. Designed to interact with people in HRI studies by performing simple gestures, displaying basic facial expressions, and, in our setup, using speech and low-level socially communicative behaviours such as joint attention and pointing, KASPAR is equipped with two 4 degree-of-freedom (DOF) arms as well as an 8 DOF head capable of panning, tilting, blinking and moving its eyes, and displaying a range of smiles and frowns [14]. KASPAR’s face was designed to be minimally expressive, in that while it was meant to approximate the facial structure and movements associated with human faces, the robot’s face was more iconic and stylized than a human’s. This was meant to make its facial features easily recognizable while also making people focus more on the meaning behind KASPAR’s facial expressions instead of the details of its face. To this end, KASPAR’s facial skin was taken from a rubber CPR dummy and was affixed to the robot’s face at the ears and nose. As such, while the skin does not draw attention to itself, its elasticity allows movements of the mouth to affect the skin near the eyes and nose, making its facial expressions appear more genuine [9].

We used the Yarp (Yet another robot platform) middleware to communicate with KASPAR’s hardware [37], and designed a simple event-driven sense-plan-act architecture which made the robot autonomously play the dyadic video game with a child with autism in the same way that the human player was trained to do (see Fig. 5). In contrast with previous HRI studies that involved children with autism, our robot behaved autonomously instead of being controlled in a “Wizard of Oz” fashion, in the sense that it was not remotely controlled by a hidden human operator [12].

Fig. 5
figure 5

Left: One of the children plays the collaborative game with H. Right: The same child plays with K

3.2.1 Sensing

KASPAR used an event-driven form of sensing and only ran most of its planning and acting modules when it received specific forms of sensory data about the video game, which could contain information such as the colours and positions of the remaining shapes in the game, the positions of the players’ lines, the successful selection of a shape, or the beginning of a new round. However, instead of KASPAR receiving this sensory data by grabbing images from the cameras in its eyes, this data came directly to the control architecture from the software thread running the collaborative video game via a Yarp connection. This was done because considering that the robot would only interact with the children in the context of playing a video game, all the pertinent information about the children’s actions would either be contained within the video game itself (i.e. what actions the child took and when they were taken) or dictated by the setting in which the video game was played (i.e. the physical location of the child with respect to the robot). Given these experimental constraints, it would have been needlessly complicated to perform feature detection and shape recognition on images from KASPAR’s eye-cameras in order to determine what each child with autism was doing. Furthermore, because the video game sent sensory data to KASPAR’s control architectures fairly frequently (at a rate of 11.11 Hz), the robot could sense the child’s actions in the video game quickly enough so as to be sufficiently responsive to them.

Furthermore, we originally wanted children to be able to talk to KASPAR and for the robot to be able to recognize certain words that the children would say in the context of the game. This is because we felt that the children would expect that a robot which could “talk”, or synthesize speech, would also be capable of “listening”, or recognizing speech. However, using a speech recognition system as a means of communication was ruled out for two reasons: firstly, the extensive amount of training that the system would have to undergo to learn each child’s pronunciation and intonation of each word was likely to be so uninteresting for most of the children as to dissuade them from participating in our study; secondly, some of the children’s limited communicative abilities would make it difficult for any speech recognition system to consistently and correctly interpret their speech. As such, we programmed KASPAR to instead respond to the buttons that the children pressed on their Wiimotes (this information was also received directly from the video game via a Yarp connection), since all of the children were theoretically capable of communicating in this manner and the easily-identifiable nature of the button’s signal would guarantee that it would always be reliably and correctly interpreted by the robot.

3.2.2 Planning

KASPAR’s control architecture prepared different responses depending on the kind of sensory data that it received. If the data dealt with a shape being successfully selected, signified by the game causing a “reward” sound to stop playing, then the architecture would first reset a number of timer variables regulating when KASPAR should perform certain periodic actions. The architecture would then set other state variables which would prepare the robot to pose and speak to the child in a congratulatory way, and temporarily lock its ability to interrupt KASPAR’s observable reactions until it was finished speaking.

If the sensory data instead dealt with the statuses of the shapes and the players in the game (which was much more likely to happen), then the architecture would process this game data by first updating and re-sorting its internal lists of which shapes were still available, as well as their colours and other attributes. It would then update and re-sort the lists of actions that each player was doing, the times that they started performing these actions, and their validity. Additionally, if the number of available shapes changed from the last time it received sensory data, the architecture would reset a number of KASPAR’s internal state and timer variables to reflect the fact that a new round had started.

If the sensory data came in the form of a button-press, which meant that a child wanted KASPAR to move its line toward a visible shape, the robot would first verbally announce that it would move its line toward the specified shape and speak the specified shape’s colour in order to make its goal clear to the child with autism. Then, after comparing the specified shape’s position with that of the robot’s blue line in the video game, KASPAR would activate a simple position control system that would slowly tilt the robot’s arm holding the Wiimote until the blue line intersected with the desired shape in the video game.

KASPAR would also perform actions in the absence of sensory data. In addition to the robot periodically blinking its eyes regardless of whatever else it was doing, if the control architecture did not receive any indication either that the child with autism wanted to select a shape in the video game in the preceding 5 seconds or that the robot was in the process of trying to select a shape, KASPAR would prompt the child to choose a shape or ask them what to do. If KASPAR had consecutively prompted the child to pick a shape three times, the robot would then take the initiative in the game. This was done by the control architecture randomly selecting an available shape, after which the robot would politely ask the child to select the shape, and then activate a positional control system to make its arm tilt until the blue line intersected the specific shape. Similarly, if a shape had been chosen by either KASPAR or the child with autism and both players’ lines intersected near the specified shape, KASPAR would announce that both players should click on the shape at the same time, begin a countdown to both players pressing buttons their Wiimotes, and then the control architecture would send an artificial “button-press” signal for KASPAR’s Wiimote (the robot could not actually move its fingers to press any Wiimote buttons).

3.2.3 Acting

KASPAR’s primary mode of acting involved communicating with the children with autism through gestures, facial expressions (see Fig. 6), and speech. The robot’s voice was created by the Acapela text-to-speech generator using the male English voice of “Graham”, which spoke using an accent of Received Pronunciation, also known as the Queen’s English. This speech generator was selected because its English voices were voted by fellow labmates to have better cadences to their speech, more natural speech rhythms, and were generally much easier to understand than freeware speech generators, such as Festival for Linux. Furthermore, we selected a male voice speaking the Queen’s English because we, as well as multiple teachers at Southfield School, felt this accent would be both the easiest for the children to understand as well as one that they had probably heard more often than any other accent offered by Acapela (e.g. Irish, Scottish). However, in order to make the voice sound slightly more childish, we raised the pitch on all of the speech samples by 21 % using Audacity, a free software package used for the mixing and editing of sound and music files. This form of voice modification was felt to be more suitable on KASPAR than a higher-pitched feminine robotic voice or the normal voice of “Graham” from the speech generator.

Fig. 6
figure 6

Four different facial expressions that KASPAR can make. Clockwise from the top left, they are: neutral, small, medium, and large smiles [9]

The robot’s control architectures implemented its actions by calling one specific function for posing and speaking (controlling the blinking of KASPAR’s eyes was handled in a separate function that was called periodically, and having KASPAR move its right arm to play the game was governed by a positional control system). The posing and speaking function first opened up the appropriate gesture file and used its contents to set KASPAR’s motors to the appropriate positions as well as determine the title of the sound file that would accompany the gesture. Furthermore, because KASPAR had multiple sound files that could potentially be used in any given situation, the function would also randomly determine which version of the appropriate sound file would be selected. Lastly, the function would modify various state-related variables and make note of the expected duration of the sound file, all of which were necessary in determining KASPAR’s actions in the future.

4 Data Collection

While one of the goals of this pilot study was to evaluate and test our experimental setup using children with autism, another aim was to practice gathering and analyzing the kinds of data that would be gathered in later studies using our setup, in which children will play collaboratively with a humanoid robot to determine whether this form of play will affect the ways that the same children will play collaboratively with a human. To practice gathering these kinds of data, we had to both define collaboration and then quantify how often the children collaborated and interacted with their human/robot partners. We defined collaboration through in-game actions and observable social behaviours, so we used two camcorders to videotape the children’s social behaviours during the play sessions and used the video game software to automatically record and timestamp the in-game actions of both players. The social behaviours manually coded by watching the videotapes and in-game actions automatically recorded in the game’s log files include:

  1. 1.

    prompting: when the autistic child’s partner or carer posed a question or made a suggestion about making the child choose a shape;

  2. 2.

    choosing: when one player expressed their desire to select a specific shape and for the other player to move their line to the said shape; this could have been done by speaking or by pushing a button;

  3. 3.

    successful shape selection: both players agreed on choosing a specific shape, moved each of their lines (the crosshair) near it, and pressed their Wiimotes’ trigger buttons at the same time;

  4. 4.

    unsuccessful shape selection: the child with autism presses the trigger button on their Wiimote when either of the two lines which constitute the crosshair are not near a shape;

  5. 5.

    gaze and gaze shift: the direction in which the child’s eyes focused while playing the game. This behaviour was included because one of the core deficits of autism is impaired gaze patterns [3]. The children’s gazes were categorized as looking at the game itself, looking at the human/robot partner, looking at the experimenter, looking at the carer, or looking at something else in the environment not relevant to the study;

  6. 6.

    positive affect: the child with autism laughed or smiled while playing the game (see Fig. 7).

    Fig. 7
    figure 7

    An example of one child’s coded behaviours represented on both a graphical timeline (top) as well as a movie player (bottom) in Noldus’s Observer software package. Both the timeline’s large red vertical bar and the movie player’s position box represent our current position in time (Color figure online)

While some of the above behaviours are social activities by nature ( e.g. communicating one’s choice to another individual), some of these are only social in the context of the goals of this study. For example, although successfully accomplishing a task in a video game is generally not seen as inherently social, doing so in this study’s collaborative video game becomes both social and cooperative. This is because successfully performing any action in this study’s game requires two players to coordinate their actions both spatially (i.e. moving each player’s line to a specific shape) as well as temporally (i.e. synchronizing the button-pressing on both of their controllers) towards the common goal of selecting shapes. Furthermore, because it had not been decided beforehand when each shape would be selected, the players needed to communicate with each other in order to properly coordinate their actions in time and space. Since all of these actions are collaborative/cooperative in nature [36], the game behaviours that accomplish them are therefore also collaborative.

These behaviours were coded by both the experimenter and a second independent rater who coded 10 % of the data in order to ensure inter-rater reliability. When the two sets of codings were compared to see how well they agreed with other, the average agreement value was 0.80, which is generally considered to be good. We also examined the codings for reliability and calculated an average Cohen’s kappa of κ=0.74. This is acceptable, since a good agreement between the raters which is not due to chance alone is defined as having a Cohen’s kappa value higher than 0.60 [4].

5 Analysis and Results

Because our paired sets of data had small sample sizes and abnormal distributions, we used Wilcoxon’s matched pairs signed-rank tests instead of using paired t-tests to determine which game session pairs had statistically significant differences (p<0.05) in the frequency of certain behaviours occurring. We also used the Mann-Whitney U test to evaluate hypotheses on whether the children had different gaze patterns with different partners and whether they displayed more positive affect while playing with a specific partner. Table 2 summarizing the results from all of the tests performed on our behavioural data can be found at the end of our article in the Appendix.

We expected our findings to show that the children interacted with and displayed positive affect with KASPAR more than they did while playing with the human adult. We felt these outcomes were likely to occur because we expected the children would want to spend more time with an enjoyable partner and previous research has shown that robots can elicit uniquely positive interactions from children with autism. We were therefore surprised when we did not find significant trends on the total time the children spent interacting with either partner. Previous research also suggested that the children from our experiment would look at KASPAR more often and for longer periods of time than they would the human adult, in addition to showing more interest in playing the video game with KASPAR than with the human adult. As such, it was surprising to see that the children did not show more interest in playing the game with KASPAR; they did not select more shapes, they did not take greater initiative in choosing shapes, and they did not display other game-related social behaviours requiring engagement and interaction more while playing with KASPAR. However, our greatest expectation was that the children would collaborate better with the human player after playing the collaborative game with KASPAR. The following section goes into greater detail on how our results supported or subverted our expectations.

The graph in Fig. 8 indicates that the children switched between looking at the game or the other player a significantly greater number of times during the play sessions with KASPAR. Children looked at the game after looking away from their partner for 80 % of the total gaze shifts during the experiment, and this kinds of changes in focus occurred significantly more during play sessions with KASPAR. We also found that in addition to the children switching between what they looked at significantly more during the play sessions with the robot, the children did the same thing more during H2 than H1. Furthermore, while the children played with KASPAR, they spent proportionally less time focusing on the game screen or controller (“the game”) and proportionally more time looking at the other player (see Fig. 9).

Fig. 8
figure 8

The children’s eye gaze shift trends

Fig. 9
figure 9

The children’s eye gaze while playing with either partner

While there were no trends among the sessions regarding the amount of time the children displayed positive affect, we found that the children usually looked either at the game or the other player when we only examined the data from the sessions that featured the children displaying positive affect. Specifically, the children spent a greater proportion of time displaying positive affect while looking at the other player (Z=−2.511,p=0.012) and less time displaying positive affect and looking at the game (Z=−3.24,p=0.001) during the play sessions with KASPAR (see Fig. 10).

Fig. 10
figure 10

The children’s eye gaze trends while displaying positive affect

The children had lower average rates of choosing shapes per minute (through speaking or pressing a button on their Wiimotes) in play sessions with KASPAR than with the human adult. Furthermore, although they chose significantly fewer shapes while looking at the game during sessions with KASPAR than during sessions with the human adult, there was no significant difference in the number of shapes the children chose while looking at the opposite player, whether KASPAR or the human. The children also chose more shapes without any external prompting to do so (took the initiative) during H2 than during H1, in addition to successfully selecting significantly more shapes by cooperating with the other player during H2 than H1 (see Fig. 11).

Fig. 11
figure 11

The children’s trends on taking the initiative in choosing shapes and cooperatively selecting them

After we conducted our final game session, we met with the children’s teacher to learn more about how the children behaved outside of our experimental setting and to understand certain sporadic behaviours we observed in some of the children. All of the children participating in our study were described as having difficulties playing with other children of similar ages; while a few were able to play by themselves near others, some could only play while separated from other children, and some had no interest in most toys. However, all were reported as having problems with turn-taking, sharing, and playing synchronously with other children. Therefore, while it is interesting that all of the children participating in our study were capable of playing the dyadic video game with an adult human, the fact that they were also capable of playing the game with a child-like robotic partner is particularly noteworthy. Additionally, some of the children would mimic KASPAR’s facial expressions, gestures, or vocal phrases while playing with the robot, but would not mimic their human partner’s behaviours or phrases; these same children were described by their teacher as fond of mimicking actions or phrases from television and computer games. Furthermore, some of the children’s reactions to KASPAR or the game were considered by the teacher to be very rare. For example, one child found it very enjoyable and funny to make KASPAR change what it was saying in mid-sentence by choosing shapes at specific times. Another, who had no play skills and was normally uninterested in any sort of play, willingly played with KASPAR and the typically-developed human player in addition to expressing positive affect while playing with and looking at the robot. Though not representative of all children, these instances show that interacting and playing with KASPAR can be a singular experience for some children with autism.

6 Discussion

6.1 Evaluating Setup of KASPAR and Collaborative Video Game

From watching the video footage of the children playing our collaborative video game as well as their interactions with KASPAR, we learned that there were many aspects of our experimental setup to which the children positively responded, as was intended. Firstly, the children enjoyed the sensory rewards that they received for successfully selecting shapes in the collaborative game. Specifically, the children displayed positive affect upon receiving some of the game’s sensory rewards, and many of the children watched the screen while the shapes spun around and blinked. Similarly, none of the children got upset at having to play the game in a cooperative manner, nor did any of them actively ignore the other individual playing with them or insist on playing alone. This suggests that the children were not averse to playing cooperatively, and that this simple collaborative video game also had the potential to be used as a setting for promoting social interaction among children with autism. There is also a great deal of evidence for the children enjoying their interactions with KASPAR the robot: many of the children spent time smiling while looking at the robot, while much less time was spent smiling while looking at the human player; some children practiced “scripting”, or repeating a part of dialogue or speech overheard from an enjoyable form of media or toy, by imitated KASPAR’s speech and/or behaviour in an echolalic manner; and some children happily talked to their carers about KASPAR in limited ways while they played with the robot. These positive behaviours suggest that children with autism could enjoy playing an explicitly collaborative game with others, particularly with the humanoid robot KASPAR.

However, although the idea of having children with autism play an explicitly collaborative video game with a humanoid robot seems to show promise, our experiences in evaluating our experimental setup also showed that there were aspects of it that should be changed in future iterations. Firstly, some of the children showed difficulties in properly communicating with KASPAR, in the sense that the children had difficulties pressing buttons while speaking and tilting their Wii controllers. We believe that instead of this being another form of the children’s communicative impairments, it is possible that performing all of these behaviours correctly was too complex for the children because this was the same way that the children were taught to communicate with the adult human partner, with whom they communicated more effectively. We believe this discrepancy partly existed because the human partner and the children’s carer would also occasionally remind the children about correct communication/choosing procedures whenever they showed difficulties in performing all of the communicative actions at once; in contrast, the robotic partner had neither the sensors nor the programming required for understanding speech, resulting in the carer reminding the child to also press the correct button in the event of a difficulty in communicating, while KASPAR behaved as it normally would, oblivious to the children’s difficulty. As such, we feel that our experimental setup could be improved by giving the children a simpler method of communicating in the context of the game, whether this would involve removing button-pushing from the equation and relying only on simpler physical gestures, giving KASPAR the ability to properly and accurately detect and interpret vocal forms of communication, or other means.

Secondly, some of the children had difficulties in understanding the game’s basic mechanics, even when they played alone during their first play sessions. Specifically, although the children had to roll their Wiimotes from side to side in order to move the orange line on the screen and select shapes in the collaborative video game, many of the children first tried to move the orange line by either rotating their Wiimote like a compass about its Z-axis or by translating their Wiimote from side to side along its X-axis. These children then had to be untrained in moving their Wiimotes incorrectly and properly trained in rolling their Wiimote from side to side. Although none of the children seemed upset at this turn of events, we believe that the additional time and effort required for the children to learn how to properly play the game may have detracted from their already-limited interactive abilities. This is because the time that the children spent learning how to use their Wiimote was uniformly spent looking at their controller and not speaking, instead of potentially looking at the other player and/or communicating with them. As such, we feel that future iterations of the game should involve more natural and more intuitive methods of control that more clearly match a child’s expectations upon looking at a game screen, such as using pitch-control in a Pong-like game to make a paddle move up or down, or using simple poses to play a game that involves mimicking stick figures.

Thirdly, KASPAR’s behaviours unintentionally rewarded some of the children for not doing anything, despite the robot’s stated goal of rewarding the children for cooperating successfully. Specifically, because the robot was programmed to prompt a child to choose a shape if they remained inactive for 5 consecutive seconds, as well as to repeatedly attempt to take the initiative in choosing shapes when the child did not respond to two consecutive prompts, one child discovered that KASPAR would essentially speak to them every 5 seconds provided that they did not play at all. This allowed the child to stare raptly at the robot for long periods of time without responding to the robot’s prompts, until the child’s carer jogged him out of this routine. Although this loophole in KASPAR’s behaviour should clearly be fixed in future iterations of the system, this incident also provided evidence that KASPAR’s behaviours and interactions could serve as their own reward, in addition to the sensory rewards provided by the video game. As such, because KASPAR’s speech and behaviours can entertain children with autism, future versions of the robot’s programming should limit the frequency and/or duration of the interactions if the children either do not play cooperatively or passively fixate too much on KASPAR instead of the actively playing the game and communicating with the other players.

6.2 Interpretation of Findings

Although the data gathered in the course of this exploratory study were only selected and analyzed as a preparatory exercise, and despite the fact that this study’s design cannot distinguish between effects from the children becoming familiar with playing the cooperative game and effects from the children learning about cooperative play from their interactions with KASPAR, one can still gain insight on trends to look out for in later studies by making inferences on this pilot study’s data. Having said this, the fact that the children performed more actively collaborative behaviours (changing the direction in which one looks, choosing shapes through taking the initiative, and successfully selecting shapes) during their second session of playing with the human adult than during their first session of doing so is uniquely interesting when one considers that were no similar increases in actively collaborative behaviours between the children’s first and second sessions of playing with KASPAR. Because this trend was not seen during the robot play sessions, this might indicate that the children wanted to play the game more, or grew more willing to play collaboratively, when they played with the human partner. Additionally, since the two play sessions involving the human adult occurred both before and after a play session involving KASPAR, it could also mean that during the second play session with the human adult, the children were able to apply what they learned about collaboration from playing with the robot. This interpretation would support the experimental hypothesis which was described in Sect. 1.2.2. On the other hand, the children’s growing display of actively collaborative behaviours during the two sessions of playing with the adult human could also be caused by their gradually becoming more comfortable interacting with the human partner. To properly determine the cause of this trend, another study involving multiple play sessions in each phase of the experiment would have to be undertaken; if similar increases in collaborative behaviour were also observed between two different sets of play sessions with an adult human that happened to couch a set of play sessions with KASPAR, it would provide strong evidence that interacting with robots improved autistic children’s collaborative behaviours. On another note, the fact that the children played differently depending on whether they played with the human partner or KASPAR supports the findings of previous research. Specifically, since the children spent more time looking at KASPAR and would also switch between looking at the game and the robot more times than they switched between looking at the game and the human adult, one could argue that the children simply thought that KASPAR was more interesting than a human adult. Furthermore, since the children also displayed more positive affect while looking at KASPAR than they did while looking at the human adult, the children may have thought KASPAR was more enjoyable and fun than the human.

There are also certain findings from this pilot study that were surprising and/or not easily explained. Specifically, the children did not collaborate more or better with KASPAR, as they instead chose fewer shapes and passively followed the robot’s suggestions instead of taking the initiative in choosing shapes. This suggests that the children were neither as engaged in the game nor as able to perform cooperative actions when interacting with KASPAR as often as they could when interacting with the human player. Additionally, some of the children engaged in “scripting”, or mimicking actions and speech from different forms of media, in that they freely and happily mimicked KASPAR’s actions and speech. Similarly, one child was observed performing actions that, instead of being helpful for selecting shapes, served only to make the robot act in an amusing manner. These phenomena suggest that although the autistic children from our study saw the robot as more entertaining than the video game, they also seemed to pay less attention to the content and meaning of KASPAR’s speech than to the fact that KASPAR spoke to them at all.

At first glance, the data might suggest that the children perceived KASPAR as a source of humor and interest instead of an entity with which they could communicate and play; this might be due to the novelty of the children interacting and playing with a humanoid robot. Specifically, because none of the children had interacted with a humanoid robot before, much less played a game with one, they may have found the experience of KASPAR interacting with them to be so interesting that they wanted to observe the robot and its behaviours instead of actually communicating with it. As such, future studies involving KASPAR should also contain periods of familiarization, in which the children could learn how KASPAR behaves and reacts to the children’s own actions, which would precede the main phases involving the children collaborating with the robot. Having the children gain some experience in interacting with KASPAR could decrease the potential for the robot’s “novelty effect” to influence the children’s interactions with it, as well as reduce the amount of time spent gazing in awe at the robot instead of playing with it.

7 Conclusions

This article presents our findings from an exploratory pilot study that tested and evaluated an experimental setup involving a dyadic collaborative video game and an autonomous version of the humanoid robot KASPAR, both of which were designed for children with autism. In addition, this study also served as a preparation for a longer and more extensive study by having children with autism alternate between playing the video game with a human partner and playing the same game with the humanoid robot. The results from the present study’s evaluation and testing of the systems developed showed that while the children willingly and happily played our collaborative video game and were fascinated by KASPAR’s autonomous behaviour, there were certain aspects of the systems that could be designed better, such as the method of communicating with the robot, the intuitive level of the video game’s controls, and the implications of the robot’s behaviour patterns. Similarly, the results from gathering initial data on the children’s behaviour upon alternating between playing the collaborative video game with a human adult and an autonomous robot suggest that the children were more entertained, seemed more interested in the game, and collaborated better with a partner during their second sessions of playing with a human than their first; in contrast, there were no significant differences when comparing how the children played in their first and second sessions with the humanoid robot. While the changes in the children’s social behaviour with the human player may be due to the children’s intermediary play session with the robotic partner, there is also a chance that such changes might also occur after enough repeated interactions with a human adult, without any child-robot interaction whatsoever. Additionally, while the children seemed to see their robotic partner as being more interesting and more entertaining than their human partner, they seemed to solve problems collaboratively and worked together better with people. This phenomenon might be due to the novelty of interacting with a robot overtaking the desire to interact productively with it.

To explore these phenomena in more depth and to conduct a more extensive trial of the collaborative video game and the autonomous robot, a similar study over a longer period of time and involving alternating baseline/intervention phases of play will be conducted, with each phase being comprised of multiple play interactions between the child and a human player (in the case of baseline) or the child and a robotic player (in the case of treatment). A familiarization phase will also be included to reduce the “novelty effect” from interfering with the children’s behaviours. By comparing the frequency and duration of the children’s social behaviours both over the course of each baseline phase as well as between the averages of the two baseline phases, one should be able to more easily determine whether any changes in the children’s displays of social behaviours were due to repeated interactions with a human adult or the after-effects of a special set of interactions with a robot during the first intervention phase. Similarly, if there were no significant changes in the children’s displays of social behaviour with the robot during the course of each intervention phase, and if the interactions with the robot produced consistently different displays of social behaviour than the interactions with the human, this would disprove the novelty factor as being a driving force behind the uniqueness of an autistic child interacting with an autonomous robot. To ensure that the children interact easily with both players, in the future we will simplify the methods for in-game communication and program the robot with better sensing and filtering algorithms to more easily interpret the children’s actions. In addition, in an attempt to keep the children from fixating on the robot’s interactions without attempting to communicate with it, we will program the robot to interact with the children less often or for shorter durations as the children become more and more passive in their communication. While this discussion has focused on the lessons learnt from this pilot study in addition to leading to concrete plans for the next study, we believe that the insights gained in this study can also benefit other human-robot interaction research, in particular in the area of robot-assisted play for children with autism. The technical contribution of this article concerns the development and implementation of a setup for collaborative dyadic and triadic interactions with an autonomous humanoid robot, and may also be used for different applications and/or user groups in the future.