Keywords

1 Introduction

Some neurodegenerative syndromes result in substantial speech impairments reducing intelligibility of spoken language and hereby affecting everyday communication and eventually contributing to social isolation [1]. For example, dysarthria is a speech motor impairment that results from neuromuscular control disturbance and is a symptom of Parkinson’s disease. In this case, speech therapy can help to support communication when frequency and intensity is given [2]. During sessions, therapists introduce and monitor exercises for treatment of articulation, prosody and pitch range, speech rate, vocal volume, or resonance. High frequency of exercises is in need of additional intensive training besides the speech therapy sessions. As therapists cannot supervise all training, they need to give patients control over self-regulated training. Only in this case, the necessary redundancy of several exercise units a day can be ensured.

Since such exercises are often strenuous, monotone, repetitive and boring many patients lack the necessary motivation to keep up with their training [3]. In addition, self-awareness of speech quality is often reduced in patients with Parkinson’s disease. They need accurate and immediate feed-back, for example via automatic speech recognition, in order to evaluate intelligibility and progress. Game-based technology might be able to provide these patients with the adequate tool for independent and high frequency training, tapping into the empowerment of the patient [4]. In the R&D project ISi-Speech (‘Individualisierte Spracherkennung in der Rehabilitation für Menschen mit Beeinträchtigung in der Sprechverständlichkeit’ (in German) [individual speech recognition in therapy for people with motor speech disorders]), we are joining efforts to develop a digital training system for people with dysarthria in an interdisciplinary team of engineers for speech signal processing and informatics, media designers, and researchers from the fields of speech pathology and psychology. The challenge is to develop an automatic speech recognition system applicable to distorted speech and to integrate this system into a speech therapy application that incorporates the motivational potential contributing to frequent and autonomous usage. Therefore, motivation theory related to a game-based context will be applied.

2 Gaming Motivation

Games show the potential to satisfy psychological needs and provide deeper and long-lasting experiences for players [5]. Motivational theories such as self-determination theory (SDT) have been applied to games, the motivation of players and the well-being outcomes of play. SDT assumes that all humans are driven by the basic psychological needs for competence, autonomy, and relatedness [6]. While the need for autonomy reflects a desire to engage in activities of one’s choice, the need for competence implies a desire to interact effectively with the environment. The need for relatedness involves the feeling of being socially connected. Tapping into these components technology–and particularly game-based interventions–suggest a tremendous potential to facilitate health behavior [7, 8]. The beneficial effects of enhancing the components autonomy and competence have already been observed for features of the gaming environment and in the context of acceptance of video game play [7,8,9]. Specifically, features such as flexibility, choice, and structured rewards support a sense of autonomy. The feeling of competence may be supported by game features such as intuitiveness, optimal challenges or adequate feed-back. Social elements such as avatars or agents as well as real players connected through a game are supporting the experience of relatedness.

Szalma [5], for example, recently suggested factors for game design that effect human needs in technology use based on SDT. Such factors are for example choice setting, informational feed-back, accountability for performance outcome, acknowledgement of the user’s experience, meaningful rationale for tasks as well as an explicit acknowledgement that a specific task may be experienced as uninteresting. In a large study, evidence was accumulated for six clusters of players’ motivation such as action, social, mastery, achievement, creativity, and immersion [10]. Knowing that players have different motivations resulting in different profiles can help to implement adequate motivational elements to the game context. Peng and colleagues [11] implemented game features based on SDT in the context of video games. For example, features such as character customization, treasure collection, and different strands of conversation are recognized as support for players’ perception of autonomy. Furthermore, system adaption, heroism meter or achievement badges are designed to enhance satisfaction of competence. Finally, context information and supportive dialogue were intended to increase relatedness.

What can be learned from gaming studies for the health context? Health games become increasingly popular (e.g. [4, 12, 13]). They incorporate design elements in a non-game context and are therefore defined as gamification [14]. Health games, specifically applied to intervention format, represent for example an opportunity to meaningfully engage patients and improve enjoyment of training. But most of health technology still neglects the potential of motivational elements and its supportive character [3]. The goal of a supportive environment will be to overcome the boredom of repetitive and monotonous tasks and to engage the users by giving them a contextualized meaning for the repetitive task. The inclusion of motivational elements in the task, the interface or the context offer opportunities to satisfy the core human need for autonomy, competence and relatedness. Within this paper we want to assign insights from both, SDT and game-based studies to ISi-Speech as a therapeutic game.

3 Therapeutic Games

Mader and colleagues [3] defined therapeutic games as “games that produce a direct, expected, and intended therapeutic effect on patients playing them. This therapeutic effect may be to alleviate, to improve or to heal the specific condition of the patients” [3, p. 1]. This means for example, that the therapeutic effect of a better intelligibility of spoken language derives from the loudness adaption and not directly from the game session.

Sustaining a patient’s motivation for therapy in chronic conditions is one of the most difficult long-term challenges [15]. The combination of game and therapy is very promising to maintain the motivation of patients to follow their exercises in speech therapy. Over the last years a broad body of research could be accumulated on the positive effects of play in the health context [4, 16,17,18,19,20]. Researchers discussed the potential of games offering therapeutic benefits to various subgroups of people and conditions to use them appropriately. One of the most important challenges of therapeutic game design is the complex development process combining game action and therapeutic competence [3]. How to choose therapeutic games wisely, i.e. based on evidence, has been proposed by speech and language therapists (e.g. [21, 22]). Hereby, the authors suggest a procedure for an evidence-based selection of apps that integrates both, the classical evidence-based approach to the selection of therapeutic methods and the evaluation of apps as well as their integration into the individual therapy setting.

4 Embedding Self-determination Theory into Game-Based Speech Rehabilitation

In this section we want to argue for the necessity to combine both, the evidence of gaming motivation based on SDT and the knowledge on health gaming in order to develop a therapeutic game such as ISi-Speech.

Evidence suggests that interventions based on theoretical approaches are more effective than ad hoc developments (for a review see [23]). As mentioned in Sect. 2 a successful example of linking theory to health intervention designs is the SDT (e.g. [24, 25]; for a summary see [26]). Furthermore, any rehabilitation training technology needs to match the therapeutic intervention. Without explicit attention to theory health behavior intervention might fail.

Within the health domain, game-based interventions are increasingly shown to indeed enhance motivation (e.g. [18, 27,28,29]) and motor control (e.g. [30, 31]) in patients with neurological disorders. While a few systematic reviews focusing on physical activity examined the effectiveness of game-based interventions (e.g. [32]), studies investigating game-based tools for speech intervention are scarce (e.g. [33]). As a result, interventions seem to enhance the selected therapy goal, but the effect of such interventions in the longer run is not clarified yet. Gamification elements such as feed-back, adaptability, motivational elements, and monitoring are identified as important features for games used as rehabilitation tool (e.g. [34]).

Within ISi-Speech we assigned gamification elements to the patients’ need for autonomy, competence and relatedness as described in SDT (Table 1). This theory based approach allows for a better understanding how technology and psychology need to closely work together in order to prompt a usage of the system that will eventually result in effectivity. The first column contains the operationalization of autonomy, competence and relatedness within the context of digital interactive media. Column two defines the goal to be achieved within the therapeutic game. The last column finally gives examples how this goal is supposed to be attained within ISi-Speech as a tool for the improvement of speech in patients with Parkinson disease. These examples are driven from a script implemented in ISi-Speech in which a patient is supposed to order coffee and cake in a bakery. This script was designed to facilitate common interactions in a rather public space.

Table 1. Implementing SDT-based gamification elements into ISi-Speech.

4.1 Autonomy

Allowing choice as much as it is practical in exercising has been demonstrated to be an important feature supporting autonomy [5]. In ISi-Speech, for each training session the patient will have the choice between activities within the task. For example, the choice whether s/he needs an auditory repetition of the word s/he has to produce.

Another control feature is based on different strands of conversation resulting in true interactivity. If patients can choose between options that result in different interactions, conversation styles are personalized, too [35].

Furthermore, the experience of autonomy could be reached by giving patients a choice in the reward system [11]. ISi-Speech will implement extrinsic and performance-based rewards to provide incentives to patients in order to reinforce the desired response. Social and individual norms will be both used as anchors for comparison. Goals are made transparent and associated with the reward, for example how many points can be earned for completing minimum specific amount of speech exercises per day. Positive evidence with respect to the implementation of (social) rewards systems in the game sector [12] encourages us to apply those motivational components to ISi-Speech.

4.2 Competence

The most important motivational design element in ISi-Speech is the feed-back component. Feed-back is given an additional role to forestall the tendency of patients with Parkinson’s disease to falsely attribute problems in communication to others (e.g. to attribute hearing loss to their conversation partner rather than acknowledging that they speak too soft and therefore not intelligible enough themselves). In this case, tailored feed-back facilitates the patient’s experience of competence during exercising. We implement an automatic speech recognition system that has been shown to be effective in patients with dysarthria [33]. Feed-back that utilizes play-back combined with accurate and specific evaluation of recorded speech will guide towards more adequate self-perception.

The achievement component seems to be highly important for the patients’ satisfaction of competence as well. ISi-Speech will incorporate various types of achievement badges per exercise. Throughout all of exercises patients can browse for example in a specific achievement menu [11]. To support the perception of progress, ISi-speech will make the advancement of each user constantly visible [36]. Thus, patients will be able to easily inform themselves about progress and power gain online.

Another competence supporting element is highly adaptive to the performance of the user. In this case, the ISi-Speech mechanism will adjust constantly along with the user’s performance in the exercise. For example, when a patient successfully exercises a predefined number of task items, s/he will be offered the next task unit both, immediately and when all task items are completed. Variables such as time and number of repetitions per task are important measurements for a subsequent adaption. However, there is still discussion how incremental levels have to be designed in health games in order to impact health behavior [37].

4.3 Relatedness

We aim to develop a so called para-social relationship between the patient user and ISi-Speech by implementing a virtual coach (VC). Through building a tutor character, we aim to reach a personalization of the system [11, 36, 38]. This VC can be a constant companion during the training. If patients are given the tools to choose her/his features such as gender, voice, age and personality as well as the character’s appearance learner motivation will be enhanced [39,40,41]. Para-social propositions are shown to have a huge potential for the feeling of social inclusion [42]. For example, the VC can fulfill the desire to form long-term meaningful relationships with others [38] and might facilitate motivation and outcome when the valence of social cues is taken into account [39]. Especially when the VC is self-made and implemented in a social dialogue, exercises might be perceived as more pleasant [40, 43] and VC acceptance is higher [44]. In addition, information about the tasks in the exercises will create an environment for users that enhances the experience of social presence [11]. Based on this evidence, the implementation of a virtual coach in ISi-Speech will strengthen the patient-VC-relationship (relatedness) and thereby support the patients’ need of being respected and understood and consequently, feeling socially included.

Other components of ISi-Speech are competition and collaboration. Studies demonstrated sustainable usage when competition was included as an element (e.g. [45]). In collaborative teamwork situations, patients might derive satisfaction by the perception of being part of a group effort [38]. ISi-Speech will provide users with social support that encourages them to engage with the speech therapy exercise. For example, patients are assigned teams within an exercise for clear and exaggerated articulation of words. In order to unlock the next articulation level on the articulation exercise of phrases, patients must work as a team to achieve the therapy goal of speaking ten words with clear articulation. Hereby, the automatic speech recognition system implemented in ISi-Speech will support the patient in achieving her/his goal. In collaborative teamwork situations patients could have an interest in helping and chatting with others [38]. In contrast, the competition will enable each patient her/his own performance on the exercise by comparing it to other patients. The patient’s motivation is based on her/his focus on a challenge and a competition with others in the exercise. Patients will be ranked on a leaderboard by the amount of speech exercises they logged over a day and different training sessions.

5 Summary and Conclusion

We reported how we use the potential of gamification elements to develop ISi-Speech as a rehabilitation tool in speech therapy based on SDT. The core principles of SDT such as autonomy, competence, and relatedness will facilitate activity, engagement and social interaction in our potential user group - patients with Parkinson’s disease. We assume that the ISi-Speech training tool for individuals with Parkinson’s disease increases the motivation for self-sustainable usage when intervention components are linked to those three core elements of SDT. Patients’ autonomy will be reached by enabling the patients to be independent from others (e.g. family) while training with ISi-Speech. The experience of competence requires a comparison with previous achievements, others or a goal during training. We aim to reach relatedness in ISi-Speech facilitating the connectedness to other users as a source for feeling embedded. The incorporation of SDT during the development process of ISi-Speech as a game-based intervention promises acceptance and effectiveness of training in patients with Parkinson’s disease. We envision being hereby able to substantially empower patients with impaired speech and contribute to their social inclusion.