Elsevier

Sleep Medicine

Volume 10, Issue 5, May 2009, Pages 556-565
Sleep Medicine

Original Article
Development of a Japanese version of the Epworth Sleepiness Scale (JESS) based on Item Response Theory

https://doi.org/10.1016/j.sleep.2008.04.015Get rights and content

Abstract

Background

Various Japanese versions of the Epworth Sleepiness Scale (ESS) have been used, but none was developed via standard procedures. Here we report on the construction and testing of the developer-authorized Japanese version of the ESS (JESS).

Methods

Developing the JESS involved translations, back translations, a pilot study, and psychometric testing. We identified questions in the ESS that were difficult to answer or were inappropriate in Japan, proposed possible replacements for those questions, and tested them with analyses based on item response theory (IRT) and classical test theory. The subjects were healthy people and patients with narcolepsy, idiopathic hypersomnia, or obstructive sleep apnea syndrome.

Results

We identified two of our proposed questions as appropriate replacements for two problematic questions in the ESS. The JESS had very few missing data. Internal consistency reliability and test–retest reliability were high. The patients had significantly higher JESS scores than did the healthy people, and higher JESS scores were associated with worse daytime function, as measured with the Pittsburgh Sleep Quality Index.

Conclusions

In Japan, the JESS provides reliable and valid information on daytime sleepiness. Researchers who use the ESS with other populations should combine their knowledge of local conditions with the results of psychometric tests.

Introduction

Daytime sleepiness is an important manifestation of sleep disorders. It can disrupt a patient’s social life and threaten public health and safety [1], [2], [3], [4]. Daytime sleepiness is an important marker in assessments of sleep disorders, and is measured both subjectively and objectively. The “gold standard” index of sleepiness is provided by the Multiple Sleep Latency Test (MSLT) [5], but this test is costly and time-consuming. Requiring much less time and money is the Epworth Sleepiness Scale (ESS), a self-report instrument for measuring a patient’s perception of sleepiness. Guidelines for the control of Obstructive Sleep Apnea Syndrome (OSAS), narcolepsy, and insomnia recommend the ESS [6], [7], [8], and it has also been used in occupational and community-based studies [9], [10].

The ESS comprises questions about subjective sleepiness in eight situations [11], [12]. Respondents use a 4-point scale (scored 0–3) to respond to each of the eight questions, and the scores are summed to give an overall score of 0–24. Higher scores indicate stronger subjective daytime sleepiness, and scores below 10 are considered to indicate no problem [9].

Several Japanese language versions of the ESS are available, but their relation to the original version and their acceptability are questionable because none was developed in accordance with standard procedures or in coordination with the developer of the original (English language). An advisory committee on sleep apnea syndrome within the Japanese Respiratory Society reported in 2004 that 165 of 277 hospitals used various Japanese language versions of the ESS, and used them in different ways (self-administration or interview). This inconsistency hampers comparisons among hospitals [13]. In addition, although many papers published from Japan, including one by us [14], have reported ESS data, the possibility remains that the sleepiness measured with those versions differs in important ways from that measured with the original ESS. Furthermore, the questions in the original ESS ask about sleepiness in various daily life situations, but we should not assume that all of those situations are familiar to respondents in Japan. Such a lack of familiarity could explain the reported high rates of missing data (9.5–19.2%) [15]. For example, the ubiquity of public transportation in Japan could account for the fact that 19.2% of the subjects in one study did not answer question 8, which asks about sleepiness “In a car, while stopped for a few minutes.”

Item Response Theory (IRT) is increasingly used in the construction of scales for measuring subjective attributes, particularly in research on Quality of Life. Common applications of IRT are the construction of shorter versions of existing scales, establishment of scoring algorithms, and Computerized Adaptive Testing (CAT) [16], [17], [18], [19]. In IRT, the probability of a particular response to a question is described as a function of a latent trait assumed to underly the manifest response. In IRT the value of the latent trait is called “theta” [20], [21], and in this study it is the actual subjective daytime sleepiness. The probability of a particular response is typically described with a function that has two parameters: “location” is the value of the latent trait about which a response provides the most information (in educational testing this is called “difficulty”), and “slope” is the degree to which responses can be used to distinguish between small differences in the latent trait [22]. In the ESS, questions with higher values of the location parameter provide more information about people whose daytime sleepiness is severe, and questions with lower values of the location parameter provide more information about people whose daytime sleepiness is milder. Questions with higher values of the slope parameter allow one to make fine distinctions between severities of daytime sleepiness, i.e., to measure small differences in daytime sleepiness, while questions with lower values of the slope parameter allow one to measure only relatively larger differences in daytime sleepiness. Analyses based on IRT allow more precise examinations of the characteristics of each question item than do those based on Classical Test Theory (CTT) [20], [21]. In addition to analyses of each question, IRT allows construction of an information function for each question and also for the scale as a whole. That test information function reflects the accuracy of measurement at different values of the latent trait [20], [21]. Given a sufficiently large and wide-ranging group of questions (the “item pool”), and knowledge of each question’s location and slope, a scale with a desired test information function can be constructed.

Here, at the request of the Japanese Respiratory Society, we report the development of a Japanese version of the ESS (the JESS). We were able to ensure that the JESS was as close as practically possible to the ESS, because one of us (M.J.) is the developer of the ESS. Our purposes were to translate the ESS into Japanese using commonly-accepted methods, to clarify problems with unauthorized Japanese-language versions of the ESS, to use IRT to develop a better Japanese-language version (the JESS), and to study the reliability and validity of the JESS.

Section snippets

Translation and preparation of the item pool

To develop the Japanese version we used a method that has been used in many countries and for many self-report scales [23]. The process includes translation from the source language into the target language (i.e., forward translation), translation from the target language back into the source language (i.e., back translation) so the developer of the source-language version can participate fully, and examination of translation quality. We also included a pilot test.

At the forward translation

Translation and item pool

Discussions between the Japanese team and the developer of the original version resulted in some differences between the JESS and previous Japanese-language versions.

In our first translation and in several previous Japanese-language versions the instructions and response options used the expression “nemutteshimau,” which means “fall asleep.” Because “dozing off” was meant to indicate sleeping for a short time, we instead used “utoutosuru (suubyou∼suufun nemutteshimau),” which means “doze off

Discussion

This study showed that the authorized Japanese translation of the ESS (the JESS) measures a construct similar to that measured by the original ESS. Moreover, we showed how the original ESS question about being in a car was problematic, and we identified an appropriate replacement for that question.

While the original ESS asked questions about probabilities (“how likely”), previous Japanese versions asked factual questions (“how often”) and measured more severe sleepiness. To avoid that content

Conclusion

Using standard, internationally recognized methods we developed and tested a version of the ESS for use in Japan (the JESS). To our knowledge, this study is the first application of IRT to the selection of questions for the ESS. Two questions from the original ESS were replaced. The two new questions were psychometrically similar to, or better than, the originals. The JESS is characterized by content equivalence with the original ESS and appropriateness for use in Japan, and its use is expected

Acknowledgements

This study was supported by Grants from the Institute for Health Outcomes and Process Evaluation Research (iHope International). We are grateful to Itsunari Minami and Yuriko Nakayama for recruiting subjects and collecting data. We also wish to thank Tsutomu Namikawa for his suggestions on our analysis.

References (36)

  • Scottish Intercollegiate Guidelines Network. Management of obstructive sleep apnoea/hypopnoea syndrome in adults. A...
  • M. Littner et al.

    Practice parameters for the treatment of narcolepsy: an update for 2000

    Sleep

    (2001)
  • A. Chesson et al.

    Practice parameters for the evaluation of chronic insomnia. An American academy of sleep medicine report. Standards of practice committee of the American academy of sleep medicine

    Sleep

    (2000)
  • M. Johns et al.

    Daytime sleepiness and sleep habits of Australian workers

    Sleep

    (1997)
  • M.W. Johns

    A new method for measuring daytime sleepiness: the Epworth sleepiness scale

    Sleep

    (1991)
  • M.W. Johns

    Reliability and factor analysis of the Epworth sleepiness scale

    Sleep

    (1992)
  • T. Akashiba et al.

    Current situation of the SAS diagnosis and treatment in facilities recognized by the Japanese respiratory society: results of the questionnaire survey

    Nihon Kokyuki Gakkai Zasshi

    (2004)
  • K. Chin et al.

    Response shift in perception of sleepiness in obstructive sleep apnea–hypopnea syndrome before and after treatment with nasal CPAP

    Sleep

    (2004)
  • Cited by (242)

    View all citing articles on Scopus

    Disclosure statement: This study was supported by grants from the Institute for Health Outcomes and Process Evaluation Research (iHope International). All authors have indicated no financial conflict of interest.

    View full text