The pre-Socratic Greek philosopher Protagoras (490 BC–420 BC) made an argument about 2,500 years ago: Man is the measure of all thingsof things which are, how they are, and of things which are not, how they are not [1]. This controversial idea is in contrast with a long-standing paradigm claiming that the universe is based on objective matters and outside the human’s subjective influence. Similarly, the use of objective methods such as laboratory tests and diagnostic imaging rather than subjective and self-reported methods to evaluate patient health outcomes has dominated medical sciences for several 100 years. Although clinicians and policy makers have gradually accepted the notions “quality of life (QOL),” “health-related quality of life (HRQOL),” or “patient-reported outcomes (PROs)” as an endpoint in the adult setting, the pediatric setting has lagged behind in considering self-reports in concept initiation, instrument development, and clinical application of PROs. We conventionally believed that children’s self-reported health information is unreliable and together with complex developmental issues, resulted in insufficient attention to and under-utilization of pediatric PRO data. Indeed, children are able to report their health status in an adequate manner [2], and pediatric PRO instruments need to carefully accommodate specific content related to children’s cognitive development as well as reading ability, vocabulary, and language skill.

This special section of Quality of Life Research includes four articles exclusively focusing on the theme of pediatric PROs. Two articles applied the International Classification of Functioning, Disability, and Health—Children and Youth Version (ICF-CY) as a framework to compare the conceptual content of pediatric PROs embedded in existing instruments for children and adolescents with chronic conditions [3, 4]. One article synthesized the constructs of PRO instruments for pediatric cancer [5], and another article summarized the development of the KIDSCREEN which is commonly used in European pediatric studies [6]. We applaud the authors for their efforts and useful findings in pediatric PRO measurement.

The comparison among pediatric PRO instruments is a challenging task because these instruments were not created based on the same conceptual foundation. Instrument developers use different conceptual frameworks to generate target PROs with specific content and domains and employ various stakeholder-engagement processes to create items. They also apply different psychometric methods to evaluate PRO measures. Previous review articles on pediatric PRO instruments might be misleading because the comparison is confined to the “face value” of the instrument domains [79]. These studies often assumed that the domains named by individual instruments (e.g., physical functioning) reflected the genuine concepts of PROs, neglecting the actual content of the items embedded in each instrument. Consequently, the same domain name across different instruments does not necessarily mean they measure the same concepts. For example, a recent pediatric study utilizing a sample of approximately 1,000 participants found that although four pediatric generic instruments (the Child Health and Illness Profile, the KIDSCREEN-52, the KINDL, and the Pediatric Quality of Life Inventory) use a similar label for physical and psychological domains, the correlations between physical domains of the four instruments were equally the same when compared with the correlation of the physical and psychological domains [10]. This finding suggests poor convergent/discriminant validity and evokes a serious question: what are the elements captured by our pediatric PRO instruments? In this special section, Ahuja et al. [4] concluded that the majority of pediatric PRO instruments for obesity primarily capture the concepts of functional status rather than HRQOL or QOL. The unique contribution of this study is the application of the ICF-CY as a framework to level the field for a systematic, head-to-head comparison among the item content of different instruments. Interestingly, analyzing a clinical trial registration database, Fayed et al. [3] found that non-drug interventions or late phase clinical trials (phase IV) were more likely to include activity and participation as endpoints than drug interventions or earlier phases (phase II or III) clinical trials.

It is encouraging that the two articles using ICF-CY [3, 4] aim to apply universal comparisons of different PRO instruments. Several methodological issues require our careful attention nonetheless. First, there continues to be confusion about the concepts of functional status, HRQOL, and QOL. Although no gold standard definitions are available for these concepts, Wilson and Cleary have proposed a comprehensive model illuminating that HRQOL is a broad concept which comprises three key components: functional status, general health perception, and QOL. However, the term “HRQOL” used by the above-mentioned two articles actually captures the notion “general health perception” that was described in Wilson and Cleary’s classical model [11]. Second, the two articles merely restrict themselves to mapping items of pediatric PROs to different categories (or codes) of ICF-CY. Although this is a useful approach for demonstrating the content validity of PRO tools, empirical data have yet to be collected to confirm the qualitative comparisons. Modern test theory [e.g., item response theory (IRT)] serves as a foundation to assess measurement properties of instruments at item and domain levels. Evidence of IRT parameters on the items assigned to the same category of ICF-CY (e.g., ICF-CY code b152: emotional functions) helps to inform whether these items truly measure different levels of the same underlying latent construct (i.e., emotional function). Combining ICF-CY with IRT, methodologies will empower the researchers to establish item banks with content-valid and psychometrically robust properties.

The article by Anthony et al. [5] specifically compared the content of pediatric PRO instruments designed for children with cancer. In this study, authors adopted an inductive reasoning approach to summarize the content of domains/subdomains from existing generic and cancer-specific tools. They created a working PRO model with four domains for children with cancer: physical health, psychological health, social health, and general health. The resulting four-domain solution is not surprising because it represents the essential constructs of the existing instruments. The question was, are these domains comprehensive enough to capture PROs important to cancer patients in general and to survivors in particular? Given that the survival rate is greater than 80 %, pediatric cancer patients are facing many life challenging issues. For example, one recent article discusses the potential missing content of PROs critical to childhood cancer survivors by reviewing the published “qualitative” studies and comparing the results to the conventional PRO framework with physical, psychological, and social domains [12]. This study found that childhood survivors are behind in several developmental milestones, and survivors have unique concerns related to cancer such as normalcy, independency, fertility, self-identity, and body image. However, these issues are either neglected or simply captured by the inclusion of only a single item in the existing PRO tools [12]. Additionally, as emphasized by Anthony et al. [5], measuring symptom impact and burden are important concepts in pediatric cancer research, yet symptoms are largely overlooked in existing PRO tools. Symptoms are often graded by clinicians rather than self-reported by patients [e.g., National Cancer Institute Common Terminology Criteria for Adverse Events (NCI CTCAE)]. Recent evidence suggests that prevalence of symptoms increases significantly alongside the aging process in childhood cancer survivors, and symptom presence explains about 60 % of variance of physical and mental aspects of HRQOL [13].

Ravens-Sieberer et al. [6] summarized the development and psychometric evaluation over the past 15 years for the KIDSCREEN. The popularity of the KIDSCREEN in Europe is parallel to that of the PedsQL, which is commonly used in the United States and other countries [14]. The KIDSCREEN was developed on the basis of qualitative methods for content identification and item generation, followed by quantitative methods (classical test theory and IRT) to generate short and long forms and a utility index for different research purposes. The uniqueness of KIDSCREEN is the inclusion of the domains related to children’s psycho-social-behavioral development such as autonomy, social acceptance (bullying), and the new initiative of a computerized adaptive test module—the KIDS-CAT. A recent national survey of pediatricians and specialists suggests that the most beneficial solution for increasing PRO application in clinical practice is to use CAT equipped with programming to automatically calculate individual’s PRO scores and to provide guidance for interventions and follow-up [15].

In contrast to four decades ago, when the term “quality of life” first appeared in the pediatric literature [16], researchers have taken a major step forward to improving pediatric PRO measurement and application. However, in celebrating this achievement, we must also focus on the next best steps for advancing pediatric PRO measurement in the coming decade. First, there is a great need to explore the issues related to the changes in pediatric PRO scores using longitudinal study designs. These issues include responsiveness and minimally important differences (MIDs). Responsiveness methods focus on the change in PRO scores when there is a true change in individual’s health status even if the magnitude of the change of PRO scores is small. In contrast, MIDs quantify the magnitude of the change in PRO scores that is clinically meaningful or interpretable using appropriate anchors and statistical methods [17]. Without information on responsiveness and MIDs, we are not able to monitor and interpret PROs over disease or developmental trajectories. Another important but neglected topic is the potential response shift—children and adolescents might change their internal standard to interpret the meaning of PROs differently along with the experience of significant events (e.g., cancer) or across different neurocognitive development stages. Schwartz et al. [18] found that response shift does exist in childhood cancer survivors and that this phenomenon tends to overestimate treatment effects in a shorter-term (1 week) and underestimate treatment effects in a longer-term (3 month) period. Unfortunately, the prevalence and underlying mechanism behind the response shift in pediatric population is still largely unknown. Research is needed that applies advanced quantitative and qualitative methods to address the puzzle. The last but not least pivotal issue is the urgent demand for developing methodology to monitor and compare PROs from childhood to adulthood, especially in children with chronic conditions (e.g., diabetes) or survivors of traumatic events (e.g., cancer, life-limiting conditions). Current PRO measurement systems are challenged by this issue because pediatric and adult instruments emphasize different developmental issues, and they were not calibrated on the same metric. We truly need developmental and life-course theories to generate a latent continuum of PROs (e.g., social functioning) that depicts different levels of functional status appropriately for children and adults. Such theories should be integrated with advanced psychometric methodology to link and calibrate PRO items and measurement tools across different developmental stages—from childhood to adulthood. Researchers of the National Institutes of Health sponsored Patient Reported Outcomes Measurement Information System (PROMIS®) [19] are calibrating items in selected domains (i.e., pain behavior, physical functioning, etc.) across adult and pediatric populations. Although in early stages, this idea of a common metric for measuring PRO across age groups might be reaching fruition.

Children’s health is our society’s ultimate treasure. Instrument development for pediatric PROs is an ongoing endeavor. As PRO researchers, we are passionate and optimistic about the future, but we still have a long way to go.