Skip to main content
Top

Open Access 25-11-2024

Myelofibrosis symptom assessment form total symptom score version 4.0: measurement properties from the MOMENTUM phase 3 study

Auteurs: Christina Daskalopoulou, Boris Gorsh, Gerasimos Dumi, Samineh Deheshi, Chad Gwaltney, Jean Paty, Catherine Ellis, Jun Kawashima, Ruben Mesa

Gepubliceerd in: Quality of Life Research

share
DELEN

Deel dit onderdeel of sectie (kopieer de link)

  • Optie A:
    Klik op de rechtermuisknop op de link en selecteer de optie “linkadres kopiëren”
  • Optie B:
    Deel de link per e-mail
insite
ZOEKEN

Abstract

Purpose

The Myelofibrosis Symptom Assessment Form version 4.0 (MFSAF v4.0) comprises 7 common MF symptom items (fatigue, night sweats, pruritus, abdominal discomfort, pain under the left ribs, early satiety, bone pain) and is the first patient-reported outcome (PRO) instrument designed to assess MF symptom burden. Given that information on the psychometric properties of this instrument has been limited, we sought to evaluate its measurement properties and validate its use in the phase 3 MOMENTUM trial.

Methods

Data were pooled to assess MFSAF item distribution, structural validity, reliability (test-retest and internal consistency), construct validity (convergent, divergent, and known-groups), and sensitivity to change. Other PRO measures included Patient Global Impression of Severity/Change (PGIS/PGIC), EORTC QLQ-C30, PROMIS Physical Function Short Form 10b, and ECOG performance status.

Results

Participants (N = 195) showed high completion rates (> 93%) across 24 weeks. Moderate to strong Spearman correlation coefficients among items were mostly observed at baseline (range, 0.289–0.772) and week 24 (range, 0.391–0.829), which supported combining items into a multi-item scale and total score. Internal consistency (Cronbach’s α, 0.877 at baseline and 0.903 at week 24) and test-retest reliability (intraclass correlation coefficient, > 0.829) were satisfactory across selected time intervals. Reliability was also supported by McDonald’s omega (ω) coefficient (> 0.875). MFSAF moderately correlated with PRO measures of similar content, differentiated between PGIS and ECOG groups (P < .001), and was able to detect change over time.

Conclusions

The MFSAF v4.0 is a valid tool to assess MF symptom burden, supporting its use in future trials in similar populations.
Opmerkingen

Supplementary Information

The online version contains supplementary material available at https://​doi.​org/​10.​1007/​s11136-024-03855-1.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Introduction

Myelofibrosis (MF) is a debilitating myeloproliferative neoplasm originating in hematopoietic stem cells [1]. The resulting clinical manifestations of MF are heterogeneous, with most patients presenting with symptoms associated with anemia and splenomegaly [24]. MF is rare, with an incidence of 0.1 to 1 per 100,000 individuals per year but has a higher prevalence of 6 per 100,000 person-years due to its chronic nature and disabling course [1, 3]. The median age at diagnosis is 67 years [3], and the median survival for all patients with MF is approximately 6 years [5, 6].
Stem cell transplant is potentially curative but associated with high rates of morbidity and mortality [1, 7]. For the majority, alleviating symptoms, reducing clinical complications, and slowing progression are key treatment goals. Recently, the US Food and Drug Administration (FDA) approved momelotinib for the treatment of adult patients with intermediate- or high-risk MF, including primary or secondary MF, and anemia [8]. Momelotinib demonstrated clinically meaningful benefits in MF-associated symptoms, anemia measures, and splenomegaly vs danazol in patients with Janus kinase (JAK) inhibitor–experienced MF who were anemic (hemoglobin < 10 g/dL) and symptomatic (Total Symptom Score [TSS] ≥ 10 at screening) in the phase 3 MOMENTUM trial (NCT04173494) [9]. The primary endpoint of MOMENTUM was the Myelofibrosis Symptom Assessment Form (MFSAF) TSS response rate at week 24 (defined as ≥ 50% reduction in mean MFSAF TSS over the 28 days immediately before the end of week 24 compared with baseline) [9].
Considering the symptom burden of patients with MF, evaluation of symptoms via patient-reported outcome (PRO) measures is critical [10]. The PRO Consortium’s MF Working Group was established to review existing questionnaires and develop a consensus-based PRO questionnaire for future MF trials. The resulting MFSAF version 4.0 (v4.0) comprises 7 symptom items (fatigue, night sweats, pruritus, abdominal discomfort, pain under the left ribs, early satiety, bone pain) and produces a TSS, calculated as the sum of the 7 individual item responses, to assess MF symptom severity [11].
Investigation of the MFSAF TSS v4.0 psychometric properties has been limited. This analysis aimed to provide preliminary evaluation of these properties and validate MFSAF TSS v4.0 use as a trial endpoint in the MOMENTUM study. Intent-to-treat (ITT) data from baseline and weeks 4, 8, 12, 16, 20, and 24 were used. The following were examined: instrument completion rates; descriptive analyses; confirmatory factor analysis (CFA) to assess the unidimensionality of the instrument; item-to-item and item-to-total correlations to evaluate the hypothesized relationships within the MFSAF’s scale; internal consistency reliability to assess the degree to which responses are consistent across the items of the multi-item scale score; test-retest reliability to evaluate score reproducibility; construct validity to establish that the MFSAF TSS measures the construct of interest; and sensitivity to change to demonstrate that health-related changes in participants’ status are reflected in changes in the MFSAF TSS.

Methods

MOMENTUM study design

MOMENTUM (N = 195) is an international, randomized, double-blind, phase 3 study. Patients were ≥ 18 years, anemic (hemoglobin < 10 g/dL), and symptomatic (TSS ≥ 10 at screening) with a confirmed diagnosis of primary or secondary MF previously treated with approved JAK inhibitor therapy. The main objective was to evaluate the efficacy of momelotinib vs danazol as assessed by improvement in MFSAF TSS v4.0. Additionally, a range of PRO instruments assessing patient-perceived disease severity and impact were administered [9].

Clinical outcome assessment instruments

MFSAF TSS v4.0

The MFSAF TSS v4.0, a daily diary-based measure with a 24-hour recall interval, comprises 7 symptoms of MF: fatigue, night sweats, pruritus, abdominal discomfort, pain under the left ribs, early satiety, and bone pain [11]. The MFSAF TSS v4.0 was completed electronically, with each symptom item assessed on an 11-point (0–10) scale and summed to create the TSS. TSS ranges from 0 to 70, with higher scores corresponding to more severe symptoms. Baseline MFSAF TSS was averaged across 7 consecutive days (baseline days 1–7) before randomization. If > 3 daily MFSAF TSS results were missing, the baseline score was considered missing. At post-baseline weeks, the MFSAF TSS was averaged from the daily MFSAF TSS from a consecutive 28-day period before the week considered. If < 20 daily measurements were available, MFSAF TSS was set to missing for the time point considered.

European Organisation for Research and Treatment of Cancer QLQ-C30 (EORTC QLQ-C30)

The EORTC QLQ-C30 [12] comprises 5 functional scales (physical, role, emotional, social, cognitive), 8 single-item symptom scales (fatigue, pain, nausea/vomiting, appetite loss, constipation, diarrhea, insomnia, dyspnea), and a global health status/quality of life (QOL) scale and financial impact. Most items use a 4-point scale from “not at all” to “very much” and a 1-week recall period; global health status/QOL uses a 7-point scale. Raw scores are transformed to a 0 to 100 scale, with higher scores representing better functioning/QOL and higher symptomatology. The EORTC QLQ-C30 was completed electronically at baseline before randomization and during weeks 12 and 24 of the randomized treatment period.

Patient-Reported Outcomes Measurement Information System (PROMIS) physical functioning items

The PROMIS physical functioning item bank includes several items assessing universal physical function [13]. The PROMIS Physical Function Short Form 10b comprises 10 questions, for which a total ranging from 10 to 50 is calculated (higher score corresponds to lower levels of physical functioning), and 4 additional questions related to physical function from the PROMIS item bank: “Are you able to climb several flights of stairs?”, “Does your health now limit you in lifting or carrying groceries?”, “Does your health now limit you in going for a short walk (less than 15 minutes)?”, and “How much difficulty do you have doing your daily physical activities, because of your health?”. The PROMIS-physical functioning questionnaire was completed at baseline before randomization and at weeks 2, 4, 8, 12, 16, 20, and 24.

Patient Global Impression of Severity (PGIS) and Patient Global Impression of Change (PGIC)

The PGIS and PGIC are single items that capture a patient’s overall perception of symptom severity and treatment benefit, respectively [14]. The PGIS consisted of a single question relating to MF symptoms (PGIS-Symptoms [PGIS-S]) and another relating to fatigue (PGIS-Fatigue [PGIS-F]), each with a 4-point response including “none”, “mild”, “moderate”, and “severe”. Similarly, the PGIC consisted of a single question relating to MF symptoms (PGIC-S) and another relating to fatigue (PGIC-F), each with a 5-point response including “much improved”, “minimally improved”, “no change”, “minimally worse”, and “much worse”. The PGIS was collected at baseline before randomization and at weeks 2, 4, 8, 12, 16, 20, and 24. The PGIC was collected at weeks 12 and 24.

EuroQoL 5D-5L (EQ-5D-5L)

The EQ-5D-5L is a questionnaire measuring QOL on each of 5 dimensions: mobility, self-care, usual activities, pain/discomfort, and anxiety/depression [15]. Each level is rated on a scale from no problems, slight problems, moderate problems, severe problems, to extreme problems. This tool also has an overall health scale where the rater selects a number between 0 and 100 to describe the condition of their health, with 100 being the best imaginable. The EQ-5D-5L was completed at baseline before randomization and at weeks 12 and 24.

Dynamic International Prognostic Scoring System (DIPSS), DIPSS-plus

DIPSS is a prognostic scoring system, validated by Passamonti et al. [16], which evaluates MF based on 5 risk factors: age, white blood cell count, hemoglobin level, peripheral blood blasts, and constitutional symptoms. The sum of the DIPSS scores categorizes patients as low (0 points), intermediate-1 (1–2 points), intermediate-2 (3–4 points), and high (5–6 points) risk.
The DIPSS-plus score, developed and validated by Gangat et al. [17], adds DIPSS-independent risk factors such as karyotype, transfusion dependency, and platelet count. The sum of the DIPSS-plus scores allows categorization of patients as low (0 points), intermediate-1 (1 point), intermediate-2 (2–3 points), and high (4–6 points) risk. DIPSS and DIPSS-plus assessments were each taken as single assessments before baseline.

Eastern Cooperative Oncology Group performance status (ECOG PS)

ECOG PS provides criteria for a patient’s level of functioning with respect to ambulatory status, daily activity, and physical ability and is graded on a scale from 0 to 5, with 0 representing normal activity [18]. ECOG PS was assessed at screening, baseline, and weeks 2, 4, 8, 12, 16, 20, and 24.

Analysis population

Analyses were performed on the ITT population, which included all randomized patients pooled across arms. No patients had missing baseline scores.

Statistical methodology

PRO completion

MFSAF completion was defined as reporting a score for all 7 symptoms. Instrument completion rate was calculated at baseline and weeks 4, 8, 12, 16, 20, and 24. The evaluation looked at completion on the required 4 of 7 days during a 1-week period at baseline and on the required 20 of 28 days during a 28-day period before the end of each week post baseline.

Descriptive analyses

Evaluation of potential floor and ceiling effects was supported by a cutoff value based on the expectation that items would be normally distributed. Pole values greater than what would be expected for a uniform distribution (100/k) indicated the existence of floor/ceiling effects. This 100/k method, or close approximation, is often applied for measures with ordinal categories (i.e., the 15% threshold specified for a measure with 6 response options by McHorney and colleagues [19]). Considering the MFSAF instrument’s 11 response options (0–10), the employed cutoff value was 10%. More than 10% of patients endorsing the lowest or highest score for each of the MFSAF items was considered to indicate a floor or ceiling effect on that item.

Structural validity (CFA, item-to-item and item-to-total correlation)

Due to the reflective nature of the MFSAF instrument, CFA was performed to investigate unidimensionality (see Supplementary Materials for additional information).
Item-to-item correlation analyses at baseline and week 24 were also performed to evaluate the degree of association between items and whether their bivariate distributions were consistent with expectations. Spearman rank correlation coefficients > 0.40 supported combining items into a multi-item scale [20].
The correlation between each individual item and the total score omitting the item was assessed at baseline and week 24 to examine the consistency of item behavior relative to the total score. Coefficients ≥ 0.50 for correlation between items and the total score indicate a contribution of information, while coefficients ≥ 0.90 generally suggest potential redundancy [20].

Test-retest reliability

Test-retest reliability was assessed using all patient data from the screening and baseline periods as stability was hypothesized since no drug administration took place. In addition, stable patients were identified with the aid of PGIS (i.e., PGIS-S, PGIS-F) as an anchor. The 2-way mixed, absolute agreement, single-measure intraclass correlation coefficient (ICC) was used [21]. ICC values together with their 95% confidence intervals (CIs) are usually interpreted within research or clinical applications as follows: values < 0.50 indicate poor reliability, values between 0.50 and 0.75 indicate moderate reliability, values between 0.75 and 0.90 indicate good reliability, and values > 0.90 indicate excellent reliability [22].

Internal consistency reliability

Internal consistency (specified via Cronbach’s α, ≥ 0.70) and its corresponding 95% CI were calculated for the MFSAF TSS at baseline and week 24. “Cronbach’s α if item deleted” was also calculated. An increase in the coefficient α resulting from removal of an item indicates that the item might be detrimental to inter-item consistency and a candidate for elimination. In addition, reliability was examined via McDonald’s omega (ω) coefficient (see Supplementary Methods for additional information).
Construct validity
Convergent and divergent.
Correlation coefficients were estimated to evaluate convergent and divergent validity for the MFSAF TSS. Moderate to strong correlations, defined as those ≥ 0.50 [23] in absolute value, were a-priori hypothesized between the MFSAF TSS and other theoretically related PROs collected within the trial in order to provide evidence of convergent validity. Lower correlations (< 0.50 in absolute value) were hypothesized between the MFSAF TSS and theoretically less related PROs, aiming to provide evidence of divergent validity. More specifically, moderate to strong correlations were hypothesized between the MFSAF TSS and select EORTC QLQ-C30 scales (fatigue, pain, insomnia, global health status/QOL, physical functioning, and role functioning); the PROMIS (total score and 4 additional items); and the EQ-5D-5L (pain/discomfort and usual activities scores). Lower correlations were expected between the MFSAF TSS and other EORTC QLQ-C30 scales (emotional and cognitive functioning).
Construct validity
Known-groups.
Known-groups validity was evaluated via analysis of variance (ANOVA) with α = 0.05 by comparing mean scores across PGIS-S (none, mild, moderate, severe), PGIS-F (none, mild, moderate, severe), and ECOG PS (0–5) groups at baseline and week 24. Comparison of mean scores across DIPSS and DIPSS-plus groups at baseline was also tested and is presented in the Supplementary Materials. Evidence of known-groups validity would be supported by identification of monotonically increasing mean MFSAF TSS across the above-examined groups. More specifically, our hypotheses stated that patients in the severe groups (in terms of symptom severity and/or functional impairment) would show higher MFSAF TSS (indicative of higher symptomatology). Cohen’s d effect size (ES) value was also calculated for MFSAF TSS between each pair of the above defined known groups. The magnitude of the d value was interpreted as small for values of 0.2 ≤ d < 0.5, moderate for 0.5 ≤ d < 0.8, and large for values ≥ 0.8 [24]. Moderate to large ESs were expected for the above-defined known groups. Refer to Supplementary Materials for additional details.
All hypotheses for construct validity (convergent/divergent validity and known-groups validity) are based on previous PRO studies, which have demonstrated that patients with higher disease burden (as measured by higher risk categories and lower ECOG PS), report worse QOL and more symptoms such as fatigue and pain, as assessed via prior MFSAF versions and other PROs [25, 26]. Therefore, patients with more disease progression seem to experience symptoms more intensely, and the hypotheses of the present analysis reflect that this burden will align with higher MFSAF scores.

Sensitivity to change

Sensitivity to change was examined using the analysis of covariance (ANCOVA) at baseline, week 12, and week 24. Anchors were used to define responder groups, with the PGIS-S and PGIS-F (prospectively measured items) serving as the primary anchoring method, and the PGIC-S and PGIC-F (retrospectively measured items) serving as a supportive method. The dependent variable was the change from baseline at the respective time point, and the model included a “responder” factor as a fixed factor. Both within- and between-groups ES was estimated. The hypothesis was that patients indicating symptom improvement (based on the PGIS-S, PGIC-S) and fatigue improvement (based on the PGIS-F, PGIC-F) would report higher mean MFSAF TSS change from baseline (in absolute terms) compared with patients who reported no change or worsening, with moderate to high between-groups ES (as defined above). See Supplementary Materials for additional information.

Results

PRO completion

At baseline, 195 patients were enrolled in the study. Baseline characteristics are provided in Supplementary Table 1. Completion was high at baseline (100%) and at key analysis time points (weeks 12 and 24). Overall completion rate ranged from 92.2% at week 4 to 100% at baseline (Supplementary Table 2).

Descriptive analyses

Floor effects were observed during the baseline period for item 2 (night sweats), item 3 (itching), item 5 (pain under ribs [left]), and item 7 (bone pain [not joint or arthritis]), as the lowest response option “0” (no symptoms) was endorsed by 21.0%, 27.2%, 19.5%, and 21.0% of patients, respectively. Floor effects were reinforced at later time points, as the percentages for the 4 aforementioned items were increased (28.4-35.8% [week 12], 26.4-35.7% [week 24]). At weeks 12 and 24, item 4 (abdominal discomfort) and item 6 (early satiety) also showed floor effects, with the percentage of patients endorsing response option “0” being 12.3% and 10.5% at week 12, respectively, and 17.1% and 15.5% at week 24, respectively (Supplementary Table 3).
Table 1 presents the summary statistics of MFSAF items and TSS at baseline and weeks 12 and 24. Item 3 (itching) (mean = 2.9), item 5 (pain under ribs [left]) (mean = 3.1), and item 7 (bone pain [not joint or arthritis]) (mean = 3.1) had the lowest means (less symptom severity). Items with the highest means (highest symptom severity) at baseline were item 1 (fatigue) (mean = 6.1) and item 6 (early satiety) (mean = 4.5). Overall, mean scores decreased, indicating symptom improvement (range: 3.1–6.1 at baseline, 1.8–4.5 at week 12, 1.9–4.4 at week 24). The MFSAF TSS mean score was 27.2 at baseline and decreased to 19.1 at week 12 and 18.7 at week 24. Skewness at baseline was > 0.5 for item 3 (itching), item 5 (pain under ribs [left]), and item 7 (bone pain [not joint or arthritis]), indicating moderately skewed data. At the subsequent time points, moderate skewness was present across all items except item 1 (fatigue). Supplementary Fig. 1 depicts item response distribution over time.
Table 1
Summary statistics for MFSAF items and TSS at BL and weeks 12 and 24 – ITT population
Item
Mean (SD)
Median
Q1/Q3
Min/Max
Skewness
Time point: BL (N = 195)a
1: Worst fatigue
6.1 (2.13)
6.2
4.6/7.8
0.9/10.0
−0.27
2: Worst night sweats
3.3 (2.72)
2.8
0.9/5.4
0.0/10.0
0.44
3: Worst itching
2.9 (2.71)
2.3
0.4/5.0
0.0/10.0
0.73
4: Worst abdominal discomfort
4.3 (2.59)
4.3
2.3/6.3
0.0/10.0
0.05
5: Worst pain under ribs (left)
3.1 (2.62)
2.7
0.8/4.9
0.0/9.9
0.60
6: Worst early satiety
4.5 (2.45)
4.6
2.4/6.3
0.0/9.7
0.13
7: Worst bone pain (not joint or arthritis)
3.1 (2.56)
2.8
1.0/4.9
0.0/9.3
0.53
MFSAF TSS
27.2 (13.51)
25.4
16.3/36.9
4.9/67.7
0.49
Time point: week 12 (n = 162)b
1: Worst fatigue
4.5 (2.32)
4.4
2.9/6.1
0.0/10.0
0.23
2: Worst night sweats
2.0 (2.06)
1.5
0.1/3.3
0.0/9.1
1.03
3: Worst itching
1.8 (1.77)
1.3
0.0/2.9
0.0/8.3
0.95
4: Worst abdominal discomfort
3.0 (2.18)
2.8
1.2/4.6
0.0/10.0
0.60
5: Worst pain under ribs (left)
2.0 (2.14)
1.4
0.1/3.1
0.0/8.4
1.05
6: Worst early satiety
3.2 (2.25)
2.8
1.3/4.7
0.0/10.0
0.65
7: Worst bone pain (not joint or arthritis)
2.6 (2.46)
2.1
0.0/4.1
0.0/10.0
0.77
MFSAF TSS
19.1 (11.81)
16.3
10.9/26.5
0.0/56.5
0.84
Time point: week 24 (n = 129)b
1: Worst fatigue
4.4 (2.36)
4.0
3.0/6.1
0.0/10.0
0.33
2: Worst night sweats
2.1 (2.28)
1.3
0.0/3.2
0.0/9.2
1.16
3: Worst itching
1.7 (1.89)
1.1
0.0/2.6
0.0/9.3
1.45
4: Worst abdominal discomfort
2.8 (2.16)
2.6
1.1/4.3
0.0/10.0
0.71
5: Worst pain under ribs (left)
1.9 (2.10)
1.2
0.1/3.0
0.0/7.6
1.03
6: Worst early satiety
2.9 (2.21)
2.4
1.3/4.7
0.0/10.0
0.71
7: Worst bone pain (not joint or arthritis)
2.8 (2.55)
2.4
0.3/4.4
0.0/10.0
0.77
MFSAF TSS
18.7 (12.40)
16.2
9.2/24.7
0.1/57.9
1.06
BL indicates baseline; ITT, intent to treat; max, maximum; MFSAF, Myelofibrosis Symptom Assessment Form; min, minimum; Q1/Q3, first/third quartile; SD, standard deviation; TSS, Total Symptom Score
a Item response score for BL is the average of the item’s daily score reported during 7 consecutive days (baseline days 1–7) before randomization
b Item response score for weeks 12 and 24 is the average of the item’s daily score reported during 28 consecutive days before the end of that week

Structural validity (CFA, item-to-item and item-to-total correlation)

The results of CFA revealed that the comparative fit index (CFI) and the standardized root mean square residual (SRMR) values were acceptable for the assumed unidimensional model (i.e., SRMR value equal to 0.05, CFI value close to 0.95, and root mean square error of approximation value equal to 0.11) (Supplementary Table 4 and Supplementary Fig. 2).
Spearman correlation coefficients were mostly moderate to strong at baseline, ranging from 0.384 to 0.772. Exceptions include item 1 (fatigue) with item 3 (itching) (r = .289), and item 3 (itching) with item 6 (early satiety) (r = .298). Week 24 correlations were higher and all but one exceeded the 0.40 threshold (range: 0.391–0.829) (Table 2).
Table 2
Item-to-item and item-to-total correlations of MFSAF items at BL and week 24 – ITT populationa
Item
Item 1
Item 2
Item 3
Item 4
Item 5
Item 6
Item 7
Item totalb
Time point: BL (N = 195)c
1: Worst fatigue
1.000
0.574
2: Worst night sweats
0.399
1.000
0.618
3: Worst itching
0.289
0.436
1.000
0.473
4: Worst abdominal discomfort
0.509
0.526
0.384
1.000
0.779
5: Worst pain under ribs (left)
0.405
0.534
0.398
0.752
1.000
0.734
6: Worst early satiety
0.604
0.460
0.298
0.772
0.600
1.000
0.716
7: Worst bone pain (not joint or arthritis)
0.415
0.511
0.444
0.508
0.565
0.478
1.000
0.625
Time point: week 24 (n = 129)d
1: Worst fatigue
1.000
0.618
2: Worst night sweats
0.439
1.000
0.612
3: Worst itching
0.424
0.473
1.000
0.579
4: Worst abdominal discomfort
0.538
0.542
0.428
1.000
0.763
5: Worst pain under ribs (left)
0.418
0.501
0.490
0.706
1.000
0.706
6: Worst early satiety
0.517
0.435
0.479
0.829
0.625
1.000
0.716
7: Worst bone pain (not joint or arthritis)
0.410
0.496
0.468
0.430
0.564
0.391
1.000
0.570
BL indicates baseline; ITT, intent to treat; MFSAF, Myelofibrosis Symptom Assessment Form; TSS, Total Symptom Score
a The presented correlation coefficient values refer to Spearman correlation coefficients
b Spearman correlation coefficients among each item with the MFSAF TSS omitting that item
c Item response score and item total score for BL are the average of the item’s daily score reported during 7 consecutive days (baseline days 1–7) before randomization
d Item response score and item total score for week 24 is the average of the item’s daily score reported during 28 consecutive days before the end of that week
Correlations between each item and the remainder of the total score were moderate to high (Table 2). All item-to-total correlations exceeded the 0.50 threshold except item 3 (itching) at baseline. The item-to-total correlations ranged from 0.473 to 0.779 at baseline and 0.570 to 0.763 at week 24.

Test-retest reliability

Moderate test-retest reliability for the MFSAF TSS score (ICC = 0.645) was observed in all patients across screening and baseline (N = 195). When stable patients across baseline and week 4 were defined with PGIS (N = 82 for PGIS-F and N = 76 for PGIS-S), good reliability (ICC = 0.845 for PGIS-F and ICC = 0.829 for PGIS-S) was observed. Excellent test-retest reliability was demonstrated in the PGIS-defined stable condition between weeks 4 and 8 (ICC = 0.911 for PGIS-F [N = 89] and ICC = 0.915 for PGIC-S [N = 83]) (Table 3).
Table 3
Test-retest reliability of MFSAF TSS – ITT population
Time pointa
Anchor
N
ICCb
95% CI
Screening to BL
 
195
0.645
0.555–0.720
BL to week 4
PGIS-S
76
0.829
0.638–0.909
PGIS-F
82
0.845
0.685–0.915
Weeks 4 to 8
PGIS-S
83
0.915
0.846–0.951
PGIS-F
89
0.911
0.821–0.951
BL indicates baseline; ICC, intraclass correlation coefficient; ITT, intent to treat; MFSAF, Myelofibrosis Symptom Assessment Form; PGIS-F, Patient Global Impression of Severity-Fatigue; PGIS-S, Patient Global Impression of Severity-Symptoms; TSS, Total Symptom Score
a MFSAF TSS for screening is the daily TSS on that assessment day (excluding patients with multiple days of screening assessments), and MFSAF TSS for BL is the average of the daily MFSAF TSS for the period of 7 consecutive days (baseline days 1–7) before randomization. MFSAF TSS for the other specified time points is the average of the daily MFSAF TSS from a period of 28 consecutive days before the end of each week
b ICC values of 0.50 to 0.90 are considered to represent moderate to good reliability, and values > 0.90 represent excellent reliability [22]

Internal consistency reliability

Internal consistency reliability was calculated at baseline (α = 0.877) and week 24 (α = 0.903) (Table 4). “Cronbach’s α if item deleted” did not lead to an increase for most items. The only exception was MFSAF item 3 (itching) at baseline, for which the increase was minimal. McDonald’s omega (ω) coefficient was equal to 0.875 at baseline and 0.899 at week 24, further supporting the reliability of the instrument.
Table 4
Internal consistency of MFSAF TSS – ITT population
 
Item
MFSAF TSS (95% CI)
Time point: BL (N = 195)a
Cronbach’s α
Total
0.877 (0.849–0.902)
Cronbach’s α if item deleted
1: Worst fatigue
0.867 (0.836–0.893)
 
2: Worst night sweats
0.860 (0.828–0.888)
 
3: Worst itching
0.883 (0.856–0.906)
 
4: Worst abdominal discomfort
0.843 (0.807–0.875)
 
5: Worst pain under ribs (left)
0.850 (0.816–0.880)
 
6: Worst early satiety
0.850 (0.816–0.880)
 
7: Worst bone pain (not joint or arthritis)
0.859 (0.826–0.887)
Time point: week 24 (n = 129)b
Cronbach’s α
Total
0.903 (0.875–0.926)
Cronbach’s α if item deleted
1: Worst fatigue
0.898 (0.868–0.922)
 
2: Worst night sweats
0.888 (0.856–0.915)
 
3: Worst itching
0.893 (0.863–0.919)
 
4: Worst abdominal discomfort
0.878 (0.843–0.908)
 
5: Worst pain under ribs (left)
0.882 (0.848–0.910)
 
6: Worst early satiety
0.882 (0.849–0.911)
 
7: Worst bone pain (not joint or arthritis)
0.897 (0.868–0.922)
BL indicates baseline; ITT, intent to treat; MFSAF, Myelofibrosis Symptom Assessment Form; TSS, Total Symptom Score
a Item response scores for BL are the average of the item’s daily score reported during 7 consecutive days (baseline days 1–7) before randomization
b Item response score for week 24 is the average of the item’s daily score reported during 28 consecutive days before the end of that week

Construct validity: convergent and divergent

Table 5 presents the correlations between the MFSAF TSS and the hypothetically related EORTC QLQ-C30 scale scores, PROMIS Physical Function Short Form 10b Total Score, PROMIS Physical Function 4 additional item scores, and EQ-5D-5L item scores at baseline and week 24. EORTC QLQ-C30 Pain and EQ-5D-5L Pain/Discomfort achieved the predefined threshold (correlation higher than 0.5) and demonstrated the highest associations with MFSAF TSS. At baseline, the correlations between the EORTC QLQ-C30 Cognitive and Emotional Functioning scales and MFSAF TSS were lower than 0.5, as per our initial hypothesis.
Table 5
Construct validity: convergent and divergent validity – ITT population
Score
Type of correlation
r at BL (n)a
r at week 24 (n)b
EORTC QLQ-C30 Global health status/QOL
Pearson
−0.365 (194)
−0.393 (120)
EORTC QLQ-C30 Physical Functioning
Pearson
−0.407 (194)
−0.435 (120)
EORTC QLQ-C30 Role Functioning
Pearson
−0.357 (194)
−0.345 (120)
EORTC QLQ-C30 Emotional Functioning
Pearson
−0.354 (194)
−0.514 (120)
EORTC QLQ-C30 Cognitive Functioning
Pearson
−0.244 (194)
−0.324 (120)
EORTC QLQ-C30 Fatigue
Pearson
0.366 (194)
0.437 (120)
EORTC QLQ-C30 Pain
Pearson
0.597 (194)
0.525 (120)
EORTC QLQ-C30 Insomnia
Pearson
0.332 (194)
0.415 (119)
PROMIS Physical Function Item: “Are you able to climb several flights of stairs?”
Polyserial
−0.338 (193)
−0.252 (120)
PROMIS Physical Function Item: “Does your health now limit you in lifting or carrying groceries?”
Polyserial
−0.371 (193)
−0.382 (120)
PROMIS Physical Function Item: “Does your health now limit you in going for a short walk (less than 15 minutes)?”
Polyserial
−0.373 (193)
−0.298 (120)
PROMIS Physical Function Item: “How much difficulty do you have doing your daily physical activities, because of your health?”
Polyserial
−0.265 (193)
−0.400 (120)
PROMIS Physical Function Short Form 10b Total Score
Pearson
−0.350 (193)
−0.380 (120)
EQ-5D-5L Usual Activities
Polyserial
0.429 (193)
0.390 (120)
EQ-5D-5L Pain/Discomfort
Polyserial
0.579 (193)
0.520 (120)
BL indicates baseline; EORTC, European Organisation for Research and Treatment of Cancer; EQ-5D-5L, EuroQoL Five Dimension 5-Level; ITT, intent to treat; MFSAF, Myelofibrosis Symptom Assessment Form; PROMIS, Patient-Reported Outcomes Measurement Information System; QOL, quality of life; TSS, Total Symptom Score
a MFSAF TSS for BL is the average of the item’s daily MFSAF TSS for the period of 7 consecutive days (baseline days 1–7) before randomization
b MFSAF TSS for week 24 is the average of the daily MFSAF TSS from a period of 28 consecutive days before the end of that week

Construct validity: known-groups

Results from the ANOVA comparing mean MFSAF TSS between consecutive groups defined by PGIS and ECOG PS are presented in Table 6; similar analyses using DIPSS and DIPSS-plus are presented in Supplementary Table 5.
Table 6
Construct validity: known-group validity – ITT population
Anchor
Anchor group
n
LS mean (SE)a
95% CIa
P valuea
ESb
Time point: BLc
PGIS-S categories
None or mild
35
18.42 (1.97)
14.53–22.30
< 0.001
Moderate
112
25.31 (1.10)
23.14–27.48
0.62
Severe
47
38.61 (1.70)
35.26–41.96
1.13
PGIS-F categories
None or mild
26
18.58 (2.45)
13.74–23.42
< 0.001
Moderate
99
25.10 (1.26)
22.62–27.58
0.57
Severe
69
33.70 (1.51)
30.73–36.67
0.68
ECOG PS
0 (fully active)
31
21.91 (2.33)
17.31–26.51
< 0.001
1 (restricted in strenuous activity)
117
25.99 (1.20)
23.62–28.35
0.33
2 (ambulatory and capable of self-care)
47
33.73 (1.89)
30.00-37.47
0.59
3 (capable of limited self-care)
0
4 (completely disabled)
0
5 (dead)
0
Time point: week 24d
PGIS-S categories
None or mild
51
12.00 (1.59)
8.85–15.16
< 0.001
Moderate or severe
70
23.48 (1.36)
20.78–26.17
1.01
PGIS-F categories
None or mild
36
10.83 (1.90)
7.07–14.59
< 0.001
Moderate
64
20.10 (1.42)
17.28–22.92
0.91
Severe
21
27.59 (2.48)
22.67–32.51
0.60
ECOG PS
0 (fully active)
34
15.38 (2.14)
11.14–19.61
0.164
1 (restricted in strenuous activity)
68
19.16 (1.51)
16.17–22.15
0.31
2 (ambulatory and capable of self-care)
22
22.55 (2.66)
17.29–27.81
0.26
3 (capable of limited self-care)
0
4 (completely disabled)
0
5 (dead)
0
ANOVA indicates analysis of variance; BL, baseline; ECOG PS, Eastern Cooperative Oncology Group performance status; ES, effect size; ITT, intent to treat; LS, least squares; MFSAF, Myelofibrosis Symptom Assessment Form; PGIS-F, Patient Global Impression of Severity-Fatigue; PGIS-S, Patient Global Impression of Severity-Symptoms; SE, standard error; TSS, Total Symptom Score
a LS mean, SE, CI, and P value are produced with an ANOVA model for BL MFSAF TSS with the anchor group (categorical) as the covariate. If the sample size of a particular category group is < 15 patients, adjacent groups are combined. A nonparametric alternative to ANOVA (Kruskal-Wallis test) is conducted for the P value when the sample size is low (i.e., < 30 per category)
b ES is calculated as the mean difference in MFSAF TSS between groups divided by the pooled standard deviation of those groups
c MFSAF TSS for BL is the average of the item’s daily MFSAF TSS for the period of 7 consecutive days (baseline days 1–7) before randomization
d MFSAF TSS for week 24 is the average of the daily MFSAF TSS from a period of 28 consecutive days before the end of each week
In accordance with expectations, lower (i.e., better) MFSAF TSS mean scores were observed for patients with better PGIS and ECOG responses. Patterns were more clearly observed among PGIS levels at baseline and week 24 and among ECOG levels at baseline. The MFSAF TSS at baseline based on PGIS-S categorization was 18.42 for the “none or mild” group, 25.31 for the “moderate” group, and 38.61 for the “severe” group. ES was moderate (threshold: 0.5 ≤ d < 0.8) to large (threshold: d ≥ 0.8) for PGIS at baseline (absolute range: 0.57–1.13 for PGIS-F and PGIS-S) and week 24 (absolute range: 0.60–1.01 for PGIS-F and PGIS-S), providing further evidence of known-groups validity.
Statistically significant differences among PGIS categories were observed at both baseline and week 24 (P < .001 for both PGIS-F and PGIS-S), and among ECOG PS groups at baseline (P < .001) (Table 6).

Sensitivity to change

Sensitivity to change analysis (Supplementary Table 6) revealed that patients who indicated symptom improvement on the PGIS-S/PGIC-S and fatigue improvement on the PGIS-F/PGIC-F reported higher (in absolute terms) mean change from baseline vs patients who reported “no change” or worsening. The mean change from baseline to week 12 in MFSAF TSS for the 3 PGIS-S collapsed categories was − 13.85 (improved), − 5.49 (no change), and − 1.66 (worsening), and from baseline to week 24 was − 16.06 (improved), − 5.30 (no change), and − 1.68 (worsening). PGIC anchors showed similar trends to those observed with PGIS anchors. The mean changes from baseline to week 12 in the MFSAF TSS for the 3 PGIC-F collapsed categories were − 10.32 (improved), − 6.82 (no change), and − 5.35 (worsening).
Within group ES was moderate (threshold: 0.5 ≤ d < 0.8) to large (threshold: d ≥ 0.8) in absolute value for “improved” groups (collapsed categories) across all anchors, indicating a greater change in the MFSAF TSS between week 12 and baseline (absolute range: 0.69–1.08) and an even greater change between week 24 and baseline (absolute range: 0.79–1.16) (Supplementary Table 6). This was also observed in PGIC uncollapsed categories, as the “much improved” group showed large ES changes vs other groups (absolute within ES = 1.12 for PGIC-S and within ES = 1.14 for PGIC-F at week 12). Between-group ES also showed a great differentiation of the “improved” group relative to the “no change” group, with PGIS showing higher ES vs PGIC anchors.

Discussion

Our analyses summarize the psychometric properties of the MFSAF v4.0 and provide preliminary evidence of its validity and appropriateness for use in MF clinical trials. The MFSAF TSS v4.0 is a diary-based measure that assesses the most important MF symptoms that affect a patient’s QOL. Data from qualitative interviews showed that its 11-point rating scale is relevant and understandable to patients [27], and this format has been recommended in other conditions where capturing patients’ perspectives is crucial [28]. The MFSAF v4.0 may be used as a daily diary or a weekly assessment; although our study used the former for capturing MF symptoms, future work may benefit from selection of a recall period appropriate for the goals of the study [11].
A total of 195 patients were enrolled in this trial, which is close to the recommended sample size to establish PRO validity [29]. The high compliance rate achieved with the electronic diary was expected as opposed to paper diaries, which have shown lower compliance rates [30]. Summary statistics revealed moderate skewness at baseline and later time points, which can be attributed to high endorsement of low severity options for item 2 (night sweats), item 3 (itching), item 5 (pain under ribs [left]), and item 7 (bone pain [not joint or arthritis]). Consistent with our study, low severity for some of these items (i.e., night sweats, itching) has been demonstrated in previous studies of the MFSAF [25, 31]. Item-to-item and item-to-total analysis revealed moderate to strong correlations at baseline and week 24, supporting the multi-item TSS scale. In addition, CFA analysis performed at baseline also confirmed the hypothesized unidimensionality of this tool. Future studies may consider using data from a single day when performing these investigations, as averaging across time has the potential to inflate the correlation strength. In addition, research may examine unidimensionality via a multilevel CFA in which the different levels of variance (within-person and between-person) are examined separately.
Consistent with the MFSAF v2.0, which has previously demonstrated reliability [25], test-retest and internal consistency reliability of the MFSAF v4.0 were established in the current study. Similarly, convergent and divergent validity seem to be well supported, as some associations, though not all, met the hypotheses. It should be noted that the convergent validity cutoffs utilized in the present analysis are arbitrary and quite high, as other PRO-related research has utilized a lower threshold (e.g., 0.40) [32]. Adequate known-groups validity was also demonstrated, as the MFSAF TSS could distinguish among PGIS groups at baseline and week 24 and among ECOG groups at baseline. As expected, lower (i.e., better) MFSAF TSS was observed for patients with better PGIS and ECOG PS with mostly moderate to high ESs as per the a-priori hypotheses. DIPSS and DIPSS-plus did not provide evidence of known-groups validity for MFSAF, which may be because there were no patients in the low-risk group based on DIPSS and low-risk or intermediate-1–risk groups (1–2 points) based on DIPSS-plus, so the differentiation of the remaining groups was not distinct. With regards to sensitivity to change, a comparison of mean change scores across anchors (PGIS and PGIC) indicated that the MFSAF TSS was sensitive to change at weeks 12 and 24, particularly in the direction of improvement. A potential limitation of this approach is that PGIC has a lengthy recall period and as a result is prone to bias, which could potentially affect patients’ perceptions of changes in symptoms [33]. Nevertheless, the monotonic patterns across the change groups are consistent with the previously studied MFSAF v2.0 [25], and results support the sensitivity to change property of the MFSAF TSS.
Literature suggests providing a positive rating for construct validity if 75% of the prespecified hypotheses are met [34]. However, the present analysis evaluated the psychometric proprieties of the MFSAF TSS within a sample that had not been collected for this purpose. Thus, even though all hypotheses were prespecified based on previous literature, determination of construct validity was made by inspecting all results collectively and not by inspecting how many hypotheses were met. Similarly, expected differences among the known groups and the various anchor groups used in the sensitivity to change analyses were not explicitly defined. However, ES results were provided, aiming to contextualize whether the identified differences were of low, moderate, or high magnitude.
As with all psychometric studies undertaken using clinical trial data, it must be acknowledged that MOMENTUM was not designed to assess the psychometric properties of the MFSAF v4.0. However, sufficient data and information were available to derive preliminary and exploratory evidence of the measurement properties of this instrument. Although the analyses conducted in this study are therefore exploratory in nature, our findings provide supportive evidence of structural validity, reliability (test-retest and internal consistency), construct validity, and sensitivity to change. Evaluation of a clinically meaningful change threshold of the MFSAF TSS is an area for future work.

Conclusions

The analyses described here provide preliminary evidence that the MFSAF TSS v4.0 is an appropriate instrument for clinical trial endpoint use in patients with JAK inhibitor–experienced MF who are anemic and symptomatic. Further work replicating our findings in other similar populations is highly recommended to support efficacy endpoints and provide an interpretable and meaningful evaluation of treatment benefits.

Acknowledgments

We thank all participating patients and their families and all study site staff. Medical writing support was provided by Prasanthi Mandalay, PhD, of Nucleus Global, an Inizio company, and funded by GSK.

Declarations

Ethical approval

The current analysis is based on results of the MOMENTUM trial. MOMENTUM was done in accordance with the Declaration of Helsinki and the International Council for Harmonisation guidelines on Good Clinical Practice. Institutional review boards or independent ethics committees at each site approved the protocol.
Written consent was obtained from all individual participants included in the MOMENTUM study.
Not applicable. All personal data are deidentified.

Competing interests

Christina Daskalopoulou holds an advisory role within IQVIA and reports grants from GSK during the conduct of this study. Boris Gorsh has been employed by GSK and Daiichi Sankyo during the production of this manuscript and has stock or stock options with both companies. Gerasimos Dumi and Jean Paty are employees of IQVIA. Samineh Deheshi is an employee of GSK. Chad Gwaltney has received consulting fees from IQVIA for this MFSAF evaluation. Catherine Ellis is an employee of GSK and reports stock or stock options. Jun Kawashima is an employee of Sierra Oncology, a GSK company. Ruben Mesa reports consulting fees/honoraria from AbbVie, Blueprint, Bristol Myers Squibb, CTI, Genentech, Geron, GSK, Incyte, MorphoSys, Novartis, Sierra, Sierra Oncology, and Telios. All authors acknowledge medical writing support related to this manuscript, funded by GSK.
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://​creativecommons.​org/​licenses/​by-nc-nd/​4.​0/​.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
share
DELEN

Deel dit onderdeel of sectie (kopieer de link)

  • Optie A:
    Klik op de rechtermuisknop op de link en selecteer de optie “linkadres kopiëren”
  • Optie B:
    Deel de link per e-mail

Onze productaanbevelingen

BSL Podotherapeut Totaal

Binnen de bundel kunt u gebruik maken van boeken, tijdschriften, e-learnings, web-tv's en uitlegvideo's. BSL Podotherapeut Totaal is overal toegankelijk; via uw PC, tablet of smartphone.

Bijlagen

Electronic supplementary material

Below is the link to the electronic supplementary material.
Literatuur
1.
go back to reference O’Sullivan, J. M., & Harrison, C. N. (2018). Myelofibrosis: Clinicopathologic features, prognosis, and management. Clinical Advances in Hematology & Oncology: H&O, 16, 121–131. O’Sullivan, J. M., & Harrison, C. N. (2018). Myelofibrosis: Clinicopathologic features, prognosis, and management. Clinical Advances in Hematology & Oncology: H&O, 16, 121–131.
9.
go back to reference Verstovsek, S., Gerds, A. T., Vannucchi, A. M., Al-Ali, H. K., Lavie, D., Kuykendall, A. T., Grosicki, S., Iurlo, A., Goh, Y. T., Lazaroiu, M. C., et al. (2023). Momelotinib versus danazol in symptomatic patients with anaemia and myelofibrosis (MOMENTUM): Results from an international, double-blind, randomised, controlled, phase 3 study. Lancet, 401, 269–280. https://doi.org/10.1016/S0140-6736(22)02036-0CrossRefPubMed Verstovsek, S., Gerds, A. T., Vannucchi, A. M., Al-Ali, H. K., Lavie, D., Kuykendall, A. T., Grosicki, S., Iurlo, A., Goh, Y. T., Lazaroiu, M. C., et al. (2023). Momelotinib versus danazol in symptomatic patients with anaemia and myelofibrosis (MOMENTUM): Results from an international, double-blind, randomised, controlled, phase 3 study. Lancet, 401, 269–280. https://​doi.​org/​10.​1016/​S0140-6736(22)02036-0CrossRefPubMed
10.
go back to reference Scherber, R., Dueck, A. C., Johansson, P., Barbui, T., Barosi, G., Vannucchi, A. M., Passamonti, F., Andreasson, B., Ferarri, M. L., Rambaldi, A., et al. (2011). The Myeloproliferative Neoplasm Symptom Assessment Form (MPN-SAF): International prospective validation and reliability trial in 402 patients. Blood, 118, 401–408. https://doi.org/10.1182/blood-2011-01-328955CrossRefPubMed Scherber, R., Dueck, A. C., Johansson, P., Barbui, T., Barosi, G., Vannucchi, A. M., Passamonti, F., Andreasson, B., Ferarri, M. L., Rambaldi, A., et al. (2011). The Myeloproliferative Neoplasm Symptom Assessment Form (MPN-SAF): International prospective validation and reliability trial in 402 patients. Blood, 118, 401–408. https://​doi.​org/​10.​1182/​blood-2011-01-328955CrossRefPubMed
12.
go back to reference Aaronson, N. K., Ahmedzai, S., Bergman, B., Bullinger, M., Cull, A., Duez, N. J., Filiberti, A., Flechtner, H., Fleishman, S. B., de Haes, J. C., et al. (1993). The European Organization for Research and Treatment of Cancer QLQ-C30: A quality-of-life instrument for use in international clinical trials in oncology. Journal of the National Cancer Institute, 85, 365–376. https://doi.org/10.1093/jnci/85.5.365CrossRefPubMed Aaronson, N. K., Ahmedzai, S., Bergman, B., Bullinger, M., Cull, A., Duez, N. J., Filiberti, A., Flechtner, H., Fleishman, S. B., de Haes, J. C., et al. (1993). The European Organization for Research and Treatment of Cancer QLQ-C30: A quality-of-life instrument for use in international clinical trials in oncology. Journal of the National Cancer Institute, 85, 365–376. https://​doi.​org/​10.​1093/​jnci/​85.​5.​365CrossRefPubMed
13.
go back to reference Hays, R. D., Spritzer, K. L., Amtmann, D., Lai, J. S., Dewitt, E. M., Rothrock, N., Dewalt, D. A., Riley, W. T., Fries, J. F., & Krishnan, E. (2013). Upper-extremity and mobility subdomains from the Patient-Reported Outcomes Measurement Information System (PROMIS) adult physical functioning item bank. Archives of Physical Medicine and Rehabilitation, 94, 2291–2296. https://doi.org/10.1016/j.apmr.2013.05.014CrossRefPubMed Hays, R. D., Spritzer, K. L., Amtmann, D., Lai, J. S., Dewitt, E. M., Rothrock, N., Dewalt, D. A., Riley, W. T., Fries, J. F., & Krishnan, E. (2013). Upper-extremity and mobility subdomains from the Patient-Reported Outcomes Measurement Information System (PROMIS) adult physical functioning item bank. Archives of Physical Medicine and Rehabilitation, 94, 2291–2296. https://​doi.​org/​10.​1016/​j.​apmr.​2013.​05.​014CrossRefPubMed
14.
go back to reference Eremenco, S., Chen, W. H., Blum, S. I., Bush, E. N., Bushnell, D. M., DeBusk, K., Gater, A., Nelsen, L., Coons, S. J., & PRO Consortium’s Communication Subcommittee. (2022). Comparing patient global impression of severity and patient global impression of change to evaluate test-retest reliability of depression, non-small cell lung cancer, and asthma measures. Quality of Life Research, 31, 3501–3512. https://doi.org/10.1007/s11136-022-03180-5CrossRefPubMedPubMedCentral Eremenco, S., Chen, W. H., Blum, S. I., Bush, E. N., Bushnell, D. M., DeBusk, K., Gater, A., Nelsen, L., Coons, S. J., & PRO Consortium’s Communication Subcommittee. (2022). Comparing patient global impression of severity and patient global impression of change to evaluate test-retest reliability of depression, non-small cell lung cancer, and asthma measures. Quality of Life Research, 31, 3501–3512. https://​doi.​org/​10.​1007/​s11136-022-03180-5CrossRefPubMedPubMedCentral
16.
go back to reference Passamonti, F., Cervantes, F., Vannucchi, A. M., Morra, E., Rumi, E., Pereira, A., Guglielmelli, P., Pungolino, E., Caramella, M., Maffioli, M., et al. (2010). A dynamic prognostic model to predict survival in primary myelofibrosis: A study by the IWG-MRT (International Working Group for Myeloproliferative Neoplasms Research and Treatment). Blood, 115, 1703–1708. https://doi.org/10.1182/blood-2009-09-245837CrossRefPubMed Passamonti, F., Cervantes, F., Vannucchi, A. M., Morra, E., Rumi, E., Pereira, A., Guglielmelli, P., Pungolino, E., Caramella, M., Maffioli, M., et al. (2010). A dynamic prognostic model to predict survival in primary myelofibrosis: A study by the IWG-MRT (International Working Group for Myeloproliferative Neoplasms Research and Treatment). Blood, 115, 1703–1708. https://​doi.​org/​10.​1182/​blood-2009-09-245837CrossRefPubMed
17.
go back to reference Gangat, N., Caramazza, D., Vaidya, R., George, G., Begna, K., Schwager, S., Van Dyke, D., Hanson, C., Wu, W., Pardanani, A., et al. (2011). DIPSS plus: A refined Dynamic International Prognostic Scoring System for primary myelofibrosis that incorporates prognostic information from karyotype, platelet count, and transfusion status. Journal of Clinical Oncology, 29, 392–397. https://doi.org/10.1200/JCO.2010.32.2446CrossRefPubMed Gangat, N., Caramazza, D., Vaidya, R., George, G., Begna, K., Schwager, S., Van Dyke, D., Hanson, C., Wu, W., Pardanani, A., et al. (2011). DIPSS plus: A refined Dynamic International Prognostic Scoring System for primary myelofibrosis that incorporates prognostic information from karyotype, platelet count, and transfusion status. Journal of Clinical Oncology, 29, 392–397. https://​doi.​org/​10.​1200/​JCO.​2010.​32.​2446CrossRefPubMed
20.
go back to reference Fayers, P. M., & Machin, D. (2013). Quality of Life: The Assessment, Analysis and Interpretation of Patient-Reported Outcomes; John Wiley & Sons. Fayers, P. M., & Machin, D. (2013). Quality of Life: The Assessment, Analysis and Interpretation of Patient-Reported Outcomes; John Wiley & Sons.
23.
go back to reference Hinkle, D. E., Wiersma, W., & Jurs, S. G. (2003). Applied statistics for the behavioral sciences; Pearson. Hinkle, D. E., Wiersma, W., & Jurs, S. G. (2003). Applied statistics for the behavioral sciences; Pearson.
24.
go back to reference Cohen, J. (2013). Statistical power analysis for the behavioral sciences; Academic press. Cohen, J. (2013). Statistical power analysis for the behavioral sciences; Academic press.
25.
go back to reference Mesa, R. A., Gotlib, J., Gupta, V., Catalano, J. V., Deininger, M. W., Shields, A. L., Miller, C. B., Silver, R. T., Talpaz, M., Winton, E. F., et al. (2013). Effect of ruxolitinib therapy on myelofibrosis-related symptoms and other patient-reported outcomes in COMFORT-I: A randomized, double-blind, placebo-controlled trial. Journal of Clinical Oncology, 31, 1285–1292. https://doi.org/10.1200/JCO.2012.44.4489CrossRefPubMedPubMedCentral Mesa, R. A., Gotlib, J., Gupta, V., Catalano, J. V., Deininger, M. W., Shields, A. L., Miller, C. B., Silver, R. T., Talpaz, M., Winton, E. F., et al. (2013). Effect of ruxolitinib therapy on myelofibrosis-related symptoms and other patient-reported outcomes in COMFORT-I: A randomized, double-blind, placebo-controlled trial. Journal of Clinical Oncology, 31, 1285–1292. https://​doi.​org/​10.​1200/​JCO.​2012.​44.​4489CrossRefPubMedPubMedCentral
26.
go back to reference Palandri, F., Breccia, M., Selleri, C., Mendicino, F., Palumbo, G. A., Abruzzese, E., Liberati, A. M., Di Renzo, N., Pane, F., Tiribelli, M., et al. (2019). Impact of disease burden in myelofibrosis patients: A sub analysis from italian romei observational study. Blood, 134, 4188–4188. https://doi.org/10.1182/blood-2019-125778CrossRef Palandri, F., Breccia, M., Selleri, C., Mendicino, F., Palumbo, G. A., Abruzzese, E., Liberati, A. M., Di Renzo, N., Pane, F., Tiribelli, M., et al. (2019). Impact of disease burden in myelofibrosis patients: A sub analysis from italian romei observational study. Blood, 134, 4188–4188. https://​doi.​org/​10.​1182/​blood-2019-125778CrossRef
29.
go back to reference Frost, M. H., Reeve, B. B., Liepa, A. M., Stauffer, J. W., Hays, R. D., & Mayo/FDA Patient-Reported Outcomes Consensus Meeting Group (2007). What is sufficient evidence for the reliability and validity of patient-reported outcome measures? Value In Health: The Journal of the International Society for Pharmacoeconomics and Outcomes Research, 10(Suppl 2), S94–S105. https://doi.org/10.1111/j.1524-4733.2007.00272.xCrossRefPubMed Frost, M. H., Reeve, B. B., Liepa, A. M., Stauffer, J. W., Hays, R. D., & Mayo/FDA Patient-Reported Outcomes Consensus Meeting Group (2007). What is sufficient evidence for the reliability and validity of patient-reported outcome measures? Value In Health: The Journal of the International Society for Pharmacoeconomics and Outcomes Research, 10(Suppl 2), S94–S105. https://​doi.​org/​10.​1111/​j.​1524-4733.​2007.​00272.​xCrossRefPubMed
31.
go back to reference Mesa, R. A., Kantarjian, H., Tefferi, A., Dueck, A., Levy, R., Vaddi, K., Erickson-Viitanen, S., Thomas, D. A., Cortes, J., Borthakur, G., et al. (2011). Evaluating the serial use of the myelofibrosis symptom assessment form for measuring symptomatic improvement: Performance in 87 myelofibrosis patients on a JAK1 and JAK2 inhibitor (INCB018424) clinical trial. Cancer, 117, 4869–4877. https://doi.org/10.1002/cncr.26129CrossRefPubMed Mesa, R. A., Kantarjian, H., Tefferi, A., Dueck, A., Levy, R., Vaddi, K., Erickson-Viitanen, S., Thomas, D. A., Cortes, J., Borthakur, G., et al. (2011). Evaluating the serial use of the myelofibrosis symptom assessment form for measuring symptomatic improvement: Performance in 87 myelofibrosis patients on a JAK1 and JAK2 inhibitor (INCB018424) clinical trial. Cancer, 117, 4869–4877. https://​doi.​org/​10.​1002/​cncr.​26129CrossRefPubMed
32.
go back to reference Cappelleri, J. C., Zou, K., Bushmakin, A. G., Alvir, M. J. J. (2014). Chapter 3: Validity. In: Patient-Reported Outcomes: Measurement, Implementation and Interpretation (Chapman & Hall/CRC Biostatistics Series). London: Chapman and Hall/CRC. Cappelleri, J. C., Zou, K., Bushmakin, A. G., Alvir, M. J. J. (2014). Chapter 3: Validity. In: Patient-Reported Outcomes: Measurement, Implementation and Interpretation (Chapman & Hall/CRC Biostatistics Series). London: Chapman and Hall/CRC.
Metagegevens
Titel
Myelofibrosis symptom assessment form total symptom score version 4.0: measurement properties from the MOMENTUM phase 3 study
Auteurs
Christina Daskalopoulou
Boris Gorsh
Gerasimos Dumi
Samineh Deheshi
Chad Gwaltney
Jean Paty
Catherine Ellis
Jun Kawashima
Ruben Mesa
Publicatiedatum
25-11-2024
Uitgeverij
Springer International Publishing
Gepubliceerd in
Quality of Life Research
Print ISSN: 0962-9343
Elektronisch ISSN: 1573-2649
DOI
https://doi.org/10.1007/s11136-024-03855-1