Tenosynovial giant cell tumor (TGCT) is a locally aggressive neoplasm associated with limited range of motion (ROM), stiffness, joint damage, pain, and reduced physical functioning (PF). The MOTION Phase 3 trial (NCT05059262) was a randomized, placebo-controlled, double-blind study of vimseltinib among patients with TGCT. The objective of the current study was to define meaningful changes in clinical outcome assessments (COAs) measuring active ROM, PF, and stiffness using qualitative and quantitative data from patients in the MOTION trial.
Methods
Embedded exit interviews with patients in MOTION were conducted to explore meaningful changes in Patient Global Impression of Change (PGIC) anchors, active ROM, Patient-Reported Outcomes Measurement Information System (PROMIS)–PF, and Worst Stiffness numeric rating scale (NRS). Anchor- and distribution-based analyses of the MOTION data, informed by the exit interviews, were used to define responder thresholds.
Results
In the MOTION trial, 96/123 patients (78%) completed an exit interview. Most considered “minimally improved” responses for each question (PGIC-PF: 67%; PGIC-ROM 73%) as meaningful. Responder estimates ranged from 1.45 to 4.9 (PROMIS-PF), from 6.0 to 14.8 (active ROM), and from − 2.3 to − 0.5 (Stiffness). The cumulative distribution function curves show a clear separation between treatment groups at a wide range of values around the proposed thresholds.
Conclusions
The responder definitions were at least a 3-point improvement for PROMIS-PF, a 10% improvement for active ROM, and a 2-point improvement for the Worst Stiffness NRS. Qualitative interviews facilitate integrating the patient perspective in the selection of anchors and defining meaningful change.
Tenosynovial giant cell tumor (TGCT) is a locally aggressive neoplasm originating from the synovium of joints, bursae, and tendon sheaths [1]. The lesion can present as either a singular nodule (localized) or multiple nodules (diffuse), with surgery being considered the first line of treatment [2]. When surgery is not an option, there are currently only limited treatment alternatives [3], representing a significant unmet need.
There has been substantial research on TGCT over the past 10 years to characterize the burden of disease from the patient perspective, including patients’ reports on physical functioning (PF), pain, stiffness, and range of motion (ROM) [4, 5]. In a previously published qualitative study of 22 patients with TGCT, they to patients reported experiencing reduced ROM, stiffness, pain, swelling, and joint instability, which impacted their PF [4]. These results were consistent with clinical expert input [4]. A 2-year longitudinal follow-up study of 176 patients living with TGCT further established patients’ long-term challenges with pain interference and severity, as well as stiffness [5]. These prior studies have supported the inclusion of PF, pain, and stiffness as key patient-reported endpoints in clinical trials for TGCT [6, 7].
The content validity and psychometric properties have previously been demonstrated in the TGCT population for PF and stiffness, as measured by specific clinical outcome assessments (COAs) such as the Patient-Reported Outcomes Measurement Information System—Physical Functioning (PROMIS-PF) and Worst Stiffness numeric rating scale (NRS) [4, 8‐10]. Additionally, there is some published information on thresholds for meaningful change (i.e., responder definitions or individual-level estimates) in this population. In a prior quantitative study, triangulation of anchor- and distribution-based analyses resulted in estimated responder definitions of ≥ 3 for the PROMIS-PF and improvement of ≥ 1 for the Worst Stiffness NRS [9].
Although clinical trials using COA endpoints may identify statistically significant differences across treatment groups, the magnitude of those differences may not be important to patients [11]. Therefore, direct input from patients to establish what constitutes a meaningful change (or minimum clinically important difference; MCID – “the smallest difference in score in the domain of interest which patients perceive as beneficial”) is critical [12‐15]. This research study extends prior approaches by using mixed-methods and gathering both quantitative and qualitative data (i.e., clinical trial exit interviews) on meaningful change directly from participants in a single randomized controlled trial (RCT).
Mixed-methods approaches exploring meaningful change are broadly considered to be studies where both qualitative and quantitative data are used to inform the definition of meaningful treatment change. Mixed-methods can be valuable in that they concurrently capitalize and synergize on the strengths of both qualitative approaches (e.g., patient-centered approach, provide additional insights on why specific changes are important, etc.) and quantitative approaches (e.g., yield concrete point estimates, well-established and consistent methods, etc.). Several prior studies have used mixed-methods in osteomalacia [16], atopic dermatitis [17, 18], non-segmental vitiligo [19], multiple myeloma [20], and aphasia [21]. While these prior studies covered a broad range of disease areas and effectively used mixed-methods to elucidate information on meaningful change, they were characterized by small numbers of participants and/or qualitative and quantitative data gathered from disparate sources.
The current mixed-methods study expands on prior applications by capitalizing on qualitative and quantitative data collected from a single clinical trial. Furthermore, it uses data from substantially larger numbers of patients participating in the same trial to iteratively identify and characterize thresholds for meaningful change in multiple COA measures. This includes direct qualitative patient input gathered through blinded in-trial interviews to select the most relevant categories of meaningful change in the anchor variables. This information was subsequently applied in the quantitative analyses of the blinded data, and then further contextualized through insights from the qualitative patient input, for example, specific descriptions of the changes the patients had actually experienced, along with information on why they considered these changes to be meaningful. The approach presented here may be of interest to those using COAs across a vast range of indications as the methods are not specific to any single disease or measure.
The objective of the current study was to use combined qualitative exit interview and quantitative data to define responder thresholds in COAs measuring active ROM, PF, and worst stiffness among patients with TGCT.
Methods
Data: MOTION Phase 3 trial
This Phase 3 study is a multicenter, international, randomized, double-blind, placebo-controlled study to evaluate the efficacy and safety of vimseltinib (MOTION; NCT05059262) [6]. A total of 123 patients with TGCT were randomly assigned 2:1 to vimseltinib 30 mg twice weekly or placebo for 24 weeks (Part 1) before proceeding to open-label vimseltinib (Part 2) (Supplemental Figure S1). Results of the MOTION trial have been previously reported [6]. Along with a statistically significant greater objective tumor response rate among patients on vimseltinib compared with placebo (primary endpoint), the results also included statistically significant greater improvements in ROM, PF, stiffness, EQ-VAS (health status), and pain among patients on active treatment (secondary endpoints) [6]. All analyses to define the responder thresholds were conducted using blinded data.
Measures
ROM for the affected joint was assessed using goniometry [22], where the measurement (in degrees) was used to derive a relative ROM, for each type (active or passive), obtained through normalization to the measurement from a reference standard value provided by the American Medical Association (AMA) per motion. A 15-item short form of the PROMIS-PF questionnaire assessing PF among patients with TGCT, with specific items that applied to upper extremity (11 items) or lower extremity (13 items) tumors, was completed [8]. PROMIS-PF scores are expressed as T-scores (i.e., mean = 50, standard deviation [SD] = 10 in the general population), with higher scores indicating better PF. Item-response theory was used to score the PROMIS-PF. Patients also completed a single NRS item evaluating worst stiffness in the past 24 hours on a scale from 0 (no stiffness) to 10 (worst imaginable). Patient Global Impression of Severity (PGIS) and Patient Global Impression of Change (PGIC) items were included as anchors in the trial. The two PGIS items evaluated the severity of PF limitations and limited ROM at the site of the tumor on a 5-point scale ranging from “none” to “very severe.” The three PGIC items assessed overall change in patients’ symptoms at the site of the tumor, change in ROM at the site of the tumor, and change in tumor-related PF. The scale for all three PGIC items was a 7-point scale ranging from “very much improved” to “very much worse.” There were no PGIS or PGIC questions on stiffness in the trial.
Qualitative Exit Interviews
A cross-sectional, embedded exit interview study was conducted with a subset of patients enrolled in the Phase 3 clinical trial, MOTION. The interviews were completed within 28 days before the End of Part 1 and prior to unblinding (Supplemental Figure S1). The exit interviews received local ethics approval at each site. Patients were eligible if they were enrolled in the trial through Cycle 5 and provided written consent. Those who declined consent or missed the interview window were not eligible. The multinational interviews were conducted in the participants’ native languages by 10 trained interviewers (including KC, TF, and YH) using a semi-structured interview guide. Non-English guides were translated by native-language translators and reviewed by independent bilingual speakers to ensure accuracy and cultural appropriateness. For the COAs, the PROMIS-PF, PGIS, PGIC, and Worst Stiffness NRS were forward and back-translated according to published guidance [23].
The interview guide included three core sections: 1) general experience with TGCT; 2) cognitive debriefing; and 3) usability of the data collection devices. To ensure reasonable patient burden, patients were assigned to one of three cognitive debriefing groups: a) PGIS and PGIC for PF and ROM, b) PROMIS-PF, or c) Worst Stiffness NRS. During cognitive debriefing, patients were asked to describe changes, including the smallest change needed, to feel the treatment was meaningful or worthwhile (see supplemental Document S1). Since patients with upper extremity tumors are relatively rare, they were all assigned to the PROMIS-PF group. The remaining patients were randomly assigned to one of the three groups (1:1:1). As part of the cognitive debriefing section, meaningful change in the respective measures were discussed with the participants. This discussion was based on the patients’ recalled responses at baseline and at the time of the interview.
The target sample size for the qualitative interviews was 100 participants. As the content of the interviews was divided into three blocks for the more specific interview content, this resulted in an adequate planned sample of approximately 33 patients per block [24]. All interviews were audio recorded and transcribed, with non-English interviews simultaneously translated into English by a professional transcription vendor. Non-English transcripts were quality checked by the native language interviewer to ensure accuracy. Transcripts were analyzed using ATLAS.ti version 22 [25]. An initial coding dictionary, developed from the interview guide, was iteratively refined to capture emerging concepts. The first transcript was triple coded to confirm agreement across coders. The three trained Evidera staff independently coded the remaining transcripts. Code frequencies and supporting quotations were then exported for analysis and synthesis following a summative content approach [26].
Quantitative analyses were conducted on a blinded interim intent-to-treat set of 93 patients from the total sample. These analyses were pre-specified to occur prior to database lock when approximately 75% of the sample had completed their Week 25 visit. The analyses included participants with Baseline and Week 25 scores for each respective measure. Of the 93 participants, data were available at the time of the interim data extraction for: n = 71 participants for the PROMIS-PF, n = 81 for ROM, and n = 73 for the Worst Stiffness NRS.
There is no universally accepted or standardized method for triangulating to identify a single point estimate for meaningful change. Anchor- and distribution-based methods were used to estimate meaningful change in the PROMIS-PF, active ROM, and Worst Stiffness NRS. Consistent with existing guidance on the topic, anchor-based methods for score interpretation were considered the primary analyses, while distribution-based (calculating half the SD at screening and the standard error of measurement [SEM]) methods were secondary and supportive [27]. For this study, a PGIS and/or PGIC item was used as the anchor to assess the mean change score of each COA within each subgroup of participants for an anchor category (e.g., “minimally improved”). The Spearman’s rank sum correlations between proposed anchors (PGIS and PGIC items) and the change score of interest were investigated prior to the anchor-based analyses; a minimum correlation of 0.30 to 0.35 is recommended as the minimum acceptable association [28]. In this study, the responder definition point estimates were derived from the minimal meaningful change group(s) for each anchor, with the primary anchor category for each variable selected as the one identified by the majority of the responses from the qualitative interviews. Triangulation of these point estimates was then performed to converge on a single point estimate for each responder definition [28, 29]. Cumulative distribution function (CDF) curves were plotted to visually explore the proportion of responders by treatment group. These CDF curves were developed after completion of the trial, using the full unblinded data from MOTION.
Results
Sample characteristics
Of the 123 patients enrolled in the MOTION trial, 120 were invited to participate in exit interviews, and 3 were unblinded prior to being interviewed or not dosed. Of the 120, 96 patients completed exit interviews. Thirteen participants declined to participate, 7 were ineligible, 2 were not interviewed due to lack of language capabilities among interviewers, and 2 were lost to follow-up. Patients were recruited from 27 of 35 clinical sites. The interview sample constituted 78% of the clinical trial sample. Table 1 presents key sociodemographic and clinical characteristics for the MOTION trial, exit interview, and meaningful change samples. Both subsamples are highly similar to the overall clinical trial sample. The baseline and change scores on each COA measure have been previously reported elsewhere[6]. Briefly improvements from Baseline to Week 25 were significantly larger in the vimseltinib versus placebo groups for active range of motion 18.4% (SE = 6.5) versus 3.8% (7.2), PROMIS-PF 4.6 (1.0) versus 1.3 (0.9), and Worst Stiffness NRS − 2.1 (0.2) versus − 0.3 (0.3) [6].
Table 1
Key sociodemographic and clinical characteristics
Characteristic
MOTION (N = 123)a
Exit Interview Sample (n = 96)b
Meaningful Change Sample (n = 93)c
Median (IQR) Age (years)
44.0 (21.0)
43.5 (20.5)
44.0 (20.0)
Sex, n (%)
Male
50 (40.7%)
40 (41.7%)
37 (39.8%)
Female
73 (59.3%)
56 (58.3%)
56 (60.2%)
Geographic region, n (%)
Europe
89 (72.4%)
71 (74.0%)
68 (73.1%)
North America
19 (15.4%)
14 (14.6%)
12 (12.9%)
Australia
14 (11.4%)
10 (10.4%)
12 (12.9%)
Asia
1 (0.8%)
1 (1.0%)
1 (1.1%)
Race, n (%)
Asian
5 (4.1%)
5 (5.2%)
4 (4.3%)
Black or African American
4 (3.3%)
2 (2.1%)
2 (2.2%)
White
80 (65.0%)
64 (66.7%)
62 (66.7%)
Not reported
31 (25.5%)
23 (24.0%)
23 (24.7%)
Unknown
3 (2.4%)
2 (2.1%)
2 (2.2%)
Extremity involvement, n (%)
Upper
11 (8.9%)
8 (8.3%)
7 (7.5%)
Lower
112 (91.1%)
88 (91.7%)
86 (92.5%)
Primary affected joint, n (%)
Ankle
15 (12.2%)
10 (10.4%)
9 (9.7%)
Elbow
1 (0.8%)
1 (1.0%)
1 (1.1%)
Foot
4 (3.3%)
4 (4.2%)
4 (4.3%)
Hand
2 (1.6%)
2 (2.1%)
1 (1.1%)
Hip
12 (9.8%)
9 (9.4%)
8 (8.6%)
Knee
83 (67.5%)
66 (68.8%)
66 (71.0%)
Shoulder
2 (1.6%)
1 (1.0%)
1 (1.1%)
Wrist
3 (2.4%)
2 (2.1%)
2 (2.2%)
Other, temporomandibular
1 (0.8%)
1 (1.0%)
1 (1.1%)
IQR = interquartile range
aThe MOTION sample consists of the ITT population for the MOTION trial
bThe EXIT Interview sample consists of all patients who agreed to participate in an exit interview among all patients from the full MOTION sample that were invited to participate
cAll analyses to estimate the responder definitions on this “Meaningful Change Sample” were conducted on an interim ITT set of approximately 75% of the total patient sample. The 75% was the pre-determined cut-off for conducting these analyses prior to database lock for the clinical trial
Qualitative support for meaningful change
For both the PGIC questions, the “minimally improved” response option represented meaningful change for a large majority of patients (PF: 67%; ROM: 73%) (Supplemental Tables S1 and S2). Given limited treatment options and a natural disease history that is progressive, some patients reported that stabilization was, in and of itself, a meaningful change.
Patient 65, [Ankle]: The only treatment for it was further surgery, which would leave me in a worse state or some sort of radiation therapy to diffuse the joint, which would leave me again in a worse state. So, any sort of treatment that provides for some improvement or even just halts the progression of the tumor, is worthwhile taking, in my view.
Patient 64, [Knee]: I think it’s better to see a small improvement than a big growth. That’s the thing with TGCT, is if left unchecked, it’s only going to get worse.
For the PGIC-PF, most patients (n = 14/21, 67%) selected “minimally improved” as representing the minimally meaningful change. Patients described specific improvements including being able to exercise for longer durations, climbing stairs without assistance, better mobility at work, playing with children, and ability to participate in new activities. Less than one-quarter of the study participants (n = 5/21, 24%) selected “much improved,” and only one participant selected “very much improved” (see Supplemental Tables S1 and S2 for participant quotes).
Patient 79, [Hip]: Even again, minimally improved, even if I just had – I could exercise for a little bit longer or I could climb steps without using a rail. Even if I could just have that more minimal improvement, that would’ve been enough to just feel satisfied with the treatment.
Patient 22, [Hip]: It would be really nice to be able to squat again…Because it’s something I can’t do currently, and it’s a very natural movement in many situations. Both at work and when playing with the children…you can definitely feel it when it’s no longer an option.
Patient 79, [Hip]: Gosh, I think most people on this trial are just desperate for any change, even if it’s minimal, just any bit of relief or like any new activity or thing that you can do, just adds a lot to your life.
Similarly, for the PGIC-ROM, most patients (n = 19/26, 73%) selected “minimally improved” as the smallest meaningful change. Only 23% of patients selected “much improved,” and one patient selected “very much improved” (Supplemental Tables S1 and S2). Patients described minimal improvement in relation to doing daily activities such as household chores or participating in activities with family:
Patient 93, [Foot]: Especially when you have kids, that minimal change may be able to run with my kids, play with my kids. So that is a lot of improvement because being able to spend time with your kids is very meaningful to me. So that minimal change, like I told you, it made me go and I can now go and put my daughter on swing, on a slide, or play ball with my son. Even though it’s a minimal change for me, but it means a lot to me because I can do a lot with that. Even going to work right now is right now very easy for me going to work.
Based on these results, the “minimally improved” groups were selected as the primary anchor category for the subsequent quantitative anchor-based meaningful change analyses for both PF and ROM.
When asked specifically about what constituted a meaningful change on individual PROMIS-PF items, most patients reported that a 1- or 2-category improvement was meaningful to them depending on the item.
Patient 72, [Knee]: The one category for me, if there was a little bit, even in half of those areas, if there was a one category change, that’s significant change. Because I think it all depends on the question too. If there was a one category change, you’d walk out the stairs. Oh, happy days. That’s amazing. If it’s a one category change in exercising for an hour, that’s incredible.
Patient 65, [Ankle]: Even a change in one category, going from with some difficulty to improving to a little difficulty, for me, makes it worthwhile. And I say that because particularly in the discussions with my orthopedic surgeon and oncologist, it was a condition that was going to worsen…. So any sort of treatment that provides for some improvement or even just halts the progression of the tumor, is worthwhile taking, in my view.
Patient 59, [Knee]: I don’t know, any improvement I suppose. … I could do more and I suppose I get concerned in terms of like my ability to do that kind of thing in the future. And I suppose also, like standing still for an hour, I expect to be able to do that without any difficulty at my age, but scrubbing floors, I don’t expect that to be completely pain free and easy. So I would accept a small amount of discomfort doing that, whereas standing still for an hour just feels like it should be something that shouldn’t cause you any discomfort.
As there was no specific anchor for stiffness, the discussion with the patients was focused on changes in stiffness that patients had experienced during the trial. Meaningful improvement in the Worst Stiffness NRS was most commonly considered to be around 2–3 points.
Patient 20, [Knee] (went from 6 to 2 or 3 on Worst Stiffness NRS): Because I can control the leg better, you could say and that it acts on the commands I give it.
Patient 34, [Foot] (went from 8 or 9 to 7 on Worst Stiffness NRS): Simply because, as I explained before, I don’t fall anymore, or at least I don’t fall as much as I did before. So I think that for me it’s quite important.
Quantitative analyses
Correlations between anchors and COAs
The correlations between changes in PROMIS-PF, stiffness, and active ROM and each of the corresponding anchor questions were estimated (Supplemental Table S3). For PROMIS-PF, correlations ranged from − 0.43 for change in PGIS-PF to − 0.54 for the relevant PGIC items (all P < 0.001). For stiffness, the correlation with the PGIC-Overall score was acceptable (r = 0.41, P = 0.0006). The correlations for active ROM anchors were acceptable for the change in PGIS-ROM (r = − 0.34, P = 0.0045), PGIC-ROM (r = − 0.39, P = 0.0010), and PGIC-Overall (r = − 0.38; P = 0.0012) (Supplemental Table S3).
Active ROM
For the change in PGIS-ROM, the mean change at Week 25 in active ROM among patients with a 1-point improvement was 9.0 (SD = 13.47). For PGIC-ROM, the mean change in active ROM assessment among those “minimally improved” was 6.0 (SD = 22.04) and 13.0 (SD = 22.25) among those “much improved.” The mean change in active ROM assessment for patients reporting “minimally improved” on the PGIC-Overall condition was 9.4 (SD = 22.57) and 9.0 (SD = 23.74) for those “much improved” (Table 2). Distribution-based analyses for active ROM yielded supportive estimates of 14.75 and 9.33 for the ½*SD and SEM estimates, respectively. Anchor-based analyses for the subgroup with knee tumors (only subgroup with sufficient data) are included in Supplemental Table S4. The results are comparable to the full sample results, with mean values that are slightly lower.
Table 2
Responder Definition of the Active ROM a Assessment Using Anchor-based Methods – Change in Active ROM Assessment from Baseline to Week 25 by Change in PGIS-ROM, PGIC-ROM, and PGIC-Overall from Baseline to Week 25
Score
Any Improvement
3-Point Improvement
2-Point Improvement
1-Point Improvement
No Change
(0-Point Change)
1-Point Worsening
2-Point
Worsening
3-Point Worsening
Change in PGIS-ROM From Baseline to Week 25
n
30
1
8
21
31
5
1
0
Mean (SD)
15.3 (28.69)
23.7 (N/A)
30.9 (49.82)
9.0 (13.47)
6.8 (20.54)
7.7 (39.01)
− 10.0 (N/A)
–
95% CI
4.60–26.02
-
− 10.72–72.57
2.83–15.09
− 0.70–14.37
− 40.69–56.17
–
Score
Any Improvement
Very Much Improved
Much Improved
Minimally Improved
No Change
Minimally Worse
Much Worse
Very Much Worse
PGIC-ROM at Week 25
n
46
8
23
15
15
5
2
0
Mean (SD)
14.5 (29.15)
34.7 (48.02)
13.0 (22.25)
6.0 (22.04)
4.2 (9.96)
4.3 (20.70)
− 14.4 (15.06)
–
95% CI
5.84–23.16
− 5.43–74.87
3.40–22.65
− 6.23–18.18
− 1.33–9.70
− 21.37–30.04
− 149.6–120.95
–
PGIC-Overall Condition at Week 25
n
47
12
23
12
15
5
1
0
Mean (SD)
14.3 (28.95)
29.1 (39.33)
9.0 (23.74)
9.4 (22.57)
3.8 (14.03)
− 0.1 (15.22)
− 3.7 (N/A)
–
95% CI
5.75–22.75
4.09–54.06
− 1.22–19.31
− 4.93–23.75
− 3.97–11.57
− 19.00–18.78
–
–
CI = confidence interval; N/A = not applicable; PGIC = Patient Global Impression of Change; PGIS = Patient Global Impression of Severity; ROM = range of motion; SD = standard deviation
Based on the qualitative input from the exit interview study, “minimally improved” on PGIC was used as one of the primary anchor categories, as noted in bold. The second primary anchor category was a 1-point improvement in the PGIS
aActive ROM is relative to American Medical Association (AMA) standard
Physical function (PROMIS-PF)
For the change in PGIS-PF at Week 25, the mean change in PROMIS-PF score among patients with a 1-point improvement was 4.4 (SD = 4.14). The mean change in PROMIS-PF score among those “minimally improved” on the PGIC-PF was 2.6 (4.05) and 4.9 (SD = 4.67) for those “much improved” (Table 3). Similar results were observed for the PGIC-Overall condition, with a mean change in PROMIS-PF score of 2.9 (SD = 4.29) for those “minimally improved” and 4.8 (SD = 5.06) for those “much improved.” Due to small sample sizes in the worsening groups, some mean PROMIS-PF scores still demonstrated an improvement, although the overall trend was declining across the groups. Distribution-based analyses for PF yielded estimates of 2.89 and 2.00 for the ½*SD and SEM estimates, respectively.
Table 3
Responder Definition of the PROMIS-PF Using Anchor-based Methods – Change in PROMIS-PF Score from Baseline to Week 25 by Change in PGIS-PF, PGIC-PF, and PGIC-Overall from Baseline to Week 25
Score
Any Improvement
3-Point Improvement
2-Point Improvement
1-Point Improvement
No Change
(0-Point Change)
1-Point Worsening
2-Point Worsening
3-Point Worsening
Change in PGIS-PF From Baseline to Week 25
n
36
2
8
26
29
6
0
0
Mean (SD)
6.0 (5.81)
22.5 (9.19)
7.3 (2.49)
4.4 (4.14)
2.2 (4.90)
1.8 (5.31)
–
–
95% CI
4.06–7.99
− 60.09–105.09
5.17–9.33
2.71–6.06
0.38–4.11
− 3.74–7.40
–
–
Score
Any Improvement
Very Much Improved
Much Improved
Minimally Improved
No Change
Minimally Worse
Much Worse
Very Much Worse
PGIC-PF at Week 25
n
52
13
24
15
13
5
1
0
Mean (SD)
5.4 (5.80)
9.7 (7.17)
4.9 (4.67)
2.6 (4.05)
0.8 (3.49)
0.4 (4.22)
− 1.0 (N/A)
–
95% CI
3.81–7.04
5.36–14.02
2.90–6.85
0.36–4.84
− 1.34–2.88
− 4.84–5.64
–
–
PGIC-Overall Condition at Week 25
n
51
14
25
12
14
5
1
0
Mean (SD)
5.4 (5.85)
8.7 (7.11)
4.8 (5.06)
2.9 (4.29)
0.5 (3.50)
2.2 (4.32)
− 1.0 (N/A)
–
95% CI
3.77–7.06
4.61–12.82
2.67–6.85
0.19–5.65
− 1.52–2.52
− 3.17–7.57
–
–
CI = confidence interval; N/A = not applicable; PF = physical functioning; PGIC = Patient Global Impression of Change; PGIS = Patient Global Impression of Severity; PROMIS = Patient-Reported Outcomes Measurement Information System; SD = standard deviation
Based on the qualitative input from the exit interview study, “minimally improved” on the PGIC-PF was used as one of the primary anchor categories, as noted in bold. For the other variables, the anchor categories were a 1-point improvement in PGIS and “minimally improved” on PGIC-Overall
Worst stiffness
There were no PGIS or PGIC questions specific to stiffness included in the MOTION clinical trial. The mean change in the Worst Stiffness NRS among those “minimally improved” on the PGIC-Overall condition at Week 25 was − 0.9 (SD = 1.31) and − 2.3 (SD = 1.76) for those “much improved” (Table 4). Distribution-based analyses for worst stiffness yielded estimates of − 0.99 and − 0.58 for the ½*SD and SEM estimates, respectively.
Table 4
Responder Definition of the Worst Stiffness NRS Using Anchor-based Methods – Change in Worst Stiffness NRS Score from Baseline to Week 25
Score
PGIC-Overall Condition at Week 25
Any Improvement
Very Much Improved
Much Improved
Minimally Improved
No Change
Minimally Worse
Much Worse
Very Much Worse
n
46
11
24
11
15
5
1
0
Mean (SD)
− 2.2 (2.10)
− 3.4 (2.77)
− 2.3 (1.76)
− 0.9 (1.31)
− 0.6 (1.85)
− 1.4 (1.46)
− 0.2 (N/A)
–
95% CI
− 2.86–− 1.62
− 5.21–− 1.49
− 3.08–− 1.59
− 1.81–− 0.05
− 1.62–0.42
− 3.21–0.40
–
–
CI = confidence interval; N/A = not applicable; NRS = numeric rating scale; PGIC = Patient Global Impression of Change; SD = standard deviation
Triangulation
The triangulation of the anchor- and distribution-based methods resulted in proposed responder thresholds as shown in Table 5. When triangulating many results to derive a single proposed anchor, there is not just one value that provides this result, rather, some results are prioritized over others and considered more heavily. In the current study, the minimally improved category of the PGIC variables was highly supported as the most appropriate anchor category on the PGIS for both PF and ROM by the qualitative feedback from a large majority of the participants in the exit interviews (PF: 67%; ROM: 73%) (Supplemental Tables S1 and S2). Thus, the minimally improved category results were most heavily weighted during triangulation. Additionally, values from the “no change” categories were considered since some participants explicitly mentioned halting or slowing progression would be meaningful. Data from the PGIS change categories, which for example in the ROM was slightly higher (PGIS -1 = 9.0), contributed to an estimated responder definition that was larger than the value obtained from the PGIC Minimally Improved anchor group. For all three of the measures the selection of the final threshold value was slightly higher/more conservative due to the intended application to clinical trial data in a pivotal trial. The distribution-based method was considered as secondary, supportive evidence. The proposed responder thresholds are: at least a 10% improvement in active ROM, a 3-point improvement on the PROMIS-PF, and a -2-point improvement on the Worst Stiffness NRS (Table 5).
Table 5
Triangulation of Responder Definitions from Baseline to Week 25
Measure
Type of Assessment
Criterion
Anchor Category
Estimated Change
Proposed Responder Definition
Active ROM a
Anchor-based methods
Change in PGIS-ROM from baseline to Week 25
0-Point Change
6.8
+ 10%
1-Point Improvement
9.0
2-Point Improvement
30.9
PGIC-ROM at Week 25
No Change
4.2
Minimally Improved
6.0
Much Improved
13.0
PGIC-overall condition at Week 25
No Change
3.8
Minimally Improved
9.4
Much Improved
9.0
Distribution-based methods
0.50*SDSC
14.75
SEM
9.33
PROMIS-PF
Anchor-based methods
Change in PGIS-PF from baseline to Week 25
0-Point Change
2.2
+ 3-points
1-Point Improvement
4.4
2-Point Improvement
7.3
PGIC-PF at Week 25
No Change
0.8
Minimally Improved
2.6
Much Improved
4.9
PGIC-overall condition at Week 25
No Change
0.5
Minimally Improved
2.9
Much Improved
4.8
Distribution-based methods
0.50*SDSC
2.89
SEM
2.00
Worst Stiffness NRS
Anchor-based methods
PGIC-overall condition at Week 25
No Change
− 0.6
− 2-points
Minimally Improved
− 0.9
Much Improved
− 2.3
Distribution-based methods
0.50*SDSC
− 0.99
SEM
− 0.58
A negative value indicates improving stiffness
NRS = numeric rating scale; PF = physical functioning; PGIC = Patient Global Impression of Change; PGIS = Patient Global Impression of Severity; PROMIS = Patient-Reported Outcomes Measurement Information System; ROM = range of motion; SDSC = standard deviation at screening—screening; SEM = standard error of measurement
aActive ROM is relative to American Medical Association (AMA) standard
Observed changes in active ROM, PF, and worst stiffness from baseline to week 25
CDF curves were used to visualize the differences between treatment groups at the end of the double-blind period in the treatment arms (vimseltinib vs. placebo) using the full unblinded MOTION data. Clear separation of the treatment and placebo arms can be observed at the proposed threshold values for active ROM, PF, and worst stiffness (Fig. 1). Notably, there is also very clear separation between the treatment groups for all three outcomes at a wide range of values both above and below the proposed thresholds.
Fig. 1
Cumulative Distribution Function of Change from Baseline in Active ROM, a PROMIS-PF, and Worst Stiffness NRS at Week 25. MCID = minimal clinically important difference; NRS = numeric rating scale; PF = physical functioning; PROMIS = Patient-Reported Outcomes Measurement Information System; ROM = range of motion. Baseline is defined as the most recent non-missing measurement prior to the first administration of study drug. a Active ROM is relative to AMA standard
Symptoms of TGCT can cause significant burden for patients, impairing their PF and ability to perform daily activities [4, 30]. Vimseltinib was recently approved by US FDA for those patients where surgery is not an option and is under review by the EMA [2, 3, 31]. The data used in this study were based on the pivotal trial for vimsletinib, and the approach described that was used to investigate and define meaningful change was undertaken to ensure that the results of the trial reflected patient-centered perceptions on the magnitude of what constitutes a meaningful change, provide data to better characterize and support the changes that patients actually experienced in the trial, and address regulatory questions around meaningful change and other aspects of the patient-reported outcome (PRO) measures that were included as endpoints in the trial. This type of mixed-methods research is increasingly common, and the approach taken here could be applied in other trials, with different PROs and across a wide range of indications.
There is a growing body of research on patient experiences with TGCT. The current study extends this existing research by employing a more robust, mixed-methods approach to elicit direct input from patients, including those who experienced improvements in the clinical trial, to further inform the anchor-based definitions of meaningful change for multiple COAs. This mixed-methods study is strengthened by the fact that both qualitative and quantitative data were obtained from patients participating in the same trial. Qualitative input was used to directly inform the selection of the most appropriate anchor categories in the quantitative responder analyses; to our knowledge, this is the first published report using this approach.
A total of 96 patients from the Phase 3 MOTION trial completed embedded exit interviews, representing 78% of the total sample. Of those patients who debriefed on the PGIS and PGIC, the majority considered a “minimally improved” response on the PGIC-ROM (n = 19/26, 73%) and PGIC-PF (n = 14/21, 67%) as representing meaningful change, supporting the justification for the “minimally improved” category as the primary anchor category for the responder definition analyses. The proportion of patients who indicated in the qualitative interviews that they consider this response category to represent meaningful change is valuable as it provides concrete patient-centered evidence supporting the “minimally improved” category as the key data point for the responder analyses. Although researchers often pre-specify and/or use this value to define responder thresholds, it is often without direct patient input or objective support and is therefore a frequent point of regulatory scrutiny. The interviews also provided data on how strongly to consider data from the “much improved” category, as approximately one-fourth of the patients reported that they considered this to be a minimally meaningful change. The inclusion of qualitative data as supportive evidence for anchor-based quantitative methods allows for patient insights on the MCID, reflecting the patient treatment experience. This approach goes beyond statistically significant differences to highlight the more concrete improvements that patients experience and the tangible reasons that the changes matter. While the responder definition requires a fixed threshold, the CDF curves showed that the separation between the treatment groups for all three COAs persisted across a very wide range of possible thresholds below and above the proposed values.
It is also important to point out that while participants in the study were queried on what level of change for individual items on the PROMIS-PF would be meaningful to them, this information cannot be directly translated into a multi-item scale level meaningful change threshold. For example, although most participants who were asked reported that a 1- or 2-point change on each PROMIS-PF item would represent a meaningful change on that particular item, the overall threshold estimate of meaningful change for the full PROMIS-PF scale did not reflect a value that would represent > 1 or 2 point changes on every single item. There was substantial qualitative and quantitative data in the current study supporting that such a threshold value would represent an unreasonably high bar. Thus although it may be interesting and helpful from a qualitative perspective to understand what a meaningful change is on any particular individual item, this information, by itself, cannot be combined across all items to yield scale level meaningful change.
The quantitative results are similar to those reported in the prior psychometric validation published by Speck et al. (2020) on the 15 item PROMIS-PF TGCT-specific short form and Worst Stiffness NRS, with the prior study proposing the same responder definition threshold of at least 3 for the PROMIS-PF and a slightly lower value of − 1 for the Worst Stiffness NRS [9]. The samples in both studies were highly similar in terms of sociodemographic and clinical characteristics and tumor location, and both used quantitative data from Phase 3 clinical trials.
One notable result is that the qualitative results supported an improvement of − 1 point on the Worst Stiffness NRS as did the prior quantitative research [9]; however, the triangulation process from the current study resulted in a threshold estimate of − 2 points. The triangulation for the Worst Stiffness in the current study considered the fact that the clinical trial did not include any concept-specific anchors for stiffness (only the PGIC-Overall variable was available). As a consequence, a conservative approach was taken when triangulating, yielding the − 2 point estimate. Both the results of prior studies and the quantitative data from the “minimally improved” category of the PGIC-Overall anchor variable fully support a − 1 point change as an appropriate responder definition. However, given the context of use (i.e., to support the meaningfulness of results from a pivotal clinical trial), in this particular use case the more conservative estimate was reported.
The results of the current study should be interpreted with consideration of the following limitations. Due to practical limitations and concerns for patient burden, not all measures were debriefed with every patient. In addition, there were 10 different interviewers who worked on the study, which was required due to the multiple languages, but may have led to greater variability. Initial transcripts from each interviewer were reviewed to ensure consistency. This also represents a notable strength of the study, as input was obtained from patients across many languages and countries. It is possible that those who opted not to participate in the exit interviews were different from those who did participate, thereby introducing some bias, though the sociodemographic and clinical characteristics of the exit interview sample aligned very closely with the full clinical trial sample. Findings may be subject to recall and social desirability bias, which could influence how patients reported their experiences and perceived changes. An additional limitation is that the anchor items referenced physical function “at the site of the tumor,” which may limit generalizability to broader physical functioning. Although alternative anchors could have been used, the focus applied here specifically targeted tumor-related impacts. There were some instances where anchor categories indicating deterioration had observed mean scores representing improvements in the constructs of interest. These cases likely reflect some combination of placebo effects, small numbers of participants in the groups of interest and/or measurement error. As the primary endpoint for the trial was at 25 weeks, it is highly likely that many patients did not experience substantial declines in their condition over the study period. Lastly, the current study reported on one method for calculating MCID, all approaches have limitations and other approaches have been proposed [32]. It is important to note that while single point estimates for defining a meaningful change are necessary to conduct responder analyses, there is no single value that represents a meaningful change to all individuals. Meaningful change is an intensely individual construct that varies considerably across a multitude of factors. The wide confidence intervals observed for some estimates may reflect this inherent subjectivity and individual variability, rather than statistical imprecision. The triangulated point estimates presented here represent one approach that was acceptable for the specific research context in which it was applied (i.e., selection of thresholds for regulatory review and to support PRO labeling claims for pharmaceutical products). The approach described herein can be applied to other PRO measures and has been shown to adequately support PRO endpoints undergoing regulatory review.
Conclusions
Combined evidence from this mixed-methods meaningful change study, which included anchor- and distribution-based methods supported by qualitative patient insights, supports responder definitions for improvements of + 3 points for PROMIS-PF, + 10% for active ROM, and − 2 points for the Worst Stiffness NRS. This study provides support for the continued use of both quantitative and qualitative methods to ascertain and contextualize meaningful change thresholds and ensure that the patient perspective is explicitly and comprehensively considered in the drug development process.
Acknowledgements
We thank the patients and their families and caregivers, the investigators, and the investigational site staff for the MOTION trial.
Declarations
Conflict of interest
Heather Gelhorn, Katelyn Cutts, Tsion Fikre and Yipin Han work for Evidera, a scientific research consultancy. Evidera was paid to conduct this work by Deciphera Pharmaceuticals, LLC. Brooke Harrow, Christopher Tait, Amanda Saunders, Nicholas A Zeringo are employees of Deciphera Pharmaceuticals, LLC. Michiel van de Sande reports institutional research funding from Implantcast GmbH and consulting/advisory roles for Merck, SynOx, and Deciphera Pharmaceuticals, LLC. Dr. Tap reports personal fees from: Daiichi Sankyo, Deciphera, Servier, Bayer Pharmaceuticals, Cogent, Amgen, AmMax Bio, Boehringer Ingelheim, BioAtla, Inhibrx, PharmaEssentia, Avacta, Ipsen, Sonata, Abbisko, Aadi, IMGT, Ikena, Curadev, Ratio, C4 Therapeutcis, Synox, Recordati. In addition, Dr. Tap has a patent Companion Diagnostic for CDK4 inhibitors—14/854,329 pending to MSKCC/SKI, and a patent Enigma and CDH18 as companion Diagnostics for CDK4 inhibition – SKI2016-021–03 issued to MSKCC/SKI and Scientific Advisory Board—Certis Oncology Solutions, Stock Ownership, Co-Founder—Atropos Therapeutics, Stock Ownership, Scientific Advisory Board Innova Therapeutics Strategic Advisory Board Osteosarcoma Institute, and Chair, Scientific Advisory Board Avacta. Hans Gelderblom reports Institutional financial compensation for entering patients in TGCT studies form AmmaxBio, Deciphera, Abbisko and SynOx Therapeutics. Nicholas Bernthal reports Scientific Advisor to Daiichi Sankyo, Merck, Deciphera, ZimmerBiomet, Onkos Surgical.
Ethics approval and consent to participate
This study was approved by an institutional review board or ethics committee at each clinical site.
Consent for publication
Not applicable.
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Mixed-Methods to Define Meaningful Change using Exit Interviewand Clinical Trial Data in Patients with Tenosynovial Giant Cell Tumor (TGCT)
Auteurs
Heather L. Gelhorn
Katelyn N. Cutts
Brooke Harrow
Christopher Tait
Amanda Saunders
Tsion Fikre
Yipin Han
Nicholas A. Zeringo
Michiel Van De Sande
William Tap
Hans Gelderblom
Nicholas Bernthal
de Saint Aubain Somerhausen, N. S., & van de Rijn, M. (2013). Tenosynovial giant cell tumor: localized type, diffuse type. In C. D. M. Fletcher, J. A. Bridge, P. C. W. Hogendoorn, & F. Mertens (Eds.), WHO Classification of Tumours of Soft Tissue and Bone (4th ed., Vol. 5, pp. 100–103). International Agency for Research on Cancer.
2.
van der Heijden, L., Gibbons, C. L., Dijkstra, P. D., Kroep, J. R., van Rijswijk, C. S., Nout, R. A., Bradley, K. M., Athanasou, N. A., Hogendoorn, P. C., & van de Sande, M. A. (2012). The management of diffuse-type giant cell tumour (pigmented villonodular synovitis) and giant cell tumour of tendon sheath (nodular tenosynovitis). Journal of Bone and Joint Surgery. British Volume,94(7), 882–888. https://doi.org/10.1302/0301-620X.94B7.28927CrossRefPubMed
Gelhorn, H. L., Tong, S., McQuarrie, K., Vernon, C., Hanlon, J., Maclaine, G., Lenderking, W., Ye, X., Speck, R. M., Lackman, R. D., Bukata, S. V., Healey, J. H., Keedy, V. L., Anthony, S. P., Wagner, A. J., Von Hoff, D. D., Singh, A. S., Becerra, C. R., Hsu, H. H., & Tap, W. D. (2016). Patient-reported symptoms of tenosynovial giant cell tumors. Clinical Therapeutics,38(4), 778–793. https://doi.org/10.1016/j.clinthera.2016.03.008CrossRefPubMedPubMedCentral
5.
Palmerini, E., Healey, J. H., Bernthal, N. M., Bauer, S., Schreuder, H., Leithner, A., Martin-Broto, J., Gouin, F., Lopez-Bastida, J., Gelderblom, H., Staals, E. L., Mercier, F., Laeis, P., Ye, X., & van de Sande, M. (2023). Tenosynovial Giant Cell Tumor Observational Platform Project (TOPP) Registry: A 2-year analysis of patient-reported outcomes and treatment strategies. The Oncologist,28(6), e425–e435. https://doi.org/10.1093/oncolo/oyad011CrossRefPubMedPubMedCentral
6.
Gelderblom, H., Bhadri, V., Stacchiotti, S., Bauer, S., Wagner, A. J., van de Sande, M., Bernthal, N. M., Lopez Pousa, A., Razak, A. A., Italiano, A., Ahmed, M., Le Cesne, A., Tinoco, G., Boye, K., Martin-Broto, J., Palmerini, E., Tafuto, S., Pratap, S., & Powers, B. C. (2024). Vimseltinib versus placebo for tenosynovial giant cell tumour (MOTION): A multicentre, randomised, double-blind, placebo-controlled, phase 3 trial. Lancet,403(10445), 2709–2719. https://doi.org/10.1016/S0140-6736(24)00885-7CrossRefPubMedPubMedCentral
7.
Sande, M., Tap, W. D., Gelhorn, H. L., Ye, X., Speck, R. M., Palmerini, E., Stacchiotti, S., Desai, J., Wagner, A. J., Alcindor, T., Ganjoo, K., Martin-Broto, J., Wang, Q., Shuster, D., Gelderblom, H., & Healey, J. H. (2021). Pexidartinib improves physical functioning and stiffness in patients with tenosynovial giant cell tumor: Results from the ENLIVEN randomized clinical trial. Acta Orthopaedica,92(4), 493–499. https://doi.org/10.1080/17453674.2021.1922161CrossRefPubMed
8.
Gelhorn, H. L., Ye, X., Speck, R. M., Tong, S., Healey, J. H., Bukata, S. V., Lackman, R. D., Murray, L., Maclaine, G., Lenderking, W. R., Hsu, H. H., Lin, P. S., & Tap, W. D. (2019). The measurement of physical functioning among patients with Tenosynovial Giant Cell Tumor (TGCT) using the Patient-Reported Outcomes Measurement Information System (PROMIS). Journal of Patient-Reported Outcomes,3(1), Article 6. https://doi.org/10.1186/s41687-019-0099-0CrossRefPubMedPubMedCentral
9.
Speck, R. M., Ye, X., Bernthal, N. M., & Gelhorn, H. L. (2020). Psychometric properties of a custom Patient-Reported Outcomes Measurement Information System (PROMIS) physical function short form and worst stiffness numeric rating scale in tenosynovial giant cell tumors. Journal of Patient-Reported Outcomes,4(1), Article 61. https://doi.org/10.1186/s41687-020-00217-6CrossRefPubMedPubMedCentral
10.
Tap, W. D., Singh, A. S., Anthony, S. P., Sterba, M., Zhang, C., Healey, J. H., Chmielowski, B., Cohn, A. L., Shapiro, G. I., Keedy, V. L., Wainberg, Z. A., Puzanov, I., Cote, G. M., Wagner, A. J., Braiteh, F., Sherman, E., Hsu, H. H., Peterfy, C., Gelhorn, H. L., & Tong-Starksen, S. (2022). Results from phase I extension study assessing pexidartinib treatment in six cohorts with solid tumors including TGCT, and abnormal CSF1 transcripts in TGCT. Clinical Cancer Research,28(2), 298–307. https://doi.org/10.1158/1078-0432.CCR-21-2007CrossRefPubMed
11.
Keefe, R. S., Kraemer, H. C., Epstein, R. S., Frank, E., Haynes, G., Laughren, T. P., McNulty, J., Reed, S. D., Sanchez, J., & Leon, A. C. (2013). Defining a clinically meaningful effect for the design and interpretation of randomized controlled trials. Innovations in Clinical Neuroscience,10(5-6 Suppl A), 4S-19S.PubMedPubMedCentral
12.
Coon, C. D., & Cappelleri, J. C. (2016). Interpreting change in scores on Patient-Reported Outcome Instruments. Therapeutic Innovation & Regulatory Science,50(1), 22–29. https://doi.org/10.1177/2168479015622667CrossRef
13.
Jaeschke, R., Singer, J., & Guyatt, G. H. (1989). Measurement of health status. Ascertaining the minimal clinically important difference. Controlled Clinical Trials,10(4), 407–415. https://doi.org/10.1016/0197-2456(89)90005-6CrossRefPubMed
14.
McLeod, L. D., Coon, C. D., Martin, S. A., Fehnel, S. E., & Hays, R. D. (2011). Interpreting patient-reported outcome results: US FDA guidance and emerging methods. Expert Review of Pharmacoeconomics & Outcomes Research,11(2), 163–169. https://doi.org/10.1586/erp.11.12CrossRef
15.
Staunton, H., Willgoss, T., Nelsen, L., Burbridge, C., Sully, K., Rofail, D., & Arbuckle, R. (2019). An overview of using qualitative techniques to explore and define estimates of clinically important change on clinical outcome assessments. Journal of Patient-Reported Outcomes,3(1), Article 16. https://doi.org/10.1186/s41687-019-0100-yCrossRefPubMedPubMedCentral
16.
de Jan Beur, S. M., Cimms, T., Nixon, A., Theodore-Oklota, C., Luca, D., Roberts, M. S., Egan, S., Graham, C. A., Hribal, E., Evans, C. J., Wood, S., & Williams, A. (2023). Burosumab improves patient-reported outcomes in adults with tumor-induced osteomalacia: Mixed-methods analysis. Journal of Bone and Mineral Research,38(11), 1654–1664. https://doi.org/10.1002/jbmr.4900CrossRef
17.
Dias-Barbosa, C., Puelles, J., Fofana, F., Gabriel, S., Rodriguez, D., Chavda, R., & Piketty, C. (2023). An explanatory sequential mixed-methods design to establish thresholds of within-individual meaningful change on a sleep disturbance numerical rating scale score in atopic dermatitis. Quality of Life Research,32(3), 881–893. https://doi.org/10.1007/s11136-022-03294-wCrossRefPubMed
18.
Rams, A., Baldasaro, J., Bunod, L., Delbecque, L., Strzok, S., Meunier, J., ElMaraghy, H., Sun, L., & Pierce, E. (2024). Assessing itch severity: Content validity and psychometric properties of a patient-reported pruritus numeric rating scale in atopic dermatitis. Advances in Therapy,41(4), 1512–1525. https://doi.org/10.1007/s12325-024-02802-3CrossRefPubMedPubMedCentral
19.
Ezzedine, K., Soliman, A. M., Camp, H. S., Ladd, M. K., Pokrzywinski, R., Coyne, K. S., Sen, R., Schlosser, B. J., Bae, J. M., & Hamzavi, I. (2024). Psychometric properties and meaningful change thresholds of the Vitiligo Area Scoring Index. JAMA Dermatology. https://doi.org/10.1001/jamadermatol.2024.4534CrossRefPubMedPubMedCentral
20.
Sully, K., Trigg, A., Bonner, N., Moreno-Koehler, A., Trennery, C., Shah, N., Yucel, E., Panjabi, S., & Cocks, K. (2019). Estimation of minimally important differences and responder definitions for EORTC QLQ-MY20 scores in multiple myeloma patients. European Journal of Haematology,103(5), 500–509. https://doi.org/10.1111/ejh.13316CrossRefPubMedPubMedCentral
21.
Zingelman, S., Cadilhac, D. A., Kim, J., Stone, M., Harvey, S., Unsworth, C., O’Halloran, R., Hersh, D., Mainstone, K., & Wallace, S. J. (2024). ‘A meaningful difference, but not ultimately the difference I would want’: A mixed-methods approach to explore and benchmark clinically meaningful changes in aphasia recovery. Health Expectations,27(4), Article e14169. https://doi.org/10.1111/hex.14169CrossRefPubMedPubMedCentral
22.
Gerhardt, J. J., Cocchiarelle, L., & Lea, R. D. (2002). The practical guide to range of motion assessment. American Medical Association.
23.
Eremenco, S., Pease, S., Mann, S., & Berry, P. (2017). Patient-Reported Outcome (PRO) Consortium translation process: Consensus development of updated best practices. Journal of Patient-Reported Outcomes,2(1), Article 12. https://doi.org/10.1186/s41687-018-0037-6CrossRefPubMed
24.
Turner-Bowker, D. M., Lamoureux, R. E., Stokes, J., Litcher-Kelly, L., Galipeau, N., Yaworsky, A., Solomon, J., & Shields, A. L. (2018). Informing a priori sample size estimation in qualitative concept elicitation interview studies for clinical outcome assessment instrument development. Value in Health,21(7), 839–842. https://doi.org/10.1016/j.jval.2017.11.014CrossRefPubMed
Food and Drug Administration. (2023). Patient-Focused Drug Development: Incorporating Clinical Outcome Assessments Into Endpoints For Regulatory Decision-Making Guidance for Industry, Food and Drug Administration Staff, and Other Stakeholders. Retrieved December 2024 from https://www.fda.gov/media/166830/download
28.
Revicki, D., Hays, R. D., Cella, D., & Sloan, J. (2008). Recommended methods for determining responsiveness and minimally important differences for patient-reported outcomes. Journal of Clinical Epidemiology,61(2), 102–109. https://doi.org/10.1016/j.jclinepi.2007.03.012CrossRefPubMed
29.
Leidy, N. K., & Wyrwich, K. W. (2005). Bridging the gap: Using triangulation methodology to estimate minimal clinically important differences (MCIDs). COPD,2(1), 157–165. https://doi.org/10.1081/copd-200050508CrossRefPubMed
30.
Mastboom, M. J., Planje, R., & van de Sande, M. A. (2018). The patient perspective on the impact of tenosynovial giant cell tumors on daily living: Crowdsourcing study on physical function and quality of life. Interactive Journal of Medical Research,7(1), Article e4. https://doi.org/10.2196/ijmr.9325CrossRefPubMedPubMedCentral
Terwee, C. B., Peipert, J. D., Chapman, R., Lai, J. S., Terluin, B., Cella, D., Griffiths, P., & Mokkink, L. B. (2021). Minimal important change (MIC): A conceptual clarification and systematic review of MIC estimates of PROMIS measures. Quality of Life Research,30(10), 2729–2754. https://doi.org/10.1007/s11136-021-02925-yCrossRefPubMedPubMedCentral