Introduction
Values for the minimal important change (MIC) have become increasingly important in the era of large-scale register-based research because clinically irrelevant changes may become statistically significant due to large sample sizes [
1]. Several different concepts for the minimal important change are used interchangeably: minimal important change (MIC), minimal/minimum clinically important difference (MCID), and minimal clinically important improvement (MCII). In this paper we define MIC as the smallest difference in score in the domain of interest that patients perceive as beneficial (i.e. the definition of MCID used by Jaeschke et al. [
2]). MIC values are used to evaluate changes within a group, e.g., before and after a medical intervention. In contrast, the minimal important difference (MID) is used to evaluate differences between groups [
3]. An equally important concept is the minimal/smallest detectable change (MDC/SDC) (also called the smallest real difference, SRD) which is the smallest measurement change, that can be interpreted as a real difference (i.e., not a measurement error) [
4]. The concept of MIC is controversial and there are concerns that clinical importance is not adequately captured by MIC values [
5].
Historically, there have been two major methodological approaches to determine MIC values: (1) distribution-based methods, and (2) anchor-based methods [
6]. Terwee et al. [
7], in a conceptual clarification, questioned the use of distribution-based methods because these methods evaluate measurement errors (e.g. MDC) but do not relate to the importance of change. However, information about the measurement error is important for assessing the quality of the measurement. If the measurement error is larger than the MIC, measures should be taken to reduce the measurement error in order to evaluate the MIC [
8].
Studies on MIC often focus on the minimal important improvement. The rationale for this is that MIC values are commonly used to assess the effects of medical interventions aimed at improving health. However, the minimal important deterioration is equally important. One approach to assess deterioration is to simply use the MIC for improvement but with the opposite sign. However, previous studies have reported differences in the magnitude between MIC for improvement and MIC for deterioration. For example, based on data from the Norwegian registry for spine surgery, Werner et al. [
9] report different MIC cutoff values for failure for common PROMs used in spine surgery compared to the corresponding values for success reported by Solberg et al. [
10].
Elective spine surgery aims to reduce pain and disability. Consequently, spine surgery outcome measures focus on pain and disability measurements. Commonly used outcome measures are numeric rating scales (NRS) for back and/or leg pain and disease-specific disability measures such as the Oswestry disability index. Previous studies have reported the MIC values of these outcome measures [
9,
10]. The MIC values can be used in clinical practice to inform patients about the expected effects of surgical procedures, e.g. the percentage of patients who experience a minimal important change after a given surgical procedure [
7]. Equally important is the assessment of general health-related quality of life (HRQoL) after spine surgery. The EQ-5D index [
11] is a commonly used instrument for health-related quality of life (HRQoL) assessment which is also used to evaluate medical interventions from an economic perspective.
The EQ VAS is an integral part of EQ-5D. Surprisingly few investigations have evaluated the MIC for the EQ VAS in orthopedic conditions [
6,
12]. In this study, we used data from the Swedish spine register, Swespine, to calculate anchor-based MIC values (improvement and deterioration) for the EQ VAS for the two common spine surgery procedures, disk herniation surgery and spinal stenosis surgery.
Discussion
In the present study, we report the MIC values for improvement and deterioration 1 year after surgery for disk herniation and spinal stenosis. Our MIC values were similar to the previously reported EQ VAS MIC values for orthopedic conditions. Soer et al. [
27] reported an EQ VAS MIC value of 10.5 points when studying effects of rehabilitation for low back pain (
n=151). Paulsen et al. [
28] reported an EQ VAS MIC value of 23 points when using a disease specific anchor in patients surgically treated with total hip arthroplasty for hip osteoarthritis (
n=1335). The correlation between the anchor and EQ VAS, however, was weak. Paulsen et al. [
28] reported a MIC value of 12 points for a general health change anchor. The correlation between the anchor and the EQ VAS was 0.35 but the ROC AUC was only 0.60. This illustrates the importance of detailed knowledge of MIC validation (type of anchor, anchor-PROM correlation, AUC, sample sizes etc.) when using specific MIC values in clinical trials.
To the best of our knowledge, there are no previous reports on EQ VAS MIC for deterioration after spine surgery. Werner et al. [
9] reported MIC values for several commonly used patient reported outcome measures (PROMs) (EQ-5D index, the Oswestry disability index, and numeric rating scales for leg and back pain) for failure after disk herniation surgery. A general health transition item was used as anchor. Interestingly, the MIC values were greater than zero which means that the PROMs of the patients improved but the health transition item showed a health deterioration. In contrast, we report negative MIC values for deterioration in EQ VAS. A possible explanation for this difference is that Werner et al. [
9] include patients reporting
no change in the definition of failure whereas we exclude patients reporting
about the same (response option three) in our definition of deterioration. Again, this illustrates the importance of detailed knowledge of the anchor when using anchor-based MIC values.
We found a marked difference between the MIC value for improvement and the MIC value for deterioration. One explanation for the difference in MIC for improvement and deterioration might be that there is an imbalance in the distribution of the answers to the SF-36 health transition item between the improved and deteriorated patients (Table S3). For example, for disk herniation surgery, the answers for improvement in health are shifted towards better health (61% much better vs. 19% somewhat better), which means that the much better group contributes more to the MIC than the somewhat better group, which results in a high MIC value. In addition, for deterioration after disk herniation surgery, the answers are shifted towards better health (4.8% somewhat worse vs. 2.4% much worse), which resulted in a lower MIC value for deterioration. Consequently, because the properties of the distribution of anchor response options (e.g., skewness) will affect the MIC values, detailed knowledge of the anchor distribution is essential when calculating MIC values.
An essential part of the MIC ROC analysis is to determine the optimal threshold for the MIC. We used two optimization criteria for the estimation of the MIC: (1) the point on the ROC curve closest to the top left corner of the ROC plot and (2) the maximum Youden index. Our analysis yielded inconsistent results for these methods (Table
3 and Figure
2). Perkins et al. [
21] argued for the use of the maximum Youden index when the results of the two methods were inconsistent.
Additionally, the ROC analysis and the logistic regression model gave inconsistent results (Table
3). The most pronounced differences were observed in deterioration after surgery for disk herniation. Telurin et al. [
17] argued in favor of using the logistic regression to determine MIC since MIC estimation based on logistic regression models appears to give smaller variance for the MIC estimate.
When our suggested MIC values for improvement and deterioration after surgery for disk herniation or spinal stenosis (12 and − 7) were applied to our data (Table S4) we found that the percentage of improved patients was lower (68.4% vs. 80%) and the percentage of deteriorated patients was higher (10.4% vs. 7.2%) than the corresponding percentages for the SF-36 health transition item (Table
2). Consequently, our suggested EQ VAS thresholds provides a more conservative estimate of the benefit with regards to general health perceptions after surgery for disk herniation or spinal stenosis compared to the SF-36 health transition item. Guyatt et al. [
29] reported that transition ratings might be biased by the current health state. Our data confirm this finding (Table
2). This means that transition ratings may overestimate the effect of a surgical intervention which might be a part of the explanation for the difference between Tables
2 and S4.
The correlation between the anchor and the 1-year change in EQ VAS was -0.47 for disk herniation surgery and − 0.48 for spinal stenosis surgery (Table
4). Revicki et al. [
23] recommend 0.30–0.35 as a correlation threshold to define an acceptable association between an anchor and the PROM change score. In contrast, Guyatt et al. [
29], have a more restrictive approach and recommend a correlation threshold of 0.50 points. Since there is no consensus regarding correlation thresholds, and also because our correlations are in the upper region of the medium correlation range proposed by Cohen [
26], we find it reasonable to use the SF-36 transition item as anchor for MIC calculations for the EQ VAS.
SF-36 provides alternative measures that could be used as anchors for MIC calculation: SF-36 item one (the single item for self-rated health assessment, SRH) and the general health (GH) domain. In our prior work on HRQoL, however, we noted that the responsiveness to change after spine surgery for SF-36 item one and the GH domain was limited, which makes these measures less suitable as anchors [
16,
30].
The findings of our study should be evaluated in the light of several limitations. First, the data were limited to patients surgically treated for disk herniation or spinal stenosis. Other uses of the MIC values of our study should be made with caution. Second, we recognize the inherent limitations of register data, e.g., lack of confounder information, missing data, or unknown data quality [
1]. Third, information about co-morbidities that might affect general heath perceptions were lacking. Fourth, data were incomplete for 20886 (44%) of the procedures. Fifth, we did not evaluate the MDC of EQ VAS. The MIC has to be greater than the MDC to be a valid threshold [
8]. Sixth, we did not adjust our MIC values for differences in EQ VAS at baseline. This is recognized as a limitation because previous studies have suggested that differences in baseline PROMs may affect MIC thresholds [
9,
31]. Seventh, data on socioeconomic factors were lacking. The study of Iderberg et al. [
32] demonstrated that socioeconomic indicators were associated with outcomes of surgery for lumbar spinal stenosis.
Despite these limitations, we believe that the results of our study, are still fairly accurate estimates of the MIC values for EQ VAS and that future studies may now use EQ VAS as a complement to the widely used EQ-5D index in the assessing changes in general HRQoL after spine surgery.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.