Introduction

Fusion of a symptomatic lumbar spinal motion segment is the most commonly used operative treatment for patients with degenerative disc disease (DDD) not responding to conservative care [1]. However, spinal fusion is associated with negative side effects such as proximal facet-joint violation, pseudarthrosis and symptomatic adjacent level disease (ASD) [2,3,4,5,6,7,8,9]. In order to avoid those fusion-related side effects, lumbar total disc replacement (TDR) has been introduced. However, TDR has also been associated with drawbacks, such as subsidence, dislocation, or malposition of the implant [10,11,12].

Subsidence of a TDR, defined as the penetration of the prosthetic endplate into the vertebral endplate (Fig. 1), is a frequently documented complication [10, 11, 13,14,15,16]. Subsidence occurs presumably due to non-central implantation [17, 18], implant undersizing [19, 20], or reduced bone quality [21]. It may ultimately lead to spontaneous fusion of the vertebral segment or to failure of the TDR [14, 22]. Consequently, patients with symptoms and radiographic subsidence, even without clear signs of wear and/or displacement, may undergo revision surgery [22]. However, there are no studies describing the relation between the occurrence of subsidence and signs or symptoms of the patient.

Fig. 1
figure 1

An example of subsidence

The purpose of this study was to investigate to what extent subsidence of the TDR is related to clinical outcome. A secondary goal was to investigate if subsidence could be predicted by the position and relative size of the TDR on the direct post-operative radiographs.

Material and methods

Patient selection

The current study was approved by the local medical ethics committee METC Z (16-N-22) and registered at the Dutch Trial Registry (NTR5710). The medical records of all patients who underwent a TDR using an SB Charité III (Waldemar Link, Germany; DePuy Spine, Raynham, MA) between 1994 and 2000 (in 1998 a bioactive hydroxyapatite coating of the prosthetic endplates was introduced) at the Zuyderland Medical Centre, Sittard, The Netherlands, were reviewed.

TDR had been performed by a single surgeon for the treatment of patients with lumbar DDD, causing predominant axial low back pain. Care was taken intra-operatively, to avoid violation of the bony endplate by the implant. The diagnosis was based on plain standing radiographs of the lumbar spine taken in antero-posterior (AP) and lateral views. Preoperatively, all patients had undergone fluoroscopically guided provocation discography to confirm a painful disc. No facet joint injections had been performed. Nerve root compression and/or spinal stenosis was considered as a contraindication for TDR. All patients were contacted with the request to visit the outpatient clinic for clinical evaluation and AP and lateral radiographs.

Radiological analysis

Subsidence as assessed by penetrated bone volume

A custom developed and validated software package implemented in MATLAB (MATLAB R2017b, Mathworks, MA) was used to create a three-dimensional graphical representation of the implant [20]. By projecting the prosthetic endplate on the plane representing the vertebral endplate, the penetrated bone volume (PBV) was calculated in mm3 (Fig. 2). The dimensions (width/length) of the prosthetic endplate were based on the size of the circular polyethylene insert, as documented in the patient’s operative records. The prosthetic endplate was represented by parabolic functions for the anterior/posterior sides. This resulted in a shape that well-matches the actual endplate (Fig. 3). The PBV was calculated simultaneously for both the upper and lower part of the TDR, and these values were added together.

Fig. 2
figure 2

Three-dimensional graphical representation of the TDR implant in relation to the bony endplates. This representation can be rotated manually until its contour best replicates the outline of the implant on both AP and lateral radiographs. Next, the most lateral left and right points of the bony endplate on the AP radiograph and the most anterior and posterior points of the bony endplate on the lateral radiograph were identified (red circles). Similar points had to be indicated on the metal ring of the circular polyethylene insert (green squares). The latter were used to correct for the difference in magnification factor between the AP and lateral radiograph of the same patient

Fig. 3
figure 3

Bottom view of the graphical TDR representation, where the red lines indicates the contour of the used surface for calculating the PBV

Subsidence as assessed by angular rotation (AR)

A second custom-developed software package implemented in MATLAB was used to simultaneously display AP and lateral radiographs, direct post-operative and at last follow-up. On both the AP and lateral image, the angle between the prosthetic and the vertebral endplate was calculated for the upper and lower part of the prosthesis (Fig. 4), using Cobb’s method [23, 24]. The highest value (upper- or lower part) was used for this analysis [15]. Analyses were done for the direct post-operative and for the last follow-up radiographs. The differences (Δ) between the AR at last follow-up and direct post-operative (upper- and lower part) for each individual patient were calculated. The highest value was used for this analysis.

Fig. 4
figure 4

Angular rotation between the vertebral and prosthetic endplate on an AP and LAT radiograph

Areal Undersizing Index (AUI)

Using the same custom MATLAB software package, the potential mismatch between the surface area of the vertebral \((A_{{{\text{vertebra}}}} )\) and the prosthetic endplate area \(\left( {A_{{{\text{TDR}}}} } \right)\) was determined (Fig. 5). For this analysis, the vertebra and the prosthesis were assumed to be parabolic, and the surface area was calculated as: \(A = \pi *a*b\) for both the vertebrae (Avertebrae) and TDR (ATDR). Subsequently, the AUI was determined on the upper and the lower part of the prosthesis. The highest value (least coverage) was used for the analysis.

$${\text{Areal}}\;{\text{Undersizing}}\;{\text{Index}} = \frac{{A_{{{\text{vertebra}}}} - A_{{{\text{TDR}}}} }}{{A_{{{\text{vertebra}}}} }}.$$
Fig. 5
figure 5

Representation of the semi-major axis (blue line) and semi-minor axis (orange line) of the ellipse fitted around the prosthesis

A value of zero implies that the contour of the TDR is perfectly matched with the contour of the vertebrae, whereas a large value indicates undersizing of the implant.

Implantation asymmetry

Using the same MATLAB package, implantation asymmetry (IA) was defined as the shortest distance (d) between the middle of both the vertebral and the prosthetic endplate (Fig. 6), divided by the corresponding vertebral endplate diameter. The measurements were done for the upper and lower part of the TDR, and the highest value was used for the analysis. The differences (Δ) between the IA at last follow-up and direct post-operative (upper- and lower part) for each individual patient were calculated. The highest value was used for this analysis.

Fig. 6
figure 6

Implant asymmetry in percentage is the shortest distance (d) between the middle of both the vertebral and the prosthetic endplate, divided by the corresponding vertebral endplate diameter (blue line)

A value of zero implies that the prosthesis is perfectly aligned with the vertebrae, whereas a large value indicates a translation from the centre. These values were measured on both the AP and lateral radiographs such that the symmetry can be quantified in two directions.

Clinical outcome evaluation

At last follow-up, back- and leg-pain intensity was recorded in all patients with a Visual Analog Scale (VAS, 0 to 100, 100 being ‘worst pain’). The highest value was used for the analysis. Functional well-being was evaluated using Oswestry Disability Index (ODI, 0 to 100, 100 being maximally disabled).

Data analysis and statistics

All radiological measurements were performed by two independent observers, who were not involved in patient care (JK, VV). Mean values of their measurements were calculated. The interclass correlation coefficient (ICC) was used to quantify agreement between the two observers.

Patients were assigned to a success or failure group based on their reported VAS and ODI score (failure was defined as VAS ≥ 50 in combination with an ODI ≥ 40) [16, 25]. In addition, patients with a revision by spinal fusion were included if both the radiographs direct post-operative and before their revision were available. They were all considered as failures of the TDR. The independent samples t test was used to test for differences in the means of the radiological parameters between both groups. Using receiver operating characteristic (ROC) curves, possible threshold values were analyzed. A cut-off p value of < 0.05 was considered statistically significant. All analyses were performed using IBM SPSS (Version 23.0).

Results

Study population

Altogether 225 patients who had undergone a TDR at level L4–L5 and/or L5–S1 were identified, 16 patients had deceased (7.1%). The remaining 209 patients were contacted by mail and subsequently by phone, with the request to visit our outpatient clinic. A total of 152 patients (72.7%) were available for analysis. In 32 patients (15.3%), a revision by spinal fusion had been performed prior to our study. In only five patients, this revision was because of subsidence or malposition of the implant. In the remaining patients, the reason for revision was facet joint degeneration (n = 14), ASD (n = 10), or dislocation of the implant (n = 3). In eight out of these 32 patients, a complete set of radiographs was available and they were included for analysis. Informed consent was acquired in all patients.

Mean follow-up after implantation was 16.7 years (median 16.4, range 13.6–23.0 years). In 18 patients (15.0%), the direct post-operative radiographs were not available. The remaining 102 patients were included for radiological analysis. Patient characteristics are listed in Table 1. Due to over-projection of the pelvis on the AP radiographs, for patients with a single TDR at L5–S1, the PBV, AUI and IA-AP could not be determined (n = 56). Consequently, in 110 patients, the AR and IA-LAT and in 54 patients the PBV, AUI and IA-AP could be determined. In only four patients, a dual-energy X-ray absorptiometry (DEXA) scan was available, hence we were unable to report on bone mineral density in relation to the occurrence of subsidence.

Table 1 Summary of subgroup patient demographic and surgical data presented as mean (standard deviation) or proportions (%)

Radiological analysis in relation to clinical outcome

High ICC between the two observers were found for AR (R ≥ 0.90, p < 0.01), IA (R ≥ 0.88, p < 0.01), AUI (R ≥ 0.85, p < 0.01) and especially PBV (R ≥ 0.972, p < 0.01). As shown in Table 2, there was a significant difference for AR (5.58° vs. 6.80°, p = 0.047), but no significant differences in the mean values for AR, IA, PBV and AUI direct post-operative (DPO) between the success (N = 61) and failure group (N = 49). At last follow-up (LFU), both the AR (8.89° vs. 6.51°, p = 0.019) and PBV (1757.2 mm3 vs. 1058.7 mm3, p = 0.003) were significantly higher in the failure compared to the success group. When the differences for the mean values between LFU and DPO were calculated, again for AR (ΔAR, 4.33° vs. 1.83° for the failure and success groups, respectively, p = 0.001) and PBV (ΔPBV, 1448.4 mm3 vs. 747.3 mm3 for the failure and success groups, respectively, p = 0.003) a significant difference was observed. Both PBV-LFU and ΔPBV were significantly higher in patients with a revision, compared to those in the success group (p = 0.009 and p = 0.001, respectively). No significance differences, between the patients with (n = 68) or without (42) the porous coating of the endplates, were observed.

Table 2 Mean values (standard deviation) of the success and failure group and the differences (95% confidence interval) between the two groups

Subsequently, ROC curves were plotted for the occurrence of failure in relation to AR, IA, AUI or PBV. Possible threshold values were determined by minimizing the false positive and false negative classifications (Table 3). A threshold of 6.23° was obtained for AR-LFU [area under the curve (AUC) 0.625, p = 0.026]. For ΔAR, an increase over time of 1.85° (AUC 0.685, p = 0.001) was associated with failure. For PBV-LFU, a threshold of 1223 mm3 (AUC 0.724, p = 0.005) was determined and for ΔPBV an increase of 829 mm3 (AUC 0.723, p = 0.003) was established. For IA and AUI, no significant associations were seen. When applying these thresholds for PBV, 27 (54.0%, PBV-LFU) and 23 (46.0%, ΔPBV) of the studied patients without a revision (N = 51) have radiographic subsidence.

Table 3 ROC curve association for failure presented as the area under the curve (standard error)

Subsidence in relation to the position and relative size of the TDR

To investigate whether subsidence could be predicted by the position and relative size of the TDR on the direct post-operative radiographs, we also investigated associations between position as measured from these radiographs and symptomatic subsidence as outcome. We defined symptomatic subsidence as a PBV-LFU of ≥ 1223 mm3 or a ΔPBV of ≥ 829 mm3, since both threshold values displayed the largest AUC. In addition, both can detect parallel subsidence, in contrary to ΔAR. In seven patients (6.4%), a ΔPBV of ≥ 829 mm3 with a ΔAR < 1.85°, indicative for parallel subsidence, was observed. ROC curves were plotted for both PBV-LFU (Table 4) and for ΔPBV (Table 5) in relation to AR, IA and AUI measured direct post-operatively.

Table 4 ROC curve predictors for subsidence defined as a penetrated bone volume at follow-up ≥ 1223 mm3 presented as the area under the curve (standard error)
Table 5 ROC curve predictors for subsidence defined as a ΔPenetrated Bone Volume at follow ≥ 829 mm3 presented as the area under the curve (standard error)

The occurrence of symptomatic subsidence defined as a PBV-LFU of ≥ 1223 mm3 is associated with an AR-DPO of ≥ 3.96° (AUC 0.690, p = 0.022) and with an AUI-DPO of > 0.50 (AUC 0.750, p = 0.002). When the occurrence of symptomatic subsidence was defined as a ΔPBV of ≥ 829 mm3, only an association with an AUI-DPO of < 0.51 (AUC 0.718, p = 0.008) was determined. For IA no significant associations were seen.

Discussion

This study represents a long-term follow-up of patients after lumbar TDR for the treatment of symptomatic DDD, and is the first study to establish a clear relation between the occurrence of radiographic subsidence and signs or symptoms of patients. Furthermore, the occurrence of subsidence could be predicted by the AR and AUI of the TDR measured on the direct post-operative radiographs. High ICC between the two observers was found, indicating high agreement between observers.

Subsidence may ultimately lead to spontaneous fusion of the vertebral segment or to failure of the TDR due to wear or displacement [14]. To quantify radiographic subsidence, different methods have previously been described. Lee et al. defined subsidence as an increase over time of 5° in AR, measured on lateral radiographs [15]. They found no significant difference in clinical outcome between the patients with or without subsidence. However, parallel subsidence cannot be detected using this method. In the present study, we identified seven patients (6.4%) with parallel subsidence.

Punt et al. [20] considered radiographic subsidence to be present if the PBV-LFU was more than 1300 mm3 or if the PBV-LFU was between 700 and 1300 mm3 in combination with an AR of more than 7.5°. These values are similar with our findings. However, in contrast to the current study, no direct post-operative images were available. Consequently, they could not investigate whether initial malpositioning or migration over time of the implant had led to the apparent radiographic subsidence at last follow-up. In addition, no clinical outcomes were reported, so they could not look for an association between the occurrence of subsidence and signs or symptoms.

Radiographic subsidence in relation to clinical outcome

In the current study, we determined that at last follow-up both the AR and PBV were significantly higher in the failure group (VAS ≥ 50 and ODI ≥ 40). This also applies when the differences between the mean values at last follow-up and direct post-operative were calculated (ΔAR and ΔPBV). It must be noted that 40.2% of the patients (n = 41) were classified as failures based on their clinical outcome, a number exceeding the number of patients with a revision in our population (n = 32). However, these findings indicate that there is a relation between the occurrence of radiographic subsidence in terms of PBV and AR and signs or symptoms of the patient. Having established this, performing revision surgery for patients with radiographic subsidence and signs or symptoms seems a more viable option. This finding does not imply that worse clinical outcome is exclusively due to radiographic subsidence in all patients. Using ROC curves, clinically applicable threshold values (ΔPBV ≥ 829 mm3 or PBV-LFU ≥ 1223 mm3) were obtained to assess which patients are at risk for symptomatic subsidence and were most likely to benefit from revision surgery.

Symptomatic subsidence in relation to the position and relative size of the TDR

ROC curves were plotted, to investigate whether symptomatic subsidence could be predicted by the position and relative size of the TDR on the direct post-operative radiographs. It seems that the AR should not exceed 4°. In addition, a reduced risk of symptomatic subsidence was found if at least 50% of the area of the bony endplate of the vertebra was covered by the TDR endplate. This value is consistent although slightly lower than the 60% described by Punt et al. [20]. We believe that our threshold is a better representation because in the current study, not only patients with clinical problems after receiving TDR were included, but also asymptomatic patients, and a correlation with clinical outcome was established.

Initially, the relation between implant size and failure of the TDR was emphasized not enough. Gstoettner et al. reported a maximum allowed distance of 5 mm, between the edges of the TDR- and vertebral endplates on either side on both AP an lateral views, to prevent subsidence [19]. In the current study, mainly (98%) size 2 (25–31.5 mm) to 4 (29–38.5 mm) of the Charité III lumbar TDR were inserted. We can calculate the AUI when applying their method for the different sizes using the product specifications. Doing so, for size 2 an AUI of 0.46 and for size 4 an AUI of 0.41 was calculated (minimal coverage between 54 and 59%). These values are comparable with our findings. Similar to our results, in this study, it was strongly advised to use whenever possible, the larger size Charité III TDRs.

The present study did not find an association between implantation asymmetry and clinical outcome or the occurrence of subsidence. A study of McAfee et al. [18] found that non-central implantation of the Charité TDR (n = 205, follow-up 24 months), negatively affected clinical outcome and range of motion. No associations with the occurrence of subsidence were studied. Possibly, the effect of non-central implantation does not influence clinical outcome by subsidence or diminishes over time.

Study limitations and strengths

The current study’s main limitation is its retrospective nature. We were only able to report on the changes in AR and PBV between directly post-operative and at last follow-up, which was not a standardized interval. In addition, we were only able to report on eight out of the 32 patients with a revision of their TDR. Therefore, it was not possible to correlate the obtained threshold values for symptomatic subsidence, with the likelihood of a revision. The mean follow-up of 16.7 years is substantial and might explain the relatively large number of patients who were lost to follow-up, mainly caused by patients who had died or could not be retrieved. In only 15% of the patients, the direct post-operative radiographs were not available. Therefore, the number of patients included in this study is such that the outcomes may be considered valid and representative. Although the Charité III total disc replacement (TDR) is since 2012 no longer available on the market, the basic design features of many TDRs used today, are still very comparable and we think important lessons can be drawn for other designs as well. Subsidence is a recognized concern in the TDR surgery, and this is the first study to report on the association between the radiographic subsidence and clinical outcome. In addition, this study indicates that occurrence of symptomatic subsidence is related to the position and relative size of the TDR, which are factors that can be optimized by the surgeon pre- or intraoperatively.