Introduction

Pelvic organ prolapse (POP) is part of a range of conditions that are related to pelvic floor dysfunction such as urinary incontinence, bowel disorders and the report of vaginal bulging [1, 2]. The lifetime prevalence in women of over 50 years of age is 30–50% [3]. Women have an 11–11.8% chance of undergoing at least one surgical intervention for POP or incontinence by the age of 79 years, with a re-operation rate of 29.2% [4, 5]. Complication rates after surgery in different racial groups were reported to be 19.4% (white), 34.1% (black) and 27.4% (other) [6]. POP is therefore associated with a high financial burden on health care [4]. Although only a relatively small number of women with POP seek treatment, it is expected that this number will increase in the future [7]. At present, the direct cost of POP surgery already exceeds 1 $ billion per year in the United States alone [8].

To estimate the care requirements of women with POP in the future, it is important to have reliable prevalence data from a general female population. Data must be obtained using a questionnaire and vaginal examination because there is only a moderate correlation between the signs of POP (measured by vaginal examination) and the symptoms (measured by a questionnaire) [9, 10].

To obtain reference data on the prevalence and distribution of POP signs and the POP symptom of feeling and/or seeing a vaginal bulge, we conducted a cross-sectional study on a general population of women aged 45–85 years. Our first aim was to develop and validate a prediction model that will be helpful to researchers and health care policy makers. We focussed on three different cut-off points because of a discrepancy between feeling and/or seeing a vaginal bulge (the cornerstone symptom of POP) and the presence of POP signs in literature. A high correlation between signs and symptoms can be present when a different cut-off point is taken regarding the presence of POP. Therefore, we looked at 1 cm above, at the hymen and 1 cm beyond the hymen. Secondly, we created a POP score chart and a prognostic index to estimate the presence of POP in a general population without vaginal examination and the amount of care needed by women.

Materials and methods

A cross-sectional study was performed on a general population of women aged 45 to 85 years. A flowchart of the study design is presented in Fig. 1.

Fig. 1
figure 1

Flowchart of the study

The study population comprised 2,979 women, registered in the office records of eight out of nine general practitioners in Brielle. Brielle is a town near Rotterdam (the Netherlands) with 16,000 citizens. Because all inhabitants have the obligation to be registered in a general practitioners clinic, the study population contained 95% of all women in Brielle. It has an almost exclusively white population (98.4%). Names and addresses of all 2,979 eligible women aged 45–85 years were obtained from the general practitioners. The women were sent information about the pelvic floor study and could be enrolled by filling out an informed consent form. They were offered three options: to sign a refusal form, or to fill out the questionnaire only or to fill out the questionnaire and undergo vaginal examination.

All the women were asked to complete a self-report questionnaire. A reminder, containing the same questionnaire, was sent 8 weeks after the first contact. The data were collected anonymously. To avoid selection bias, non-responders were invited to complete a short questionnaire that comprised five questions about age, parity, presence of stress urinary incontinence (yes/no), faecal incontinence (yes/no) and feeling of vaginal bulging (yes/no). To encourage a high response to the questionnaire, we used envelopes with the name and logo of the Erasmus University, coloured paper and stamped-addressed-return envelopes [11].

The questionnaire used in this study combined several Dutch validated pelvic floor questionnaires, such as the Urogenital Distress Inventory [12] and the Defaecation Distress Inventory [13]. In addition, subjects were asked about ethnicity, parity, vaginal bulging, incontinence, pelvic girdle pain and vaginal bulging during pregnancy, family history, menopausal status, hormone replacement therapy (HRT), previous pelvic floor surgery, educational level, smoking and heavy physical work at present or in the past.

Vaginal examination

From all the participants who gave informed consent in the beginning of the study to undergo vaginal examination (n = 1,140), 800 women were randomly selected by age for POPQ measurement. (All response forms of the women were registered with a number that identified the age, and they were at random taken by a research assistant). The POPQ was introduced by the International Continence Society (ICS). It has become widely accepted and proven to be valid [1] and reliable [14].

A gynaecologist (MV) and a physiotherapist (MS) performed the vaginal examinations. The two examiners practiced the POPQ measurement protocol until they reached agreement about the test and registration scores. This process was performed at the Pelvic Floor Centre at the Erasmus Medical Centre in Rotterdam. POPQ measurements were carried out in conformity with the ICS standardisation report [1]. After each examination, all the details were entered into the three-by-three POPQ grid. The two examiners were blinded to the results of the questionnaire. The women were asked to empty their bladder before the examination.

Women were assigned to one of five ordinal stages of prolapse (0–4) in accordance with the POPQ grading system. All the methods, definitions and descriptions were in line with the ICS [1].

To make a detailed analysis of stage 2, we divided it into 2A (indicating 1 cm above the hymen), 2B (0) and 2C (1 cm beyond the hymen). So, for example, the cut-off point 2A means that all subjects with POP stage 2A till stage 4 are used for analyses. Within the vaginal examination group, the women were classified into the symptomatic group if they reported feeling and/or seeing vaginal bulging, and all others were included in the asymptomatic group.

Statistical analysis

Logistic regression was used to develop multivariate prediction models on the risk of prolapse. Three different definitions of prolapse were used and compared, based on the three cut-off points on the POPQ scale 2A, 2B and 2C. Backwards elimination of the predictors was used. Variables with P < 0.3 were kept in the model. This strategy eliminates most of the purely random variables and improves the chance that the model will perform well in future patients [15]. The predictive performance of the three resulting models was transformed into receiver operating characteristic (ROC) curves and the areas under the curves (AUC) were compared. There was a limited number of missing values on many of the variables. As multivariate analysis is severely hampered by missing values and more importantly, results may be biassed, we used multiple imputation with ten datasets [16].

Internal validation of the models was performed by a bootstrap re-sampling procedure: the model building process was repeated 200 times after creating 200 new datasets (bootstrap samples) by randomly drawing cases (with replacement) from the original data. The variable selection and estimation procedure was performed on each bootstrap sample. This yielded 200 sets of predictors and parameter estimates. The model estimates of each bootstrap sample were evaluated on the basis of the original data. On average, the predicted and observed outcomes should agree. Predictions that deviate strongly from the mean usually differ greatly from the observed outcomes due to over-fitting of the model. The size of the over- fitting effect was estimated by averaging the 200 bootstrap samples. This produced a shrinkage factor c to compensate for the over-fitting [17]. The bootstrap method was also used to estimate the amount of optimism in the AUC by optimally fine-tuning a model and subsequently evaluating its predictive performance on the same data [17].

The prediction model that showed the highest AUC was translated into a pragmatic prognostic score, the Slieker POP score. For each prognostic factor in the model, the regression coefficients in the logistic model were converted into score points. For ease of use, the regression coefficients were scaled and rounded to whole numbers, such that the minimum and maximum score of women in our data set were 0 and 100, respectively. From a graph, the corresponding risk of POP can be read off.

The analyses were performed using the Statistical Package for Social Science (SPSS) 15.0. The Medical Ethics Research Committee of the Erasmus Medical Centre in Rotterdam, the Netherlands, approved this study.

Results

Response rate

The response rate to the questionnaire was 62.7% (1,869 of 2,979). In the group of 1,869 responders, 472 (15.8%) women refused to participate, 1,397 (46.8%; group 1) women agreed to fill out the large questionnaire and 1,140 (38.2%; group 2) agreed to fill out the large questionnaire and undergo vaginal examination. In the non-responder group 3, 20.8% returned the completed short questionnaire (620 of 2,979). Feeling vaginal bulging was reported by 6.7% (n = 41) of this non-responder group versus 9.8% in the responder group (135 of 1,397). From group 2, 800 out of the 1,140 women who consented to undergo vaginal examination were selected at random and sent an invitation for vaginal examination: 649 women participated (81.1%), which was 21.7% of the total study population and 46.4% of the women who filled in the questionnaire.

The vaginal examination group of 649 women was stratified into an asymptomatic control group (n = 570) and a symptomatic (n = 79) group in which the women had reported seeing and/or feeling vaginal bulging. Combining the data on the large and short questionnaires from the responders and the initial non-responders (1,397 + 620 = 2,017) revealed the report of a feeling of vaginal bulging prevalence rate of 8.7% (n = 176).

Baseline characteristics

Baseline characteristics of the total study population and the different groups (group 1 the total group, vaginal examination group 2 divided into a symptomatic group and an asymptomatic group and the non-responder group 3) are shown in Table 1.

Table 1 Baseline characteristics of the total study population group 1, group 2 who underwent vaginal examination divided into symptomatic and asymptomatic women expressed as percentages (%) with means and the non-responders who filled out the short-questionnaire group 3

No significant differences were found between group 1 and group 3 or between the asymptomatic women and the symptomatic women in group 2.

The prevalence of POP per POP stage in relation with the report of vaginal bulging in our general population is presented in Table 2. The overall prevalence of ≥stage 2B (all the women with stages 2B, 3 and 4) was 17.5% (114 of 649), of whom 30.7% (35 of 79) had symptoms of vaginal bulging (n = 35).

Table 2 The prevalence of POP stage in relation to the report of vaginal bulging in percentage (n); POP data were missing in six women; vaginal bulging question had not been answered by ten women)

The results of the multivariate analyses on POP stages 2A, 2B and 2C are shown in Table 3. Significantly higher odds ratios were found especially in stages 2B (at the hymen) and 2C (beyond the hymen) for the report of vaginal bulging (3.80 and 5.47, resp.), for ageing (1.04 and 1.04, resp.), parity of 2 (2.84 and 3.06, resp.), parity of ≥3 (2.63 and 3.33, resp.), and POP in the mother (1.96 and 2.00, resp.). The ROC curve in Fig. 2 shows that the largest AUC were 0.759 for ‘beyond the hymen’ and 0.723 for ‘at or beyond the hymen’. The AUC values were corrected for optimism 0.672 and 0.648, respectively.

Table 3 Results of the multivariate logistic regression analysis with test scores and area under the curve (AUC) in POP substages 2A, 2B and 2C in relation to the hymen (pregn. POP = vaginal bulging symptoms during pregnancy with at least a little bother)
Fig. 2
figure 2

Receiver operating characteristics of the multivariate analysis with the area under the curve of the stages 2A, 2B and 2C

Due to small sample sizes, no ORs could be calculated in the multivariate analyses on HRT, hysterectomy, incontinence during pregnancy and incontinence in the mother. In Table 4, the Slieker-POP-Score-Chart and POP prognostic index are presented. The score chart is based on POP stage 2B (and 2C), i.e. POP at or beyond the hymen. After filling in the numbers on the score chart, the total score can be interpreted on the prognostic curve and will give the risk for the presence of POP in percentages. A shrinkage factor of 0.63, estimated from the bootstrap validation procedure, was applied to this model to enable optimal predictive performance in new subjects.

Table 4 The Slieker-POP-Score Chart and the prognostic index to read the sum score

Discussion

The present study was designed to investigate the prevalence of POP in a general female population and to develop a prediction model based on prognostic factors that could be considered into a prognostic index.

Prevalence

The distribution of pelvic organ prolapse in this population indicated that POP was present at or beyond the hymen (≥stage 2B) in 21% of the women. Within this 21%, 45% of the women had reported the symptom of seeing and/or feeling vaginal bulging. If stage 2A had been included, the prevalence would have increased to 36.4%, which is in line with the study by Gutman et al. [18]. Our results are also comparable with those reported in many other studies, but as yet, no reliable explanation has been put forward for the discrepancy between POP stage and symptoms of vaginal bulging [3, 9, 1921]. Explanations for the discrepancies between POP signs and POP symptoms might lie in the personal sphere, such as coping strategies, attitudes and beliefs, or in the social and economic circumstances, such as quality of life. We recommend that future research focuses on these personal and socio-economic factors in relation with POP.

Another possible underlying factor in the lack of conformity between the report of the symptom of vaginal bulging with the signs of POP stage is the reliability of the POPQ measurement. Results can be influenced by the level of fullness of the bladder and bowel and by reluctance to make a maximum valsalva manoeuvre on command during vaginal examination. We tried to standardise the measurements as much as possible to diminish this bias.

Our results are in contrast with a recent study by Kluivers et al. [22] who were able to distinguish symptomatic women with clinically relevant POP from asymptomatic women without any clinically relevant POP. These differences in outcome can be explained on the basis of population selection in the study by Kluivers et al. [22] because women were included who were seeking treatment for one or more pelvic floor disorders at a pelvic floor centre.

Prediction model

We developed a prediction model that has substantial sensitivity and specificity to help researchers estimate the prevalence of clinically relevant POP on a basis of a short questionnaire alone. To diagnose POP symptoms, we focused on the report of feeling and/or seeing vaginal bulging, as cornerstone of the symptom of POP, because in the literature, other variables such as urinary splinting, digital manipulation, defaecation disorders and pelvic heaviness show no correlations with the presence of POP [21, 23, 24].

In our study, the AUC was analysed based on risk factors presented in an earlier study [25], such as age, BMI, parity, menopausal status and HRT, smoking, hysterectomy, incontinence surgery, education level, heavy physical work currently or in the past, pelvic girdle pain, incontinence and/or the report of vaginal bulging during pregnancy and incontinence or POP in the mother. Highest sensitivity and specificity were reached using the 2C score ‘beyond the hymen’ (AUC 0.759). However, the differences between the AUC of the cut-off points ‘at’ and ‘beyond the hymen’ are small. This indicates that no higher correlation is present between signs and symptoms using these cut-off points. Especially, the report of vaginal bulging, age, parity ≥2 and POP in the mother contributed to the sensitivity and specificity of this prediction model. However, we recommend the use of ‘at or beyond the hymen’ as the cut-off point during POP examination instead of ‘beyond the hymen’. Although the AUC was lower (0.723), the difference was only small, but the advantage could be the early detection of POP. This approach might also enhance preventive strategies for more advanced POP stages [2629]. Our findings are in line with Barber et al. [9] who studied the prevalence of clinically relevant POP (at or beyond the hymen) in what they referred to as a ‘low risk’ population, which is also applicable to a general population.

Higher AUC scores (of 0.90) were recently demonstrated in the study by Robinson et al. [23] with an artificial neural network in which 20 variables made the largest contributions to the prediction model, such as age, gravidity, parity, number of vaginal deliveries, weight of the largest vaginal delivery, BMI, menstrual status, number of years postmenopausal, race, history of chronic disease, hypertension, diabetes, chronic obstructive pulmonary disease, prior hysterectomy, prior prolapse or incontinence surgery and the use of anti-hypertensive’s. In contrast, our study included family history, smoking behaviour, education, heavy physical work, pelvic girdle pain and POP symptoms during pregnancy. Furthermore, the definition of POP was different in the study by Robinson et al. They defined clinically relevant POP as ≥2 cm beyond the hymen. The location of stages 2B and 2C ‘at or beyond the hymen’ was not used in their artificial neural network at all. Thus, the definition of POP (≥2 cm beyond the hymen) used by Robinson et al. can account for the high AUC score.

We developed and validated a simple, inexpensive tool, the Slieker-POP-Score-Chart, to predict the outcome of clinically relevant stage 2B (at or beyond the hymen) and 2C (beyond the hymen) POP with AUC scores of 0.640 and 0.672, respectively. This simple self-diagnostic instrument can also help women to estimate the severity of their POP. Until now, only a small number of women with POP seek treatment. Awareness of the potential presence of POP will encourage to seek advice on how to deal with their symptoms, or they can be advised to consult a gynaecologist or a physiotherapist for pelvic floor training [2629] before surgery because in our opinion, surgery is not the only treatment option.

Raising this awareness can also have an adverse effect on women being more aware of bulging feelings in the vagina, but if prevention is a goal, detection is important. Therefore, the Slieker-POP-Score-Chart can not only be used by midwifes but also on internet or other informative media and can be used for research to preventive strategies.

Strengths and limitations

One of the strengths of this study was the use of vaginal examination in a large cross-sectional design. This large study population was a subgroup of an even larger group of 95% of all eligible women of Brielle. Because there was no referral of the general practitioners and women were addressed directly by mail, there was no strong bias selection of the group. Another strength was that the results led to the development of a validated POP score instrument to estimate the presence of clinically relevant POP on the basis of eight questions, without the need for vaginal examination, which could also be helpful in epidemiological studies. Furthermore, this is a self-report instrument that can support a woman’s decision to consult a physician.

This study also had limitations. A questionnaire can elicit socially desirable answers, although this was probably minimal due to the anonymity of responses to the questionnaire. It can be difficult to identify POP, as the situation can change over the course of the day, and there is considerable dependence on performing a maximum Valsalva manoeuvre. Although women were not selected by their physician, still, differences in outcome might have occurred because of selection bias in the population: women who experienced some POP symptoms could be more likely to agree for vaginal examination. This could cause overestimation of the real prevalence of POP.

Conclusion

The prevalence of POP symptoms in a general Dutch female population was 12.2%. The prevalence of POP, scored by the POPQ, ‘at or beyond the hymen’ (≥2B) was 17.5% in the overall group. In the symptomatic group, 45.5% had POP stage ≥2B. With the newly developed prediction model (based on 17 questions) and the Slieker POP Score Chart (based on eight questions), the risk of developing POP can be estimated by healthy women themselves. According to the researchers, the Slieker POP Score Chart can be used to estimate the prevalence of POP ‘at or beyond the hymen’ in a general Caucasian population.