Original Article
Hybrid models were found to be very elegant to disentangle longitudinal within- and between-subject relationships

https://doi.org/10.1016/j.jclinepi.2018.11.021Get rights and content

Highlights

  • The between-subject part is obtained by the individual mean value over time.

  • The within-subject part is obtained by using the deviation score.

  • The deviation score is the difference between observations and individual mean.

  • The results of a hybrid logistic model should be interpreted with caution.

Abstract

Objectives

The interpretation of a regression coefficient obtained from a longitudinal data analysis is a combination of a within-subject part and a between-subject part. The hybrid model is used to disentangle the two components. The purpose of this article was to illustrate and discuss the use of the hybrid model in epidemiologic studies.

Study Design and Setting

In the hybrid model the between-subject part of the relationship is obtained using the individual mean value over time, whereas the within-subject part is obtained using the deviation score, that is, the differences between the observations and the individual mean value.

Results

It was shown that the regression coefficient of a standard mixed model analysis is a sort of weighted average of the between- and within-subject part of the relationship. When the outcome was continuous the separate analyses to estimate the two components of a longitudinal relationship were equal to the estimation in the hybrid model. However, for dichotomous outcome, the estimations were slightly different.

Conclusion

The hybrid model is an elegant, easy to perform method to disentangle the within- and between-subject part of a relationship in longitudinal studies.

Introduction

Within the field of epidemiology, there is an increasing interest in observational longitudinal studies. Regarding the analysis of longitudinal data, mixed model analysis and generalized estimating equations (GEE analysis) are the two most used methods [1], [2]. Both techniques are extensions of regression analyses, and the general idea behind both methods is that an adjustment is made for the dependency of the observations within the subject. Mixed model analysis performs this adjustment by modeling the differences between the subjects either in the intercept (i.e., by adding a random intercept to the model) or in the regression coefficients for time-dependent independent variables (i.e., by adding random slopes to the model). On the other hand, GEE performs this adjustment by directly modeling the correlations between the repeated measurements within the subjects. Although linear mixed model analysis and GEE analysis show highly similar results, linear mixed model analysis is most used in longitudinal epidemiologic studies. This is probably because of the fact that mixed model analysis is slightly better when there is missing data and is slightly more flexible in the modeling of the dependency of the observations within the subject.

One of the problems with mixed model analysis is the confusing terminology. Mixed model analysis is also known as hierarchical linear modeling, multilevel analysis, random effects modeling, or random coefficient analysis; many different names for the same method. Furthermore, when mixed model analysis is used in epidemiologic studies, it is said that the regression model is divided into two parts. The fixed part contains the regression coefficients, whereas the random part contains the random intercept and/or random slope variance [3], [4]. Within econometrics and sociology, for instance, regarding longitudinal studies, a distinction is made between fixed effects models, between-effects models, and random effects models. A fixed effects model is not only a model with the regression coefficients but also a model in which only the within-subject part of the relationship is estimated. In a between-effects model only the between-subject part of the relationship is estimated, whereas a random effects model is basically the same as a regular mixed model analysis in epidemiology [5], [6], [7].

In longitudinal studies the interpretation of the regression coefficient deserves specific attention, in particular, when analyzing the association between two variables that vary over time. When the independent variable is time-dependent the interpretation of the regression coefficient is twofold: a between-subject component and a within-subject component. Although the combined interpretation reflects the total longitudinal relationship between two (time-dependent) variables, in some situations, the researcher may want to disentangle the within- and between-subject interpretation. There are several models available to disentangle the two effects [2]. From these, the hybrid model seems to be the best option [3], [8], [9], [10], [11], [12]. However, this method is not much used within epidemiologic practice.

Therefore, the purpose of the present article was to illustrate and discuss the use of the hybrid model as a possible tool to disentangle the within- and between-subject part of the relationship in longitudinal epidemiologic studies.

Section snippets

The hybrid model

When a longitudinal data analysis is performed the between-subject relationship is basically nothing more than the relationship between the mean value of the particular independent variable for each subject and the outcome (Equation 1). To obtain the within-subject part of the relationship, the independent variable must be centered around the mean value of the particular subject; this centering around the mean value is known as the deviation score (Equation 2). To obtain both the within- and

Results

Table 1 shows descriptive information regarding the datasets used in the examples of the present article.

Table 2 shows the results of the longitudinal analyses regarding the relationship between cholesterol and SSF. As can be seen from the results of the first three analyses, the regression coefficients obtained from the hybrid model, including the individual mean and the deviation score, are equal to the regression coefficients obtained from the two separate analyses. This can be explained by

Discussion

In the present article the hybrid model (including both the individual mean and the deviation score as independent variables) was illustrated and discussed as a possibility to disentangle the within- and between-subject part of a longitudinal relationship. It was shown that the overall regression coefficient obtained from a regular mixed model analysis is some sort of weighted average of the two separate relationships obtained from a hybrid model. The hybrid model thus offers a possibility to

Conclusion

The hybrid model is an elegant, easy to perform method to disentangle the within- and between-subject part of a relationship in longitudinal studies.

References (18)

  • S.C. Newman

    Commonalities in the classical, collapsibility and counterfactual concepts in confounding

    J Clin Epidemiol

    (2004)
  • G. Fitzmaurice et al.

    Applied longitudinal analysis

    (2011)
  • J.W.R. Twisk

    Applied longitudinal data analysis for epidemiology. A practical guide

    (2013)
  • P.J. Curran et al.

    The disaggregation of within-person and between-person effects in longitudinal models of change

    Annu Rev Psychol

    (2001)
  • J.W.R. Twisk

    Applied mixed model analysis. A practical guide

    (2018)
  • C. Hsiao

    Analysis of panel data

    (2003)
  • C.N. Halaby

    Panel models in sociological research: theory into practice

    Annu Rev Sociol

    (2004)
  • J.M. Wooldridge

    Econometric analysis of cross section and panel data

    (2010)
  • J.M. Neuhaus et al.

    Between- and within-cluster covariate effects in the analysis of clustered data

    Biometrics

    (1998)
There are more references available in the full text version of this article.

Cited by (0)

Declarations of interest: none.

View full text