Original ArticleHybrid models were found to be very elegant to disentangle longitudinal within- and between-subject relationships
Introduction
Within the field of epidemiology, there is an increasing interest in observational longitudinal studies. Regarding the analysis of longitudinal data, mixed model analysis and generalized estimating equations (GEE analysis) are the two most used methods [1], [2]. Both techniques are extensions of regression analyses, and the general idea behind both methods is that an adjustment is made for the dependency of the observations within the subject. Mixed model analysis performs this adjustment by modeling the differences between the subjects either in the intercept (i.e., by adding a random intercept to the model) or in the regression coefficients for time-dependent independent variables (i.e., by adding random slopes to the model). On the other hand, GEE performs this adjustment by directly modeling the correlations between the repeated measurements within the subjects. Although linear mixed model analysis and GEE analysis show highly similar results, linear mixed model analysis is most used in longitudinal epidemiologic studies. This is probably because of the fact that mixed model analysis is slightly better when there is missing data and is slightly more flexible in the modeling of the dependency of the observations within the subject.
One of the problems with mixed model analysis is the confusing terminology. Mixed model analysis is also known as hierarchical linear modeling, multilevel analysis, random effects modeling, or random coefficient analysis; many different names for the same method. Furthermore, when mixed model analysis is used in epidemiologic studies, it is said that the regression model is divided into two parts. The fixed part contains the regression coefficients, whereas the random part contains the random intercept and/or random slope variance [3], [4]. Within econometrics and sociology, for instance, regarding longitudinal studies, a distinction is made between fixed effects models, between-effects models, and random effects models. A fixed effects model is not only a model with the regression coefficients but also a model in which only the within-subject part of the relationship is estimated. In a between-effects model only the between-subject part of the relationship is estimated, whereas a random effects model is basically the same as a regular mixed model analysis in epidemiology [5], [6], [7].
In longitudinal studies the interpretation of the regression coefficient deserves specific attention, in particular, when analyzing the association between two variables that vary over time. When the independent variable is time-dependent the interpretation of the regression coefficient is twofold: a between-subject component and a within-subject component. Although the combined interpretation reflects the total longitudinal relationship between two (time-dependent) variables, in some situations, the researcher may want to disentangle the within- and between-subject interpretation. There are several models available to disentangle the two effects [2]. From these, the hybrid model seems to be the best option [3], [8], [9], [10], [11], [12]. However, this method is not much used within epidemiologic practice.
Therefore, the purpose of the present article was to illustrate and discuss the use of the hybrid model as a possible tool to disentangle the within- and between-subject part of the relationship in longitudinal epidemiologic studies.
Section snippets
The hybrid model
When a longitudinal data analysis is performed the between-subject relationship is basically nothing more than the relationship between the mean value of the particular independent variable for each subject and the outcome (Equation 1). To obtain the within-subject part of the relationship, the independent variable must be centered around the mean value of the particular subject; this centering around the mean value is known as the deviation score (Equation 2). To obtain both the within- and
Results
Table 1 shows descriptive information regarding the datasets used in the examples of the present article.
Table 2 shows the results of the longitudinal analyses regarding the relationship between cholesterol and SSF. As can be seen from the results of the first three analyses, the regression coefficients obtained from the hybrid model, including the individual mean and the deviation score, are equal to the regression coefficients obtained from the two separate analyses. This can be explained by
Discussion
In the present article the hybrid model (including both the individual mean and the deviation score as independent variables) was illustrated and discussed as a possibility to disentangle the within- and between-subject part of a longitudinal relationship. It was shown that the overall regression coefficient obtained from a regular mixed model analysis is some sort of weighted average of the two separate relationships obtained from a hybrid model. The hybrid model thus offers a possibility to
Conclusion
The hybrid model is an elegant, easy to perform method to disentangle the within- and between-subject part of a relationship in longitudinal studies.
References (18)
Commonalities in the classical, collapsibility and counterfactual concepts in confounding
J Clin Epidemiol
(2004)- et al.
Applied longitudinal analysis
(2011) Applied longitudinal data analysis for epidemiology. A practical guide
(2013)- et al.
The disaggregation of within-person and between-person effects in longitudinal models of change
Annu Rev Psychol
(2001) Applied mixed model analysis. A practical guide
(2018)Analysis of panel data
(2003)Panel models in sociological research: theory into practice
Annu Rev Sociol
(2004)Econometric analysis of cross section and panel data
(2010)- et al.
Between- and within-cluster covariate effects in the analysis of clustered data
Biometrics
(1998)
Cited by (0)
Declarations of interest: none.