On multi-level modeling of data from repeated measures designs: a tutorial
Introduction
Like other behavioral disciplines, the study of speech communication progresses mainly by means of statistical inference. Using strict tools and procedures, researchers generalize from a sample of observed cases to broader contexts. Statistics as a discipline aims to facilitate and improve this inference, and to ensure validity of the resulting insights. Meanwhile, however, new insights are also achieved within the field of statistics itself, although these insights do not always percolate into actual research practice (Max and Onghena, 1999). Statistical insight and actual research practice thus are at risk to diverge, to the detriment of the latter. In particular, multi-level modeling (hence MLM) has emerged in the past decades as a highly flexible and useful tool for statistical analysis and inference (Searle et al., 1992; Bryk and Raudenbush, 1992; Goldstein, 1995; Hox, 1995; Kreft and De Leeuw, 1998; Snijders and Bosker, 1999; McCulloch and Searle, 2001; Raudenbush and Bryk, 2002; Maxwell and Delaney, 2004, Chapter 15). This new tool is also known as the hierarchical linear model, variance component model, or mixed-effects model. MLM has already found wide deployment in disciplines such as sociology (e.g. Carvajal et al., 2001), education (e.g. Broekkamp et al., 2002), biology (e.g. Agrawal et al., 2001; Hall and Bailey, 2001), and medicine (e.g. Beacon and Thompson, 1996; Merlo et al., 2001; Lochner et al., 2001). Its many advantages have also made MLM increasingly popular in behavioral research (e.g. Van der Leeden, 1998; Reise and Duan, 2001; Raudenbush and Bryk, 2002), but so far MLM has made few inroads into speech research.
The purpose of this tutorial is to explain the basics of multi-level modeling, to compare MLM against its more conventional counterpart for hypothesis testing, ANOVA, and to demonstrate the use of MLM in actual research in our field. Some readers might hesitate to learn about, let alone adopt statistical innovations such as MLM. It will be argued below that the advantages of MLM for inference and insight outweigh these difficulties.
The outline of this tutorial is as follows. First, three well-known problems with ANOVA are reviewed, using a fictitious data set. Multi-level modeling promises to solve all three problems: sphericity, hierarchical sampling, and missing data. The subsequent section explains the basics of multi-level modeling, using the same fictitious data set. Analysis results from MLM and from RM-ANOVA, based on the same data set, are then compared and discussed. One notable advantage of MLM is its higher power in hypothesis testing. MLM is then demonstrated in an example analysis of real data from a recently published study. MLM and ANOVA are also compared in a more general fashion, using Monte Carlo simulations. Finally, we discuss the advantages and drawbacks of using multi-level models for research in speech communication.
Section snippets
Sphericity
This tutorial focuses on repeated-measurement designs, with two nested random factors: subjects (or participants), and trials (or occasions) within subjects, respectively. Hence, data are obtained from a multi-level sampling scheme, in which subjects have been sampled first, and trials have been sampled within subjects. These two levels of sampling are usually called level-1 (lower) and level-2 (higher). The factor of interest, like “treatment” or “condition”, constitutes a fixed factor, so
Multi-level modeling
Multi-level modeling promises to solve all three problems with conventional RM-ANOVA discussed above. It is robust against violations of homoschedasticity and sphericity. It is suitable for analyzing data from multi-level sampling schemes. It is also robust against missing data. But how does it work? This section explains the basics of multi-level modeling. Several comprehensive textbooks are available for further study, including Goldstein (1995), Bryk and Raudenbush (1992), Hox (1995), Kreft
Multi-level modeling of existing data
As a further demonstration, let us apply the multi-level modeling technique outlined above to real data, to show what could be gained by using such modeling in actual research. To this end, we have re-analyzed data from a recent prosody study with 9 esophageal, 10 tracheoesophageal and 10 laryngeal control speakers (Van Rossum et al., 2002, Experiment 2). The 29 speakers read 10 sentences. Each speaker read each sentence twice, with two different preceding sentences that induced a contrastive
Monte Carlo simulations
In two examples presented above, analyses with MLM yielded a significant main effect or interaction that was not reported by RM-ANOVA. This greater power in detecting effects is due to the more accurate modeling of the variance–covariance matrix, or matrices, at each level of the sampling hierarchy. This reduces the standard errors of the estimated variance components, which in turn leads to more sensitive testing of fixed effects and contrasts. Hence, it seems that MLM has more power to reject
Discussion and conclusion
The two analyses and the simulations in this tutorial have attempted to demonstrate several important advantages of MLM in comparison with other analysis tools. First, MLM has higher power in finding effects and contrasts in the data. Second, there is no need for disputable assumptions, notably those of homoschedasticity (homogeneity of variance), and of sphericity. Variance and covariance components are estimated from the data, rather than postulated a priori. These variance estimates may in
Acknowledgements
Our sincere thanks are due to Maya van Rossum, for providing data for re-analysis, and for helpful discussions. We also thank Sieb Nooteboom, Guus de Krom, Anne Cutler, Frank Wijnen, Saskia te Riele, Jeroen Raaijmakers, Brian McElree, Peter Dixon, and Robert L. Greene, for valuable discussions, comments and suggestions.
References (42)
- et al.
How to deal with “the language-as-fixed-effect fallacy”: common misconceptions and alternative solutions
J. Memory Language
(1999) - et al.
On indirect genetic effects in structured populations
Amer. Nat.
(2001) - et al.
Multi-level models for repeated measurement data: application to quality of life data in clinical trials
Statist. Med.
(1996) - et al.
Importance in instructional text: teachers' and students' perception of task demands
J. Educ. Psychol.
(2002) - Bryk, A., Raudenbusch, S., Congdon, R., 2001. HLM: hierarchical linear and nonlinear modeling. Computer program....
- et al.
Hierarchical Linear Models: Applications and Data Analysis Methods
(1992) - et al.
Multilevel models and unbiased tests for group based interventions: examples from the safer choices study
Multivar. Behav. Res.
(2001) Sampling Techniques
(1977)Statistical Power Analysis for the Behavioral Sciences
(1988)A power primer
Psychological Bull.
(1992)
The Dependability of Behavioral Measurements: Theory of Generalizability for Scores and Profiles
Nonlinear multilevel models with an application to discrete response data
Biometrika
Multilevel Statistical Models
A general model for the analysis of multilevel data
Psychometrika
Multilevel time series models with applications to repeated measures data
Statist. Med.
Intraclass Correlation and the Analysis of Variance
Modeling and prediction of forest growth variables based on multilevel nonlinear mixed models
Forest Sci.
Applied Multilevel Analysis
Experimental Design: Procedures for the Behavioral Sciences
Survey Sampling
Introducing Multilevel Modeling
Cited by (481)
Maintain your mind, maintain your focus: Effects of focused attention and intensity in experienced runners
2024, Psychology of Sport and ExerciseShame facets as predictors of problematic eating behaviors: An ecological momentary assessment study
2023, Behaviour Research and TherapyMotivation for real-life social engagement of preschool children with autism spectrum disorder: From the caregiver perspectives
2023, Research in Autism Spectrum Disorders