The probability of false positives in zero-dimensional analyses of one-dimensional kinematic, force and EMG trajectories
Introduction
In classical hypothesis testing p values represent the probability that a random process would produce an effect larger than the observed one. There are unfortunately many ways in which p value computations can go astray to yield “false positives” (Knudson, 2005, Kundson and Lindsey, 2014): the mistake of inferring an experimental effect when none exists in reality. This paper deals with one specific pitfall in p value computations which has not previously been quantified and is relevant to many branches of Biomechanics: zero-dimensional (0D) p values for one-dimensional (1D) (e.g. time varying) data.
Imagine a simple experiment which yields five scalar 1D force trajectories for each of two groups (Fig. 1b). Classical hypothesis testing can be conducted using either a “0D” or a “1D” approach (Pataky et al., 2015).
Zero-dimensional analysis: One could analyze the data using a 0D summary metric like the local maximum (Fig. 1a). In this case a one-tailed two-sample t test of the maxima yields t=2.357, p=0.023 and one would reject the null hypothesis at α=0.05. More completely, the 0D residuals (Fig. 1c) represent the variance about the group means, and one judges the effect size x (Fig. 1a) against this variance. If the null hypothesis (x=0) were true, random 0D data with the same variance would produce a distribution of t values over an infinite number of identical experiments (Fig. 1e) and only 2.3% of those values would be greater than the observed t=2.357. The null hypothesis is rejected because the observed t value exceeds the threshold corresponding to α (Fig. 1e and g). This value can be rapidly computed in statistical software packages, or it can be computed by iteratively simulating thousands of t tests on randomly generated samples of 0D Gaussian data.
One-dimensional analysis: One could alternatively analyze the data using 1D methods (Lenhoff et al., 1999, Pataky et al., 2015) (Fig. 1, right panels). Analogous to the 0D procedure, the 1D residuals (Fig. 1d) embody the variance about mean trajectories, and the null hypothesis is the null difference trajectory: , where q is time or the 1D measurement domain. If the null hypothesis were true, random 1D data with the same variance and same smoothness would produce a distribution of t trajectory maxima over an infinite number of identical experiments (Fig. 1f), and in this case random 1D data would produce the observed 0D effect of t=2.357 with a probability of approximately 55%, which is well above α. In other words, random 1D data produce particular t values with generally much greater probability than do random 0D data. The null hypothesis is not rejected because the observed maximum t value, across the whole trajectory, does not exceed the threshold corresponding to α (Fig. 1f and h). The value can be analytically calculated using random field theory (RFT) (Adler and Taylor, 2007, Friston et al., 2007), or it can be computed by iteratively simulating thousands of t tests on randomly generated samples of smooth 1D Gaussian data (Fig. 2) (Pataky, 2016).
The 0D and 1D procedures have yielded opposite hypothesis testing results, so which is correct? The answer depends on the a priori hypothesis. If, prior to conducting an experiment, one explicitly identified that particular 0D metric as the sole metric of empirical interest then the empirical question is inherently 0D and the 0D result is correct. If, however, one measured 1D data and did not specify that particular 0D metric prior to the experiment then the empirical question is inherently 1D and the 1D result is correct. Failing to specify a 0D metric prior to a 1D experiment and then adopting 0D methods has been termed “regional focus bias” (Pataky et al., 2013) and is a potential source of false positives. False positive prevalence for 1D biomechanics datasets has previously been estimated for 0D procedures (Knudson, 2005) but not, to our knowledge, in the context of 0D vs. 1D procedures.
The purpose of this study was to quantify the false positive rates that could be expected in real 1D biomechanical datasets when employing 0D statistical inference. To that end we analyzed nine public datasets (Table 2) representing a variety of experimental tasks (walking, running, cutting, and cycling) and data modalities (forces, kinematics, and EMG). Based on the data׳s temporal smoothness we estimated the likelihood of producing false positives using a simplified experimental design: a two-sample t test with N=10 for each group. Analyses for more complex designs like ANOVA are addressed in the Discussion (Section 4.3). The key theoretical concept we shall attempt to convey is that two parameters – the mean (μ) and standard deviation (σ) – describe 0D Gaussian behavior, and that only one additional parameter – 1D smoothness (FWHM) (Fig. 2) (Appendices A and B) – is needed to describe 1D Gaussian behavior.
To clarify our “0D” and “1D” terminology we shall also employ the term: “nDmD”, where n and m are the dimensionalities of the measurement domain and dependent variable, respectively (Table 1). In nDmD datasets the physical nature of the variables changes across the m components but not across the nD measurement domain. Biomechanics studies often measure 1DmD data but use 0D1D models of randomness to define critical statistical thresholds, and this paper quantifies false positive rates associated with that approach. Throughout this paper “0D” and “1D” represent to “0D1D” and “1DmD”, respectively.
We emphasize that this paper focusses on just a single statistical issue: the probability of false positives in a single 0D1D two-sample t test conducted on 1DmD data. We use only the two-sample t test because this simple test sufficiently demonstrates the magnitude of the false positives problem and because the problem is exacerbated in more complex designs like ANOVA. We acknowledge that many other issues must be considered when conducting statistical analyses including: small sample sizes, non-sphericity, normality, outliers, etc. Just as it is useful to consider each of these issues individually, we feel it is equally useful to consider 0D vs. 1D analysis individually because this issue is relevant to all 1D data analyses but has not been explicitly addressed in the literature.
Section snippets
Methods
All analyses were implemented in Python 2.7 (van Rossum, 2014) using Canopy 1.4 (Enthought Inc., Austin, USA) and the open-source software package “rft1d” (Pataky, 2016) (www.spm1d.org/rft1d). For readers unfamiliar with Python, MATLAB source code (The MathWorks, Natick, USA) replicating the study׳s main analyses and results is provided as Supplementary Material (Appendix C).
Experimental data smoothness
Residual 1D trajectories from all datasets, two of which are depicted in Fig. 3, were qualitatively consistent with simulated smooth Gaussian 1D trajectories (Fig. 2). Quantitative consistency between 1D residuals and Gaussian 1D trajectories has been demonstrated elsewhere (Pataky et al., 2015).
Residual smoothness estimates yielded minimum, median and maximum FWHM values of: 6.2%, 16.5% and 67.0%, respectively, across all datasets (Table 3). Kinematic residuals were smoothest on average,
Main implications
The convention of α=0.05 implies that one accepts a 5% false positive rate when conducting classical hypothesis testing. The main result of this study was that smooth, random 1D trajectories generally produce false positives in 0D analyses with a probability much higher than α. Even for the best case – maximum smoothness (FWHM=67.0) and one scalar trajectory – false positive rates were nearly three times greater than α (p=0.145, Table 4). For the median smoothness observed across all datasets
Conflict of interest
None declared.
Acknowledgments
This work was supported by Wakate A Grant 15H05360 from the Japan Society for the Promotion of Science. We also wish to thank Cyril J. Donnelly for helpful discussions and continued support.
References (28)
- et al.
Knee muscle forces during walking and running in patellofemoral pain patients and pain-free controls
J. Biomech.
(2009) - et al.
Ground reaction forces in distance running
J. Biomech.
(1980) - et al.
Robust smoothness estimation in statistical parametric maps using standardized residuals from the general linear model
NeuroImage
(1999) - et al.
Bootstrap prediction and confidence bandsa superior statistical method for analysis of gait data
Gait Posture
(1999) - et al.
Vector field statistical analysis of kinematic and force trajectories
J. Biomech.
(2013) - et al.
Zero- vs. one-dimensional, parametric vs. non-parametric, and confidence interval vs. hypothesis testing procedures in one-dimensional biomechanical trajectory analysis
J. Biomech.
(2015) - et al.
Statistical parametric mapping for alpha-based statistical analyses of multi-muscle EMG time-series
J. Electromyogr. Kinesiol.
(2015) - et al.
The gait deviation index: a new comprehensive index of gait pathology
Gait Posture
(2008) - et al.
Unified univariate and multivariate random field theory
NeuroImage
(2004) - et al.
Random Fields and Geometry
(2007)
External loading of the knee joint during running and cutting maneuvers
Med. Sci. Sports Exercise
Muscle activation strategies at the knee during running and cutting maneuvers
Med. Sci. Sports Exercise
The detection of local shape changes via the geometry of Hotelling׳s T2 fields
Ann. Stat.
Muscular strategy shift in human runningdependence of running speed on hip and ankle muscle performance
J. Exp. Biol.
Cited by (116)
Upper limb motor dysfunction is associated with fragmented kinetics after brain injury
2024, Clinical BiomechanicsBotulinum neurotoxin type A responders among children with spastic cerebral palsy: Pattern-specific effects
2024, European Journal of Paediatric NeurologyAcute effects of robot-assisted body weight unloading on biomechanical movement patterns during overground walking
2024, Journal of BiomechanicsIMU positioning affects range of motion measurement during squat motion analysis
2023, Journal of Biomechanics