Skewness and Kurtosis in Real Data Samples
Abstract
Parametric statistics are based on the assumption of normality. Recent findings suggest that Type I error and power can be adversely affected when data are non-normal. This paper aims to assess the distributional shape of real data by examining the values of the third and fourth central moments as a measurement of skewness and kurtosis in small samples. The analysis concerned 693 distributions with a sample size ranging from 10 to 30. Measures of cognitive ability and of other psychological variables were included. The results showed that skewness ranged between −2.49 and 2.33. The values of kurtosis ranged between −1.92 and 7.41. Considering skewness and kurtosis together the results indicated that only 5.5% of distributions were close to expected values under normality. Although extreme contamination does not seem to be very frequent, the findings are consistent with previous research suggesting that normality is not the rule with real data.
References
1997a). Nonparametric methods for factorial design with censored data. Journal of the American Statistical Association, 92, 568–576.
(1997b). A unified approach to rank tests for mixed models. Journal of Statistical Planning and Inference, 61, 249–277.
(2003). Nonparametric models for ANOVA and ANCOVA: A review. In , Recent advances and trends in nonparametric statistics (pp. 79–91). Amsterdam, The Netherlands: Elsevier.
(2008). Improving the performance of kurtosis estimator. Computational Statistic & Data Analysis, 52, 2669–2681.
(1988). Kurtosis: A critical review. The American Statistician, 42, 111–119.
(1990). Kurtosis and spread. Canadian Journal of Statistics, 18, 17–30.
(2011). Robust estimation of skewness and kurtosis in distributions with infinite higher moments. Finance Research Letters, 8, 77–87.
(1978). Robustness? British Journal of Mathematical and Statistical Psychology, 31, 144–152.
(2010). Attention skills and looking to television in children from low income families. Journal of Applied Developmental Psychology, 31, 330–338.
(2002). Nonparametric analysis of longitudinal data in factorial experiments. New York, NY: Wiley.
(2002). A class of rank-score tests in factorial designs. Journal of Statistical Planning Inference, 103, 331–360.
(2006). Robust measures of tail weight. Computational Statistics & Data Analysis, 50, 733–759.
(1982). Parametric alternatives to the analysis of variance. Journal of Educational Statistics, 7, 207–214.
(1947). Some consequences when the assumptions for the analysis of variance are not satisfied. Biometrics, 33, 22–38.
(1997). On the meaning and use of kurtosis. Psychological Methods, 2, 292–307.
(2010). Características y análisis de los diseños de medidas repetidas en la investigación experimental en España en los últimos 10 años
([Characteristics and analyses of the repeated measures designs in experimental research in Spain during the last ten years] . In Actas del XI Congreso de Metodología de las Ciencias Sociales y de la Salud (pp. 193–198). Málaga: UMA-Tecnolex.1978). A method for simulating non-normal distributions. Psychometrika, 43, 521–532.
(1972). Consequences of failure to meet assumptions underlying the fixed-effects analysis of variance and covariance. Review of Educational Research, 42, 237–288.
(1998). A class of quantile measures for kurtosis. The American Statistician, 51, 325–329.
(1984). Measuring skewness and kurtosis. The Statistician, 33, 391–399.
(1999). Autoregressive conditional skewness. Journal of Financial and Quantitative Analysis, 34, 465–487.
(2000). Conditional skewness in asset pricing test. Journal of Finance, 55, 1263–1295.
(2003). Summarizing Monte Carlo results in methodological research: The single-factor, fixed-effects ANOVA case. Journal of Educational Statistics, 28, 45–70.
(2006). Testing experimental data for univariate normality. Clinica Chimica Acta, 366, 112–129.
(2009). Robust methods in biostatistics. West Sussex, UK: Wiley.
(1982). Robustness in real life: A study of clinical laboratory data. Biometrics, 38, 377–396.
(1974). Adaptive robust procedures: A partial review and some suggestions for future applications and theory. Journal of the American Statistical Association, 69, 909–927.
(1982). On adaptive statistical inferences. Communications in Statistics: Theory and Methods, 11, 2531–2542.
(1975). A two-sample adaptive distribution-free test. Journal of the American Statistical Association, 70, 656–661.
(1999). Modeling emerging market risk premia using higher moments. International Journal of Finance and Economics, 4, 271–296.
(2008). A generally robust approach for testing hypotheses and setting confidence intervals for effect sizes. Psychological Methods, 13, 110–129.
(1998). Statistical practices of education researchers: An analysis of the ANOVA, MANOVA, and ANCOVA analyses. Review of Educational Research, 68, 350–386.
(2005). Analysis of quantitative data obtained from toxicity studies showing non-normal distribution. Journal of Toxicological Science, 30, 127–134.
(1977). The log-normal distribution of the incubation time of exogenous diseases. Japanese Journal of Human Genetics, 21, 217–237.
(2005). The effect of varying degrees on nonnormality in structural equation modeling. Structural Equation Modeling, 12, 1–27.
(1982). Power of the F test with skewed data: Should one transform or not? Psychological Bulletin, 9, 22–80.
(1996). Consequences of assumptions violations revisited: A quantitative review of alternatives to the one-way analysis of variance F test. Review of Educational Research, 66, 579–620.
(2001). Using Johnson’s transformation and robust estimators with heteroscedastic test statistics: An examination of the effects of nonnormality and heterogeneity in the nonorthogonal two-way ANOVA design. British Journal of Mathematical and Statistical Psychology, 54, 79–94.
(2004). Improved robust test statistic base on trimmed means and Hall’s transformation for two-way ANOVA models under non-normality. Journal of Applied Statistics, 31, 623–643.
(1989). The unicorn, the normal curve, and other improbable creatures. Psychological Bulletin, 105, 156–166.
(1931). The analysis of variance in cases of non-normal variation. Biometrika, 23, 114–133.
(2007). Meta analysis of advanced cancer survival data using lognormal parametric fitting: A statistical method to identify effective treatment protocols. Current Pharmaceutical Design, 13, 1533–1544.
(1979). A probability distribution and its uses in fitting data. Technometrics, 21, 201–214.
(1985). The power of Student’s t and Wilcoxon statistics. Evaluation Review, 9, 505–510.
(2008). Analysis of high-dimensional repeated measures designs: The one sample case. Computational Statistics and Data Analysis, 53, 416–427.
(1996). Hinge estimators of location: Robust to asymmetry. Computer Methods and Programs in Biomedicine, 49, 11–17.
(1987). What is kurtosis? An influence function approach. The American Statistician, 41, 1–5.
(1992). A more realistic look at the robustness and Type II error properties of the t test to departures from normality. Psychological Bulletin, 111, 353–360.
(1959). The analysis of variance. New York, NY: Wiley.
(2010). Is it really robust? Reinvestigating the robustness of ANOVA against violations of the normal distribution assumption. Methodology, 6, 147–151.
(2004). Nonparametric analysis of ordinal data in design factorial experiment. Phytopathology, 94, 33–43.
(2010). Estimation of air traffic longitudinal conflict probability based on the reaction time of controllers. Safety Science, 48, 926–930.
(1959). Effect of non-normality on the power function of t-test. Biometrika, 46, 114–122.
(1964). Approximating the general nonnormal variance-ratio sampling distribution. Biometrika, 51, 83–95.
(1971). Power function of the F-test under non-normal situations. Journal of the American Statistical Association, 66, 913–916.
(1983). Simulating multivariate nonnormal distributions. Psychometrika, 48, 451–464.
(2006). A lognormal model for response times on test items. Journal of Educational and Behavioral Statistics, 31, 181–204.
(1993). Analysing repeated measures or randomized block design using trimmed means. British Journal of Mathematical and Statistical Psychology, 46, 63–76.
(1995). ANOVA: A paradigm for low power and misleading measures of effect sizes? Review of Educational Research, 65, 51–77.
(2001). Fundamentals of modern statistical methods: Substantially improving power and accuracy. New York, NY: Springer.
(2002). Understanding the practical advantages of modern ANOVA methods. Journal of Clinical and Adolescent Psychology, 31, 399–412.
(2003). Applying contemporary statistical techniques. San Diego, CA: Academic Press.
(2005). Introduction to robust estimation and hypothesis testing (2nd ed.). San Diego, CA: Academic Press.
(2009). Understanding conventional methods and modern insights. New York, NY: Oxford University Press.
(2001). Using trimmed means to compare K measures corresponding to two independent groups. Multivariate Behavioral Research, 36, 421–444.
(2007). Modern one-way ANOVA F methods: Trimmed means, one step M-estimators and bootstrap methods. Journal of Quantitative Research, 1, 155–173.
(