The multisample Cucconi test

Marozzi, Marco

doi:10.1007/s10260-014-0255-x

The multisample Cucconi test

Published: 15 February 2014

Volume 23, pages 209–227, (2014)
Cite this article

Statistical Methods & Applications Aims and scope Submit manuscript

Marco Marozzi¹

663 Accesses
27 Citations
3 Altmetric
Explore all metrics

Abstract

The multisample version of the Cucconi rank test for the two-sample location-scale problem is proposed. Even though little known, the Cucconi test is of interest for several reasons. The test is compared with some Lepage-type tests. It is shown that the multisample Cucconi test is slightly more powerful than the multisample Lepage test. Moreover, its test statistic can be computed analytically whereas several others cannot. A practical application example in experimental nutrition is presented. An R function to perform the multisample Cucconi test is given.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Violating the normality assumption may be the lesser of two evils

Article Open access 07 May 2021

Ulrich Knief & Wolfgang Forstmeier

Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

Article 30 August 2016

Aki Vehtari, Andrew Gelman & Jonah Gabry

Check your outliers! An introduction to identifying statistical outliers in R with easystats

Article 25 March 2024

Rémi Thériault, Mattan S. Ben-Shachar, … Dominique Makowski

References

Adamson GCD, Nash DJ (2013) Long-term variability in the date of monsoon onset over western India. Clim Dyn 40:2589–2603
Google Scholar
Akkouchi M (2005) On the convolution of gamma distributions. Soochow J Math 31:205–211
MATH MathSciNet Google Scholar
Baumgartner W, Weiss P, Schindler H (1998) A nonparametric test for the general two-sample problem. Biometrics 54:1129–1135
Article MATH Google Scholar
Bausch J (2012) On the efficient calculation of a linear combination of chi-square random variables with an application in counting string vacua, arXiv:1208.2691v2
Boos DD, Zhang J (2000) Monte Carlo evaluation of resampling-based hypothesis tests. J Am Stat Assoc 95:486–492
Article Google Scholar
Buning H, Thadewald T (2000) An adaptive two-sample location-scale test of Lepage-type for symmetric distributions. J Stat Comput Simul 65:287–310
Article MathSciNet Google Scholar
Castano-Martinez A, Lopez-Blazquez F (2005) Distribution of a sum of weighted noncentral chi-square variables. Test 14:397–415
Article MATH MathSciNet Google Scholar
Cucconi O (1968) Un nuovo test non parametrico per il confronto tra due gruppi campionari. Giornale degli Economisti 27:225–248
Google Scholar
Gerhard D, Hothorn LA (2010) Rank transformation in haseman-elston regression using scores for location-scale alternatives. Hum Hered 69:143–151
Article Google Scholar
Hajek J, Sidak Z, Sen PK (1998) Theory of rank tests, 2nd edn. Academic Press, New York
Kruskal WH, Wallis WA (1952) Use of ranks in one criterion variance analysis. J Am Stat Assoc 47:583–621
Article MATH Google Scholar
Lepage Y (1971) A combination of Wilcoxon’s and Ansari-Bradley’s statistics. Biometrika 58:213–217
Article MATH MathSciNet Google Scholar
Lindsay BG, Pilla RS, Basak P (2000) Moment-based approximations of distributions using mixtures: theory and applications. Ann Inst Math Stat 52:215–230
Article MATH MathSciNet Google Scholar
Lunde A, Timmermann A (2004) Duration dependence in stock prices: an analysis of bull and bear markets. J Bus Econ Stat 22:253–273
Article MathSciNet Google Scholar
Marozzi M (2007) Multivariate tri-aspect non-parametric testing. J Nonparametr Stat 19:269–282
Article MATH MathSciNet Google Scholar
Marozzi M (2009) Some notes on the location-scale Cucconi test. J Nonparametr Stat 21:629–647
Article MATH MathSciNet Google Scholar
Marozzi M (2012a) A combined test for differences in scale based on the interquantile range. Stat Pap 53:61–72
Article MATH MathSciNet Google Scholar
Marozzi M (2012b) A modified Hall–Padmanabhan test for the homogeneity of scales. Commun Stat – Theory Methods 41(16–17):3068–3078
Article MATH MathSciNet Google Scholar
Marozzi M (2012c) A modified Cucconi test for location and scale change alternatives. Colomb J Stat 35:369–382
MathSciNet Google Scholar
Marozzi M (2013) Nonparametric simultaneous tests for location and scale testing: a comparison of several methods. Commun Stat–Simul Comput 42(6):1298–1317
Article MATH MathSciNet Google Scholar
Manly BFJ, Francis RICC (2002) Testing for mean and variance differences with samples from distributions that may be non-normal with unequal variances. J Stat Comput Simul 72(8):633–646
Article MATH MathSciNet Google Scholar
Moore DS, McCabe GP (2009) Introduction to the practice statstics, 6th edn. Freeman, New York
Google Scholar
Muccioli C, Belford R, Podgor M, Sampaio P, de Smet M, Nussenblatt R (1993) The diagnosis of intraocular inflammation and cytomegalovirus retinitis in HIV-infected patients by laser flare photometry. Ocul Immunol Inflamm 4(2):75–81
Article Google Scholar
Murakami H (2007) Lepage-type statistic based on the modified Baumgartner statistic. Comput Stat Data Anal 51:5061–5067
Article MATH Google Scholar
Murakami H (2008) A multisample rank test for location-scale parameters. Commun Stat–Simul Comput 37:1347–1355
Article MATH Google Scholar
Neuhauser M (2000) An exact two-sample test based on the Baumgartner–Weiss–Schindler statistic and a modification of Lepage’s test. Commun Stat–Theory and Methods 29:67–78
Article MathSciNet Google Scholar
Neuhauser M (2001) An adaptive location-scale test. Biom J 43:809–819
Article MathSciNet Google Scholar
Neuhauser M, Kotzmann J, Walier M, Poulin R (2010) The comparison of mean crowding between two groups. J Parasitol 96:477–481
Article Google Scholar
Oden NL (1991) Allocation of effort in Monte Carlo simulation for power of permutation tests. J Am Stat Assoc 86:1074–1076
Article Google Scholar
Pesarin F, Salmaso L (2010) Permutation tests for complex data. Wiley, Chichester
Book Google Scholar
Podgor MJ, Gastwirth JL (1994) On non-parametric and generalized tests for the two-sample problem with location and scale change alternatives. Stat Med 13:747–758
Article Google Scholar
Palar K, Sturm R (2009) Potential societal savings from reduced sodium consumption in the U.S. adult population. Am J Health Promot 24:49–57
Article Google Scholar
Puri ML (1965) On some tests of homogeneity of variances. Ann Inst Stat Math 17:323–330
Article MATH Google Scholar
Rublik F (2005) The multisample version of the Lepage test. Kybernetika 41:713–733
MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Economics, Statistics and Finance, University of Calabria, Rende CS, Italy
Marco Marozzi

Authors

Marco Marozzi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marco Marozzi.

Appendices

Appendix 1

Cucconi (1968) does not organize the formal results on $\sum _{i=1}^{n_{k}}R_{ki}^{2}$ and $\sum _{i=1}^{n_{k}}( n+1-R_{ki})^{2}$ in lemmas and theorems but gives an outline of some derivations. Here we reorganize the results on the sums of squared ranks and squared antiranks in a clearer manner by reporting the whole proof of all results. Well known results about sum of powers of the first $n$ natural numbers will be used in this section.

Theorem 1

Under the null hypothesis

$$\begin{aligned} E\left( \,\sum _{i=1}^{n_{k}}R_{ki}^{2}\right) =n_{k}\left( n+1\right) \left( 2n+1\right) /6 \ \forall k. \end{aligned}$$

Proof

Consider the population of the first $n$ squared natural numbers $ 1,2^{2},...,n^{2}$. $\frac{1}{n_k} \sum _{i=1}^{n_{k}}R_{ki}^{2}$ may be seen as the random variable defined by the mean of a random sample of $n_{k}$ values drawn without replacement from this population. Since the mean of the sample means is equal to the population mean it follows that

$$\begin{aligned} E\left( \frac{1}{n_{k}} \sum _{i=1}^{n_{k}}R_{ki}^{2}\right) =\frac{1}{n}\sum _{i=1}^{n}i^{2}=\left( n+1\right) \left( 2n+1\right) /6 \end{aligned}$$

$\forall k$, and the thesis follows immediately. $\square $

Theorem 2

Under the null hypothesis

$$\begin{aligned} Var\left( \,\sum _{i=1}^{n_{k}}R_{ki}^{2}\right) =n_{k}\left( n-n_{k}\right) \left( n+1\right) \left( 2n+1\right) \left( 8n+11\right) /180 \forall k. \end{aligned}$$

Proof

With simple algebra we first compute the variance $\tau ^2$ of the population of the first $n$ squared natural numbers

$$\begin{aligned} \tau ^{2}&= \frac{1}{n}\sum _{i=1}^{n}\left[ i^{2}-\left( 2n+1\right) \left( n+1\right) /6\right] ^{2}\\&= \left( n^{2}-1\right) \left( 2n+1\right) \left( 8n+11\right) /180. \end{aligned}$$

Now, since the sampling is without replacement

$$\begin{aligned} Var\left( \frac{1}{n_{k}}\sum _{i=1}^{n_{k}}R_{ki}^{2}\right) = \frac{n-n_{k}}{n-1}\frac{\tau ^{2}}{n_{k}} \end{aligned}$$

(8)

$\forall k$, and the thesis follows immediately. $\square $

The following lemmas will be used for proving Theorem 3.

Lemma 1

Under the null hypothesis

$$\begin{aligned} E\left[ \left( \,\sum _{i=1}^{n_{k}}R_{ki}^{2}\right) ^{2}\right]&= n_{k}\left( n-n_{k}\right) \left( n+1\right) \left( 2n+1\right) \left( 8n+11\right) /180\\&\quad +\,n_{k}^{2}\left( n+1\right) ^{2}\left( 2n+1\right) ^{2}/36 \ \forall k. \end{aligned}$$

Proof

Straightforward (we deliberately not factor the result). $\square $

Lemma 2

Under the null hypothesis

$$\begin{aligned} E\left( \,\sum _{i=1}^{n_{k}}R_{ki}^{2}\sum _{i=1}^{n_{k}}R_{ki}\right) =n_{k}n\left( n+1\right) ^{2}\left( 2n_{k}+1\right) /12 \forall k. \end{aligned}$$

Proof

$$\begin{aligned} E\left( \, \sum _{i=1}^{n_{k}}R_{ki}^{2}\sum _{i=1}^{n_{k}}R_{ki}\right) =E\left( \,\sum _{i=1}^{n_{k}} R_{ki}^{3}\right) +E\left( \,\sum _{i=1}^{n_{k}}\sum _{j\ne 1}^{n_{k}}R_{ki}^{2}R_{kj}\right) . \end{aligned}$$

It is $E\left( \,\sum _{i=1}^{n_{k}}R_{ki}^{3}\right) =\frac{n_k}{n}\sum _{j=1}^{n}j^{3}=n_kn\left( n+1\right) ^{2}/4$ and it is

$$\begin{aligned} E\left( \sum _{i=1}^{n_{k}}\sum _{j\ne 1}^{n_{k}}R_{ki}^{2}R_{kj}\right)&= \frac{n_k(n_k-1)}{n\left( n-1\right) }\sum _{j=1}^{n}j^{2}\sum _{l\ne j}^{n}l\\&= \frac{n_k(n_k-1)}{n\left( n-1\right) }\left( \,\sum _{j=1}^{n}j^{2}\sum _{l=1}^{n}l-\sum _{j=1}^{n}j^{3}\right) \\&= n_k(n_k-1)n\left( n+1\right) ^{2}/6. \end{aligned}$$

Finally it follows with simple algebra that

$$\begin{aligned} E\left( \sum _{i=1}^{n_{k}}R_{ki}^{2}\sum _{i=1}^{n_{k}}R_{ki}\right) =n_{k}n\left( n+1\right) ^{2}\left( 2n_{k}+1\right) /12 \end{aligned}$$

$\forall k$. $\square $

Theorem 3

Under the null hypothesis

$$\begin{aligned} Cor\left( \,\sum _{i=1}^{n_{k}}R_{ki}^{2},\sum _{i=1}^{n_{k}}\left( n+1-R_{ki}\right) ^{2}\right) =-\frac{30n+14n^{2}+19}{\left( 8n+11\right) \left( 2n+1\right) } \ \forall k. \end{aligned}$$

Proof

It is

$$\begin{aligned}&Cor\left( \,\sum _{i=1}^{n_{k}}R_{ki}^{2},\sum _{i=1}^{n_{k}}\left( n+1-R_{ki}\right) ^{2}\right) \nonumber \\&\quad =\frac{E\left[ \left( \sum _{i=1}^{n_{k}}R_{ki}^{2}\right) \left( \, \sum _{i=1}^{n_{k}}\left( n+1-R_{ki}\right) ^{2}\right) \right] -n_{k}^{2}\left( 2n+1\right) ^{2}\left( n+1\right) ^{2}/36}{n_{k}\left( n-n_{k}\right) \left( n+1\right) \left( 2n+1\right) \left( 8n+11\right) /180}. \end{aligned}$$

It remains to compute

$$\begin{aligned}&E\left[ \left( \,\sum _{i=1}^{n_{k}}R_{ki}^{2}\right) \left( \, \sum _{i=1}^{n_{k}}\left( n+1-R_{ki}\right) ^{2}\right) \right] \nonumber \\&\quad \quad =n_{k}\left( n+1\right) ^{2}E\left( \,\sum _{i=1}^{n_{k}}R_{ki}^{2}\right) +E \left[ \!\left( \,\sum _{i=1}^{n_{k}}R_{ki}^{2}\right) ^{2}\right] -2\left( n+1\right) E\!\left( \, \sum _{i=1}^{n_{k}}R_{ki}^{2}\sum _{i=1}^{n_{k}}R_{ki}\right) \!. \end{aligned}$$

Using Lemma 1 it follows that

$$\begin{aligned}&Cov\left( \,\sum _{i=1}^{n_{k}}R_{ki}^{2},\sum _{i=1}^{n_{k}}\left( n+1-R_{ki}\right) ^{2}\right) \nonumber \\&\quad \quad =n_{k}\left( n+1\right) ^{2}E\left( \, \sum _{i=1}^{n_{k}}R_{ki}^{2}\right) +Var\left( \, \sum _{i=1}^{n_{k}}R_{ki}^{2}\right) -2\left( n+1\right) E\left( \, \sum _{i=1}^{n_{k}}R_{ki}^{2}\sum _{i=1}^{n_{k}}R_{ki}\right) . \end{aligned}$$

Using Theorem 1, Theorem 2 and Lemma 2 it follows with simple algebra that

$$\begin{aligned}&Cov\left( \,\sum _{i=1}^{n_{k}}R_{ki}^{2},\sum _{i=1}^{n_{k}}\left( n+1-R_{ki}\right) ^{2}\right) \nonumber \\&\quad =-n_{k}\left( n+1\right) \left( 30n+14n^{2}+19\right) \left( n-n_{k}\right) /180 \end{aligned}$$

and finally that

$$\begin{aligned} Cor\left( \,\sum _{i=1}^{n_{k}}R_{ki}^{2},\sum _{i=1}^{n_{k}}\left( n+1-R_{ki}\right) ^{2}\right) =-\frac{30n+14n^{2}+19}{\left( 8n+11\right) \left( 2n+1\right) } \end{aligned}$$

$\forall k$.$\square $

Appendix 2

An R function for performing the multisample Cucconi test follows.

To analyze the data considered in Sect. 6 run the following code.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Marozzi, M. The multisample Cucconi test. Stat Methods Appl 23, 209–227 (2014). https://doi.org/10.1007/s10260-014-0255-x

Download citation

Received: 08 March 2013
Revised: 15 July 2013
Accepted: 26 January 2014
Published: 15 February 2014
Issue Date: June 2014
DOI: https://doi.org/10.1007/s10260-014-0255-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The multisample Cucconi test

Abstract

Access this article

Similar content being viewed by others

Violating the normality assumption may be the lesser of two evils

Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

Check your outliers! An introduction to identifying statistical outliers in R with easystats

References