Abstract:
Measures of dependence among several random vectors and associated tests of independence play a major role in different statistical applications. First, we consider the problem of measuring dependence among several random variables and testing for the statistical significance of that measure. We propose a new copula-based dependency measure based on the Gaussian kernel and develop a test of independence based on it. Unlike other existing measures of dependence, our proposed measure is invariant under strictly monotone transformations of the variables and the proposed test is distribution-free. We establish the consistency of this test under suitable conditions. We also consider multi-scale versions of this test, where we look at the results for various choices of the bandwidth associated with the Gaussian kernel and then aggregate them to arrive at the final decision. Several simulated and real data sets are analyzed to demonstrate the utility of these proposed methods. However, the above-mentioned tests can be used only when the variables are continuous. To take care of discrete, ordinal, or binary variables, we propose a measure based on checkerboard copula and construct a test based on it. Again single-scale and multi-scale versions of these tests are considered and their consistency is proved under suitable regularity conditions.
Next, we propose and investigate some methods for testing independence among several random vectors of arbitrary dimensions. We propose two common recipes, one based on linear projections and the other based on pairwise Euclidean distances for multivariate generalizations of the tests discussed above. In both cases, we transform the observations on sub-vectors into univariate observations and then use our proposed copula-based tests on the transformed data. We investigate the theoretical as well as the empirical performance of the resulting tests.
Finally, we consider some tests based on ranks of the nearest neighbours. Most of these proposed tests are based on multivariate rank functions and some of them use the idea of maximal mean discrepancy (MMD) as well. Empirical performances of these tests are investigated by analyzing several simulated and real datasets. Our proposed multivariate tests can be used for testing independence among several random functions as well. We carry out some simulation studies to investigate their empirical performance in this context.