Tom BERRETT(University of Cambridge) – “Efficient multivariate functional estimation and independence testing”
February 4, 2:00 pm - 3:15 pm
The Statistical Seminar: Every Monday at 2:00 pm.
Time: 2:00 pm – 3:15 pm
Date: 4 th of February 2019
Place: Room 3001.
Tom BERRETT(University of Cambridge) – “Efficient multivariate functional estimation and independence testing“
Abstract: Many statistical procedures, including goodness-of-fit tests and methods for independent component analysis, rely critically on the estimation of the entropy of a distribution. In this talk I will first describe new entropy estimators that are efficient and achieve the local asymptotic minimax lower bound with respect to squared error loss. These estimators are constructed as weighted averages of the estimators originally proposed by Kozachenko and Leonenko (1987), based on the k-nearest neighbour distances of a sample of n independent and identically distributed random vectors taking values in Rd. A careful choice of weights enables us to obtain an efficient estimator for arbitrary d, given sufficient smoothness, while the original unweighted estimator is typically only efficient for d < 3. I will also discuss newer results on the estimation of more general functionals, in settings where we have samples from two different distributions.
The next part of the talk will be to use our entropy estimators to propose a test of independence of two multivariate random vectors, given a sample from the underlying population. Our approach, which we call MINT, is based on the estimation of mutual information, which we may decompose into joint and marginal entropies. The proposed critical values, which may be obtained from simulation in the case where an approximation to one marginal is available or resampling otherwise, facilitate size guarantees, and we provide local power analyses, uniformly over classes of densities whose mutual information satisfies a lower bound. Our ideas may be extended to provide a new goodness-of-fit tests of normal linear models based on assessing the independence of our vector of covariates and an appropriately-defined notion of an error vector.
Cristina BUTUCEA, Alexandre TSYBAKOV, Julie JOSSE, Eric MOULINES, Mathieu ROSENBAUM