Jaouad MOURTADA (CREST) – ” Statistical leverage in prediction: from least squares to logistic regression “
September 21, 2:00 pm - 3:15 pm
The Statistical Seminar: Every Monday at 2:00 pm.
Time: 2:00 pm – 3:15 pm
Date: 21th of september 2020
Jaouad MOURTADA (CREST) – “Statistical leverage in prediction: from least squares to logistic regression”
Abstract: We consider statistical learning/prediction problems: based on an iid sample, one aims to find a good predictor of a response given some features. In this context, an input sample has high leverage (relative to some learning algorithm or class of predictors) if the prediction at this point is sensitive to the associated output.
In the context of random-design linear prediction with square loss, we show that the hardness of the problem is characterized by the distribution of leverage scores of feature points. As an application, in high dimension Gaussian design (with nearly constant leverage) is seen to be nearly most favorable through lower bounds on the minimax risk, as well as refinements depending on the signal-to-noise ratio.
We then turn to conditional density estimation with entropy risk. We study this problem in a statistical learning framework, where given a class of conditional densities (a model), the goal is to find estimators that predict almost as well as the best distribution in the model, without restrictive assumptions on the true distribution. We introduce a new estimator, given by the solution of some min-max problem involving ‘virtual’ samples, which quantifies uncertainty according to a notion of leverage. In the case of logistic regression, this procedure achieves fast learning rates under weak assumptions, improving over within-model estimators such as regularized MLE; it also admits a simple form and does not require posterior sampling, providing a computationally less demanding alternative to Bayesian approaches.
Cristina BUTUCEA (CREST), Alexandre TSYBAKOV (CREST), Karim LOUNICI (CMAP) , Zoltan SZABO (CMAP)