• This event has passed.

# Guillaume LECUÉ (CREST) – “A geometrical viewpoint on the benign overfitting property of the minimum l_2-norm interpolant estimator “

May 2 @ 2:00 pm - 3:15 pm

Statistical Seminar: Every Monday at 2:00 pm.
Time: 2:00 pm – 3:15 pm
Date: 2nd of May 2022
Place: salle 3001

Guillaume LECUÉ (CREST) – “A geometrical viewpoint on the benign overfitting property of the minimum l_2-norm interpolant estimator”

Abstract: Practitioners have observed that some deep learning models generalize well even with a perfect fit to noisy training data [1,2]. Since then many theoretical works have revealed some facets of this phenomenon [3,4,5,6] known as benign overfitting. In particular, in the linear regression model, the minimum l_2-norm interpolant estimator \hat\bbeta has received a lot of attention [3,7] since it was proved to be consistent even though it perfectly fits noisy data under some condition on the covariance matrix \Sigma of the input vector. Motivated by this phenomenon, we study the generalization property of this estimator from a geometrical viewpoint. Our main results extend and improve the convergence rates as well as the deviation probability from [7]. Our proof differs from the classical bias/variance analysis and is based on the self-induced regularization property introduced in [4]: \hat\bbeta can be written as a sum of a ridge estimator \hat\bbeta_{1:k} and an overfitting component \hat\bbeta_{k+1:p} which follows a decomposition of the features space \bR^p=V_{1:k}\oplus^\perp V_{k+1:p} into the space V_{1:k} spanned by the top k eigenvectors of \Sigma and the ones V_{k+1:p} spanned by the p-k last ones. We also prove a matching lower bound for the expected prediction risk. The two geometrical properties of random Gaussian matrices at the heart of our analysis are the Dvoretsky-Milman theorem and isomorphic and restricted isomorphic properties. In particular, the Dvoretsky dimension appearing naturally in our geometrical viewpoint coincides with the effective rank from [3,7] and is the key tool to handle the behavior of the design matrix restricted to the sub-space V_{k+1:p} where overfitting happens. Joint work with Zong Shang.

Organizers:
Cristina BUTUCEA (CREST), Alexandre TSYBAKOV (CREST), Karim LOUNICI (CMAP) , Jaouad MOURTADA (CREST)