Contact
Is the performance of my deep network too good to be true? A direct approach to estimating the Bayes error in binary classification
There is a fundamental limitation in the prediction performance that a machine learning model can achieve due to the inevitable uncertainty of the prediction target. In classification problems, this c ...
Proceedings of 11th International Conference on Learning Representations (ICLR 2023), 2023
Mediated Uncoupled Learning and Validation with Bregman Divergences: Loss Family with Maximal Generality
In mediated uncoupled learning (MU-learning), the goal is to predict an output variable X given an input variable as in ordinary supervised learning while the training dataset has no joint samples of ...
Proceedings of the 26th International Conference on Artificial Intelligence and Statistics (AISTATS 2023), Proceedings of Machine Learning Research, vol. 206, pp. 4768-4801, 2023
Mediated Uncoupled Learning: Learning Functions Without Direct Input-output Correspondences
Ordinary supervised learning is useful when we have paired training data of input X and output Y . However, such paired data can be difficult to collect in practice. In this paper, we consider the tas ...
Proceedings of the 38th International Conference on Machine Learning (ICML 2021), Proceedings of Machine Learning Research, vol. 139, pp. 1637-11647, 2021
Skew-symmetrically perturbed gradient flow for convex optimization
Recently, many methods for optimization and sampling have been developed by designing continuous dynamics followed by discretization. The dynamics that have been used for optimization have their corre ...
In Proceedings of the 13th Asian Conference on Machine Learning (ACML 2021), Proceedings of Machine Learning Research, vol. 157, pp. 721-736, 2021
A One-Step Approach to Covariate Shift Adaptation
A default assumption in many machine learning scenarios is that the training and test samples are drawn from the same probability distribution. However, such an assumption is often violated in the rea ...
SN Computer Science. vol. 2, no. 319, 12 pages, 2021
Do We Need Zero Training Loss After Achieving Zero Training Error?
Overparameterized deep networks have the capacity to memorize training data with zero \emph{training error}. Even after memorization, the \emph{training loss} continues to approach zero, making the mo ...
Proceedings of 37th International Conference on Machine Learning (ICML2020), vol. 119, pp. 4604-4614, 2020
A One-step Approach to Covariate Shift Adaptation
A default assumption in many machine learning scenarios is that the training and test samples are drawn from the same probability distribution. However, such an assumption is often violated in the rea ...
Proceedings of the 12th Asian Conference on Machine Learning (ACML 2020), Proceedings of Machine Learning Research, vol. 129, pp. 65-80, 2020
Uplift Modeling from Separate Labels
Uplift modeling is aimed at estimating the incremental impact of an action on an individual's behavior, which is useful in various application domains such as targeted marketing (advertisement campaig ...
Advances in Neural Information Processing Systems 31 (NeurIPS2018)), pp. 9949-9959, 2018., 2018
Multitask principal component analysis
Principal Component Analysis (PCA) is a canonical and well-studied tool for dimensionality reduction. However, when few data are available, the poor quality of the covariance estimator at its core may ...
Asian Conference on Machine Learning (ACML2016), Proceedings of Machine Learning Research, vol. 63, pp. 302-317, 2016
Regularized Multi-Task Learning for Multi-Dimensional Log-Density Gradient Estimation
Log-density gradient estimation is a fundamental statistical problem and possesses various practical applications such as clustering and measuring nongaussianity. A naive two-step approach of first es ...
Neural Computation, vol. 28, no. 6, pp. 1388-1410, 2016