Universal robust regression via maximum mean discrepancy

成果类型:
Article
署名作者:
Alquier, P.; Gerber, M.
署名单位:
ESSEC Business School; University of Bristol
刊物名称:
BIOMETRIKA
ISSN/ISSBN:
0006-3444
DOI:
10.1093/biomet/asad031
发表日期:
2024
页码:
7192
关键词:
sample selection models inference
摘要:
Many modern datasets are collected automatically and are thus easily contaminated by outliers. This has led to a renewed interest in robust estimation, including new notions of robustness such as robustness to adversarial contamination of the data. However, most robust estimation methods are designed for a specific model. Notably, many methods were proposed recently to obtain robust estimators in linear models, or generalized linear models, and a few were developed for very specific settings, for example beta regression or sample selection models. In this paper we develop a new approach for robust estimation in arbitrary regression models, based on maximum mean discrepancy minimization. We build two estimators that are both proven to be robust to Huber-type contamination. For one of them, we obtain a non-asymptotic error bound and show that it is also robust to adversarial contamination, but this estimator is computationally more expensive to use in practice than the other one. As a by-product of our theoretical analysis of the proposed estimators, we derive new results on kernel conditional mean embedding of distributions that are of independent interest.
来源URL: