SUPERMIX: SPARSE REGULARIZATION FOR MIXTURES
成果类型:
Article
署名作者:
De Castro, Y.; Gadat, S.; Marteau, C.; Maugis-Rabusseau, C.
署名单位:
Centre National de la Recherche Scientifique (CNRS); Ecole Centrale de Lyon; Institut National des Sciences Appliquees de Lyon - INSA Lyon; Universite Claude Bernard Lyon 1; Universite Jean Monnet; Universite de Toulouse; Universite Toulouse 1 Capitole; Toulouse School of Economics; Centre National de la Recherche Scientifique (CNRS); Ecole Centrale de Lyon; Institut National des Sciences Appliquees de Lyon - INSA Lyon; Universite Claude Bernard Lyon 1; Universite Jean Monnet; Universite de Toulouse; Universite Toulouse III - Paul Sabatier
刊物名称:
ANNALS OF STATISTICS
ISSN/ISSBN:
0090-5364
DOI:
10.1214/20-AOS2022
发表日期:
2021
页码:
1779-1809
关键词:
maximum-likelihood
CONVERGENCE
Finite
rates
摘要:
This paper investigates the statistical estimation of a discrete mixing measure mu(0) involved in a kernel mixture model. Using some recent advances in l(1)-regularization over the space of measures, we introduce a data fitting and regularization convex program for estimating mu(0) in a grid-less manner from a sample of mixture law, this method is referred to as Beurling-LASSO. Our contribution is two-fold: we derive a lower bound on the bandwidth of our data fitting term depending only on the support of mu(0) and its socalled minimum separation to ensure quantitative support localization error bounds; and under a so-called nondegenerate source condition we derive a nonasymptotic support stability property. This latter shows that for a sufficiently large sample size n, our estimator has exactly as many weighted Dirac masses as the target mu(0), converging in amplitude and localization towards the true ones. Finally, we also introduce some tractable algorithms for solving this convex program based on Sliding Frank-Wolfe or Conic Particle Gradient Descent. Statistical performances of this estimator are investigated designing a socalled dual certificate, which is appropriate to our setting. Some classical situations as, for example, mixtures of super-smooth distributions (see, e.g., Gaussian distributions) or ordinary-smooth distributions (see, e.g., Laplace distributions), are discussed at the end of the paper.