AMF: Aggregated Mondrian forests for online learning
成果类型:
Article
署名作者:
Mourtada, Jaouad; Gaiffas, Stephane; Scornet, Erwan
署名单位:
Institut Polytechnique de Paris; Ecole Polytechnique; Centre National de la Recherche Scientifique (CNRS); Centre National de la Recherche Scientifique (CNRS); Universite Paris Cite; Universite PSL; Ecole Normale Superieure (ENS); Centre National de la Recherche Scientifique (CNRS); CNRS - National Institute for Mathematical Sciences (INSMI)
刊物名称:
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY
ISSN/ISSBN:
1369-7412
DOI:
10.1111/rssb.12425
发表日期:
2021
页码:
505-533
关键词:
tree weighting method
摘要:
Random forest (RF) is one of the algorithms of choice in many supervised learning applications, be it classification or regression. The appeal of such tree-ensemble methods comes from a combination of several characteristics: a remarkable accuracy in a variety of tasks, a small number of parameters to tune, robustness with respect to features scaling, a reasonable computational cost for training and prediction, and their suitability in high-dimensional settings. The most commonly used RF variants, however, are 'offline' algorithms, which require the availability of the whole dataset at once. In this paper, we introduce AMF, an online RF algorithm based on Mondrian Forests. Using a variant of the context tree weighting algorithm, we show that it is possible to efficiently perform an exact aggregation over all prunings of the trees; in particular, this enables to obtain a truly online parameter-free algorithm which is competitive with the optimal pruning of the Mondrian tree, and thus adaptive to the unknown regularity of the regression function. Numerical experiments show that AMF is competitive with respect to several strong baselines on a large number of datasets for multi-class classification.
来源URL: