-
作者:Scornet, Erwan; Biau, Gerard; Vert, Jean-Philippe
作者单位:Sorbonne Universite; Universite PSL; MINES ParisTech; UNICANCER; Universite PSL; Institut Curie; Universite PSL; UNICANCER; Institut Curie; Institut National de la Sante et de la Recherche Medicale (Inserm)
摘要:Random forests are a learning algorithm proposed by Breiman [Mach. Leant. 45 (2001) 5-32] that combines several randomized decision trees and aggregates their predictions by averaging. Despite its wide usage and outstanding practical performance, little is known about the mathematical properties of the procedure. This disparity between theory and practice originates in the difficulty to simultaneously analyze both the randomization process and the highly data-dependent tree structure. In the p...
-
作者:Basu, Sumanta; Michailidis, George
作者单位:University of Michigan System; University of Michigan
摘要:Many scientific and economic problems involve the analysis of high-dimensional time series datasets. However, theoretical studies in high-dimensional statistics to date rely primarily on the assumption of independent and identically distributed (i.i.d.) samples. In this work, we focus on stable Gaussian processes and investigate the theoretical properties of l(1)-regularized estimates in two important statistical problems in the context of high-dimensional time series: (a) stochastic regressio...
-
作者:Shang, Zuofeng; Cheng, Guang
作者单位:Purdue University System; Purdue University
摘要:We propose a roughness regularization approach in making nonparametric inference for generalized functional linear models. In a reproducing kernel Hilbert space framework, we construct asymptotically valid confidence intervals for regression mean, prediction intervals for future response and various statistical procedures for hypothesis testing. In particular, one procedure for testing global behaviors of the slope function is adaptive to the smoothness of the slope function and to the structu...
-
作者:Castillo, Ismael
作者单位:Sorbonne Universite; Centre National de la Recherche Scientifique (CNRS); Centre National de la Recherche Scientifique (CNRS); Universite Paris Cite
-
作者:Mai, Qing; Zou, Hui
作者单位:State University System of Florida; Florida State University; University of Minnesota System; University of Minnesota Twin Cities
摘要:A new model-free screening method called the fused Kolmogorov filter is proposed for high-dimensional data analysis. This new method is fully nonparametric and can work with many types of covariates and response variables, including continuous, discrete and categorical variables. We apply the fused Kolmogorov filter to deal with variable screening problems emerging from a wide range of applications, such as multiclass classification, nonparametric regression and Poisson regression, among other...
-
作者:Chatterjee, Yasachi; Guntuboyina, Adityanand; Sen, Bodhisattva
作者单位:University of Chicago; University of California System; University of California Berkeley; Columbia University
摘要:We consider the problem of estimating an unknown theta is an element of R-n from noisy observations under the constraint that theta belongs to certain convex polyhedral cones in R-n. Under this setting, we prove bounds for the risk of the least squares estimator (LSE). The obtained risk bound behaves differently depending on the true sequence theta which highlights the adaptive behavior of theta. As special cases of our general result, we derive risk bounds for the LSE in univariate isotonic a...
-
作者:Li, Kang; Zheng, Wei; Ai, Mingyao
作者单位:Peking University; Peking University; Purdue University System; Purdue University; Purdue University in Indianapolis
摘要:The interference model has been widely used and studied in block experiments where the treatment for a particular plot has effects on its neighbor plots. In this paper, we study optimal circular designs for the proportional interference model, in which the neighbor effects of a treatment are proportional to its direct effect. Kiefer's equivalence theorems for estimating both the direct and total treatment effects are developed with respect to the criteria of A, D, E and T. Parallel studies are...
-
作者:Fan, Jianqing; Ke, Zheng Tracy; Liu, Han; Xia, Lucy
作者单位:Princeton University; University of Chicago
摘要:We propose a novel Rayleigh quotient based sparse quadratic dimension reduction method-named QUADRO (Quadratic Dimension Reduction via Rayleigh Optimization)-for analyzing high-dimensional data. Unlike in the linear setting where Rayleigh quotient optimization coincides with classification, these two problems are very different under nonlinear settings. In this paper, we clarify this difference and show that Rayleigh quotient optimization may be of independent scientific interests. One major c...
-
作者:Byrne, Simon; Dawid, A. Philip
作者单位:University of London; University College London; University of Cambridge
摘要:This paper considers the problem of defining distributions over graphical structures. We propose an extension of the hyper Markov properties of Dawid and Lauritzen [Ann. Statist. 21 (1993) 1272-1317], which we term structural Markov properties, for both undirected decomposable and directed acyclic graphs, which requires that the structure of distinct components of the graph be conditionally independent given the existence of a separating component. This allows the analysis and comparison of mu...
-
作者:Nickl, Richard
作者单位:University of Cambridge