-
作者:Chernozhukov, Victor; Haerdle, Wolfgang Karl; Huang, Chen; Wang, Weining
作者单位:Massachusetts Institute of Technology (MIT); Massachusetts Institute of Technology (MIT); Humboldt University of Berlin; Aarhus University; CREATES; Aarhus University; University of York - UK
摘要:We consider the estimation and inference in a system of high-dimensional regression equations allowing for temporal and cross-sectional dependency in covariates and error processes, covering rather general forms of weak temporal dependence. A sequence of regressions with many regressors using LASSO (Least Absolute Shrinkage and Selection Operator) is applied for variable selection purpose, and an overall penalty level is carefully chosen by a block multiplier bootstrap procedure to account for...
-
作者:Chetverikov, Denis; Liao, Zhipeng; Chernozhukov, Victor
作者单位:University of California System; University of California Los Angeles; Massachusetts Institute of Technology (MIT); Massachusetts Institute of Technology (MIT)
摘要:In this paper, we derive nonasymptotic error bounds for the Lasso estimator when the penalty parameter for the estimator is chosen using K-fold cross-validation. Our bounds imply that the cross-validated Lasso estimator has nearly optimal rates of convergence in the prediction, L-2, and L-1 norms. For example, we show that in the model with the Gaussian noise and under fairly general assumptions on the candidate set of values of the penalty parameter, the estimation error of the cross-validate...
-
作者:Najafi, Amir; Ilchi, Saeed; Saberi, Amir Hossein; Motahari, Seyed Abolfazl; Khalaj, Babak H.; Rabiee, Hamid R.
作者单位:Sharif University of Technology; Sharif University of Technology; Sharif University of Technology
摘要:We study the sample complexity of learning a high-dimensional simplex from a set of points uniformly sampled from its interior. Learning of simplices is a long studied problem in computer science and has applications in computational biology and remote sensing, mostly under the name of spectral unmixing. We theoretically show that a sufficient sample complexity for reliable learning of a K-dimensional simplex up to a total-variation error of is an element of is O(K-2/epsilon log K/epsilon), wh...
-
作者:Cai, T. Tony; Zhang, Linjun
作者单位:University of Pennsylvania; Rutgers University System; Rutgers University New Brunswick
摘要:In this paper, we study high-dimensional sparse Quadratic Discriminant Analysis (QDA) and aim to establish the optimal convergence rates for the classification error. Minimax lower bounds are established to demonstrate the necessity of structural assumptions such as sparsity conditions on the discriminating direction and differential graph for the possible construction of consistent high-dimensional QDA rules. We then propose a classification algorithm called SDAR using constrained convex opti...
-
作者:Vovk, Vladimir; Wang, Ruodu
作者单位:University of London; Royal Holloway University London; University of Waterloo
摘要:Multiple testing of a single hypothesis and testing multiple hypotheses are usually done in terms of p-values. In this paper, we replace p-values with their natural competitor, e-values, which are closely related to betting, Bayes factors and likelihood ratios. We demonstrate that e-values are often mathematically more tractable; in particular, in multiple testing of a single hypothesis, e-values can be merged simply by averaging them. This allows us to develop efficient procedures using e-val...
-
作者:Ghosh, Satyajit; Khare, Kshitij; Michailidis, George
作者单位:Rutgers University System; Rutgers University New Brunswick; State University System of Florida; University of Florida; State University System of Florida; University of Florida
摘要:Vector autoregressive (VAR) models aim to capture linear temporal interdependencies among multiple time series. They have been widely used in macroeconomics and financial econometrics and more recently have found novel applications in functional genomics and neuroscience. These applications have also accentuated the need to investigate the behavior of the VAR model in a high-dimensional regime, which will provide novel insights into the role of temporal dependence for regularized estimates of ...
-
作者:Gregory, Karl; Mammen, Enno; Wahl, Martin
作者单位:University of South Carolina System; University of South Carolina Columbia; Ruprecht Karls University Heidelberg; Humboldt University of Berlin
摘要:In this paper, we discuss the estimation of a nonparametric component f(1) of a nonparametric additive model Y = f(1)(X-1)+ ... + f(q)(X-q) + epsilon. We allow the number q of additive components to grow to infinity and we make sparsity assumptions about the number of nonzero additive components. We compare this estimation problem with that of estimating f(1) in the oracle model Z = f(1)(X-1) + epsilon, for which the additive components f(2),..., f(q) are known. We construct a two-step presmoo...
-
作者:Fan, Zhou; Sun, Yi; Wang, Zhichao
作者单位:Yale University; University of Chicago; University of California System; University of California San Diego
摘要:We study the principal components of covariance estimators in multivariate mixed-effects linear models. We show that, in high dimensions, the principal eigenvalues and eigenvectors may exhibit bias and aliasing effects that are not present in low-dimensional settings. We derive the first-order limits of the principal eigenvalue locations and eigenvector projections in a high-dimensional asymptotic framework, allowing for general population spectral distributions for the random effects and exte...
-
作者:Brecheteau, Claire; Fischer, Aurelie; Levrard, Clement
作者单位:Universite Rennes 2; Universite de Rennes; Universite Paris Cite
摘要:Clustering with Bregman divergences encompasses a wide family of clustering procedures that are well suited to mixtures of distributions from exponential families (J. Mach. Learn. Res. 6 (2005) 1705-1749). However, these techniques are highly sensitive to noise. To address the issue of clustering data with possibly adversarial noise, we introduce a robustified version of Bregman clustering based on a trimming approach. We investigate its theoretical properties, showing for instance that our es...
-
作者:Duchi, John C.; Namkoong, Hongseok
作者单位:Stanford University; Columbia University
摘要:A common goal in statistics and machine learning is to learn models that can perform well against distributional shifts, such as latent heterogeneous subpopulations, unknown covariate shifts or unmodeled temporal effects. We develop and analyze a distributionally robust stochastic optimization (DRO) framework that learns a model providing good performance against perturbations to the data-generating distribution. We give a convex formulation for the problem, providing several convergence guara...