-
作者:Bresler, Guy; Karzand, Mina
作者单位:Massachusetts Institute of Technology (MIT); University of Wisconsin System; University of Wisconsin Madison
摘要:We study the problem of learning a tree Ising model from samples such that subsequent predictions made using the model are accurate. The prediction task considered in this paper is that of predicting the values of a subset of variables given values of some other subset of variables. Virtually all previous work on graphical model learning has focused on recovering the true underlying graph. We define a distance (small set TV or ssTV) between distributions P and Q by taking the maximum, over all...
-
作者:El Alaoui, Ahmed; Krzakala, Florent; Jordan, Michael
作者单位:Stanford University; Sorbonne Universite; Universite PSL; Ecole Normale Superieure (ENS); Universite Paris Cite; Centre National de la Recherche Scientifique (CNRS); Sorbonne Universite; University of California System; University of California Berkeley
摘要:We study the fundamental limits of detecting the presence of an additive rank-one perturbation, or spike, to a Wigner matrix. When the spike comes from a prior that is i.i.d. across coordinates, we prove that the log-likelihood ratio of the spiked model against the nonspiked one is asymptotically normal below a certain reconstruction threshold which is not necessarily of a spectral nature, and that it is degenerate above. This establishes the maximal region of contiguity between the planted an...
-
作者:Wong, Kam Chung; Li, Zifan; Tewari, Ambuj
作者单位:University of Michigan System; University of Michigan; Yale University
摘要:Many theoretical results for lasso require the samples to be i.i.d. Recent work has provided guarantees for lasso assuming that the time series is generated by a sparse Vector Autoregressive (VAR) model with Gaussian innovations. Proofs of these results rely critically on the fact that the true data generating mechanism (DGM) is a finite-order Gaussian VAR. This assumption is quite brittle: linear transformations, including selecting a subset of variables, can lead to the violation of this ass...
-
作者:Guntuboyina, Adityanand; Lieu, Donovan; Chatterjee, Sabyasachi; Sen, Bodhisattva
作者单位:University of California System; University of California Berkeley; University of Illinois System; University of Illinois Urbana-Champaign; Columbia University
摘要:We study trend filtering, a relatively recent method for univariate nonparametric regression. For a given integer r >= 1, the rth order trend filtering estimator is defined as the minimizer of the sum of squared errors when we constrain (or penalize) the sum of the absolute rth order discrete derivatives of the fitted function at the design points. For r = 1, the estimator reduces to total variation regularization which has received much attention in the statistics and image processing literat...
-
作者:Kolaczyk, Eric D.; Lin, Lizhen; Rosenberg, Steven; Walters, Jackson; Xu, Jie
作者单位:Boston University; University of Notre Dame
摘要:It is becoming increasingly common to see large collections of network data objects, that is, data sets in which a network is viewed as a fundamental unit of observation. As a result, there is a pressing need to develop network-based analogues of even many of the most basic tools already standard for scalar and vector data. In this paper, our focus is on averages of unlabeled, undirected networks with edge weights. Specifically, we (i) characterize a certain notion of the space of all such net...
-
作者:Koltchinskii, Vladimir; Loffler, Matthias; Nickl, Richard
作者单位:University System of Georgia; Georgia Institute of Technology; University of Cambridge
摘要:We study principal component analysis (PCA) for mean zero i.i.d. Gaussian observations X-1, ..., X-n in a separable Hilbert space H with unknown covariance operator Sigma. The complexity of the problem is characterized by its effective rank r(Sigma) := tr(Sigma)/parallel to Sigma parallel to where tr(Sigma) denotes the trace of Sigma and parallel to Sigma parallel to denotes its operator norm. We develop a method of bias reduction in the problem of estimation of linear functionals of eigenvect...
-
作者:Tan, Kai; Shi, Lei; Yu, Zhou
作者单位:East China Normal University; Fudan University
摘要:Sliced inverse regression (SIR) is an innovative and effective method for sufficient dimension reduction and data visualization. Recently, an impressive range of penalized SIR methods has been proposed to estimate the central subspace in a sparse fashion. Nonetheless, few of them considered the sparse sufficient dimension reduction from a decision-theoretic point of view. To address this issue, we in this paper establish the minimax rates of convergence for estimating the sparse SIR directions...
-
作者:Katsevich, Eugene; Ramdas, Aaditya
作者单位:Carnegie Mellon University
摘要:While traditional multiple testing procedures prohibit adaptive analysis choices made by users, Goeman and Solari (Statist. Sci. 26 (2011) 584-597) proposed a simultaneous inference framework that allows users such flexibility while preserving high-probability bounds on the false discovery proportion (FDP) of the chosen set. In this paper, we propose a new class of such simultaneous FDP bounds, tailored for nested sequences of rejection sets. While most existing simultaneous FDP bounds are bas...
-
作者:Castillo, Ismael; Roquain, Etienne
作者单位:Universite Paris Cite; Sorbonne Universite
摘要:This paper explores a connection between empirical Bayes posterior distributions and false discovery rate (FDR) control. In the Gaussian sequence model this work shows that empirical Bayes-calibrated spike and slab posterior distributions allow a correct FDR control under sparsity. Doing so, it offers a frequentist theoretical validation of empirical Bayes methods in the context of multiple testing. Our theoretical results are illustrated with numerical experiments.
-
作者:Kennedy, Edward H.; Balakrishnan, Sivaraman; G'Sell, Max
作者单位:Carnegie Mellon University
摘要:It is well known that, without restricting treatment effect heterogeneity, instrumental variable (IV) methods only identify local effects among compliers, that is, those subjects who take treatment only when encouraged by the IV. Local effects are controversial since they seem to only apply to an unidentified subgroup; this has led many to denounce these effects as having little policy relevance. However, we show that such pessimism is not always warranted: it can be possible to accurately pre...