-
作者:Montanari, Andrea; Zhong, Yiqiao
作者单位:Stanford University; Stanford University
摘要:Modern neural networks are often operated in a strongly overparametrized regime: they comprise so many parameters that they can interpolate the training set, even if actual labels are replaced by purely random ones. Despite this, they achieve good prediction error on unseen data: interpolating the training set does not lead to a large generalization error. Further, overparametrization appears to be beneficial in that it simplifies the optimization landscape. Here, we study these phenomena in t...
-
作者:Austern, Morgane; Orbanz, Peter
作者单位:Harvard University; University of London; University College London
摘要:A distributional symmetry is invariance of a distribution under a group of transformations. Exchangeability and stationarity are examples. We explain that a result of ergodic theory implies a law of large numbers for such invariant distributions: If the group satisfies suitable conditions, expectations can be estimated by averaging over subsets of transformations, and these estimators are strongly consistent. We show that, if a mixing condition holds, the averages also satisfy a central limit ...
-
作者:Huang, Hsueh-Han; Chan, Ngai Hang; Chen, Kun; Ing, Ching-Kang
作者单位:National Tsing Hua University; Chinese University of Hong Kong; Southwestern University of Finance & Economics - China
摘要:Estimating the orders of the autoregressive fractionally integrated moving average (ARFIMA) model has been a long-standing problem in time series analysis. This paper tackles this challenge by establishing the consistency of the Bayesian information criterion (BIC) for ARFIMA models with independent errors. Since the memory parameter of the model can be any real number, this consistency result is valid for short memory, long memory and nonstationary time series. This paper further extends the ...
-
作者:Gamarnik, David; Zadik, Ilias
作者单位:Massachusetts Institute of Technology (MIT)
摘要:We consider a sparse high-dimensional regression model where the goal is to recover a k-sparse unknown binary vector beta* from n noisy linear observations of the form Y = X beta* + W is an element of R-n where X is an element of R-n has i.i.d. N(0, 1) entries and W is an element of R-n has i.i.d. N(0, sigma(2)) entries. In the high signal-to-noise ratio regime and sublinear sparsity regime, while the order of the sample size needed to recover the unknown vector information-theoretically is kn...
-
作者:Xia, Dong; Zhang, Anru R.; Zhou, Yuchen
作者单位:Hong Kong University of Science & Technology; Duke University; Duke University; Duke University; University of Pennsylvania
摘要:In this paper, we consider the statistical inference for several low-rank tensor models. Specifically, in the Tucker low-rank tensor PCA or regression model, provided with any estimates achieving some attainable error rate, we develop the data-driven confidence regions for the singular subspace of the parameter tensor based on the asymptotic distribution of an updated estimate by two-iteration alternating minimization. The asymptotic distributions are established under some essential condition...
-
作者:Gao, Chao; Ma, Zongming
作者单位:University of Chicago; University of Pennsylvania
摘要:In this paper, we test whether two data sets measured on the same set of subjects share a common clustering structure. As a leading example, we focus on comparing clustering structures in two independent random samples from two deterministic two-component mixtures of multivariate Gaussian distributions. Mean parameters of these Gaussian distributions are treated as potentially unknown nuisance parameters and are allowed to differ. Assuming knowledge of mean parameters, we first determine the p...
-
作者:Yuan, Mingao; Liu, Ruiqi; Feng, Yang; Shang, Zuofeng
作者单位:North Dakota State University Fargo; Texas Tech University System; Texas Tech University; New York University; New Jersey Institute of Technology
摘要:Many complex networks in the real world can be formulated as hypergraphs where community detection has been widely used. However, the fundamental question of whether communities exist or not in an observed hypergraph remains unclear. This work aims to tackle this important problem. Specifically, we systematically study when a hypergraph with community structure can be successfully distinguished from its Erdos-Renyi counterpart, and propose concrete test statistics when the models are distingui...
-
作者:Goto, Yuichi; Kley, Tobias; Van Hecke, Ria; Volgushev, Stanislav; Dette, Holger; Hallin, Marc
作者单位:Kyushu University; University of Gottingen; Ruhr University Bochum; University of Toronto; Universite Libre de Bruxelles; Universite Libre de Bruxelles
摘要:Frequency domain methods form a ubiquitous part of the statistical tool-box for time-series analysis. In recent years, considerable interest has been given to the development of new spectral methodology and tools capturing dynamics in the entire joint distributions, and thus avoiding the limitations of classical, L2-based spectral methods. Most of the spectral concepts proposed in that literature suffer from one major drawback, though: their estimation re-quires the choice of a smoothing param...
-
作者:Hanneke, S. T. E. V. E.; Kpotufe, Samory
作者单位:Purdue University System; Purdue University; Columbia University
摘要:Multitask learning and related areas such as multisource domain adapta-tion address modern settings where data sets from N related distributions {Pt } are to be combined toward improving performance on any single such distri-bution D. A perplexing fact remains in the evolving theory on the subject: while we would hope for performance bounds that account for the contribu-tion from multiple tasks, the vast majority of analyses result in bounds that improve at best in the number n of samples per ...
-
作者:Ekvall, Karl Oskar; Bottai, Matteo
作者单位:Karolinska Institutet
摘要:We propose confidence regions with asymptotically correct uniform coverage probability of parameters whose Fisher information matrix can be singular at important points of the parameter set. Our work is motivated by the need for reliable inference on scale parameters close or equal to zero in mixed models, which is obtained as a special case. The confidence regions are constructed by inverting a continuous extension of the score test statistic standardized by expected information, which we sho...