-
作者:Han, Xiao; Yang, Qing; Fan, Yingying
作者单位:Chinese Academy of Sciences; University of Science & Technology of China, CAS; University of Southern California
摘要:Determining the precise rank is an important problem in many large-scale applications with matrix data exploiting low-rank plus noise models. In this paper, we suggest a universal approach to rank inference via residual sub-sampling (RIRS) for testing and estimating rank in a wide family of models, including many popularly used network models such as the degree corrected mixed membership model as a special case. Our procedure constructs a test statistic via subsampling entries of the residual ...
-
作者:Steinberger, Lukas; Leeb, Hannes
作者单位:University of Vienna
摘要:We investigate generically applicable and intuitively appealing predic-tion intervals based on k-fold cross-validation. We focus on the conditional coverage probability of the proposed intervals, given the observations in the training sample (hence, training conditional validity), and show that it is close to the nominal level, in an appropriate sense, provided that the underlying algorithm used for computing point predictions is sufficiently stable when feature-response pairs are omitted. Our...
-
作者:Komarova, Tatiana; Hidalgo, Javier
作者单位:University of Manchester; University of London; London School Economics & Political Science
摘要:We describe and examine a test for a general class of shape constraints, such as signs of derivatives, U-shape, quasi-convexity, log-convexity, among others, in a nonparametric framework using partial sums empirical processes. We show that, after a suitable transformation, its asymptotic distribution is a functional of a Brownian motion index by the c.d.f. of the regressor. As a result, the test is distribution-free and critical values are readily available. However, due to the possible poor a...
-
作者:Weinstein, Asaf; Su, Weijie J.; Bogdan, Malgorzata; Barber, Rina Foygel; Candes, Emmanuel J.
作者单位:Hebrew University of Jerusalem; University of Pennsylvania; University of Wroclaw; University of Chicago; Stanford University; Stanford University
摘要:Variable selection properties of procedures utilizing penalized-likelihood estimates is a central topic in the study of high-dimensional linear regression problems. Existing literature emphasizes the quality of ranking of the variables by such procedures as reflected in the receiver operating characteristic curve or in prediction performance. Specifically, recent works have harnessed modern theory of approximate message-passing (AMP) to obtain, in a particular setting, exact asymptotic predict...
-
作者:Abbe, Emmanuel; Li, Shuangping; Sly, Allan
作者单位:Swiss Federal Institutes of Technology Domain; Ecole Polytechnique Federale de Lausanne; Princeton University; Princeton University
摘要:The problem of learning graphons has attracted considerable attention across several scientific communities, with significant progress over the re-cent years in sparser regimes. Yet, the current techniques still require diverg-ing degrees in order to succeed with efficient algorithms in the challenging cases where the local structure of the graph is homogeneous. This paper pro-vides an efficient algorithm to learn graphons in the constant expected degree regime. The algorithm is shown to succe...
-
作者:Lyu, Zhongyuan; Xia, Dong
作者单位:Hong Kong University of Science & Technology
摘要:Structural matrix-variate observations routinely arise in diverse fields such as multilayer network analysis and brain image clustering. While data of this type have been extensively investigated with fruitful outcomes being delivered, the fundamental questions like its statistical optimality and computational limit are largely under-explored. In this paper, we propose a low-rank Gaussian mixture model (LrMM) assuming each matrix-valued observation has a planted low-rank structure. Minimax low...
-
作者:Wang, Di; Tsay, Ruey S.
作者单位:Shanghai Jiao Tong University; University of Chicago
摘要:High-dimensional time series data appear in many scientific areas in the current data-rich environment. Analysis of such data poses new challenges to data analysts because of not only the complicated dynamic dependence between the series, but also the existence of aberrant observations, such as missing values, contaminated observations, and heavy-tailed distributions. For high-dimensional vector autoregressive (VAR) models, we introduce a unified estimation procedure that is robust to model mi...
-
作者:Fan, Jianqing; Lou, Zhipeng; Yu, Mengxin
作者单位:Princeton University; Pennsylvania Commonwealth System of Higher Education (PCSHE); University of Pittsburgh; University of Pennsylvania
摘要:A stylized feature of high-dimensional data is that many variables have heavy tails, and robust statistical inference is critical for valid large-scale statistical inference. Yet, the existing developments such as Winsorization, Huberization and median of means require the bounded second moments and involve variable-dependent tuning parameters, which hamper their fidelity in applications to large-scale problems. To liberate these constraints, this paper revisits the celebrated Hodges-Lehmann (...
-
作者:Li, Harrison H.; Owen, Art B.
作者单位:Stanford University
摘要:Tie-breaker designs trade off a measure of statistical efficiency against a with higher values of a running variable x. The efficiency measure can be any continuous function of the expected information matrix in a two-line regression model. The short-term gain is expressed as the covariance between the running variable and the treatment indicator. We investigate how to choose design functions p(x) specifying the probability of treating a subject with running variable x in order to optimize the...
-
作者:Qiu, Jiaxin; Li, Zeng; Yao, Jianfeng
作者单位:University of Hong Kong; Southern University of Science & Technology; The Chinese University of Hong Kong, Shenzhen
摘要:The asymptotic normality for a large family of eigenvalue statistics of a general sample covariance matrix is derived under the ultrahigh-dimensional setting, that is, when the dimension to sample size ratio p/n & RARR; & INFIN;. Based on this CLT result, we extend the covariance matrix test problem to the new ultra-high-dimensional context, and apply it to test a matrix-valued white noise. Simulation experiments are conducted for the investigation of finite-sample properties of the general as...