-
作者:Niu, By ziang; Chakraborty, Abhinav; Dukes, Oliver; Katsevich, Eugene
作者单位:University of Pennsylvania; Ghent University
摘要:Model-X approaches to testing conditional independence between a predictor and an outcome variable given a vector of covariates usually assume exact knowledge of the conditional distribution of the predictor given the covariates. Nevertheless, model-X methodologies are often deployed with this conditional distribution learned in sample. We investigate the consequences of this choice through the lens of the distilled conditional randomization test (dCRT). We find that Type-I error control is st...
-
作者:Chang, Jinyuan; Hu, Qiao; Kolaczyk, Eric d.; Yao, Qiwei; Yi, Fengting
作者单位:Southwestern University of Finance & Economics - China; Chinese Academy of Sciences; McGill University; University of London; London School Economics & Political Science; Yunnan University
摘要:A standing challenge in data privacy is the trade-off between the level of privacy and the efficiency of statistical inference. Here, we conduct an indepth study of this trade-off for parameter estimation in the beta-model (Ann. Appl. Probab. 21 (2011) 1400-1435) for edge differentially private network 500). Unlike most previous approaches based on maximum likelihood estimation for this network model, we proceed via the method of moments. This choice facilitates our exploration of a substantia...
-
作者:Mesters, Geert; Zwiernik, Piotr
作者单位:Pompeu Fabra University; University of Toronto
摘要:A seminal result in the ICA literature states that for AY = s, if the components of s are independent and at most one is Gaussian, then A is identified up to sign and permutation of its rows (Signal Process. 36 (1994)). In this paper we study to which extent the independence assumption can be relaxed by replacing it with restrictions on higher order moment or cumulant tensors of s. We document new conditions that establish identification for several nonindependent component models, for example...
-
作者:Zhao, Alex; Li, Changcheng; Li, Runze; Zhang, Zhe
作者单位:Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park; Dalian University of Technology
摘要:This paper is concerned with statistical inference for regression coefficients in high-dimensional linear regression models. We propose a new method for testing the coefficient vector of the high-dimensional linear models, and establish the asymptotic normality of our proposed test statistic with the aid of the martingale central limit theorem. We derive the asymptotical relative efficiency (ARE) of the proposed test with respect to the test proshow that the ARE is always greater or equal to o...
-
作者:Soloff, Jake a.; Xiang, Daniel; Fithian, William
作者单位:University of Chicago; University of California System; University of California Berkeley
摘要:Despite the popularity of the false discovery rate (FDR) as an error control metric for large-scale multiple testing, its close Bayesian counterpart the local false discovery rate (lfdr), defined as the posterior probability that a particular null hypothesis is false, is a more directly relevant standard for justifying and interpreting individual rejections. However, the lfdr is difficult to work with in small samples, as the prior distribution is typically unknown. We propose a simple multipl...
-
作者:Cheng, Chen; Montanari, Andrea
作者单位:Stanford University; Stanford University
摘要:Random matrix theory has become a widely useful tool in high-dimensional statistics and theoretical machine learning. However, random matrix theory is largely focused on the proportional asymptotics in which the number of columns grows proportionally to the number of rows of the data matrix. This is not always the most natural setting in statistics where columns correspond to covariates and rows to samples. With the objective to move beyond the proportional asymptotics, we revisit ridge regres...
-
作者:Gu, Jia; Chen, Song xi
作者单位:Zhejiang University; Tsinghua University
摘要:This paper considers decentralized Federated Learning (FL) under heterogeneous distributions among distributed clients or data blocks for the Mestimation. The mean squared error and consensus error across the estimators from different clients via the decentralized stochastic gradient descent algorithm are derived. The asymptotic normality of the Polyak-Ruppert (PR) averaged estimator in the decentralized distributed setting is attained, which shows that its statistical efficiency comes at a co...
-
作者:Zhang, Zhenyuan; Ramdas, Aaditya; Wang, Ruodu
作者单位:Stanford University; Carnegie Mellon University; University of Waterloo
摘要:Given a composite null P and composite alternative Q, when and how can we construct a p-value whose distribution is exactly uniform under the null, and stochastically smaller than uniform under the alternative? Similarly, when and how can we construct an e-value whose expectation exactly equals one under the null, but its expected logarithm under the alternative is positive? We answer these basic questions, and other related ones, when P and Q are convex polytopes (in the space of probability ...
-
作者:Chen, YinFeng; Jiao, YuLing; Qiu, Rui; Hu, Zhou
作者单位:East China Normal University; Wuhan University
摘要:Linear sufficient dimension reduction, as exemplified by sliced inverse regression, has seen substantial development in the past thirty years. However, with the advent of more complex scenarios, nonlinear dimension reduction has gained considerable interest recently. This paper introduces a novel method for nonlinear sufficient dimension reduction, utilizing the generalized martingale difference divergence measure in conjunction with deep neural networks. The optimal solution of the proposed o...
-
作者:Ding, Yi; Zheng, Xinghua
作者单位:University of Macau; Hong Kong University of Science & Technology
摘要:We study the estimation of high-dimensional covariance matrices and their empirical spectral distributions under dynamic volatility models. Data under such models have nonlinear dependency both cross-sectionally and temporally. We establish the condition under which the limiting spectral distribution (LSD) of the sample covariance matrix under scalar BEKK models is different from the i.i.d. case. We then propose a time-variation adjusted (TV-adj) sample covariance matrix and prove that its LSD...