-
作者:Ignatiadis, Nikolaos; Saha, Sujayam; Sun, Dennis L.; Muralidharan, Omkar
作者单位:Stanford University; Alphabet Inc.; Google Incorporated; California State University System; California Polytechnic State University San Luis Obispo
摘要:We study empirical Bayes estimation of the effect sizes of N units from K noisy observations on each unit. We show that it is possible to achieve near-Bayes optimal mean squared error, without any assumptions or knowledge about the effect size distribution or the noise. The noise distribution can be heteroscedastic and vary arbitrarily from unit to unit. Our proposal, which we call Aurora, leverages the replication inherent in the K observations per unit and recasts the effect size estimation ...
-
作者:Cai, Chencheng; Chen, Rong; Xie, Min-ge
作者单位:Pennsylvania Commonwealth System of Higher Education (PCSHE); Temple University; Rutgers University System; Rutgers University New Brunswick
摘要:Many massive data sets are assembled through collections of information of a large number of individuals in a population. The analysis of such data, especially in the aspect of individualized inferences and solutions, has the potential to create significant value for practical applications. Traditionally, inference for an individual in the dataset is either solely relying on the information of the individual or from summarizing the information about the whole population. However, with the avai...
-
作者:Xing, Xin; Zhao, Zhigen; Liu, Jun S.
作者单位:Virginia Polytechnic Institute & State University; Pennsylvania Commonwealth System of Higher Education (PCSHE); Temple University; Harvard University
摘要:Simultaneously, finding multiple influential variables and controlling the false discovery rate (FDR) for linear regression models is a fundamental problem. We here propose the Gaussian Mirror (GM) method, which creates for each predictor variable a pair of mirror variables by adding and subtracting a randomly generated Gaussian perturbation, and proceeds with a certain regression method, such as the ordinary least-square or the Lasso (the mirror variables can also be created after selection)....
-
作者:Niezink, Nynke M. D.
作者单位:Carnegie Mellon University
-
作者:Dai, Chenguang; Lin, Buyu; Xing, Xin; Liu, Jun S.
作者单位:Harvard University; Virginia Polytechnic Institute & State University
摘要:Selecting relevant features associated with a given response variable is an important problem in many scientific fields. Quantifying quality and uncertainty of a selection result via false discovery rate (FDR) control has been of recent interest. This article introduces a data-splitting method (referred to as DS) to asymptotically control the FDR while maintaining a high power. For each feature, DS constructs a test statistic by estimating two independent regression coefficients via data split...
-
作者:Avarucci, Marco; Zaffaroni, Paolo
作者单位:University of Glasgow; Imperial College London; Sapienza University Rome
摘要:This article studies estimation of linear panel regression models with heterogeneous coefficients using a class of weighted least squares estimators, when both the regressors and the error possibly contain a common latent factor structure. Our theory is robust to the specification of such a factor structure because it does not require any information on the number of factors or estimation of the factor structure itself. Moreover, our theory is efficient, in certain circumstances, because it ne...
-
作者:Alquier, Pierre; Cherief-Abdellatif, Badr-Eddine; Derumigny, Alexis; Fermanian, Jean-David
作者单位:RIKEN; University of Oxford; Delft University of Technology; Institut Polytechnique de Paris; ENSAE Paris
摘要:This article deals with robust inference for parametric copula models. Estimation using canonical maximum likelihood might be unstable, especially in the presence of outliers. We propose to use a procedure based on the maximum mean discrepancy (MMD) principle. We derive nonasymptotic oracle inequalities, consistency and asymptotic normality of this new estimator. In particular, the oracle inequality holds without any assumption on the copula family, and can be applied in the presence of outlie...
-
作者:Li, Sai; Cai, T. Tony; Li, Hongzhe
作者单位:Renmin University of China; University of Pennsylvania; University of Pennsylvania
摘要:Transfer learning for high-dimensional Gaussian graphical models (GGMs) is studied. The target GGM is estimated by incorporating the data from similar and related auxiliary studies, where the similarity between the target graph and each auxiliary graph is characterized by the sparsity of a divergence matrix. An estimation algorithm, Trans-CLIME, is proposed and shown to attain a faster convergence rate than the minimax rate in the single-task setting. Furthermore, we introduce a universal debi...
-
作者:Ma, Pulong; Bhadra, Anindya
作者单位:Clemson University; Purdue University System; Purdue University
摘要:The Matern covariance function is a popular choice for prediction in spatial statistics and uncertainty quantification literature. A key benefit of the Matern class is that it is possible to get precise control over the degree of mean-square differentiability of the random process. However, the Matern class possesses exponentially decaying tails, and thus, may not be suitable for modeling polynomially decaying dependence. This problem can be remedied using polynomial covariances; however, one ...
-
作者:Zhen, Yaoming; Wang, Junhui
作者单位:City University of Hong Kong
摘要:Conventional network data have largely focused on pairwise interactions between two entities, yet multi-way interactions among multiple entities have been frequently observed in real-life hypergraph networks. In this article, we propose a novel method for detecting community structure in general hypergraph networks, uniform or non-uniform. The proposed method introduces a null vertex to augment a nonuniform hypergraph into a uniform multi-hypergraph, and then embeds the multi-hypergraph in a l...