-
作者:Allen, Genevera I.
作者单位:Rice University
-
作者:Banerjee, Trambak; Mukherjee, Gourab; Sun, Wenguang
作者单位:University of Southern California
摘要:The article considers the problem of estimating a high-dimensional sparse parameter in the presence of side information that encodes the sparsity structure. We develop a general framework that involves first using an auxiliary sequence to capture the side information, and then incorporating the auxiliary sequence in inference to reduce the estimation risk. The proposed method, which carries out adaptive Stein's unbiased risk estimate-thresholding using side information (ASUS), is shown to have...
-
作者:Romano, Yaniv; Sesia, Matteo; Candes, Emmanuel
作者单位:Stanford University
摘要:This article introduces a machine for sampling approximate model-X knockoffs for arbitrary and unspecified data distributions using deep generative models. The main idea is to iteratively refine a knockoff sampling mechanism until a criterion measuring the validity of the produced knockoffs is optimized; this criterion is inspired by the popular maximum mean discrepancy in machine learning and can be thought of as measuring the distance to pairwise exchangeability between original and knockoff...
-
作者:Gerstenberger, Carina; Vogel, Daniel; Wendler, Martin
作者单位:Ruhr University Bochum; University of Aberdeen; Universitat Greifswald
摘要:In many applications it is important to know whether the amount of fluctuation in a series of observations changes over time. In this article, we investigate different tests for detecting changes in the scale of mean-stationary time series. The classical approach, based on the CUSUM test applied to the squared centered observations, is very vulnerable to outliers and impractical for heavy-tailed data, which leads us to contemplate test statistics based on alternative, less outlier-sensitive sc...
-
作者:Li, Han; Xu, Minxuan; Liu, Jun S.; Fan, Xiaodan
作者单位:Shenzhen University; Chinese University of Hong Kong; University of California System; University of California Los Angeles; Harvard University
摘要:In this article, we study the rank aggregation problem, which aims to find a consensus ranking by aggregating multiple ranking lists. To address the problem probabilistically, we formulate an elaborate ranking model for full and partial rankings by generalizing the Mallows model. Our model assumes that the ranked data are generated through a multistage ranking process that is explicitly governed by parameters that measure the overall quality and stability of the process. The new model is quite...
-
作者:Fan, Yingying; Demirkaya, Emre; Li, Gaorong; Lv, Jinchi
作者单位:University of Southern California; University of Tennessee System; University of Tennessee Knoxville; Beijing University of Technology
摘要:Power and reproducibility are key to enabling refined scientific discoveries in contemporary big data applications with general high-dimensional nonlinear models. In this article, we provide theoretical foundations on the power and robustness for the model-X knockoffs procedure introduced recently in Candes, Fan, Janson and Lv in high-dimensional setting when the covariate distribution is characterized by Gaussian graphical model. We establish that under mild regularity conditions, the power o...
-
作者:Bradic, Jelena; Claeskens, Gerda; Gueuning, Thomas
作者单位:University of California System; University of California San Diego; KU Leuven; KU Leuven
摘要:Many scientific and engineering challenges-ranging from pharmacokinetic drug dosage allocation and personalized medicine to marketing mix (4Ps) recommendations-require an understanding of the unobserved heterogeneity to develop the best decision making-processes. In this article, we develop a hypothesis test and the corresponding p-value for testing for the significance of the homogeneous structure in linear mixed models. A robust matching moment construction is used for creating a test that a...
-
作者:Rosset, Saharon; Tibshirani, Ryan J.
作者单位:Tel Aviv University; Carnegie Mellon University; Carnegie Mellon University
摘要:In statistical prediction, classical approaches for model selection and model evaluation based on covariance penalties are still widely used. Most of the literature on this topic is based on what we call the Fixed-X assumption, where covariate values are assumed to be nonrandom. By contrast, it is often more reasonable to take a Random-X view, where the covariate values are independently drawn for both training and prediction. To study the applicability of covariance penalties in this setting,...
-
作者:Stephenson, Briana J. K.; Herring, Amy H.; Olshan, Andrew
作者单位:University of North Carolina; University of North Carolina Chapel Hill; Duke University; University of North Carolina; University of North Carolina Chapel Hill
摘要:The National Birth Defects Prevention Study (NBDPS) is a case-control study of birth defects conducted across 10 U.S. states. Researchers are interested in characterizing the etiologic role of maternal diet, collected using a food frequency questionnaire. Because diet is multidimensional, dimension reduction methods such as cluster analysis are often used to summarize dietary patterns. In a large, heterogeneous population, traditional clustering methods, such as latent class analysis, used to ...
-
作者:Tabouy, Tinnothee; Barbillon, Pierre; Chiquet, Julien
作者单位:AgroParisTech; Universite Paris Saclay; INRAE
摘要:This article deals with nonobserved dyads during the sampling of a network and consecutive issues in the inference of the stochastic block model (SBM). We review sampling designs and recover missing at random (MAR) and not missing at random (NMAR) conditions for the SBM. We introduce variants of the variational EM algorithm for inferring the SBM under various sampling designs (MAR and NMAR) all available as an R package. Model selection criteria based on integrated classification likelihood ar...