-
作者:Chen, Elynn Y.; Tsay, Ruey S.; Chen, Rong
作者单位:Princeton University; University of Chicago; Rutgers University System; Rutgers University New Brunswick
摘要:High-dimensional matrix-variate time series data are becoming widely available in many scientific fields, such as economics, biology, and meteorology. To achieve significant dimension reduction while preserving the intrinsic matrix structure and temporal dynamics in such data, Wang, Liu, and Chen proposed a matrix factor model, that is, shown to be able to provide effective analysis. In this article, we establish a general framework for incorporating domain and prior knowledge in the matrix fa...
-
作者:Wang, Jingshen; He, Xuming; Xu, Gongjun
作者单位:University of Michigan System; University of Michigan
摘要:This article concerns the potential bias in statistical inference on treatment effects when a large number of covariates are present in a linear or partially linear model. While the estimation bias in an under-fitted model is well understood, we address a lesser-known bias that arises from an over-fitted model. The over-fitting bias can be eliminated through data splitting at the cost of statistical efficiency, and we show that smoothing over random data splits can be pursued to mitigate the e...
-
作者:Wu, Peng; Zeng, Donglin; Wang, Yuanjia
作者单位:Columbia University; University of North Carolina; University of North Carolina Chapel Hill
摘要:Current guidelines for treatment decision making largely rely on data from randomized controlled trials (RCTs) studying average treatment effects. They may be inadequate to make individualized treatment decisions in real-world settings. Large-scale electronic health records (EHR) provide opportunities to fulfill the goals of personalized medicine and learn individualized treatment rules (ITRs) depending on patient-specific characteristics from real-world patient data. In this work, we tackle c...
-
作者:Allen, Genevera I.
作者单位:Rice University
-
作者:Banerjee, Trambak; Mukherjee, Gourab; Sun, Wenguang
作者单位:University of Southern California
摘要:The article considers the problem of estimating a high-dimensional sparse parameter in the presence of side information that encodes the sparsity structure. We develop a general framework that involves first using an auxiliary sequence to capture the side information, and then incorporating the auxiliary sequence in inference to reduce the estimation risk. The proposed method, which carries out adaptive Stein's unbiased risk estimate-thresholding using side information (ASUS), is shown to have...
-
作者:Romano, Yaniv; Sesia, Matteo; Candes, Emmanuel
作者单位:Stanford University
摘要:This article introduces a machine for sampling approximate model-X knockoffs for arbitrary and unspecified data distributions using deep generative models. The main idea is to iteratively refine a knockoff sampling mechanism until a criterion measuring the validity of the produced knockoffs is optimized; this criterion is inspired by the popular maximum mean discrepancy in machine learning and can be thought of as measuring the distance to pairwise exchangeability between original and knockoff...
-
作者:Gerstenberger, Carina; Vogel, Daniel; Wendler, Martin
作者单位:Ruhr University Bochum; University of Aberdeen; Universitat Greifswald
摘要:In many applications it is important to know whether the amount of fluctuation in a series of observations changes over time. In this article, we investigate different tests for detecting changes in the scale of mean-stationary time series. The classical approach, based on the CUSUM test applied to the squared centered observations, is very vulnerable to outliers and impractical for heavy-tailed data, which leads us to contemplate test statistics based on alternative, less outlier-sensitive sc...
-
作者:Li, Han; Xu, Minxuan; Liu, Jun S.; Fan, Xiaodan
作者单位:Shenzhen University; Chinese University of Hong Kong; University of California System; University of California Los Angeles; Harvard University
摘要:In this article, we study the rank aggregation problem, which aims to find a consensus ranking by aggregating multiple ranking lists. To address the problem probabilistically, we formulate an elaborate ranking model for full and partial rankings by generalizing the Mallows model. Our model assumes that the ranked data are generated through a multistage ranking process that is explicitly governed by parameters that measure the overall quality and stability of the process. The new model is quite...
-
作者:Fan, Yingying; Demirkaya, Emre; Li, Gaorong; Lv, Jinchi
作者单位:University of Southern California; University of Tennessee System; University of Tennessee Knoxville; Beijing University of Technology
摘要:Power and reproducibility are key to enabling refined scientific discoveries in contemporary big data applications with general high-dimensional nonlinear models. In this article, we provide theoretical foundations on the power and robustness for the model-X knockoffs procedure introduced recently in Candes, Fan, Janson and Lv in high-dimensional setting when the covariate distribution is characterized by Gaussian graphical model. We establish that under mild regularity conditions, the power o...
-
作者:Bradic, Jelena; Claeskens, Gerda; Gueuning, Thomas
作者单位:University of California System; University of California San Diego; KU Leuven; KU Leuven
摘要:Many scientific and engineering challenges-ranging from pharmacokinetic drug dosage allocation and personalized medicine to marketing mix (4Ps) recommendations-require an understanding of the unobserved heterogeneity to develop the best decision making-processes. In this article, we develop a hypothesis test and the corresponding p-value for testing for the significance of the homogeneous structure in linear mixed models. A robust matching moment construction is used for creating a test that a...