-
作者:Romano, Gaetano; Rigaill, Guillem; Runge, Vincent; Fearnhead, Paul
作者单位:Lancaster University; Universite Paris Saclay; Universite Paris Cite; Centre National de la Recherche Scientifique (CNRS); INRAE; Centre National de la Recherche Scientifique (CNRS); Universite Paris Saclay
摘要:While there are a plethora of algorithms for detecting changes in mean in univariate time-series, almost all struggle in real applications where there is autocorrelated noise or where the mean fluctuates locally between the abrupt changes that one wishes to detect. In these cases, default implementations, which are often based on assumptions of a constant mean between changes and independent noise, can lead to substantial over-estimation of the number of changes. We propose a principled approa...
-
作者:Shi, Chengchun; Li, Lexin
作者单位:University of California System; University of California Berkeley
摘要:A central question in high-dimensional mediation analysis is to infer the significance of individual mediators. The main challenge is that the total number of potential paths that go through any mediator is super-exponential in the number of mediators. Most existing mediation inference solutions either explicitly impose that the mediators are conditionally independent given the exposure, or ignore any potential directed paths among the mediators. In this article, we propose a novel hypothesis ...
-
作者:Paindaveine, Davy; Rasoafaraniaina, Josea; Verdebout, Thomas
作者单位:Universite Libre de Bruxelles; Universite Libre de Bruxelles; Universite de Toulouse; Universite Toulouse 1 Capitole; Toulouse School of Economics
摘要:Multisample covariance estimation-that is, estimation of the covariance matrices associated with k distinct populations-is a classical problem in multivariate statistics. A common solution is to base estimation on the outcome of a test that these covariance matrices show some given pattern. Such a preliminary test may, for example, investigate whether or not the various covariance matrices are equal to each other (test of homogeneity), or whether or not they have common eigenvectors (test of c...
-
作者:Nandy, Debmalya; Chiaromonte, Francesca; Li, Runze
作者单位:Colorado School of Public Health; University of Colorado System; University of Colorado Anschutz Medical Campus; Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park; Scuola Superiore Sant'Anna; Scuola Superiore Sant'Anna
摘要:Contemporary high-throughput experimental and surveying techniques give rise to ultrahigh-dimensional supervised problems with sparse signals; that is, a limited number of observations (n), each with a very large number of covariates (p >> n), only a small share of which is truly associated with the response. In these settings, major concerns on computational burden, algorithmic stability, and statistical accuracy call for substantially reducing the feature space by eliminating redundant covar...
-
作者:Abadie, Alberto; Spiess, Jann
作者单位:Massachusetts Institute of Technology (MIT); Stanford University
摘要:Nearest-neighbor matching is a popular nonparametric tool to create balance between treatment and control groups in observational studies. As a preprocessing step before regression, matching reduces the dependence on parametric modeling assumptions. In current empirical practice, however, the matching step is often ignored in the calculation of standard errors and confidence intervals. In this article, we show that ignoring the matching step results in asymptotically valid standard errors if m...
-
作者:Liu, Yang; Hu, Feifang
作者单位:George Washington University
摘要:Balancing important covariates is often critical in clinical trials and causal inference. Stratified permuted block (STR-PB) and covariate-adaptive randomization (CAR) procedures are widely used to balance observed covariates in practice. The balance properties of these procedures with respect to the observed covariates have been well studied. However, it has been questioned whether these methods will also yield a good balance for the unobserved covariates. In this article, we develop a genera...
-
作者:Lemyre, Felix Camirand; Carroll, Raymond J.; Delaigle, Aurore
作者单位:University of Sherbrooke; University of Sherbrooke; Texas A&M University System; Texas A&M University College Station; University of Technology Sydney; University of Melbourne; University of Melbourne
摘要:Dietary data collected from 24-hour dietary recalls are observed with significant measurement errors. In the nonparametric curve estimation literature, much of the effort has been devoted to designing methods that are consistent under contamination by noise, and which have been traditionally applied for analyzing those data. However, some foods such as alcohol or fruits are consumed only episodically, and may not be consumed during the day when the 24-hour recall is administered. These so-call...
-
作者:Dai, Ben; Shen, Xiaotong; Wang, Junhui
作者单位:University of Minnesota System; University of Minnesota Twin Cities; City University of Hong Kong
摘要:Numerical embedding has become one standard technique for processing and analyzing unstructured data that cannot be expressed in a predefined fashion. It stores the main characteristics of data by mapping it onto a numerical vector. An embedding is often unsupervised and constructed by transfer learning from large-scale unannotated data. Given an embedding, a downstream learning method, referred to as a two-stage method, is applicable to unstructured data. In this article, we introduce a novel...
-
作者:Xie, Dongyue; Stephens, Matthew
作者单位:University of Chicago; University of Chicago
-
作者:Park, Hyunwoo
作者单位:Seoul National University (SNU)