-
作者:Lu, Xin; Liu, Tianle; Liu, Hanzhong; Ding, Peng
作者单位:Tsinghua University; Harvard University; University of California System; University of California Berkeley
摘要:Complete randomization balances covariates on average, but covariate imbalance often exists in finite samples. Rerandomization can ensure covariate balance in the realized experiment by discarding the undesired treatment assignments. Many field experiments in public health and social sciences assign the treatment at the cluster level due to logistical constraints or policy considerations. Moreover, they are frequently combined with rerandomization in the design stage. We refer to cluster reran...
-
作者:van den Boom, W.; Reeves, G.; Dunson, D. B.
作者单位:Yale NUS College; National University of Singapore; Duke University
-
作者:Sei, T.; Komaki, F.
作者单位:University of Tokyo
摘要:A Bayesian prediction problem for the two-dimensional Wishart model is investigated within the framework of decision theory. The loss function is the Kullback-Leibler divergence. We construct a scale-invariant and permutation-invariant prior distribution that shrinks the correlation coefficient. The prior is the geometric mean of the right invariant prior with respect to permutation of the indices, and is characterized by a uniform distribution for Fisher's z-transformation of the correlation ...
-
作者:Wang, Wenshuo; Janson, Lucas
作者单位:Harvard University
摘要:In many scientific applications, researchers aim to relate a response variable Y to a set of potential explanatory variables X = (X-1, ..., X-p) and start by trying to identify variables that contribute to this relationship. In statistical terms, this goal can be understood as trying to identify those X-j on which Y is conditionally dependent. Sometimes it is of value to simultaneously test for each j, which is more commonly known as variable selection. The conditional randomization test, CRT,...
-
作者:Song, Hoseung; Chen, Hao
作者单位:University of California System; University of California Davis
摘要:A nonparametric framework for changepoint detection, based on scan statistics utilizing graphs that represent similarities among observations, is gaining attention owing to its flexibility and good performance for high-dimensional and non-Euclidean data sequences. However, this graph-based framework faces challenges when there are repeated observations in the sequence, which is often the case for discrete data such as network data. In this article we extend the graph-based framework to solve t...
-
作者:Zhao, Peng; Yang, Yun; He, Qiao-Chu
作者单位:Texas A&M University System; Texas A&M University College Station; University of Illinois System; University of Illinois Urbana-Champaign; Southern University of Science & Technology
摘要:Many statistical estimators for high-dimensional linear regression are M-estimators, formed through minimizing a data-dependent square loss function plus a regularizer. This work considers a new class of estimators implicitly defined through a discretized gradient dynamic system under overparameterization. We show that, under suitable restricted isometry conditions, overparameterization leads to implicit regularization: if we directly apply gradient descent to the residual sum of squares with ...
-
作者:Zhao, Haibing
作者单位:Shanghai University of Finance & Economics
摘要:Post-selection inference on thousands of parameters has attracted considerable research interest in recent years. Specifically, Benjamini & Yekutieli (2005) considered constructing confidence intervals after selection. They proposed adjusting the confidence levels of marginal confidence intervals for the selected parameters to ensure control of the false coverage-statement rate. However, although Benjamini-Yekutieli confidence intervals are widely used, they are uniformly inflated. In this art...
-
作者:Heinrich-Mertsching, Claudio; Fissler, Tobias
作者单位:Vienna University of Economics & Business
摘要:A statistical functional is said to be elicitable if there exists a loss or scoring function under which the functional is the optimal point forecast in expectation. While the mean and quantiles are elicitable, it has been shown in that the mode is not elicitable if the true distribution can follow any Lebesgue density. We strengthen the result of substantially, showing that the mode is not elicitable if the true distribution can be any strongly unimodal distribution with continuous Lebesgue d...
-
作者:McKeague, Ian W.; Zhang, Xin
作者单位:Columbia University; State University System of Florida; Florida State University
摘要:We consider the problem of testing for the presence of linear relationships between large sets of random variables based on a postselection inference approach to canonical correlation analysis. The challenge is to adjust for the selection of subsets of variables having linear combinations with maximal sample correlation. To this end, we construct a stabilized one-step estimator of the Euclidean norm of the canonical correlations maximized over subsets of variables of prespecified cardinality. ...
-
作者:Shi, Pixu; Zhou, Yuchen; Zhang, Anru R.
作者单位:Duke University; University of Wisconsin System; University of Wisconsin Madison
摘要:In microbiome and genomic studies, the regression of compositional data has been a crucial tool for identifying microbial taxa or genes that are associated with clinical phenotypes. To account for the variation in sequencing depth, the classic log-contrast model is often used where read counts are normalized into compositions. However, zero read counts and the randomness in covariates remain critical issues. We introduce a surprisingly simple, interpretable and efficient method for the estimat...