-
作者:Fortini, Sandra; Petrone, Sonia
作者单位:Bocconi University
摘要:Bayesian methods are often optimal, yet increasing pressure for fast computations, especially with streaming data, brings renewed interest in faster, possibly suboptimal, solutions. The extent to which these algorithms approximate Bayesian solutions is a question of interest, but often unanswered. We propose a methodology to address this question in predictive settings, when the algorithm can be reinterpreted as a probabilistic predictive rule. We specifically develop the proposed methodology ...
-
作者:Berrett, Thomas B.; Wang, Yi; Barber, Rina Foygel; Samworth, Richard J.
作者单位:University of Cambridge; University of Chicago
摘要:We propose a general new method, the conditional permutation test, for testing the conditional independence of variables X and Y given a potentially high dimensional random vector Z that may contain confounding factors. The test permutes entries of X non-uniformly, to respect the existing dependence between X and Z and thus to account for the presence of these confounders. Like the conditional randomization test of Candes and co-workers in 2018, our test relies on the availability of an approx...
-
作者:Dette, Holger; Kokot, Kevin; Volgushev, Stanislav
作者单位:Ruhr University Bochum; University of Toronto
摘要:We develop methodology for testing relevant hypotheses about functional time series in a tuning-free way. Instead of testing for exact equality, e.g. for the equality of two mean functions from two independent time series, we propose to test the null hypothesis of no relevant deviation. In the two-sample problem this means that an L2-distance between the two mean functions is smaller than a prespecified threshold. For such hypotheses self-normalization, which was introduced in 2010 by Shao, an...
-
作者:Yang, Shu; Kim, Jae Kwang; Song, Rui
作者单位:North Carolina State University; Iowa State University
摘要:We consider integrating a non-probability sample with a probability sample which provides high dimensional representative covariate information of the target population. We propose a two-step approach for variable selection and finite population inference. In the first step, we use penalized estimating equations with folded concave penalties to select important variables and show selection consistency for general samples. In the second step, we focus on a doubly robust estimator of the finite ...
-
作者:Gorgi, Paolo
作者单位:Vrije Universiteit Amsterdam; Tinbergen Institute
摘要:The paper introduces a general class of heavy-tailed auto-regressions for modelling integer-valued time series with outliers. The specification proposed is based on a heavy-tailed mixture of negative binomial distributions that features an observation-driven dynamic equation for the conditional expectation. The existence of a stationary and ergodic solution for the class of auto-regressive processes is shown under general conditions. The estimation of the model can be easily performed by maxim...
-
作者:Richardson, Robert; Kottas, Athanasios; Sanso, Bruno
作者单位:Brigham Young University; University of California System; University of California Santa Cruz
摘要:An integro-difference equation can be represented as a hierarchical spatiotemporal dynamic model using appropriate parameterizations. The dynamics of the process defined by an integro-difference equation depends on the choice of a bivariate kernel distribution, where more flexible shapes generally result in more flexible models. Under a Bayesian modelling framework, we consider the use of the stable family of distributions for the kernel, as they are infinitely divisible and offer a variety of...
-
作者:Chen, Fan; Zhang, Yini; Rohe, Karl
作者单位:University of Wisconsin System; University of Wisconsin Madison
摘要:The paper provides statistical theory and intuition for personalized PageRank (called 'PPR'): a popular technique that samples a small community from a massive network. We study a setting where the entire network is expensive to obtain thoroughly or to maintain, but we can start from a seed node of interest and 'crawl' the network to find other nodes through their connections. By crawling the graph in a designed way, the PPR vector can be approximated without querying the entire massive graph,...
-
作者:Rad, Kamiar Rahnama; Maleki, Arian
作者单位:City University of New York (CUNY) System; Columbia University
摘要:The paper considers the problem of out-of-sample risk estimation under the high dimensional settings where standard techniques such asK-fold cross-validation suffer from large biases. Motivated by the low bias of the leave-one-out cross-validation method, we propose a computationally efficient closed form approximate leave-one-out formula ALO for a large class of regularized estimators. Given the regularized estimate, calculating ALO requires a minor computational overhead. With minor assumpti...
-
作者:Shah, Rajen D.; Frot, Benjamin; Thanei, Gian-Andrea; Meinshausen, Nicolai
作者单位:University of Cambridge; Swiss Federal Institutes of Technology Domain; ETH Zurich
摘要:We consider the problem of estimating a high dimensional pxp covariance matrix sigma, given n observations of confounded data with covariance sigma+Gamma Gamma T, where Gamma is an unknown pxq matrix of latent factor loadings. We propose a simple and scalable estimator based on the projection onto the right singular vectors of the observed data matrix, which we call right singular vector projection (RSVP). Our theoretical analysis of this method reveals that, in contrast with approaches based ...
-
作者:Rosenblum, Michael; Fang, Ethan X.; Liu, Han
作者单位:Johns Hopkins University; Johns Hopkins Bloomberg School of Public Health; Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park; Northwestern University
摘要:Adaptive enrichment designs involve preplanned rules for modifying enrolment criteria based on accruing data in a randomized trial. We focus on designs where the overall population is partitioned into two predefined subpopulations, e.g. based on a biomarker or risk score measured at baseline. The goal is to learn which populations benefit from an experimental treatment. Two critical components of adaptive enrichment designs are the decision rule for modifying enrolment, and the multiple-testin...