-
作者:Dirks, Sjoerd; Maly, Johannes; Rauhut, Holger
作者单位:Utrecht University; University of Munich; RWTH Aachen University
摘要:We consider the classical problem of estimating the covariance matrix of a sub-Gaussian distribution from i.i.d. samples in the novel context of coarse quantization, that is, instead of having full knowledge of the samples, they are quantized to one or two bits per entry. This problem occurs naturally in signal processing applications. We introduce new estimators in two different quantization scenarios and derive nonasymptotic estimation error bounds in terms of the operator norm. In the first...
-
作者:Goto, Yuichi; Kley, Tobias; Van Hecke, Ria; Volgushev, Stanislav; Dette, Holger; Hallin, Marc
作者单位:Kyushu University; University of Gottingen; Ruhr University Bochum; University of Toronto; Universite Libre de Bruxelles; Universite Libre de Bruxelles
摘要:Frequency domain methods form a ubiquitous part of the statistical tool-box for time-series analysis. In recent years, considerable interest has been given to the development of new spectral methodology and tools capturing dynamics in the entire joint distributions, and thus avoiding the limitations of classical, L2-based spectral methods. Most of the spectral concepts proposed in that literature suffer from one major drawback, though: their estimation re-quires the choice of a smoothing param...
-
作者:Hanneke, S. T. E. V. E.; Kpotufe, Samory
作者单位:Purdue University System; Purdue University; Columbia University
摘要:Multitask learning and related areas such as multisource domain adapta-tion address modern settings where data sets from N related distributions {Pt } are to be combined toward improving performance on any single such distri-bution D. A perplexing fact remains in the evolving theory on the subject: while we would hope for performance bounds that account for the contribu-tion from multiple tasks, the vast majority of analyses result in bounds that improve at best in the number n of samples per ...
-
作者:Bing, Xin; Bunea, Florentina; Strimas-mackey, Seth; Wegkamp, Marten
作者单位:University of Toronto; Cornell University; Cornell University; Cornell University
摘要:This paper studies the estimation of high-dimensional, discrete, possibly sparse, mixture models in the context of topic models. The data consists of observed multinomial counts of p words across n independent documents. In topic models, the p x n expected word frequency matrix is assumed to be factorized as a p x K word-topic matrix A and a K x n topic-document matrix T. Since columns of both matrices represent conditional probabilities belonging to probability simplices, columns of A are vie...
-
作者:Wang, Yuhao; Li, Xinran
作者单位:Tsinghua University; University of Illinois System; University of Illinois Urbana-Champaign
摘要:Completely randomized experiments have been the gold standard for drawing causal inference because they can balance all potential confounding on average. However, they may suffer from unbalanced covariates for real-ized treatment assignments. Rerandomization, a design that rerandomizes the treatment assignment until a prespecified covariate balance criterion is met, has recently got attention due to its easy implementation, improved covari-ate balance and more efficient inference. Researchers ...
-
作者:Fithian, William; Lei, Lihua
作者单位:University of California System; University of California Berkeley; Stanford University
摘要:We introduce a new class of methods for finite-sample false discovery rate (FDR) control in multiple testing problems with dependent test statistics where the dependence is known. Our approach separately calibrates a data -dependent p-value rejection threshold for each hypothesis, relaxing or tight-ening the threshold as appropriate to target exact FDR control. In addition to our general framework, we propose a concrete algorithm, the dependence-adjusted Benjamini-Hochberg (dBH) procedure, whi...
-
作者:Yang, Wenhao; Zhang, Liangyu; Zhang, Zhihua
作者单位:Peking University; Peking University
摘要:In this paper, we study the nonasymptotic and asymptotic performances of the optimal robust policy and value function of robust Markov Decision Processes (MDPs), where the optimal robust policy and value function are estimated from a generative model. While prior work focusing on nonasymptotic performances of robust MDPs is restricted in the setting of the KL uncertainty set and (s, a)-rectangular assumption, we improve their results and also consider other uncertainty sets, including the L-1 ...
-
作者:Charkaborty, Anirvan; Panaretos, Victor M.
作者单位:Indian Institute of Science Education & Research (IISER) - Kolkata; Swiss Federal Institutes of Technology Domain; Ecole Polytechnique Federale de Lausanne
摘要:How can we discern whether the covariance operator of a stochastic pro-cess is of reduced rank, and if so, what its precise rank is? And how can we do so at a given level of confidence? This question is central to a great deal of methods for functional data, which require low-dimensional representa-tions whether by functional PCA or other methods. The difficulty is that the determination is to be made on the basis of i.i.d. replications of the process observed discretely and with measurement e...
-
作者:Waghmare, Kartik G.; Panaretos, Victor M.
作者单位:Swiss Federal Institutes of Technology Domain; Ecole Polytechnique Federale de Lausanne
摘要:We consider the problem of positive-semidefinite continuation: extending a partially specified covariance kernel from a subdomain Omega of a rectangular domain I x I to a covariance kernel on the entire domain I x I. For a broad class of domains Omega called serrated domains, we are able to present a complete theory. Namely, we demonstrate that a canonical completion always exists and can be explicitly constructed. We characterise all possible completions as suitable perturbations of the canon...
-
作者:Aswani, Anil; Olfat, Matt
作者单位:University of California System; University of California Berkeley
摘要:Data-driven decision making has drawn scrutiny from policy makers due to fears of potential discrimination, and a growing literature has begun to develop fair statistical techniques. However, these techniques are often spe-cialized to one model context and based on ad hoc arguments, which makes it difficult to perform theoretical analysis. This paper develops an optimization hierarchy, which is a sequence of optimization problems with an increasing number of constraints, for fair statistical d...