-
作者:Howard, Steven R.; Ramdas, Aaditya; McAuliffe, Jon; Sekhon, Jasjeet
作者单位:University of California System; University of California Berkeley; Carnegie Mellon University; Carnegie Mellon University; Yale University
摘要:A confidence sequence is a sequence of confidence intervals that is uniformly valid over an unbounded time horizon. Our work develops confidence sequences whose widths go to zero, with nonasymptotic coverage guarantees under nonparametric conditions. We draw connections between the Cramer-Chernoff method for exponential concentration, the law of the iterated logarithm (LIL) and the sequential probability ratio test-our confidence sequences are time-uniform extensions of the first; provide tigh...
-
作者:Schmidt, Sara K.; Wornowizki, Max; Fried, Roland; Dehling, Herold
作者单位:Ruhr University Bochum; Dortmund University of Technology
摘要:We present a novel approach to test for heteroscedasticity of a nonstationary time series that is based on Gini's mean difference of logarithmic local sample variances. In order to analyse the large sample behaviour of our test statistic, we establish new limit theorems for U-statistics of dependent triangular arrays. We derive the asymptotic distribution of the test statistic under the null hypothesis of a constant variance and show that the test is consistent against a large class of alterna...
-
作者:Chatterjee, Sabyasachi; Goswami, Subhajit
作者单位:University of Illinois System; University of Illinois Urbana-Champaign; Tata Institute of Fundamental Research (TIFR)
摘要:Proposed by Donoho (Ann. Statist. 25 (1997) 1870-1911), Dyadic CART is a nonparametric regression method which computes a globally optimal dyadic decision tree and fits piecewise constant functions in two dimensions. In this article, we define and study Dyadic CART and a closely related estimator, namely Optimal Regression Tree (ORT), in the context of estimating piecewise smooth functions in general dimensions in the fixed design setup. More precisely, these optimal decision tree estimators f...
-
作者:Morikawa, Kosuke; Kim, Jae Kwang
作者单位:University of Osaka; Iowa State University
摘要:When the response mechanism is believed to be not missing at random (NMAR), a valid analysis requires stronger assumptions on the response mechanism than standard statistical methods would otherwise require. Semiparametric estimators have been developed under the parametric model assumptions on the response mechanism. In this paper, a new statistical test is proposed to guarantee model identifiability without using instrumental variable assumption. Furthermore, we develop optimal semiparametri...
-
作者:Rao, Suhasini Subba; Yang, Junho
作者单位:Texas A&M University System; Texas A&M University College Station; Academia Sinica - Taiwan
摘要:In time series analysis there is an apparent dichotomy between time and frequency domain methods. The aim of this paper is to draw connections between frequency and time domain methods. Our focus will be on reconciling the Gaussian likelihood and the Whittle likelihood. We derive an exact, interpretable, bound between the Gaussian and Whittle likelihood of a second order stationary time series. The derivation is based on obtaining the transformation which is biorthogonal to the discrete Fourie...
-
作者:Ye, Ting; Shao, Jun; Kang, Hyunseung
作者单位:University of Pennsylvania; East China Normal University; University of Wisconsin System; University of Wisconsin Madison
摘要:Mendelian randomization (MR) has become a popular approach to study the effect of a modifiable exposure on an outcome by using genetic variants as instrumental variables. A challenge in MR is that each genetic variant explains a relatively small proportion of variance in the exposure and there are many such variants, a setting known as many weak instruments. To this end, we provide a theoretical characterization of the statistical properties of two popular estimators in MR: the inverse-varianc...
-
作者:Camerlenghi, Federico; Lijoi, Antonio; Prunster, Igor
作者单位:University of Milano-Bicocca; Bocconi University; Bocconi University
摘要:Hierarchical nonparametric processes are popular tools for defining priors on collections of probability distributions, which induce dependence across multiple samples. In survival analysis problems, one is typically interested in modeling the hazard rates, rather than the probability distributions themselves, and the currently available methodologies are not applicable. Here, we fill this gap by introducing a novel, and analytically tractable, class of multivariate mixtures whose distribution...
-
作者:Jiang, Sheng; Tokdar, Surya T.
作者单位:Duke University
摘要:Bayesian nonparametric regression under a rescaled Gaussian process prior offers smoothness-adaptive function estimation with near minimax-optimal error rates. Hierarchical extensions of this approach, equipped with stochastic variable selection, are known to also adapt to the unknown intrinsic dimension of a sparse true regression function. But it remains unclear if such extensions offer variable selection consistency, that is, if the true subset of important variables could be consistently l...
-
作者:Bellec, Pierre C.; Zhang, Cun-Hui
作者单位:Rutgers University System; Rutgers University New Brunswick
摘要:Stein's formula states that a random variable of the form z(inverted perpendicular) f (z) - divf (z) is mean-zero for all functions f with integrable gradient. Here, div f is the divergence of the function f and z is a standard normal vector. This paper aims to propose a second-order Stein formula to characterize the variance of such random variables for all functions f (z) with square integrable gradient, and to demonstrate the usefulness of this second-order Stein formula in various applicat...
-
作者:Klochkov, Yegor; Kroshnin, Alexey; Zhivotovskiy, Nikita
作者单位:University of Cambridge; HSE University (National Research University Higher School of Economics); Russian Academy of Sciences; Kharkevich Institute for Information Transmission Problems of the RAS; Alphabet Inc.; Google Incorporated
摘要:We consider the robust algorithms for the k-means clustering problem where a quantizer is constructed based on N independent observations. Our main results are median of means based nonasymptotic excess distortion bounds that hold under the two bounded moments assumption in a general separable Hilbert space. In particular, our results extend the renowned asymptotic result of (Ann. Statist. 9 (1981) 135-140) who showed that the existence of two moments is sufficient for strong consistency of an...