-
作者:Balakrishnan, Sivaraman; Wasserman, Larry
作者单位:Carnegie Mellon University
摘要:We consider the goodness-of-fit testing problem of distinguishing whether the data are drawn from a specified distribution, versus a composite alternative separated from the null in the total variation metric. In the discrete case, we consider goodness-of-fit testing when the null distribution has a possibly growing or unbounded number of categories. In the continuous case, we consider testing a Holder density with exponent 0 < s <= 1, with possibly unbounded support, in the low-smoothness reg...
-
作者:Williams, Jonathan P.; Hannig, Jan
作者单位:University of North Carolina; University of North Carolina Chapel Hill
摘要:Standard penalized methods of variable selection and parameter estimation rely on the magnitude of coefficient estimates to decide which variables to include in the final model. However, coefficient estimates are unreliable when the design matrix is collinear. To overcome this challenge, an entirely new perspective on variable selection is presented within a generalized fiducial inference framework. This new procedure is able to effectively account for linear dependencies among subsets of cova...
-
作者:Neykov, Matey; Lu, Junwei; Liu, Han
作者单位:Carnegie Mellon University; Princeton University; Northwestern University; Northwestern University
摘要:We propose a new family of combinatorial inference problems for graphical models. Unlike classical statistical inference where the main interest is point estimation or parameter testing, combinatorial inference aims at testing the global structure of the underlying graph. Examples include testing the graph connectivity, the presence of a cycle of certain size, or the maximum degree of the graph. To begin with, we study the information-theoretic limits of a large family of combinatorial inferen...
-
作者:He, Hera Y.; Basu, Kinjal; Zhao, Qingyuan; Owen, Art B.
作者单位:Stanford University; University of Pennsylvania
摘要:It is common for genomic data analysis to use p-values from a large number of permutation tests. The multiplicity of tests may require very tiny p-values in order to reject any null hypotheses and the common practice of using randomly sampled permutations then becomes very expensive. We propose an inexpensive approximation to p-values for two sample linear test statistics, derived from Stolarsky's invariance principle. The method creates a geometrically derived reference set of approximate p-v...
-
作者:Ramdas, Aaditya K.; Barber, Rina F.; Wainwright, Martin J.; Jordan, Michael, I
作者单位:Carnegie Mellon University; University of Chicago; University of California System; University of California Berkeley
摘要:There is a significant literature on methods for incorporating knowledge into multiple testing procedures so as to improve their power and precision. Some common forms of prior knowledge include (a) beliefs about which hypotheses are null, modeled by nonuniform prior weights; (b) differing importances of hypotheses, modeled by differing penalties for false discoveries; (c) multiple arbitrary partitions of the hypotheses into (possibly overlapping) groups and (d) knowledge of independence, posi...
-
作者:Bobkov, Sergey G.
作者单位:University of Minnesota System; University of Minnesota Twin Cities; HSE University (National Research University Higher School of Economics)
摘要:Let F-n denote the distribution function of the normalized sum of n i.i.d. random variables. In this paper, polynomial rates of approximation of F n by the corrected normal laws are considered in the model where the underlying distribution has a convolution structure. As a basic tool, the convergence part of Khinchine's theorem in metric theory of Diophantine approximations is extended to the class of product characteristic functions.
-
作者:Lin, Yi; Martin, Ryan; Yang, Min
作者单位:University of Illinois System; University of Illinois Chicago; University of Illinois Chicago Hospital; North Carolina State University
摘要:Classically, Fisher information is the relevant object in defining optimal experimental designs. However, for models that lack certain regularity, the Fisher information does not exist, and hence, there is no notion of design optimality available in the literature. This article seeks to fill the gap by proposing a so-called Hellinger information, which generalizes Fisher information in the sense that the two measures agree in regular problems, but the former also exists for certain types of no...
-
作者:Zhu, Ke
作者单位:University of Hong Kong
摘要:This paper provides an entire inference procedure for the autoregressive model under (conditional) heteroscedasticity of unknown form with a finite variance. We first establish the asymptotic normality of the weighted least absolute deviations estimator (LADE) for the model. Second, we develop the random weighting (RW) method to estimate its asymptotic covariance matrix, leading to the implementation of the Wald test. Third, we construct a portmanteau test for model checking, and use the RW me...
-
作者:Chen, Ningyuan; Lee, Donald K. K.; Negahban, Sahand N.
作者单位:Hong Kong University of Science & Technology; Hong Kong University of Science & Technology; Yale University; Yale University; Yale University
摘要:Exploiting the fact that most arrival processes exhibit cyclic behaviour, we propose a simple procedure for estimating the intensity of a nonhomogeneous Poisson process. The estimator is the super-resolution analogue to Shao (2010) and Shao and Lii [J. R. Stat. Soc. Ser. B. Stat. Methodol. 73 (2011) 99-122], which is a sum of p sinusoids where p and the amplitude and phase of each wave are not known and need to be estimated. This results in an interpretable yet flexible specification that is s...
-
作者:Athey, Susan; Tibshirani, Julie; Wager, Stefan
作者单位:Stanford University
摘要:We propose generalized random forests, a method for nonparametric statistical estimation based on random forests (Breiman [Mach. Learn. 45 (2001) 5-32]) that can be used to fit any quantity of interest identified as the solution to a set of local moment equations. Following the literature on local maximum likelihood estimation, our method considers a weighted set of nearby training examples; however, instead of using classical kernel weighting functions that are prone to a strong curse of dime...