-
作者:Battey, Heather; Fan, Jianqing; Liu, Han; Lu, Junwei; Zhu, Ziwei
作者单位:Imperial College London; Princeton University; Fudan University
摘要:This paper studies hypothesis testing and parameter estimation in the context of the divide-and-conquer algorithm. In a unified likelihood-based framework, we propose new test statistics and point estimators obtained by aggregating various statistics from k subsamples of size n/k, where n is the sample size. In both low dimensional and sparse high dimensional settings, we address the important question of how large k can be, as n grows large, such that the loss of efficiency due to the divide-...
-
作者:Cui, Hengjian; Guo, Wenwen; Zhong, Wei
作者单位:Capital Normal University; Xiamen University
摘要:Testing a hypothesis for high-dimensional regression coefficients is of fundamental importance in the statistical theory and applications. In this paper, we develop a new test for the overall significance of coefficients in high-dimensional linear regression models based on an estimated U-statistics of order two. With the aid of the martingale central limit theorem, we prove that the asymptotic distributions of the proposed test are normal under two different distribution assumptions. Refitted...
-
作者:Escanciano, Juan Carlos; Carlos Pardo-Fernandez, Juan; Van Keilegom, Ingrid
作者单位:Indiana University System; Indiana University Bloomington; Universidade de Vigo; KU Leuven
摘要:This article proposes a new general methodology for constructing nonparametric and semiparametric Asymptotically Distribution-Free (ADF) tests for semiparametric hypotheses in regression models for possibly dependent data coming from a strictly stationary process. Classical tests based on the difference between the estimated distributions of the restricted and unrestricted regression errors are not ADF. In this article, we introduce a novel transformation of this difference that leads to ADF t...
-
作者:Mukherjee, Rajarshi; Mukherjee, Sumit; Sen, Subhabrata
作者单位:University of California System; University of California Berkeley; Columbia University; Microsoft; Microsoft
摘要:In this paper, we study sharp thresholds for detecting sparse signals in beta-models for potentially sparse random graphs. The results demonstrate interesting interplay between graph sparsity, signal sparsity and signal strength. In regimes of moderately dense signals, irrespective of graph sparsity, the detection thresholds mirror corresponding results in independent Gaussian sequence problems. For sparser signals, extreme graph sparsity implies that all tests are asymptotically powerless, ir...
-
作者:Pan, Wenliang; Tian, Yuan; Wang, Xueqin; Zhang, Heping
作者单位:Sun Yat Sen University; Sun Yat Sen University; Yale University
摘要:In this paper, we first introduce Ball Divergence, a novel measure of the difference between two probability measures in separable Banach spaces, and show that the Ball Divergence of two probability measures is zero if and only if these two probability measures are identical without any moment assumption. Using Ball Divergence, we present a metric rank test procedure to detect the equality of distribution measures underlying independent samples. It is therefore robust to outliers or heavy-tail...
-
作者:Kong, Xin-Bing
作者单位:Nanjing Audit University
摘要:In this paper, we separate the integrated (spot) volatility of an individual Ito process into integrated (spot) systematic and idiosyncratic volatilities, and estimate them by aggregation of local factor analysis (localization) with large-dimensional high-frequency data. We show that, when both the sampling frequency n and the dimensionality p go to infinity and p >= C root n for some constant C, our estimators of High dimensional Ito process; common driving process; specific driving process, ...
-
作者:Cai, T. Tony; Guntuboyina, Adityanand; Wei, Yuting
作者单位:University of Pennsylvania; University of California System; University of California Berkeley; University of California System; University of California Berkeley
摘要:In this paper, we consider adaptive estimation of an unknown planar compact, convex set from noisy measurements of its support function. Both the problem of estimating the support function at a point and that of estimating the whole convex set are studied. For pointwise estimation, we consider the problem in a general nonasymptotic framework, which evaluates the performance of a procedure at each individual set, instead of the worst case performance over a large parameter space as in conventio...
-
作者:Gregory, Karl B.; Lahiri, Soumendra N.; Nordman, Daniel J.
作者单位:University of South Carolina System; University of South Carolina Columbia; North Carolina State University; Iowa State University
摘要:Quantile regression allows for broad (conditional) characterizations of a response distribution beyond conditional means and is of increasing interest in economic and financial applications. Because quantile regression estimators have complex limiting distributions, several bootstrap methods for the independent data setting have been proposed, many of which involve smoothing steps to improve bootstrap approximations. Currently, no similar advances in smoothed bootstraps exist for quantile regr...
-
作者:Bai, Zhidong; Choi, Kwok Pui; Fujikoshi, Yasunori
作者单位:Northeast Normal University - China; Northeast Normal University - China; National University of Singapore; Hiroshima University
摘要:In this paper, we study the problem of estimating the number of significant components in principal component analysis (PCA), which corresponds to the number of dominant eigenvalues of the covariance matrix of p variables. Our purpose is to examine the consistency of the estimation criteria AIC and BIC based on the model selection criteria by Akaike [In 2nd International Symposium on Information Theory (1973) 267-281, Akademia Kiado] and Schwarz [Estimating the dimension of a model 6 (1978) 46...
-
作者:Fan, Jianqing; Shao, Qi-Man; Zhou, Wen-Xin
作者单位:Fudan University; Princeton University; Princeton University; Chinese University of Hong Kong; University of California System; University of California San Diego
摘要:Over the last two decades, many exciting variable selection methods have been developed for finding a small group of covariates that are associated with the response from a large pool. Can the discoveries from these data mining approaches be spurious due to high dimensionality and limited sample size? Can our fundamental assumptions about the exogeneity of the covariates needed for such variable selection be validated with the data? To answer these questions, we need to derive the distribution...