-
作者:Battey, Heather; Fan, Jianqing; Liu, Han; Lu, Junwei; Zhu, Ziwei
作者单位:Imperial College London; Princeton University; Fudan University
摘要:This paper studies hypothesis testing and parameter estimation in the context of the divide-and-conquer algorithm. In a unified likelihood-based framework, we propose new test statistics and point estimators obtained by aggregating various statistics from k subsamples of size n/k, where n is the sample size. In both low dimensional and sparse high dimensional settings, we address the important question of how large k can be, as n grows large, such that the loss of efficiency due to the divide-...
-
作者:Cui, Hengjian; Guo, Wenwen; Zhong, Wei
作者单位:Capital Normal University; Xiamen University
摘要:Testing a hypothesis for high-dimensional regression coefficients is of fundamental importance in the statistical theory and applications. In this paper, we develop a new test for the overall significance of coefficients in high-dimensional linear regression models based on an estimated U-statistics of order two. With the aid of the martingale central limit theorem, we prove that the asymptotic distributions of the proposed test are normal under two different distribution assumptions. Refitted...
-
作者:Aletti, Giacomo; Ghiglietti, Andrea; Rosenberger, William F.
作者单位:University of Milan; Catholic University of the Sacred Heart; George Mason University
摘要:In this paper, we propose a general class of covariate-adjusted response adaptive (CARA) designs based on a new functional urn model. We prove strong consistency concerning the functional urn proportion and the proportion of subjects assigned to the treatment groups, in the whole study and for each covariate profile, allowing the distribution of the responses conditioned on covariates to be estimated nonparametrically. In addition, we establish joint central limit theorems for the above quanti...
-
作者:Weng, Haolei; Maleki, Arian; Zheng, Le
作者单位:Columbia University; Columbia University
摘要:We study the problem of estimating a sparse vector beta is an element of R-p from the response variables y = X beta + omega, where omega similar to N(0, sigma(2)(omega) I-nxn), under the following high-dimensional asymptotic regime: given a fixed number delta, p -> infinity, while n/p -> delta. We consider the popular class of l(q)-regularized least squares (LQLS), a.k.a. bridge estimators, given by the optimization problem (beta) over cap(lambda, q) is an element of arg min(beta) 1/2 parallel...
-
作者:Efron, Bradley
作者单位:Stanford University
摘要:Maximum likelihood estimates are sufficient statistics in exponential families, but not in general. The theory of statistical curvature was introduced to measure the effects of MLE insufficiency in one-parameter families. Here, we analyze curvature in the more realistic venue of multiparameter families-more exactly, curved exponential families, a broad class of smoothly defined nonexponential family models. We show that within the set of observations giving the same value for the MLE, there is...
-
作者:Uhler, Caroline; Lenkoski, Alex; Richards, Donald
作者单位:Massachusetts Institute of Technology (MIT); Massachusetts Institute of Technology (MIT); Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park
摘要:Gaussian graphical models have received considerable attention during the past four decades from the statistical and machine learning communities. In Bayesian treatments of this model, the G-Wishart distribution serves as the conjugate prior for inverse covariance matrices satisfying graphical constraints. While it is straightforward to posit the unnormalized densities, the normalizing constants of these distributions have been known only for graphs that are chordal, or decomposable. Up until ...
-
作者:Elsener, Andreas; van de Geer, Sara
作者单位:Swiss Federal Institutes of Technology Domain; ETH Zurich
摘要:Many results have been proved for various nuclear norm penalized estimators of the uniform sampling matrix completion problem. However, most of these estimators are not robust: in most of the cases the quadratic loss function and its modifications are used. We consider robust nuclear norm penalized estimators using two well-known robust loss functions: the absolute value loss and the Huber loss. Under several conditions on the sparsity of the problem (i.e., the rank of the parameter matrix) an...
-
作者:Vu Dinh; Lam Si Tung Ho; Suchard, Marc A.; Matsen, Frederick A.
作者单位:Fred Hutchinson Cancer Center; University of California System; University of California Los Angeles; University of California System; University of California Los Angeles
摘要:It is common in phylogenetics to have some, perhaps partial, information about the overall evolutionary tree of a group of organisms and wish to find an evolutionary tree of a specific gene for those organisms. There may not be enough information in the gene sequences alone to accurately reconstruct the correct gene tree. Although the gene tree may deviate from the species tree due to a variety of genetic processes, in the absence of evidence to the contrary it is parsimonious to assume that t...
-
作者:Javanmard, Adel; Montanari, Andrea
作者单位:University of Southern California; Stanford University; Stanford University
摘要:Multiple hypothesis testing is a core problem in statistical inference and arises in almost every scientific field. Given a set of null hypotheses H(n) = (H-1,..., H-n), Benjamini and Hochberg [J.R. Stat. Soc. Ser. B. Stat. Methodol. 57 (1995) 289-300] introduced the false discovery rate (FDR), which is the expected proportion of false positives among rejected null hypotheses, and proposed a testing procedure that controls FDR below a preassigned significance level. Nowadays FDR is the criteri...
-
作者:Wang, Guanghui; Zou, Changliang; Yin, Guosheng
作者单位:Nankai University; Nankai University; University of Hong Kong
摘要:We consider a sequence of multinomial data for which the probabilities associated with the categories are subject to abrupt changes of unknown magnitudes at unknown locations. When the number of categories is comparable to or even larger than the number of subjects allocated to these categories, conventional methods such as the classical Pearson's chi-squared test and the deviance test may not work well. Motivated by high-dimensional homogeneity tests, we propose a novel change-point detection...