-
作者:Janson, Lucas; Barber, Rina Foygel; Candes, Emmanuel
作者单位:Stanford University; University of Chicago
摘要:Consider the following three important problems in statistical inference: constructing confidence intervals for the error of a high dimensional (p>n) regression estimator, the linear regression noise level and the genetic signal-to-noise ratio of a continuous-valued trait ( related to the heritability). All three problems turn out to be closely related to the little-studied problem of performing inference on the (l)2-norm of the signal in high dimensional linear regression. We derive a novel p...
-
作者:Jiang, Runchao; Lu, Wenbin; Song, Rui; Davidian, Marie
作者单位:North Carolina State University
摘要:A treatment regime is a deterministic function that dictates personalized treatment based on patients' individual prognostic information. There is increasing interest in finding optimal treatment regimes, which determine treatment at one or more treatment decision points to maximize expected long-term clinical outcomes, where larger outcomes are preferred. For chronic diseases such as cancer or human immunodeficiency virus infection, survival time is often the outcome of interest, and the goal...
-
作者:Kennedy, Edward H.; Ma, Zongming; McHugh, Matthew D.; Small, Dylan S.
作者单位:University of Pennsylvania
摘要:Continuous treatments (e.g. doses) arise often in practice, but many available causal effect estimators are limited by either requiring parametric models for the effect curve, or by not allowing doubly robust covariate adjustment. We develop a novel kernel smoothing approach that requires only mild smoothness assumptions on the effect curve and still allows for misspecification of either the treatment density or outcome regression. We derive asymptotic properties and give a procedure for data-...
-
作者:Barber, Rina Foygel; Ramdas, Aaditya
作者单位:University of Chicago; University of California System; University of California Berkeley
摘要:In many practical applications of multiple testing, there are natural ways to partition the hypotheses into groups by using the structural, spatial or temporal relatedness of the hypotheses, and this prior knowledge is not used in the classical Benjamini-Hochberg procedure for controlling the false discovery rate (FDR). When one can define (possibly several) such partitions, it may be desirable to control the group FDR simultaneously for all partitions (as special cases, the 'finest' partition...
-
作者:Cannings, Timothy I.; Samworth, Richard J.
作者单位:University of Cambridge
摘要:We introduce a very general method for high dimensional classification, based on careful combination of the results of applying an arbitrary base classifier to random projections of the feature vectors into a lower dimensional space. In one special case that we study in detail, the random projections are divided into disjoint groups, and within each group we select the projection yielding the smallest estimate of the test error. Our random-projection ensemble classifier then aggregates the res...
-
作者:Zwiernik, Piotr; Uhler, Caroline; Richards, Donald
作者单位:Pompeu Fabra University; Massachusetts Institute of Technology (MIT); Institute of Science & Technology - Austria; Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park
摘要:We study parameter estimation in linear Gaussian covariance models, which are p-dimensional Gaussian models with linear constraints on the covariance matrix. Maximum likelihood estimation for this class of models leads to a non-convex optimization problem which typically has many local maxima. Using recent results on the asymptotic distribution of extreme eigenvalues of the Wishart distribution, we provide sufficient conditions for any hill climbing method to converge to the global maximum. Al...
-
作者:Bastide, Paul; Mariadassou, Mahendra; Robin, Stephane
作者单位:AgroParisTech; Universite Paris Saclay; INRAE; INRAE; Universite Paris Saclay
摘要:Comparative and evolutive ecologists are interested in the distribution of quantitative traits between related species. The classical framework for these distributions consists of a random process running along the branches of a phylogenetic tree relating the species. We consider shifts in the process parameters, which reveal fast adaptation to changes of ecological niches. We show that models with shifts are not identifiable in general. Constraining the models to be parsimonious in the number...
-
作者:Roy, Sandipan; Atchade, Yves; Michailidis, George
作者单位:University of London; University College London; University of Michigan System; University of Michigan; State University System of Florida; University of Florida
摘要:The paper investigates a change point estimation problem in the context of high dimensional Markov random-field models. Change points represent a key feature in many dynamically evolving network structures. The change point estimate is obtained by maximizing a profile penalized pseudolikelihood function under a sparsity assumption. We also derive a tight bound for the estimate, up to a logarithmic factor, even in settings where the number of possible edges in the network far exceeds the sample...
-
作者:Fan, Jianqing; Han, Xu
作者单位:Princeton University; Fudan University; Pennsylvania Commonwealth System of Higher Education (PCSHE); Temple University
摘要:Large-scale multiple testing with correlated test statistics arises frequently in much scientific research. Incorporating correlation information in approximating the false discovery proportion (FDP) has attracted increasing attention in recent years. When the covariance matrix of test statistics is known, Fan and his colleagues provided an accurate approximation of the FDP under arbitrary dependence structure and some sparsity assumption. However, the covariance matrix is often unknown in man...
-
作者:Matias, Catherine; Miele, Vincent
作者单位:Centre National de la Recherche Scientifique (CNRS); Sorbonne Universite; Sorbonne Universite; Universite Paris Cite; Centre National de la Recherche Scientifique (CNRS); Universite Claude Bernard Lyon 1; VetAgro Sup
摘要:Statistical node clustering in discrete time dynamic networks is an emerging field that raises many challenges. Here, we explore statistical properties and frequentist inference in a model that combines a stochastic block model for its static part with independent Markov chains for the evolution of the nodes groups through time. We model binary data as well as weighted dynamic random graphs ( with discrete or continuous edges values). Our approach, motivated by the importance of controlling fo...