-
作者:Donoho, David; Gavish, Matan
作者单位:Stanford University
摘要:An unknown m by n matrix X-0 is to be estimated from noisy measurements Y = X-0 + Z, where the noise matrix Z has i.i.d. Gaussian entries. A popular matrix denoising scheme solves the nuclear norm penalization problem min(X) parallel to Y - X parallel to(2)(F)/2 + lambda parallel to X parallel to* where parallel to X parallel to(*) denotes the nuclear norm (sum of singular values). This is the analog, for matrices, of l(1) penalization in the vector case. It has been empirically observed that ...
-
作者:Fryzlewicz, Piotr
作者单位:University of London; London School Economics & Political Science
摘要:We propose a new technique, called wild binary segmentation (WBS), for consistent estimation of the number and locations of multiple change-points in data. We assume that the number of change-points can increase to infinity with the sample size. Due to a certain random localisation mechanism, WBS works even for very short spacings between the change-points and/or very small jump magnitudes, unlike standard binary segmentation. On the other hand, despite its use of localisation, WBS does not re...
-
作者:Buehlmann, Peter; Peters, Jonas; Ernest, Jan
作者单位:Swiss Federal Institutes of Technology Domain; ETH Zurich
摘要:We develop estimation for potentially high-dimensional additive structural equation models. A key component of our approach is to decouple order search among the variables from feature or edge selection in a directed acyclic graph encoding the causal structure. We show that the former can be done with nonregularized (restricted) maximum likelihood estimation while the latter can be efficiently addressed using sparse regression techniques. Thus, we substantially simplify the problem of structur...
-
作者:Ke, Zheng Tracy; Jin, Jiashun; Fan, Jianqing
作者单位:University of Chicago; Carnegie Mellon University; Princeton University
摘要:Consider a linear model Y = X beta +z, where X = X-n,X-p and z similar to N(0, In). The vector beta is unknown but is sparse in the sense that most of its coordinates are 0. The main interest is to separate its nonzero coordinates from the zero ones (i.e., variable selection). Motivated by examples in long-memory time series (Fan and Yao [Nonlinear Time Series: Nonparametric and Parametric Methods (2003) Springer]) and the change-point problem (Bhattacharya [In Change-Point Problems (South Had...
-
作者:Bhaskar, Anand; Song, Yun S.
作者单位:University of California System; University of California Berkeley; University of California System; University of California Berkeley
摘要:The sample frequency spectrum (SFS) is a widely-used summary statistic of genomic variation in a sample of homologous DNA sequences. It provides a highly efficient dimensional reduction of large-scale population genomic data and its mathematical dependence on the underlying population demography is well understood, thus enabling the development of efficient inference algorithms. However, it has been recently shown that very different population demographies can actually generate the same SFS f...
-
作者:Schmidt-Hieber, Johannes
作者单位:Leiden University - Excl LUMC; Leiden University
摘要:Consider estimation of the regression function based on a model with equidistant design and measurement errors generated from a fractional Gaussian noise process. In previous literature, this model has been heuristically linked to an experiment, where the anti-derivative of the regression function is continuously observed under additive perturbation by a fractional Brownian motion. Based on a reformulation of the problem using reproducing kernel Hilbert spaces, we derive abstract approximation...
-
作者:Szekely, Gabor J.; Rizzo, Maria L.
作者单位:National Science Foundation (NSF); University System of Ohio; Bowling Green State University
摘要:Distance covariance and distance correlation are scalar coefficients that characterize independence of random vectors in arbitrary dimension. Properties, extensions and applications of distance correlation have been discussed in the recent literature, but the problem of defining the partial distance correlation has remained an open question of considerable interest. The problem of partial distance correlation is more complex than partial correlation partly because the squared distance covarian...
-
作者:Chatterjee, Sourav
作者单位:Stanford University
摘要:Consider the problem of estimating the mean of a Gaussian random vector when the mean vector is assumed to be in a given convex set. The most natural solution is to take the Euclidean projection of the data vector on to this convex set; in other words, performing least squares under a convex constraint. Many problems in modern statistics and statistical signal processing theory are special cases of this general situation. Examples include the lasso and other high-dimensional regression techniq...
-
作者:Bailey, R. A.; Druilhet, P.
作者单位:University of London; Queen Mary University London; University of St Andrews; Universite Clermont Auvergne (UCA)
摘要:We consider repeated measurement designs when a residual or carry-over effect may be present in at most one later period. Since assuming an additive model may be unrealistic for some applications and leads to biased estimation of treatment effects, we consider a model with interactions between carryover and direct treatment effects. When the aim of the experiment is to study the effects of a treatment used alone, we obtain universally optimal approximate designs. We also propose some efficient...
-
作者:Binev, Peter; Cohen, Albert; Dahmen, Wolfgang; DeVore, Ronald
作者单位:University of South Carolina System; University of South Carolina Columbia; Sorbonne Universite; Universite Paris Cite; Centre National de la Recherche Scientifique (CNRS); CNRS - National Institute for Mathematical Sciences (INSMI); RWTH Aachen University; Texas A&M University System; Texas A&M University College Station
摘要:Algorithms for binary classification based on adaptive tree partitioning are formulated and analyzed for both their risk performance and their friendliness to numerical implementation. The algorithms can be-viewed as generating a set approximation to the Bayes set and thus fall into the general category of set estimators. In contrast with the most studied tree-based algorithms, which utilize piecewise constant approximation on the generated partition [IEEE Trans. Inform. Theory 52 (2006) 1335-...