-
作者:Fang, Ethan X.; Ning, Yang; Li, Runze
作者单位:Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park; Cornell University
摘要:This paper concerns statistical inference for longitudinal data with ultrahigh dimensional covariates. We first study the problem of constructing confidence intervals and hypothesis tests for a low-dimensional parameter of interest. The major challenge is how to construct a powerful test statistic in the presence of high-dimensional nuisance parameters and sophisticated within-subject correlation of longitudinal data. To deal with the challenge, we propose a new quadratic decorrelated inferenc...
-
作者:Porter, Thomas; Stewart, Michael
作者单位:University of Melbourne; University of Sydney
摘要:Higher criticism (HC) is a popular method for large-scale inference problems based on identifying unusually high proportions of small p-values. It has been shown to enjoy a lower-order optimality property in a simple normal location mixture model which is shared by the 'tailor-made' parametric generalised likelihood ratio test (GLRT) for the same model; however, HC has also been shown to perform well outside this 'narrow' model. We develop a higher-order framework for analysing the power of th...
-
作者:Chen, Xi; Lee, Jason D.; Tong, Xin T.; Zhang, Yichen
作者单位:New York University; University of Southern California; National University of Singapore
摘要:The stochastic gradient descent (SGD) algorithm has been widely used in statistical estimation for large-scale data due to its computational and memory efficiency. While most existing works focus on the convergence of the objective function or the error of the obtained solution, we investigate the problem of statistical inference of true model parameters based on SGD when the population loss function is strongly convex and satisfies certain smoothness conditions. Our main contributions are two...
-
作者:Paindaveine, Davy; Remy, Julien; Verdebout, Thomas
作者单位:Universite Libre de Bruxelles; Universite Libre de Bruxelles
摘要:We consider the problem of testing, on the basis of a p-variate Gaussian random sample, the null hypothesis H-0 :( )theta(1) = theta(0)(1) against the alternative H-1 : theta(1) not equal theta(0)(1), where theta(1) is the first eigenvector of the underlying covariance matrix and theta(0)(1) is a fixed unit p-vector. In the classical setup where eigenvalues lambda(1) > lambda(2) >= ... >= lambda(p) are fixed, the Anderson (Ann. Math. Stat. 34 (1963) 122-148) likelihood ratio test (LRT) and the...
-
作者:Ray, Kolyan; van der Vaart, Aad
作者单位:Imperial College London; Leiden University; Leiden University - Excl LUMC
摘要:We develop a semiparametric Bayesian approach for estimating the mean response in a missing data model with binary outcomes and a nonparametrically modelled propensity score. Equivalently, we estimate the causal effect of a treatment, correcting nonparametrically for confounding. We show that standard Gaussian process priors satisfy a semiparametric Bernsteinvon Mises theorem under smoothness conditions. We further propose a novel propensity score-dependent prior that provides efficient infere...
-
作者:Alquier, Pierre; Ridgway, James
作者单位:RIKEN
摘要:While Bayesian methods are extremely popular in statistics and machine learning, their application to massive data sets is often challenging, when possible at all. The classical MCMC algorithms are prohibitively slow when both the model dimension and the sample size are large. Variational Bayesian methods aim at approximating the posterior by a distribution in a tractable family F. Thus, MCMC are replaced by an optimization algorithm which is orders of magnitude faster. VB methods have been ap...
-
作者:Xu, Min; Jog, Varun; Loh, Po-Ling
作者单位:Rutgers University System; Rutgers University New Brunswick; University of Wisconsin System; University of Wisconsin Madison
摘要:Community identification in a network is an important problem in fields such as social science, neuroscience and genetics. Over the past decade, stochastic block models (SBMs) have emerged as a popular statistical framework for this problem. However, SBMs have an important limitation in that they are suited only for networks with unweighted edges; in various scientific applications, disregarding the edge weights may result in a loss of valuable information. We study a weighted generalization o...
-
作者:Heng, Jeremy; Bishop, Adrian N.; Deligiannidis, George; Doucet, Arnaud
作者单位:ESSEC Business School; Commonwealth Scientific & Industrial Research Organisation (CSIRO); University of Oxford
摘要:Sequential Monte Carlo methods, also known as particle methods, are a popular set of techniques for approximating high-dimensional probability distributions and their normalizing constants. These methods have found numerous applications in statistics and related fields; for example, for inference in nonlinear non-Gaussian state space models, and in complex static models. Like many Monte Carlo sampling schemes, they rely on proposal distributions which crucially impact their performance. We int...
-
作者:Jeon, Jeong Min; Park, Byeong U.
作者单位:Seoul National University (SNU)
摘要:This paper develops a foundation of methodology and theory for the estimation of structured nonparametric regression models with Hilbertian responses. Our method and theory are focused on the additive model, while the main ideas may be adapted to other structured models. For this, the notion of Bochner integration is introduced for Banach-space-valued maps as a generalization of Lebesgue integration. Several statistical properties of Bochner integrals, relevant for our method and theory and al...
-
作者:Nickl, Richard; Ray, Kolyan
作者单位:University of Cambridge; University of London; King's College London
摘要:The problem of determining a periodic Lipschitz vector field b = (b(1),..., b(d)) from an observed trajectory of the solution (X-t : 0 <= t <= T) of the multi-dimensional stochastic differential equation dX(t) = b(X-t) dt + dW(t), t >= 0, where W-t is a standard d-dimensional Brownian motion, is considered. Convergence rates of a penalised least squares estimator, which equals the maxi-mum a posteriori (MAP) estimate corresponding to a high-dimensional Gaus-sian product prior, are derived. The...