-
作者:Cappello, Lorenzo; Veber, Amandine; Palacios, Julia A.
作者单位:Pompeu Fabra University; Barcelona School of Economics; Centre National de la Recherche Scientifique (CNRS); Universite Paris Cite; Stanford University
摘要:Molecular sequence variation at a locus informs about the evolutionary history of the sample and past population size dynamics. The Kingman coalescent is used in a generative model of molecular sequence variation to infer evolutionary parameters. However, it is well understood that inference under this model does not scale well with sample size. Here, we build on recent work based on a lower resolution coalescent process, the Tajima coalescent, to model longitudinal samples. While the Kingman ...
-
作者:Sen, Bodhisattva
作者单位:Columbia University
-
作者:Gnecco, Nicola; Terefe, Edossa Merga; Engelke, Sebastian
作者单位:University of Copenhagen; University of Geneva; Hawassa University
摘要:Classical methods for quantile regression fail in cases where the quantile of interest is extreme and only few or no training data points exceed it. Asymptotic results from extreme value theory can be used to extrapolate beyond the range of the data, and several approaches exist that use linear regression, kernel methods or generalized additive models. Most of these methods break down if the predictor space has more than a few dimensions or if the regression function of extreme quantiles is co...
-
作者:Su, Chang; Zhang, Jingfei; Zhao, Hongyu
作者单位:Emory University; Yale University; Emory University
摘要:Inferring and characterizing gene co-expression networks has led to important insights on the molecular mechanisms of complex diseases. Most co-expression analyses to date have been performed on gene expression data collected from bulk tissues with different cell type compositions across samples. As a result, the co-expression estimates only offer an aggregated view of the underlying gene regulations and can be confounded by heterogeneity in cell type compositions, failing to reveal gene coord...
-
作者:Wang, Xueqin; Zhu, Jin; Pan, Wenliang; Zhu, Junhao; Zhang, Heping
作者单位:Chinese Academy of Sciences; University of Science & Technology of China, CAS; Sun Yat Sen University; University of London; London School Economics & Political Science; Chinese Academy of Sciences; Yale University; University of Toronto
摘要:The distribution function is essential in statistical inference and connected with samples to form a directed closed loop by the correspondence theorem in measure theory and the Glivenko-Cantelli and Donsker properties. This connection creates a paradigm for statistical inference. However, existing distribution functions are defined in Euclidean spaces and are no longer convenient to use in rapidly evolving data objects of complex nature. It is imperative to develop the concept of the distribu...
-
作者:Schnell, Patrick M.; Wascher, Matthew; Rempala, Grzegorz A.
作者单位:University System of Ohio; Ohio State University; University System of Ohio; University of Dayton
摘要:During the COVID-19 pandemic, many institutions such as universities and workplaces implemented testing regimens with every member of some population tested longitudinally, and those testing positive isolated for some time. Although the primary purpose of such regimens was to suppress disease spread by identifying and isolating infectious individuals, testing results were often also used to obtain prevalence and incidence estimates. Such estimates are helpful in risk assessment and institution...
-
作者:He, Yifan; Wu, Ruiyang; Zhou, Yong; Feng, Yang
作者单位:Chinese University of Hong Kong; New York University; East China Normal University; East China Normal University
摘要:Distributed statistical learning has become a popular technique for large-scale data analysis. Most existing work in this area focuses on dividing the observations, but we propose a new algorithm, DDAC-SpAM, which divides the features under a high-dimensional sparse additive model. Our approach involves three steps: divide, decorrelate, and conquer. The decorrelation operation enables each local estimator to recover the sparsity pattern for each additive component without imposing strict const...
-
作者:Chen, Zhehui; Mak, Simon; Wu, C. F. Jeff
作者单位:University System of Georgia; Georgia Institute of Technology; Duke University
摘要:The Expected Improvement (EI) method, proposed by Jones, Schonlau, andWelch, is a widely used Bayesian optimization method, which makes use of a fitted Gaussian process model for efficient black-box optimization. However, one key drawback of EI is that it is overly greedy in exploiting the fitted Gaussian process model for optimization, which results in suboptimal solutions even with large sample sizes. To address this, we propose a new hierarchical EI (HEI) framework, which makes use of a hie...
-
作者:Yang, Ying; Yao, Fang; Zhao, Peng
作者单位:Chinese Academy of Sciences; Academy of Mathematics & System Sciences, CAS; Peking University; Jiangsu Normal University; Jiangsu Normal University
摘要:We propose an online smoothing backfitting method for generalized additive models coupled with local linear estimation. The idea can be extended to general nonlinear optimization problems. The strategy is to use an appropriate-order expansion to approximate the nonlinear equations and store the coefficients as sufficient statistics which can be updated in an online manner by the dynamic candidate bandwidth method. We investigate the statistical and algorithmic convergences of the proposed meth...
-
作者:Weitkamp, Christoph Alexander; Proksch, Katharina; Tameling, Carla; Munk, Axel
作者单位:University of Gottingen; University of Twente; Max Planck Society
摘要:In this article, we aim to provide a statistical theory for object matching based on a lower bound of the Gromov-Wasserstein distance related to the distribution of (pairwise) distances of the considered objects. To this end, we model general objects as metric measure spaces. Based on this, we propose a simple and efficiently computable asymptotic statistical test for pose invariant object discrimination. This is based on a (beta-trimmed) empirical version of the afore-mentioned lower bound. W...