-
作者:Shi, Pixu; Zhou, Yuchen; Zhang, Anru R.
作者单位:Duke University; University of Wisconsin System; University of Wisconsin Madison
摘要:In microbiome and genomic studies, the regression of compositional data has been a crucial tool for identifying microbial taxa or genes that are associated with clinical phenotypes. To account for the variation in sequencing depth, the classic log-contrast model is often used where read counts are normalized into compositions. However, zero read counts and the randomness in covariates remain critical issues. We introduce a surprisingly simple, interpretable and efficient method for the estimat...
-
作者:Wang, Rui; Xu, Wangli
作者单位:Renmin University of China; Renmin University of China
摘要:This paper is concerned with the problem of comparing the population means of two groups of independent observations. An approximate randomization test procedure based on the test statistic of is proposed. The asymptotic behaviour of the test statistic, as well as the randomized statistic, is studied under weak conditions. In our theoretical framework, observations are not assumed to be identically distributed even within groups. No condition on the eigenstructure of the covariance matrices is...
-
作者:Masoero, Lorenzo; Camerlenghi, Federico; Favaro, Stefano; Broderick, Tamara
作者单位:Massachusetts Institute of Technology (MIT); University of Milano-Bicocca; University of Turin
摘要:While the cost of sequencing genomes has decreased dramatically in recent years, this expense often remains nontrivial. Under a fixed budget, scientists face a natural trade-off between quantity and quality: spending resources to sequence a greater number of genomes or spending resources to sequence genomes with increased accuracy. Our goal is to find the optimal allocation of resources between quantity and quality. Optimizing resource allocation promises to reveal as many new variations in th...
-
作者:Moon, Haeun; Chen, Kehui
作者单位:Pennsylvania Commonwealth System of Higher Education (PCSHE); University of Pittsburgh
摘要:We generalize the sign covariance introduced by Bergsma & Dassios (2014) to multivariate random variables and beyond. The new interpoint-ranking sign covariance is applicable to general types of random objects as long as a meaningful similarity measure can be defined, and it is shown to be zero if and only if the two random variables are independent. The test statistic is a $U$-statistic, whose large-sample behaviour guarantees that the proposed test is consistent against general types of alte...
-
作者:Deresa, N. W.; Van Keilegom, I
作者单位:KU Leuven
-
作者:Schiavon, L.; Canale, A.; Dunson, D. B.
作者单位:University of Padua; Duke University
摘要:Factorization models express a statistical object of interest in terms of a collection of simpler objects. For example, a matrix or tensor can be expressed as a sum of rank-one components. In practice, however, it can be challenging to infer the number of components and the relative impact of the different components. A popular idea is to include infinitely many components whose impact decreases with the component index. This article is motivated by two limitations of such existing methods: (i...
-
作者:Luo, Wei; Xue, Lingzhou; Yao, Jiawei; Yu, Xiufan
作者单位:Zhejiang University; Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park; Princeton University; University of Notre Dame
摘要:We consider forecasting a single time series using a large number of predictors in the presence of a possible nonlinear forecast function. Assuming that the predictors affect the response through the latent factors, we propose to first conduct factor analysis and then apply sufficient dimension reduction on the estimated factors to derive the reduced data for subsequent forecasting. Using directional regression and the inverse third-moment method in the stage of sufficient dimension reduction,...
-
作者:Li, Yichao; Wang, Wenshuo; Deng, K. E.; Liu, Jun S.
作者单位:Tsinghua University; Harvard University
摘要:Sequential Monte Carlo algorithms are widely accepted as powerful computational tools for making inference with dynamical systems. A key step in sequential Monte Carlo is resampling, which plays the role of steering the algorithm towards the future dynamics. Several strategies have been used in practice, including multinomial resampling, residual resampling, optimal resampling, stratified resampling and optimal transport resampling. In one-dimensional cases, we show that optimal transport resa...
-
作者:Cohen, E. A. K.; Gibberd, A. J.
作者单位:Imperial College London; Lancaster University
摘要:Wavelets provide the flexibility for analysing stochastic processes at different scales. In this article we apply them to multivariate point processes as a means of detecting and analysing unknown nonstationarity, both within and across component processes. To provide statistical tractability, a temporally smoothed wavelet periodogram is developed and shown to be equivalent to a multi-wavelet periodogram. Under a stationarity assumption, the distribution of the temporally smoothed wavelet peri...
-
作者:Shi, H.; Drton, M.; Han, F.
作者单位:University of Washington; University of Washington Seattle; Technical University of Munich
摘要:Chatterjee (2021) introduced a simple new rank correlation coefficient that has attracted much attention recently. The coefficient has the unusual appeal that it not only estimates a population quantity first proposed by that is zero if and only if the underlying pair of random variables is independent, but also is asymptotically normal under independence. This paper compares Chatterjee's new correlation coefficient with three established rank correlations that also facilitate consistent tests...