-
作者:Hu, Xiaoyu; Lin, Zhenhua
作者单位:Xi'an Jiaotong University; National University of Singapore
摘要:Two-sample hypothesis testing is a fundamental statistical problem for inference about two populations. In this paper, we construct a novel test statistic to detect high-dimensional distributional differences based on the max-sliced Wasserstein distance to mitigate the curse of dimensionality. By exploiting an intriguing link between the distance and suprema of empirical processes, we develop an effective bootstrapping procedure to approximate the null distribution of the test statistic. One d...
-
作者:Wang, Ruodu
作者单位:University of Waterloo
摘要:In this paper it is proved that the only admissible way of merging arbitrary $ e $-values is to use a weighted arithmetic average. This result completes the picture of merging methods for arbitrary $ e $-values and generalizes the result of that the only admissible way of symmetrically merging $ e $-values is to use the arithmetic average combined with a constant. Although the proved statement is naturally anticipated, its proof relies on a sophisticated application of optimal transport dualit...
-
作者:Dixit, Vaidehi; Martin, Ryan
作者单位:University of Missouri System; University of Missouri Columbia; North Carolina State University
摘要:Distinguishing two models is a fundamental and practically important statistical problem. Error rate control is crucial to the testing logic, but in complex nonparametric settings can be difficult to achieve, especially when the stopping rule that determines the data collection process is not available. This paper proposes an $ e $-process construction based on the predictive recursion algorithm originally designed to recursively fit nonparametric mixture models. The resulting predictive recur...
-
作者:Pozza, F.; Zanella, G.
作者单位:Bocconi University
摘要:We study multi-proposal Markov chain Monte Carlo algorithms, such as multiple-try or generalized Metropolis-Hastings schemes, which have recently received renewed attention due to their amenability to parallel computing. First, we prove that no multi-proposal scheme can speed up convergence relative to the corresponding single-proposal scheme by more than a factor of $ K $, where $ K $ denotes the number of proposals at each iteration. This result applies to arbitrary target distributions and ...
-
作者:Saefken, B.; Kneib, T.; Wood, S. N.
作者单位:TU Clausthal; University of Gottingen; University of Edinburgh
摘要:The smoothing parameters in a semiparametric model are estimated based on criteria such as generalized cross-validation or restricted maximum likelihood. As these parameters are estimated in a data-driven manner, they influence the degrees of freedom of a semiparametric model, based on Stein's lemma. This allows us to associate parts of the degrees of freedom of a semiparametric model with the smoothing parameters. A framework is introduced that enables these degrees of freedom of the smoothin...
-
作者:Li, Jinming; Xu, Gongjun; Zhu, Ji
作者单位:University of Michigan System; University of Michigan
摘要:Factor analysis is a statistical tool widely used in many disciplines, such as psychology, economics and sociology. As observations linked by networks become increasingly common, incorporating network structures into factor analysis is an important problem that remains open. This article focuses on high-dimensional factor analysis involving network-connected observations, and we propose a generalized factor model with latent factors that account for both the network structure and the dependenc...
-
作者:Roycraft, B.; Rajaratnam, B.
作者单位:State University System of Florida; University of Florida; University of California System; University of California Davis
摘要:Graphical and sparse (inverse) covariance models have found widespread use in modern sample-starved high-dimensional applications. A part of their wide appeal stems from the significantly low sample sizes required for existence of the estimators, especially in comparison with the classical full covariance model. For undirected Gaussian graphical models, the minimum sample size required for the existence of maximum likelihood estimators had been an open question for almost half a century, and h...
-
作者:Stolf, F.; Dunson, D. B.
作者单位:Duke University
摘要:Joint species distribution models are popular in ecology for modelling covariate effects on species occurrence, while characterizing cross-species dependence. Data consist of multivariate binary indicators of the occurrences of different species in each sample, along with sample-specific covariates. A key problem is that current models implicitly assume that the list of species under consideration is predefined and finite, while for highly diverse groups of organisms, it is impossible to antic...
-
作者:Zu, Tianhai; Qin, Yichen
作者单位:University of Texas System; University of Texas at San Antonio; University System of Ohio; University of Cincinnati
摘要:In network analysis, one frequently needs to conduct inference for network parameters based on a single observed network. Since the sampling distribution of the statistic is often unknown, one has to rely on the bootstrap. However, because of the complex dependence structure among vertices, existing bootstrap methods often yield unsatisfactory performance, especially for small or moderate sample sizes. Here we propose a new network bootstrap procedure, termed the local bootstrap, for estimatin...
-
作者:Gang, B.; Banerjee, T.
作者单位:Fudan University; University of Kansas
摘要:Heteroskedasticity poses several methodological challenges in designing valid and powerful procedures for simultaneous testing of composite null hypotheses. In particular, the conventional practice of standardizing or rescaling heteroskedastic test statistics in this setting may severely affect the power of the underlying multiple testing procedure. Additionally, when the inferential parameter of interest is correlated with the variance of the test statistic, methods that ignore this dependenc...