-
作者:Cai, T. Tony; Sun, Wenguang; Wang, Weinan
作者单位:University of Pennsylvania; University of Southern California
摘要:Two-sample multiple testing has a wide range of applications. The conventional practice first reduces the original observations to a vector of p-values and then chooses a cut-off to adjust for multiplicity. However, this data reduction step could cause significant loss of information and thus lead to suboptimal testing procedures. We introduce a new framework for two-sample multiple testing by incorporating a carefully constructed auxiliary variable in inference to improve the power. A data-dr...
-
作者:Han, Peisong; Kong, Linglong; Zhao, Jiwei; Zhou, Xingcai
作者单位:University of Michigan System; University of Michigan; University of Alberta; State University of New York (SUNY) System; University at Buffalo, SUNY; Nanjing Audit University
摘要:Quantile estimation has attracted significant research interest in recent years. However, there has been only a limited literature on quantile estimation in the presence of incomplete data. We propose a general framework to address this problem. Our framework combines the two widely adopted approaches for missing data analysis, the imputation approach and the inverse probability weighting approach, via the empirical likelihood method. The method proposed is capable of dealing with many differe...
-
作者:Delaigle, Aurore; Hall, Peter; Tung Pham
作者单位:University of Melbourne
摘要:We show that, in the functional data context, by appropriately exploiting the functional nature of the data, it is possible to cluster the observations asymptotically perfectly. We demonstrate that this level of performance can sometimes be achieved by the k-means algorithm as long as the data are projected on a carefully chosen finite dimensional space. In general, the notion of an ideal cluster is not clearly defined. We derive our results in the setting where the data come from two populati...
-
作者:Godolphin, J. D.
作者单位:University of Surrey
摘要:The arrangement of 2(n)-factorials in row-column designs to estimate main effects and two-factor interactions is investigated. Single-replicate constructions are given which enable estimation of all main effects and maximize the number of estimable two-factor interactions. Constructions and guidance are given for multireplicate designs in single arrays and in multiple arrays. Consideration is given to constructions for 2(n-t) fractional factorials.
-
作者:Piao, Jin; Ning, Jing; Shen, Yu
作者单位:University of Southern California; University of Texas System; UTMD Anderson Cancer Center
摘要:To understand better the relationship between patient characteristics and their residual survival after an intermediate event such as the local recurrence of cancer, it is of interest to identify patients with the intermediate event and then to analyse their residual survival data. One challenge in analysing such data is that the observed residual survival times tend to be longer than those in the target population, since patients who die before experiencing the intermediate event are excluded...
-
作者:Ditlevsen, Susanne; Samson, Adeline
作者单位:University of Copenhagen; Communaute Universite Grenoble Alpes; Universite Grenoble Alpes (UGA); Centre National de la Recherche Scientifique (CNRS)
摘要:The statistical problem of parameter estimation in partially observed hypoelliptic diffusion processes is naturally occurring in many applications. However, because of the noise structure, where the noise components of the different co-ordinates of the multi-dimensional process operate on different timescales, standard inference tools are ill conditioned. We propose to use a higher order scheme to approximate the likelihood, such that the different timescales are appropriately accounted for. W...
-
作者:Bernton, Espen; Jacob, Pierre E.; Gerber, Mathieu; Robert, Christian P.
作者单位:Harvard University; University of Bristol; Universite PSL; Universite Paris-Dauphine; University of Warwick
摘要:A growing number of generative statistical models do not permit the numerical evaluation of their likelihood functions. Approximate Bayesian computation has become a popular approach to overcome this issue, in which one simulates synthetic data sets given parameters and compares summaries of these data sets with the corresponding observed values. We propose to avoid the use of summaries and the ensuing loss of information by instead using the Wasserstein distance between the empirical distribu...
-
作者:Zhao, Junlong; Liu, Chao; Niu, Lu; Leng, Chenlei
作者单位:Beijing Normal University; Beihang University; University of Warwick; Alan Turing Institute
摘要:Influence diagnosis is an integrated component of data analysis but has been severely underinvestigated in a high dimensional regression setting. One of the key challenges, even in a fixed dimensional setting, is how to deal with multiple influential points that give rise to masking and swamping effects. The paper proposes a novel group deletion procedure referred to as multiple influential point detection by studying two extreme statistics based on a marginal-correlation-based influence measu...
-
作者:Liang, Tengyuan; Su, Weijie J.
作者单位:University of Chicago; University of Pennsylvania
摘要:Modern statistical inference tasks often require iterative optimization methods to compute the solution. Convergence analysis from an optimization viewpoint informs us only how well the solution is approximated numerically but overlooks the sampling nature of the data. In contrast, recognizing the randomness in the data, statisticians are keen to provide uncertainty quantification, or confidence, for the solution obtained by using iterative optimization methods. The paper makes progress along ...