-
作者:Chakraborty, Anirvan; Chaudhuri, Probal
作者单位:Indian Statistical Institute; Indian Statistical Institute Kolkata
摘要:The Wilcoxon-Mann-Whitney test is a robust competitor of the test in the univariate setting. For finite-dimensional multivariate non-Gaussian data, several extensions of the Wilcoxon-Mann-Whitney test have been shown to outperform Hotelling's test. In this paper, we study a Wilcoxon-Mann-Whitney-type test based on spatial ranks in infinite-dimensional spaces, we investigate its asymptotic properties and compare it with several existing tests. The proposed test is shown to be robust with respec...
-
作者:Wu, Yuanshan; Yin, Guosheng
作者单位:Wuhan University; University of Hong Kong
摘要:To accommodate the heterogeneity that is often present in ultrahigh-dimensional data, we propose a conditional quantile screening method, which enables us to select features that contribute to the conditional quantile of the response given the covariates. The method can naturally handle censored data by incorporating a weighting scheme through redistribution of the mass to the right; moreover, it is invariant to monotone transformation of the response and requires substantially weaker conditio...
-
作者:Rootzen, Holger; Zholud, Dmitrii
作者单位:Chalmers University of Technology; University of Gothenburg
摘要:This paper develops tail estimation methods to handle false positives in multiple testing problems where testing is done at extreme significance levels and with low degrees of freedom, and where the true null distribution may differ from the theoretical one. We show that the number of false positives, conditional on the total number of positives, has an approximately binomial distribution, and we find estimators of the distribution parameter. We also develop methods for estimation of the true ...
-
作者:Scott, J. G.; Shively, T. S.; Walker, S. G.
作者单位:University of Texas System; University of Texas Austin; University of Texas System; University of Texas Austin
摘要:This paper adopts a nonparametric Bayesian approach to testing whether a function is monotone. Two new families of tests are constructed. The first uses constrained smoothing splines with a hierarchical stochastic-process prior that explicitly controls the prior probability of monotonicity. The second uses regression splines together with two proposals for the prior over the regression coefficients. Via simulation, the finite-sample performance of the tests is shown to improve upon existing fr...
-
作者:Cao, Hongyuan; Wu, Wei Biao
作者单位:University of Missouri System; University of Missouri Columbia; University of Chicago
摘要:We consider large scale multiple testing for data that have locally clustered signals. With this structure, we apply techniques from changepoint analysis and propose a boundary detection algorithm so that the clustering information can be utilized. Consequently the precision of the multiple testing procedure is substantially improved. We study tests with independent as well as dependent p-values. Monte Carlo simulations suggest that the methods perform well with realistic sample sizes and show...
-
作者:Xia, Yin; Cai, Tianxi; Cai, T. Tony
作者单位:University of North Carolina; University of North Carolina Chapel Hill; Harvard University; Harvard T.H. Chan School of Public Health; University of Pennsylvania
摘要:Model organisms and human studies have yielded increasing empirical evidence that interactions among genes contribute broadly to genetic variation of complex traits. In the presence of gene-gene interactions, the dimensionality of the feature space becomes extremely high relative to the sample size. This poses a significant methodological challenge in the identification of gene-gene interactions. In this paper, by using a Gaussian graphical model framework, we translate the problem of identify...
-
作者:Wadsworth, Jennifer L.
作者单位:Lancaster University
摘要:Full likelihood-based inference for high-dimensional multivariate extreme value distributions, or max-stable processes, is feasible when incorporating occurrence times of the maxima; without this information, d-dimensional likelihood inference is usually precluded due to the large number of terms in the likelihood. However, some studies have noted bias when performing high-dimensional inference that incorporates such event information, particularly when dependence is weak. We elucidate this ph...
-
作者:Wu, Yichao; Stefanski, Leonard A.
作者单位:North Carolina State University
摘要:We propose an automatic structure recovery method for additive models, based on a backfitting algorithm coupled with local polynomial smoothing, in conjunction with a new kernel-based variable selection strategy. Our method produces estimates of the set of noise predictors, the sets of predictors that contribute polynomially at different degrees up to a specified degree M, and the set of predictors that contribute beyond polynomially of degree M. We prove consistency of the proposed method, an...
-
作者:Stallings, J. W.; Morgan, J. P.
作者单位:North Carolina State University; Virginia Polytechnic Institute & State University
摘要:The standard approach to finding optimal experimental designs employs conventional measures of design efficacy, such as the A, E, and D-criterion, that assume equal interest in all estimable functions of model parameters. This paper develops a general theory for weighted optimality, allowing precise design selection according to expressed relative interest in different functions in the estimation space. The approach employs a very general class of matrix-specified weighting schemes that produc...
-
作者:Walter, V.; Wright, F. A.; Nobel, A. B.
作者单位:Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Penn State Health; North Carolina State University; University of North Carolina; University of North Carolina Chapel Hill
摘要:We consider the detection and identification of recurrent departures from stationary behaviour in genomic or similarly arranged data containing measurements at an ordered set of variables. Our primary focus is on departures that occur only at a single variable, or within a small window of contiguous variables, but involve more than one sample. This encompasses the identification of aberrant markers in genome-wide measurements of DNA copy number and DNA methylation, as well as meta-analyses of ...