-
作者:Li, Tianxi; Lei, Lihua; Bhattacharyya, Sharmodeep; Van den Berge, Koen; Sarkar, Purnamrita; Bickel, Peter J.; Levina, Elizaveta
作者单位:University of Virginia; Stanford University; Oregon State University; University of California System; University of California Berkeley; Ghent University; University of Texas System; University of Texas Austin; University of Michigan System; University of Michigan
摘要:The problem of community detection in networks is usually formulated as finding a single partition of the network into some correct number of communities. We argue that it is more interpretable and in some regimes more accurate to construct a hierarchical tree of communities instead. This can be done with a simple top-down recursive partitioning algorithm, starting with a single community and separating the nodes into two communities by spectral clustering repeatedly, until a stopping rule sug...
-
作者:Li, Yunxiao; Hu, Yi-Juan; Satten, Glen A.
作者单位:Emory University; Centers for Disease Control & Prevention - USA
摘要:Modern statistical analyses often involve testing large numbers of hypotheses. In many situations, these hypotheses may have an underlying tree structure that both helps determine the order that tests should be conducted but also imposes a dependency between tests that must be accounted for. Our motivating example comes from testing the association between a trait of interest and groups of microbes that have been organized into operational taxonomic units (OTUs) or amplicon sequence variants (...
-
作者:McCulloch, C. E.; Neuhaus, J. M.
-
作者:Wang, Jiayi; Wong, Raymond K. W.; Zhang, Xiaoke
作者单位:Texas A&M University System; Texas A&M University College Station; George Washington University
摘要:Multidimensional function data arise from many fields nowadays. The covariance function plays an important role in the analysis of such increasingly common data. In this article, we propose a novel nonparametric covariance function estimation approach under the framework of reproducing kernel Hilbert spaces (RKHS) that can handle both sparse and dense functional data. We extend multilinear rank structures for (finite-dimensional) tensors to functions, which allow for flexible modeling of both ...
-
作者:Fan, Jianqing; Fan, Yingying; Han, Xiao; Lv, Jinchi
作者单位:Princeton University; University of Southern California; Chinese Academy of Sciences; University of Science & Technology of China, CAS
摘要:Characterizing the asymptotic distributions of eigenvectors for large random matrices poses important challenges yet can provide useful insights into a range of statistical applications. To this end, in this article we introduce a general framework of asymptotic theory of eigenvectors for large spiked random matrices with diverging spikes and heterogeneous variances, and establish the asymptotic properties of the spiked eigenvectors and eigenvalues for the scenario of the generalized Wigner ma...
-
作者:Fu, Luella; Gang, Bowen; James, Gareth M.; Sun, Wenguang
作者单位:California State University System; San Francisco State University; Fudan University; University of Southern California
摘要:Standardization has been a widely adopted practice in multiple testing, for it takes into account the variability in sampling and makes the test statistics comparable across different study units. However, despite conventional wisdom to the contrary, we show that there can be a significant loss in information from basing hypothesis tests on standardized statistics rather than the full data. We develop a new class of heteroscedasticity-adjusted ranking and thresholding (HART) rules that aim to ...
-
作者:Peruzzi, Michele; Banerjee, Sudipto; Finley, Andrew O.
作者单位:Michigan State University; Duke University; University of California System; University of California Los Angeles
摘要:We introduce a class of scalable Bayesian hierarchical models for the analysis of massive geostatistical datasets. The underlying idea combines ideas on high-dimensional geostatistics by partitioning the spatial domain and modeling the regions in the partition using a sparsity-inducing directed acyclic graph (DAG). We extend the model over the DAG to a well-defined spatial process, which we call the meshed Gaussian process (MGP). A major contribution is the development of an MGPs on tessellate...
-
作者:Wang, Zeya; Baladandayuthapani, Veerabhadran; Kaseb, Ahmed O.; Amin, Hesham M.; Hassan, Manal M.; Wang, Wenyi; Morris, Jeffrey S.
作者单位:Rice University; University of Texas System; UTMD Anderson Cancer Center; University of Michigan System; University of Michigan; University of Texas System; UTMD Anderson Cancer Center; University of Texas System; UTMD Anderson Cancer Center; University of Texas System; UTMD Anderson Cancer Center; University of Texas System; UTMD Anderson Cancer Center; University of Pennsylvania
摘要:It is well established that interpatient heterogeneity in cancer may significantly affect genomic data analyses and in particular, network topologies. Most existing graphical model methods estimate a single population-level graph for genomic or proteomic network. In many investigations, these networks depend on patient-specific indicators that characterize the heterogeneity of individual networks across subjects with respect to subject-level covariates. Examples include assessments of how the ...
-
作者:Liu, Jeremiah Zhe; Deng, Wenying; Lee, Jane; Lin, Pi-I Debby; Valeri, Linda; Christiani, David C.; Bellinger, David C.; Wright, Robert O.; Mazumdar, Maitreyi M.; Coull, Brent A.
作者单位:Harvard University; Harvard T.H. Chan School of Public Health; Harvard University; Harvard T.H. Chan School of Public Health; Harvard University; Harvard University Medical Affiliates; Boston Children's Hospital; Harvard University; Harvard T.H. Chan School of Public Health; Icahn School of Medicine at Mount Sinai
摘要:Gene-environment and nutrition-environment studies often involve testing of high-dimensional interactions between two sets of variables, each having potentially complex nonlinear main effects on an outcome. Construction of a valid and powerful hypothesis test for such an interaction is challenging, due to the difficulty in constructing an efficient and unbiased estimator for the complex, nonlinear main effects. In this work, we address this problem by proposing a cross-validated ensemble of ke...
-
作者:Jordanger, Lars Arne; Tjostheim, Dag
作者单位:Western Norway University of Applied Sciences; University of Bergen
摘要:The spectral distribution f(omega) of a stationary time series {Y-t}(t is an element of Z) can be used to investigate whether or not periodic structures are present in {Y-t}(t is an element of Z), but f(omega) has some limitations due to its dependence on the autocovariances gamma(h). For example, f(omega) can not distinguish white iid noise from GARCH-type models (whose terms are dependent, but uncorrelated), which implies that f(omega) can be an inadequate tool when {Y-t}(t is an element of ...