-
作者:Zhang, Yaowu; Zhu, Liping
作者单位:Shanghai University of Finance & Economics; Renmin University of China
摘要:Testing independence between high-dimensional random vectors is fundamentally different from testing independence between univariate random variables. Taking the projection correlation as an example, it suffers from at least three problems. First, it has a high computational complexity of O{n3(p+q)}, where n, p and q are the sample size and dimensions of the random vectors; this limits its usefulness substantially when n is extremely large. Second, the asymptotic null distribution of the proje...
-
作者:Wood, S. N.
-
作者:Gorgi, P.; Lauria, C. S. A.; Luati, A.
作者单位:Vrije Universiteit Amsterdam; University of Bologna; Imperial College London
摘要:Score-driven models have recently been introduced as a general framework to specify time-varying parameters of conditional densities. The score enjoys stochastic properties that make these models easy to implement and convenient to apply in several contexts, ranging from biostatistics to finance. Score-driven parameter updates have been shown to be optimal in terms of locally reducing a local version of the Kullback-Leibler divergence between the true conditional density and the postulated den...
-
作者:Li, Shuwei; Hu, Tao; Wang, Lianming; McMahan, Christopher S.; Tebbs, Joshua M.
作者单位:Guangzhou University; Capital Normal University; University of South Carolina System; University of South Carolina Columbia; Clemson University; University of South Carolina System; University of South Carolina Columbia
摘要:Group testing is an effective way to reduce the time and cost associated with conducting large-scale screening for infectious diseases. Benefits are realized through testing pools formed by combining specimens, such as blood or urine, from different individuals. In some studies, individuals are assessed only once and a time-to-event endpoint is recorded, for example, the time until infection. Combining group testing with this type of endpoint results in group-tested current status data (). To ...
-
作者:Wang, Yuyao; Ying, Andrew; Xu, Ronghui
作者单位:University of California System; University of California San Diego; University of Pennsylvania
摘要:In prevalent cohort studies with follow-up, the time-to-event outcome is subject to left truncation leading to selection bias. For estimation of the distribution of the time to event, conventional methods adjusting for left truncation tend to rely on the quasi-independence assumption that the truncation time and the event time are independent on the observed region. This assumption is violated when there is dependence between the truncation time and the event time, possibly induced by measured...
-
作者:Maugis, P. A.
作者单位:University of London; University College London
摘要:Subgraph counts, in particular the number of occurrences of small shapes such as triangles, characterize properties of random networks. As a result, they have seen wide use as network summary statistics. Subgraphs are typically counted globally, making existing approaches unable to describe vertex-specific characteristics. In contrast, rooted subgraphs focus on vertex neighbourhoods, and are fundamental descriptors of local network properties. We derive the asymptotic joint distribution of roo...
-
作者:Abadir, Karim M.; Lubrano, Michel
作者单位:Imperial College London; Aix-Marseille Universite
摘要:We show that least-squares cross-validation methods share a common structure that has an explicit asymptotic solution, when the chosen kernel is asymptotically separable in bandwidth and data. For density estimation with a multivariate Student-t(nu) kernel, the cross-validation criterion becomes asymptotically equivalent to a polynomial of only three terms. Our bandwidth formulae are simple and noniterative, thus leading to very fast computations, their integrated squared-error dominates tradi...
-
作者:Lewis, R. M.; Battey, H. S.
作者单位:Imperial College London
摘要:Direct use of the likelihood function typically produces severely biased estimates when the dimension of the parameter vector is large relative to the effective sample size. With linearly separable data generated from a logistic regression model, the loglikelihood function asymptotes and the maximum likelihood estimator does not exist. We show that an exact analysis for each regression coefficient produces half-infinite confidence sets for some parameters when the data are separable. Such conc...
-
作者:Yu, X.; Zhu, J.
作者单位:University of Michigan System; University of Michigan
摘要:In many real-world networks, it is often observed that subgraphs or higher-order structures of certain configurations, e.g., triangles and by-fans, are overly abundant compared to standard randomly generated networks (). However, statistical models accounting for this phenomenon are limited, especially when community structure is of interest. This limitation is coupled with a lack of community detection methods that leverage subgraphs or higher-order structures. In this paper, we propose a new...
-
作者:Song, Hoseung; Chen, Hao
作者单位:University of California System; University of California Davis
摘要:Kernel two-sample tests have been widely used for multivariate data to test equality of distributions. However, existing tests based on mapping distributions into a reproducing kernel Hilbert space mainly target specific alternatives and do not work well for some scenarios when the dimension of the data is moderate to high due to the curse of dimensionality. We propose a new test statistic that makes use of a common pattern under moderate and high dimensions and achieves substantial power impr...