-
作者:Cheng, Yu; Fine, Jason P.
作者单位:Pennsylvania Commonwealth System of Higher Education (PCSHE); University of Pittsburgh; University of North Carolina; University of North Carolina Chapel Hill
摘要:. Association models, like frailty and copula models, are frequently used to analyse clustered survival data and to evaluate within-cluster associations. The assumption of non-informative censoring is commonly applied to these models, though it may not be true in many situations. We consider bivariate competing risk data and focus on association models specified for the bivariate cumulative incidence function (CIF), which is a non-parametrically identifiable quantity. Copula models are propose...
-
作者:Delaigle, Aurore; Hall, Peter
作者单位:University of Melbourne
摘要:. We show that, in functional data classification problems, perfect asymptotic classification is often possible, making use of the intrinsic very high dimensional nature of functional data. This performance is often achieved by linear methods, which are optimal in important cases. These results point to a marked contrast between classification for functional data and its counterpart in conventional multivariate analysis, where the dimension is kept fixed as the sample size diverges. In the lat...
-
作者:Yuan, Ying; Zhu, Hongtu; Lin, Weili; Marron, J. S.
作者单位:University of North Carolina; University of North Carolina Chapel Hill; University of North Carolina School of Medicine; University of North Carolina; University of North Carolina Chapel Hill; University of North Carolina School of Medicine
摘要:Local polynomial regression has received extensive attention for the non-parametric estimation of regression functions when both the response and the covariate are in Euclidean space. However, little has been done when the response is in a Riemannian manifold. We develop an intrinsic local polynomial regression estimate for the analysis of symmetric positive definite matrices as responses that lie in a Riemannian manifold with covariate in Euclidean space. The primary motivation and applicatio...
-
作者:Sang, Huiyan; Huang, Jianhua Z.
作者单位:Texas A&M University System; Texas A&M University College Station
摘要:. Gaussian process models have been widely used in spatial statistics but face tremendous computational challenges for very large data sets. The model fitting and spatial prediction of such models typically require O(n3) operations for a data set of size n. Various approximations of the covariance functions have been introduced to reduce the computational cost. However, most existing approximations cannot simultaneously capture both the large- and the small-scale spatial dependence. A new appr...
-
作者:Casella, G.; Roberts, G.
-
作者:Allen, Genevera I.; Tibshirani, Robert
作者单位:Rice University; Baylor College of Medicine; Stanford University
摘要:. We consider the problem of large-scale inference on the row or column variables of data in the form of a matrix. Many of these data matrices are transposable meaning that neither the row variables nor the column variables can be considered independent instances. An example of this scenario is detecting significant genes in microarrays when the samples may be dependent because of latent variables or unknown batch effects. By modelling this matrix data by using the matrix variate normal distri...
-
作者:Ambroise, Christophe; Matias, Catherine
作者单位:Universite Paris Saclay; Centre National de la Recherche Scientifique (CNRS)
摘要:. Random-graph mixture models are very popular for modelling real data networks. Parameter estimation procedures usually rely on variational approximations, either combined with the expectationmaximization (EM) algorithm or with Bayesian approaches. Despite good results on synthetic data, the validity of the variational approximation is, however, not established. Moreover, these variational approaches aim at approximating the maximum likelihood or the maximum a posteriori estimators, whose beh...
-
作者:Cai, T. Tony; Jeng, X. Jessie; Li, Hongzhe
作者单位:University of Pennsylvania
摘要:. Copy number variants (CNVs) are alternations of DNA of a genome that result in the cell having less or more than two copies of segments of the DNA. CNVs correspond to relatively large regions of the genome, ranging from about one kilobase to several megabases, that are deleted or duplicated. Motivated by CNV analysis based on next generation sequencing data, we consider the problem of detecting and identifying sparse short segments hidden in a long linear sequence of data with an unspecified...
-
作者:Fan, Jianqing; Feng, Yang; Tong, Xin
作者单位:Princeton University; Columbia University; Princeton University
摘要:For high dimensional classification, it is well known that naively performing the Fisher discriminant rule leads to poor results due to diverging spectra and accumulation of noise. Therefore, researchers proposed independence rules to circumvent the diverging spectra, and sparse independence rules to mitigate the issue of accumulation of noise. However, in biological applications, often a group of correlated genes are responsible for clinical outcomes, and the use of the covariance information...
-
作者:Polson, Nicholas G.; Scott, James G.
作者单位:University of Texas System; University of Texas Austin; University of Chicago
摘要:. We use Levy processes to generate joint prior distributions, and therefore penalty functions, for a location parameter as p grows large. This generalizes the class of localglobal shrinkage rules based on scale mixtures of normals, illuminates new connections between disparate methods and leads to new results for computing posterior means and modes under a wide class of priors. We extend this framework to large-scale regularized regression problems where p>n, and we provide comparisons with o...