-
作者:Cai, Tianxi; Cai, T. Tony; Zhang, Anru
作者单位:University of Pennsylvania
摘要:Matrix completion has attracted significant recent attention in many fields including statistics, applied mathematics, and electrical engineering. Current literature on matrix completion focuses primarily on-independent sampling models under which the individual observed entries are sampled independently. Motivated by applications in genomic data integration, we propose a new framework of structured matrix completion (SMC) to treat structured rnissingness by design. Specifically, our proposed ...
-
作者:Feng, Long; Zou, Changliang; Wang, Zhaojun
作者单位:Nankai University; Nankai University
摘要:This article concerns tests for the two-sample location problem when data dimension is larger than the sample size. Existing multivariate-sign-based procedures are not robust against high dimensionality, producing tests with Type I error rates far away from nominal levels. This is mainly due to the biases from estimating location parameters. We propose,a novel test to overcome this issue by using the leave-one-out idea. The proposed test statistic is scalar-invariant and thus is particularly u...
-
作者:Fogarty, Colin B.; Mikkelsen, Mark E.; Gaieski, David F.; Small, Dylan S.
作者单位:University of Pennsylvania
摘要:Motivated by an observational study of the effect of hospital ward versus intensive care unit admission on severe sepsis mortality, we develop methods to address two common problems in observational studies: (1) when there is a lack of covariate overlap between the treated and control groups, how to define an interpretable study population wherein inference can be conducted without extrapolating with respect to important variables; and (2) how to use randomization inference to form confidence ...
-
作者:Plumlee, Matthew; Joseph, V. Roshan; Yang, Hui
作者单位:University of Michigan System; University of Michigan
摘要:Computational modeling is a popular tool to understand a diverse set of complex systems. The output from a computational model depends on a set of parameters that are unknown to the designer, but a modeler can estimate them by collecting physical data. In the described study of the ion channels of ventricular myocytes, the parameter of interest is a function as opposed to a scalar or a set of scalars. This article develops a new modeling strategy to nonparametrically study the functional param...
-
作者:Breidt, F. Jay; Opsomer, Jean D.; Sanchez-Borrego, Ismael
作者单位:Colorado State University System; Colorado State University Fort Collins
摘要:Fine stratification is commonly used to control the distribution of a sample from a finite population and to improve the precision of resulting estimators. One-per-stratum designs represent the finest possible stratification and occur in practice, but designs with very low numbers of elements per stratum (say, two orthree). are also common. The classical variance estimator in this context is the collapsed stratum estimator, which relies on creating larger pseudo-strata and computing the sum of...
-
作者:Tang, Jin; Li, Yehua; Guan, Yongtao
作者单位:Iowa State University; Iowa State University
摘要:We model generalized longitudinal data from multiple treatment groups by a class of semiparametric analysis of covariance models, which take into account the parametric effects of time dependent covariates and the nonparametric time effects. In these models, the treatment effects are represented by nonparametric functions of time and we propose a generalized quasi-likelihood ratio test procedure to test if these functions are identical. Our estimation procedure is based on profile estimating e...
-
作者:Fisher, Aaron; Caffo, Brian; Schwartz, Brian; Zipunnikov, Vadim
作者单位:Johns Hopkins University; Johns Hopkins Bloomberg School of Public Health
摘要:Many have suggested a bootstrap procedure for estimating the sampling variability of principal component analysis (PCA) results. However, when the number of measurements per subject (p) is much larger than the number of subjects (n), calculating and storing the leading principal components (PCs) from each bootstrap sample can be computationally infeasible. To address this, we outline methods for fast, exact calculation of bootstrap PCs, eigenvalues, and scores. Our methods leverage the fact th...
-
作者:Tibshirani, Ryan J.; Taylor, Jonathan; Lockhart, Richard; Tibshirani, Robert
作者单位:Carnegie Mellon University; Carnegie Mellon University
摘要:We propose new inference tools for forward stepwise regression, least angle regression, and the lasso. Assuming a Gaussian model for the observation vector y, we first describe a general scheme to perform valid inference after any selection event that can be characterized as y falling into a polyhedral set. This framework allows us to derive conditional (post-selection) hypothesis tests at any step of forward stepwise or least angle regression, or any step along the lasso regularization path, ...
-
作者:Huang, Chiung-Yu; Qin, Jing; Tsai, Huei-Ting
作者单位:Johns Hopkins University; Johns Hopkins Medicine
摘要:With the rapidly increasing availability of data in the public domain, combining information from different sources to infer about associations or differences of interest has become an emerging challenge to researchers. This article presents a novel approach to improve efficiency in estimating the survival time distribution by synthesizing information from the individual-level data with t-year survival probabilities from external sources such as disease registries. While disease registries pro...
-
作者:Bien, Jacob; Bunea, Florentina; Xiao, Luo
作者单位:Cornell University
摘要:We introduce a new sparse estimator of the covariance matrix for high-dimensional models in which the variables have a known ordering. Our estimator, which is the solution to a convex optimization problem, is equivalently expressed as an estimator that tapers the sample covariance matrix by a Toeplitz, sparsely banded, data-adaptive matrix. As a result of this adaptivity, the convex banding estimator enjoys theoretical optimality properties not attained by previous banding or tapered estimator...