-
作者:Hans, Chris; Dobra, Adrian; West, Mike
作者单位:University System of Ohio; Ohio State University; University of Washington; University of Washington Seattle; University of Washington; University of Washington Seattle; University of Washington; University of Washington Seattle; Duke University
摘要:Model search in regression with very large numbers of candidate predictors raises challenges for both model specification and computation, for which standard approaches such as Markov chain Monte Carlo (MCMC) methods are often infeasible or ineffective. We describe a novel shotgun stochastic search (SSS) approach that explores interesting regions of the resulting high-dimensional model spaces and quickly identifies regions of high posterior probability over models. We describe algorithmic and ...
-
作者:Cai, Jianwen; Fan, Jianqing; Jiang, Jiancheng; Zhou, Haibo
作者单位:University of North Carolina; University of North Carolina Chapel Hill; Princeton University; University of North Carolina; University of North Carolina Charlotte
摘要:This article studies estimation of partially linear hazard regression models for multivariate survival data. A profile pseudo-partial likelihood estimation method is proposed under the marginal hazard model framework. The estimation on the parameters for the linear part is accomplished by maximization of a pseudo-partial likelihood profiled over the nonparametric part. This enables us to obtain root n-consistent estimators of the parametric component. Asymptotic normality is obtained for the e...
-
作者:Dahl, David B.; Newton, Michael A.
作者单位:Texas A&M University System; Texas A&M University College Station; University of Wisconsin System; University of Wisconsin Madison; University of Wisconsin System; University of Wisconsin Madison
摘要:Multiple hypothesis testing and clustering have been the subject of extensive research in high-dimensional inference, yet these problems usually have been treated separately. By defining true clusters in terms of shared parameter values, we could improve the sensitivity of individual tests, because more data bearing on the same parameter values are available. We develop and evaluate a hybrid methodology that uses clustering information to increase testing sensitivity and accommodates uncertain...
-
作者:Smith, Michael; Fahrmeir, Ludwig
作者单位:University of Melbourne; University of Sydney; University of Munich
摘要:We propose a procedure to undertake Bayesian variable selection and model averaging for a series of regressions located on a lattice. For those regressors that are in common in the regressions, we consider using an Ising prior to smooth spatially the indicator variables representing whether or not the variable is zero or nonzero in each regression. This smooths spatially the probabilities that each independent variable is nonzero in each regression and indirectly smooths spatially the regressi...
-
作者:Javaras, Kristin N.; Ripley, Brian D.
作者单位:University of Oxford
摘要:Likert attitude data consist of responses to favorable and unfavorable statements about an entity. where responses fall into ordered categories ranging from disagreement to agreement. Social science and rnarketing researchers frequently use data of this type to measure attitudes toward an entity such as a policy or product. We focus oil data on American and British attitudes toward their respective nations (national pride). We introduce a multidimensional Unfolding model (MUM) to describe the ...
-
作者:Wang, Lifeng; Shen, Xiaotong
作者单位:University of Minnesota System; University of Minnesota Twin Cities
摘要:Binary support vector machines (SVMs) have been proven to deliver high performance. In multiclass classification, however, issues remain with respect to variable selection. One challenging issue is classification and variable selection in the presence of variables in the magnitude of thousands, greatly exceeding the size of training sample. This often occurs in genomics classification. To meet the challenge, this article proposes a novel multiclass support vector machine, which performs classi...
-
作者:Cooner, Freda; Banerjee, Sudipto; Carlin, Bradley P.; Sinha, Debajyoti
作者单位:University of Minnesota System; University of Minnesota Twin Cities; Medical University of South Carolina
摘要:With rapid improvements in medical treatment and health care, many datasets dealing with time to relapse or death now reveal a substantial portion of patients who are cured (i.e., who never experience the event). Extended survival models called cure rate models account for the probability of a subject being cured and can be broadly classified into the classical mixture models of Berkson and Gage (BG type) or the stochastic tumor models pioneered by Yakovlev and extended to a hierarchical frame...
-
作者:Hoff, Peter D.
作者单位:University of Washington; University of Washington Seattle; University of Washington; University of Washington Seattle; University of Washington; University of Washington Seattle
摘要:Many multivariate data-analysis techniques for an m x n matrix Y are related to the model Y = M + E, where Y is an m x 17 matrix of full rank and M is an unobserved mean matrix of rank K < (m boolean AND n). Typically the rank of M is estimated in a heuristic way and then the least-squares estimate of M is obtained via the singular value decomposition of Y, yielding an estimate that can have a very high variance. In this article we suggest a model-based alternative to the preceding approach by...
-
作者:Keller-McNulty, Sallie
作者单位:Rice University
摘要:Science, engineering, technology, and people-these are the ingredients that must come together to support the growing complexity of today's global challenges, ranging from international security to space exploration. As scientists and engineers, it is essential that we develop the means to put our work into a decision context for policy makers; otherwise, our efforts will only inform the writers of textbooks and not the leaders who shape the world within which we live. Statisticians must step ...
-
作者:Chan, Hock Peng; Zhang, Nancy Ruonan
作者单位:National University of Singapore; Stanford University
摘要:We examine scan statistics for one-dimensional marked Poisson processes. Such statistics tabulate the maximum weighted count of event occurrences within a window of predetermined width over all windows within an observed interval. We derive analytical formulas and also give an importance sampling method for approximating the tail probabilities of scan statistics. Because high-throughput genomic sequencing has led to the availability of massive amounts of biomolecular sequence data, it is often...