-
作者:Little, Roderick J.
作者单位:University of Michigan System; University of Michigan
摘要:Ronald Fisher was by all accounts a first-rate mathematician, but he saw himself as a scientist, not a mathematician, and he railed against what George Box called (in his Fisher lecture) mathematistry. Mathematics is the indispensable foundation of statistics, but for me the real excitement and value of our subject lies in its application to other disciplines. We should not view statistics as another branch of mathematics and favor mathematical complexity over clarifying, formulating, and solv...
-
作者:Sadinle, Mauricio; Fienberg, Stephen E.
作者单位:Carnegie Mellon University; Carnegie Mellon University; Carnegie Mellon University
摘要:We present a probabilistic method for linking multiple datafiles. This task is not trivial in the absence of unique identifiers for the individuals recorded. This is a common scenario when linking census data to coverage measurement surveys for census coverage evaluation, and in general when multiple record systems need to be integrated for posterior analysis. Our method generalizes the Fellegi-Sunter theory for linking records from two datafiles and its modem implementations. The goal of mult...
-
作者:Haberman, Shelby J.; Sinharay, Sandip
作者单位:Educational Testing Service (ETS)
摘要:Generalized residuals are a tool employed in the analysis of contingency tables to examine possible sources of model error. They have typically been applied to log-linear models and to latent-class models. A general approach to generalized residuals is developed for a very general class of models for contingency tables. To illustrate their use, generalized residuals are applied to models based on item response theory (IRT) models. Such models are commonly applied to analysis of standardized ac...
-
作者:Ryu, Duchwan; Liang, Faming; Mallick, Bani K.
作者单位:University System of Georgia; Augusta University; Texas A&M University System; Texas A&M University College Station
摘要:The sea surface temperature (SST) is an important factor of the earth climate system. A deep understanding of SST is essential for climate monitoring and prediction. In general, SST follows a nonlinear pattern in both time and location and can be modeled by a dynamic system which changes with time and location. In this article, we propose a radial basis function network-based dynamic model which is able to catch the nonlinearity of the data and propose to use the dynamically weighted particle ...
-
作者:Lee, Juhee; Mueller, Peter; Zhu, Yitan; Ji, Yuan
作者单位:University System of Ohio; Ohio State University; University of Texas System; University of Texas Austin; NorthShore University Health System
摘要:We propose a nonparametric Bayesian local clustering (NoB-LoC) approach for heterogeneous data. NoB-LoC implements inference for nested clusters as posterior inference under a Bayesian model. Using protein expression data as an example, the NoB-LoC model defines a protein (column) cluster as a set of proteins that give rise to the same partition of the samples (rows). In other words, the sample partitions are nested within protein clusters. The common clustering of the samples gives meaning to...
-
作者:Jensen, Shane T.; Park, Jared; Braunstein, Alexander F.; McAuliffe, Jon
作者单位:University of Pennsylvania; University of California System; University of California Berkeley
摘要:A major challenge for the treatment of human immunodeficiency virus (HIV) infection is the development of therapy-resistant strains. We present a statistical model that quantifies the evolution of HIV populations when exposed to particular therapies. A hierarchical Bayesian approach is used to estimate differences in rates of nucleotide changes between treatment- and control-group sequences. Each group's rates are allowed to vary spatially along the HIV genome. We employ a coalescent structure...
-
作者:Kim, Young Min; Lahiri, Soumendra N.; Nordman, Daniel J.
作者单位:Radiation Effects Research Foundation - Japan; North Carolina State University; Iowa State University
摘要:This article develops a new blockwise empirical likelihood (BEL) method for stationary, weakly dependent time processes, called the progressive block empirical likelihood (PBEL). In contrast to the standard version of BEL, which uses data blocks of constant length for a given sample size and whose performance can depend crucially on the block length selection, this new approach involves a data-blocking scheme where blocks increase in length by an arithmetic progression. Consequently, no block ...
-
作者:Kunihama, Tsuyoshi; Dunson, David B.
作者单位:Duke University
摘要:It is of interest in many applications to study trends over time in relationships among categorical variables, such as age group, ethnicity, religious affiliation, political party, and preference for particular policies. At each time point, a sample of individuals provides responses to a set of questions, with different individuals sampled at each time. In such settings, there tend to be an abundance of missing data and the variables being measured may change over time. At each time point, we ...
-
作者:Yang, Min; Biedermann, Stefanie; Tang, Elina
作者单位:University of Illinois System; University of Illinois Chicago; University of Illinois Chicago Hospital; University of Southampton; University of Illinois System; University of Illinois Chicago; University of Illinois Chicago Hospital
摘要:Finding optimal designs for nonlinear models is challenging in general. Although some recent results allow us to focus on a simple subclass of designs for most problems, deriving a specific optimal design still mainly depends on numerical approaches. There is need for a general and efficient algorithm that is more broadly applicable than the current state-of-the-art methods. We present a new algorithm that can be used to find optimal designs with respect to a broad class of optimality criteria...
-
作者:Cai, Tianxi; Zheng, Yingye
作者单位:Harvard University; Fred Hutchinson Cancer Center
摘要:The nested case-control (NCC) design has been widely adopted as a cost-effective solution in many large cohort studies for risk assessment with expensive markers, such as the emerging biologic and genetic markers. To analyze data from NCC studies, conditional logistic regression and maximum likelihood-based methods have been proposed. However, most of these methods either cannot be easily extended beyond the Cox model or require additional modeling assumptions. More generally applicable approa...