-
作者:Pannekoek, Jeroen; Shlomo, Natalie; De Waal, Ton
作者单位:University of Manchester
摘要:A common problem faced by statistical institutes is that data may be missing from collected data sets. The typical way to overcome this problem is to impute the missing data. The problem of imputing missing data is complicated by the fact that statistical data often have to satisfy certain edit rules and that values of variables across units sometimes have to sum up to known totals. For numerical data, edit rules are most often formulated as linear restrictions on the variables. For example, f...
-
作者:Konomi, Bledar A.; Dhavala, Soma S.; Huang, Jianhua Z.; Kundu, Subrata; Huitink, David; Liang, Hong; Ding, Yu; Mallick, Bani K.
作者单位:Texas A&M University System; Texas A&M University College Station; Texas A&M University System; Texas A&M University College Station; Texas A&M University System; Texas A&M University College Station
摘要:The properties of materials synthesized with nanoparticles (NPs) are highly correlated to the sizes and shapes of the nanoparticles. The transmission electron microscopy (TEM) imaging technique can be used to measure the morphological characteristics of NPs, which can be simple circles or more complex irregular polygons with varying degrees of scales and sizes. A major difficulty in analyzing the TEM images is the overlapping of objects, having different morphological properties with no specif...
-
作者:Quick, Harrison; Banerjee, Sudipto; Carlin, Bradley P.
作者单位:University of Minnesota System; University of Minnesota Twin Cities
摘要:Advances in Geographical Information Systems (GIS) have led to the enormous recent burgeoning of spatial-temporal databases and associated statistical modeling. Here we depart from the rather rich literature in space-time modeling by considering the setting where space is discrete (e. g., aggregated data over regions), but time is continuous. Our major objective in this application is to carry out inference on gradients of a temporal process in our data set of monthly county level asthma hospi...
-
作者:Ye, Zhi-Sheng; Hong, Yili; Xie, Yimeng
作者单位:Hong Kong Polytechnic University; Virginia Polytechnic Institute & State University
摘要:The main objective of accelerated life tests (ALTs) is to predict fraction failings of products in the field. However, there are often discrepancies between the predicted fraction failing from the lab testing data and that from the field failure data, due to the yet unobserved heterogeneities in usage and operating conditions. Most previous research on ALT planning and data analysis ignores the discrepancies, resulting in inferior test plans and biased predictions. In this paper we model the h...
-
作者:Lin, Winston
作者单位:University of California System; University of California Berkeley
摘要:Freedman [Adv. in Appl. Math. 40 (2008) 180-193; Ann. Appl. Stat. 2 (2008) 176-196] critiqued ordinary least squares regression adjustment of estimated treatment effects in randomized experiments, using Neyman's model for randomization inference. Contrary to conventional wisdom, he argued that adjustment can lead to worsened asymptotic precision, invalid measures of precision, and small-sample bias. This paper shows that in sufficiently large samples, those problems are either minor or easily ...
-
作者:Rusch, Thomas; Hofmarcher, Paul; Hatzinger, Reinhold; Hornik, Kurt
作者单位:Vienna University of Economics & Business; Johannes Kepler University Linz; Vienna University of Economics & Business
摘要:The WikiLeaks Afghanistan war logs contain nearly 77,000 reports of incidents in the US-led Afghanistan war, covering the period from January 2004 to December 2009. The recent growth of data on complex social systems and the potential to derive stories from them has shifted the focus of journalistic and scientific attention increasingly toward data-driven journalism and computational social science. In this paper we advocate the usage of modern statistical methods for problems of data journali...
-
作者:Gruhl, Jonathan; Erosheva, Elena A.; Crane, Paul K.
作者单位:University of Washington; University of Washington Seattle; Harborview Medical Center; University of Washington; University of Washington Seattle
摘要:Multivariate data that combine binary, categorical, count and continuous outcomes are common in the social and health sciences. We propose a semiparametric Bayesian latent variable model for multivariate data of arbitrary type that does not require specification of conditional distributions. Drawing on the extended rank likelihood method by Hoff [Ann. Appl. Stat. 1 (2007) 265-283], we develop a semiparametric approach for latent variable modeling with mixed outcomes and propose associated Mark...
-
作者:Castruccio, Stefano; Stein, Michael L.
作者单位:University of Chicago
摘要:Global climate models aim to reproduce physical processes on a global scale and predict quantities such as temperature given some forcing inputs. We consider climate ensembles made of collections of such runs with different initial conditions and forcing scenarios. The purpose of this work is to show how the simulated temperatures in the ensemble can be reproduced (emulated) with a global space/time statistical model that addresses the issue of capturing nonstationarities in latitude more effe...
-
作者:Yau, Christopher; Holmes, Christopher C.
作者单位:Imperial College London; University of Oxford
摘要:This paper is concerned with statistical methods for the segmental classification of linear sequence data where the task is to segment and classify the data according to an underlying hidden discrete state sequence. Such analysis is commonplace in the empirical sciences including genomics, finance and speech processing. In particular, we are interested in answering the following question: given data y and a statistical model pi(x, y) of the hidden states x, what should we report as the predict...
-
作者:Krafty, Robert T.; Hall, Martica
作者单位:Pennsylvania Commonwealth System of Higher Education (PCSHE); University of Pittsburgh; Pennsylvania Commonwealth System of Higher Education (PCSHE); University of Pittsburgh
摘要:Although many studies collect biomedical time series signals from multiple subjects, there is a dearth of models and methods for assessing the association between frequency domain properties of time series and other study outcomes. This article introduces the random Cramer representation as a joint model for collections of time series and static outcomes where power spectra are random functions that are correlated with the outcomes. A canonical correlation analysis between cepstral coefficient...