-
作者:Alfons, Andreas; Croux, Christophe; Gelper, Sarah
作者单位:KU Leuven; Erasmus University Rotterdam; Erasmus University Rotterdam - Excl Erasmus MC
摘要:Sparse model estimation is a topic of high importance in modern data analysis due to the increasing availability of data sets with a large number of variables. Another common problem in applied statistics is the presence of outliers in the data. This paper combines robust regression and sparse model estimation. A robust and sparse estimator is introduced by adding an L-1 penalty on the coefficient estimates to the well-known least trimmed squares (LTS) estimator. The breakdown point of this sp...
-
作者:Kleiber, William; Katz, Richard W.; Rajagopalan, Balaji
作者单位:University of Colorado System; University of Colorado Boulder; National Center Atmospheric Research (NCAR) - USA; University of Colorado System; University of Colorado Boulder
摘要:Spatiotemporal simulation of minimum and maximum temperature is a fundamental requirement for climate impact studies and hydrological or agricultural models. Particularly over regions with variable orography, these simulations are difficult to produce due to terrain driven nonstationarity. We develop a bivariate stochastic model for the spatiotemporal field of minimum and maximum temperature. The proposed framework splits the bivariate field into two components of local climate and weather. Th...
-
作者:Imai, Kosuke; Ratkovic, Marc
作者单位:Princeton University
摘要:When evaluating the efficacy of social programs and medical treatments using randomized experiments, the estimated overall average causal effect alone is often of limited value and the researchers must investigate when the treatments do and do not work. Indeed, the estimation of treatment effect heterogeneity plays an essential role in (1) selecting the most effective treatment from a large number of available treatments, (2) ascertaining subpopulations for which a treatment is effective or ha...
-
作者:Gramacy, Robert B.; Taddy, Matt; Wild, Stefan M.
作者单位:University of Chicago; United States Department of Energy (DOE); Argonne National Laboratory; University of Chicago; University of Chicago
摘要:We investigate an application in the automatic tuning of computer codes, an area of research that has come to prominence alongside the recent rise of distributed scientific processing and heterogeneity in high-performance computing environments. Here, the response function is nonlinear and noisy and may not be smooth or stationary. Clearly needed are variable selection, decomposition of influence, and analysis of main and secondary effects for both real-valued and binary inputs and outputs. Ou...
-
作者:Shen, Ronglai; Wang, Sijian; Mo, Qianxing
作者单位:Memorial Sloan Kettering Cancer Center; University of Wisconsin System; University of Wisconsin Madison; University of Wisconsin System; University of Wisconsin Madison; Baylor College of Medicine
摘要:High resolution microarrays and second-generation sequencing platforms are powerful tools to investigate genome-wide alterations in DNA copy number, methylation and gene expression associated with a disease. An integrated genomic profiling approach measures multiple omics data types simultaneously in the same set of biological samples. Such approach renders an integrated data resolution that would not be available with any single data type. In this study, we use penalized latent variable regre...
-
作者:Quick, Harrison; Banerjee, Sudipto; Carlin, Bradley P.
作者单位:University of Minnesota System; University of Minnesota Twin Cities
摘要:Advances in Geographical Information Systems (GIS) have led to the enormous recent burgeoning of spatial-temporal databases and associated statistical modeling. Here we depart from the rather rich literature in space-time modeling by considering the setting where space is discrete (e. g., aggregated data over regions), but time is continuous. Our major objective in this application is to carry out inference on gradients of a temporal process in our data set of monthly county level asthma hospi...
-
作者:Lin, Winston
作者单位:University of California System; University of California Berkeley
摘要:Freedman [Adv. in Appl. Math. 40 (2008) 180-193; Ann. Appl. Stat. 2 (2008) 176-196] critiqued ordinary least squares regression adjustment of estimated treatment effects in randomized experiments, using Neyman's model for randomization inference. Contrary to conventional wisdom, he argued that adjustment can lead to worsened asymptotic precision, invalid measures of precision, and small-sample bias. This paper shows that in sufficiently large samples, those problems are either minor or easily ...
-
作者:Krafty, Robert T.; Hall, Martica
作者单位:Pennsylvania Commonwealth System of Higher Education (PCSHE); University of Pittsburgh; Pennsylvania Commonwealth System of Higher Education (PCSHE); University of Pittsburgh
摘要:Although many studies collect biomedical time series signals from multiple subjects, there is a dearth of models and methods for assessing the association between frequency domain properties of time series and other study outcomes. This article introduces the random Cramer representation as a joint model for collections of time series and static outcomes where power spectra are random functions that are correlated with the outcomes. A canonical correlation analysis between cepstral coefficient...
-
作者:Yuan, Ying; Zhu, Hongtu; Styner, Martin; Gilmore, John H.; Marron, J. S.
作者单位:St Jude Children's Research Hospital; University of North Carolina; University of North Carolina Chapel Hill
摘要:Diffusion tensor imaging provides important information on tissue structure and orientation of fiber tracts in brain white matter in vivo. It results in diffusion tensors, which are 3 x 3 symmetric positive definite (SPD) matrices, along fiber bundles. This paper develops a functional data analysis framework to model diffusion tensors along fiber tracts as functional data in a Riemannian manifold with a set of covariates of interest, such as age and gender. We propose a statistical model with ...