-
作者:Strawderman, William E.; Rukhin, Andrew L.
作者单位:Rutgers University System; Rutgers University New Brunswick; National Institute of Standards & Technology (NIST) - USA
摘要:Several procedures that are designed to reduce nonconformity in interlaboratory studies by shrinking data towards a consensus weighted mean are suggested. Some of them are shown to have a smaller quadratic risk than the vector sample means. Shrinkage towards a weighted mean in a random-effects model and a statistic appearing in models which allow for systematic errors are also considered. The results are illustrated by two examples of collaborative studies.
-
作者:Copas, John; Eguchi, Shinto
作者单位:University of Warwick; Research Organization of Information & Systems (ROIS); Institute of Statistical Mathematics (ISM) - Japan
摘要:In likelihood inference we usually assume that the model is fixed and then base inference on the corresponding likelihood function. Often, however, the choice of model is rather arbitrary, and there may be other models which fit the data equally well. We study robustness of likelihood inference over such 'statistically equivalent' models and suggest a simple 'envelope likelihood' to capture this aspect of model uncertainty. Robustness depends critically on how we specify the parameter of inter...
-
作者:Ranjan, Roopesh; Gneiting, Tilmann
作者单位:Ruprecht Karls University Heidelberg; University of Washington; University of Washington Seattle
摘要:Linear pooling is by far the most popular method for combining probability forecasts. However, any non-trivial weighted average of two or more distinct, calibrated probability forecasts is necessarily uncalibrated and lacks sharpness. In view of this, linear pooling requires recalibration, even in the ideal case in which the individual forecasts are calibrated. Towards this end, we propose a beta-transformed linear opinion pool for the aggregation of probability forecasts from distinct, calibr...
-
作者:Chun, Hyonho; Keles, Suenduez
作者单位:University of Wisconsin System; University of Wisconsin Madison
摘要:Partial least squares regression has been an alternative to ordinary least squares for handling multicollinearity in several areas of scientific research since the 1960s. It has recently gained much attention in the analysis of high dimensional genomic data. We show that known asymptotic consistency of the partial least squares estimator for a univariate response does not hold with the very large p and small n paradigm. We derive a similar result for a multivariate response regression with par...
-
作者:Ning, Jing; Qin, Jing; Shen, Yu
作者单位:University of Texas System; University of Texas Health Science Center Houston; University of Texas School Public Health; National Institutes of Health (NIH) - USA; NIH National Institute of Allergy & Infectious Diseases (NIAID); University of Texas System; UTMD Anderson Cancer Center
摘要:Testing the equality of two survival distributions can be difficult in a prevalent cohort study when non-random sampling of subjects is involved. Owing to the biased sampling scheme, the independent censoring assumption is often violated. Although the issues about biased inference caused by length-biased sampling have been widely recognized in the statistical, epidemiological and economical literature, there is no satisfactory solution for efficient two-sample testing. We propose an asymptotic...
-
作者:Hall, Peter; Titterington, D. M.; Xue, Jing-Hao
作者单位:University of London; University College London; University of Melbourne; University of Glasgow
摘要:Many contemporary classifiers are constructed to provide good performance for very high dimensional data. However, an issue that is at least as important as good classification is determining which of the many potential variables provide key information for good decisions. Responding to this issue can help us to determine which aspects of the datagenerating mechanism (e.g. which genes in a genomic study) are of greatest importance in terms of distinguishing between populations. We introduce ti...
-
作者:Drost, Feike C.; van den Akker, Ramon; Werker, Bas J. M.
作者单位:Tilburg University
摘要:Integer-valued auto-regressive (INAR) processes have been introduced to model non-negative integer-valued phenomena that evolve over time. The distribution of an INAR(p) process is essentially described by two parameters: a vector of auto-regression coefficients and a probability distribution on the non-negative integers, called an immigration or innovation distribution. Traditionally, parametric models are considered where the innovation distribution is assumed to belong to a parametric famil...
-
作者:Garcia-Escudero, L. A.; Gordaliza, A.; San Martin, R.; Van Aelst, S.; Zamar, R.
作者单位:Universidad de Valladolid; Ghent University; University of British Columbia
摘要:Non-hierarchical clustering methods are frequently based on the idea of forming groups around 'objects'. The main exponent of this class of methods is the k-means method, where these objects are points. However, clusters in a data set may often be due to certain relationships between the measured variables. For instance, we can find linear structures such as straight lines and planes, around which the observations are grouped in a natural way. These structures are not well represented by point...
-
作者:Guindani, Michele; Mueller, Peter; Zhang, Song
作者单位:University of New Mexico; University of Texas System; UTMD Anderson Cancer Center; University of Texas System; University of Texas Southwestern Medical Center
摘要:We discuss a Bayesian discovery procedure for multiple-comparison problems. We show that, under a coherent decision theoretic framework, a loss function combining true positive and false positive counts leads to a decision rule that is based on a threshold of the posterior probability of the alternative. Under a semiparametric model for the data, we show that the Bayes rule can be approximated by the optimal discovery procedure, which was recently introduced by Storey. Improving the approximat...
-
作者:Hall, Peter; Maiti, Tapabrata
作者单位:Michigan State University; University of Melbourne
摘要:We develop a general non-parametric approach to the analysis of clustered data via random effects. Assuming only that the link function is known, the regression functions and the distributions of both cluster means and observation errors are treated non-parametrically. Our argument proceeds by viewing the observation error at the cluster mean level as though it were a measurement error in an errors-in-variables problem, and using a deconvolution argument to access the distribution of the clust...