-
作者:Berger, Yves G.; Rao, J. N. K.
作者单位:University of Reading; Carleton University
摘要:Imputation is commonly used to compensate for item non-response in sample surveys. If we treat the imputed values as if they are true values, and then compute the variance estimates by using standard methods, such as the jackknife, we can seriously underestimate the true variances. We propose a modified jackknife variance estimator which is defined for any without-replacement unequal probability sampling design in the presence of imputation and non-negligible sampling fraction. Mean, ratio and...
-
作者:Slud, EV; Maiti, T
作者单位:University System of Maryland; University of Maryland College Park; Iowa State University
摘要:The problem of accurately estimating the mean-squared error of small area estimators within a Fay-Herriot normal error model is studied theoretically in the common setting where the model is fitted to a logarithmically transformed response variable. For bias-corrected empirical best linear unbiased predictor small area point estimators, mean-squared error formulae and estimators are provided, with biases of smaller order than the reciprocal of the number of small areas. The performance of thes...
-
作者:Zou, H; Hastie, T
作者单位:Stanford University
-
作者:Berger, YG; Skinner, CJ
作者单位:University of Southampton
摘要:The jackknife method is often used for variance estimation in sample surveys but has only been developed for a limited class of sampling designs. We propose a jackknife variance estimator which is defined for any without-replacement unequal probability sampling design. We demonstrate design consistency of this estimator for a broad class of point estimators. A Monte Carlo study shows how the proposed estimator may improve on existing estimators.
-
作者:Dellaportas, P; Tarantola, C
作者单位:University of Pavia; Athens University of Economics & Business
摘要:We deal with contingency table data that are used to examine the relationships between a set of categorical variables or factors. We assume that such relationships can be adequately described by the cond`itional independence structure that is imposed by an undirected graphical model. If the contingency table is large, a desirable simplified interpretation can be achieved by combining some categories, or levels, of the factors. We introduce conditions under which such an operation does not alte...
-
作者:Hall, P; Samworth, RJ
作者单位:University of Cambridge; Australian National University
摘要:It is shown that bagging, a computationally intensive method, asymptotically improves the performance of nearest neighbour classifiers provided that the resample size is less than 69% of the actual sample size, in the case of with-replacement bagging, or less than 50% of the sample size, for without-replacement bagging. However, for larger sampling fractions there is no asymptotic difference between the risk of the regular nearest neighbour classifier and its bagged version. In particular, nei...
-
作者:Johnson, VE
作者单位:University of Texas System; UTMD Anderson Cancer Center
摘要:Traditionally, the use of Bayes factors has required the specification of proper prior distributions on model parameters that are implicit to both null and alternative hypotheses. I describe an approach to defining Bayes factors based on modelling test statistics. Because the distributions of test statistics do not depend on unknown model parameters, this approach eliminates much of the subjectivity that is normally associated with the definition of Bayes factors. For standard test statistics,...
-
作者:Hall, P; Marron, JS; Neeman, A
作者单位:University of North Carolina; University of North Carolina Chapel Hill; Australian National University
摘要:High dimension, low sample size data are emerging in various areas of science. We find a common structure underlying many such data sets by using a non-standard type of asymptotics: the dimension tends to infinity while the sample size is fixed. Our analysis shows a tendency for the data to lie deterministically at the vertices of a regular simplex. Essentially all the randomness in the data appears only as a random rotation of this simplex. This geometric representation is used to obtain seve...
-
作者:Owen, AB
作者单位:Stanford University
摘要:In high throughput genomic work, a very large number d of hypotheses are tested based on n << d data samples. The large number of tests necessitates an adjustment for false discoveries in which a true null hypothesis was rejected. The expected number of false discoveries is easy to obtain. Dependences between the hypothesis tests greatly affect the variance of the number of false discoveries. Assuming that the tests are independent gives an inadequate variance formula. The paper presents a var...
-
作者:Baddeley, A; Turner, R; Moller, J; Hazelton, M
作者单位:University of Western Australia; University of New Brunswick; Aalborg University
摘要:We define residuals for point process models fitted to spatial point pattern data, and we propose diagnostic plots based on them. The residuals apply to any point process model that has a conditional intensity; the model may exhibit spatial heterogeneity, interpoint interaction and dependence on spatial covariates. Some existing ad hoc methods for model checking (quadrat counts, scan statistic, kernel smoothed intensity and Berman's diagnostic) are recovered as special cases. Diagnostic tools ...