-
作者:Zou, H; Hastie, T
作者单位:Stanford University
-
作者:Berger, YG; Skinner, CJ
作者单位:University of Southampton
摘要:The jackknife method is often used for variance estimation in sample surveys but has only been developed for a limited class of sampling designs. We propose a jackknife variance estimator which is defined for any without-replacement unequal probability sampling design. We demonstrate design consistency of this estimator for a broad class of point estimators. A Monte Carlo study shows how the proposed estimator may improve on existing estimators.
-
作者:Dellaportas, P; Tarantola, C
作者单位:University of Pavia; Athens University of Economics & Business
摘要:We deal with contingency table data that are used to examine the relationships between a set of categorical variables or factors. We assume that such relationships can be adequately described by the cond`itional independence structure that is imposed by an undirected graphical model. If the contingency table is large, a desirable simplified interpretation can be achieved by combining some categories, or levels, of the factors. We introduce conditions under which such an operation does not alte...
-
作者:Hall, P; Samworth, RJ
作者单位:University of Cambridge; Australian National University
摘要:It is shown that bagging, a computationally intensive method, asymptotically improves the performance of nearest neighbour classifiers provided that the resample size is less than 69% of the actual sample size, in the case of with-replacement bagging, or less than 50% of the sample size, for without-replacement bagging. However, for larger sampling fractions there is no asymptotic difference between the risk of the regular nearest neighbour classifier and its bagged version. In particular, nei...
-
作者:Johnson, VE
作者单位:University of Texas System; UTMD Anderson Cancer Center
摘要:Traditionally, the use of Bayes factors has required the specification of proper prior distributions on model parameters that are implicit to both null and alternative hypotheses. I describe an approach to defining Bayes factors based on modelling test statistics. Because the distributions of test statistics do not depend on unknown model parameters, this approach eliminates much of the subjectivity that is normally associated with the definition of Bayes factors. For standard test statistics,...
-
作者:Hall, P; Marron, JS; Neeman, A
作者单位:University of North Carolina; University of North Carolina Chapel Hill; Australian National University
摘要:High dimension, low sample size data are emerging in various areas of science. We find a common structure underlying many such data sets by using a non-standard type of asymptotics: the dimension tends to infinity while the sample size is fixed. Our analysis shows a tendency for the data to lie deterministically at the vertices of a regular simplex. Essentially all the randomness in the data appears only as a random rotation of this simplex. This geometric representation is used to obtain seve...
-
作者:Owen, AB
作者单位:Stanford University
摘要:In high throughput genomic work, a very large number d of hypotheses are tested based on n << d data samples. The large number of tests necessitates an adjustment for false discoveries in which a true null hypothesis was rejected. The expected number of false discoveries is easy to obtain. Dependences between the hypothesis tests greatly affect the variance of the number of false discoveries. Assuming that the tests are independent gives an inadequate variance formula. The paper presents a var...
-
作者:Baddeley, A; Turner, R; Moller, J; Hazelton, M
作者单位:University of Western Australia; University of New Brunswick; Aalborg University
摘要:We define residuals for point process models fitted to spatial point pattern data, and we propose diagnostic plots based on them. The residuals apply to any point process model that has a conditional intensity; the model may exhibit spatial heterogeneity, interpoint interaction and dependence on spatial covariates. Some existing ad hoc methods for model checking (quadrat counts, scan statistic, kernel smoothed intensity and Berman's diagnostic) are recovered as special cases. Diagnostic tools ...
-
作者:Huang, WZ; Fitzmaurice, GM
作者单位:Harvard University; Harvard T.H. Chan School of Public Health; Harvard University; Harvard University Medical Affiliates; Brigham & Women's Hospital
摘要:The paper considers modelling, estimating and diagnostically verifying the response process generating longitudinal data, with emphasis on association between repeated meas-ures from unbalanced longitudinal designs. Our model is based on separate specifications of the moments for the mean, standard deviation and correlation, with different components possibly sharing common parameters. We propose a general class of correlation structures that comprise random effects, measurement errors and a s...
-
作者:Chen, YG; Xie, JY; Liu, JS
作者单位:Harvard University; Duke University
摘要:Motivated by the statistical inference problem in population genetics, we present a new sequential importance sampling with resampling strategy. The idea of resampling is key to the recent surge of popularity of sequential Monte Carlo methods in the statistics and engin-eering communities, but existing resampling techniques do not work well for coalescent-based inference problems in population genetics. We develop a new method called 'stopping-time resampling', which allows us to compare parti...