-
作者:Bühlmann, P; Yu, B
作者单位:Swiss Federal Institutes of Technology Domain; ETH Zurich; University of California System; University of California Berkeley
摘要:Bagging is one of the most effective computationally intensive procedures to improve on unstable estimators or classifiers, useful especially for high dimensional data set problems. Here we formalize the notion of instability and derive theoretical results to analyze the variance reduction effect of bagging (or variants thereof) in mainly hard decision problems, which include estimation after testing in regression and decision trees for regression functions and classifiers. Hard decisions crea...
-
作者:Friedman, JH; Stuetzle, W
作者单位:Stanford University; University of Washington; University of Washington Seattle
摘要:If there ever was a tool that could stimulate the imagination and profit from the intuition and creativity of John Tukey, it was computer graphics. John always saw graphics a being central to exploratory data analysis: Since the aim of exploratory data analysis is to learn what seems to be, it should be no surprise that pictures play a vital role in doing it well. There is nothing better than a picture for making you think of questions your had forgotten to ask (even mentally). Much of his wor...
-
作者:Bar-Lev, SK; Bshouty, D; Letac, G
作者单位:University of Haifa; Technion Israel Institute of Technology; Universite de Toulouse; Universite Toulouse III - Paul Sabatier; Centre National de la Recherche Scientifique (CNRS)
摘要:Consider an NEF F on the real line parametrized by theta is an element of Theta. Also let 00 be a specified value of theta. Consider the test of size alpha for a simple hypothesis H-0:theta = theta(0) versus two sided alternative H-1:theta not equal theta(0). A UMPU test of size alpha then exists for any given alpha. Suppose that F is continuous. Therefore the UMPU test is nonrandomized and then becomes comparable with the generalized likelihood ratio test (GLR). Under mild conditions we show ...
-
作者:Stein, ML
作者单位:University of Chicago
摘要:When predicting the value of a stationary random field at a location x in some region in which one has a large number of observations, it may be difficult to compute the optimal predictor. One simple way to reduce the computational burden is to base the predictor only on those observations nearest to x. As long as the number of observations used in the predictor is sufficiently large, one might generally expect the best predictor based on these observations to be nearly optimal relative to the...
-
作者:Dey, A; Suen, CY
作者单位:Indian Statistical Institute; Indian Statistical Institute Delhi; University System of Ohio; Cleveland State University
摘要:Finite projective geometry is used to obtain fractional factorial plans for m-level symmetrical factorial experiments, where m is a prime or a prime power. Under a model that includes the mean, all main effects and a specified set of two-factor interactions, the plans are shown to be universally optimal within the class of all plans involving the same number of runs.
-
作者:Brown, LD; Cai, TT; DasGupta, A
作者单位:University of Pennsylvania; Purdue University System; Purdue University
摘要:We address the classic problem of interval estimation of a binomial proportion. The Wald interval <(p)over cap> +/- (z)alpha/2(n-1/2)((p) over cap (1 - (p) over cap))(1/2) is currently in near universal use. We first show that the coverage properties of the Wald interval are persistently poor and defy virtually all conventional wisdom. We then proceed to a theoretical comparison of the standard interval and four additional alternative intervals by asymptotic expansions of their coverage probab...
-
作者:Rosenberg, D; Solan, E; Vieille, N
作者单位:Universite Paris 13; Tel Aviv University; Northwestern University; Hautes Etudes Commerciales (HEC) Paris
摘要:A Blackwell epsilon-optimal strategy in a Markov Decision Process is a strategy that is epsilon-optimal for every discount factor sufficiently close to 1. We prove the existence of Blackwell epsilon-optimal strategies in finite Marko Decision Processes with partial observation.
-
作者:Hall, P; Peng, L; Tajvidi, N
作者单位:Australian National University; Lund University
摘要:A feature that distinguishes extreme-value contexts from more conventional statistical problems is that in the former we often wish to make predictions well beyond the range of the data. For example, one might have a 10-year sequence of observations of a phenomenon, and wish to make forecasts for the next 20 to 30 years. It is generally unclear how such long ranges of extrapolation affect prediction. In the present paper, and for extremes from a distribution with regularly varying tails at inf...
-
作者:Efromovich, S
作者单位:University of New Mexico
-
作者:Speed, TP
作者单位:University of California System; University of California Berkeley
摘要:John Tukey connected the theory underlying simple random sampling without replacement, cumulants, expected mean squares and spectrum analysis. He gave us one degree of freedom for nonadditivity, and he pioneered finite population models for understanding ANOVA. He wrote widely on the nature and purpose of ANOVA, and he illustrated his approach. In this appreciation of Tukey's work on ANOVA we summarize and comment on his contributions, and refer to some relevant recent literature.