-
作者:Witten, Daniela M.; Tibshirani, Robert
作者单位:University of Washington; University of Washington Seattle; Stanford University
摘要:We consider the supervised classification setting, in which the data consist of p features measured on n observations, each of which belongs to one of K classes. Linear discriminant analysis (LDA) is a classical method for this problem. However, in the high dimensional setting where p >> n, LDA is not appropriate for two reasons. First, the standard estimate for the within-class covariance matrix is singular, and so the usual discriminant rule cannot be applied. Second, when p is large, it is ...
-
作者:Ma, Yanyuan; Hart, Jeffrey D.; Janicki, Ryan; Carroll, Raymond J.
作者单位:Texas A&M University System; Texas A&M University College Station
摘要:We consider functional measurement error models, i.e. models where covariates are measured with error and yet no distributional assumptions are made about the mismeasured variable. We propose and study a score-type local test and an orthogonal series-based, omnibus goodness-of-fit test in this context, where no likelihood function is available or calculated-i.e. all the tests are proposed in the semiparametric model framework. We demonstrate that our tests have optimality properties and comput...
-
作者:Fryzlewicz, P.; Oh, H. -S.
作者单位:University of London; London School Economics & Political Science; Seoul National University (SNU)
摘要:Traditional visualization of time series data often consists of plotting the time series values against time and 'connecting the dots'. We propose an alternative, multiscale visualization technique, motivated by the scale-space approach in computer vision. In brief, our method also 'connects the dots' but uses a range of pens of varying thicknesses for this. The resulting multiscale map, which is termed the thick pen transform, corresponds to viewing the time series from a range of distances. ...
-
作者:Tibshirani, Robert
作者单位:Stanford University
摘要:In the paper I give a brief review of the basic idea and some history and then discuss some developments since the original paper on regression shrinkage and selection via the lasso.
-
作者:Yau, C.; Papaspiliopoulos, O.; Roberts, G. O.; Holmes, C.
作者单位:Pompeu Fabra University; University of Oxford; University of Warwick
摘要:We propose a flexible non-parametric specification of the emission distribution in hidden Markov models and we introduce a novel methodology for carrying out the computations. Whereas current approaches use a finite mixture model, we argue in favour of an infinite mixture model given by a mixture of Dirichlet processes. The computational framework is based on auxiliary variable representations of the Dirichlet process and consists of a forward-backward Gibbs sampling algorithm of similar compl...
-
作者:Guillotte, Simon; Perron, Francois; Segers, Johan
作者单位:Universite Catholique Louvain; University of Prince Edward Island; Universite de Montreal
摘要:The tail of a bivariate distribution function in the domain of attraction of a bivariate extreme value distribution may be approximated by that of its extreme value attractor. The extreme value attractor has margins that belong to a three-parameter family and a dependence structure which is characterized by a probability measure on the unit interval with mean equal to 1/2, which is called the spectral measure. Inference is done in a Bayesian framework using a censored likelihood approach. A pr...
-
作者:McCabe, Brendan P. M.; Martin, Gael M.; Harris, David
作者单位:Monash University; University of Liverpool; University of Melbourne
摘要:Efficient probabilistic forecasts of integer-valued random variables are derived. The optimality is achieved by estimating the forecast distribution non-parametrically over a given broad model class and proving asymptotic (non-parametric) efficiency in that setting. The method is developed within the context of the integer auto-regressive class of models, which is a suitable class for any count data that can be interpreted as a queue, stock, birth-and-death process or branching process. The th...
-
作者:Bradic, Jelena; Fan, Jianqing; Wang, Weiwei
作者单位:Princeton University; University of Texas System; University of Texas Health Science Center Houston
摘要:In high dimensional model selection problems, penalized least square approaches have been extensively used. The paper addresses the question of both robustness and efficiency of penalized model selection methods and proposes a data-driven weighted linear combination of convex loss functions, together with weighted L-1-penalty. It is completely data adaptive and does not require prior knowledge of the error distribution. The weighted L-1-penalty is used both to ensure the convexity of the penal...
-
作者:Lindgren, Finn; Rue, Havard; Lindstrom, Johan
作者单位:Norwegian University of Science & Technology (NTNU); Lund University
摘要:Continuously indexed Gaussian fields (GFs) are the most important ingredient in spatial statistical modelling and geostatistics. The specification through the covariance function gives an intuitive interpretation of the field properties. On the computational side, GFs are hampered with the big n problem, since the cost of factorizing dense matrices is cubic in the dimension. Although computational power today is at an all time high, this fact seems still to be a computational bottleneck in man...
-
作者:Bickel, Peter J.; Gel, Yulia R.
作者单位:University of Waterloo; University of California System; University of California Berkeley
摘要:The paper addresses a 'large p-small n' problem in a time series framework and considers properties of banded regularization of an empirical autocovariance matrix of a time series process. Utilizing the banded autocovariance matrix enables us to fit a much longer auto-regressive AR(p) model to the observed data than typically suggested by the Akaike information criterion, while controlling how many parameters are to be estimated precisely and the level of accuracy. We present results on asympt...