-
作者:Chen, Kehui; Mueller, Hans-Georg
作者单位:University of California System; University of California Davis
摘要:Motivated by the conditional growth charts problem, we develop a method for conditional quantile analysis when predictors take values in a functional space. The method proposed aims at estimating conditional distribution functions under a generalized functional regression framework. This approach facilitates balancing of model flexibility and the curse of dimensionality for the infinite dimensional functional predictors. Its good performance in comparison with other methods, both for sparsely ...
-
作者:Neill, Daniel B.
作者单位:Carnegie Mellon University
摘要:. We propose a new fast subset scan approach for accurate and computationally efficient event detection in massive data sets. We treat event detection as a search over subsets of data records, finding the subset which maximizes some score function. We prove that many commonly used functions (e.g. Kulldorff's spatial scan statistic and extensions) satisfy the linear time subset scanning property, enabling exact and efficient optimization over subsets. In the spatial setting, we demonstrate that...
-
作者:Yekutieli, Daniel
作者单位:Tel Aviv University
摘要:. We address the problem of providing inference from a Bayesian perspective for parameters selected after viewing the data. We present a Bayesian framework for providing inference for selected parameters, based on the observation that providing Bayesian inference for selected parameters is a truncated data problem. We show that if the prior for the parameter is non-informative, or if the parameter is a fixed unknown constant, then it is necessary to adjust the Bayesian inference for selection....
-
作者:Cho, Haeran; Fryzlewicz, Piotr
作者单位:University of London; London School Economics & Political Science
摘要:. The paper considers variable selection in linear regression models where the number of covariates is possibly much larger than the number of observations. High dimensionality of the data brings in many complications, such as (possibly spurious) high correlations between the variables, which result in marginal correlation being unreliable as a measure of association between the variables and the response. We propose a new way of measuring the contribution of each variable to the response whic...
-
作者:Fan, Jianqing; Guo, Shaojun; Hao, Ning
作者单位:Princeton University; Chinese Academy of Sciences; University of Arizona
摘要:Variance estimation is a fundamental problem in statistical modelling. In ultrahigh dimensional linear regression where the dimensionality is much larger than the sample size, traditional variance estimation techniques are not applicable. Recent advances in variable selection in ultrahigh dimensional linear regression make this problem accessible. One of the major problems in ultrahigh dimensional regression is the high spurious correlation between the unobserved realized noise and some of the...
-
作者:Lee, Stephen M. S.
作者单位:University of Hong Kong
摘要:We consider the general problem of constructing confidence regions for, possibly multi-dimensional, parameters when we have available more than one approach for the construction. These approaches may be motivated by different model assumptions, different levels of approximation, different settings of tuning parameters or different Monte Carlo algorithms. Their effectiveness is often governed by different sets of conditions which are difficult to vindicate in practice. We propose two procedures...
-
作者:Farrington, C. Paddy; Unkel, Steffen; Anaya-Izquierdo, Karim
作者单位:Open University - UK; University of London; London School of Hygiene & Tropical Medicine
摘要:. The relative frailty variance among survivors provides a readily interpretable measure of how the heterogeneity of a population, as represented by a frailty model, evolves over time. We discuss the properties of the relative frailty variance, show that it characterizes frailty distributions and that, suitably rescaled, it may be used to compare patterns of dependence across models and data sets. In shared frailty models, the relative frailty variance is closely related to the cross-ratio fun...
-
作者:Casella, G.; Roberts, G. Z.
摘要:Gaussian process models have been widely used in spatial statistics but face tremendous computational challenges for very large data sets. The model fitting and spatial prediction of such models typically require O(n(3)) operations for a data set of size n. Various approximations of the covariance functions have been introduced to reduce the computational cost. However, most existing approximations cannot simultaneously capture both the large- and the small-scale spatial dependence. A new appr...
-
作者:Fan, Jianqing; Guo, Shaojun; Hao, Ning
作者单位:Princeton University; Chinese Academy of Sciences; University of Arizona
摘要:. Variance estimation is a fundamental problem in statistical modelling. In ultrahigh dimensional linear regression where the dimensionality is much larger than the sample size, traditional variance estimation techniques are not applicable. Recent advances in variable selection in ultrahigh dimensional linear regression make this problem accessible. One of the major problems in ultrahigh dimensional regression is the high spurious correlation between the unobserved realized noise and some of t...
-
作者:van Erven, Tim; Grunwald, Peter; de Rooij, Steven
作者单位:University of Cambridge
摘要:. Prediction and estimation based on Bayesian model selection and model averaging, and derived methods such as the Bayesian information criterion BIC, do not always converge at the fastest possible rate. We identify the catch-up phenomenon as a novel explanation for the slow convergence of Bayesian methods, which inspires a modification of the Bayesian predictive distribution, called the switch distribution. When used as an adaptive estimator, the switch distribution does achieve optimal cumul...