-
作者:Sadinle, Mauricio; Lei, Jing; Wasserman, Larry
作者单位:University of Washington; University of Washington Seattle; Carnegie Mellon University
摘要:In most classification tasks, there are observations that are ambiguous and therefore difficult to correctly label. Set-valued classifiers output sets of plausible labels rather than a single label, thereby giving a more appropriate and informative treatment to the labeling of ambiguous instances. We introduce a framework for multiclass set-valued classification, where the classifiers guarantee user-defined levels of coverage or confidence (the probability that the true label is contained in t...
-
作者:Daouia, Abdelaati; Gijbels, Irene; Stupfler, Gilles
作者单位:Universite de Toulouse; Universite Toulouse 1 Capitole; Toulouse School of Economics; KU Leuven; KU Leuven; University of Nottingham
摘要:Quantiles and expectiles of a distribution are found to be useful descriptors of its tail in the same way as the median and mean are related to its central behavior. This article considers a valuable alternative class to expectiles, called extremiles, which parallels the class of quantiles and includes the family of expected minima and expected maxima. The new class is motivated via several angles, which reveals its specific merits and strengths. Extremiles suggest better capability of fitting...
-
作者:Saul, Bradley C.; Hudgens, Michael G.; Mallin, Michael A.
作者单位:University of North Carolina; University of North Carolina Chapel Hill; University of North Carolina; University of North Carolina Wilmington
摘要:The United States Environmental Protection Agency considers nutrient pollution in stream ecosystems one of the United States' most pressing environmental challenges. But limited independent replicates, lack of experimental randomization, and space- and time-varying confounding handicap causal inference on effects of nutrient pollution. In this article, the causal g-methods are extended to allow for exposures to vary in time and space in order to assess the effects of nutrient pollution on chlo...
-
作者:Hornstein, Michael; Fan, Roger; Shedden, Kerby; Zhou, Shuheng
作者单位:University of Michigan System; University of Michigan; University of California System; University of California Riverside
摘要:It has been proposed that complex populations, such as those that arise in genomics studies, may exhibit dependencies among observations as well as among variables. This gives rise to the challenging problem of analyzing unreplicated high-dimensional data with unknown mean and dependence structures. Matrix-variate approaches that impose various forms of (inverse) covariance sparsity allow flexible dependence structures to be estimated, but cannot directly be applied when the mean and covarianc...
-
作者:Tibshirani, Ryan J.; Rosset, Saharon
作者单位:Carnegie Mellon University; Carnegie Mellon University; Tel Aviv University
摘要:Nearly all estimators in statistical prediction come with an associated tuning parameter, in one way or another. Common practice, given data, is to choose the tuning parameter value that minimizes a constructed estimate of the prediction error of the estimator; we focus on Stein's unbiased risk estimator, or SURE, which forms an unbiased estimate of the prediction error by augmenting the observed training error with an estimate of the degrees of freedom of the estimator. Parameter tuning via S...
-
作者:Salter, James M.; Williamson, Daniel B.; Scinocca, John; Kharin, Viatcheslav
作者单位:University of Exeter; Environment & Climate Change Canada; Canadian Centre for Climate Modelling & Analysis (CCCma)
摘要:The calibration of complex computer codes using uncertainty quantification (UQ) methods is a rich area of statistical methodological development. When applying these techniques to simulators with spatial output, it is now standard to use principal component decomposition to reduce the dimensions of the outputs in order to allow Gaussian process emulators to predict the output for calibration. We introduce the terminal case, in which the model cannot reproduce observations to within model discr...
-
作者:Cadonna, Annalisa; Kottas, Athanasios; Prado, Raquel
作者单位:Vienna University of Economics & Business; University of California System; University of California Santa Cruz
摘要:We develop a novel Bayesian modeling approach to spectral density estimation for multiple time series. The log-periodogram distribution for each series is modeled as a mixture of Gaussian distributions with frequency-dependent weights and mean functions. The implied model for the log-spectral density is a mixture of linear mean functions with frequency-dependent weights. The mixture weights are built through successive differences of a logit-normal distribution function with frequency-dependen...
-
作者:De Backer, Mickael; Ghouch, Anouar El; Van Keilegom, Ingrid
作者单位:Universite Catholique Louvain; KU Leuven
摘要:In this article, we study a novel approach for the estimation of quantiles when facing potential right censoring of the responses. Contrary to the existing literature on the subject, the adopted strategy of this article is to tackle censoring at the very level of the loss function usually employed for the computation of quantiles, the so-called check function. For interpretation purposes, a simple comparison with the latter reveals how censoring is accounted for in the newly proposed loss func...
-
作者:Tavakoli, Shahin; Pigoli, Davide; Aston, John A. D.; Coleman, John S.
作者单位:University of Warwick; University of London; King's College London; University of Cambridge; University of Oxford
-
作者:Guo, Zijian; Wang, Wanjie; Cai, T. Tony; Li, Hongzhe
作者单位:Rutgers University System; Rutgers University New Brunswick; National University of Singapore; University of Pennsylvania; University of Pennsylvania
摘要:Estimating the genetic relatedness between two traits based on the genome-wide association data is an important problem in genetics research. In the framework of high-dimensional linear models, we introduce two measures of genetic relatedness and develop optimal estimators for them. One is genetic covariance, which is defined to be the inner product of the two regression vectors, and another is genetic correlation, which is a normalized inner product by their lengths. We propose functional de-...