-
作者:Zhu, Li-Ping; Li, Lexin; Li, Runze; Zhu, Li-Xing
作者单位:Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park; Shanghai University of Finance & Economics; North Carolina State University; Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park; Hong Kong Baptist University
摘要:With the recent explosion of scientific data of unprecedented size and complexity, feature ranking and screening are playing an increasingly important role in many scientific studies. In this article, we propose a novel feature screening procedure under a unified model framework, which covers a wide variety of commonly used parametric and semiparametric models. The new method does not require imposing a specific model structure on regression functions, and thus is particularly appealing to ult...
-
作者:Matteson, David S.; Tsay, Ruey S.
作者单位:Cornell University; University of Chicago
摘要:We introduce dynamic orthogonal components (DOC) for multivariate time series and propose a procedure for estimating and testing the existence of DOCs for a given time series. We estimate the dynamic orthogonal components via a generalized decorrelation method that minimizes the linear and quadratic dependence across components and across time. We then use Ljung-Box type statistics to test the existence of dynamic orthogonal components. When DOCs exist, univariate analysis can be applied to bu...
-
作者:Toth, Daniell; Eltinge, John L.
作者单位:United States Department of Labor
摘要:In the past several years a wide range of methods for the construction of regression trees and other estimators based on the recursive partitioning of samples have appeared in the statistics literature. Many applications involve data collected through a complex sample design. At present, however, relatively little is known regarding the properties of these methods under complex designs. This article proposes a method for incorporating information about the complex sample design when building a...
-
作者:Efron, Bradley
作者单位:Stanford University
摘要:We suppose that the statistician observes some large number of estimates z(i), each with its own unobserved expectation parameter mu(i). The largest few of the z(i)'s are likely to substantially overestimate their corresponding mu(i)'s, this being an example of selection bias, or regression to the mean. Tweedie's formula, first reported by Robbins in 1956, offers a simple empirical Bayes approach for correcting selection bias. This article investigates its merits and limitations. In addition t...
-
作者:Corradi, Valentina; Distaso, Walter; Swanson, Norman R.
作者单位:University of Warwick; Imperial College London; Rutgers University System; Rutgers University New Brunswick
摘要:Numerous volatility-based derivative products have been engineered in recent years. This has led to interest in constructing conditional predictive densities and confidence intervals for integrated volatility. In this article we propose nonparametric estimators of the aforementioned quantities, based on model-free volatility estimators. We establish consistency and asymptotic normality for the feasible estimators and study their finite-sample properties through a Monte Carlo experiment. Finall...
-
作者:Hero, Alfred; Rajaratnam, Bala
作者单位:University of Michigan System; University of Michigan; University of Michigan System; University of Michigan; University of Michigan System; University of Michigan; Stanford University
摘要:This article addresses the problem of screening for variables with high correlations in high-dimensional data in which there can be many fewer samples than variables. We focus on threshold-based correlation screening methods for three related applications: screening for variables with large correlations within a single treatment (autocorrelation screening), screening for variables with large cross-correlations over two treatments (cross-correlation screening), and screening for variables that ...
-
作者:Cai, Tony; Liu, Weidong
作者单位:University of Pennsylvania; Shanghai Jiao Tong University; Shanghai Jiao Tong University
摘要:This article considers sparse linear discriminant analysis of high-dimensional data. In contrast to the existing methods which are based on separate estimation of the precision matrix Omega and the difference delta of the mean vectors, we introduce a simple and effective classifier by estimating the product Omega delta directly through constrained l(1) minimization. The estimator can be implemented efficiently using linear programming and the resulting classifier is called the linear programmi...
-
作者:Ruth, David M.; Koyak, Robert A.
作者单位:United States Department of Defense; United States Navy; United States Naval Academy; United States Department of Defense; United States Navy; Naval Postgraduate School
摘要:Given a sequence of observations, has a change occurred in the underlying probability distribution with respect to observation order? This problem of detecting change points arises in a variety of applications including health prognostics for mechanical systems, syndromic disease surveillance in geographically dispersed populations, anomaly detection in information networks, and multivariate process control in general. Detecting change points in high-dimensional settings is challenging, and mo...
-
作者:Chen, Lin S.; Paul, Debashis; Prentice, Ross L.; Wang, Pei
作者单位:University of Chicago; University of California System; University of California Davis; Fred Hutchinson Cancer Center
摘要:Recent proteomic studies have identified proteins related to specific phenotypes. In addition to marginal association analysis for individual proteins, analyzing pathways (functionally related sets of proteins) may yield additional valuable insights. Identifying pathways that differ between phenotypes can be conceptualized as a multivariate hypothesis testing problem: whether the mean vector mu of a p-dimensional random vector X is mu(0). Proteins within the same biological pathway may correla...
-
作者:Kim, Mi-Ok; Yang, Yunwen
作者单位:Cincinnati Children's Hospital Medical Center; University of Illinois System; University of Illinois Urbana-Champaign
摘要:We consider a random effects quantile regression analysis of clustered data and propose a semiparametric approach using empirical likelihood. The random regression coefficients are assumed independent with a common mean, following parametrically specified distributions. The common mean corresponds to the population-average effects of explanatory variables on the conditional quantile of interest, whereas the random coefficients represent cluster-specific deviations in the covariate effects. We ...