-
作者:Arias-Castro, Ery; Castro, Rui M.; Tanczos, Ervin; Wang, Meng
作者单位:University of California System; University of California San Diego; Eindhoven University of Technology; Stanford University
摘要:The scan statistic is by far the most popular method for anomaly detection, being popular in syndromic surveillance, signal and image processing, and target detection based on sensor networks, among other applications. The use of the scan statistics in such settings yields a hypothesis testing procedure, where the null hypothesis corresponds to the absence of anomalous behavior. If the null distribution is known, then calibration of a scan-based test is relatively easy, as it can be done by Mo...
-
作者:Tang, Xueying; Ghosh, Malay; Ha, Neung Soo; Sedransk, Joseph
作者单位:Columbia University; State University System of Florida; University of Florida; University System of Maryland; University of Maryland College Park
摘要:Small area estimation is becoming increasingly popular for survey statisticians. One very important program is Small Area Income and Poverty Estimation undertaken by the United States Bureau of the Census, which aims at providing estimates related to income and poverty based on American Community Survey data at the state level and even at lower levels of geography. This article introduces global-local (GL) shrinkage priors for random effects in small area estimation to capture wide area level ...
-
作者:Tansey, Wesley; Koyejo, Oluwasanmi; Poldrack, Russell A.; Scott, James G.
作者单位:University of Texas System; University of Texas Austin; University of Illinois System; University of Illinois Urbana-Champaign; Stanford University; University of Texas System; University of Texas Austin; University of Texas System; University of Texas Austin
摘要:We present false discovery rate (FDR) smoothing, an empirical-Bayes method for exploiting spatial structure in large multiple-testing problems. FDR smoothing automatically finds spatially localized regions of significant test statistics. It then relaxes the threshold of statistical significance within these regions, and tightens it elsewhere, in a manner that controls the overall false discovery rate at a given level. This results in increased power and cleaner spatial separation of signals fr...
-
作者:Chen, Zhao; Fan, Jianqing; Li, Runze
作者单位:Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park; Fudan University; Princeton University; Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park
摘要:Error variance estimation plays an important role in statistical inference for high-dimensional regression models. This article concerns with error variance estimation in high-dimensional sparse additive model. We study the asymptotic behavior of the traditional mean squared errors, the naive estimate of error variance, and show that it may significantly underestimate the error variance due to spurious correlations that are even higher in nonparametric models than linear models. We further pro...
-
作者:Keich, Uri; Noble, William Stafford
作者单位:University of Sydney; University of Washington; University of Washington Seattle; University of Washington; University of Washington Seattle
摘要:We consider the problem of controlling the false discovery rate (FDR) among discoveries from searching an incomplete database. This problem differs from the classical multiple testing setting because there are two different types of false discoveries: those arising from objects that have no match in the database and those that are incorrectly matched. We show that commonly used FDR controlling procedures are inadequate for this setup, a special case of which is tandem mass spectrum identificat...
-
作者:Kuhnert, Petra M.
作者单位:Commonwealth Scientific & Industrial Research Organisation (CSIRO); CSIRO Data61
-
作者:Zhou, Quan; Guan, Yongtao
作者单位:Baylor College of Medicine
摘要:We show that under the null, the is asymptotically distributed as a weighted sum of chi-squared random variables with a shifted mean. This claim holds for Bayesian multi-linear regression with a family of conjugate priors, namely, the normal-inverse-gamma prior, the g-prior, and the normal prior. Our results have three immediate impacts. First, we can compute analytically a p-value associated with a Bayes factor without the need of permutation. We provide a software package that can evaluate t...
-
作者:Davidov, Ori; Jelsema, Casey M.; Peddada, Shyamal
作者单位:University of Haifa; West Virginia University; National Institutes of Health (NIH) - USA; NIH National Institute of Environmental Health Sciences (NIEHS)
摘要:There are many applications in which a statistic follows, at least asymptotically, a normal distribution with a singular or nearly singular variance matrix. A classic example occurs in linear regression models under multicollinearity but there are many more such examples. There is well-developed theory for testing linear equality constraints when the alternative is two-sided and the variance matrix is either singular or nonsingular. In recent years, there is considerable, and growing, interest...
-
作者:DeYoreo, Maria; Kottas, Athanasios
作者单位:RAND Corporation; Duke University; University of California System; University of California Santa Cruz
摘要:We develop a Bayesian nonparametric framework for modeling ordinal regression relationships, which evolve in discrete time. The motivating application involves a key problem in fisheries research on estimating dynamically evolving relationships between age, length, and maturity, the latter recorded on an ordinal scale. The methodology builds from nonparametric mixture modeling for the joint stochastic mechanism of covariates and latent continuous responses. This approach yields highly flexible...
-
作者:Ganong, Peter; Jaeger, Simon
作者单位:National Bureau of Economic Research; University of Chicago; Massachusetts Institute of Technology (MIT); University of Bonn; IZA Institute Labor Economics; Leibniz Association; Ifo Institut
摘要:The regression kink (RK) design is an increasingly popular empirical method for estimating causal effects of policies, such as the effect of unemployment benefits on unemployment duration. Using simulation studies based on data from existing RK designs, we empirically document that the statistical significance of RK estimators based on conventional standard errors can be spurious. In the simulations, false positives arise as a consequence of nonlinearities in the underlying relationship betwee...