-
作者:Barut, Emre; Wang, Huixia Judy
作者单位:George Washington University
-
作者:Jobe, J. Marcus; Pokojovy, Michael
作者单位:University System of Ohio; Miami University; University of Konstanz
摘要:Detection power of the squared Mahalanobis distance statistic is significantly reduced when several outliers exist within a multivariate dataset of interest. To overcome this masking effect, we propose a computer-intensive cluster-based approach that incorporates a reweighted version of Rousseeuw's minimum covariance determinant method with a multi-step cluster-based algorithm that initially filters out potential masking points. Compared to the most robust procedures, simulation studies show t...
-
作者:Martin, Ryan; Liu, Chuanhai
作者单位:University of Illinois System; University of Illinois Chicago; University of Illinois Chicago Hospital; Purdue University System; Purdue University
摘要:The inferential models (IM) framework provides prior-free, frequency-calibrated, and posterior probabilistic inference. The key is the use of random sets to predict unobservable auxiliary variables connected to the observable data and unknown parameters. When nuisance parameters are present, a marginalization step can reduce the dimension of the auxiliary variable which, in turn, leads to more efficient inference. For regular problems, exact marginalization can be achieved, and we give conditi...
-
作者:Brown, Lawrence D.; McCarthy, Daniel
作者单位:University of Pennsylvania
-
作者:Morganstein, David
作者单位:Westat
-
作者:Shah, Rajen D.; Samworth, Richard J.
作者单位:University of Cambridge
-
作者:Jiang, Bo; Liu, Jun S.
作者单位:Harvard University; Harvard University
摘要:Expression quantitative trait loci (eQTLs) are genomic locations associated with changes of expression levels of certain genes. By assaying gene expressions and genetic variations simultaneously on a genome-wide scale, scientists wish to discover genomic loci responsible for expression variations of a set of genes. The task can be viewed as a multivariate regression problem with variable selection on both responses (gene expression) and covariates (genetic variations), including alsomulti-way ...
-
作者:Jiang, Wenxin; Zhao, Yu
作者单位:Shandong University; Northwestern University; Amazon.com
摘要:A LIFT measure, such as the response rate, lift, or the percentage of captured response, is a fundamental measure of effectiveness for a scoring rule obtained from data mining, which is estimated from a set of validation data. In this article, we study how to construct confidence intervals of the LIFT measures. We point out the subtlety of this task and explain how simple binomial confidence intervals can have incorrect coverage probabilities, due to omitting variation from the sample percenti...
-
作者:Calonico, Sebastian; Cattaneo, Matias D.; Titiunik, Rocio
作者单位:University of Miami; University of Michigan System; University of Michigan; University of Michigan System; University of Michigan
摘要:Exploratory data analysis plays a central role in applied statistics and econometrics. In the popular regression-discontinuity (RD) design, the use of graphical analysis has been strongly advocated because it provides both easy presentation and transparent validation of the design. RD plots are nowadays widely used in applications, despite its formal properties being unknown: these plots are typically presented employing ad hoc choices of tuning parameters, which makes these procedures less au...
-
作者:Martin, Ryan
作者单位:University of Illinois System; University of Illinois Chicago; University of Illinois Chicago Hospital
摘要:In the frequentist program, inferential methods with exact control on error rates are a primary focus. The standard approach, however, is to rely on asymptotic approximations, which may not be suitable. This article presents a general framework for the construction of exact frequentist procedures based on plausibility functions. It is shown that the plausibility function-based tests and confidence regions have the desired frequentist properties in finite samples-no large-sample justification n...