-
作者:Ma, Yanyuan; Carroll, Raymond J.
作者单位:University of South Carolina System; University of South Carolina Columbia; Texas A&M University System; Texas A&M University College Station
摘要:We study the regression relationship between covariates in case-control data: an area known as the secondary analysis of case-control studies. The context is such that only the form of the regression mean is specified, so that we allow an arbitrary regression error distribution, which can depend on the covariates and thus can be heteroscedastic. Under mild regularity conditions we establish the theoretical identifiability of such models. Previous work in this context has either specified a ful...
-
作者:Delaigle, Aurore; Hall, Peter
作者单位:University of Melbourne
摘要:In the non-parametric deconvolution problem, to estimate consistently a density or distribution from a sample of data contaminated by additive random noise, it is often assumed that the noise distribution is completely known or that an additional sample of replicated or validation data is available. Methods also have been suggested for estimating the scale of the error distribution, but they require somewhat restrictive smoothness assumptions on the signal distribution, which can be difficult ...
-
作者:Genovese, Christopher R.; Perone-Pacifico, Marco; Verdinelli, Isabella; Wasserman, Larry
作者单位:Carnegie Mellon University; Sapienza University Rome
摘要:We derive non-parametric confidence intervals for the eigenvalues of the Hessian at modes of a density estimate. This provides information about the strength and shape of modes and can also be used as a significance test. We use a data splitting approach in which potential modes are identified by using the first half of the data and inference is done with the second half of the data. To obtain valid confidence sets for the eigenvalues, we use a bootstrap based on an elementary symmetric polyno...
-
作者:Fryzlewicz, Piotr; Van Keilegom, Ingrid
-
作者:Daouia, Abdelaati; Noh, Hohsuk; Park, Byeong U.
作者单位:Universite de Toulouse; Universite Catholique Louvain; Sookmyung Women's University; Seoul National University (SNU)
摘要:Estimation of support frontiers and boundaries often involves monotone and/or concave edge data smoothing. This estimation problem arises in various unrelated contexts, such as optimal cost and production assessments in econometrics and master curve prediction in the reliability programmes of nuclear reactors. Very few constrained estimators of the support boundary of a bivariate distribution have been introduced in the literature. They are based on simple envelopment techniques which often su...
-
作者:Bickel, Peter J.; Sarkar, Purnamrita
作者单位:University of California System; University of California Berkeley; University of Texas System; University of Texas Austin
摘要:Community detection in networks is a key exploratory tool with applications in a diverse set of areas, ranging from finding communities in social and biological networks to identifying link farms in the World Wide Web. The problem of finding communities or clusters in a network has received much attention from statistics, physics and computer science. However, most clustering algorithms assume knowledge of the number of clusters k. We propose to determine k automatically in a graph generated f...
-
作者:Zhang, Xiang; Wu, Yichao; Wang, Lan; Li, Runze
作者单位:North Carolina State University; University of Minnesota System; University of Minnesota Twin Cities; Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park
摘要:The support vector machine (SVM) is a powerful binary classification tool with high accuracy and great flexibility. It has achieved great success, but its performance can be seriously impaired if many redundant covariates are included. Some efforts have been devoted to studying variable selection for SVMs, but asymptotic properties, such as variable selection consistency, are largely unknown when the number of predictors diverges to 1. We establish a unified theory for a general class of non-c...
-
作者:Prus, Maryna; Schwabe, Rainer
作者单位:Otto von Guericke University
摘要:Characterizations of optimal designs are derived for the prediction of individual response curves within the framework of hierarchical linear mixed models. It is shown that the so-obtained optimal designs may differ substantially from those propagated in the literature so far and that the latter may become useless in terms of their performance.
-
作者:Bonhomme, Stephane; Jochmans, Koen; Robin, Jean-Marc
作者单位:University of Chicago; Institut d'Etudes Politiques Paris (Sciences Po); University of London; University College London
摘要:This paper provides methods to estimate finite mixtures from data with repeated measurements non-parametrically. We present a constructive identification argument and use it to develop simple two-step estimators of the component distributions and all their functionals. We discuss a computationally efficient method for estimation and derive asymptotic theory. Simulation experiments suggest that our theory provides confidence intervals with good coverage in small samples.
-
作者:Lee, Sokbae; Seo, Myung Hwan; Shin, Youngki
作者单位:Seoul National University (SNU); University of London; London School Economics & Political Science; University of London; London School Economics & Political Science; Western University (University of Western Ontario)
摘要:We consider a high dimensional regression model with a possible change point due to a covariate threshold and develop the lasso estimator of regression coefficients as well as the threshold parameter. Our lasso estimator not only selects covariates but also selects a model between linear and threshold regression models. Under a sparsity assumption, we derive non-asymptotic oracle inequalities for both the prediction risk and the l(1)-estimation loss for regression coefficients. Since the lasso...