-
作者:Athey, Susan; Imbens, Guido W.; Wager, Stefan
作者单位:Stanford University
摘要:There are many settings where researchers are interested in estimating average treatment effects and are willing to rely on the unconfoundedness assumption, which requires that the treatment assignment be as good as random conditional on pretreatment variables. The unconfoundedness assumption is often more plausible if a large number of pretreatment variables are included in the analysis, but this can worsen the performance of standard approaches to treatment effect estimation. We develop a me...
-
作者:Sengupta, Srijan; Chen, Yuguo
作者单位:Virginia Polytechnic Institute & State University; University of Illinois System; University of Illinois Urbana-Champaign
摘要:The community structure that is observed in empirical networks has been of particular interest in the statistics literature, with a strong emphasis on the study of block models. We study an important network feature called node popularity, which is closely associated with community structure. Neither the classical stochastic block model nor its degree-corrected extension can satisfactorily capture the dynamics of node popularity as observed in empirical networks. We propose a popularity-adjust...
-
作者:Goncalves, Flavio B.; Gamerman, Dani
作者单位:Universidade Federal de Minas Gerais; Universidade Federal do Rio de Janeiro
摘要:We present a novel inference methodology to perform Bayesian inference for spatiotemporal Cox processes where the intensity function depends on a multivariate Gaussian process. Dynamic Gaussian processes are introduced to enable evolution of the intensity function over discrete time. The novelty of the method lies on the fact that no discretization error is involved despite the non-tractability of the likelihood function and infinite dimensionality of the problem. The method is based on a Mark...
-
作者:Schouten, Barry
作者单位:Utrecht University
摘要:In most real life studies, auxiliary variables are available and are employed to explain and understand missing data patterns and to evaluate and control causal relationships with variables of interest. Usually their availability is assumed to be a fact, even if the variables are measured without the objectives of the study in mind. As a result, inference with missing data and causal inference require some assumptions that cannot easily be validated or checked. In this paper, a framework is co...
-
作者:Pfister, Niklas; Buhlmann, Peter; Schoelkopf, Bernhard; Peters, Jonas
作者单位:Max Planck Society; University of Copenhagen
摘要:We investigate the problem of testing whether d possibly multivariate random variables, which may or may not be continuous, are jointly (or mutually) independent. Our method builds on ideas of the two-variable Hilbert-Schmidt independence criterion but allows for an arbitrary number of variables. We embed the joint distribution and the product of the marginals in a reproducing kernel Hilbert space and define the d-variable Hilbert-Schmidt independence criterion dHSIC as the squared distance be...
-
作者:Li, Weiming; Yao, Jianfeng
作者单位:Shanghai University of Finance & Economics; University of Hong Kong
摘要:By studying the family of p-dimensional scale mixtures, the paper shows for the first time a non-trivial example where the eigenvalue distribution of the corresponding sample covariance matrix does not converge to the celebrated Marenko-Pastur law. A different and new limit is found and characterized. The reasons for failure of the Marenko-Pastur limit in this situation are found to be a strong dependence between the p-co-ordinates of the mixture. Next, we address the problem of testing whethe...
-
作者:Linero, Antonio R.; Yang, Yun
作者单位:State University System of Florida; Florida State University; University of Illinois System; University of Illinois Urbana-Champaign
摘要:Ensembles of decision trees are a useful tool for obtaining flexible estimates of regression functions. Examples of these methods include gradient-boosted decision trees, random forests and Bayesian classification and regression trees. Two potential shortcomings of tree ensembles are their lack of smoothness and their vulnerability to the curse of dimensionality. We show that these issues can be overcome by instead considering sparsity inducing soft decision trees in which the decisions are tr...
-
作者:Gronsbell, Jessica L.; Cai, Tianxi
作者单位:Harvard University
摘要:In many modern machine learning applications, the outcome is expensive or time consuming to collect whereas the predictor information is easy to obtain. Semi-supervised (SS) learning aims at utilizing large amounts of unlabelled' data along with small amounts of labelled' data to improve the efficiency of a classical supervised approach. Though numerous SS learning classification and prediction procedures have been proposed in recent years, no methods currently exist to evaluate the prediction...
-
作者:Wang, Boxiang; Zou, Hui
作者单位:University of Minnesota System; University of Minnesota Twin Cities
摘要:Distance-weighted discrimination (DWD) is a modern margin-based classifier with an interesting geometric motivation. It was proposed as a competitor to the support vector machine (SVM). Despite many recent references on DWD, DWD is far less popular than the SVM, mainly because of computational and theoretical reasons. We greatly advance the current DWD methodology and its learning theory. We propose a novel thrifty algorithm for solving standard DWD and generalized DWD, and our algorithm can b...
-
作者:Kallus, Nathan
作者单位:Cornell University
摘要:We develop a unified theory of designs for controlled experiments that balance baseline covariates a priori (before treatment and before randomization) using the framework of minimax variance and a new method called kernel allocation. We show that any notion of a priori balance must go hand in hand with a notion of structure, since with no structure on the dependence of outcomes on baseline covariates complete randomization (no special covariate balance) is always minimax optimal. Restricting ...