-
作者:Jewell, Nicholas P.
作者单位:University of London; London School of Hygiene & Tropical Medicine
-
作者:Mao, Xueyu; Sarkar, Purnamrita; Chakrabarti, Deepayan
作者单位:University of Texas System; University of Texas Austin; University of Texas System; University of Texas Austin; University of Texas System; University of Texas Austin
摘要:We consider the problem of estimating community memberships of nodes in a network, where every node is associated with a vector determining its degree of membership in each community. Existing provably consistent algorithms often require strong assumptions about the population, are computationally expensive, and only provide an overall error bound for the whole community membership matrix. This article provides uniform rates of convergence for the inferred community membership vector ofeachnod...
-
作者:de Micheaux, Pierre Lafaye; Mozharovskyi, Pavlo; Vimond, Myriam
作者单位:University of New South Wales Sydney; IMT - Institut Mines-Telecom; Institut Polytechnique de Paris; Telecom Paris; Centre National de la Recherche Scientifique (CNRS); CNRS - Institute for Humanities & Social Sciences (INSHS); Ecole Nationale de la Statistique et de l'Analyse de l'Information (ENSAI)
摘要:In 1975, John W. Tukey defined statistical data depth as a function that determines the centrality of an arbitrary point with respect to a data cloud or to a probability measure. During the last decades, this seminal idea of data depth evolved into a powerful tool proving to be useful in various fields of science. Recently, extending the notion of data depth to the functional setting attracted a lot of attention among theoretical and applied statisticians. We go further and suggest a notion of...
-
作者:Agarwal, Anish; Shah, Devavrat; Shen, Dennis; Song, Dogyoon
作者单位:Massachusetts Institute of Technology (MIT)
摘要:Principal component regression (PCR) is a simple, but powerful and ubiquitously utilized method. Its effectiveness is well established when the covariates exhibit low-rank structure. However, its ability to handle settings with noisy, missing, and mixed-valued, that is, discrete and continuous, covariates is not understood and remains an important open challenge. As the main contribution of this work, we establish the robustness of PCR, without any change, in this respect and provide meaningfu...
-
作者:Lunagomez, Simon; Olhede, Sofia C.; Wolfe, Patrick J.
作者单位:Lancaster University; Swiss Federal Institutes of Technology Domain; Ecole Polytechnique Federale de Lausanne; University of London; University College London; Purdue University System; Purdue University; Purdue University System; Purdue University; Purdue University System; Purdue University
摘要:This article introduces a new class of models for multiple networks. The core idea is to parameterize a distribution on labeled graphs in terms of a Frechet mean graph (which depends on a user-specified choice of metric or graph distance) and a parameter that controls the concentration of this distribution about its mean. Entropy is the natural parameter for such control, varying from a point mass concentrated on the Frechet mean itself to a uniform distribution over all graphs on a given vert...
-
作者:Xue, Fei; Qu, Annie
作者单位:University of Pennsylvania; University of California System; University of California Irvine
摘要:For multisource data, blocks of variable information from certain sources are likely missing. Existing methods for handling missing data do not take structures of block-wise missing data into consideration. In this article, we propose a multiple block-wise imputation (MBI) approach, which incorporates imputations based on both complete and incomplete observations. Specifically, for a given missing pattern group, the imputations in MBI incorporate more samples from groups with fewer observed va...
-
作者:Gerber, Guillaume; Le Faou, Yohann; Lopez, Olivier; Trupin, Michael
作者单位:Universite Paris Cite; Centre National de la Recherche Scientifique (CNRS); Sorbonne Universite
摘要:In the insurance broker market, commissions received by brokers are closely related to so-called customer value: the longer a policyholder keeps their contract, the more profit there is for the company and therefore the broker. Hence, predicting the time at which a potential policyholder will surrender their contract is essential to optimize a commercial process and define a prospect scoring. In this article, we propose a weighted random forest model to address this problem. Our model is desig...
-
作者:Su, Qihui; Qin, Zhongling; Peng, Liang; Qin, Gengsheng
作者单位:Jilin University; Auburn University System; Auburn University; University System of Georgia; Georgia State University
摘要:Given the importance of backtesting risk models and forecasts for financial institutions and regulators, we develop an efficient empirical likelihood backtest for either conditional value-at-risk or conditional expected shortfall when the given risk variable is modeled by an ARMA-GARCH process. Using a two-step procedure, the proposed backtests require less finite moments than existing backtests, allowing for robustness to heavier tails. Furthermore, we add a constraint on the goodness of fit ...
-
作者:Shaikh, Azeem M.; Toulis, Panos
作者单位:University of Chicago; University of Chicago
摘要:This article considers the problem of inference in observational studies with time-varying adoption of treatment. In addition to an unconfoundedness assumption that the potential outcomes are independent of the times at which units adopt treatment conditional on the units' observed characteristics, our analysis assumes that the time at which each unit adopts treatment follows a Cox proportional hazards model. This assumption permits the time at which each unit adopts treatment to depend on the...
-
作者:Masini, Ricardo; Medeiros, Marcelo C.
作者单位:Getulio Vargas Foundation; Pontificia Universidade Catolica do Rio de Janeiro; Princeton University
摘要:Recently, there has been growing interest in developing statistical tools to conduct counterfactual analysis with aggregate data when a single treated unit suffers an intervention, such as a policy change, and there is no obvious control group. Usually, the proposed methods are based on the construction of an artificial counterfactual from a pool of untre ated peers, organized in a panel data structure. In this article, we consider a general framework for counterfactual analysis for high-dimen...