-
作者:Paindaveine, Davy; Rasoafaraniaina, Josea; Verdebout, Thomas
作者单位:Universite Libre de Bruxelles; Universite Libre de Bruxelles; Universite de Toulouse; Universite Toulouse 1 Capitole; Toulouse School of Economics
摘要:Multisample covariance estimation-that is, estimation of the covariance matrices associated with k distinct populations-is a classical problem in multivariate statistics. A common solution is to base estimation on the outcome of a test that these covariance matrices show some given pattern. Such a preliminary test may, for example, investigate whether or not the various covariance matrices are equal to each other (test of homogeneity), or whether or not they have common eigenvectors (test of c...
-
作者:Nandy, Debmalya; Chiaromonte, Francesca; Li, Runze
作者单位:Colorado School of Public Health; University of Colorado System; University of Colorado Anschutz Medical Campus; Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park; Scuola Superiore Sant'Anna; Scuola Superiore Sant'Anna
摘要:Contemporary high-throughput experimental and surveying techniques give rise to ultrahigh-dimensional supervised problems with sparse signals; that is, a limited number of observations (n), each with a very large number of covariates (p >> n), only a small share of which is truly associated with the response. In these settings, major concerns on computational burden, algorithmic stability, and statistical accuracy call for substantially reducing the feature space by eliminating redundant covar...
-
作者:Abadie, Alberto; Spiess, Jann
作者单位:Massachusetts Institute of Technology (MIT); Stanford University
摘要:Nearest-neighbor matching is a popular nonparametric tool to create balance between treatment and control groups in observational studies. As a preprocessing step before regression, matching reduces the dependence on parametric modeling assumptions. In current empirical practice, however, the matching step is often ignored in the calculation of standard errors and confidence intervals. In this article, we show that ignoring the matching step results in asymptotically valid standard errors if m...
-
作者:Liu, Yang; Hu, Feifang
作者单位:George Washington University
摘要:Balancing important covariates is often critical in clinical trials and causal inference. Stratified permuted block (STR-PB) and covariate-adaptive randomization (CAR) procedures are widely used to balance observed covariates in practice. The balance properties of these procedures with respect to the observed covariates have been well studied. However, it has been questioned whether these methods will also yield a good balance for the unobserved covariates. In this article, we develop a genera...
-
作者:Lemyre, Felix Camirand; Carroll, Raymond J.; Delaigle, Aurore
作者单位:University of Sherbrooke; University of Sherbrooke; Texas A&M University System; Texas A&M University College Station; University of Technology Sydney; University of Melbourne; University of Melbourne
摘要:Dietary data collected from 24-hour dietary recalls are observed with significant measurement errors. In the nonparametric curve estimation literature, much of the effort has been devoted to designing methods that are consistent under contamination by noise, and which have been traditionally applied for analyzing those data. However, some foods such as alcohol or fruits are consumed only episodically, and may not be consumed during the day when the 24-hour recall is administered. These so-call...
-
作者:Dai, Ben; Shen, Xiaotong; Wang, Junhui
作者单位:University of Minnesota System; University of Minnesota Twin Cities; City University of Hong Kong
摘要:Numerical embedding has become one standard technique for processing and analyzing unstructured data that cannot be expressed in a predefined fashion. It stores the main characteristics of data by mapping it onto a numerical vector. An embedding is often unsupervised and constructed by transfer learning from large-scale unannotated data. Given an embedding, a downstream learning method, referred to as a two-stage method, is applicable to unstructured data. In this article, we introduce a novel...
-
作者:Xie, Dongyue; Stephens, Matthew
作者单位:University of Chicago; University of Chicago
-
作者:Park, Hyunwoo
作者单位:Seoul National University (SNU)
-
作者:Laga, Ian; Niu, Xiaoyue; Bao, Le
作者单位:Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park
摘要:Certain subpopulations like female sex workers (FSW), men who have sex with men (MSM), and people who inject drugs (PWID) often have higher prevalence of HIV/AIDS and are difficult to map directly due to stigma, discrimination, and criminalization. Fine-scale mapping of those populations contributes to the progress toward reducing the inequalities and ending the AIDS epidemic. In 2016 and 2017, the PLACE surveys were conducted at 3290 venues in 20 out of the total 28 districts in Malawi to est...
-
作者:Zhang, Yingying; Wang, Huixia Judy; Zhu, Zhongyi
作者单位:East China Normal University; George Washington University; Fudan University
摘要:Threshold regression models are useful for identifying subgroups with heterogeneous parameters. The conventional threshold regression models split the sample based on a single and observed threshold variable, which enforces the threshold point to be equal for all subgroups of the population. In this article, we consider a more flexible single-index threshold model in the quantile regression setup, in which the sample is split based on a linear combination of predictors. We propose a new estima...