-
作者:Wu, Jason; Ding, Peng
作者单位:University of California System; University of California Berkeley
摘要:The Fisher randomization test (FRT) is appropriate for any test statistic, under a sharp null hypothesis that can recover all missing potential outcomes. However, it is often sought after to test a weak null hypothesis that the treatment does not affect the units on average. To use the FRT for a weak null hypothesis, we must address two issues. First, we need to impute the missing potential outcomes although the weak null hypothesis cannot determine all of them. Second, we need to choose a pro...
-
作者:Bai, Jushan; Ng, Serena
作者单位:Columbia University; National Bureau of Economic Research
摘要:This article proposes an imputation procedure that uses the factors estimated from a tall block along with the re-rotated loadings estimated from a wide block to impute missing values in a panel of data. Assuming that a strong factor structure holds for the full panel of data and its sub-blocks, it is shown that the common component can be consistently estimated at four different rates of convergence without requiring regularization or iteration. An asymptotic analysis of the estimation error ...
-
作者:Khan, Md Kamrul Hasan; Chakraborty, Avishek; Petris, Giovanni; Wilson, Barry T.
作者单位:University of Arkansas System; University of Arkansas Fayetteville; United States Department of Agriculture (USDA); United States Forest Service
摘要:The USDA Forest Service uses satellite imagery, along with a sample of national forest inventory field plots, to monitor and predict changes in forest conditions over time throughout the United States. We specifically focus on a 230,400 ha region in north-central Wisconsin between 2003 and 2012. The auxiliary data from the satellite imagery of this region are relatively dense in space and time, and can be used to learn how forest conditions changed over that decade. However, these records have...
-
作者:Lee, Youjin; Ogburn, Elizabeth L.
作者单位:University of Pennsylvania; Johns Hopkins University; Johns Hopkins Bloomberg School of Public Health
摘要:Researchers across the health and social sciences generally assume that observations are independent, even while relying on convenience samples that draw subjects from one or a small number of communities, schools, hospitals, etc. A paradigmatic example of this is the Framingham Heart Study (FHS). Many of the limitations of such samples are well-known, but the issue of statistical dependence due to social network ties has not previously been addressed. We show that, along with anticonservative...
-
作者:Ferman, Bruno
摘要:We consider the asymptotic properties of the synthetic control (SC) estimator when both the number of pretreatment periods and control units are large. If potential outcomes follow a linear factor model, we provide conditions under which the SC unit asymptotically recovers the factor structure of the treated unit, even when the pretreatment fit is imperfect. This happens when there are weights diluted among an increasing number of control units such that a weighted average of the factor struct...
-
作者:Shafer, Glenn
作者单位:Rutgers University System; Rutgers University Newark; Rutgers University New Brunswick
-
作者:Nemeth, Christopher; Fearnhead, Paul
作者单位:Lancaster University
摘要:Markov chain Monte Carlo (MCMC) algorithms are generally regarded as the gold standard technique for Bayesian inference. They are theoretically well-understood and conceptually simple to apply in practice. The drawback of MCMC is that performing exact inference generally requires all of the data to be processed at each iteration of the algorithm. For large datasets, the computational cost of MCMC can be prohibitive, which has led to recent developments in scalable Monte Carlo algorithms that h...
-
作者:Yan, Xiaohan; Bien, Jacob
摘要:It is common in modern prediction problems for many predictor variables to be counts of rarely occurring events. This leads to design matrices in which many columns are highly sparse. The challenge posed by such rare features has received little attention despite its prevalence in diverse areas, ranging from natural language processing (e.g., rare words) to biology (e.g., rare species). We show, both theoretically and empirically, that not explicitly accounting for the rareness of features can...
-
作者:Shi, Xu; Li, Xiaoou; Cai, Tianxi
作者单位:University of Michigan System; University of Michigan; University of Minnesota System; University of Minnesota Twin Cities; Harvard University
摘要:Motivated by a series of applications in data integration, language translation, bioinformatics, and computer vision, we consider spherical regression with two sets of unit-length vectors when the data are corrupted by a small fraction of mismatch in the response-predictor pairs. We propose a three-step algorithm in which we initialize the parameters by solving an orthogonal Procrustes problem to estimate a translation matrix ignoring the mismatch. We then estimate a mapping matrix aiming to c...
-
作者:Abadie, Alberto; L'Hour, Jeremy
作者单位:Massachusetts Institute of Technology (MIT); National Bureau of Economic Research; Institut Polytechnique de Paris; ENSAE Paris
摘要:Synthetic control methods are commonly applied in empirical research to estimate the effects of treatments or interventions on aggregate outcomes. A synthetic control estimator compares the outcome of a treated unit to the outcome of a weighted average of untreated units that best resembles the characteristics of the treated unit before the intervention. When disaggregated data are available, constructing separate synthetic controls for each treated unit may help avoid interpolation biases. Ho...