LARGE SAMPLE THEORY FOR MERGED DATA FROM MULTIPLE SOURCES

成果类型:
Article
署名作者:
Saegusa, Takumi
署名单位:
University System of Maryland; University of Maryland College Park
刊物名称:
ANNALS OF STATISTICS
ISSN/ISSBN:
0090-5364
DOI:
10.1214/18-AOS1727
发表日期:
2019
页码:
1585-1615
关键词:
2-phase stratified samples central limit-theorems Semiparametric models weighted likelihood invariance-principle efficient estimation regression-analysis wilms-tumor frame DESIGN
摘要:
We develop large sample theory for merged data from multiple sources. Main statistical issues treated in this paper are (1) the same unit potentially appears in multiple datasets from overlapping data sources, (2) duplicated items are not identified and (3) a sample from the same data source is dependent due to sampling without replacement. We propose and study a new weighted empirical process and extend empirical process theory to a dependent and biased sample with duplication. Specifically, we establish the uniform law of large numbers and uniform central limit theorem over a class of functions along with several empirical process results under conditions identical to those in the i.i.d. setting. As applications, we study infinite-dimensional M-estimation and develop its consistency, rates of convergence and asymptotic normality. Our theoretical results are illustrated with simulation studies and a real data example.