MOTIF ESTIMATION VIA SUBGRAPH SAMPLING: THE FOURTH-MOMENT PHENOMENON
成果类型:
Article
署名作者:
Bhattacharya, Bhaswar B.; Das, Sayan; Mukherjee, Sumit
署名单位:
University of Pennsylvania; Columbia University; Columbia University
刊物名称:
ANNALS OF STATISTICS
ISSN/ISSBN:
0090-5364
DOI:
10.1214/21-AOS2134
发表日期:
2022
页码:
987-1011
关键词:
Social networks
connected components
convergent sequences
degree distributions
LIMIT-THEOREMS
graph
number
sums
摘要:
Network sampling is an indispensable tool for understanding features of large complex networks where it is practically impossible to search over the entire graph. In this paper, we develop a framework for statistical inference for counting network motifs, such as edges, triangles and wedges, in the widely used subgraph sampling model, where each vertex is sampled independently, and the subgraph induced by the sampled vertices is observed. We derive necessary and sufficient conditions for the consistency and the asymptotic normality of the natural Horvitz-Thompson (HT) estimator, which can be used for constructing confidence intervals and hypothesis testing for the motif counts based on the sampled graph. In particular, we show that the asymptotic normality of the HT estimator exhibits an interesting fourth-moment phenomenon, which asserts that the HT estimator (appropriately centered and rescaled) converges in distribution to the standard normal whenever its fourth-moment converges to 3 (the fourth-moment of the standard normal distribution). As a consequence, we derive the exact thresholds for consistency and asymptotic normality of the HT estimator in various natural graph ensembles, such as sparse graphs with bounded degree, Erdos-Renyi random graphs, random regular graphs and dense graphons.