High-dimension, low-sample size perspectives in constrained statistical inference: The SARSCoV RNA genome in illustration

成果类型:
Article
署名作者:
Sen, Pranab K.; Tsai, Ming-Tien; Jou, Yuh-Shan
署名单位:
University of North Carolina; University of North Carolina Chapel Hill; University of North Carolina; University of North Carolina Chapel Hill
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.1198/016214507000000077
发表日期:
2007
页码:
686-694
关键词:
coronavirus diversity epidemiology SEQUENCES
摘要:
High-dimensional categorical data models, often with inadequately large sample sizes, crop up in many fields of application. The SARS epidemic, originating in southern China in 2002, had an identified single-stranded and positive-sense RNA virus with large genome size and moderate mutation rate. The present genomic study is used as a prime illustration for motivating appropriate statistical methodology for comprehending the genomic variation in such high-dimensional categorical data models. Because of underlying restraints, a pseudomarginal approach based on Hamming distance is considered in a constrained statistical inference setup. The union-intersection principle and jackknifing methods are incorporated in exploring appropriate statistical procedures.