A Nonparametric Bayesian Model for Local Clustering With Application to Proteomics
成果类型:
Article
署名作者:
Lee, Juhee; Mueller, Peter; Zhu, Yitan; Ji, Yuan
署名单位:
University System of Ohio; Ohio State University; University of Texas System; University of Texas Austin; NorthShore University Health System
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.1080/01621459.2013.784705
发表日期:
2013
页码:
775-788
关键词:
gene-expression data
breast-cancer
microarray data
cells
selection
tumors
摘要:
We propose a nonparametric Bayesian local clustering (NoB-LoC) approach for heterogeneous data. NoB-LoC implements inference for nested clusters as posterior inference under a Bayesian model. Using protein expression data as an example, the NoB-LoC model defines a protein (column) cluster as a set of proteins that give rise to the same partition of the samples (rows). In other words, the sample partitions are nested within protein clusters. The common clustering of the samples gives meaning to the protein clusters. Any pair of samples might belong to the same cluster for one protein set but to different clusters for another protein set. These local features are different from features obtained by global clustering approaches such as hierarchical clustering, which create only one partition of samples that applies for all the proteins in the dataset. In addition, the NoB-LoC model is different from most other local or nested clustering methods, which define clusters based on common parameters in the sampling model. As an added and important feature, the NoB-LoC method probabilistically excludes sets of irrelevant proteins and samples that do not meaningfully cocluster with other proteins and samples, thus improving the inference on the clustering of the remaining proteins and samples. Inference is guided by a joint probability model for all the random elements. We provide a simulation study and a motivating example to demonstrate the unique features of the NoB-LoC model. Supplementary materials for this article are available online.
来源URL: