Penalized clustering of large-scale functional data with multiple covariates
成果类型:
Article
署名作者:
Ma, Ping; Zhong, Wenxuan
署名单位:
University of Illinois System; University of Illinois Urbana-Champaign
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.1198/016214508000000247
发表日期:
2008
页码:
625-636
关键词:
bayesian confidence-intervals
gene-expression
em algorithm
likelihood
MODEL
yeast
cycle
摘要:
In this article we propose a penalized clustering method for large-scale data with multiple covariates through a functional data approach. In our proposed method, responses and covariates are linked together through nonparametric multivariate functions (fixed effects), which have great flexibility in modeling various function features, such as jump points, branching, and periodicity. Functional ANOVA is used to further decompose multivariate functions in a reproducing kernel Hilbert space and provide associated notions of main effect and interaction. Parsimonious random effects are used to capture various correlation structures. The mixed-effects models are nested under a general mixture model in which the heterogeneity of functional data is characterized. We propose a penalized Henderson's likelihood approach for model fitting and design a rejection-controlled EM algorithm for the estimation. Our method selects smoothing parameters through generalized cross-validation. Furthermore, Bayesian confidence intervals are used to measure the clustering uncertainty. Simulation studies and real-data examples are presented to investigate the empirical performance of the proposed method. Open-source code is available in the R package MFDA.