A Generalized Least-Square Matrix Decomposition
成果类型:
Article
署名作者:
Allen, Genevera I.; Grosenick, Logan; Taylor, Jonathan
署名单位:
Rice University; Rice University; Baylor College of Medicine; Baylor College Medical Hospital; Baylor College of Medicine; Baylor College Medical Hospital; Stanford University; Stanford University
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.1080/01621459.2013.852978
发表日期:
2014
页码:
145-159
关键词:
Principal component analysis
functional principal
fmri data
brain
likelihood
selection
sparsity
systems
models
摘要:
Variables in many big-data settings are structured, arising, for example, from measurements on a regular grid as in imaging and time series or from spatial-temporal measurements as in climate studies. Classical multivariate techniques ignore these structural relationships often resulting in poor performance. We propose a generalization of principal components analysis (PCA) that is appropriate for massive datasets with structured variables or known two-way dependencies. By finding the best low-rank approximation of the data with respect to a transposable quadratic norm, our decomposition, entitled the generalized least-square matrix decomposition (GMD), directly accounts for structural relationships. As many variables in high-dimensional settings are often irrelevant, we also regularize our matrix decomposition by adding two-way penalties to encourage sparsity or smoothness. We develop fast computational algorithms using our methods to perform generalized PCA (GPCA), sparse GPCA, and functional GPCA on massive datasets. Through simulations and a whole brain functional MRI example, we demonstrate the utility of our methodology for dimension reduction, signal recovery, and feature selection with high-dimensional structured data. Supplementary materials for this article are available online.