Interpreting latent variables in factor models via convex optimization
成果类型:
Article
署名作者:
Taeb, Armeen; Chandrasekaran, Venkat
署名单位:
California Institute of Technology; California Institute of Technology; California Institute of Technology
刊物名称:
MATHEMATICAL PROGRAMMING
ISSN/ISSBN:
0025-5610
DOI:
10.1007/s10107-017-1187-7
发表日期:
2018
页码:
129-154
关键词:
matrix
selection
摘要:
Latent or unobserved phenomena pose a significant difficulty in data analysis as they induce complicated and confounding dependencies among a collection of observed variables. Factor analysis is a prominent multivariate statistical modeling approach that addresses this challenge by identifying the effects of (a small number of) latent variables on a set of observed variables. However, the latent variables in a factor model are purely mathematical objects that are derived from the observed phenomena, and they do not have any interpretation associated to them. A natural approach for attributing semantic information to the latent variables in a factor model is to obtain measurements of some additional plausibly useful covariates that may be related to the original set of observed variables, and to associate these auxiliary covariates to the latent variables. In this paper, we describe a systematic approach for identifying such associations. Our method is based on solving computationally tractable convex optimization problems, and it can be viewed as a generalization of the minimum-trace factor analysis procedure for fitting factor models via convex optimization. We analyze the theoretical consistency of our approach in a high-dimensional setting as well as its utility in practice via experimental demonstrations with real data.
来源URL: