Robust Principal Component Analysis for Power Transformed Compositional Data
成果类型:
Article
署名作者:
Scealy, J. L.; de Caritat, Patrice; Grunsky, Eric C.; Tsagris, Michail T.; Welsh, A. H.
署名单位:
Australian National University; Geoscience Australia; Australian National University; Natural Resources Canada; Lands & Minerals Sector - Natural Resources Canada; Geological Survey of Canada; Australian National University
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.1080/01621459.2014.990563
发表日期:
2015
页码:
136-148
关键词:
element concentrations
MULTIVARIATE-ANALYSIS
distributions
geochemistry
regression
soils
摘要:
Geochemical surveys collect sediment or rock samples, measure the concentration of chemical elements, and report these typically either in weight percent or in parts per million (ppm). There are usually a large number of elements measured and the distributions are often skewed, containing many potential outliers. We present a new robust principal component analysis (PCA) method for geochemical survey data, that involves first transforming the compositional data onto a manifold using a relative power transformation. A flexible set of moment assumptions are made which take the special geometry of the manifold into account. The Kent distribution moment structure arises as a special case when the chosen manifold is the hypersphere. We derive simple moment and robust estimators (RO) of the parameters which are also applicable in high-dimensional settings. The resulting PCA based on these estimators is done in the tangent space and is related to the power transformation method used in correspondence analysis. To illustrate, we analyze major oxide data from the National Geochemical Survey of Australia. When compared with the traditional approach in the literature based on the centered log-ratio transformation, the new PCA method is shown to be more successful at dimension reduction and gives interpretable results.