BAYESIAN MATCHING OF UNLABELED MARKED POINT SETS USING RANDOM FIELDS, WITH AN APPLICATION TO MOLECULAR ALIGNMENT
成果类型:
Article
署名作者:
Czogiel, Irina; Dryden, Ian L.; Brignell, Christopher J.
署名单位:
Max Planck Society; University of Nottingham; University of South Carolina System; University of South Carolina Columbia
刊物名称:
ANNALS OF APPLIED STATISTICS
ISSN/ISSBN:
1932-6157
DOI:
10.1214/11-AOAS486
发表日期:
2011
页码:
2603-2629
关键词:
Similarity
摘要:
Statistical methodology is proposed for comparing unlabeled marked point sets, with an application to aligning steroid molecules in chemoinformatics. Methods from statistical shape analysis are combined with techniques for predicting random fields in spatial statistics in order to define a suitable measure of similarity between two marked point sets. Bayesian modeling of the predicted field overlap between pairs of point sets is proposed, and posterior inference of the alignment is carried out using Markov chain Monte Carlo simulation. By representing the fields in reproducing kernel Hilbert spaces, the degree of overlap can be computed without expensive numerical integration. Superimposing entire fields rather than the configuration matrices of point coordinates thereby avoids the problem that there is usually no clear one-to-one correspondence between the points. In addition, mask parameters are introduced in the model, so that partial matching of the marked point sets can be carried out. We also propose an adaptation of the generalized Procrustes analysis algorithm for the simultaneous alignment of multiple point sets. The methodology is illustrated with a simulation study and then applied to a data set of 31 steroid molecules, where the relationship between shape and binding activity to the corticosteroid binding globulin receptor is explored.
来源URL: