您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 统计学 > The Annals of Statistics > 2010 > 3期

SUCCESSIVE NORMALIZATION OF RECTANGULAR ARRAYS

成果类型：

Article

署名作者：

Olshen, Richard A.; Rajaratnam, Bala

署名单位：

Stanford University; Stanford University; Stanford University

刊物名称：

ANNALS OF STATISTICS

ISSN/ISSBN：

0090-5364

DOI：

10.1214/09-AOS743

发表日期：

2010

页码：

1638-1664

关键词：

摘要：

Standard statistical techniques often require transforming data to have mean 0 and standard deviation I. Typically, this process of standardization or normalization is applied across subjects when each subject produces a single number. High throughput genomic and financial data often come as rectangular arrays where each coordinate in one direction concerns subjects who might have different status (case or control, say), and each coordinate in the other designates outcome for a specific feature, for example, gene, polymorphic site or some aspect of financial profile. It may happen, when analyzing data that arrive as a rectangular array, that one requires BOTH the subjects and the features to be on the same fooling. Thus there may be a need to standardize across rows and columns of the rectangular matrix. There arises the question as to how to achieve this double normalization. We propose and investigate the convergence of what seems to us a natural approach to successive normalization which we learned from our colleague Bradley Efron. We also study the implementation of the method on simulated data and also on data that arose from scientific experimentation.