Analysis of two-way layout of count data involving multiple counts in each cell
成果类型:
Article
署名作者:
Paul, SR; Banerjee, T
署名单位:
University of Windsor; University of Calcutta
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.2307/2670056
发表日期:
1998
页码:
1419-1429
关键词:
poisson regression
tests
models
overdispersion
proportions
hypotheses
摘要:
Multiple counts may occur in each cell of an a x b two-way layout (balanced or unbalanced) of two fixed factors A and B. Standard log-linear model analysis based on a Poisson distribution assumption of the cell counts is not applicable here, because of the unbalanced nature of the table or because the Poisson distribution assumption is not valid. We develop C(alpha) tests for interaction and main effects assuming data to be Poisson distributed and also assuming that data within the cells have extra (over/under) dispersion beyond that explained by a Poisson distribution. For this we consider an extended negative binominal distribution and a semiparametric model using the quasi-likelihood. We show that in all situations the C(alpha) tests for interaction are of very simple forms. For C(alpha) tests for the main effect in presence of no interaction, such simplification is possible only under certain conditions. A score test for detecting extra dispersion in presence of interaction is also obtained and is of simple form. This test is to be used before testing for interaction. If evidence of extra dispersion is found, then the score tests for testing for interaction developed under the assumption of extra-dispersed distributions is used; otherwise, the score test developed under the assumption of a Poisson distribution is used. Further, maximum likelihood (ML) and quasi-likelihood (QL) estimates under different relevant null hypotheses are obtained. For testing for interaction under the assumption of Poisson-distributed data, we show that to obtain ML estimates of a + b - 1 parameters, we need to solve only min(a - 1, b - 1) equations simultaneously. For example, for a 2 x 10 table we need to solve only one equation iteratively. When data are extra dispersed, we need to solve a + b equations simultaneously to obtain ML or QL estimates of the a + b parameters. For testing for a main effect in the absence of interaction, closed-form estimates exist when data are Poisson distributed. However, when data follow a negative binomial or a semiparametric model based on the knowledge of only the first two moments (quasi-likelihood), only one equation involving the dispersion parameter is to be solved iteratively. Some simulations are performed, and two real datasets are analyzed for illustrative purposes.