Date Available

12-16-2023

Year of Publication

2023

Degree Name

Doctor of Philosophy (PhD)

Document Type

Doctoral Dissertation

College

Arts and Sciences

Department/School/Program

Statistics

First Advisor

Dr. Derek S. Young

Abstract

For modeling count data, the Conway-Maxwell-Poisson (CMP) distribution is a popular generalization of the Poisson distribution due to its ability to characterize data over- or under-dispersion. While the classic parameterization of the CMP has been well-studied, its main drawback is that it is does not directly model the mean of the counts. This is mitigated by using a mean-parameterized version of the CMP distribution. In this work, we are concerned with the setting where count data may be comprised of subpopulations, each possibly having varying degrees of data dispersion. Thus, we propose a finite mixture of mean-parameterized CMP distributions. An EM algorithm is constructed to perform maximum likelihood estimation of the model, while bootstrapping is employed to obtain estimated standard errors. A simulation study is used to demonstrate the flexibility of the proposed mixture model relative to mixtures of Poissons and mixtures of negative binomials. An analysis of dog mortality data is presented.

As a generalization of the Poisson distribution and a common alternative to other discrete distributions, the Conway-Maxwell-Poisson (CMP) distribution has the flexibility to explicitly characterize data over- or under-dispersion. The mean-parameterized version of the CMP has received increasing attention in the literature due to its ability to directly model the data mean. When the mean further depends on covariates, then the mean-parameterized CMP regression model can be treated in a generalized linear models framework. In this work, we propose a mixture of mean-parameterized CMP regressions model to apply on data which are potentially comprised of subpopulations with different conditional means and varying degrees of dispersions. An EM algorithm is constructed to find maximum likelihood estimates of the model. A simulation study is performed to test the proposed mixture of mean-parameterized CMP regressions model, and to compare it to model fits using mixtures of Poisson regressions and mixtures of negative binomial regressions. An analysis of the spread of a viral infection in potato plants is performed using these different mixtures of count regressions models, where we show the mixture of mean-parameterized CMP regressions to be an effective model.

Digital Object Identifier (DOI)

https://doi.org/10.13023/etd.2023/469

Share

COinS