Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

A flexible univariate moving average time-series model for dispersed count data

A flexible univariate moving average time-series model for dispersed count data kfs7@georgetown.edu Department of Mathematics and Al-Osh and Alzaid (1988) consider a Poisson moving average (PMA) model to describe Statistics, Georgetown University, the relation among integer-valued time series data; this model, however, is constrained Washington, DC, USA Center for Statistical Research and by the underlying equi-dispersion assumption for count data (i.e., that the variance and Methodology Division, U.S. Census the mean equal). This work instead introduces a flexible integer-valued moving Bureau, Washington, DC, USA average model for count data that contain over- or under-dispersion via the Conway-Maxwell-Poisson (CMP) distribution and related distributions. This first-order sum-of-Conway-Maxwell-Poissons moving average (SCMPMA(1)) model offers a generalizable construct that includes the PMA (among others) as a special case. We highlight the SCMPMA model properties and illustrate its flexibility via simulated data examples. Keywords: Over-dispersion, Under-dispersion, Conway-Maxwell-Poisson (COM-Poisson or CMP), Sum-of-Conway-Maxwell-Poisson (sCMP) Introduction Integer-valued thinning-based models have been proposed to model time series data rep- resented as counts. Al-Osh and Alzaid (1988) introduce a generally defined integer-valued moving average (INMA) process as an analog to the moving average (MA) model for continuous data which assumes an underlying Gaussian distribution. This INMA process instead utilizes a thinning operator that maintains an integer-valued range of possible outcomes. To form such a model, they consider the “survivals” of independent and iden- tically distributed (iid) non-negative integer valued random innovations to maintain and ensure discrete data outcomes (Weiss 2021). Al-Osh and Alzaid (1988) particularly con- sider a first-order Poisson moving average (PMA(1)), i.e. a stationary sequence U of the form U = γ ◦  +  where { } is a sequence of iid Poisson(η) random variables and t t−1 t t (γ ◦ ) = B for a sequence of iid Bernoulli(γ ) random variables {B } independent i i i=1 of {}. By design, the PMA(1) is an INMA whose maximum stay time in the sequence is two time units. Consequently, components of U are dependent, while the components of and (γ ◦  ) are independent. t t−1 Given the PMA(1) structure, E(U ) = Var(U ) = (1 + γ)η,(1) t t © The Author(s). 2021 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. (2021) 8:1 Sellers et al. Journal of Statistical Distributions and Applications Page 2 of 12 and the covariance of consecutive variables Cov(U , U ) = γη; this implies that the t−1 t correlation is r = 1 1+γ ρ (r) = Corr(U , U ) = (2) U t−r t 0 r > 1. Meanwhile, the probability generating function (pgf) of U is  (u) = t U −η(1+γ)(1−u) e , the joint pgf of {U , ... , U } is  (u , ... , u ) = exp (−η r + γ − 1 r r 1 r r r−1 (1 − γ) u − γ(u + u ) − γ u u (which infers that time reversibility i 1 r i i+1 i=1 i=1 holds for the PMA), and the pgf of T = U is U,r i i=1 (u) = exp −η [(1 − γ)r + 2γ ] (1 − u) − ηγ (r − 1)(1 − u ) . U,r Al-Osh and Alzaid (1988)notethat T does not have a Poisson distribution, which is U,r in contrast to the standard MA(1) process. The conditional mean and variance of U t+1 given U = u are both linear in U ,namely t t E(U | U = u) = η + γ u/(1 + γ),and (3) t+1 t Var(U | U = u) = η + γ u/(1 + γ).(4) t+1 t The PMA is a natural choice for modeling an integer-valued process, in part because of its tractability (Al-Osh and Alzaid 1988). This model, however, is limited by its con- straining equi-dispersion property, i.e. the assumption that the mean and variance of the underlying process equal. Real data do not generally conform to this construct (Hilbe 2014;Weiss 2018); they usually display over-dispersion relative to the Poisson model (i.e. where the variance is greater than the mean), however integer-valued data are surfacing with greater frequency that express data under-dispersion relative to Poisson (i.e. the vari- ance is less than the mean). Accordingly, it would be fruitful to instead consider a flexible time series model that can accommodate data over- and/or under-dispersion. Alzaid and Al-Osh (1993) introduce a first-order generalized Poisson moving average (GPMA(1)) process as an alternative to the PMA. The associated model has the form, ∗ ∗ ∗ W = Q  +  , t = 0, 1, 2, ...,(5) t t−1 t ∗ ∗ ∗ where  is a sequence of iid generalized Poisson GP(μ , θ), and {Q (·)} is a sequence t t ∗ ∗ ∗ of quasi-binomial QB(p , θ/μ , ·) random operators independent of { }. As with the PMA, W and W are independent for |r| > 1. The marginal distribution of W is t+r t t ∗ ∗ GP((1 + p )μ , θ). Recognizing the relationship between moving average and autoregres- sive models, Alzaid and Al-Osh (1993) equate terms in this GPMA(1) model to their first-order generalized Poisson autoregressive (GPAR(1)) counterpart, W = Q (W ) +  , t = 0, 1, 2, ...,(6) t t t−1 t where { } is a sequence of iid GP(qμ, θ) random variables where q = 1 − p,and {Q (·)} t t is a sequence of QB(p, θ/μ, ·) random operators, independent of { }; i.e., they let μ = ∗ ∗ (1 + p )μ and p = .The bivariatepgf of W and W can thus be represented as ∗ t+1 t 1+p ∗ ∗ ∗ (u , u ) = exp μ (A (u ) + A (u ) − 2) + μ p (A (u u ) − 1) (7) W ,W 1 2 θ 1 θ 2 θ 1 2 t+1 t = exp μq (A (u ) + A (u ) − 2) + μp (A (u u ) − 1),(8) θ 1 θ 2 θ 1 2 −θ(s−1) where A (s) is the inverse function that satisfies A se = s;see Alzaid andAl- θ θ Osh (1993). This substitution in Eq. (7)toobtainEq. (8) further illustrates the relationship (2021) 8:1 Sellers et al. Journal of Statistical Distributions and Applications Page 3 of 12 between the GPMA(1) and GPAR(1) models such that they have the same joint pgf. Eq. (8) and the related GPAR work of Alzaid and Al-Osh (1993) therefore show that qμ E(W | W = w) = pw +.(9) t t−1 1 − θ Thejoint pgfof(W , W , ... , W )isgiven by t t−1 t−r+1 r r (u , ... , u ) = exp μq (A (u ) − 1) + μp (A (u u ) − 1) . (10) 1 r θ i θ i i+1 i=1 i=1 From the joint pgf, we see that the GPMA(1) is also time-reversible, because it has the same dynamics if time is reversed. Further, the pgf associated with the total counts occurring during time lag r (i.e. T = W )is  (u) = W,r t−r+i T i=1 w,r exp μqr(A (u) − 1) + μp(r − 1)(A (u ) − 1) . Alzaid and Al-Osh (1993) note that this θ θ result extends the analogous PMA result to the broader GPMA(1) model. Finally, the GPMA autocorrelation function is p |r|= 1 ρ (r) = Corr(W , W ) = W t t+r 0 |r| > 1, ∗ ∗ −1 where p = p (1 + p ) ; by definition, ρ (r) ∈[ 0, 0.5] (Alzaid and Al-Osh 1993). Even though the GPMA can be considered to model over- or under-dispersed count time series, it may not be a viable option for count data that express extreme under- dispersion; see, e.g. Famoye (1993). This work instead introduces another alternative for modeling integer-valued time series data. The subsequent writing proceeds as follows. We first provide background regarding the probability distributions that motivate the devel- opment of our flexible INMA model. Then, we introduce the SCMPMA(1) model to the reader and discuss its statistical properties. The subsequent section illustrates the model flexibility through simulated and real data examples. Finally, the manuscript concludes with discussion. Motivating distributions While the above constructs show increased ability and improvement towards modeling integer-valued time series data with various forms of dispersion, each of the models suf- fers from respective limitations. In order to develop and describe our SCMPMA(1), we first introduce its underlying motivating distributions: the CMP distribution and its gen- eralized sum-of-CMPs distribution (sCMP), as well as the Conway-Maxwell-Binomial (CMB) along with a generalized CMB (gCMB) distribution. The Conway-Maxwell-Poisson distribution and its generalization The Conway-Maxwell-Poisson (CMP) distribution (introduced by Conway and Maxwell (1962), and revived by Shmueli et al. (2005)) is a viable count distribution that generalizes the Poisson distribution in light of potential data dispersion. The CMP probability mass function (pmf) takes the form P(X = x | λ, ν) = , x = 0, 1, 2, ... , (11) (x! ) ζ(λ, ν) for a random variable X,where λ = E(X ) ≥ 0, ν ≥ 0 is the associated dispersion param- ∞ λ eter, and ζ(λ, ν) = is the normalizing constant. The CMP distribution includes s=0 (s!) (2021) 8:1 Sellers et al. Journal of Statistical Distributions and Applications Page 4 of 12 three well-known distributions as special cases, namely the Poisson (ν = 1), geometric (ν = 0, λ< 1), and Bernoulli ν →∞ with probability distributions. 1+λ ζ(λu,ν) The associated pgf of X is  (u) = E(u ) = , and its moment generating func- ζ(λ,ν) ζ(λe ,ν) Xu tion (mgf) is M (u) = E(e ) = . The moments can meanwhile be represented ζ(λ,ν) recursively as 1−ν λ[ E(X + 1)] , g = 0 g+1 E(X ) = (12) ∂ g g λ E(X ) + E(X)E(X ), g > 0. ∂λ In particular, the expected value and variance can be written in the form and approxi- mated respectively as ∂ ln ζ(λ, ν) ν − 1 1/ν E(X) = ≈ λ − , (13) ∂ ln λ 2ν ∂E(X) 1 1/ν Var(X) = ≈ λ , (14) ∂ ln λ ν where the approximations are especially good for ν ≤ 1or λ> 10 (Shmueli et al. 2005). This distribution is a member of the exponential family, where the joint pmf of the random sample x = (x , ... , x ) is 1 N i=1 −N S −N P(x | λ, ν) = · ζ (λ, ν) = λ exp(−νS )ζ (λ, ν), x ! i=1 N N where S = x and S = log(x ! ) are joint sufficient statistics for λ and 1 i 2 i i=1 i=1 ν. Further, because the CMP distribution belongs to the exponential family, the con- a−1 −νb −c jugate prior distribution has the form, h(λ, ν) = λ e ζ (λ, ν)δ(a, b, c),where −1 λ> 0, ν ≥ 0, and δ(a, b, c) is a normalizing constant such that δ (a, b, c) = ∞ ∞ a−1 −bν −c λ e ζ (λ, ν)dλdν< ∞. 0 0 Meanwhile, letting X = X for iid random variables X ∼ CMP(λ, ν), i = 1, ... , n, ∗ i i i=1 we say that X is distributed as a sum-of-CMPs [denoted sCMP(λ, ν, n)] variable, and has the pmf ∗ ν λ x P(X = x ) = , x = 0, 1, 2, ... , ∗ ∗ ∗ ν n (x ! ) ζ (λ, ν) a , ··· , a ∗ 1 n a ,...,a =0 a +...+a =x 1 n ∗ x x ! n ∗ ∗ where ζ (λ, ν) is the nth power of ζ(λ, ν),and = is a multinomial coef- a , ··· , a a !···a ! 1 n 1 n ficient. The sCMP(λ, ν, n) distribution encompasses the Poisson distribution with rate parameter nλ (for ν = 1), negative binomial(n,1 − λ) distribution (for ν = 0and λ< 1), and Binomial(n, p) distribution as ν →∞ with success probabilityp = as special λ+1 cases. Further, for n = 1, the sCMP(λ, ν, n = 1) is simply the CMP(λ, ν) distribution. The mgf and pgf for a sCMP(λ, ν, n) random variable X are n n ζ(λe , ν) ζ(λt, ν) M (t) = and  (t) = , X X ∗ ∗ ζ(λ, ν) ζ(λ, ν) respectively; accordingly, the sCMP(λ, ν)has mean E(X ) = nE(X) and variance V (X ) = ∗ ∗ nV (X),where E(X) and V (X) are defined in Eqs. (13)-(14), respectively. Invariance under addition holds for two independent sCMP distributions with the same rate and disper- sion parameters. See Sellers et al. (2017) for additional information regarding the sCMP distribution. (2021) 8:1 Sellers et al. Journal of Statistical Distributions and Applications Page 5 of 12 The Conway-Maxwell-Binomial distribution and its generalization The Conway-Maxwell-Binomial distribution of Kadane (2016) (also known as the Conway-Maxwell-Poisson-Binomial distribution by Borges et al. (2014)) is a three- parameter generalization of the Binomial distribution. Denoted as CMB(d, p, ν)dis- tributed, its pmf is y d−y p (1 − p) P(Y = y) = , y = 0, ... , d (15) χ(p, ν, d) d d for some random variable Y where 0 ≤ p ≤ 1, ν ∈ R,and χ(p, ν, d) = p (1 − y=0 d−y p) is the associated normalizing constant. The Binomial(d, p) distribution is the special case of the CMB(d, p, ν)where ν = 1. Meanwhile, ν> (<)1 corresponds to under- dispersion (over-dispersion) relative to the Binomial distribution. For ν →∞,the pmf is concentrated on the point dp while, for ν →−∞, the pmf is concentrated at 0 or d. For independent X ∼ CMP(λ , ν), i = 1, 2, the conditional distribution of X given that i i 1 X + X = d has a CMB d, , ν distribution. 1 2 λ +λ 1 2 The pgf and mgf of Y have the form, up pe τ , ν, d τ , ν, d 1−p 1−p (u) = E u =   and M (u) =  , (16) Y Y p p τ , ν, d τ , ν, d 1−p 1−p d d respectively, where τ(θ , ν, d) = θ for some θ . The CMB distribution is a ∗ ∗ ∗ y=0 member of the exponential family whose joint pmf of the random sample y ={y , ... , y } 1 N is i N ν p d! dN P(y | p, ν) ∝ (1 − p) 1 − p [ y ! (d − y )!] i i i=1 ∝ exp S log − νS , ∗1 ∗2 1 − p N N where S = y and S = log[ y ! (d − y )! ] are the joint sufficient statistics ∗1 i ∗2 i i i=1 i=1 for p and ν. Further, its existence as a member of the exponential family implies that a conjugate prior family exists of the form, a−1 −νb −c h(θ , ν) = θ e ω (θ , ν)ψ(a, b, c),0 <θ < ∞,0 <ν < ∞, ∗ ∗ ∗ ν −1 where ω(θ , ν) = θ /[ y! (d − y)!] , ψ (a, b, c) = ∗ ∗ y=0 ∞ ∞ a−1 −νb −c θ e ω (θ , ν)dθ dν< ∞ (Kadane 2016). ∗ ∗ 0 0 Sellers et al. (2017) further introduce a generalized Conway-Maxwell-Binomial (gCMB) distribution whose pmf is ⎡ ⎤ ⎡ ⎤ z s−z ν ν ⎢ ν ⎥ ⎢ ⎥ s z s − z ⎢ ⎥ z s−z ⎢ ⎥ P(Z = z) ∝ p (1 − p) ⎢ ⎥ ⎣ ⎦ ⎣ ⎦ z a , ... , a b , ... , b 1 n 1 n 1 2 a ,...,a =0 b ,...,b =0 1 n n 1 1 2 a +...+a =z 1 n b +...+b =s−z 1 1 n (17) for a random variable Z with parameters (p, ν, s, n , n ). As with the conditional probabil- 1 2 ity of a CMP random variable given the sum of it and another independent CMP random variable sharing the same dispersion parameter, a special case of a gCMB distribution can be derived as the conditional distribution of X ,given thesum X + X = d for ∗1 ∗1 ∗2 (2021) 8:1 Sellers et al. Journal of Statistical Distributions and Applications Page 6 of 12 independent sCMP random variables, X ∼ sCMP(λ , ν, n ), i = 1, 2; the resulting dis- ∗i i i tribution is analogously a gCMB , ν, d, n , n distribution. The gCMB distribution 1 2 λ +λ 1 2 contains several special cases, including the CMB(d, p, ν) distribution (for n = n = 1); 1 2 the Binomial(d, p) distribution (when n = n = 1and ν = 1); and, for λ = λ = λ,the 1 2 1 2 hypergeometric distribution when ν →∞ and the negative hypergeometric distribution when ν = 0and λ< 1. First-order sCMP time series models This section highlights two first-order models for discrete time series data that have a sCMP marginal distribution, namely the first-order sCMP autoregressive (SCMPAR(1)) model, and a first-order SCMP moving average (SCMPMA(1)) model with the same marginal distribution structure. First-order sCMP autoregressive (SCMPAR(1)) model Sellers et al. (2020) introduce a first-order sCMP autoregressive (SCMPAR(1)) model to describe count data correlated in time that express over- or under-dispersion. Based on the sCMP and gCMB distributions, respectively (as described in the “Motivating distri- butions” section with more detail available in Sellers et al. (2017)), we use the sCMP distri- bution to model the marginals of the first-order integer-valued autoregressive (INAR(1)) process as X = C (X ) +  t = 1, 2, ... , (18) t t t−1 t where  ∼ sCMP(λ, ν, n ),and {C (•) : t = 1, 2, ...} is asequenceofindependent t 2 t gCMB , ν, •, n , n operators, independent of { }. This flexible INAR(1) model con- 1 2 t tains the first-order Poisson autoregressive (PAR(1)) as described in several references (Al-Osh and Alzaid 1987;McKenzie 1988;Weiss 2008), and the first-order binomial auto- regressive model of Al-Osh and Alzaid (1991) as special cases. It likewise contains an INAR(1) model that allows for negative binomial marginals with a thinning operator whose pmf is negative hypergeometric. The SCMPAR(1) model is yet another special case of the infinitely divisible convolution- closed class of first-order autoregressive (AR(1)) models described in Joe (1996), and satisfies the Markov property with the transition probability, ⎡ ⎤ ν x −k x k k t−1 x −k t−1 t−1 ⎣ ⎦ a ,...,a =0 n b ,...,b =0 k 1 a ,...,a n b ,...,b 1 1 n 1 1 n min(x ,x ) 1 2 2 t t−1 a +...+a =k 1 n b +...+b =x −k 1 1 n t−1 P(X |X ) = t t−1 x ν t−1 x t−1 c ,...,c =0 n +n c ...c k=0 1 1 n +n 1 2 1 2 c +...+c =x 1 n +n t−1 1 2 x −k t ν x −k λ x − k × . ν n [ (x − k)!] Z (λ, ν) d , ... , d t 1 n d ,...,d =0 d +...+d =x −k 1 n t−1 (19) The SCMPAR(1) model has an ergodic Markov chain, thus X has a stationary sCMP(λ, ν, n + n ) distribution that is unique. The joint pgf associated with the 1 2 SCMPAR(1) model is (2021) 8:1 Sellers et al. Journal of Statistical Distributions and Applications Page 7 of 12 n n n 2 n 2 1 2 1 ζ (λu, ν) ζ (λul, ν) ζ (λl, ν) (ζ(λu, ν)ζ (λl, ν)) ζ (λul, ν) φ (u, l) = = , (20) X ,X t+1 t n n n n +2n 2 1 2 1 2 ζ (λ, ν) ζ (λ, ν) ζ (λ, ν) ζ (λ, ν) where the pgf is symmetric in u and l, and hence the joint distribution of X and X t+1 t is time reversible. The regression form for the SCMPAR(1) process can be determined, and the general autocorrelation function for the process {X } is ρ = Corr(X , X ) = t r t t−r for r = 0, 1, 2, .... Parameter estimation can be conducted via conditional max- n +n 1 2 imum likelihood with statistical computation tools (e.g. in R); see Sellers et al. (2020)for details. Introducing the sCMPMA(1) model Motivated by the SCMPAR(1) model of Sellers et al. (2020), we introduce a first-order sum-of-CMPs moving average (SCMPMA(1)) process X by ∗ ∗ ∗ X = C  +  , t = 1, 2, ... , (21) t t−1 t ∗ ∗ where  is a sequence of iid sCMP(λ, ν, m + m ) random variables and C (•) is a 1 2 t t sequence of independent gCMB(1/2, ν, •, m , m ) operators independent of  . By defini- 1 2 tion, X is a stationary process with the sCMP(λ, ν,2m + m ) distribution, and X and t 1 2 t+r X are independent for |r| > 1. While this model can analogously be viewed as a special case of the infinitely divisible convolution-closed class of discrete MA models (Joe 1996), unlike the sCMPAR(1) process, the sCMPMA(1) process is not Markovian. The autocorrelation between X and X is t t+1 ∗ ∗ ∗ ∗ ∗ ∗ Cov C  +  , C  + t t t t−1 t+1 t+1 ρ = Corr(X , X ) = 1 t t+1 Var(X )Var(X ) t t+1 ∗ ∗ ∗ Cov  , C t t+1 t =  by the independence assumptions, Var(X )Var(X ) t t+1 m m +m ∗ ∗ 1 ∗ 1 2 where C ( ) = Y and  = Y ,respectively,aresCMP(λ, ν, m ) and i i 1 t+1 t i=1 t i=1 sCMP(λ, ν, m + m ) random variables; i.e. each sCMP random variable can be viewed as 1 2 respective sums of iid CMP(λ, ν) random variables, Y .Thus, m +m m m 1 2 1 1 ∗ ∗ ∗ Cov  , C  = Cov Y , Y = Var Y = m Var(Y ), i i i 1 t t+1 t i=1 i=1 i=1 where, without loss of generality, we let Y denote any of the iid Y random variables. Meanwhile, because {X } is a sCMP(λ, ν,2m + m ) distributed stationary process, we t 1 2 2m +m 2m +m 1 2 1 2 can likewise represent Var(X ) = Var Y = Var(Y ) = (2m + t i i 1 i=1 i=1 m )Var(Y ) for all t. We therefore find that ∗ ∗ ∗ Cov  , C  m Var(Y ) m t t 1 1 t+1 ρ = Corr (X , X ) =  = = .(22) 1 t t+1 (2m + m )Var(Y ) 2m + m Var X Var(X ) ( ) 1 2 1 2 t t+1 Because m , m ≥ 1, the one-step range of possible correlation values is 0 ≤ ρ ≤ 0.5. 1 2 1 In particular, for m = m , we have the special case where ρ = 1/3. Meanwhile, ρ = 0 1 2 1 k for all k > 1 because, by definition of the SCMPMA(1) model assumptions, there is no dependent structure between X and X for r > 1. t t+r (2021) 8:1 Sellers et al. Journal of Statistical Distributions and Applications Page 8 of 12 Recall from the “The Conway-Maxwell-Poisson distribution and its generalization” ζ(λw,ν) section that  (w) = is thepgf forasCMP(λ, ν, π) distributed random variable, ζ(λ,ν) (say) G. Using this knowledge along with Eq. (21), the joint pgf can be derived as X X t+1 t φ (u, l) = E u l X ,X t+1 t ∗ ∗ ∗ C  + X ( ) t t+1 = E u l ∗ ∗ ∗ ∗ ∗ ∗ ∗ C ( )+ X −C ( ) C ( ) t t+1 t t = E u l l ∗ ∗ ∗ ∗ C   X −C ( ) ∗ ∗ ∗ ( ) t t t t+1 = E (ul) u l where X − C  = t t ∗ ∗ ∗ ∗ C ( ) t t+1 t = E (ul) E u E l by independence ∗ ∗ ∗ = φ ∗ (ul)φ (u)φ (l) ( ) t t+1 t m m +m m +m 1 1 2 1 2 ζ(λul, ν) ζ(λu, ν) ζ(λl, ν) ζ(λ, ν) ζ(λ, ν) ζ(λ, ν) m +m m 1 2 1 (ζ(λu, ν)ζ (λl, ν)) (ζ(λul, ν)) = , (23) 3m +2m 1 2 (ζ(λ, ν)) where Eq. (23) is equivalent to Eq. (20) (i.e. the SCMPMA(1) process is comparable to the SCMPAR(1) process) when m = n = n − m . Given this comparison, we 1 1 2 2 can easily determine the conditional mean E(X | X = x) and conditional vari- t+1 t ance Var(X | X = x).Eq.(23) further demonstrates that the SCMPMA(1) model is t+1 t time-reversible. Parameter estimation via maximum likelihood (ML) is a difficult task with INMA models given the complex form of the underlying distributions. Even a conditional least squares approach does not appear to be feasible “because of the thinning operators, unless randomization is used” (Brännäs and Hall 2001). We therefore instead consider the following ad hoc procedure for parameter estimation. Given a data set with an observed correlation ρ , we first propose values for m , m ∈ N that satisfy the con- 1 1 2 straint, ρ ≈ .Given m and m and recognizing that X is stationary with a 1 1 2 t 2m +m 1 2 sCMP(λ, ν,2m + m ) distribution, we proceed with ML estimation to determine λ and ν ˆ 1 2 as describedinZhu et al.(2017) for conducting sCMP(λ, ν, s = 2m +m ) parameter esti- 1 2 mation with regard to a CMP process over an interval of length s ≥ 1. The corresponding variation for λ and ν ˆ can be quantified via the Fisher information matrix or nonparamet- ric bootstrapping. While the sampling distribution for λ is approximately symmetric, the sampling distribution for ν ˆ is considerably right-skewed, hence analysts are advised to quantify estimator variation via nonparametric bootstrapping. While this is a means to an end, it only achieves in determining an appropriate distributional form regarding the data; it does not fully address the nature of the time series. Data examples To illustrate the flexibility of our INMA model, we consider various data simulations and a real data example. Below contains the respective details and associated commentary. Simulated data examples Table 1 reports the estimated mean, variance, and autocorrelation that result from (2021) 8:1 Sellers et al. Journal of Statistical Distributions and Applications Page 9 of 12 Table 1 Estimated mean, variance, and autocorrelation for various SCMPMA(1) data simulations of length 10,000 given parameters, (λ, ν, m , m ); λ = 0.5 for all simulations 1 2 m m ν Est. Mean True Mean Est. Var. True Var. ρρ ˆ 1 2 1 1 0 3.0073 3.00 5.9076 6.00 0.333 0.333 0.5 1.8846 2.3395 0.335 0.333 1 1.4842 1.50 1.4507 1.50 0.328 0.333 2 1.2359 1.0358 0.338 0.333 35 0.9826 1.00 0.6568 0.67 0.333 0.333 1 2 0 4.0027 4.00 8.0309 8.00 0.248 0.250 0.5 2.4891 3.0010 0.253 0.250 1 1.9861 2.00 1.9755 2.00 0.252 0.250 2 1.6427 1.3284 0.257 0.250 35 1.3355 1.33 0.8964 0.89 0.254 0.250 2 1 0 5.0408 5.00 10.2256 10.00 0.404 0.400 0.5 3.1414 3.9010 0.402 0.400 1 2.4823 2.50 2.4689 2.50 0.396 0.400 2 2.0094 1.6561 0.401 0.400 35 1.6583 1.67 1.1135 1.11 0.390 0.400 2 2 0 6.0019 6.00 12.2343 12.00 0.331 0.333 0.5 3.7195 4.5267 0.338 0.333 1 2.9873 3.00 3.0130 3.00 0.329 0.333 2 2.4488 2.0178 0.336 0.333 35 1.9861 2.00 1.2964 1.33 0.326 0.333 For the special cases (i.e. ν = 0, 1, 35, where ν = 35 sufficiently represents performance as ν →∞), we likewise provide the expected/true mean and variance. Along with the estimated autocorrelation ρˆ that results from the data, the table reports the true autocorrelation, ρ = , for all {m , m } and any ν (Eq. (22)) 1 1 2 2m +m 1 2 various data simulations of SCMPMA(1) data given parameters (λ, ν, m , m ).Inall 1 2 examples, we let λ = 0.5, m , m ∈{1, 2},and ν ={0, 0.5, 1, 2, 35},where ν = 0 1 2 captures thecaseofextreme over-dispersion, ν = 1 denotes equi-dispersion, and ν = 35 sufficiently illustrates the case computationally of utmost under-dispersion where ν →∞. For all examples, we find that the associated mean and variance compare with each other as expected, i.e. the variance is greater than the mean when ν< 1(i.e. thedataare over-dispersed), the variance and mean are approximately equal when ν = 1 (i.e. equi- dispersion holds), and the variance is less than the mean (i.e. the data are under-dispersed) when ν> 1. In particular, we can easily verify that the three special case models perform as expected. For the Poisson cases (ν = 1), we expect the mean and variance to both equal (2m + m )λ, while the binomial cases (i.e. ν →∞ and p = )produce ameanequal 1 2 λ+1 λ λ λ to (2m + m ) and variance equaling (2m + m ) 1 − ,and thenegative 1 2 1 2 λ+1 λ+1 λ+1 (2m +m )λ 1 2 binomial cases (ν = 0with p = 1 − λ)haveameanof and variance equaling 1−λ (2m +m )λ 1 2 . In fact, even with the ν →∞ case approximated by letting ν = 35, we still (1−λ) obtain reasonable estimates for the mean and variance for all of the associated cases of m and m . For each {m , m } pair, the mean and variance both decrease as ν increases while, for all 1 2 of the considered examples, we obtain estimated correlation values ρˆ that approximately equal the true correlation, ρ. In particular, for those cases where m = m ,weobtain 1 2 ρˆ ≈ 1/3 as expected (see Eq. (22)). (2021) 8:1 Sellers et al. Journal of Statistical Distributions and Applications Page 10 of 12 Fig. 1 (a) ACF plot for IP data (b) PACF plot for IP data Real data example: IP address counts Weiss (2007) considers a modified dataset regarding the number of unique IP-addresses which access the University of Wurzburg Department of Statistics’s webpages in 240 two- minute intervals. Collected on November 29, 2005 (from 10:00:00 to 18:00:00), these data have an associated mean and variance equaling 1.286 and 1.205, respectively. Weiss (2007) considers a PAR(1) model, noting that “the empirical partial autocorrelation function indicates that a first order [autoregressive] model may be an appropriate choice” with ρˆ = 0.292; Sellers et al. (2020), following suit, consider a SCMPAR(1) model as a flex- ible alternative to the PAR(1) model. The ACF and PACF plots of these data, however, do not clearly distinguish between considering a first-order autoregressive or a moving average model; see Fig. 1a-b. Further, recognizing that the data express apparent under- to equi-dispersion, we therefore consider the SCMPMA(1) as an illustrative model for analysis. We perform ML estimation assuming various combinations for (m , m ) (i.e. {(1,1), 1 2 1 1 (1,2), (2,2)}) as these values contain the observed correlation, 0.25 = < ρˆ < ≈ 0.33. 4 3 Table 2 contains the resulting parameter estimates for λ and ν, along with the respective Akaike Information Criterion (AIC). While the SCMPMA(1) model with m = m = 2 1 2 has the lowest AIC among the four models considered, all of these models produce approximately equal AIC values (i.e. 695.2) where the increasing m and m values asso- 1 2 ciate with decreasing λ and increasing ν ˆ. This makes sense because the resulting estimates rely solely on the assumed underlying sCMP(λ, ν,2m + m ) distributional form for the 1 2 data. The dispersion estimates in Table 2 are all greater than 1, thus implying a perceived level of data under-dispersion. These results naturally stem from the reported mean of the data (1.286) being greater than its corresponding variance (1.205). Their associated Table 2 Estimated parameters, the 95% confidence intervals for λ and ν derived from nonparametric bootstrapping, and Akaike Information Criterion (AIC) values for various SCMPMA(1) models for the IP data λν m m Est. 95% CI Est. 95% CI AIC 1 2 1 1 0.461 (0.374, 0.585) 1.285 (0.650, 2.291) 695.24 1 2 0.346 (0.278, 0.433) 1.387 (0.582, 2.937) 695.21 2 1 0.277 (0.225, 0.348) 1.493 (0.500, 4.594) 695.20 2 2 0.231 (0.188, 0.289) 1.612 (0.429, 25.802) 695.19 (2021) 8:1 Sellers et al. Journal of Statistical Distributions and Applications Page 11 of 12 95% confidence intervals (determined via nonparametric bootstrapping; also supplied in Table 2), however, are sufficiently large such that they contain ν = 1. This suggests that the apparent data under-dispersion is not statistically significant, thus instead suggesting that the data can be analyzed via the Al-Osh and Alzaid (1988)PMA(1)model.Itisfurther striking to see that the respective 95% confidence intervals associated with the dispersion parameter increase with the size of the underlying sCMP(2m + m )model.Thisisan 1 2 artifact of the (s)CMP distribution, namely that the distribution of ν is a right-skewed distribution (as discussed in Zhu et al. (2017)). This approach confirms interest in the PMA(1) model where Eqs. (1)-(2) imply that associated estimated parameters are γˆ ≈ 0.4124 and η ˆ ≈ 0.9105. Thus, we benefit from the SCMPMA(1) as a tool for parsimonious model determination. Discussion This work utilizes the sCMP distribution of Sellers et al. (2017)todevelop aSCMPMA(1) model that serves as a flexible moving average time series model for discrete data where data dispersion is present. The SCMPMA(1) model captures the PMA(1), as well as versions of a negative binomial and binomial MA(1) structure, respectively, as special cases. This along with the flexible SCMPAR(1) can be used further to derive broader auto-regressive moving average (ARMA) and auto-regressive integrated moving average (ARIMA) models based on the sCMP distribution. The SCMPMA(1) shares many properties with the analogous SCMPAR(1) model by Sellers et al. (2020). The presented models rely on predefining discrete values (i.e. m , m 1 2 for the SCMPMA(1)) for parameter estimation. As done in Sellers et al. (2017) and Sellers and Young (2019), we utilize a profile likelihood approach where, given m and m ,we 1 2 estimate the remaining model coefficients and then identify that collection of parameter estimates that produces the largest likelihood, thus identifying these parameter estimates as the MLEs. While this profile likelihood approach is acceptable as demonstrated in other applications, directly estimating m , m along with the other SCMPMA(1) model 1 2 estimates would likewise prove beneficial, as would redefining the model to allow for real- and m . These generalizations and estimation approaches can valued estimators for m 1 2 be explored in future work. Simulated data examples illustrate that the SCMPMA(1) model can obtain unbiased estimates, and the model demonstrates potential for accurate forecasts given data con- taining any measure of data dispersion. The real data illustration, however, highlights the complexities that come with parameter estimation. While we nonetheless present a means towards achieving this goal, this approach does not perform but so strongly with regard to prediction and forecasting. It nonetheless serves as a starting point for parameter esti- mation that we will continue to investigate in future work. Moreover, the flexibility of the SCMPMA(1) aids in determining a parsimonious model form as appropriate. Abbreviations AR(1): First-order autoregressive; ARIMA: Auto-regressive integrated moving average; ARMA: Auto-regressive moving average; CMB: Conway-Maxwell-Binomial; CMP: Conway-Maxwell-Poisson; gCMB: Generalized Conway-Maxwell-Binomial; GPAR(1): First-order generalized Poisson autoregressive; GPMA(1): First-order generalized Poisson moving average; INAR(1): First-order integer-valued autoregressive; INMA: Integer-valued moving average; MA: Moving average; mgf: Moment generating function; PAR(1): First-order Poisson autoregressive; pgf: Probability generating function; PMA: Poisson moving average; PMA(1): First-order Poisson moving average; QB: Quasi-binomial; sCMP: Sum-of-Conway-Maxwell-Poisson; SCMPAR(1): First-order sum-of-Conway-Maxwell-Poisson autoregressive; SCMPMA(1): First-order sum-of-Conway-Maxwell-Poissons moving average (2021) 8:1 Sellers et al. Journal of Statistical Distributions and Applications Page 12 of 12 Acknowledgements This paper is released to inform interested parties of research and to encourage discussion. The views expressed are those of the authors and not necessarily those of the U.S. Census Bureau. SM and FC thank the Georgetown Undergraduate Research Opportunities Program (GUROP) for their support. All authors thank Dr. Christian Weiss for use of the IP dataset, and the reviewers for their feedback and comments. Authors’ contributions KFS developed the research idea. All authors contributed towards the literature review, theoretical developments, and statistical computing. The author(s) read and approved the final manuscript. Funding SM was funded in part by the GUROP. Availability of data and materials Simulated data can vary given the generation process. Simulation code(s) can be supplied upon request. IP data set obtained from Dr. Christian Weiss of Helmut Schmidt University. Competing interests No authors have competing interests relating to this work. Received: 22 April 2020 Accepted: 26 January 2021 References Al-Osh, M. A., Alzaid, A. A.: First-order integer valued autoregressive (INAR(1)) process. J. Time Ser. Anal. 8(3), 261–275 (1987) Al-Osh, M. A., Alzaid, A. A.: Integer-valued moving average (INMA) process. Stat. Pap. 29(1), 281–300 (1988) Al-Osh, M. A., Alzaid, A. A.: Binomial autoregressive moving average models. Commun. Stat. Stoch. Model. 7(2), 261–282 (1991) Alzaid, A. A., Al-Osh, M. A.: Some autoregressive moving average processes with generalized Poisson marginal distributions. Ann. Inst. Stat. Math. 45(2), 223–232 (1993) Borges, P., Rodrigues, J., Balakrishnan, N., Bazán, J.: A COM-Poisson type generalization of the binomial distribution and its properties and applications. Stat. Probab. Lett. 87, 158–166 (2014) Brännäs, K., Hall, A.: Estimation in integer-valued moving average models. Appl. Stoch. Model. Bus. Ind. 17, 277–291 (2001) Conway, R. W., Maxwell, W. L.: A queuing model with state dependent service rates. J. Ind. Eng. 12, 132–136 (1962) Famoye, F.: Restricted generalized Poisson regression model. Commun. Stat. Theory Methods. 22(5), 1335–1354 (1993) Hilbe, J. M.: Modeling Count Data. Cambridge University Press, New York, NY (2014) Joe, H.: Time series models with univariate margins in the convolution-closed infinitely divisible class. J. Appl. Probab. 33(3), 664–677 (1996) Kadane, J. B.: Sums of possibly associated Bernoulli variables: The Conway-Maxwell-Binomial distribution. Bayesian Anal. 11(2), 403–420 (2016) McKenzie, E.: ARMA models for dependent sequences of Poisson counts. Adv. Appl. Probab. 20(4), 822–835 (1988) Sellers, K. F., Peng, S. J., Arab, A.: A flexible univariate autoregressive time-series model for dispersed count data. J. Time Ser. Anal. 41(3), 436–453 (2020). https://doi.org/10.1111/jtsa.12516 Sellers, K. F., Swift, A. W., Weems, K. S.: A flexible distribution class for count data. J. Stat. Distrib. Appl. 4(22), 1–21 (2017). https://doi.org/10.1186/s40488-017-0077-0 Sellers, K. F., Young, D. S.: Zero-inflated sum of Conway-Maxwell-Poissons (ZISCMP) regression. J. Stat. Comput. Simul. 89(9), 1649–1673 (2019) Shmueli, G., Minka, T. P., Kadane, J. B., Borle, S., Boatwright, P.: A useful distribution for fitting discrete data: revival of the Conway-Maxwell-Poisson distribution. Appl. Stat. 54, 127–142 (2005) Weiss, C. H.: Controlling correlated processes of Poisson counts. Qual. Reliab. Eng. Int. 23(6), 741–754 (2007) Weiss, C. H.: Thinning operations for modeling time series of counts–a survey. Adv. Stat. Anal. 92, 319–341 (2008) Weiss, C. H.: An Introduction to Discrete-Valued Time Series. John Wiley & Sons, Inc., Hoboken, NJ (2018) Weiss, C. H.: Stationary count time series models. Wiley Interdiscip. Rev. Comput. Stat. 13(1), 1502 (2021). https://doi.org/ 10.1002/wics.1502 Zhu, L., Sellers, K. F., Morris, D. S., Shmuéli, G.: Bridging the gap: A generalized stochastic process for count data. Am. Stat. 71(1), 71–80 (2017) Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Journal of Statistical Distributions and Applications Springer Journals

A flexible univariate moving average time-series model for dispersed count data

Loading next page...
 
/lp/springer-journals/a-flexible-univariate-moving-average-time-series-model-for-dispersed-YnFH1OCf5y

References (19)

Publisher
Springer Journals
Copyright
Copyright © The Author(s) 2021
eISSN
2195-5832
DOI
10.1186/s40488-021-00115-2
Publisher site
See Article on Publisher Site

Abstract

kfs7@georgetown.edu Department of Mathematics and Al-Osh and Alzaid (1988) consider a Poisson moving average (PMA) model to describe Statistics, Georgetown University, the relation among integer-valued time series data; this model, however, is constrained Washington, DC, USA Center for Statistical Research and by the underlying equi-dispersion assumption for count data (i.e., that the variance and Methodology Division, U.S. Census the mean equal). This work instead introduces a flexible integer-valued moving Bureau, Washington, DC, USA average model for count data that contain over- or under-dispersion via the Conway-Maxwell-Poisson (CMP) distribution and related distributions. This first-order sum-of-Conway-Maxwell-Poissons moving average (SCMPMA(1)) model offers a generalizable construct that includes the PMA (among others) as a special case. We highlight the SCMPMA model properties and illustrate its flexibility via simulated data examples. Keywords: Over-dispersion, Under-dispersion, Conway-Maxwell-Poisson (COM-Poisson or CMP), Sum-of-Conway-Maxwell-Poisson (sCMP) Introduction Integer-valued thinning-based models have been proposed to model time series data rep- resented as counts. Al-Osh and Alzaid (1988) introduce a generally defined integer-valued moving average (INMA) process as an analog to the moving average (MA) model for continuous data which assumes an underlying Gaussian distribution. This INMA process instead utilizes a thinning operator that maintains an integer-valued range of possible outcomes. To form such a model, they consider the “survivals” of independent and iden- tically distributed (iid) non-negative integer valued random innovations to maintain and ensure discrete data outcomes (Weiss 2021). Al-Osh and Alzaid (1988) particularly con- sider a first-order Poisson moving average (PMA(1)), i.e. a stationary sequence U of the form U = γ ◦  +  where { } is a sequence of iid Poisson(η) random variables and t t−1 t t (γ ◦ ) = B for a sequence of iid Bernoulli(γ ) random variables {B } independent i i i=1 of {}. By design, the PMA(1) is an INMA whose maximum stay time in the sequence is two time units. Consequently, components of U are dependent, while the components of and (γ ◦  ) are independent. t t−1 Given the PMA(1) structure, E(U ) = Var(U ) = (1 + γ)η,(1) t t © The Author(s). 2021 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. (2021) 8:1 Sellers et al. Journal of Statistical Distributions and Applications Page 2 of 12 and the covariance of consecutive variables Cov(U , U ) = γη; this implies that the t−1 t correlation is r = 1 1+γ ρ (r) = Corr(U , U ) = (2) U t−r t 0 r > 1. Meanwhile, the probability generating function (pgf) of U is  (u) = t U −η(1+γ)(1−u) e , the joint pgf of {U , ... , U } is  (u , ... , u ) = exp (−η r + γ − 1 r r 1 r r r−1 (1 − γ) u − γ(u + u ) − γ u u (which infers that time reversibility i 1 r i i+1 i=1 i=1 holds for the PMA), and the pgf of T = U is U,r i i=1 (u) = exp −η [(1 − γ)r + 2γ ] (1 − u) − ηγ (r − 1)(1 − u ) . U,r Al-Osh and Alzaid (1988)notethat T does not have a Poisson distribution, which is U,r in contrast to the standard MA(1) process. The conditional mean and variance of U t+1 given U = u are both linear in U ,namely t t E(U | U = u) = η + γ u/(1 + γ),and (3) t+1 t Var(U | U = u) = η + γ u/(1 + γ).(4) t+1 t The PMA is a natural choice for modeling an integer-valued process, in part because of its tractability (Al-Osh and Alzaid 1988). This model, however, is limited by its con- straining equi-dispersion property, i.e. the assumption that the mean and variance of the underlying process equal. Real data do not generally conform to this construct (Hilbe 2014;Weiss 2018); they usually display over-dispersion relative to the Poisson model (i.e. where the variance is greater than the mean), however integer-valued data are surfacing with greater frequency that express data under-dispersion relative to Poisson (i.e. the vari- ance is less than the mean). Accordingly, it would be fruitful to instead consider a flexible time series model that can accommodate data over- and/or under-dispersion. Alzaid and Al-Osh (1993) introduce a first-order generalized Poisson moving average (GPMA(1)) process as an alternative to the PMA. The associated model has the form, ∗ ∗ ∗ W = Q  +  , t = 0, 1, 2, ...,(5) t t−1 t ∗ ∗ ∗ where  is a sequence of iid generalized Poisson GP(μ , θ), and {Q (·)} is a sequence t t ∗ ∗ ∗ of quasi-binomial QB(p , θ/μ , ·) random operators independent of { }. As with the PMA, W and W are independent for |r| > 1. The marginal distribution of W is t+r t t ∗ ∗ GP((1 + p )μ , θ). Recognizing the relationship between moving average and autoregres- sive models, Alzaid and Al-Osh (1993) equate terms in this GPMA(1) model to their first-order generalized Poisson autoregressive (GPAR(1)) counterpart, W = Q (W ) +  , t = 0, 1, 2, ...,(6) t t t−1 t where { } is a sequence of iid GP(qμ, θ) random variables where q = 1 − p,and {Q (·)} t t is a sequence of QB(p, θ/μ, ·) random operators, independent of { }; i.e., they let μ = ∗ ∗ (1 + p )μ and p = .The bivariatepgf of W and W can thus be represented as ∗ t+1 t 1+p ∗ ∗ ∗ (u , u ) = exp μ (A (u ) + A (u ) − 2) + μ p (A (u u ) − 1) (7) W ,W 1 2 θ 1 θ 2 θ 1 2 t+1 t = exp μq (A (u ) + A (u ) − 2) + μp (A (u u ) − 1),(8) θ 1 θ 2 θ 1 2 −θ(s−1) where A (s) is the inverse function that satisfies A se = s;see Alzaid andAl- θ θ Osh (1993). This substitution in Eq. (7)toobtainEq. (8) further illustrates the relationship (2021) 8:1 Sellers et al. Journal of Statistical Distributions and Applications Page 3 of 12 between the GPMA(1) and GPAR(1) models such that they have the same joint pgf. Eq. (8) and the related GPAR work of Alzaid and Al-Osh (1993) therefore show that qμ E(W | W = w) = pw +.(9) t t−1 1 − θ Thejoint pgfof(W , W , ... , W )isgiven by t t−1 t−r+1 r r (u , ... , u ) = exp μq (A (u ) − 1) + μp (A (u u ) − 1) . (10) 1 r θ i θ i i+1 i=1 i=1 From the joint pgf, we see that the GPMA(1) is also time-reversible, because it has the same dynamics if time is reversed. Further, the pgf associated with the total counts occurring during time lag r (i.e. T = W )is  (u) = W,r t−r+i T i=1 w,r exp μqr(A (u) − 1) + μp(r − 1)(A (u ) − 1) . Alzaid and Al-Osh (1993) note that this θ θ result extends the analogous PMA result to the broader GPMA(1) model. Finally, the GPMA autocorrelation function is p |r|= 1 ρ (r) = Corr(W , W ) = W t t+r 0 |r| > 1, ∗ ∗ −1 where p = p (1 + p ) ; by definition, ρ (r) ∈[ 0, 0.5] (Alzaid and Al-Osh 1993). Even though the GPMA can be considered to model over- or under-dispersed count time series, it may not be a viable option for count data that express extreme under- dispersion; see, e.g. Famoye (1993). This work instead introduces another alternative for modeling integer-valued time series data. The subsequent writing proceeds as follows. We first provide background regarding the probability distributions that motivate the devel- opment of our flexible INMA model. Then, we introduce the SCMPMA(1) model to the reader and discuss its statistical properties. The subsequent section illustrates the model flexibility through simulated and real data examples. Finally, the manuscript concludes with discussion. Motivating distributions While the above constructs show increased ability and improvement towards modeling integer-valued time series data with various forms of dispersion, each of the models suf- fers from respective limitations. In order to develop and describe our SCMPMA(1), we first introduce its underlying motivating distributions: the CMP distribution and its gen- eralized sum-of-CMPs distribution (sCMP), as well as the Conway-Maxwell-Binomial (CMB) along with a generalized CMB (gCMB) distribution. The Conway-Maxwell-Poisson distribution and its generalization The Conway-Maxwell-Poisson (CMP) distribution (introduced by Conway and Maxwell (1962), and revived by Shmueli et al. (2005)) is a viable count distribution that generalizes the Poisson distribution in light of potential data dispersion. The CMP probability mass function (pmf) takes the form P(X = x | λ, ν) = , x = 0, 1, 2, ... , (11) (x! ) ζ(λ, ν) for a random variable X,where λ = E(X ) ≥ 0, ν ≥ 0 is the associated dispersion param- ∞ λ eter, and ζ(λ, ν) = is the normalizing constant. The CMP distribution includes s=0 (s!) (2021) 8:1 Sellers et al. Journal of Statistical Distributions and Applications Page 4 of 12 three well-known distributions as special cases, namely the Poisson (ν = 1), geometric (ν = 0, λ< 1), and Bernoulli ν →∞ with probability distributions. 1+λ ζ(λu,ν) The associated pgf of X is  (u) = E(u ) = , and its moment generating func- ζ(λ,ν) ζ(λe ,ν) Xu tion (mgf) is M (u) = E(e ) = . The moments can meanwhile be represented ζ(λ,ν) recursively as 1−ν λ[ E(X + 1)] , g = 0 g+1 E(X ) = (12) ∂ g g λ E(X ) + E(X)E(X ), g > 0. ∂λ In particular, the expected value and variance can be written in the form and approxi- mated respectively as ∂ ln ζ(λ, ν) ν − 1 1/ν E(X) = ≈ λ − , (13) ∂ ln λ 2ν ∂E(X) 1 1/ν Var(X) = ≈ λ , (14) ∂ ln λ ν where the approximations are especially good for ν ≤ 1or λ> 10 (Shmueli et al. 2005). This distribution is a member of the exponential family, where the joint pmf of the random sample x = (x , ... , x ) is 1 N i=1 −N S −N P(x | λ, ν) = · ζ (λ, ν) = λ exp(−νS )ζ (λ, ν), x ! i=1 N N where S = x and S = log(x ! ) are joint sufficient statistics for λ and 1 i 2 i i=1 i=1 ν. Further, because the CMP distribution belongs to the exponential family, the con- a−1 −νb −c jugate prior distribution has the form, h(λ, ν) = λ e ζ (λ, ν)δ(a, b, c),where −1 λ> 0, ν ≥ 0, and δ(a, b, c) is a normalizing constant such that δ (a, b, c) = ∞ ∞ a−1 −bν −c λ e ζ (λ, ν)dλdν< ∞. 0 0 Meanwhile, letting X = X for iid random variables X ∼ CMP(λ, ν), i = 1, ... , n, ∗ i i i=1 we say that X is distributed as a sum-of-CMPs [denoted sCMP(λ, ν, n)] variable, and has the pmf ∗ ν λ x P(X = x ) = , x = 0, 1, 2, ... , ∗ ∗ ∗ ν n (x ! ) ζ (λ, ν) a , ··· , a ∗ 1 n a ,...,a =0 a +...+a =x 1 n ∗ x x ! n ∗ ∗ where ζ (λ, ν) is the nth power of ζ(λ, ν),and = is a multinomial coef- a , ··· , a a !···a ! 1 n 1 n ficient. The sCMP(λ, ν, n) distribution encompasses the Poisson distribution with rate parameter nλ (for ν = 1), negative binomial(n,1 − λ) distribution (for ν = 0and λ< 1), and Binomial(n, p) distribution as ν →∞ with success probabilityp = as special λ+1 cases. Further, for n = 1, the sCMP(λ, ν, n = 1) is simply the CMP(λ, ν) distribution. The mgf and pgf for a sCMP(λ, ν, n) random variable X are n n ζ(λe , ν) ζ(λt, ν) M (t) = and  (t) = , X X ∗ ∗ ζ(λ, ν) ζ(λ, ν) respectively; accordingly, the sCMP(λ, ν)has mean E(X ) = nE(X) and variance V (X ) = ∗ ∗ nV (X),where E(X) and V (X) are defined in Eqs. (13)-(14), respectively. Invariance under addition holds for two independent sCMP distributions with the same rate and disper- sion parameters. See Sellers et al. (2017) for additional information regarding the sCMP distribution. (2021) 8:1 Sellers et al. Journal of Statistical Distributions and Applications Page 5 of 12 The Conway-Maxwell-Binomial distribution and its generalization The Conway-Maxwell-Binomial distribution of Kadane (2016) (also known as the Conway-Maxwell-Poisson-Binomial distribution by Borges et al. (2014)) is a three- parameter generalization of the Binomial distribution. Denoted as CMB(d, p, ν)dis- tributed, its pmf is y d−y p (1 − p) P(Y = y) = , y = 0, ... , d (15) χ(p, ν, d) d d for some random variable Y where 0 ≤ p ≤ 1, ν ∈ R,and χ(p, ν, d) = p (1 − y=0 d−y p) is the associated normalizing constant. The Binomial(d, p) distribution is the special case of the CMB(d, p, ν)where ν = 1. Meanwhile, ν> (<)1 corresponds to under- dispersion (over-dispersion) relative to the Binomial distribution. For ν →∞,the pmf is concentrated on the point dp while, for ν →−∞, the pmf is concentrated at 0 or d. For independent X ∼ CMP(λ , ν), i = 1, 2, the conditional distribution of X given that i i 1 X + X = d has a CMB d, , ν distribution. 1 2 λ +λ 1 2 The pgf and mgf of Y have the form, up pe τ , ν, d τ , ν, d 1−p 1−p (u) = E u =   and M (u) =  , (16) Y Y p p τ , ν, d τ , ν, d 1−p 1−p d d respectively, where τ(θ , ν, d) = θ for some θ . The CMB distribution is a ∗ ∗ ∗ y=0 member of the exponential family whose joint pmf of the random sample y ={y , ... , y } 1 N is i N ν p d! dN P(y | p, ν) ∝ (1 − p) 1 − p [ y ! (d − y )!] i i i=1 ∝ exp S log − νS , ∗1 ∗2 1 − p N N where S = y and S = log[ y ! (d − y )! ] are the joint sufficient statistics ∗1 i ∗2 i i i=1 i=1 for p and ν. Further, its existence as a member of the exponential family implies that a conjugate prior family exists of the form, a−1 −νb −c h(θ , ν) = θ e ω (θ , ν)ψ(a, b, c),0 <θ < ∞,0 <ν < ∞, ∗ ∗ ∗ ν −1 where ω(θ , ν) = θ /[ y! (d − y)!] , ψ (a, b, c) = ∗ ∗ y=0 ∞ ∞ a−1 −νb −c θ e ω (θ , ν)dθ dν< ∞ (Kadane 2016). ∗ ∗ 0 0 Sellers et al. (2017) further introduce a generalized Conway-Maxwell-Binomial (gCMB) distribution whose pmf is ⎡ ⎤ ⎡ ⎤ z s−z ν ν ⎢ ν ⎥ ⎢ ⎥ s z s − z ⎢ ⎥ z s−z ⎢ ⎥ P(Z = z) ∝ p (1 − p) ⎢ ⎥ ⎣ ⎦ ⎣ ⎦ z a , ... , a b , ... , b 1 n 1 n 1 2 a ,...,a =0 b ,...,b =0 1 n n 1 1 2 a +...+a =z 1 n b +...+b =s−z 1 1 n (17) for a random variable Z with parameters (p, ν, s, n , n ). As with the conditional probabil- 1 2 ity of a CMP random variable given the sum of it and another independent CMP random variable sharing the same dispersion parameter, a special case of a gCMB distribution can be derived as the conditional distribution of X ,given thesum X + X = d for ∗1 ∗1 ∗2 (2021) 8:1 Sellers et al. Journal of Statistical Distributions and Applications Page 6 of 12 independent sCMP random variables, X ∼ sCMP(λ , ν, n ), i = 1, 2; the resulting dis- ∗i i i tribution is analogously a gCMB , ν, d, n , n distribution. The gCMB distribution 1 2 λ +λ 1 2 contains several special cases, including the CMB(d, p, ν) distribution (for n = n = 1); 1 2 the Binomial(d, p) distribution (when n = n = 1and ν = 1); and, for λ = λ = λ,the 1 2 1 2 hypergeometric distribution when ν →∞ and the negative hypergeometric distribution when ν = 0and λ< 1. First-order sCMP time series models This section highlights two first-order models for discrete time series data that have a sCMP marginal distribution, namely the first-order sCMP autoregressive (SCMPAR(1)) model, and a first-order SCMP moving average (SCMPMA(1)) model with the same marginal distribution structure. First-order sCMP autoregressive (SCMPAR(1)) model Sellers et al. (2020) introduce a first-order sCMP autoregressive (SCMPAR(1)) model to describe count data correlated in time that express over- or under-dispersion. Based on the sCMP and gCMB distributions, respectively (as described in the “Motivating distri- butions” section with more detail available in Sellers et al. (2017)), we use the sCMP distri- bution to model the marginals of the first-order integer-valued autoregressive (INAR(1)) process as X = C (X ) +  t = 1, 2, ... , (18) t t t−1 t where  ∼ sCMP(λ, ν, n ),and {C (•) : t = 1, 2, ...} is asequenceofindependent t 2 t gCMB , ν, •, n , n operators, independent of { }. This flexible INAR(1) model con- 1 2 t tains the first-order Poisson autoregressive (PAR(1)) as described in several references (Al-Osh and Alzaid 1987;McKenzie 1988;Weiss 2008), and the first-order binomial auto- regressive model of Al-Osh and Alzaid (1991) as special cases. It likewise contains an INAR(1) model that allows for negative binomial marginals with a thinning operator whose pmf is negative hypergeometric. The SCMPAR(1) model is yet another special case of the infinitely divisible convolution- closed class of first-order autoregressive (AR(1)) models described in Joe (1996), and satisfies the Markov property with the transition probability, ⎡ ⎤ ν x −k x k k t−1 x −k t−1 t−1 ⎣ ⎦ a ,...,a =0 n b ,...,b =0 k 1 a ,...,a n b ,...,b 1 1 n 1 1 n min(x ,x ) 1 2 2 t t−1 a +...+a =k 1 n b +...+b =x −k 1 1 n t−1 P(X |X ) = t t−1 x ν t−1 x t−1 c ,...,c =0 n +n c ...c k=0 1 1 n +n 1 2 1 2 c +...+c =x 1 n +n t−1 1 2 x −k t ν x −k λ x − k × . ν n [ (x − k)!] Z (λ, ν) d , ... , d t 1 n d ,...,d =0 d +...+d =x −k 1 n t−1 (19) The SCMPAR(1) model has an ergodic Markov chain, thus X has a stationary sCMP(λ, ν, n + n ) distribution that is unique. The joint pgf associated with the 1 2 SCMPAR(1) model is (2021) 8:1 Sellers et al. Journal of Statistical Distributions and Applications Page 7 of 12 n n n 2 n 2 1 2 1 ζ (λu, ν) ζ (λul, ν) ζ (λl, ν) (ζ(λu, ν)ζ (λl, ν)) ζ (λul, ν) φ (u, l) = = , (20) X ,X t+1 t n n n n +2n 2 1 2 1 2 ζ (λ, ν) ζ (λ, ν) ζ (λ, ν) ζ (λ, ν) where the pgf is symmetric in u and l, and hence the joint distribution of X and X t+1 t is time reversible. The regression form for the SCMPAR(1) process can be determined, and the general autocorrelation function for the process {X } is ρ = Corr(X , X ) = t r t t−r for r = 0, 1, 2, .... Parameter estimation can be conducted via conditional max- n +n 1 2 imum likelihood with statistical computation tools (e.g. in R); see Sellers et al. (2020)for details. Introducing the sCMPMA(1) model Motivated by the SCMPAR(1) model of Sellers et al. (2020), we introduce a first-order sum-of-CMPs moving average (SCMPMA(1)) process X by ∗ ∗ ∗ X = C  +  , t = 1, 2, ... , (21) t t−1 t ∗ ∗ where  is a sequence of iid sCMP(λ, ν, m + m ) random variables and C (•) is a 1 2 t t sequence of independent gCMB(1/2, ν, •, m , m ) operators independent of  . By defini- 1 2 tion, X is a stationary process with the sCMP(λ, ν,2m + m ) distribution, and X and t 1 2 t+r X are independent for |r| > 1. While this model can analogously be viewed as a special case of the infinitely divisible convolution-closed class of discrete MA models (Joe 1996), unlike the sCMPAR(1) process, the sCMPMA(1) process is not Markovian. The autocorrelation between X and X is t t+1 ∗ ∗ ∗ ∗ ∗ ∗ Cov C  +  , C  + t t t t−1 t+1 t+1 ρ = Corr(X , X ) = 1 t t+1 Var(X )Var(X ) t t+1 ∗ ∗ ∗ Cov  , C t t+1 t =  by the independence assumptions, Var(X )Var(X ) t t+1 m m +m ∗ ∗ 1 ∗ 1 2 where C ( ) = Y and  = Y ,respectively,aresCMP(λ, ν, m ) and i i 1 t+1 t i=1 t i=1 sCMP(λ, ν, m + m ) random variables; i.e. each sCMP random variable can be viewed as 1 2 respective sums of iid CMP(λ, ν) random variables, Y .Thus, m +m m m 1 2 1 1 ∗ ∗ ∗ Cov  , C  = Cov Y , Y = Var Y = m Var(Y ), i i i 1 t t+1 t i=1 i=1 i=1 where, without loss of generality, we let Y denote any of the iid Y random variables. Meanwhile, because {X } is a sCMP(λ, ν,2m + m ) distributed stationary process, we t 1 2 2m +m 2m +m 1 2 1 2 can likewise represent Var(X ) = Var Y = Var(Y ) = (2m + t i i 1 i=1 i=1 m )Var(Y ) for all t. We therefore find that ∗ ∗ ∗ Cov  , C  m Var(Y ) m t t 1 1 t+1 ρ = Corr (X , X ) =  = = .(22) 1 t t+1 (2m + m )Var(Y ) 2m + m Var X Var(X ) ( ) 1 2 1 2 t t+1 Because m , m ≥ 1, the one-step range of possible correlation values is 0 ≤ ρ ≤ 0.5. 1 2 1 In particular, for m = m , we have the special case where ρ = 1/3. Meanwhile, ρ = 0 1 2 1 k for all k > 1 because, by definition of the SCMPMA(1) model assumptions, there is no dependent structure between X and X for r > 1. t t+r (2021) 8:1 Sellers et al. Journal of Statistical Distributions and Applications Page 8 of 12 Recall from the “The Conway-Maxwell-Poisson distribution and its generalization” ζ(λw,ν) section that  (w) = is thepgf forasCMP(λ, ν, π) distributed random variable, ζ(λ,ν) (say) G. Using this knowledge along with Eq. (21), the joint pgf can be derived as X X t+1 t φ (u, l) = E u l X ,X t+1 t ∗ ∗ ∗ C  + X ( ) t t+1 = E u l ∗ ∗ ∗ ∗ ∗ ∗ ∗ C ( )+ X −C ( ) C ( ) t t+1 t t = E u l l ∗ ∗ ∗ ∗ C   X −C ( ) ∗ ∗ ∗ ( ) t t t t+1 = E (ul) u l where X − C  = t t ∗ ∗ ∗ ∗ C ( ) t t+1 t = E (ul) E u E l by independence ∗ ∗ ∗ = φ ∗ (ul)φ (u)φ (l) ( ) t t+1 t m m +m m +m 1 1 2 1 2 ζ(λul, ν) ζ(λu, ν) ζ(λl, ν) ζ(λ, ν) ζ(λ, ν) ζ(λ, ν) m +m m 1 2 1 (ζ(λu, ν)ζ (λl, ν)) (ζ(λul, ν)) = , (23) 3m +2m 1 2 (ζ(λ, ν)) where Eq. (23) is equivalent to Eq. (20) (i.e. the SCMPMA(1) process is comparable to the SCMPAR(1) process) when m = n = n − m . Given this comparison, we 1 1 2 2 can easily determine the conditional mean E(X | X = x) and conditional vari- t+1 t ance Var(X | X = x).Eq.(23) further demonstrates that the SCMPMA(1) model is t+1 t time-reversible. Parameter estimation via maximum likelihood (ML) is a difficult task with INMA models given the complex form of the underlying distributions. Even a conditional least squares approach does not appear to be feasible “because of the thinning operators, unless randomization is used” (Brännäs and Hall 2001). We therefore instead consider the following ad hoc procedure for parameter estimation. Given a data set with an observed correlation ρ , we first propose values for m , m ∈ N that satisfy the con- 1 1 2 straint, ρ ≈ .Given m and m and recognizing that X is stationary with a 1 1 2 t 2m +m 1 2 sCMP(λ, ν,2m + m ) distribution, we proceed with ML estimation to determine λ and ν ˆ 1 2 as describedinZhu et al.(2017) for conducting sCMP(λ, ν, s = 2m +m ) parameter esti- 1 2 mation with regard to a CMP process over an interval of length s ≥ 1. The corresponding variation for λ and ν ˆ can be quantified via the Fisher information matrix or nonparamet- ric bootstrapping. While the sampling distribution for λ is approximately symmetric, the sampling distribution for ν ˆ is considerably right-skewed, hence analysts are advised to quantify estimator variation via nonparametric bootstrapping. While this is a means to an end, it only achieves in determining an appropriate distributional form regarding the data; it does not fully address the nature of the time series. Data examples To illustrate the flexibility of our INMA model, we consider various data simulations and a real data example. Below contains the respective details and associated commentary. Simulated data examples Table 1 reports the estimated mean, variance, and autocorrelation that result from (2021) 8:1 Sellers et al. Journal of Statistical Distributions and Applications Page 9 of 12 Table 1 Estimated mean, variance, and autocorrelation for various SCMPMA(1) data simulations of length 10,000 given parameters, (λ, ν, m , m ); λ = 0.5 for all simulations 1 2 m m ν Est. Mean True Mean Est. Var. True Var. ρρ ˆ 1 2 1 1 0 3.0073 3.00 5.9076 6.00 0.333 0.333 0.5 1.8846 2.3395 0.335 0.333 1 1.4842 1.50 1.4507 1.50 0.328 0.333 2 1.2359 1.0358 0.338 0.333 35 0.9826 1.00 0.6568 0.67 0.333 0.333 1 2 0 4.0027 4.00 8.0309 8.00 0.248 0.250 0.5 2.4891 3.0010 0.253 0.250 1 1.9861 2.00 1.9755 2.00 0.252 0.250 2 1.6427 1.3284 0.257 0.250 35 1.3355 1.33 0.8964 0.89 0.254 0.250 2 1 0 5.0408 5.00 10.2256 10.00 0.404 0.400 0.5 3.1414 3.9010 0.402 0.400 1 2.4823 2.50 2.4689 2.50 0.396 0.400 2 2.0094 1.6561 0.401 0.400 35 1.6583 1.67 1.1135 1.11 0.390 0.400 2 2 0 6.0019 6.00 12.2343 12.00 0.331 0.333 0.5 3.7195 4.5267 0.338 0.333 1 2.9873 3.00 3.0130 3.00 0.329 0.333 2 2.4488 2.0178 0.336 0.333 35 1.9861 2.00 1.2964 1.33 0.326 0.333 For the special cases (i.e. ν = 0, 1, 35, where ν = 35 sufficiently represents performance as ν →∞), we likewise provide the expected/true mean and variance. Along with the estimated autocorrelation ρˆ that results from the data, the table reports the true autocorrelation, ρ = , for all {m , m } and any ν (Eq. (22)) 1 1 2 2m +m 1 2 various data simulations of SCMPMA(1) data given parameters (λ, ν, m , m ).Inall 1 2 examples, we let λ = 0.5, m , m ∈{1, 2},and ν ={0, 0.5, 1, 2, 35},where ν = 0 1 2 captures thecaseofextreme over-dispersion, ν = 1 denotes equi-dispersion, and ν = 35 sufficiently illustrates the case computationally of utmost under-dispersion where ν →∞. For all examples, we find that the associated mean and variance compare with each other as expected, i.e. the variance is greater than the mean when ν< 1(i.e. thedataare over-dispersed), the variance and mean are approximately equal when ν = 1 (i.e. equi- dispersion holds), and the variance is less than the mean (i.e. the data are under-dispersed) when ν> 1. In particular, we can easily verify that the three special case models perform as expected. For the Poisson cases (ν = 1), we expect the mean and variance to both equal (2m + m )λ, while the binomial cases (i.e. ν →∞ and p = )produce ameanequal 1 2 λ+1 λ λ λ to (2m + m ) and variance equaling (2m + m ) 1 − ,and thenegative 1 2 1 2 λ+1 λ+1 λ+1 (2m +m )λ 1 2 binomial cases (ν = 0with p = 1 − λ)haveameanof and variance equaling 1−λ (2m +m )λ 1 2 . In fact, even with the ν →∞ case approximated by letting ν = 35, we still (1−λ) obtain reasonable estimates for the mean and variance for all of the associated cases of m and m . For each {m , m } pair, the mean and variance both decrease as ν increases while, for all 1 2 of the considered examples, we obtain estimated correlation values ρˆ that approximately equal the true correlation, ρ. In particular, for those cases where m = m ,weobtain 1 2 ρˆ ≈ 1/3 as expected (see Eq. (22)). (2021) 8:1 Sellers et al. Journal of Statistical Distributions and Applications Page 10 of 12 Fig. 1 (a) ACF plot for IP data (b) PACF plot for IP data Real data example: IP address counts Weiss (2007) considers a modified dataset regarding the number of unique IP-addresses which access the University of Wurzburg Department of Statistics’s webpages in 240 two- minute intervals. Collected on November 29, 2005 (from 10:00:00 to 18:00:00), these data have an associated mean and variance equaling 1.286 and 1.205, respectively. Weiss (2007) considers a PAR(1) model, noting that “the empirical partial autocorrelation function indicates that a first order [autoregressive] model may be an appropriate choice” with ρˆ = 0.292; Sellers et al. (2020), following suit, consider a SCMPAR(1) model as a flex- ible alternative to the PAR(1) model. The ACF and PACF plots of these data, however, do not clearly distinguish between considering a first-order autoregressive or a moving average model; see Fig. 1a-b. Further, recognizing that the data express apparent under- to equi-dispersion, we therefore consider the SCMPMA(1) as an illustrative model for analysis. We perform ML estimation assuming various combinations for (m , m ) (i.e. {(1,1), 1 2 1 1 (1,2), (2,2)}) as these values contain the observed correlation, 0.25 = < ρˆ < ≈ 0.33. 4 3 Table 2 contains the resulting parameter estimates for λ and ν, along with the respective Akaike Information Criterion (AIC). While the SCMPMA(1) model with m = m = 2 1 2 has the lowest AIC among the four models considered, all of these models produce approximately equal AIC values (i.e. 695.2) where the increasing m and m values asso- 1 2 ciate with decreasing λ and increasing ν ˆ. This makes sense because the resulting estimates rely solely on the assumed underlying sCMP(λ, ν,2m + m ) distributional form for the 1 2 data. The dispersion estimates in Table 2 are all greater than 1, thus implying a perceived level of data under-dispersion. These results naturally stem from the reported mean of the data (1.286) being greater than its corresponding variance (1.205). Their associated Table 2 Estimated parameters, the 95% confidence intervals for λ and ν derived from nonparametric bootstrapping, and Akaike Information Criterion (AIC) values for various SCMPMA(1) models for the IP data λν m m Est. 95% CI Est. 95% CI AIC 1 2 1 1 0.461 (0.374, 0.585) 1.285 (0.650, 2.291) 695.24 1 2 0.346 (0.278, 0.433) 1.387 (0.582, 2.937) 695.21 2 1 0.277 (0.225, 0.348) 1.493 (0.500, 4.594) 695.20 2 2 0.231 (0.188, 0.289) 1.612 (0.429, 25.802) 695.19 (2021) 8:1 Sellers et al. Journal of Statistical Distributions and Applications Page 11 of 12 95% confidence intervals (determined via nonparametric bootstrapping; also supplied in Table 2), however, are sufficiently large such that they contain ν = 1. This suggests that the apparent data under-dispersion is not statistically significant, thus instead suggesting that the data can be analyzed via the Al-Osh and Alzaid (1988)PMA(1)model.Itisfurther striking to see that the respective 95% confidence intervals associated with the dispersion parameter increase with the size of the underlying sCMP(2m + m )model.Thisisan 1 2 artifact of the (s)CMP distribution, namely that the distribution of ν is a right-skewed distribution (as discussed in Zhu et al. (2017)). This approach confirms interest in the PMA(1) model where Eqs. (1)-(2) imply that associated estimated parameters are γˆ ≈ 0.4124 and η ˆ ≈ 0.9105. Thus, we benefit from the SCMPMA(1) as a tool for parsimonious model determination. Discussion This work utilizes the sCMP distribution of Sellers et al. (2017)todevelop aSCMPMA(1) model that serves as a flexible moving average time series model for discrete data where data dispersion is present. The SCMPMA(1) model captures the PMA(1), as well as versions of a negative binomial and binomial MA(1) structure, respectively, as special cases. This along with the flexible SCMPAR(1) can be used further to derive broader auto-regressive moving average (ARMA) and auto-regressive integrated moving average (ARIMA) models based on the sCMP distribution. The SCMPMA(1) shares many properties with the analogous SCMPAR(1) model by Sellers et al. (2020). The presented models rely on predefining discrete values (i.e. m , m 1 2 for the SCMPMA(1)) for parameter estimation. As done in Sellers et al. (2017) and Sellers and Young (2019), we utilize a profile likelihood approach where, given m and m ,we 1 2 estimate the remaining model coefficients and then identify that collection of parameter estimates that produces the largest likelihood, thus identifying these parameter estimates as the MLEs. While this profile likelihood approach is acceptable as demonstrated in other applications, directly estimating m , m along with the other SCMPMA(1) model 1 2 estimates would likewise prove beneficial, as would redefining the model to allow for real- and m . These generalizations and estimation approaches can valued estimators for m 1 2 be explored in future work. Simulated data examples illustrate that the SCMPMA(1) model can obtain unbiased estimates, and the model demonstrates potential for accurate forecasts given data con- taining any measure of data dispersion. The real data illustration, however, highlights the complexities that come with parameter estimation. While we nonetheless present a means towards achieving this goal, this approach does not perform but so strongly with regard to prediction and forecasting. It nonetheless serves as a starting point for parameter esti- mation that we will continue to investigate in future work. Moreover, the flexibility of the SCMPMA(1) aids in determining a parsimonious model form as appropriate. Abbreviations AR(1): First-order autoregressive; ARIMA: Auto-regressive integrated moving average; ARMA: Auto-regressive moving average; CMB: Conway-Maxwell-Binomial; CMP: Conway-Maxwell-Poisson; gCMB: Generalized Conway-Maxwell-Binomial; GPAR(1): First-order generalized Poisson autoregressive; GPMA(1): First-order generalized Poisson moving average; INAR(1): First-order integer-valued autoregressive; INMA: Integer-valued moving average; MA: Moving average; mgf: Moment generating function; PAR(1): First-order Poisson autoregressive; pgf: Probability generating function; PMA: Poisson moving average; PMA(1): First-order Poisson moving average; QB: Quasi-binomial; sCMP: Sum-of-Conway-Maxwell-Poisson; SCMPAR(1): First-order sum-of-Conway-Maxwell-Poisson autoregressive; SCMPMA(1): First-order sum-of-Conway-Maxwell-Poissons moving average (2021) 8:1 Sellers et al. Journal of Statistical Distributions and Applications Page 12 of 12 Acknowledgements This paper is released to inform interested parties of research and to encourage discussion. The views expressed are those of the authors and not necessarily those of the U.S. Census Bureau. SM and FC thank the Georgetown Undergraduate Research Opportunities Program (GUROP) for their support. All authors thank Dr. Christian Weiss for use of the IP dataset, and the reviewers for their feedback and comments. Authors’ contributions KFS developed the research idea. All authors contributed towards the literature review, theoretical developments, and statistical computing. The author(s) read and approved the final manuscript. Funding SM was funded in part by the GUROP. Availability of data and materials Simulated data can vary given the generation process. Simulation code(s) can be supplied upon request. IP data set obtained from Dr. Christian Weiss of Helmut Schmidt University. Competing interests No authors have competing interests relating to this work. Received: 22 April 2020 Accepted: 26 January 2021 References Al-Osh, M. A., Alzaid, A. A.: First-order integer valued autoregressive (INAR(1)) process. J. Time Ser. Anal. 8(3), 261–275 (1987) Al-Osh, M. A., Alzaid, A. A.: Integer-valued moving average (INMA) process. Stat. Pap. 29(1), 281–300 (1988) Al-Osh, M. A., Alzaid, A. A.: Binomial autoregressive moving average models. Commun. Stat. Stoch. Model. 7(2), 261–282 (1991) Alzaid, A. A., Al-Osh, M. A.: Some autoregressive moving average processes with generalized Poisson marginal distributions. Ann. Inst. Stat. Math. 45(2), 223–232 (1993) Borges, P., Rodrigues, J., Balakrishnan, N., Bazán, J.: A COM-Poisson type generalization of the binomial distribution and its properties and applications. Stat. Probab. Lett. 87, 158–166 (2014) Brännäs, K., Hall, A.: Estimation in integer-valued moving average models. Appl. Stoch. Model. Bus. Ind. 17, 277–291 (2001) Conway, R. W., Maxwell, W. L.: A queuing model with state dependent service rates. J. Ind. Eng. 12, 132–136 (1962) Famoye, F.: Restricted generalized Poisson regression model. Commun. Stat. Theory Methods. 22(5), 1335–1354 (1993) Hilbe, J. M.: Modeling Count Data. Cambridge University Press, New York, NY (2014) Joe, H.: Time series models with univariate margins in the convolution-closed infinitely divisible class. J. Appl. Probab. 33(3), 664–677 (1996) Kadane, J. B.: Sums of possibly associated Bernoulli variables: The Conway-Maxwell-Binomial distribution. Bayesian Anal. 11(2), 403–420 (2016) McKenzie, E.: ARMA models for dependent sequences of Poisson counts. Adv. Appl. Probab. 20(4), 822–835 (1988) Sellers, K. F., Peng, S. J., Arab, A.: A flexible univariate autoregressive time-series model for dispersed count data. J. Time Ser. Anal. 41(3), 436–453 (2020). https://doi.org/10.1111/jtsa.12516 Sellers, K. F., Swift, A. W., Weems, K. S.: A flexible distribution class for count data. J. Stat. Distrib. Appl. 4(22), 1–21 (2017). https://doi.org/10.1186/s40488-017-0077-0 Sellers, K. F., Young, D. S.: Zero-inflated sum of Conway-Maxwell-Poissons (ZISCMP) regression. J. Stat. Comput. Simul. 89(9), 1649–1673 (2019) Shmueli, G., Minka, T. P., Kadane, J. B., Borle, S., Boatwright, P.: A useful distribution for fitting discrete data: revival of the Conway-Maxwell-Poisson distribution. Appl. Stat. 54, 127–142 (2005) Weiss, C. H.: Controlling correlated processes of Poisson counts. Qual. Reliab. Eng. Int. 23(6), 741–754 (2007) Weiss, C. H.: Thinning operations for modeling time series of counts–a survey. Adv. Stat. Anal. 92, 319–341 (2008) Weiss, C. H.: An Introduction to Discrete-Valued Time Series. John Wiley & Sons, Inc., Hoboken, NJ (2018) Weiss, C. H.: Stationary count time series models. Wiley Interdiscip. Rev. Comput. Stat. 13(1), 1502 (2021). https://doi.org/ 10.1002/wics.1502 Zhu, L., Sellers, K. F., Morris, D. S., Shmuéli, G.: Bridging the gap: A generalized stochastic process for count data. Am. Stat. 71(1), 71–80 (2017) Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Journal

Journal of Statistical Distributions and ApplicationsSpringer Journals

Published: Feb 21, 2021

There are no references for this article.