Journal of Statistical Theory and Applications

Volume 19, Issue 3, September 2020, Pages 415 - 431

First-Order Integer-Valued Moving Average Process with Power Series Innovations

Authors
Eisa Mahmoudi*, ORCID, Ameneh Rostami
Department of Statistics, Yazd University, P.O. Box 89175-741, Yazd, Iran
*Corresponding author. Email: emahmoudi@yazd.ac.ir
Corresponding Author
Eisa Mahmoudi
Received 5 September 2019, Accepted 19 May 2020, Available Online 28 September 2020.
DOI
10.2991/jsta.d.200917.001How to use a DOI?
Keywords
First-order integer-valued moving average process; Poisson thinning operator; Moving average model; Power series family of distributions
Abstract

In this paper, we introduce a first-order nonnegative integer-valued moving average process with power series innovations based on a Poisson thinning operator (PINMAPS(1)) for modeling overdispersed, equidispersed and underdispersed count time series. This process contains the PINMA process with geometric, Bernoulli, Poisson, binomial, negative binomial and logarithmic innovations which some of them are studied in details. Some statistical properties of the process are obtained. The unknown parameters of the model are estimated using the Yule-Walker, conditional least squares and least squares feasible generalized methods. Also, the performance of estimators is evaluated using a simulation study. Finally, we apply the model to three real data set and show the ability of the model for predicting data compared to competing models.

Copyright
© 2020 The Authors. Published by Atlantis Press B.V.
Open Access
This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

1. INTRODUCTION

In recent decades, the integer-valued time series have been played an important role in research. This type of time series are extensively used in different science such as natural, social, agricultural, medical and health science. For example, the number of chromosome interchanges in cells, the number of bases of DNA sequences, the number of births in a hospital in successive months, the number of customers in an internet server in a period of time and the number of shares sold in the stock market.

Many models have been proposed by researchers for modeling the integer-valued time series, such as the first-order integer-valued autoregressive (INAR(1)) process (Al-Osh and Alzaid [1]), integer-valued moving average process (INMA(1)) (McKenzie [2]) and (Al-Osh and Alzaid [3]), estimation in INMA models (Brännäs and Hall [4]), bivariate time series modeling of financial count data (Quoreshi [5]), a new geometric INAR(1)) process based on the negative binomial thinning operator (Ristíc et al. [6]), integer-valued moving average modeling of the number of transactions in stocks (Brännäs and Quoreshi [7]), acombined geometric INAR(p) model based on the negative binomial thinning (Nastíc et al. [8]), a bivariate INAR(1) time series model with geometric marginals (Ristíc et al. [9]), compound Poisson INAR(1) processes (Schweer and Wei [10]), INMA models with structural changes (Yu et al. [11]), INAR(1) processes with power series innovations (Bourguinon and Vasconcellos [12]), the combined Poisson INMA(q) models for time series of counts (Yu and Zou [13]) and INAR(1) model with Poisson–Lindley marginal distribution (Mohammadpour et al. [14]).

One important characteristic of the count data time series is the overdispersion, equidispersion and underdispersion property. The INAR models, are the most widely used integer-valued time series for dealing with this type of data. But INAR models, sometimes, are no more the best choice when data have a very short-run autocorrelation. In this case, the INMA models work better, because in general case for INMA(q), the correlation between Yt and Ytk for k>q is zero, whereas, in the INAR models, this correlation gradually decreases with k increase, thus we need to introduce a new model based on the INMA for modeling overdispersed, equidispersed and underdispersed count data that have a very short-run autocorrelation.

The aim of this paper is to introduce a new INMA(1) process with power series innovations based on a Poisson thinning operator. In this process, we use the Poisson thinning operator, which contains Poisson counting series. Unlike the binomial thinning operator (Steutel and van Harn [15]) which counting series can take only 0 or 1 value and is appropriate for modeling the number of random events which could only survive or disappear after a time period, the counting series of the Poisson thinning operator like the negative binomial thinning operator (Ristíc et al. [6]) can take any nonnegative integer values and is appropriate for modeling the number of random events capable of replication themselves. The main reason for using the Poisson thinning operator instead of the negative binomial thinning operator in this model is that INMA(1) process based on the Poisson thinning operator produced more dependence between the time series variables in comparing with INMA(1) process based on the negative binomial thinning operator under same assumptions.

To clear up the reason why the proposed model in this paper has been used, paying attention to the below points is essential.

  1. This model is suitable for modeling time series count data with overdispersion, equidispersion and underdispersion that have a very short-run autocorrelation.

  2. In this model, we use of innovations that come from the power series family of distributions. This family of distributions has the functional form of parameters. Also, it is useful for modeling overdispersed, equidispersed and underdispersed count data. Moreover, the power series family of distributions includes many of the important discrete distributions such as Poisson, geometric, Bernoulli, binomial, negative binomial and logarithmic that are flexible and extensively used distributions for fitting count data.

The paper is organized as follows:

The INMA(1) model with power series innovations based on the Poisson thinning operator is defined in Section 2 and some of its statistical properties are presented. In Section 3, the estimators of the model parameters are obtained using the Yule-Walker (YW), conditional least squares (CLS) and feasible generalized least squares (FGLS) methods. In Section 4, four special cases of the model are given. Some simulation results of the estimators are provided in Section 5. Section 6 deals with three real applications of the proposed model. Finally, we conclude the paper in Section 7.

2. CONSTRUCTION OF THE MODEL

In this section, we introduce an INMA model of the first order generated by the Poisson thinning operator and with the power series innovations. Therefore, we first introduce the Poisson thinning operator.

The Poisson thinning operator, denoted by , is defined by

αX=i=1XZi,
where {Zi}i=1X is a sequence of independent and identically distributed (i.i.d.) Poisson random variables with mean α and Z0=a.s.0. For a given random variable X, the random variable αX has the Poisson distribution with mean αX. This implies that the conditional mean and variance of the random variable αX given X are given as E(αX|X)=αX and Var(αX|X)=αX, respectively. Then, the unconditional mean and variance of the random variable αX are, respectively, given by E(αX)=αE(X) and Var(αX)=αE(X)+α2Var(X). Since the counting series are Poisson distributed random variables, the probability generating function (pgf) of the random variable αX is given by φ(s)=φX(eα(s1)).

As we mention above, we consider a model which innovations have distributed according to the power series family of distributions, so we review some properties of this family of distributions. A nonnegative integer-valued random variable X is said to have a power series family of distributions, if its probability mass function is given by

P(X=x)=a(x)θxC(θ),xT,(1)
where T is a subset of the nonnegative integers, a(x)0, θ>0 and C(θ) is a function defined as C(θ)=xTa(x)θx.

This family of distributions contains some well-known distributions such are Bernoulli, binomial, Poisson, geometric, negative binomial and logarithmic distribution. This is shown in Table 1.

Distribution θ C(θ) a(x) T
Bernoulli with parameter p p(1p) 1+θ 1 {0,1}
Binomial with parameters n, p p(1p) (1+θ)n nx {0,1,,n}
Poisson with parameter λ λ eθ 1x! {0,1,}
Geometric with parameter p 1p (1θ)1 1 {0,1,}
Negative binomial with parameter r, p 1p (1θ)r x+r1r1 {0,1,}
Logarithmic with parameter p p log(1θ) x1 {1,2,}
Table 1

Special cases of power series distribution.

The pgf of a random variable X with power series family of distributions is given by φX(s)=C(θs)C(θ). Using this fact, the expectation and variance of the random variable X are given by E(X)=θG(θ) and Var(X)=θ2G(θ)+θG(θ), respectively, where G(θ)=logC(θ) and G and G are the first two derivatives of function G(θ). The dispersion index is given by

Ix=θ2G(θ)+θG(θ)θG(θ).

Thus, this distribution is overdispersed if C(θ)=(1θ)1, C(θ)=(1θ)r and C(θ)=log(1θ) where log(1θ)>1, underdispersed if C(θ)=1+θ, C(θ)=(1+θ)n and C(θ)=log(1θ) where 0<log(1θ)<1, and equidispersed if C(θ)=eθ.

Definition 2.1

A time series model {Yt} given by

Yt=αεt1+εt,t{0,±1,±2,},(2)
is called the integer-valued moving average model with power series innovations based on the Poisson thinning operator (PINMAPS(1)) if the following conditions are satisfied:
  1. {εt} is a sequence of i.i.d. random variables with a power series distribution given by (1).

  2. The counting series {Zi(t)} incorporated in αεt have the Poisson distribution with the parameter α(0,1) for all t and i.

  3. All the counting series incorporated in αεs and αεt are independent for all st.

  4. The counting series {Zi(s)} are independent of the random variables εt for all t, s and i.

Under the above assumptions, we obtain the expectation and variance of the random variable Yt, respectively, as follows:

E(Yt)=(1+α)θG(θ),Var(Yt)=αθG(θ)+(1+α2)[θ2G(θ)+θG(θ)].

Thus, the dispersion index is given by IYt=1+θG(θ)G(θ)+α2[θG(θ)G(θ)+1]1+α.

Remark 2.1

  1. If C(θ)=(1θ)1, C(θ)=(1θ)r and C(θ)=eθ, the first-order nonnegative integer-valued moving average process with power series innovations based on a Poisson thinning operator (PINMAPS(1)) process is overdispersed.

  2. If C(θ)=1+θ and C(θ)=(1+θ)n, this process is overdispersed when θ<α2; underdispersed when θ>α2 and equidispersed when θ=α2.

  3. If C(θ)=log(1θ), this process is overdispersed when log(1θ)>1 and underdispersed when 0<log(1θ)<1.

The covariance between the random variables Yt and Yt1 is Cov(Yt,Yt1)=α[θ2G(θ)+θG(θ)], which implies that the lag 1 serial correlation of PINMAPS(1) model is

ρ(1)=ασε2αμε+(1+α2)σε2=α[θ2G(θ)+θG(θ)]αθG(θ)+(1+α2)[θ2G(θ)+θG(θ)],
where με and σε2 are the expectation and variance of the innovation εt. All other lag k2 serial correlations are equal to 0. We can see that ρ(1) is nonnegative value and bounded above by 12, which is the same as for the INMA(1) model introduced by Al-Osh and Alzaid [3].

Theorem 2.1.

Let Yt be the process defined in (2), then Yt is covariance stationary.

Proof.

Since the expectation and variance of the process are constant and autocovariance function does not depend on time, the process (2) is covariance stationary.

Theorem 2.2.

The PINMAPS(1) process is ergodic in the mean and autocovariance function.

Proof.

The proof is similar to the proof of theorem 7 from Yu and Zou [13] and omitted.

Using the pgf of the power series distribution and the independency of the counting series and the random variables εt1 and εt, we obtain that the pgf of the random variable Yt is given by

φYt(s)=E(sαεt1)E(sεt)=φεt(eα(s1))φεt(s)=C(eα(s1)θ)C(θ)×C(sθ)C(θ).

Using the series expansion of the function C, we obtain

C(eα(s1)θ)=a(0)C(θ)+x=1Ta(x)θxC(θ)eαx(s1),
which implies that the random variable Yt is distributed as a random variable X+SαW, where the random variables X and SαW are independent random variables, X and W have the power series distributions given by (1) and the random variable SαW for given W1 has Poisson distribution with the mean parameter αW and S0=a.s.0. Then, from the last result, we can easily obtain the probability mass function of the random variable yt as
P(Yt=y)=x=0min(y,T)a(x)θxC(θ)a(0)C(θ)I{x=y}+w=1Ta(w)(αw)yxθweαwC(θ)(yx)!,y0.

Unlike the power series innovations which can take values on set {0,1,,T}, the random variable Yt always takes values on the set of nonnegative integers. This is a consequence of using the Poisson thinning operator.

Al-Osh and Alzaid [3] have shown that the Poisson INMA(1) process is the only INMA(1) process that has a linear regression. An example of the INMA(1) process with a nonlinear regression is the geometric INMA(1) process introduced by Alzaid and Al-Osh [16]. Now, we will consider the regression of the proposed model. In this sense, we first derive the joint pgf of the random variables Yt1 and Yt. It is given by

φYt1,Yt(s1,s2)=C(θs2)C(θ)×C(θeα(s11))C(θ)×C(θs1eα(s21))C(θ).

The joint pgf can be used for derivation of the conditional pgf of the random variable Yt given Yt1, which the expression is given in the following theorem.

Theorem 2.3.

The conditional pgf of the random variable Yt given Yt1=x, x{0,1,2,}, is given as

φYt|Yt1=x(s)=C(θs)j=0xi=1jxjαjajiθi+xjeiα+α(xj)(s1)C(i)(θeα)C(xj)(0)C(θ)j=0xi=1jxjαjajiθi+xjeiαC(i)(θeα)C(xj)(0),
where the coefficients aji are given recurrently as aji=iaj1,i+aj1,i1, i{2,3,,j1} and aj1=ajj=1.

Proof.

According to Theorem 1.3.1 from Kocherlakota and Kocherlakota [17], the conditional pgf of the random variable Yt given Yt1=x can be derived from the joint pgf of the random variables Yt1 and Yt as

φYt|Yt1=x(s)=xφYt1,Yt(0,s)s1xxφYt1,Yt(0,1)s1x.(3)

Let us first consider the partial derivative xφYt1,Yt(s1,s2)s1x. By the Leibniz's rule, we have

xφYt1,Yt(s1,s2)s1x=C(θs2)C3(θ)j=0xxjjC(θeα(s11))s1jxjC(θs1eα(s21))s1xj.(4)

It is easy to derive the partial derivatives of the function C(θs1eα(s21)) as

kC(θs1eα(s21))s1k=θkekα(s21)C(k)(θs1eα(s21)),k=0,1,,x.(5)

On the other hand and after some calculations, we obtain that the partial derivatives of the function C(θeα(s11)) are given by

kC(θeα(s11))s1k=αki=1kakiθieiα(s11)C(k)(θeα(s11)),k=0,1,,x,(6)
where the coefficients aki are given in the statement of theorem. Finally, replacing (5) and (6) in (4) with s1=0 and s2=s, we obtain the numerator of (3). In a similar way, we obtain the denominator of (3) which proves the theorem.

Now, we are able to derive regression of our introduced model. It is given as follows:

Corollary 2.1.

The regression of Yt given Yt1=x is a nonlinear function given by

E(Yt|Yt1=x)=j=0xi=1jxjαjajiθi+xjeiαC(i)(θeα)C(xj)(0)θC(θ)+α(xj)C(θ)C(θ)j=0xi=1jxjαjajiθi+xjeiαC(i)(θeα)C(xj)(0).

Proof.

The proof follows from Corollary 1.3.1 (Kocherlakota and Kocherlakota [17]) and the previous theorem.

The conditional mean and conditional variance of Yt given t1 are obtained as follows:

E(Yt|t1)=αεt1+με,(7)
Var(Yt|t1)=αεt1+σε2,(8)
where t1 is the information set at time t1.

3. ESTIMATION OF THE UNKNOWN PARAMETERS

In this section, we derive the estimators of the unknown parameters α and θ using the YW, CLS and FGLS methods. Let Y1,,YT,TN be a random sample of size T from the PINMAPS(1) process.

3.1. The YW Method

The YW estimators of the parameters α and θ are obtained via solving the following equations:

Ȳ(1+α)θG(θ)=0,S2αθG(θ)(1+α2)(θ2G(θ)+θG(θ))=0,r1S2α(θ2G(θ)+θG(θ))=0,
where Ȳ, S2 and r1 are the sample mean, variance and autocorrelation function at lag one, respectively.

Theorem 3.1.

The YW estimators of the parameters α and θ are consistent.

Proof.

According to Theorem 2.2, we have ȲpμY, S2pσY2 and r1pρ(1) where μY:=E(Yt), σY2:=Var(Yt). Thus, using the properties of convergence in probability, the consistency of the YW estimators is resulted.

3.2. The CLS Method

The CLS estimators of the parameters α and θ are given by minimizing the following function with respect to parameter με

SCLS=t=2Te1t2,
where με:=E(εt) and e1t=YtE(Yt|t1).

According to the Section 3 from Brännäs and Quoreshi [7], and Equation 7, e1t is given by

e1t=εtμε,
so, the CLS estimator of με is given by
μ̂ε(CLS)=t=2TεtT1.

The CLS estimators of α and θ are obtained by replacing μ̂ε(CLS) in equations,

Ȳ=(1+α̂CLS)μ̂ε(CLS),μ̂ε(CLS)=θ̂CLSG(θ̂CLS).

3.3. The FGLS Method

This method includes three stages:

Stage 1: Obtaining the CLS estimator of the parameters.

Stage 2: Minimizing the function

S=t=2Te2t2,
with respect to parameter σε2, where σε2:=Var(εt) and e2t=(YtE(Yt|t1))2Var(Yt|t1).

Thus, we have

σ̂ε2=t=2T((εtμ̂ε(CLS))2α̂CLSεt1)(T2).

Stage 3: Minimizing the function

SFGLS=t=2Te1t2Var̂(Yt|t1),
with respect to parameter με and obtaining FGLS estimators of α and θ. By minimizing the function SFGLS with respect to με, the FGLS estimator of με is given by
μ̂ε(FGLS)=t=2TεtVar̂(Yt|t1)t=2T1Var̂(Yt|t1).

So, the FGLS estimators of α and θ are given by replacing μ̂ε(FGLS) in equations,

Ȳ=(1+α̂FLGS)μ̂ε(FGLS),μ̂ε(FGLS)=θ̂FGLSG(θ̂FGLS).

4. SPECIAL CASES OF THE PINMAPS(1) PROCESS

In this section, we consider four special cases of the PINMAPS(1) process.

4.1. First-Order Integer-Valued Moving Average Process with Geometric Innovations

If C(θ)=11θ(θ(0,1)), εt has geometric distribution, then Yt is called first-order integer-valued moving average process with geometric innovations (PINMAG(1)) process.

The mean, variance and pgf of this process are given by

E(Yt)=(1+α)θ1θ,Var(Yt)=(α)θ1θ+(1+α2)θ(1θ)2,φYt(s)=C(eα(s1)θ)C(θ)C(sθ)C(θ)=(1θ)2(1sθ)(1eα(s1)θ).

The autocovariance function, conditional mean and variance of Yt are given, respectively, by

γYt(1)=αθ(1θ)2,E(Yt|t1)=αεt1+θ1θ,Var(Yt|t1)=αεt1+θ(1θ)2.

The YW estimators of α and θ are obtained via solving the following equations:

Y(1+α)θ1θ=0,(9)
S2αθ1θ(1+α2)θ(1θ)2=0,(10)
r1S2αθ(1θ)2=0.(11)

To facilitate obtaining the estimator of θ, the auxiliary parameter ν=θ1θ may be used. Therefore, Equations (911) can be written as follows:

Y(1+α)ν=0,S2αν(1+α2)(ν+ν2)=0,r1S2α(ν+ν2)=0.

So, the YW estimators of ν and α are given by

ν^YW=(Y1)+(1Y)24(r1S2Y)2,α^YW=Yν^YWν^YW.

The YW estimator for θ is

θ̂YW=ν̂YW1+ν̂YW.

According to the previous section, the CLS and FGLS estimators of α and θ are obtained, respectively, by

θ̂CLS=t=2TεtT11+t=2TεtT1,α̂CLS=Ȳt=2TεtT1t=2TεtT1,
and
α^FGLS=Ȳt=2TεtVar^(Yt|t1)t=2T1Var^(Yt|t1)t=2TεtVar^(Yt|t1)t=2T1Var^(Yt|t1),θ^FGLS=t=2TεtVar^(Yt|t1)t=2T1Var^(Yt|t1)1+t=2TεtVar^(Yt|t1)t=2T1Var^(Yt|t1),

4.2. First-Order Integer-Valued Moving Average Process with Poisson Innovations

If C(θ)=eθ(θ>0), εt has Poisson distribution, then Yt is called first-order integer-valued moving average process with Poisson innovations (PINMAP(1)) process.

The mean, variance and pgf of this process are given by

E(Yt)=(1+α)θ,Var(Yt)=αθ+(1+α2)θ,φYt(s)=C(eα(s1)θ)C(θ)C(sθ)C(θ)=eeα(s1)θesθe2θ.

The autocovariance function, conditional mean and variance of Yt are given, respectively, by

γYt(1)=αθ,E(Yt|t1)=αεt1+θ,Var(Yt|t1)=αεt1+θ.

Since in this process the conditional mean is the same as the conditiona variance, it can be concluded that Yt given t1 has Poisson distribution with mean αεt1+θ.

The YW estimators of α and θ are obtained via solving the following equations:

Ȳ(1+α)θ=0,S2αθ(1+α2)θ=0,r1S2αθ=0.

Thus, the YW estimators of θ and α are

θ̂YW=Ȳr1S2,α̂YW=r1S2θ̂YW.

According to the previous section, the CLS and FGLS estimators of α and θ are obtained, respectively, by

α^CLS=Ȳt=2TεtT1t=2TεtT1,θ^CLS=t=2TεtT1,
and
α^FGLS=Ȳt=2TεtVar^(Yt|t1)t=2T1Var^(Yt|t1)t=2TεtVar^(Yt|t1)t=2T1Var^(Yt|t1),θ^FGLS=t=2TεtVar^(Yt|t1)t=2T1Var^(Yt|t1),

4.3. First-Order Integer-Valued Moving Average Process with Binomial Innovations

If C(θ)=(1+θ)n(θ>0), εt has binomial distribution, then Yt is called first-order integer-valued moving average process with binomial innovations (PINMAB(1)) process.

The mean, variance and pgf of this process are given by

E(Yt)=(1+α)nθ1+θ,Var(Yt)=αnθ1+θ+(1+α2)nθ(1+θ)2,φYt(s)=C(eα(s1)θ)C(θ)C(sθ)C(θ)=(1+eα(s1)θ)n(1+sθ)n(1+θ)2n.

The autocovariance function, conditional mean and conditional variance of Yt are given, respectively, by

γYt(1)=αnθ(1+θ)2,E(Yt|t1)=αεt1+nθ1+θ,Var(Yt|t1)=αεt1+nθ(1+θ)2.

Assuming n is known, the estimators of the parameter α and θ are obtained. The YW estimators of α and θ are obtained via solving the following equations:

Ȳ(1+α)nθ1+θ=0,(12)
S2αnθ1+θ(1+α2)nθ(1+θ)2=0,(13)
r1S2αnθ(1+θ)2=0.(14)

To facilitate obtaining the YW estimator of θ, the auxiliary parameter τ=θ1+θ may be used. Therefore, Equations (1214) can be written as follows:

Y(1+α)nτ=0,S2αnτn(1+α2)(ττ2)=0,r1S2nα(ττ2)=0.

So, the YW estimators of τ and α are

τ^YW=(n+Y)(n+Y)24n(Yr1S2)2n,α^YW=Ynτ^YWnτ^YW.

The YW estimator for parameter θ is given by

θ̂YW=τ̂YW1τ̂YW.

According to the previous section, the CLS and FGLS estimators of α and θ are obtained, respectively, by

α^CLS=Ȳt=2TεtT1t=2TεtT1,θ^CLS=t=2TεtT1nt=2TεtT1,
and
α^FGLS=Ȳt=2TεtVar^(Yt|t1)t=2T1Var^(Yt|t1)t=2TεtVar^(Yt|t1)t=2T1Var^(Yt|t1),θ^FGLS=t=2TεtVar^(Yt|t1)t=2T1Var^(Yt|t1)nt=2TεtVar^(Yt|t1)t=2T1Var^(Yt|t1),

4.4. First-Order Integer-Valued Moving Average Process with Negative Binomial Innovations

For C(θ)=(1θ)r(θ(0,1)), εt has geometric distribution, then Yt is called first-order integer-valued moving average process with negative binomial innovations (PINMANB(1)) process.

The mean, variance and pgf of this process are given by

E(Yt)=(1+α)rθ1θ,Var(Yt)=(α)rθ1θ+(1+α2)rθ(1θ)2,φYt(s)=C(eα(s1)θ)C(θ)C(sθ)C(θ)=(1θ)2r(1sθ)r(1eα(s1)θ)r.

Also, the autocovariance function, conditional mean and variance of Yt are given, respectively, by

γYt(1)=αrθ(1θ)2,E(Yt|t1)=αεt1+rθ1θ,Var(Yt|t1)=αεt1+rθ(1θ)2.

Assuming r is known, the estimators of the parameters α and θ are obtained. The YW estimators of α and θ are obtained via solving the following equations:

Ȳ(1+α)rθ1θ=0,(15)
S2αrθ1θ(1+α2)rθ(1θ)2=0,(16)
r1S2αrθ(1θ)2=0.(17)

To facilitate obtaining the YW estimator of θ, the auxiliary parameter δ=θ1θ may be used. Therefore, Equations (1517) can be written as follows:

Ȳ(1+α)rδ=0,S2αrδr(1+α2)(δ+δ2)=0,r1S2rα(δ+δ2)=0.

So, the YW estimators of δ and α are

δ̂YW=(Ȳr)+(rȲ)24r(r1S2Ȳ)2r,α̂YW=Ȳrδ̂YWrδ̂YW.

The YW estimator for parameter θ is given by

θ̂YW=δ̂YW1+δ̂YW.

According to the previous section, the CLS and FGLS estimators of α and θ are obtained, respectively, by

α^CLS=Yt=2TεtT1t=2TεtT1,θ^CLS=t=2TεtT1r+t=2TεtT1,
and
α^FGLS=Yt=2TεtVar^(Yt|t1)t=2T1Var^(Yt|t1)t=2TεtVar^(Yt|t1)t=2T1Var^(Yt|t1),θ^FGLS=t=2TεtVar^(Yt|t1)t=2T1Var^(Yt|t1)r+t=2TεtVar^(Yt|t1)t=2T1Var^(Yt|t1),

5. SIMULATION STUDY

In this section, the performance of the YW, CLS and FGLS estimators is evaluated. For this purpose, we simulate 1000 samples of size T=100,200,300 from the PINMAP(1), PINMAB(1) and PINMANB(1) models for different values of α and θ and estimate the parameters using the three methods. Tables 24 present the estimates, biases and root mean square (RMS) errors of the estimators. As we see from Tables 24, in each of the three models, estimates converge to the true value and the bias and RMS of the CLS estimators are smaller than the bias and RMS of the other two methods, thus the CLS estimators present more performance.

YW
CLS
FGLS
T α̂YW θ̂YW α̂CLS θ̂CLS α̂FGLS θ̂FGLS
α=0.4,θ=1
100 0.2058(0.2655) 1.3060(0.3669) 0.5478(0.1681) 1.0005(0.0960) 0.5491(0.1926) 1.0026(0.1087)
Bias 0.1942 0.3060 0.1478 0.0005 0.1491 0.0026
200 0.2004(0.2350) 1.3007(0.3332) 0.5455(0.1568) 1.0020(0.0697) 0.5460(0.1671) 1.0030(0.0792)
Bias 0.1996 0.3007 0.1455 0.0020 0.1460 0.0030
300 0.1989(0.2232) 1.3003(0.3219) 0.5474(0.1547) 1.0023(0.0590) 0.5475(0.1621) 1.0032(0.0664)
Bias 0.2011 0.3003 0.1474 0.0023 0.1475 0.0032

α=0.7,θ=3
100 0.7173(0.5780) 3.2053(0.7794) 0.7061(0.0497) 3.0069(0.1721) 0.6932(0.3595) 2.9924(0.4516)
Bias 0.0173 0.2053 0.0061 0.0069 0.0068 0.0076
200 0.6639(0.3392) 3.1806(0.5663) 0.7096(0.0355) 2.9997(0.1270) 0.7101(0.0563) 3.0009(0.1469)
Bias 0.0361 0.1806 0.0096 0.0003 0.0101 0.0009
300 0.6734(0.2483) 3.1216(0.4439) 0.7104(0.0294) 2.9963(0.1027) 0.7124(0.0451) 2.9939(0.1181)
Bias 0.0265 0.1216 0.0104 0.0037 0.0124 0.0061

α=0.8,θ=4
100 0.8993(1.2971) 4.2544(1.1438) 0.7987(0.0449) 3.9985(0.1972) 0.7934(0.1754) 3.9904(0.8252)
Bias 0.0993 0.2544 0.0012 0.0015 0.0065 0.0095
200 0.8141(0.4376) 4.1525(0.8417) 0.8002(0.0315) 3.9978(0.1434) 0.7741(0.5837) 3.9803(0.8145)
Bias 0.0141 0.1525 0.0002 0.0022 0.0259 0.0197
300 0.8147(0.3147) 4.0835(0.6595) 0.8006(0.0249) 4.0064(0.1200) 0.8007(0.0599) 4.0122(0.2408)
Bias 0.0147 0.0835 0.0006 0.0064 0.0007 0.0122
Table 2

Estimated parameters, Bias and RMS (in parentheses) for the PINMAP(1) model.

YW
CLS
FGLS
T α̂YW θ̂YW α̂CLS θ̂CLS α̂FGLS θ̂FGLS
α=0.3,θ=0.5,n=5
100 0.2466(0.1740) 0.5716(0.1371) 0.3222(0.0506) 0.5015(0.0495) 0.3224(0.0601) 0.5020(0.0523)
Bias 0.0534 0.0716 0.0222 0.0015 0.0224 0.0020
200 0.2269(0.1451) 0.5762(0.1247) 0.3251(0.0410) 0.5001(0.0341) 0.3251(0.0464) 0.5004(0.0364)
Bias 0.0730 0.0762 0.0251 0.0001 0.0251 0.0004
300 0.2314(0.1226) 0.5683(0.1036) 0.3231(0.0350) 0.5016(0.0286) 0.3232(0.0392) 0.5017(0.0303)
Bias 0.0686 0.0684 0.0231 0.0016 0.0232 0.0017

α=0.4,θ=0.8,n=5
100 0.3604(0.2118) 0.9178(0.2833) 0.4063(0.0422) 0.8005(0.0737) 0.3933(0.5406) 0.8035(0.1715)
Bias 0.0396 0.1178 0.0063 0.0005 0.0067 0.0035
200 0.3559(0.1488) 0.8952(0.2148) 0.4099(0.0328) 0.8028(0.0518) 0.4102(0.0611) 0.8052(0.0694)
Bias 0.0441 0.0953 0.0099 0.0028 0.0102 0.0052
300 0.3563(0.1271) 0.8787(0.1682) 0.4077(0.0265) 0.8006(0.0424) 0.4082(0.0378) 0.8009(0.0495)
Bias 0.0437 0.0787 0.0077 0.0006 0.0082 0.0009

α=0.7,θ=0.9,n=5
100 0.6523(0.3188) 1.1090(0.5563) 0.7081(0.0569) 0.9034(0.0816) 0.6860(1.2838) 0.8839(0.5572)
Bias 0.0477 0.2089 0.0082 0.0034 0.0140 0.0161
200 0.6485(0.2197) 1.0316(0.3248) 0.7092(0.0390) 0.9033(0.0557) 0.7184(0.3054) 0.9151(0.2037)
Bias 0.0514 0.1315 0.0092 0.0033 0.0184 0.0151
300 0.6426(0.1837) 1.0162(0.2531) 0.7097(0.0332) 0.9009(0.0475) 0.7149(0.0777) 0.9009(0.0827)
Bias 0.0574 0.1162 0.0097 0.0009 0.0149 0.0009

Note: RMS, root mean square; PINMAP(1), first-order nonnegative integer-valued moving average process with power series innovations based on a Poisson thinning operator; YW, Yule-Walker; CLS, conditional least squares; FGLS, feasible generalized least squares
 Reply

Table 3

Estimated parameters, Bias and RMS (in parentheses) for the PINMAB(1) model.

YW
CLS
FGLS
T α̂YW θ̂YW α̂CLS θ̂CLS α̂FGLS θ̂FGLS
α=0.2,θ=0.3,r=10
100 0.2106(0.1986) 0.3017(0.0328) 0.1992(0.0229) 0.3002(0.0123) 0.1986(0.0234) 0.3003(0.0124)
Bias 0.0106 0.0017 0.0008 0.0002 0.0013 0.00034
200 0.1993(0.1259) 0.3016(0.0225) 0.1999(0.01601) 0.2999(0.0084) 0.1996(0.0165) 0.30004(0.0084)
Bias 0.0007 0.0016 0.0001 0.00001 0.0003 0.00004
300 0.1917(0.0949) 0.3023(0.0174) 0.1997(0.0129) 0.3000(0.0071) 0.1996(0.0132) 0.30003(0.0071)
Bias 0.0083 0.0023 0.0002 0.00001 0.0004 0.00003

α=0.3,θ=0.3,r=10
100 0.2911(0.4048) 0.3046(0.0387) 0.3003(0.0277) 0.3004(0.0122) 0.2990(0.0288) 0.3006(0.0123)
Bias 0.0089 0.0046 0.0003 0.0004 0.0009 0.0006
200 0.3024(0.1770) 0.3019(0.0253) 0.2999(0.01967) 0.3002(0.0085) 0.2997(0.0207) 0.3002(0.0085)
Bias 0.0024 0.0019 0.0001 0.0002 0.0003 0.0002
300 0.3031(0.1367) 0.3006(0.0217) 0.3010(0.0153) 0.2994(0.0068) 0.3010(0.0161) 0.2995(0.0069)
Bias 0.0031 0.0006 0.0010 0.0005 0.0010 0.0005

α=0.5,θ=0.4,r=10
100 0.5859(0.6263) 0.3997(0.0611) 0.4954(0.0296) 0.4001(0.0122) 0.4936(0.0313) 0.4004(0.0123)
Bias 0.0859 0.0003 0.0045 0.0001 0.0063 0.0003
200 0.5589(0.4651) 0.3972(0.0441) 0.4966(0.0197) 0.4003(0.0086) 0.4959(0.0209) 0.4004(0.0087)
Bias 0.0589 0.0028 0.0033 0.0003 0.0040 0.0004
300 0.5257(0.2669) 0.3996(0.0351) 0.4995(0.0157) 0.4000(0.0070) 0.4992(0.0167) 0.4001(0.0071)
Bias 0.0257 0.0004 0.0004 0.00003 0.0008 0.0001

Note: RMS, root mean square; PINMAP(1), first-order nonnegative integer-valued moving average process with power series innovations based on a Poisson thinning operator; YW, Yule-Walker; CLS, conditional least squares; FGLS, feasible generalized least squares

Table 4

Estimated parameters, Bias and RMS (in parentheses) for the PINMANB(1) model.

6. APPLICATION

In this section, we fit PINMAPS(1) model to three real data sets and obtain the model that gives a better fit to count data. For this purposes, we compare PINMAP(1), PINMAG(1), PINMABE(1) (INMA(1) with Bernoulli innovations based on the Poisson thinning operator), PINMA(1) (INMA(1) with Poisson innovations based on the binomial thinning operator), proposed by Al-Osh and Alzaid [3], GINMA(1) (INMA(1) with geometric innovations based on the binomial thinning operator), proposed by Alzaid and Al-Osh [16], NBINMAP(1) (INMA(1) with Poisson innovations based on the negative binomial thinning operator), NBINMAG(1) (INMA(1) with geometric innovations based on the negative binomial thinning operator) and NBINMABE(1) (INMA(1) with Bernoulli innovations based on the negative binomial thinning operator) models.

6.1. Number of Polio Cases

The first example assumes the number of polio cases, monthly from Jan 1980 to Dec 1983 in the United States A (https://books.google.com/books/about/Multivariate_Statistical_Modelling_Based.html). The sample path, autocorrelation and partial autocorrelation functions are shown in Figure 1. According to Figure 1, we observe that an INMA(1) process can be suitable for modeling the polio series since there exists a cut-off after lag 1 in the sample autocorrelation. The sample mean, variance and empirical index of dispersion are, respectively, 0.7708, 1.2868 and 1.6693. Since the index of dispersion exceeds 1, the polio series is overdispersed. Thus, an overdispersed model must be assumed for modeling the series. We fit the PINMAP(1), PINMAG(1), PINMA(1), GINMA(1), NBINMAP(1) and NBINMAG(1) models to this data set. For the INMA(1) models mentioned, we obtain the YW estimates of the unknown parameters (since the CLS and FGLS estimates are dependent on innovations and we do not observe their values) and the RMS of differences of observations and predicted values (RMS). The results are presented in Table 5. According to this table, we observed that the PINMAG(1) model gives the lowest RMS value compared to the other models. Thus, it can be concluded that the PINMAG(1) model presents the best forecasting for the polio series. Figure 2 shows the plots of the polio data series and their predicted values based on the PINMAG(1) model.

Figure 1

The sample path, Autocorrelation function (ACF) and Partial autocorrelation functions (PACF) plots of the number of polio cases, monthly from Jan 1980 to Dec 1983 in the United States.

Model YW Estimates
RMS
α̂ λ̂ p̂
PINMAP(1) 0.221 1.014 1.156
PINMAG(1) 0.201 0.609 1.085
PINMA(1) 0.210 1.063 1.175
GINMA(1) 0.196 0.608 1.103
NBINMAP(1) 0.466 0.479 1.112
NBINMAG(1) 0.207 0.610 1.183

RMS, root mean square; PINMAP(1), first-order nonnegative integer-valued moving average process with power series innovations based on a Poisson thinning operator; YW, Yule-Walker

Table 5

YW estimates of the parameters and RMS for the polio series.

Figure 2

Polio data and their predicted values.

6.2. Number of Rubella Cases

The second example assumes the number of rubella cases, monthly from Jan 2013 to Jul 2017 in Spain (https://www.ecdc.europa.eu/sites/portal/files/media/en/publications/Publications/measles-rubella-monitoring-jan-(2013-2018).pdf). The sample path, autocorrelation and partial autocorrelation functions are shown in Figure 3. According to Figure 3, we observe that an INMA(1) process can be appropriate for modeling the rubella series since there exists a cut-off after lag 1 in the sample autocorrelation. The sample mean, variance and empirical index of dispersion are, respectively, 0.291, 0.321 and 1.104. Since the index of dispersion exceeds 1, the rubella series is overdispersed. Thus, the series must be modeled by an overdispersed model. We fit the PINMAP(1), PINMAG(1), PINMA(1), GINMA(1), NBINMAP(1) and NBINMAG(1) models to this data set. For each model, we obtain the YW estimates of the unknown parameters and RMS. The results are reported in Table 6. As we can see from Table 6, the PINMAP(1) has the lowest RMS compared to the other models, which indicates that the PINMAP(1) provides the best forecasting for the rubella series. Figure 4 shows the plots of the rubella data series and their predicted values based on the PINMAP(1) model.

Figure 3

The sample path, ACF and PACF plots of the number of rubella cases, monthly from Jan 2013 to Jul 2017 in Spain.

Model YW Estimates
RMS
α̂ λ̂ p̂
PINMAP(1) 0.178 0.265 0.5586
PINMAG(1) 0.175 0.802 0.5588
PINMA(1) 0.172 0.274 0.5595
GINMA(1) 0.170 0.801 0.5594
NBINMAP(1) 0.368 0.128 0.5663
NBINMAG(1) 0.182 0.802 0.6111

RMS, root mean square; PINMAP(1), first-order nonnegative integer-valued moving average process with power series innovations based on a Poisson thinning operator; YW, Yule-Walker

Table 6

YW estimates of the parameters and RMS for the rubella series.

Figure 4

Rubella data and their predicted values.

6.3. Number of Earthquakes Magnitude 8.0 to 9.9

The third example assumes the number of earthquakes magnitude 8.0 to 9.9, annually from 1977 to 2006 in the world (http://www.johnstonsarchive.net/other/quake1.html). The sample path, autocorrelation and partial autocorrelation functions are shown in Figure 5. According to Figure 5, we observe that an INMA(1) process can be suitable for modeling this data set since there exists a cut-off after lag 1 in the sample autocorrelation. The sample mean, variance and empirical index of dispersion are, respectively, 0.67, 0.57 and 0.86. Since the index of dispersion lower than 1, the earthquakes series is underdispersed. Thus, an underdispersed model must be assumed for modeling the series. So we fit the PINMABE(1) and NBINMABE(1) models to this data set. For both models, we obtain the YW estimates and RMS. The results are shown in Table 7. According to the values of Table 7, we conclude that the PINMABE(1) model gives better forecasting than the NBINMABE(1) model for the earthquakes series because its RMS is smaller. Figure 6 shows the plots of the earthquakes data series and their predicted values based on the PINMABE(1) model.

Figure 5

The sample path, ACF and PACF plots of the number of earthquakes magnitude 8.0 to 9.9, annually from 1977 to 2006 in the world.

Model YW Estimates
RMS
α̂ p̂
PINMABE(1) 0.234 0.54 0.704
NBINMABE(1) 0.271 0.52 0.719

RMS, root mean square; YW, Yule-Walker

Table 7

YW estimates of the parameters and RMS for the earthquakes series.

Figure 6

Earthquakes data and their predicted values.

7. CONCLUSION

In this paper, we consider a PINMAPS(1). Some statistical properties of the process are obtained. The stationary and ergodicity of the process are investigated. The parameters of the model are estimated using three methods contain YW, CLS and FGLS; also their performance is evaluated via simulation. Some sub-models are studied in detail. Finally, the model is applied to three real data sets and is shown the better performance of the model for predicting future values of overdispersed and underdispersed count data.

CONFLICTS OF INTEREST

There is no conflict of interest in this article.

AUTHORS' CONTRIBUTIONS

Mrs Rostami is the phd student under supervision of Prof. Eisa Mahmoudi and this work is related to her thesis.

ACKNOWLEDGMENTS

The authors thank the anonymous referees for their valuable comments and careful reading, which led to an improvement of the presentation and results of the article. The authors are grateful to the Editor-in-Chief and Editor for their helpful remarks on improving this manuscript. The authors are also indebted to Yazd University for supporting this research.

REFERENCES

17.S. Kocherlakota and K. Kocherlakota, Bivariate Discrete Distributions, Statistics: Textbooks and Monographs, Markel Dekker, New York, Vol. 132, 1992.
Journal
Journal of Statistical Theory and Applications
Volume-Issue
19 - 3
Pages
415 - 431
Publication Date
2020/09/28
ISSN (Online)
2214-1766
ISSN (Print)
1538-7887
DOI
10.2991/jsta.d.200917.001How to use a DOI?
Copyright
© 2020 The Authors. Published by Atlantis Press B.V.
Open Access
This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - JOUR
AU  - Eisa Mahmoudi
AU  - Ameneh Rostami
PY  - 2020
DA  - 2020/09/28
TI  - First-Order Integer-Valued Moving Average Process with Power Series Innovations
JO  - Journal of Statistical Theory and Applications
SP  - 415
EP  - 431
VL  - 19
IS  - 3
SN  - 2214-1766
UR  - https://doi.org/10.2991/jsta.d.200917.001
DO  - 10.2991/jsta.d.200917.001
ID  - Mahmoudi2020
ER  -