Journal of Statistical Theory and Applications

Volume 20, Issue 1, March 2021, Pages 33 - 45

Statistical Properties and Different Estimation Procedures of Poisson–Lindley Distribution

Authors
Mohammed Amine Meraou1, Mohammad Z. Raqab*, ORCID
1Department of Mathematics, The University of Jordan, Amman, 11942, Jordan
*Corresponding author. Email: mraqab@ju.edu.jo
Corresponding Author
Mohammad Z. Raqab
Received 21 October 2020, Accepted 28 October 2020, Available Online 11 January 2021.
DOI
10.2991/jsta.d.210105.001How to use a DOI?
Keywords
Anderson–Darling method; Cramer–von Mises; Least square estimators; Maximum likelihood estimators; Poisson–Lindley distribution
Abstract

In this paper, we propose a new class of distributions by compounding Lindley distributed random variates with the number of variates being zero-truncated Poisson distribution. This model is called a compound zero-truncated Poisson–Lindley distribution with two parameters. Different statistical properties of the proposed model are discussed. We describe different methods of estimation for the unknown parameters involved in the model. These methods include maximum likelihood, least squares, weighted least squares, Cramer–von Mises, maximum product of spacings, Anderson–Darling and right-tail Anderson–Darling methods. Numerical simulation experiments are conducted to assess the performance of the so obtained estimators developed from these methods. Finally, the potentiality of the model is studied using one real data set representing the monthly highest snowfall during February 2018, for a subset of stations in the Global Historical Climatological Network of USA.

Copyright
© 2021 The Authors. Published by Atlantis Press B.V.
Open Access
This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

1. INTRODUCTION

In recent years, many researches are interested in obtaining several new continuous distributions by compounding an absolutely continuous distribution with a discrete distribution. This method is used widely in engineering applications including risk measurement, floods reliability and survical analysis. For example, Adamidis and Loukas [1] proposed a two-parameter lifetime distribution by compounding exponential and geometric distributions. The exponential Poisson (EP) and exponential logarithmic distributions were introduced by Kus [2] and Tahmasbi and Rezaei [3], respectively. Marshall and Olkin [4] developed some new extensions based on random minimum and maximum. Barreto-Souza and Cribari-Neto [5] introduced the exponentiated exponential Poisson (EEP).

Cancho et al. [6] introduced the Poisson-exponential (PE) distribution by compounding the exponential and zero-truncated Poisson distributions. Chahkandi and Ganjali [7] introduced a class of distributions, namely exponential power series (EPS) distributions by compounding exponential and power series distributions. Also in the same way, Mahmoudi and Jafari [8] introduced the generalized exponential power series (GEPS) distribution by compounding the generalized exponential (GE) distribution with the power series distribution. The performances of the estimators using intensive simulation experiments have received considerable attention in the literature by several authors. Among them, Gupta and Kundu [9], Kundu and Raqab [10], Alkasabeh and Raqab [11], Asgharzadeh et al. [12], Dey et al. [13] and Rodrigues et al. [14].

The main aim of the present study is two-fold. The first main aim is to introduce a new model which is flexible in fitting a wide range of data sets by compounding Lindley and zero-truncated Poisson distributions. The basic idea can be described as follows. Consider a random variable X having the Lindley distribution with probability density function (PDF)

fX(x;λ)=λ2λ+1(1+x)eλx,x>0,λ>0,(1)
and cumulative distribution function (CDF)
FX(x;λ)=11+λλ+1xeλx,x>0,λ>0.(2)

Given M=m, let X1,X2,,Xm be independent and identically distributed (iid) random variables from Lindley distribution. The random variable M follows zero-truncated Poisson distribution with PDF

P(M=m)=θmm!(eθ1),m=1,2,3,,θ>0.(3)

Here, we introduce a new class of distributions based on the maximal random variate Y=max{X1,X2,XM}. Now we have

P(Yy)=P(max{X1,X2,XM}y)=m=1Pmax{X1,X2,XM}y|M=mP(M=m)=m=1F(y;λ,θ)mP(M=m)=m=1θFX(y;λ,θ)mm!1eθ1=eθFX(y;λ,θ)1eθ1,(4)
where FX(x;λ) is the CDF of the Lindley distribution defined in (2). The so obtained model is called the compound zero-truncated Poisson Lindley (ZTPL) distribution with two parameters. It has an absolute continuous distribution function. Moreover, the Lindley distribution can be obtained as a special case of the compound ZTPL.

The second aim is to present various estimation methods for estimating the two parameters of the compound ZTPL model. The estimators to be considered are maximum likelihood estimators (MLEs), least square estimators (LSEs), weighted least square estimators (WLSEs), Cramer–von Mises type minimum distance estimators (CMDEs), maximum product of spacings, Anderson–Darling estimators (ADEs) and right-tail Anderson–Darling estimation (RTADE). An intensive simulation study is performed for comparing the effectiveness of the so developed of estimators.

This paper is organized as follows: In Section 2, the ZTPL model is described and its distributional properties are discussed. Also, different methods for estimating the parameters of the ZTPL model are developed in Section 3. Numerical simulation results are presented in Section 4. The analysis of monthly highest snowfall data during the month of February 2018, for a subset of stations in the Global Historical Climatological Network of USA is performed for validation purposes in Section 5. Some concluding remarks are presented in Section 6.

2. ZTPL DISTRIBUTION

A random variable Y is said to have a compound ZTPL distribution if its CDF is given by

FY(y;λ,θ)=eθ11+λy1+λeλy1eθ1.(5)

The corresponding PDF of Y can be obtained to be

fY(y;λ,θ)=θλ2(1+y)eλy(eθ1)(λ+1)eθ11+λy1+λeλy,y>0;θ,λ>0.(6)

From (6), it is easily seen that the Lindley distribution is a special case of compound ZTPL when θ0. Hence the parameter θ can be interpreted as a concentration parameter. Figure 1 provides plots of the compound ZTPL distribution for some selected choices of λ and θ. It is observed that the compound ZTPL distribution can be decreasing and unimodal.

Figure 1

Probability density function (PDF) plots of zero-truncated Poisson–Lindley (ZTPL) distribution for different parameter values: (a) (0.25, 1) (b) (2, 3) (c) (1, 4) (d) (3, 0.75) (e) (2, 0.5) (f) (0.2, 0.0001).

The joint PDF of Y and M is given as

fY,M(y,m)=θλ2(eθ1)(m1)!(λ+1)(1+y)eλyθ11+λy1+λeλym1.(7)

Further, from (6) and (7), it can be shown that the PDF of the conditional distribution of M given Y=y is just Poisson random variate with mean θ11+λy1+λeλy. That is,

P(M=m|Y=y)=θ11+λy1+λeλym1(m1)!eθ11+λy1+λeλy,m=1,2,.

The survival function and hazard rate (HR) of the ZTPL(λ,θ) distribution, are given respectively by

S(y)=eθeθ11e11+λy1+λeλy,
and
h(y)=θλ2(1+y)eλy(λ+1)eθeθ11+λy1+λeλy1e11+λy1+λeλy.

Figure 2 presents different shapes of HR for the compound ZTPL(λ,θ) distribution considering different values of λ and θ. It is observed from Figure 2 that the HR function is increasing for all λ>0 and θ>0.

Figure 2

HR Plots for λ = 0.5, 1, 1.5, 2, 2.5 and fxed θ = 1.

The following expression for the r-th moment of Y can be obtained as

E(Yr)=θλ2(eθ1)(λ+1)Ck,i,jΓ(r+j+1)[λ(i+1)]r+j+1+Γ(r+j+2)[λ(i+1)]r+j+2,
where
Ck,i,j=k=0i=0j=0θkk!ki(1)iijλλ+1j.

Therefore, the mean and variance of Y are given by

E(Y)=θλ2(eθ1)(λ+1)Ck,i,jΓ(j+2)[λ(i+1)]j+2+Γ(j+3)[λ(i+1)]j+3,
and
Var(Y)=θλ2(eθ1)(λ+1)Ck,i,jΓ(j+3)[λ(i+1)]j+3+Γ(j+4)[λ(i+1)]j+4θλ2(eθ1)(λ+1)Ck,i,jΓ(j+2)[λ(i+1)]j+2+Γ(j+3)[λ(i+1)]j+32.

The skewness measure of Y is given by

γ3=λ(λ+1)(eθ1)θ12Ck,i,j12Γ(j+3)[λ(i+1)]j+3+Γ(j+4)[λ(i+1)]j+4θλ2(eθ1)(λ+1)Ck,i,jΓ(j+2)[λ(i+1)]j+2+Γ(j+3)[λ(i+1)]j+3232×Γ(j+4)[λ(i+1)]j+4+Γ(j+5)[λ(i+1)]j+5.

The moment generating function (MGF) of Y, MY(t), is

MY(t)=θλ2(eθ1)(λ+1)Ck,i,jΓ(j+1)[λ(i+1)t]j+1+Γ(j+2)[λ(i+1)t]j+2.

3. METHODS OF ESTIMATION

In this section, we present different estimation methods for obtaining the estimators of the parameters λ and θ of the compound ZTPL distribution. These methods are quite useful in obtaining the estimators of λ and θ and other related inferences.

3.1. Maximum Likelihood Estimation and its Asymptotics

Let {y1,,yn} be a random sample of size n from ZTPL(λ,θ). Then the log-likelihood function is given as

l(λ,θ)=nlog(θ)+2nlog(λ)+i=1nlog(1+yi)λi=1nyi+θi=1n11+λyi1+λeλyinlog(eθ1)nlog(1+λ).(8)

The MLEs λ̂MLE and θ̂MLE of λ and θ, are obtained respectively by solving the two nonlinear equations:

l(λ,θ)λ=2nλn1+λi=1nyi+θ(λ+1)2i=1nλyieλyi(λyi+yi+λ+2)=0,
and
l(λ,θ)θ=nθneθ1+i=1n11+λyi1+λeλyi=0.

Another aspect of estimation is to construct confidence intervals (CIs) of the parameters by making use of the asymptotic distribution theory of MLE. By denoting the parameter vector ϑ=(λ,θ), the asymptotic distribution of ϑ is ϑ̂ϑDN2(0,I1), where ϑ̂ is the MLE of ϑ and I1(ϑ) is the inverse of the observed information matrix of ϑ=(λ,θ), which can be approximated by I1(ϑ̂), where

I(ϑ̂)=2lϑiϑj|ϑ=ϑ̂,
with ϑ1=λ and ϑ2=θ.

The elements of matrix I(ϑ) are derived based on (8) as follows:

I11=2nλ2n(1+λ)2+θ(λ+1)3i=1nyieλyi(λ3yi2+λ3yi+2λ2yi2+3λ2yi+λyi2+λyiyi2),
I12=I21=1(λ+1)2i=1nλyieλyi(λyi+yi+λ+2),
and
I22=nθ2neθ(eθ1)2.

Therefore, the lower confidence limit (LCL) and upper confidence limit (UCL) of (1α)% CI of ϑj are

LCB=ϑ̂jzα2I1(ϑ̂j),j=1,2,
and
UCB=ϑ̂j+zα2I1(ϑ̂j),j=1,2,
where zα2 is upper α2 quantile of the standard normal distribution, N(0,1). However, the drawback of this method is that the lower limit of the CI may be negative, which is inadmissible. For this reason, one may use the delta method and logarithmic transformation can avoid this problem. The asymptotic distribution of lnϑ̂j is
lnϑ̂jlnϑjDN(0,var(lnϑ̂j)),
where
var(lnϑ̂j)=var(ϑ̂j)ϑ̂j2=[I1(ϑ̂)]jϑ̂j2.

Then the (1α)% CI of ϑj can be written as

ϑ̂jExpzα2var(lnϑ̂j),ϑ̂jExpzα2var(lnϑ̂j).

3.2. Least Square and Weighted LSEs

Swain et al. [15] proposed an alternative method to compute the estimation of unknown parameters, which is called the LSEs or WLSEs. The basic idea can be defined as follows. Let y1,,yn is a random sample of size n from ZTPL distribution and y(1)<<y(n) denote the order statistics of the random sample. The LSEs of λ and θ (say, λ̂LSE and θ̂LSE) can be obtained by minimizing

i=1nF(y(i)|λ,θ)in+12,
with respect to λ and θ, where F(y|λ,θ) is given by (5). Equivalently, they can be obtained by solving the following nonlinear equations:
i=1nF(y(i)|λ,θ)in+1η1(y(i)|λ,θ)=0,
i=1nF(y(i)|λ,θ)in+1η2(y(i)|λ,θ)=0,
where
η1(y(i)|λ,θ)=θλy(i)eλy(i)(1+λ)2(eθ1)expθ11+λy(i)1+λeλy(i)×λ(1+y(i))+y(i)+2,(9)
η2(y(i)|λ,θ)=1(eθ1)2(eθ1)11+λy(i)1+λeλy(i)expθ11+λy(i)1+λeλy(i)expθ211+λy(i)1+λeλy(i)+eθ.(10)

The WLSEs of λ and θ, say λ̂WLSE and θ̂WLSE, respectively, can be found by minimizing

i=1n(n+1)2(n+2)i(ni+1)F(y(i)|λ,θ)in+12.

Hence, the estimates λ̂WLSE and θ̂WLSE, respectively, can be obtained by solving the following nonlinear equations:

i=1n(n+1)2(n+2)i(ni+1)F(y(i)|λ,θ)in+1η1(y(i)|λ,θ)=0,
i=1n(n+1)2(n+2)i(ni+1)F(y(i)|λ,θ)in+1η2(y(i)|λ,θ)=0.

3.3. Maximum Product of Spacings

Cheng and Amin [16,17] introduced an elaborate technique to compute the estimation of unknown parameters of continuous univariate distributions, namely the maximum product spacing (MPS) method. It was developed by Ranneby [18] independently as an approximation to the Kullback–Leibler measure of information. The simple idea can be described as follows. Let

Di(λ,θ)=F(y(i)|λ,θ)F(y(i1)|λ,θ),i=1,,n+1,
where
F(y(0)|λ,θ)=0,andF(y(n+1)|λ,θ)=1.

Clearly i=1n+1Di(λ,θ)=1.

The MPS estimators (MPSEs) of λ and θ, say λ̂MPS and θ̂MPS, respectively, can be obtained by maximizing the geometric mean of the spacings

G(λ,θ)=i=1n+1Di(λ,θ)1(n+1).(11)

Equivalently, they can be obtained by maximizing the logarithm of the geometric mean of sample spacings:

H(λ,θ)=1n+1i=1n+1logDi(λ,θ).(12)

The estimates λ̂MPS and θ̂MPS can be obtained by solving the two nonlinear equations:

H(λ,θ)λ=1n+1i=1n+11Di(λ,θ)[η1(y(i)|λ,θ)η1(y(i1)|λ,θ)]=0,
and
H(λ,θ)θ=1n+1i=1n+11Di(λ,θ)[η2(y(i)|λ,θ)η2(y(i1)|λ,θ)]=0,
where η1(.|λ,θ) and η2(.|λ,θ) are given in (9) and (10), respectively.

3.4. Cramer–von Mises Minimum Distance Estimators

The CMDEs can be obtained as the difference between the estimate of the CDF and its respective empirical distribution function. MacDonald [19] provided empirical evidence that the bias of the estimator is smaller than the other minimum distance estimators. The Cramer–von Mises estimates λ̂CME and θ̂CME are obtained by minimizing

C(λ,θ)=112n+i=1nF(y(i)|λ,θ)2i12n2.(13)

These estimates can also be obtained by solving the nonlinear equations:

i=1nF(y(i)|λ,θ)2i12nη1(y(i)|λ,θ)=0,
i=1nF(y(i)|λ,θ)2i12nη2(y(i)|λ,θ)=0,
where η1(.|λ,θ) and η2(.|λ,θ) are given in (9) and (10), respectively.

3.5. Anderson–Darling and Right-Tail Anderson–Darling

The ADE is another type of minimum distance estimator and it was introduced by Anderson and Darling [20]. The ADEs λ̂ADE and θ̂ADE of λ and θ are, respectively, obtained by minimizing

A(λ,θ)=n1ni=1n(2i1){logF(y(i)|λ,θ)+logS(y(n+1)|λ,θ)}.

These estimates can also be obtained by solving the following equations:

i=1n(2i1)η1(y(i)|λ,θ)F(y(i)|λ,θ)η1(y(n+1)|λ,θ)S(y(n+1)|λ,θ)=0,
i=1n(2i1)η2(y(i)|λ,θ)F(y(i)|λ,θ)η2(y(n+1)|λ,θ)S(y(n+1)|λ,θ)=0,
where η1(.|λ,θ) and η2(.|λ,θ) are given in (9) and (10), respectively.

Similarly, the RTADEs, λ̂RTADE and θ̂RTADE are obtained by minimizing

R(λ,θ)=n22i=1nlogF(y(i)|λ,θ)1ni=1n(2i1)logS(y(n+1)|λ,θ).

These estimates can be obtained by solving the the following equations:

2i=1nη1(y(i)|λ,θ)F(y(i)|λ,θ)+1ni=1n(2i1)η1(y(n+1)|λ,θ)S(y(n+1)|λ,θ)=0,
2i=1nη2(y(i)|λ,θ)F(y(i)|λ,θ)+1ni=1n(2i1)η2(y(n+1)|λ,θ)S(y(n+1)|λ,θ)=0,
where η1(.|λ,θ) and η2(.|λ,θ) are given in (9) and (10), respectively.

4. NUMERICAL EXPERIMENTS AND DISCUSSIONS

In this section, we present some results of Monte Carlo simulation study to compare the efficiency of the different estimation procedures proposed in the previous sections. For a given set of parameter values for λ and θ and for a given sample size, we first generate a random sample of size n from the compound ZTPL model. Secondly, we compute the average estimates (AEs) and the associated mean squared errors (MSEs) over on 1000 replications. The results are recorded in Tables 1 and 2. The simulation study was computed using the software R. The performance of different methods of estimation are evaluated in terms of MSEs.

Par n Est. MLE LSE WLSE CME MPS ADE RTADE
λ=0.5 25 AE 0.5673 0.4897 0.4929 0.5263 0.4816 0.5102 0.4686
MSE 0.0200 0.0316 0.0315 0.0313 0.0041 0.0257 0.0218
50 AE 0.5437 0.4829 0.4802 0.5040 0.4983 0.4975 0.4902
MSE 0.0089 0.0170 0.0146 0.0158 0.0024 0.0175 0.0113
100 AE 0.5227 0.4873 0.5003 0.4997 0.4992 0.4914 0.5027
MSE 0.0042 0.0096 0.0067 0.0084 0.0011 0.0046 0.0064
300 AE 0.5098 0.4891 0.4990 0.4986 0.5004 0.4942 0.5050
MSE 0.0015 0.0036 0.0020 0.0028 0.0009 0.0020 0.0023
500 AE 0.5026 0.5049 0.4968 0.5007 0.5090 0.5008 0.5022
MSE 0.0010 0.0014 0.0013 0.0016 0.0003 0.0013 0.0010
θ=0.5 25 AE 1.1585 0.2207 0.1742 0.6038 0.4670 0.4126 0.1417
MSE 1.6850 2.6828 3.2857 3.2989 0.0076 2.7555 3.8645
50 AE 0.8743 0.2743 0.2389 0.4794 0.4896 0.4011 0.3443
MSE 0.8223 2.4417 1.7143 1.6978 0.0010 1.9692 1.4611
100 AE 0.5960 0.3460 0.4395 0.4713 0.4963 0.4337 0.4771
MSE 0.3872 0.9877 0.7283 0.9177 0.0008 0.5503 0.8661
300 AE 0.3400 0.4181 0.4931 0.4756 0.5049 0.4486 0.5061
MSE 0.1264 0.3812 0.2301 0.2816 0.0007 0.2131 0.2165
500 AE 0.5292 0.5073 0.4734 0.5011 0.5097 0.4915 0.5216
MSE 0.0960 0.1335 0.1338 0.1657 0.0003 0.1373 0.0969

AE, average estimate; MSE, mean squared error; MLE, maximum likelihood estimator; LSE, least square estimator; WLSE, weighted least square estimator; MPS, maximum product spacing; ADE, Anderson–Darling estimator; RTADE, right-tail Anderson–Darling estimator.

Table 1

The AE and the associated MSEs for the estimates of λ and θ considering different sample sizes.

Par n Est. MLE LSE WLSE CME MPS ADE RTADE
λ=1 25 AE 1.0759 0.9318 0.9798 1.0336 0.9598 0.9866 1.0077
MSE 0.0508 0.1168 0.0937 0.0938 0.0202 0.0864 0.0954
50 AE 1.0092 0.9861 0.9562 1.0146 0.9808 0.9430 1.0216
MSE 0.0207 0.0601 0.0437 0.0565 0.0084 0.0446 0.0477
100 AE 1.0223 0.9873 1.0170 1.0215 0.9912 0.9958 1.0153
MSE 0.0143 0.0351 0.0220 0.0262 0.0030 0.0244 0.0248
300 AE 1.0015 0.9865 0.9780 1.0080 1.0056 1.0096 0.9913
MSE 0.0048 0.0098 0.0082 0.0086 0.0026 0.0085 0.0070
500 AE 1.0034 1.0043 1.0053 1.0014 1.0176 1.0025 1.0078
MSE 0.0037 0.0049 0.0044 0.0047 0.0020 0.0044 0.0034
θ=0.75 25 AE 1.0774 0.2428 0.5457 0.8378 0.7257 0.4698 0.7177
MSE 1.2320 3.6427 2.4906 2.2156 0.0071 2.8561 3.0602
50 AE 0.8486 0.6438 0.5907 0.7168 0.7453 0.4715 0.8298
MSE 0.5147 1.7936 1.2246 1.1642 0.0012 1.3342 1.3694
100 AE 0.8745 0.6277 0.8020 0.8503 0.7480 0.6720 0.8176
MSE 0.4271 0.8848 0.5889 0.5837 0.0008 0.6024 0.6815
300 AE 0.7589 0.6678 0.6496 0.7904 0.7520 0.7519 0.7044
MSE 0.1057 0.2395 0.1890 0.1721 0.0006 0.1865 0.1870
500 AE 0.7682 0.7524 0.7613 0.7523 0.7583 0.7775 0.8085
MSE 0.1000 0.1024 0.1045 0.1041 0.0005 0.1090 0.0978

AE, average estimate; MSE, mean squared error; MLE, maximum likelihood estimator; LSE, least square estimator; WLSE, weighted least square estimator; MPS, maximum product spacing; ADE, Anderson–Darling estimator; RTADE, right-tail Anderson–Darling estimator.

Table 2

The AE and the associated MSEs for the estimates of λ and θ considering different sample sizes.

Some of the points are quite clear from Tables 1 and 2. As the sample size increases, the AEs based on all estimation methods tend to the true parameter values and the MSEs decrease. This indicates that all estimators are consistent and asymptotically unbiased. Furthermore, it is observed that as λ and θ increase, the MSEs increase for all the estimates. Based on the MSE as an optimality criterion, the MPSEs have superior performance than the MLEs and other types of estimation for the compound ZTPL distribution. These results are shown with other studies, see, for example, Ramos and Louzada [21] and Sharma et al. [22]. However, the RTADE has smaller MSE than the ADE. Figures 3 and 4 confirm that these concluding remarks.

Figure 3

MSE of λ̂ for λ = 0.5 (left) and MSE of θ̂ for θ = 0.5 (right).

Figure 4

MSE of λ̂ for λ = 1 (left) and MSE of θ̂ for θ = 0.75 (right).

5. APPLICATION TO MONTHLY MAXIMUM SNOWFALL DATA

In this section, we discuss the analysis of real-life data representing the monthly highest snowfall during the month of February 2018, obtained from a subset of stations in the United States and it is measured in inches (in). This data set was reported in: the National Centers for Environmental Information (NCEI) (https://www.ncdc.noaa.gov/cdoweb/datatools/records). Here, we only wish to demonstrate the use of the estimation procedures based on samples from compound ZTPL model. Table 3 summarizes some basic statistics of the monthly maximum snowfall data set.

Mean Median Std.Dev. Q1 Q3
8.25 7.99 3.50 5.98 10.09
Table 3

Basic statistics of monthly highest snowfall data set.

The following distributions are used in the literature as fitting models for the data set. For example, Poisson Lomax PL (α,β,λ) distribution is introduced by Bander and Hanaa [23], exponentiated Weibull–Poisson (EWP) (α,β,γ,θ) distribution is considered by Mahmoudi and Sepahdar [24] and Lindley (θ) distribution is also used by Ghitany et al. [25]. Here, we show that the compound ZTPL distribution is also a correct fitting distribution as an alternative to the PL, EWP and LI distributions. We fit the compound ZTPL distribution to the monthly highest snowfall data set. The MLEs and the corresponding log-likelihood (ll) values for each distribution are computed. The results are reported in Table 4. The MLEs of λ and θ are computed numerically using Newton–Raphson (NR) method to be λ^=0.4259 and θ^=5.8033. The Kolmogorov–Smirnov (K-S) distance between the fitted and the empirical distribution functions is 0.1109, and the corresponding p values is 0.5197, respectively. Therefore, these values indicate that the two-parameter compound ZTPL distribution fits the data set well.

Distribution Parameters ll
Lindley λ̂=0.2205 −156.4475
ZTPL λ̂=0.4259 θ̂=5.8033 −143.1997
PL α̂=0.0107 β̂=6.7197 λ̂=29.0468 −145.3201
EWP α̂ = 0.1814 β̂ = 4.2158 γ̂=0.4335 θ̂=319.3034 −152.0237

MLE, maximum likelihood estimator; ZTPL, zero-truncated Poisson–Lindley; EWP, exponentiated Weibull–Poisson.

Table 4

The MLEs and the corresponding log-likelihood (ll) values for different fitting distributions.

For further checking model validity and comparisons, the values of K-S and other criteria and their corresponding p values for other distributions including Lindley, PL and EWP distributions are computed. The additional considered criteria are Akaike information criterion (AIC), Akaike information criterion correction (AICc), Hannan–Quinn information criterion (HQIC), Bayesian information criterion (BIC). Table 5 presents the values of these statistics. Note that the smaller the value of the considered criterion, the better the fit to the data. Clearly, the compound ZTPL distribution is a good alternative model comparing with other fitted models. Figure 5 shows the plots of the fitted PDFs and CDFs with their corresponding empirical values. In addition, the empirical survival function (ESF) and fitted survival function are presented in Figure 6. All these plots confirm the same conclusion. Now, we obtain the estimators of the unknown parameters for the compound ZTPL model using different methods of estimation discussed in Section 3. The results for estimates as well as LCL and UCL for 95% CIs of the parameters are displayed in Table 6. Based on K-S distance and log-likelihood criteria, it can be checked that the MPS method competes the other methods but their values are close.

Distribution AIC AICc HQIC BIC K-S p-value
Lindley 314.8949 314.9719 315.6620 316.8839 0.2398 0.004
ZTPL 290.3994 290.6347 291.9335 294.3774 0.1109 0.5197
PL 296.6401 297.1201 298.9414 302.6071 0.1267 0.3512
EWP 312.0474 312.8637 315.1156 320.0033 0.1587 0.1317

MLE, maximum likelihood estimator; ZTPL, zero-truncated Poisson–Lindley; EWP, exponentiated Weibull–Poisson; AIC, Akaike information criterion; AICc, Akaike information criterion correction (AICc); HQIC, Hannan–Quinn information criterion; BIC, Bayesian information criterion; K-S, Kolmogorov–Smirnov.

Table 5

The goodness of fit tests for monthly highest snowfall data set.

Figure 5

Fitted and empirical densities and cumulative distribution functions (CDFs) of zero-truncated Poisson–Lindley (ZTPL), PL, Lindley and exponentiated Weibull–-Poisson (EWP) distributions for the monthly highest snowfall data set.

Figure 6

The empirical survival function (ESF) and fitted survival functions for different distributions: (a) zero-truncated Poisson–Lindley (ZTPL), (b) PL, (c) exponentiated Weibull–-Poisson (EWP) and (d) Lindley.

Method λ̂
θ̂
ll K-S
Est. LCB UCB Est. LCB UCB
MLE 0.4258 0.3471 0.5078 5.8033 4.1696 7.4369 −143.1997 0.1109
LSE 0.4309 0.3477 0.5140 6.1238 4.4186 7.8289 −143.2309 0.1173
WLSE 0.4239 0.3421 0.5056 5.8177 4.1809 7.4544 −143.2049 0.1165
CME 0.4423 0.3569 0.5276 6.6075 4.7906 8.4243 −143.3629 0.1162
MPS 0.4341 0.3503 0.5178 5.8224 4.1845 7.4602 −143.1989 0.1104
ADE 0.4278 0.3453 0.5102 5.9683 4.2981 7.6384 −143.2100 0.1160
RTADE 0.4383 0.3537 0.5228 6.4321 4.6562 8.2079 −143.3023 0.1165

CI, confidence interval; MSE, mean squared error; MLE, maximum likelihood estimator; LSE, least square estimator; WLSE, weighted least square estimator; MPS, maximum product spacing; ADE, Anderson–Darling estimator; RTADE, right-tail Anderson–Darling estimator.

Table 6

Estimates of λ and θ and the respective 95% CIs under various methods and goodness of fit statistics.

6. CONCLUSION

In this paper, a new family of distributions is proposed based on a maxima of Poisson number of Lindely random variates. It is called a compound ZTPL model. Some distributional properties of this model are discussed and different methods of estimation are derived for the unknown parameters, namely, maximum likelihood, least squares, weighted least squares, Cramer–von Mises, maximum product of spacing, Anderson–Darling and right tailed Anderson–Darling. It is observed that the estimators obtained by maximum product of spacing method outperform all other estimators when the mean square error is considered as an optimality criterion. For fitting the maximal values of random number observations, it is evident that the compound ZTPL model provides a consistently better fit than Lindley, Poisson Lomax and Exponentiated Weibul–Poisson distributions.

CONFLICTS OF INTEREST

The authors have no conflicts of interests to declare.

Funding Statement

No funding

ACKNOWLEDGEMENTS

Thanks are due to referees’ comments that helped to improve the paper

REFERENCES

12.A. Asgharzadeh, R. Rezaie, and M. Abdi, Selcuk J. Appl. Math., 2011, pp. 93-108. Special Issue
16.R. Cheng and N. Amin, Maximum Product of Spacings Estimation with Application to the Lognormal Distribution, Department of Mathematics, UWIST, Cardiff, UK, 1979. Math Report 79-1
18.B. Ranneby, Scand. J. Stat., Vol. 11, 1984, pp. 93-112. https://www.jstor.org./stable/4615946
Journal
Journal of Statistical Theory and Applications
Volume-Issue
20 - 1
Pages
33 - 45
Publication Date
2021/01/11
ISSN (Online)
2214-1766
ISSN (Print)
1538-7887
DOI
10.2991/jsta.d.210105.001How to use a DOI?
Copyright
© 2021 The Authors. Published by Atlantis Press B.V.
Open Access
This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - JOUR
AU  - Mohammed Amine Meraou
AU  - Mohammad Z. Raqab
PY  - 2021
DA  - 2021/01/11
TI  - Statistical Properties and Different Estimation Procedures of Poisson–Lindley Distribution
JO  - Journal of Statistical Theory and Applications
SP  - 33
EP  - 45
VL  - 20
IS  - 1
SN  - 2214-1766
UR  - https://doi.org/10.2991/jsta.d.210105.001
DO  - 10.2991/jsta.d.210105.001
ID  - Meraou2021
ER  -