Journal of Statistical Theory and Applications

Volume 19, Issue 1, March 2020, Pages 49 - 58

Finite Mixture Modeling via Skew-Laplace Birnbaum–Saunders Distribution

Authors
Mehrdad Naderi1, *, Mahdieh Mozafari2, Kheirolah Okhli3
1Department of Statistics, Faculty of Natural and Agricultural Sciences, University of Pretoria, Pretoria, South Africa
2Department of Statistics, Faculty of Mathematics and Computing, Higher Education Complex of Bam, Bam, Iran
3Department of Statistics, Ferdowsi University of Mashhad, Mashhad, Iran
*Corresponding author. Email: m.naderi@up.ac.za
Corresponding Author
Mehrdad Naderi
Received 16 September 2017, Accepted 19 October 2018, Available Online 5 March 2020.
DOI
10.2991/jsta.d.200224.008How to use a DOI?
Keywords
Birnbaum–Saunders distribution; Normal mean-variance mixture model; Skew-Laplace distribution; Finite mixture model; ECM algorithm
Abstract

Finite mixture model is a widely acknowledged model-based clustering method for analyzing data. In this paper, a new finite mixture model via an extension of Birnbaum–Saunders distribution is introduced. The new mixture model provide a useful generalization of the heavy-tailed lifetime model since the mixing components cover both skewness and kurtosis. Some properties and characteristics of the model are derived and an expectation and maximization (EM)-type algorithm is developed to compute maximum likelihood estimates. The asymptotic standard errors of the parameter estimates are obtained via offering an information-based approach. Finally, the performance of the methodology is illustrated by considering both simulated and real datasets.

Copyright
© 2020 The Authors. Published by Atlantis Press SARL.
Open Access
This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

1. INTRODUCTION

The Birnbaum–Saunders (BS) distribution [1,2] is a positively skewed and unimodal distribution with non-negative support. The BS distribution has recently received considerable attention in the literature of statistics including lifetime, survival and environmental data analysis (see, e.g. [36]). An important property of the BS distribution is that it is closely related to the normal distribution by means of a simple stochastic representation. A random variable T is said to have a BS distribution with the shape and scale parameters α and β, respectively, if it is generated by

T=β4αZ+(αZ)2+42,(1)
where Z follows a standard normal distribution (denoted by N(0,1)). It can be easily shown that the probability density function (pdf) of T is
f(t;α,β)=A(t,α,β)ϕ(a(t,α,β)),t>0,α>0,β>0,(2)
where ϕ() is the pdf of the standard normal distribution, a(t,α,β)=(tββt)α, and A(t,α,β) is the derivative of a(t,α,β) with respect to t. A comprehensive review of the BS distribution can be found in [7].

Recently, some generalizations and extensions of the BS distribution have been proposed through replacing the standard normal variable Z in (1) by other random variables or replacing ϕ() in (2) by other pdfs. The main motivation for introducing these extended distributions is to increase the range of skewness and kurtosis and is to make the BS distribution more flexible, since the BS distribution may not fit the data very well [8]. Therefore, the skewed and heavy-tailed family of distributions are used by several researchers to enhance the flexibility of the BS distribution against skewness and outliers. For instance by introducing skew-normal distribution and its extensions such as skew-t distribution [9], the related BS versions of them are proposed (see [1012] among others).

Another important family of skewed distributions is the class of normal mean-variance (NMV) mixture models [13]. Concerning the asymmetric properties of the NMV models, Arslan [14] introduced the skew-Laplace (SL) distribution via considering the inverse gamma distribution as a mixing random variable. Arslan also developed the expectation and maximization (EM) algorithm [15] for computing maximum likelihood (ML) estimates.

In reliability studies, data often arise from heterogeneous populations and so they need to be modeled by a mixture of two or more life distributions. The finite mixture of distributions (FM) is a finite convex linear combination of distribution functions which is also useful to approximate complicated probability densities presenting multimodality. A comprehensive survey of the FM models can be found in [16,17]. Because of the usefulness of the FM models in reliability, different kinds of the lifetime mixture models have been recently proposed. Ali [18] introduced a mixture of the inverse Rayleigh model for lifetime study in engineering processes and studied different properties of the proposed model. Balakrishnan et al. [19] proposed three different mixture models based on the BS distribution as (a) mixture of two different BS distributions, (b) mixture of a BS distribution and a length-biased version of another BS distribution and (c) mixture of a BS distribution and its length-biased version.

Due to the asymmetric properties and robustness of the SL distribution, the main objective of this paper is to propose a g components FM model via an extension of the BS distribution based on the SL model. Studying some properties of the new model, the EM-type algorithm is implemented to facilitate the procedure of the ML estimation. It also becomes possible to compute asymptotic standard errors of ML estimation via using the information-based idea [20]. Finally, by a simulation study, we show that the proposed model is robust for analyzing heavy tail lifetime data.

The remainder of the paper is unfolded as follows. In Section 2, we briefly review the univariate SL distribution and some of its properties. We also present the skew-Laplace Birnbaum–Saunders (SLBS) distribution with its characteristics in this section. Section 3 describes the finite mixture model of the proposed SLBS distribution and develops the EM-type algorithm for estimating parameters. In Section 4, we analyze a real data set for the purpose of illustrating the performance of the proposed mixture model. Two simulation studies are carried out to verify the robustness of the new model and to check finite-sample properties of the ML estimates in Section 5. Finally, the paper is closed with a short summary in Section 6.

2. PRELIMINARIES

A random variable X is said to have a univariate SL distribution (denoted by X~SL(μ,σ,λ)) if its pdf is given by

fSL(x;μ,σ,λ)=12τσexpτ|xμ|σ+λ(xμ)σ2,x,
where σ>0, μ and λ are the scale, location and skewness parameters, respectively, and τ=1+(λσ)2. It can be easily shown that the SL distribution is generated through the NMV representation. Specifically, let Z and W be two independent random variables followed by N(0,1) and the exponential distribution with mean 2(Exp(0.5)), respectively. Then, the random variable X~SL(μ,σ,λ) admits the stochastic representation
X=dμ+Wλ+σWZ.(3)

Proposition 2.1.

The skewness and kurtosis of X~SL(μ,σ,λ) are respectively given by

γx=16λ3+12λ(4λ2+2)1.5,andκx=36λ4+36λ2+6(2λ2+1)23.

Proof.

The results are straightforward by using representation (3).

Figure 1 displays the curves of skewness and kurtosis of the SL distribution. It can be observed that the SL distribution takes wider ranges of skewness and kurtosis as compared with the skew-normal distribution. In the following, the notation fSL(;λ) stands for the pdf of the standard SL distribution (μ=0,σ=1), denoted by X~SL(λ).

Figure 1

The skewness and kurtosis plots of the skew-Laplace (SL) distribution.

Replacing Z by X~SL(λ) in (1), the SLBS distribution with the following pdf can be introduced

fSLBS(t;α,β,λ)=A(t,α,β)fSL(a(t,α,β);λ),t,α,β>0,λ.(4)

This model will be denoted by SLBS(α,β,λ) henceforth. The stochastic representation of the SLBS distribution can be readily obtained by (1) and (3) as

T=dβ4α(Wλ+WZ)+(α(Wλ+WZ))2+42,
where Z~N(0,1) and W~Exp(0.5), independently.

Theorem 2.1.

Some properties of the SLBS distribution are as follows:

  1. The density of SLBS distribution tends to the density of Laplace BS distribution, as λ0.

  2. The random variable T distributed by SLBS(α,β,λ) is degenerated to β as α tends to zero.

  3. If T~SLBS(α,β,λ), then X=d1αTββT~SL(λ).

  4. Let T~SLBS(α,β,λ). It can be easily shown that the cumulative distribution function (cdf) of T is FSLBS(t;α,β,λ)=FGH(a(t,α,β);1,0,1,0,1,λ), where FGH(;κ,χ,ψ,μ,σ,λ) represents the cdf of generalized hyperbolic distribution with parameter (κ,χ,ψ,μ,σ,λ). This leads to readily obtain the hazard rate function of T as

    H(t)=fSLBS(t;α,β,λ)1FGH(a(t,α,β);1,0,1,0,1,λ).(5)

Using the “ghyp” package in the statistical software R and part (4) of Theorem 2.1, the hazard rate function of T can be computed. Figure 2 presents the hazard function curves of (5) for some parameter values. As a result of the fact that the hazard function of the proposed distribution can be decreasing, increasing, upside-down bathtub, it is evident that the proposed three-parameter distribution is quite flexible and can be used effectively in analyzing data with monotone as well as non-monotone hazard shape which are quite common in reliability and biological studies, for instance.

Figure 2

The hazard rate plots of the skew-Laplace Birnbaum-Saunders (SLBS) distribution for some parameter values.

Now in the following theorem, we present a transformational result for the random variable T with SLBS distribution. This theorem can be useful in proposing a regression model with errors followed by the SLBS distribution.

Theorem 2.2.

Let T~SLBS(α,β,λ). The random variable Y=log(T) has a following pdf:

fY(y;α,γ,λ,τ)=U(y,α,γ)fSL(u(y,α,γ);λ),
where γ=log(β), U(y,α,γ)=2cosh(yγ2)/α and u(y,α,γ)=sinh(yγ2)/α.

Remark 2.1.

As a measure of the uncertainty or randomness of a system, the entropy plays an important role in many sciences. The entropy has been used in the various situations and its numerous measures have been studied and compared in the literature. For a random variable X, the Rényi entropy which is important in quantum information, and Shannon entropy which plays a similar role as the kurtosis measure in comparing the shapes of various densities and measuring heaviness of tails, are respectively defined by

IR=(1δ)1logf(x)δdt,andIS=Elogf(x),
where δ>0, δ1 and f(x) represents the pdf of X. The Rényi and Shannon entropies of T~SLBS(α,β,λ) can be expressed as
IR=δ1δlog(2)δlog1+λ2/2+(1δ)1×log0At;α,βδexpτδ|a(t;α,β)|+λδa(t;α,β)dt,IS=log4ατβ+2λ2+τ1.5Elog(T)+Elog(T+β).

There is no a closed-form expression for the integral and expectations. Hence, Theorems 2.1 and 2.2 are useful to compute them numerically.

Consequently, we establish the following proposition and theorem, which are useful for obtaining the complete log-likelihood function and for the calculation of some conditional expectations involved in the proposed EM-type algorithm discussed in the next section.

Proposition 2.2.

Let T be the random variable following the SLBS(α,β,λ) distribution. The hierarchical representation of T is

T|W=w~EBS(αw,β,2,wλ),W~Exp(0.5),
where extended Birnbaum-Saunders distribution (EBS) denotes the extended BS distribution [21].

Theorem 2.3.

Let W~Exp(0.5) and T~SLBS(α,β,λ). For every t, the conditional distribution of W given T=t is the generalized inverse Gaussian (GIG) distribution, GIG(0.5,a2(t,α,β),1+λ2). Moreover,

E[Wr|T=t]=a(t,α,β)1+λ2rR(1+λ2a(t,α,β),0.5,r),r=±1,±2,,
where R(c,a,b)=Ka+b(c)Ka(c), Kκ() denotes the modified Bessel function of the third kind with index κ.

Proof.

By Bayes' rule, the conditional distribution is simply obtained. Moreover, the conditional expectation can be derived by using properties of the GIG distribution and the Bessel function, see [22].

3. FINITE MIXTURE OF THE SLBS DISTRIBUTIONS

Consider n independent, random variables T1,,Tn, which are taken from a mixture of SLBS (Mix-SLBS) distributions. The pdf of g-component Mix-SLBS model is

ftj;Θ=i=1gπifSLBStj;θi,j=1,2,,n,(6)
where pi's are mixing proportions subject to i=1gπi=1, fSLBS(;θi) is the density of SLBS distribution (4) with θi=(αi,βi,λi) and Θ=(π1,,πg1,θ1,,θg). By observing data t=(t1,,tn), the observed log-likelihood function can be obtained as
(Θ|t)=j=1nlogi=1gπifSLBS(tj;θi).(7)

The ML estimate of the parameters involved in (7) can be obtained directly. However, maximization of (7) is complicated. Another approach for getting ML estimator is the EM algorithm. In this approach, the idea is to solve tractable complete log-likelihood problems repeatedly instead of solving a difficult incomplete log-likelihood problem. For applying this approach to Mix-SLBS model, it is convenient to construct a log-likelihood function by introducing a set of allocation variables Zj=(Z1j,,Zgj) for j=1,,n, taking Zij=1 if yj belongs to the ith component and Zij=0 otherwise. This implies that Zj independently follows a multinomial distribution with one trial and probabilities (π1,,πg), denoted by Zj~M(1;π1,,πg). It also follows from Proposition 2.2 that the hierarchical formulation of (6) can be represented by

Tj|W=wj,Zij=1~EBS(αiwj,βi,2,λiwj),Wj|Zij=1~Exp(0.5),Zj~M(1;π1,,πg).

Therefore, the complete data log-likelihood function for Θ associated with the observed variable t and hidden variables w=(w1,,wn) and Z=(Z1,,Zn), omitting additive constants, is

c(Θ|t,w,z)=j=1ni=1gzijlogπiαi+logtj+βiβiwj12αi2δ(tj,βi)λi22wj+λiαiξ(tj,βi),(8)
where δ(t,β)=tββt2, and ξ(t,β)=a(t,1,β).

3.1. Parameter Estimation via ECM Algorithm

To compute the ML estimate of the unknown parameters involved in (8), we adopt the Expectation Conditional Maximization (ECM) algorithm [23]. The ECM algorithm is a variant of the EM algorithm with the maximization (M) step of EM replaced by a sequence of computationally simpler conditional maximization (CM) steps. The ECM algorithm for ML estimation of the Mix-SLBS proceeds as follows:

  • E-step: At the iteration k, we compute the so-called Q-function, defined as the expected value of complete data log-likelihood (8) with Θ evaluated at Θ̂(k)

    Q(Θ|Θ̂(k))=j=1ni=1gij(k)logπiαi+logtj+βiβiûij(k)2αi2δtj,βiλi22ŵij(k)+λiαiξtj,βi,(9)
    where ûij(k)=E[Wj1|tj,θ̂i(k)] and ŵij(k)=E[Wj|tj,θ̂i(k)] calculated by using (7) in Theorem 2.3.

  • CM-steps: Put ni=j=1nij(k), Ai=j=1nij(k)ŵij(k), Bi=j=1nij(k)ûij(k), Ri=j=1nij(k)ûij(k)tj and Si=j=1nij(k)ŵij(k)tj and update Θ̂(k) by maximizing (9) over Θ. This leads to the following CM estimators:

    π̂i(k+1)=nin,λ̂i(k+1)=1Aij=1nij(k)atj;α̂i(k),β̂i(k),α̂i2(k+1)=Riniβ̂i(k)+β̂i(k)Sini2Bini1Ainij=1nij(k)ξ(tj,β̂i(k))2,β̂i(k+1)=arg maxβobsβi;α̂i(k+1),λ̂i(k+1),
    where
    obsβi;α̂i(k+1),λ̂i(k+1)=j=1nij(k)logtj+βiβiûij(k)2α̂i2(k+1)δ(tj,βi)+λ̂i(k+1)α̂i(k+1)ξ(tj,βi).

The above procedure is iterated until a suitable convergence rule is satisfied. To avoid an indication of lack of progress of the algorithm [24], we recommend adopting the Aitken acceleration [25] method as the stopping criterion. To apply this approach, the asymptotic estimate of the log-likelihood [16] is computed by

θ̂(k+1)=θ̂(k+1)+11a(k)θ̂(k+1)θ̂(k).
where the Aitken acceleration factor is a(k)=((θ^(k+1))(θ^(k)))/((θ^(k))(θ^(k1))). Therefore, the ECM algorithm can be considered to have reached convergence if (θ^(k+1))(θ^(k))<ε, [26]. This paper considers the tolerance ε=105.

3.2. Estimation of Standard Errors

To compute the asymptotic covariance of the ML estimates Θ̂, we employ the information-based method suggested by Meilijson [27]. Formally, the empirical information matrix is defined as

Ie(Θ|t)=j=1ns(tj|Θ)sT(tj|Θ)1nS(t|Θ)ST(t|Θ),(10)
where S(t|Θ)=j=1ns(tj|Θ) and s(tj|Θ) are individual scores which can be determined from the result of [20] as s(tj|Θ)=f(tj|Θ)Θ=Ec(Θ|tj,wj,zj)Θtj,Θ. Substituting the ML estimates Θ̂=(p̂1,,p̂g1,α̂1,,α̂g,β̂1,,β̂g,λ̂1,,λ̂g) into (10) gives
Ie(Θ|t)=j=1nŝjŝjT,(11)
where ŝj=(ŝj,p1,,ŝj,pg1,ŝj,α1,,ŝj,αg,ŝj,β1,,ŝj,βg,ŝj,λ1,,ŝj,λg). The explicit expressions for the elements of ŝj are summarized below
ŝj,pr=rjp̂rgjp̂g,ŝj,λr=rjλ̂rŵrj+a(tj;α̂r,β̂r,ŝj,αr=rj1α̂r+ûrjα̂r3tjβ̂r+β̂rtj2λ̂rα̂r2ξtj,β̂r,ŝj,βr=rj12β̂rûrj2α̂r21tjtjβ̂r2+1tj+β̂r+λ̂r2β̂rα̂rtj/βr^+βr^/tj.

As a result, the standard errors of Θ̂ are obtained as the square roots of the diagonal elements of the inverse of (11).

4. REAL DATA ANALYSIS

In this section, we use Enzyme data to illustrate the performance of the Mix-SLBS model to fit the data. These data, originally analyzed by Bechtel et al. [28], correspond to the enzymatic activity in the blood and represent the metabolism of carcinogenic substances among 245 unrelated individuals. The enzymatic activity is quantified by the molar ratio between two metabolites of caffeine. Bechtel et al. concluded that a mixture of two skewed distributions is suitable for analyzing these Enzyme data. Recently, Balakrishnan et al. [19] used these data to compare three kinds of BS mixture model. Their proposed model were (a) mixture of two BS distribution (Mix-BS), (b) Mixture of Length-biased BS and BS distributions (Mix-LBBS) and (c) Mixture of Length-biased BS and BS distributions with the same parameters (Mix-LBSBS). Here, the Akaike Information Criterion (AIC) [29] as well as the Bayesian Information Criterion (BIC) [30] are computed to identify the best selected model for comparing our proposed model with these three models. The AIC and BIC are formulated by mcn2(Θ̂), where (Θ̂) is the actual log-likelihood, m is the number of free parameters that have to be estimated under the model and the penalty term cn is a convenient sequence of positive numbers. We have cn=2 for AIC and cn=log(n) for BIC. Also, the Kolmogorov–Smirnov (KS) test is carried out as a measure of Goodness of fit test. The KS statistic is a distance between the empirical cdf and the estimated theoretical cdf for the models. The p-value of KS test can be used as a similarity assessment of the experimental data against the fitted distribution.

We fit Mix-SLBS model for different number of components g, and find that the best g=2 based on AIC and BIC criteria. The results of the study are summarized in Table 1. Results based on AIC and BIC show that the Mix-SLBS provides a highly improved fit of the data over three other competitors. The results of Table 1 can also be comparable with table 5 in Benites et al. [31] where they fits Mix-BS distributions with different number of components. Moreover, the Mix-SLBS model yields quite smaller standard errors for the estimated parameters which shows that the Mix-SLBS distributions allows to produce more precise estimates for this data example.

Mix-SLBS
Mix-BS
Mix-LBBS
Mix-LBSBS
Parameter MLE SE MLE SE MLE SE MLE SE
π 0.636 0.007 0.629 0.031 0.450 0.028 0.417 0.026
α1 0.399 0.001 0.533 0.032 0.365 0.034 1.038 0.084
β1 0.185 0.012 0.175 0.007 0.171 0.007 0.216 0.012
λ1 −0.029 0.001
α2 0.210 0.016 0.319 0.025 1.274 0.114 1.038 0.084
β2 1.003 0.039 1.274 0.043 0.213 0.044 0.216 0.012
λ2 0.587 0.003
(Θ̂) −46.394 −59.168 −71.091 −115.899
AIC 106.787 128.336 152.182 237.798
BIC 131.296 145.842 169.688 248.302
KS 0.039 0.053 0.111 0.151
p-value 0.832 0.507 0.005 <0.001

BS, Birnbaum–Saunders; LBBS, Length-biased Birnbaum–Saunders; LBSBS, Length-biased Birnbaum–Saunders and Birnbaum–Saunders distributions with the same parameters; MLE, maximum likelihood estimation; SE, standard error; SLBS, skew-Laplace Birnbaum-Saunders.

The bold numbers are related to the best model coordinating to the each model comparison measure.

Table 1

Parameter estimates with corresponding standard errors of the Enzyme data set.

On the other hand, the p-value of the KS test for Mix-SLBS distributions is significantly grater than the Mix-BS, Mix-LBBS and Mix-LBSBS models which strongly confirms that the Enzyme data follow the Mix-SLBS distributions.

Finally, we provide the probability-probability (PP)-plot of the two best fitted models in Figure 3. The PP-plot shows that the Mix-SLBS model adapts to the shape of the histogram very accurately.

Figure 3

The PP-plots of two best distributions.

5. SIMULATION STUDY

In this section, we conduct two simulation studies in order to examine the performance of the proposed method. The first study shows that the underlying Mix-SLBS model is robust in the ability to cluster heterogeneous lifetime data in the presence of outliers. In the second simulation study, we investigate if the ML estimates obtained using the proposed ECM algorithm can provide good asymptotic properties. All simulation studies are done in the statistical software R.

5.1. Robustness of the Model

In the first experiment, the ability of the Mix-SLBS model in clustering observations (allocating them into groups of observations that are similar in some sense) is verified. Although, we know that each data point belongs to one of g heterogeneous populations, we cannot find any discriminating point between them. Through modeling via mixture models, we can cluster the data in terms of the estimated (posterior) probability of belonging to a given group. Our main idea is to use the flexibility of these models in life time data and to extend the ability of them by including possible higher skewness and kurtosis of the related components.

We generated 500 samples from a mixture of two BS and two SLBS densities with the parameter values at S1:p=0.6, α=(2,1), β=(2,1) and λ=(2,2) and S2:p=0.6, α=(1,1), β=(1,1) and λ=(3,0). For each artificial sample, we also proceeded clustering considering the known true classification. Then, the related mixture models were fitted in order to obtain the estimate of the posterior probability that an observation ti belongs to the jth component of the mixture, ij. Finally for each replication v,v=1,,500, we obtain the number of correct allocations divided by the sample size, rv.

Table 2 shows the mean value of rv, i.e., (1500)v=1500rv. We observe from Table 2 that the rate of correct allocations in both mixture models is increased by increasing sample size when samples are taken form Mix-BS model. But, comparing with the results for the Mix-BS model, it can be seen that modeling using the Mix-SLBS distribution represents a substantial improvement in the clustering when the samples are taken form the Mix-SLBS model. This is happened due to heavier tail of the SLBS distribution. So, the Mix-SLBS has a better performance in clustering data with heavier tail which shows its robustness to discrepant observations.

S1: Fitted Model
S2: Fitted Model
True Model Sample Size Mix-BS Mix-SLBS Mix-BS Mix-SLBS
Mix-BS 100 0.8134 0.7861 0.7998 0.7460
500 0.8231 0.8350 0.8067 0.7984
1000 0.8358 0.8345 0.8109 0.8201
Mix-SLBS 100 0.5189 0.9722 0.6018 0.8874
500 0.4477 0.9761 0.5929 0.9146
1000 0.4449 0.9776 0.5901 0.9191

BS, Birnbaum–Saunders; SLBS, skew-Laplace Birnbaum-Saunders.

Table 2

Mean right allocations rates for fitted models.

5.2. Asymptotic Properties

We carry out a second simulation study to check the finite-sample performances of the ML estimates obtained by using the ECM algorithm. To show the effect of parameters on the estimation procedure, we consider three set of the parameter values Θ=(p,α1,β1,λ1,α2,β2,λ2) as follows:

S1)Θ=(0.7,0.25,1,1,1,2,2),S2)Θ=(0.7,1,0.25,2,1,1,2),andS3)Θ=(0.7,1,1,0.5,1,1,2).

As suggested in [32], the parameters were taken to produce highly skewed and heavy-tailed distributions. In each replication of 500 trials, we generate data of size n=100,500,1000,5000, from the Mix-SLBS model. To investigate the estimation accuracies, the bias and the mean squared error (MSE) of overall samples are computed as:

R.Bias=1500i=1500θi^θθ,andMSE=1500i=1500(θi^θ)2,
where θi^ is the estimation of specific parameter θ when the data is sample i.

Table 3 presents the results of the simulation. It is evident that both magnitudes bias and MSE of the estimators decrease and tend to toward zero when the sample size increases. It shows that the ML estimates obtained via the ECM algorithm are empirically consistent.

S1
S2
S3
Measure Parameter Sample Size
Sample Size
Sample Size
100 250 500 1000 100 250 500 1000 100 250 500 1000
R.Bias p 0.237 0.034 0.024 0.017 0.211 0.097 0.046 0.026 0.030 0.020 0.022 0.001
α1 0.450 0.372 0.284 0.108 0.445 0.397 0.308 0.134 0.296 0.176 0.098 0.038
α2 0.182 0.134 0.110 0.094 0.195 0.114 0.054 0.009 0.084 0.066 0.062 0.007
β1 0.098 0.089 0.069 0.043 0.984 0.569 0.327 0.209 0.247 0.209 0.150 0.095
β2 0.456 0.369 0.301 0.198 0.365 0.259 0.186 0.097 0.080 0.025 0.008 0.002
λ1 0.557 0.457 0.311 0.167 0.399 0.279 0.153 0.068 0.527 0.473 0.335 0.196
λ2 0.209 0.094 0.045 0.013 0.188 0.078 0.026 0.009 0.219 0.208 0.189 0.163
MSE p 0.050 0.027 0.012 0.005 0.059 0.034 0.017 0.008 0.058 0.043 0.031 0.024
α1 0.053 0.034 0.026 0.015 0.444 0.219 0.134 0.086 0.256 0.185 0.099 0.010
α2 0.156 0.089 0.055 0.019 0.201 0.152 0.091 0.032 0.113 0.066 0.043 0.009
β1 0.009 0.009 0.008 0.006 1.628 1.079 0.793 0.258 0.062 0.040 0.015 0.001
β2 0.331 0.218 0.161 0.083 0.242 0.173 0.109 0.045 0.049 0.023 0.014 0.008
λ1 1.654 1.202 0.985 0.384 0.887 0.496 0.273 0.129 0.691 0.467 0.323 0.165
λ2 1.029 0.504 0.321 0.176 0.835 0.638 0.356 0.192 0.612 0.412 0.361 0.136

EM, expectation and maximization; MSE, mean squared error.

Table 3

Bias and mean squared errors for EM estimates of simulated data.

6. CONCLUSIONS

In this paper, we have dealt with a new finite mixture model based on a new extension of BS distribution, called the Mix-SLBS. We have presented a convenient hierarchical representation and developed the ECM algorithm to obtain the ML estimates of the parameters. Numerical results suggest that the proposed Mix-SLBS distributions is well suited to the experimental data and can be more robust and flexible against outliers as compared with the 3 mixture competitors, proposed by Balakrishnan et al. [19].

Some possible directions of the current work remain that deserve further attention. For instance, the linear and non-linear regression model based on the SLBS distribution can be presented by using Theorem 2.2 [33,34]. As another extension of this work, it is worthwhile to generalize BS distribution in more general case by considering GIG distribution instead of Exp(0.5). It is also interesting to introduce multivariate SLBS distribution and setting finite mixture of multivariate SLBS distributions [35].

CONFLICT OF INTEREST

No potential conflict of interest was reported by the authors.

AUTHORS' CONTRIBUTIONS

M. Naderi developed main methodological parts of the paper whereas M. Mozafari and K. Okhli implemented the methodology in the R program and applied them to the real and simulated data examples.

ACKNOWLEDGMENTS

We are grateful to the Editor-in-Chief, anonymous referees for their comments, which greatly improved this work. M. Naderi is also appreciate the support by the National Research Foundation, South Africa (Reference: CPRR160403161466 351 Grant No. 105840, SARChI Research Chair- UID: 71199, and STATOMET).

REFERENCES

8.N. Balakrishnan, V. Leiva, A. Sanhueza, and F. Vilca, Sort, Vol. 33, 2009, pp. 171-192.
13.A. McNeil, R. Frey, and P. Embrechts, Quantitative Risk Management: Concepts, Techniques and Tools, Princeton University Press, New Jersey, 2005.
17.G. McLachlan and D. Peel, Finite Mixture Models, John Wiley and Sons, New York, NY, USA, 2004.
26.B.G. Lindsay, in NSF-CBMS Regional Conference Series in Probability and Statistics (Alexandria, Virginia), 1995, pp. i-163.
31.L. Benites, R. Maehara, F. Vilca, and F. Marmolejo-Ramos, Finite mixture of Birnbaum-Saunders distributions using the k-bumps algorithm, 2017. arXiv preprint arXiv:1708.00476
Journal
Journal of Statistical Theory and Applications
Volume-Issue
19 - 1
Pages
49 - 58
Publication Date
2020/03/05
ISSN (Online)
2214-1766
ISSN (Print)
1538-7887
DOI
10.2991/jsta.d.200224.008How to use a DOI?
Copyright
© 2020 The Authors. Published by Atlantis Press SARL.
Open Access
This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - JOUR
AU  - Mehrdad Naderi
AU  - Mahdieh Mozafari
AU  - Kheirolah Okhli
PY  - 2020
DA  - 2020/03/05
TI  - Finite Mixture Modeling via Skew-Laplace Birnbaum–Saunders Distribution
JO  - Journal of Statistical Theory and Applications
SP  - 49
EP  - 58
VL  - 19
IS  - 1
SN  - 2214-1766
UR  - https://doi.org/10.2991/jsta.d.200224.008
DO  - 10.2991/jsta.d.200224.008
ID  - Naderi2020
ER  -