The Kumaraswamy Marshall-Olkin Log-Logistic Distribution with Application
- Selen Cakmakyapanselencakmakyapan@hacettepe.edu.trDepartment of Statistics, Hacettepe University, Ankara, TurkeyGamze Ozelgamzeozl@hacettepe.edu.trDepartment of Statistics, Hacettepe University, Ankara, TurkeyYehia Mousa Hussein El Gebalyyehia1958@hotmail.comDepartment of Statistics, Mathematics and Insurance, Benha University, EgyptG. G. Hamedanigholamhoss.email@example.comDepartment of Mathematics, Statistics and Computer Science, Marquette University, USA
- https://doi.org/10.2991/jsta.2018.17.1.5How to use a DOI?
- Kumaraswamy-G, Maximum likelihood, Log-Logistic, Order statistic
In this paper, we define and study a new lifetime model called the Kumaraswamy Marshall-Olkin log-logistic distribution. The new model has the advantage of being capable of modeling various shapes of aging and failure criteria. The new model contains some well-known distributions as special cases such as the Marshall-Olkin log-logistic, log-logistic, lomax, Pareto type II and Burr XII distributions. Some of its mathematical properties including explicit expressions for the quantile and generating functions, ordinary moments, skewness, kurtosis are derived. The maximum likelihood estimators of the unknown parameters are obtained. The importance and flexibility of the new model is proved empirically using a real data set.
- Copyright © 2018, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article under the CC BY-NC license (http://creativecommons.org/licences/by-nc/4.0/).
There has been an increased interest in defining new generated classes of univariate continuous distributions by introducing additional shape parameter(s) to a baseline model. The extended distributions have attracted several statisticians to develop new models. The addition of parameters has been proven to be useful in exploring skewness and tail properties, and also for improving the goodness-of-fit of the generated family. The well-known generators are the following: the Marshall-Olkin distribution family by Marshall and Olkin (1997), the beta-G by Eugene et al. (2002), the Kumaraswamy-G (Kw-G) by Cordeiro and de Castro (2011), the Logistic-G by Torabi and Montazari (2014), the transformed-transformer (T-X) by Alzaatreh et al. (2013), the odd exponentiated generalized by Cordeiro et al. (2013), the Weibull-G by Bourguignon et al. (2014), the Kumaraswamy Marshal-Olkin distribution family by Alizadeh et al. (2015), the transmuted geometric-G by Afify et al. (2016a) and the beta transmuted-H by Afify et al. (2017).
Marshall and Olkin (1997) proposed a flexible family of distributions and introduced an interesting method of adding a new parameter to an existing distribution. The resulting new distribution includes the original distribution as a special case and gives more flexibility to model various types of data. For further information about the Marshall–Olkin family of distributions, see Barreto-Souza et al. (2013). The log-logistic (LL) distribution (known as the Fisk distribution in economics) has been widely used particularly in survival and reliability over the last few decades. It is the probability distribution of a random variable whose logarithm has a logistic distribution, an alternative to the log-normal distribution since it presents a failure rate function that increases initially and decreases later. The cumulative distribution function (cdf) and probability density function (pdf) of the LL distribution are given (for x > 0) byrespectively, where α > 0 is the scale parameter and γ > 0 is the shape parameter.
Searching a more flexible LL distribution, many authors defined generalizations and modified forms of the LL distribution, with different number of parameters. For example, the Kumaraswamy log-logistic (de Santana et al., 2012), Marshall-Olkin LL (MOLL) (Gui, 2013), Lomax log-logistic (Cordeiro et al., 2014), McDonald log-logistic (Tahir et al., 2014), beta log-logistic (Lemonte, 2014), transmuted log-logistic (Granzotto and Louzada, 2015), Kumaraswamy transmuted log-logistic (Afify et al., 2016b) and generalized transmuted log-logistic (GTLL) (Nofal et al., 2017) distributions.
Gui (2013) defined the cdf and pdf of the MOLL distribution (for x > 0) byrespectively, where α, γ, β > 0. For β = 1, we obtain the LL distribution.
The goal of this paper is to define and study a new lifetime model called the Kumaraswamy Marshall-Olkin Log-Logistic (“KMOLL” for short) distribution. The main feature of this model is that two additional shape parameters are inserted in (2) to give more flexibility in the form of the generated distribution. Based on the Kumaraswamy-generalized (K-G) family proposed by Cordeiro and de Castro (2011), we construct the new five-parameter KMOLL distribution. We give some mathematical properties of the new distribution with the hope that it will attract wider applications in engineering, reliability, life testing and other research. In fact, the KMOLL distribution can provide better fits than other models.
Let g(x) and G(x) denote the pdf and cdf of the baseline model. Cordeiro and de Castro (2011) defined the cdf of the K-G family byThe corresponding pdf of (1.3) is given by where a > 0 and b > 0 are two extra shape parameters whose role are to govern skewness and tail weights. Clearly, for a = b = 1, we obtain the baseline distribution. (1.5) is where α is a scale parameter and the shape parameters a,b,γ and β govern the skewness of (1.6).
A random variable X with the pdf (1.6) is denoted by X ~ KMOLL(a,b,α,γ,β). The survival function, hazard rate function (hrf) and cumulative hazard rate function (chrf) of X are, respectively, given byand Some of the possible shapes of the pdf (1.6) for selected parameter values are illustrated in Figure 1. As seen from Figure 1, the density function can take various forms depending on the parameter values. It is evident that the KMOLL distribution is much more flexible than the MOLL distribution, i.e. the additional shape parameters a and b allow for a high degree of flexibility of the KMOLL distribution. Both unimodal and monotonically decreasing shapes appear to be possible.
Plots for the hrf of the KMOLL distribution for several parameter values are displayed in Figure 2. Figure 2 shows that the hrf of the KMOLL distribution can be bathtub, upside down bathtub (unimodal), increasing or decreasing. This attractive flexibility makes the hrf of the KMOLL useful and suitable for non-monotone empirical hazard behaviors which are more likely to be encountered or observed in real life situations.
We now state a useful expansion for the KMOLL density. Using the binomial expansion, the pdf of the KMOLL reduces towhere
The importance of the KMOLL distribution is that it contains as special sub-models several well-known distributions. Table 1 lists the special distributions related to KMOLL distribution.
Sub-models of the KMOLL distribution
The rest of the article is outlined as follows. In Section 2, we obtain the quantile function, shapes, skewness, kurtosis, moments, moment generating functions, Rényi entropies, reliability function and order statistics of X. Certain characterizations are presented in Section 3. The maximum likelihood estimates (MLEs) of the model parameters are obtained in Section 4. An application to real data set is considered in Section 5. Finally, Section 6 provides some concluding remarks.
2. The KMOLL Properties
In this section, we investigate mathematical properties of the KMOLL distribution including quantile function, skewness, kurtosis, shapes of functions, moments, the Rényi and Shannon entropies, reliability and order statistics.
2.1. Quantile function
Quantile functions are in widespread use in statistics and often find representations in terms of lookup tables for key percentiles. Let X ~ KMOLL(a,b,α,γ,β). The quantile function say Q(u) is defined by inverting F(x) in (1.5) asThe effect of the shape parameters a,b,α, γ, β, on the skewness and kurtosis can be considered based on quantile measures. There are many heavy tailed distributions for which this measure is infinite. So, it becomes uninformative precisely when it needs to be. The Bowley’s skewness is based on quartiles: and the Moors’ kurtosis is based on octiles: where Q(.) represents the quantile function of X. These measures are less sensitive to outliers and they exist even for distributions without moments. Skewness measures the degree of the long tail and kurtosis is a measure of the degree of tail heaviness. When the distribution is symmetric, S = 0 and the when the distribution is right (or left) skewed, S > 0 or (S < 0). As K increases, the tail of the distribution becomes heavier.
2.2. Moments and moment generating function
Some of the most important features and characteristics of a distribution can be studied through moments (e.g. tendency, dispersion, skewness and kurtosis). Now we obtain ordinary moments and the moment generating function of the KMOLL distribution. The ordinary moments E(X)n = μ′n, n = 1,2,..., of the KMOLL distribution can be obtained, using (1.7), aswhere . Here, gMOLL(.) and GMOLL(.) are the pdf and cdf of the MOLL distribution, respectively. Then, the integral part in (2.1) is defined as where QMOLL(.) is the quantile function of the MOLL distribution for 0 < u< 1. Then, we obtain where B(.,.) is the beta function.
Further, the central moments (μn) and cumulants (κn), n = 1, 2,..., of the KMOLL distribution can be obtained fromHere, κ1 = µ′1, , , etc.
The skewness and kurtosis are also computed from the second, third and fourth cumulants. Table 2 gives moments, skewness, and kurtosis of the KMOLL distribution for some parameter values.
|KMOLL(a, b, γ, α, β)||μ′1||μ′2||μ′3||μ′4||S||K|
Moments, skewness and kurtosis of the KMOLL distribution for some parameter values
Table 2 indicates that the skewness value can be positive and negative, also close to zero. Hence, the KMOLL distribution can be right-skewed, left-skewed or symmetric.
Figure 3 also depicts plots for the skewness and kurtosis coefficients related to additional parameters. In the figure, a parameter decreases while other parameters are kept fixed. These plots indicate that both measures can be very sensitive on these shape parameters. Thus, indicating the importance of the proposed distribution.
The moment generating function (mgf) is widely used as an alternative way to analytical results compared with working directly with pdf and cdf. The mgf of X isor another representation for M(t) can be obtained using (1.7) where . Then, the integral part in (2.2) is given as where QMOLL(.) is the quantile function of the MOLL distribution for 0 < u < 1. From the Maclaurin expansion, we obtain .
Then we obtain
The pdf of the KMOLL model is decreasing or unimodal. In order to investigate the critical points of its density function, its first derivative with respect to x isThere may be more than one root to (2.4). If x= x0 is a root of (2.4), then it corresponds to a local maximum If df(x) / dx > 0 for all x < x0 and df (x)/dx < 0 for all x > x0. It corresponds to a local minimum if df (x)/dx < 0 for all x < x0 and df (x)/dx > 0 for all x> x0. It corresponds to a point of inflexion if either df (x)/dx > 0 for all x ≠ x0 or df (x) /dx < 0 for all x ≠ x0.
The entropy of a random variable X with density function f(x) is a measure of variation of the uncertainty. Two popular entropy measures are the Rényi and Shannon entropies (Rényi (1961). Shannon (1951)). Here. we derive expressions for the Rényi and the Shannon entropies of the KMOLL distribution. The Rényi entropy of a random variable with pdf f(x) is defined asfor δ > 0 and δ ≠ 1. Then, we can write where and .
The integral part of (2.5) isThen the Rényi entropy of the KMOLL distribution is given by The Shannon entropy plays a similar role as the kurtosis measure in comparing the shapes of various densities and measuring heaviness of tails. The Shannon entropy of a random variable X is defined by. E⌊−log f(x)⌋. It is the special case of Rényi entropy when δ > 1. The Shannon entropy of the KMOLL distribution is To obtain three expectations terms given above. We define and compute Here we have where 2F1 is the generalized hypergeometric function defined by and (a)k = a(a + 1)…(a + k − 1) denotes ascending factorial.
Similarly, the following expectations are defined for (12) as and . Here. C is Euler’s constant (Nadarajah et al. 2012).
In the context of reliability. the stress-strength model describes the life of a component which has a random strength X1 that is subjected to a random stress X2 The component fails at the instant that the stress applied to it exceeds the strength. and the component will function satisfactorily whenever X1 > X2. Hence, R = Pr(X2 < X1 ) is a measure of component reliability. Here. we obtain the reliability R when X1 ~ KMOLL(a1,b1,α,γ,β) and X2 ~ KMOLL(a2,b2,α,γ,β) are independent random variables. Probabilities of this form have many applications especially in engineering concepts.
Let fi and Fi denote the pdf and cdf Xi for i = 1,2,…,. Then, the reliability function for the KMOLL distribution is given byThe cdf of X2 and the pdf of X1 are obtained as After some algebra, we arrive at where and .
2.6. Order statistics
Order statistics make their appearance in many areas of statistical theory and practice. They enter in the problems of estimation and hypotheses testing in a variety of ways. Therefore, we now discuss some properties of the order statistics for the proposed class of distributions. Let Xi:n denote the ith order statistic. Nadarajah et al. (2012) obtained the general results for the Kumaraswamy-G distribution. We use the results about the pdf fi:n(x) of the ith order statistic. Then. we can give the pdf fi:n(x) for a random sample X1, X2,…,Xn from the KMOLL distribution. It is well-known thatfor i = 1, 2,..., n. Using the binomial expansion in the last equation. We obtain The pdf in (2.7) can also be defined as where and .
This section deals with various characterizations of KMOLL distribution. These characterizations are based on: (i) the ratio of two truncated moments; (ii) the hazard function and (iii) certain functions of the random variable. It should be mentioned that for characterization (i) the cdf need not have a closed form. We present our characterizations (i) − (iii) in three subsections.
3.1. Characterizations based on ratio of two truncated moments
In this subsection, we present characterizations of KMOLL distribution in terms of a simple relationship between two truncated moments. This characterization result employs a theorem due to (Glänzel. 1987). see Theorem 3.1 below. Note that the result holds also when the interval H is not closed. Moreover, as mentioned above. It could be also applied when the cdf F does not have a closed form. As shown in (Glänzel, 1990). This characterization is stable in the sense of weak convergence.
Let (Ω, F, P) Ω, be a given probability space and let H = [d,e] be an interval for some d < e (d = −∞, e = ∞ might as well be allowed). Let X: Ω → H be a continuous random variable with the distribution function F and let g and h be two real functions defined on H such thatis defined with some real function ξ. Assume that the equation g, h ∈ C1(H), ξ ∈ C2 (H) and F is twice continuously differentiable and strictly monotone function on the set H. Finally, assume that the equation ξh = g has no real solution in the interior of H. Then, F is uniquely determined by the functions g, h and ξ particularly where the function s is a solution of the differential and C is the normalization constant. such that .
Let X : Ω → (0, ∞) be a continuous random variable and let and g(x) = h(x)(xγ + βαγ) −1 for x > 0. The random variable X has pdf (1.6) if and only if the function ξ defined in Theorem 3.1 has the form
Let X be a random variable with pdf (1.6). Then,and and finally Conversely, if ξ is given as above. Then and hence Now, in view of Theorem 3.1. X has density (1.6).
Let X : Ω → (0, ∞) be a continuous random variable and let h(x) be as in Proposition 3.1. The pdf of X is (6) if and only if there exist functions g and ξ defined in Therorem 3.1 satisfying the differential equationThe general solution of the differential equation in corollary 3.1 is where D is a constant. Note that a set of function satisfying the above differential equation is given in Proposition 3.1 with D = 0 However, it should be also noted that there are other triplets (h,g,ξ) satisfying the conditions of Theorem 3.1.
For b = 1, (Mendoza et al., 2016). we let h(x) = g(x)[xβ + αβ] −1 with g(x) = x−β(a−1). Then for x > 0.
The differential equation and general solution in this case are. respectively.and
3.2. Characterization based on hazard function
It is known that the hazard function. hF. a twice differentiable distribution function, F, satisfies the first order differential equationFor many univariate continuous distributions. this is the only characterization available in terms of the hazard function. The following characterization establishes a characterization of KMOLL distribution which is not of the above trivial form.
Let X : Ω → (0, ∞) be a continuous random variable. The pdf of X is (1.6) if and only if its hazard function hF (x) satisfies the differential equationx > 0. with the initial condition hF(0) = 0 for aγ > 1.
If X has pdf (1.6) then clearly the above differential equation holds. Now, if it holds, thenor which is the hazard function of the KMOLL distribution.
For a = b = 1 (special case of (1.4)). we have the following simple differential equation
3.3. Characterization based on certain functions of the random variable
The following propositions have already appeared in (Hamedani, 2013). so we will just state them here which can be used to characterize KMOLL distribution.
Let X : Ω → (d, e) be a continuous random variable with cdf F. Let ψ(x) be a differentiable function on (d, e) with limx→d+ ψ(x) = 1. Then for δ ≠ 1.If and only if
It is easy to see that for certain functions. e.g., , and (d, e) = (0,∞).
Proposition 3.3 provides a characterization of KMOLL distribution. Clearly there are other suitable functions ψ. We chose the above one for simplicity.
4. Maximum Likelihood Estimation
Several approaches for parameter estimation have been proposed in the literature but the maximum likelihood method is the most commonly employed. Here we consider estimation of the unknown parameters of the KMOLL distribution by the method of maximum likelihood. Let x1, x2,..., xn be observed values from the KMOLL distribution with parameters a,b,γ,α and β. The log-likelihood function for (a,b,γ,α,β) is given bywhere .
The derivatives of the log-likelihood function with respect to the parameters a,b,γ,α and β are given by respectively.The MLEs of (a, b, γ, α, β), say (â, , , , ), are the simultaneous solutions of the equations , , , and . Maximization of the likelihood function can be performed by using nlm or optimize in R statistical package.
5. An Illustrative Application
In this section, we use a real data set to compare the fits of the KMOLL distribution with MOLL, LL and Weibull Fréchet (WFr) (Afify et al., 2016c) distributions. We will use a data set consists of 63 observations of the strengths of 1.5 cm glass fibres (Smith and Naylor, 1987), originally obtained by workers at the UK National Physical Laboratory. Unfortunately, the measurement units are not given in their paper. We estimate the unknown parameters of the distributions by the maximum likelihood. Then, we provide the values of the following statistics: Akaike Information Criterion (AIC), Consistent Akaike Information Criterion (CAIC) and Bayesian Information Criterion (BIC).
In general, the smaller the values of these statistics, the better the fit to the data. Table 3 lists the MLEs of the parameters and the values of AIC, CAIC and BIC statistics.
|Distribution||Estimated Parameters (Standard Error)||AIC||CAIC||BIC|
|KMOLL(a, b, γ, α, β)||1.8355 (0.096)||0.0028 (0.002)||47.4236 (13.307)||0.0588 (0.030)||0.2786 (0.095)||28.0861||29.1387||38.8018|
|MOLL(γ, α, β)||2.3267 (1.289)||0.0353 (0.154)||7.9260 (0.873)||51.5799||51.9867||58.0093|
|LL(γ, α)||1.5262 (0.041)||7.9260 (0.873)||49.5799||49.7799||53.8662|
|WFr(α, β, a, b)||0.3865 (0.799)||0.2436 (0.285)||1.4762 (4.782)||16.8561 (20.485)||39.0||47.6||42.4|
MLEs and the values of AIC, CAIC and BIC statistics
Based on Table 3, it is clear that KMOLL distribution provides the overall best fit and therefore could be chosen as the more adequate model than other models for explaining the data set. Table 4 gives Cramer-von Misses (W) and Anderson Darling statistics (A) for the three models which are the KMOLL, MOLL and LL distributions. More information is provided by a histogram of the data given in Figure 4. Fitted lines in Figure 4 represent the KMOLL, MOLL, LL and WFr distributions. Figure 5 shows empirical cdf and the fitted cdfs. Finally, we give Q-Q plots for all fitted models. The figures also reveals that the KMOLL fits the data very well.
|KMOLL(a, b, γ, α, β)||0.0181403||0.127219|
|MOLL(γ, α, β)||0.4969404||2.748973|
Cramer-von Misses and Anderson Darling statistics
In this paper. we introduce a five-parameter distribution called the Kumaraswamy Marshal-Olkin log-logistic (KMOLL) distribution. Interestingly. our proposed model has increasing. upside-down bathtub and bathtub shaped hazard rate function. A study on the mathematical properties of the new distribution is presented. We obtain the moment generating function. ordinary moments. skewness. kurtosis. hazard and survival functions. The estimation of the model parameters is done via maximum likelihood method. We also provide a numerical example of our findings. We hope that the proposed model may attract applications in survival analysis and customer lifetime duration etc.
Cite this article
TY - JOUR AU - Selen Cakmakyapan AU - Gamze Ozel AU - Yehia Mousa Hussein El Gebaly AU - G. G. Hamedani PY - 2018 DA - 2018/03 TI - The Kumaraswamy Marshall-Olkin Log-Logistic Distribution with Application JO - Journal of Statistical Theory and Applications SP - 59 EP - 76 VL - 17 IS - 1 SN - 2214-1766 UR - https://doi.org/10.2991/jsta.2018.17.1.5 DO - https://doi.org/10.2991/jsta.2018.17.1.5 ID - Cakmakyapan2018 ER -