The Extended Dagum Distribution : Properties and Application

Abstract: We study a new five-parameter model called the extended Dagum distribution. The proposed model contains as special cases the log-logistic and Burr III distributions, among others. We derive the moments, generating and quantile functions, mean deviations and Bonferroni, Lorenz and Zenga curves. We obtain the density function of the order statistics. The parameters are estimated by the method of maximum likelihood. The observed information matrix is determined. An application to real data illustrates the importance of the new model.


Introduction
The Dagum model pionnered by Camilo Dagum (1977Dagum ( , 1980) has been widely used in studies of income and wealth distributions.Its structural properties have been extensively investigated by several authors.For an excellent survey on the genesis and on empirical applications, see Kleiber and Kotz(2003) and Kleiber (2008).If a random variable Z has the Dagum distribution with positive parameters β, λ and δ, its cumulative distribution function (cdf) and probability density function (pdf) are given by (for z > 0) (; ) = (1 +   − ) − (1) and respectively, where  = (, , ).The parameter λ is a scale parameter, whereas β and δ are shape parameters.Henceforth, the Dagum distribution with parameters β, λ and δ will be denoted by Da(, , ).It has positive asymmetry and it is unimodal for βδ > 1 and zero-modal for βδ ≤ 1.The qth quantile and the rth ordinary moment of Z are given by () = respectively, for  < , where B(•, •) is the beta function.
 Corresponding author.
Recently, several authors have studied the Dagum distribution under some perspectives.Domma(2007) where a > 0 and b > 0 are two extra shape parameters whose role is to add skewness and to vary tail weights.Because of its tractable cdf (4), the EG class can be used quite effectively even if the data are censored.The pdf corresponding to (4) is given by The baseline distribution G(x) is a basic exemplar of (4) when a = b = 1.Setting a = 1 gives the exponentiated-G (exp-G) distribution.The case b = 1 corresponds to the Lehmman type II.So, the cdf (4) generalized both the exponentiated and Lehmman type II distributions.
In this paper, we propose a new lifetime model, named the extended Dagum (EDa) distribution with cdf obtained from equation ( 4) by taking G(x) to be the cdf of the Dagum(β, λ, δ) distribution.We obtain some mathematical properties of the new distribution and discuss maximum likelihood estimation of its parameters.The rest of the paper is outlined as follows.In Section 2, we introduce the EDa distribution and provide a mixture representation for its density function.General expressions for the ordinary and incomplete moments are given in Section 3. In Section 4, we derive the moment generating function (mgf).In Section 5, we provide an expression for the quantile function (qf).The R´e nyi entropy is determined in Section 6.The mean deviations are calculated in Section 7. The Bonferroni, Lorenz and Zenga curves are obtained in Section 8.The density of the order statistics is determined in Section 9.The maximum likelihood estimation is addressed in Section 10.An application to real data is discussed in Section 11.Finally, concluding remarks are given in Section 12.

The Extended Dagum Distribution
Replacing (1) in equation ( 4), we obtain the EDa distribution defined by the cdf and pdf and respectively, where  = (, , , , ) is the parameter vector.Henceforth, a random variable X having the EDa pdf (6) with parameter vector & is denoted by X ∼ (, , , , ).The hazard rate function (hrf) and reverse hazard function of X are given, respectively, by Plots of the EDa pdf and hrf are displayed in Figures 1 and 2, respectively.The pdf (6) includes several distributions as special models.In fact, the Dagum distribution (with parameters β, λ and δ) is clearly a basic exemplar for a = b = 1.The extended Fisk (EFisk) and extended Burr III (EBuIII) distributions are new models which arise for β = 1 and λ = 1, respectively.The case b = 1 corresponds to the Lehmman type II Dagum distribution.For ρ > 0 real non-integer and || < 1, we have Expanding the binomials in (6) as in (7), we obtain (; ) as a mixture of Dagum densities where where U is uniform random variable.

Ordinary and Incomplete Moments
The rth ordinary moment of X for  <  comes from equations ( 3) and ( 8) as Further, the central momnets(  ) and cumulants(  ) of X are easily obtained from equation ( 12) by , respectively, where where is the incomplete beta function.
This result can be useful in the study of the income inequality measures.

Generating Function
Here, we derive an explicit expression for the mgf (; ) of X.First, we have from (2), the mgf of the Da(, , ) distribution Using the power series in equation ( 14) for k > 0 and the integral ∫  −1  −  = Γ()  ∞ 0 (Prudnikove et al., 1986), we obtain (for t < 0) Combining (8) and the last result, the mgf of X follows as where   + 1 is given by (9).

Quantile Function
By inverting equation ( 5), the qf of X is given by The effect of the shape parameters a and b on the skewness and kurtosis of the new distribution can be based on quantile measures.The shortcomings of the classical skewness and kurtosis measures are well-known.One of the earliest skewness measures to be suggested is the Bowley skewness (Kenney and Keeping,1962)  These measures are less sensitive to outliers and they exist even for distributions without moments.
In Figures 3 and 4, we plot the measures B and M for the EGDa distribution as functions of a and b for fixed values of the other parameter, respectively.These plots indicate that there is a great flexibility of the skewness and kurtosis curves of the new distribution.

R𝐞́nyi Entropy
Entropy has been used in various situations in science and engineering.Numerous entropy measures have been proposed and studied in the literature.The entropy of a random variable X with pdf () is a measure of variation of the uncertainty.The Rényi entropy is defined by where  > 0 and  ≠ 1.For further details, see Song (2001).

Mean Deviation
The amount of scatter in a population is evidently measured to some extent by the totality of deviations from the mean and median.These are know as the mean deviations about the mean and about the median defined by , respectively, where  1 ′ = () and M = Median(X) denotes the median.From equation ( 16) These measures can be determined using the following relationships where ( 1 ′ ) is calculated from (5) and  1 (q) is the first incomplete moment of X obtained from (13) with r = 1.
Finally, we have (for δ > 1) and We can write the Bonferroni, Lorenz and Zenga curves of X from ( 12) and ( 13) as (for δ >1)

Order statistics
Order statistics appear in many areas of statistics and play an important role in practice.The density  : () of the ith order statistic, for  = 1, . . ., , from independent and identically distributed random variables  1 , . . .,   is given by Substituting ( 5) and ( 6) in equation ( 19), we can write Appling the binomial expansion twice in the previous equation, we obtain Finally, we have Then, the density of the EDa order statistics is a linear combination of Dagum densities.So, some mathematical properties of the EDa order statistics follow from the Dagum distribution.
The normal approximation of the MLE of  can be used for constructing approximate confidence intervals and for testing hypotheses on the parameters , , ,  and .The elements of the observed information matrix () = − 2 ℓ()/  ⊤ are given in the Appendix.We can construct approximate confidence regions for the parameters based on the multivariate normal  5 (0, () −1 ) distribution.
Further, the likelihood ratio (LR) statistic can be used for comparing this distribution with some of its special models.We can compute the maximum values of the unrestricted and restricted log-likelihoods to obtain LR statistics for testing some sub-models of the EDa distribution.

Application
We use a real data set to compare the fits of the EDa distribution with three of its sub-models and with two other non-nested models, namely:  the Kumaraswamy Burr XII (KwBuXII) distribution (Parana´ıba et al., 2012).Its pdf (for x > 0) is given by  the beta Dagum (BDa) distribution (Doma and Condimo, 2013).Its pdf (for x > 0) is given by The parameters are estimated by maximum likelihood as described in Section 10 using the R software.The data set was studied by Andrews and Herzberg (1985), Cooray and Ananda (2008) and, more recently, by Paranaıba et al. (2012).The n = 101 observations represent the stressrupture lifes of kevlar 49/epoxy strands subjected to constant sustained pressure at the 90% stress level until all had failed such that we obtain a complete data with exact failure times.
Table 1 lits the MLEs (and their standard errors) of the parameters and the Akaike information criterion (AIC) for the fitted models.These results indicate that the EDa distribution has the lowest AIC value among all fitted models, and so it could be chosen as the best model.In order to assess if the model is appropriate Figure 5(a) and 5(b) display the histogram of the data and the plots of the fitted EDa, exp-Da and BDa distributions and the empirical and their estimated survival functions, respectively.These plots indicate that the EDa distribution provides the best fit to these data.In addition, we use the formal goodness-of-fit tests in order to verify which distribution fits better to the current data.We consider the Cramér-von Mises ( * ) and Anderson-Darling ( * ) statistics (Chen and Balakrishnan, 1995).In general, the smaller the values of the statistics W * and A * , the better the fit to the data.The values of the statistics  * and  * for all models are listed in Table 2. Based on the values of these statistics, we conclude that the EDa model yields the best fit to the current data.distributions.We demonstrate that the EDa density function can be expressed as a linear combination of the Dagum densities.Based on this result, we derive some structural properties of the proposed distribution and provide the ordinary and incomplete moments, generating function and quantile function.We also obtain the Rényi entropy, mean deviations, Bonferroni, Lorenz and Zenga curves and the density of ith order statistic.The estimation of the model parameters is approached by maximum likelihood and the observed information matrix is derived.The usefulness of the new distribution is illustrated by means of a real data set.where   and   ̅ are defined in Section 10.

Concluding remarks
derived the asymptotic distribution of the maximum likelihood estimators (MLEs) of the parameters of the right-truncated Dagum ditribution.Domma et al. (2009) determined the information matrix in doubly censored data.Domma et al. (2011a) calculated the observed information matrix in right censored samples and Domma et al. (2011b) discussed aspects of the maximum likelihood estimation for censored data.Shahzad and Asghar (2013) obtained the Lmoments and TL-moments in closed-form.Domma et al. (2011c) studied the Dagum distribution from the perspective of reliability, whereas Domma and Perri (2009) investigated some developments of the log-Dagum distribution.Oluyede and Rajasoorya (2013) proposed the Mc-Dagum distribution and studied some mathematical properties.Doma and Condino (2013) extended the Dagum distribution based on the beta generator pioneered by Eugene et al. (2002).Several methods for generating new classes of distributions by extending well-known models and at the same time providing great flexibility in modeling real data have been proposed in the last years.The use of new generators of continuous distributions from classical ones has become more common in the last ten years or so.Some examples are the beta-generated (Eugene et al., 2002), gamma-generated (Zografos and Balakrishnan, 2009) and generalized Kumaraswamy (Cordeiro and de Castro, 2011) classes of distributions.For a continuous baseline cdf G(x), Cordeiro et al. (2013) defined the exponentiated generalized ("EG"for short) class of distributions by

Figure 1 :
Figure 1: Plots of the EDa pdf.

Figure 3 :
Figure 3: Skewness and kurtosis of X as a function of a for some values of b.

Figure 4 :
Figure 4: Skewness and kurtosis of X as a function of b for some values of a.

Figure 5 :
Figure 5: (a) The estimated densities of the EDa, exp-Da and BDa models for stress data.(b) The estimated survival functions from the fitted EDa, exp-Da and BDa distributions and the empirical survival for stress data.
We propose a new five-parameter lifetime model referred to as the extended Dagum (EDa) distribution, which is based on the exponentiated generalized class of distributions recently introduced by Cordeiro et al. (2013).The new distribution contains as special models the extended Dagum, extended Burr III, extended log-logistic, log-logistic, Burr III and Dagum