The Kummer Beta Generalized Gamma Distribution

Abstract: A new extension of the generalized gamma distribution with sixparameter called the Kummer beta generalized gamma distribution is introduced and studied. It contains at least 28 special models such as the beta generalized gamma, beta Weibull, beta exponential, generalized gamma, Weibull and gamma distributions and thus could be a better model for analyzing positive skewed data. The new density function can be expressed as a linear combination of generalized gamma densities. Various mathematical properties of the new distribution including explicit expressions for the ordinary and incomplete moments, generating function, mean deviations, entropy, density function of the order statistics and their moments are derived. The elements of the observed information matrix are provided. We discuss the method of maximum likelihood and a Bayesian approach to fit the model parameters. The superiority of the new model is illustrated by means of three real data sets.


Introduction
The generalized gamma (GG) distribution (Stacy, 1962) is an important lifetime model since it includes as special models the exponential, Weibull, gamma and Rayleigh distributions, among others.It is suitable for modeling data with hazard rate function (hrf) of different forms (increasing, decreasing, bathtub and unimodal) and then it is useful for estimating individual hazard functions and both relative hazards and relative times (Cox 2008).The GG distribution has been used in several research areas such as engineering, environment, hydrology and survival analysis.For example, Ortega et al. (2003) discussed influence diagnostics in GG regression models, Nadarajah and Gupta (2007) applied this distribution to drought data, Cox et al. (2007) presented a parametric survival analysis based on GG hazard functions and Cox (2008) discussed and compared the F-generalized family with the GG model.More recently, Barkauskas et al. (2009) modeled the noise part of a spectrum as an autoregressive moving average (ARMA) model with the innovations following the GG distribution Malhotra et al.

662
The Kummer Beta Generalized Gamma Distribution (2009) provided a unifed analysis for wireless system over generalized fading channels that is modeled by a two parameter GG model and Xie and Liu (2009) analyzed three-moment auto conversion parametrization based on this model.Further Ortega et al. (2009) proposed a modifed GG regression model to allow the possibility that long-term survivors may be presented in the data and Cordeiro et al. (2011b) studied the exponentiated generalized gamma (EGG) distribution.
The class of distributions (4) includes two important special cases: the beta-generalized (BG) and exponentiated generalized (EG) distributions when c = 0 and c = 0 and b =1 respectively.We can note that the BG distributions can be limited in one aspect.They have only two additional shape parameters and so they can add only a limited structure to the generated distribution.For instance, a BG distribution may have problems to capture the behavior of random variables with symmetric but highly leptokurtic distributions.While the beta parameters offer explicit control over skewness when the parent is symmetric they have less control over higher moments such as kurtosis.Further the EG distribution still introduces only one extra shape parameter whereas three may be required to control both tail weights and the distribution of weight in the center.Hence the generated distribution ( 4) is a more flexible model since it has one more shape parameter than the classical beta generator.
In this paper we study a new six-parameter model called the Kummer beta generalized gamma (KBGG) distribution which contains at least 28 special models.The main motivation for this extension is that the new model is a highly flexible lifetime distribution which admits different degrees of kurtosis and asymmetry.The KBGG density function is defined from (4) by taking ( 1) and ( 2) as the baseline model.The six-parameter KBGG density function can be expressed as (5) The corresponding hrf to (5) becomes (6) Hereafter, we denote by X a random variable following (5), say X ∼ KBGG(a, b, c, α, β, k).This density has five shape parameters a, b, c, β and k which allow for a high degree of flexibility.The parameter c controls tail weights to the extremes of the distribution.The study of the new distribution is important since it extends some distributions previously considered in the literature.In fact, the generalized gamma (GG) model is clearly a basic exemplar for a = b = 1 and c = 0 with a continuous crossover towards models with different shapes (e.g. a specifized combination of skewness and kurtosis).The KBGG model contains as sub-models the beta generalized gamma (BGG) (Cordeiro et al., 2013a) and the exponentiated generalized gamma (EGG) (Cordeiro et al., 2011b) distributions when c = 0 and b = 1 in addition to c = 0, respectively.Plots of the new density function for selected parameter values are displayed in The Kummer Beta Generalized Gamma Distribution Figure 1.It is evident that this density function is much more flexible than the GG distribution.The KBGG model is very flexible and hence can be used in many practical situations.In fact, it can be symmetric, asymmetric and also exhibit bimodality.We also provide a comprehensive description of some of its mathematical properties with the hope that it will attract wider applications in reliability, engineering, environment and in other areas of research.
The paper is outlined as follows.In Section 2, we derive more than 28 special distributions from the KBGG model.In Section 3, we demonstrate that the KBGG density function can be expressed as a linear combination of EGG density functions.This is an important result to provide some mathematical properties of the KBGG distribution.We obtain explicit expressions for the moments and generating function (Section 4), incomplete moments (Section 5), mean deviations and Rényi entropy (Section 6) and order statistics (Section 7).In Section 8, we discuss some statistical inference such as maximum likelihood method and Bayesian approach.Three applications given in Section 9 reveal the usefulness of the new distribution for analyzing real data.Concluding remarks are addressed in Section 10.

Special distributions
The following well-known distributions are special models of the KBGG distribution.

Exponentiated
where the coefficients (for r = 0, 1 . ..) are Equation ( 7) reveals that the KB-G density function is a linear combination of EG densities.This result is important to derive some mathematical properties of the KBGG distribution from those of the EGG distribution.This equation holds for any real non-integers a, b and c.
The coefficients  , can also be written explicitly as functions of the quantities   .Further, combining equations ( 9) and ( 10), we obtain The Kummer Beta Generalized Gamma Distribution (12) where the coefficients  , are obtained from equation (11) with   = (−1)  /( + )!.Combining ( 8) and ( 12) and after some algebra manipulations, we can rewrite the EGG density function as (13) where k ⋆ = k(r + 1) + m and  ,, ⋆ () is the density function of the GG(α, β, k ⋆ ) distribution.Combining ( 7) and ( 13), we obtain (14) where  , =  ,   .Equation ( 14) reveals that the KBGG density function can be expressed as a linear combination of GG densities.This equation is the main result of this section.It plays an important role in this paper.In the next sections, based on this equation, we obtain some KBGG structural properties including explicit expressions for the ordinary and incomplete moments, generating function, mean deviations and order statistics.

Moments and generating function
The sth moment of X can be expressed from ( 14) as and then (15) where   ⋆ ∼ GG(, ,  ⋆ ).Equation ( 15) is an important result since it provides the moments of the KBGG distribution as a linear combination of GG moments.So, we have Replacing the last result in (15), we obtain the sth moment of X as (16) where  , is defined by (14).Equation ( 16) is readily computed numerically using standard statistical software.It (and other expansions in this paper) can also be evaluated in symbolic computation software such as Mathematica and Maple.In numerical applications, a large natural number N can be used in the sums instead of infinity.Several mathematical quantities of X (central, incomplete and factorial moments, variance, skewness and kurtosis) can be derived from this result.
The skewness and kurtosis measures can be determined from the ordinary moments using well-known relationships.Plots of the skewness and kurtosis of the KBGG distribution as functions of c for selected values of a and b for α = 0.5, β = 1.0 and k = 2.0 are displayed in Figures 2 and 3, respectively.Figures 2a and 2b indicate that the additional parameter c promotes high levels of asymmetry.The Kummer Beta Generalized Gamma Distribution Further, we provide a representation for the moment generating function (mgf) of X, say M(t) = E[exp(tX)], from the linear combination of GG generating functions.From equation ( 14), we have (17) Where  ,, ⋆ () denotes the mgf of the GG(α, β,  ⋆ ) distribution.We can derive  ,, ⋆ () as Using the power series for the exponential function and replacing  = (/)  in this integral,  ,, ⋆ () reduces to (18) Computing the integral in (18), we obtain Consider the Wright generalized hypergeometric function defined by Combining the last two results, we can rewrite the mgf of the GG distribution as (19) provided that  > 1.
The KBGG generating function follows by inserting (19) in equation (17).For  > 1, we have (20) Equations ( 16) and ( 20) are the main results of this section.The mgf of any KBGG submodel, as those discussed in Section 2, can be determined from ( 20) by substitution of known parameters.

Incomplete moments
The answers to many important questions in economics require more than just knowing the mean of the distribution, but its shape as well.This is obvious not only in the study of The Kummer Beta Generalized Gamma Distribution econometrics but in other areas as well.Incomplete moments of the income distribution form natural building blocks for measuring inequality: for example, the Lorenz and Bonferroni curves and Pietra and Gini measures of inequality depend upon the incomplete moments of the income distribution.The sth incomplete moment of X is defined by   () = ∫   ()  0 .
From the linear combination ( 14), we have

Other Measures
Here, we derive the means deviations, Lorenz and Bonferroni curves and the Rényi entropy of the KBGG distribution.

Mean deviations
We can derive the mean deviations about the mean numerically.They can be expressed as where  1 (•) is the first incomplete moment of X given by ( 22) with s = 1.We have (23) The measures  1 and  2 are calculated from ( 23) by setting  = ( 1 ′ ) and  =  , respectively.

Rényi Entropy
Entropy has been used in various situations in science and engineering and numerous measures of entropy have been studied and compared in the literature.The Rényi entropy is defined by Note that the integral above is obtained from ( 5) as (24) Using the exponential and binomial expansions in (24), we obtain (25) Noting that  > 0 and a > 0 are real non-integers, we can expand [ 1 (, ( The Kummer Beta Generalized Gamma Distribution , quantity, I() can be expressed in the form (26) where Using expansion ( 12) in ( 26), we obtain (27) Calculating the integral in ( 27), we have where Finally, the Rényi entropy reduces to

Order statistics
Here, we derive an explicit expression for the density function of the ith order statistic  : , say  : (x), in a random sample of size n from X ∼ KBGG( a, b, c, α, β, k).It is well-known that and using the binomial expansion, we obtain (28) We demonstrate that  : () can be expressed as a linear combination of GG densities.First, we provide an expansion for the KBGG cdf.Pescim et al. (2012) demonstrated that (29) where the coefficient ,,, denotes a sum of constants and  ,,, is defined in (7).
Equation ( 29) gives the KBGG cdf as an infinite weighted power series of the baseline cdf.Inserting ( 2) in ( 29), we have (30) Combining ( 7) and ( 30), the pdf of the ith order statistic reduces to (31) Applying the identity (10) in (31), we have (32) where  +−1, ⋆ can be obtained from (11)  Substituting ( 13) and (33) in equation ( 31), we can write The Kummer Beta Generalized Gamma Distribution (34) where k ⋆⋆ = k(2r + 1) + 2m and denotes the GG(α, β, k ⋆⋆ ) density function.Equation ( 34) reveals that the density function of the KBGG order statistics is an infinite linear combination of GG densities.Hence, ordinary moments of order statistics can be determined directly from those quantities of the GG distribution.
Based upon these moments, we can derive expansions for the L-moments as infinite weighted linear combinations of suitable KBGG means.The L-moments are analogous to the ordinary moments but can be estimated by linear combinations of order statistics.They are linear functions of expected order statistics defined by Hosking (1990) and are relatively robust to the effects of outliers.

The Classical Inference
Here, the estimation of the model parameters of the KBGG distribution is investigated by the maximum likelihood method.Let  = ( 1 , . . .,   ) be a random sample of the new distribution with unknown parameter vector  = (a, b, c, α, β, k For interval estimation and hypothesis tests on the parameters in , we require the 6 × 6 total observed information matrix J() = −{Urs}, where the elements Urs for r, s = α, β, k, a, b, c are given in Appendix A. The estimated asymptotic multivariate normal N6(  ,J(  ̂)−1 ) distribution of  ̂ can be used to construct approximate condence regions for the parameters.An asymptotic condence interval (ACI) with signicance level γ for each parameter   is given by where ̂ ,   is the rth diagonal element of J() −1 estimated at  ̂, for r = 1, . . ., 4, and  /2 is the quantile 1 − γ/2 of the standard normal distribution.
We can compute the maximum values of the unrestricted and restricted log-likelihoods to construct likelihood ratio (LR) statistics for testing some sub-models of the KBGG distribution.For example, we may use LR statistics to check if the fit using the KBGG distribution is statistically "superior" to the fits using the KBW, BGHN, EW and GG distributions for a given data set.In any case, considering the partition  = (   ,    )  , tests of hypotheses of the type H0 : =   () versus HA :   ≠   () can be performed using the LR statistic  = 2{ℓ( ̂) − ℓ( ̃)}, where  ̂ and  ̃ are the estimates of  under HA and H0, respectively.Under the null hypothesis H0 ,   →  q 2 , where q is the dimension of the vector   of interest.The LR test rejects H0 if  >   , where   denotes the upper 100γ% point of the χ  2 distribution.

The Bayesian Inference
As is well-known, the Bayesian approach allows the incorporation of previous knowledge of the parameters through informative prior density functions.When this information is not available, we can consider a non-informative prior.In the Bayesian context, the information referring to the model parameters is obtained through a posterior marginal distribution.Thus, two difficulties usually arise.The first refers to attaining marginal posterior distribution, and the second to the calculation of the moments of interest.Both cases require numerical integration that, many times, do not present an analytical solution.To overcome these problems, we use the simulation method based on the Markov Chain Monte Carlo (MCMC), such as the Gibbs sampler and Metropolis-Hastings algorithms.
Combining the likelihood function (36) and the prior distribution (37), the joint posterior distribution for a, b, c, α, β and k reduces to (38) The joint posterior density (38) is analytically intractable because the integration of the joint posterior density is not easy to perform.So, the inference can be based on MCMC The Kummer Beta Generalized Gamma Distribution simulation methods such as the Gibbs sampler and Metropolis-Hastings algorithm, which can be used to draw samples, from which features of the marginal distributions of interest can be inferred.In this direction, we first obtain the full conditional distributions of the unknown quantities given by and Since the full conditional distributions do not have explicit expressions, we require the use of the Metropolis-Hastings algorithm to generate the variables a, b, c, α, β and k for the KBGG distribution.

Applications
In this section, we use three real data sets which come from diverse fields such as actuarial sciences (D1), environment (D2) and engineering (D3) to compare the fits of the KBGG distribution with those of three sub-models (i.e.BGG, EGG and GG distributions) and also to the following non-nested model: the Kumaraswamy generalized gamma (KwGG) distribution (Pascoa et al., 2011).The primary reason for choosing these data is that they allow us to show how in different fields it is necessary to have positively skewed distributions with non-negative support.Moreover, these data sets present different degrees of variability, skewness and kurtosis.Lawless (1982).Table 2 gives a descriptive summary for these data and suggest positively skewed distributions with different degrees of variability, skewness and kurtosis.

Maximum likelihood estimation
First, in order to estimate the model parameters, we consider the maximum likelihood estimation method discussed in Section 8.1.We take the estimates of α, β and k from the fitted GG distribution as starting values for the numerical iterative procedure.All the computations were performed using the R statistical software.Table 3 lists the MLEs of the parameters and the values of the following statistics for some models: Akaike Information Criterion (AIC), Consistent Akaike Information Criterion (CAIC) and Bayesian Information Criterion (BIC).The results indicate that the KBGG model has the smallest values of the statistics (AIC and CAIC) among all fitted models.So, it could be chosen as the more suitable model.

684
The Kummer Beta Generalized Gamma Distribution Table 3: MLEs of the model parameters for the three data sets and the corresponding AIC, CAIC and BIC statistics.
A comparison of the proposed distribution with some of its sub-models using LR statistics is given in Table 4.The p-values indicate that the proposed model yields the best fit to the three data sets.This gives a clear evidence of the potential of the three parameters when modeling real data.
In order to assess if the model is appropriate, Figure 4 displays histograms and the estimated KBGG density functions for these data sets, respectively.We can conclude that the new distribution is a very suitable model to fit the three data sets.

Bayesian analysis
For the three data sets, the following independent priors were considered to perform the Metropolis-Hastings algorithm:  ∼ Γ(0.01, 0.01) ,  ∼ Γ(0.01, 0.01) ,  ∼ Γ(0.01, 0.01) ,  ∼ Γ(0.01, 0.01) ,  ∼ Γ(0.01, 0.01) and c∼ N(0,100), so that we have vague prior distributions.Considering these prior density functions, we generate two parallel independent runs of the Metropolis-Hastings with size 300, 000 for each parameter, disregarding the first 30, 000 iterations to eliminate the effect of the initial values and, to avoid correlation problems, we consider a spacing of size 10, obtaining a sample of size 27,000 from each chain.To monitor the convergence of the Metropolis-Hastings algorithm, we perform the methods suggested by Cowles and Carlin (1996) using the between and within sequence information, following the approach developed in Gelman and Rubin (1992) to obtain the potential scale reduction,  ̂.In all cases, these values were close to one, indicating the convergence of the chain.The approximate posterior marginal density functions for the parameters are displayed in Figures 5, 6 and 7 for the first, second and third data sets, respectively.In Table 5, we report posterior summaries for the parameters of the KBGG model for the three data sets.We note that the values for the means a posteriori (Table 5) are quite close (as expected) to the MLEs obtained for the KBGG model given in Table 3. "SD" denotes the standard deviation from the posterior distributions of the parameters and "HPD" denotes the 95% highest posterior density intervals.

Concluding remarks
We introduce the Kummer beta generalized gamma (KBGG) distribution with three additional shape parameters because of the wide usage of the GG distribution and the fact that the current generalization provides extensions to its continuous extension to still more complex 688 The Kummer Beta Generalized Gamma Distribution situations.The new distribution unifies more than 28 distributions and yields a general overview of these distributions for theoretical studies.In fact, the KBGG distribution (5) generalizes the Weibull, gamma, exponentiated Weibull, exponentiated gamma, beta Weibull, beta gamma, Kummer beta Weibull and Kummer beta gamma distributions and other important lifetime models.The KBGG density function can be expressed as a linear combination of GG density functions which allow us to derive some of its mathematical properties.The estimation of the model parameters is approached by the method of maximum likelihood and the Bayesian analysis.We consider the likelihood ratio (LR) statistic and other criteria to compare the KBGG model with its sub-models and other non-nested model.The potentiality of the KBGG distribution is illustrated in three applications to real data sets.The new model provides a rather flexible mechanism for fitting a wide spectrum of real world lifetime data in reliability, biology and other areas.The Kummer Beta Generalized Gamma Distribution , has the form (; , , ) is (; , , ) =  1 [, (

Next
integral, E(  ⋆  ) reduces to

Figure 2 :
Figure 2: Skewness of the KBGG distribution as a function of c for some values of a and b for α = 0.5,  = 1.0 and k = 2.0.(a) b = 2.0 and (b) a = 1.2

Figure 3 :
Figure 3: Kurtosis of the KBGG distribution as a function of c for some values of a and b for α = 0.5, = 1.0 and k = 2.0.(a) b = 2.0 and (b) a = 1.2 ) = ∫    ,, ⋆  0 () denotes the sth incomplete moment of the GG distribution with parameters , β and k ⋆ given by Calculating the integral above,   ⋆ () reduces to Substituting the last equation in (21), we obtain(22)

Figure 4 :
Figure 4: Histograms and the estimated KBGG density functions for the current data sets.

Figure 5 :
Figure 5: Approximate posterior marginal densities for the parameters of the KBGG model for the first data set.

Figure 6 :
Figure 6: Approximate posterior marginal densities for the parameters of the KBGG model for the second data set.

Figure 7 :
Figure 7: Approximate posterior marginal densities for the parameters of the KBGG model for the third data set.
Description of the data sets D1 Actuarial sciences: It is important for the Mexican Institute of Social Security (IMSS) to study the distributional behaviour of the mortality of retired people on disability because it enables the calculation of long and short term financial estimation, such as the assessment of the reserve required to pay the minimum pensions.The data set corresponding to 280 lifetimes (in years) of retired women with temporary disabilities, which are incorporated in the Mexican insurance public system and who died during 2004 were reported and analyzed by Balakrishnan et al. (2009).D2 Environmental sciences: These data were analyzed by Leiva et al. (2009) and correspond to daily ozone level measurements in New York in May-September, 1973, from the New York State Department of Conservation.D3 Engineering: Failures can occur in microcircuits because of the movement of atoms in the conductors in the circuit, which is referred to the electromigration.The data set refers to an accelerated life test of 59 conductors reported by

Table 2 :
Descriptive statistics for the three data sets.

Table 4 :
LR statistics for the three data sets.