The exponentiated generalized extended exponential distribution

We introduce and study a new four-parameter lifetime model named the exponentiated generalized extended exponential distribution. The proposed model has the advantage of including as special cases the exponential and exponentiated exponential distributions, among others, and its hazard function can take the classic shapes: bathtub, inverted bathtub, increasing, decreasing and constant, among others. We derive some mathematical properties of the new model such as a representation for the density function as a double mixture of Erlang densities, explicit expressions for the quantile function, ordinary and incomplete moments, mean deviations, Bonferroni and Lorenz curves, generating function, R ényi entropy, density of order statistics and reliability. We use the maximum likelihood method to estimate the model parameters. Two applications to real data illustrate the flexibility of the proposed model.


Introduction
The exponential distribution is a very popular statistical model and, probably, is one of the parametric models most extensively applied in several fields (Lemonte, 2013).The popularity of this distribution can be explained, perhaps, by the simplicity of their cumulative function, which involves only one unknown parameter λ > 0 and takes a simple form G(x) = 1 − e −λx , for x > 0, in addition to having constant hazard rate function (hrf).Due to its importance, several studies introducing and/or studying extensions of the exponential distribution are available in the literature.Here, we refer to the following papers: Gupta  We emphasize that the density (2) can be also obtained as a special case of the generalized Lindley distribution proposed by Zakerzadeh and Dolati (2009).However, these latter authors do not address this particular case in their research.
Several mathematical properties of the ƐƐ distribution, including expectation, variance, moment generating function (mgf), asymmetry and kurtosis coefficients, among others, were studied by Gómez et al. (2014).In particular, they proved that the density of the ƐƐ model is a mixture of the exponential and gamma densities.We believe that the addition of parameters to the ƐƐ model may generate new distributions with great adjustment capability and, for this reason, we propose a generalization of it.
In a recent paper, Cordeiro et 3) to extend some well-known distributions such as the Fr échet, normal, gamma and Gumbel distributions.Moreover, they presented several properties for the Ɛɡ class, which provide motivations to adopt this generator.Next, we discuss some of these motivations.The first important point to note is the simplicity of equations ( 3) and (4).They have no complicated functions and will be always tractable when the cdf and pdf of the baseline distribution have simple analytic expressions.It is very easy, for example, to obtain the inverse of the cdf (3).Another important feature is that the Ɛɡ model contains as especial cases the two classes of Lehmann's alternatives.In fact, for a = 1, In this paper, we define the exponentiated generalized extended exponential (ƐɡƐƐ) distribution by inserting (1) in equation ( 3).The EGEE model includes as special cases the exponential, Lindley and exponentiated exponential distributions, among others, which are very important statistical models, specially for applied works.The new density function is a double linear mixture of Erlang densities, and thus several properties of the ƐɡƐƐ model can be simplified from this relationship.Moreover, the proposed model has monotonic and non-monotonic hrf's.We hope that the new distribution can be widely used for data modeling in areas such as economics, finance, reliability, biology and medicine, among others.
The rest of the paper is organized as follows.In Section 2, we give the density and hazard functions of the ƐɡƐƐ model and corresponding plots for selected parameter values.We provide expressions for the cumulative and reversed hazard functions.In Section 3, we investigate the shapes of the ƐɡƐƐ density function.In Section 4, we derive several mathematical properties of the proposed model, including mixture representations for the density and cumulative functions, explicit expressions for the quantile and generating functions, ordinary and incomplete moments, among others.Estimation and inference by maximum likelihood are discussed in Section 5. Two applied results are presented in Section 6. Section 7 provides concluding remarks.

The ƐɡƐƐ distribution
The cdf and pdf of the ƐɡƐƐ distribution (by omitting the dependence on the parameters a > 0, b > 0, α > 0 and β ≥ 0), for x > 0, are given by and respectively.Henceforth, a continuous random variable having pdf (6) is denoted by X ∼ ƐɡƐƐ (a, b, α, β).
Several distributions are special cases of the ƐɡƐƐ model.Here, we mention some of them.Clearly, the ƐƐ distribution is a basic exemplar when a = b = 1.The exponential and Lindley distributions are obtained from ( 6

Shapes
The first derivative of log{f (x)} for the ƐɡƐƐ model is given by The exponentiated generalized extended exponential distribution where z(x) = β + α + αβx.
Thus, the critical values of f (x) are the roots of the equation: If the point x = x0 is a root of ( 7), then we can classify it as local maximum, local minimum or inflection point when λ(x0) < 0, λ(x0) > 0 and λ(x0) = 0, respectively, where λ(x) = d 2 log{f (x)}/dx 2 is given by

Properties
In this section, we study some structural properties of the ƐɡƐƐ distribution.

A useful representation
First, we derive simple representations for the density and cumulative functions of the ƐɡƐƐ distribution.The starting point of our approach is the class of exponentiated distributions, which has been widely explored in recent works.A comprehensive review of these publications can be found in a recent paper by Tahir and Nadarajah (2015).For an arbitrary continuous baseline cdf G(x), a random variable Y is said to have the exponentiated-ɡ ("exp-ɡ" for short) distribution with power parameter a > 0, say Y ∼ exp-ɡ (a), if its cdf and pdf are Ha(x) = G(x) a and ha(x) = a g(x)G(x) a−1 , respectively.Thus, "exp-ɡ" denotes the Lehmann type I transformation of G(x).Based on some results in Cordeiro and Lemonte (2014), we can express the Ɛɡ cdf (3) as where is the expɡ cdf with power parameter j + 1.By differentiating (8), we obtain a similar mixture representation for f (x) as where hj+1(x) = dHj+1(x)=dx.
By using ( 8) and ( 9) for the ƐƐ distribution (1), hj+1(x) becomes the exp-ƐƐ pdf with power parameter j + 1 (for j ≥ 0) given by Combining equations ( 9) and (10) we have an important result: the ƐɡƐƐ density function is a linear mixture of exp-ƐƐ densities.This result can be used to derive some mathematical properties of X.
Next, we apply the binomial expansion in equation ( 10) in order to obtain a simple representation for the exp-ƐƐ density.We have By interchanging in the last equation, where and, after a simple algebraic manipulation, we obtain where Here, π(x; i+1, (k+1)α) denotes the pdf of the Erlang distribution with shape parameter i+1 (for i ≥ 0) and scale parameter (k+1)α.If Z is the Erlang random variable with shape parameter s (= 1, 2, 3, . ..) and scale parameter λ > 0, its pdf is given by π(z; s, λ) = λ s z s−1 e −λz /(s − 1)!.
Second, combining equations ( 9) and ( 11) and changing the density function of the ƐɡƐƐ model reduces to where Equation ( 13) is the main result of this section.It gives the density function of X as a double linear mixture of Erlang densities.This result is important to obtain some mathematical properties of X such as the ordinary and incomplete moments, generating function and mean deviations from those of the Erlang distribution.We can take the upper limit of k to be equal 20 in equation ( 13) for most practical purposes.

Quantile function
For many applications it is important to determine the quantile function (qf) of X.Based on this function, we can, for example, generate variates and obtain the median of the ƐɡƐƐ distribution.By inverting (5), the qf of X can be expressed as where 0 < u < 1 and W(•) denotes the Lambert W-function.
In a recent paper, Nadarajah et al. (2011) used the Lambert W-function to derive the qf of the ƐL distribution.For any complex t, the Lambert W-function is defined as the inverse of the function g(t) =te t .For more details, see http://mathworld.wolfram.com/LambertW-Function.html.An implementation in R software is available through the LambertW package.See http://cran.rproject.org/web/packages/LambertW/LambertW.pdf.
Using the Lagrange inversion theorem, the power series for the W-function holds: By applying (15) in equation ( 14), we have One of the important applications of equation ( 14) is to determine the median of the ƐɡƐƐ distribution.The median of X, say M, is obtained by M = Q(1/2).

Moments
The nth moment of X can be derived using the fact that the ƐɡƐƐ density function is a double linear mixture of Erlang densities.Thus, based on (13), the nth moment of X is given by Further, we have

Moments
The incomplete moments of a distribution play an important role in applications.The nth incomplete moment of X is given by and using equation ( 13), we can write Thus, we can write Tn(z) as where is the gamma function and denotes the upper incomplete gamma function.
The first incomplete moment of X is important to determine the mean deviations, which can be used to measure the amount of scatter in a population, and the Bonferroni and Lorenz curves, which are useful for applications in areas such as economics, reliability, demography and many others.Setting n = 1 in equation ( 16) gives The mean deviations of X about the mean  = E(X) and about the median M are given by respectively, where f(x) is the pdf (6).Using equation ( 17), these measures follow as where F(M) is the cdf (5) evaluated at M and T1(z) is given by (17).
Equation ( 17) can also be used to obtain the Bonferroni and Lorenz curves of X given by B(p) = T1(q)/(p ) and L(p) = T1(q)/ , respectively, where q = Q(p) is determined by ( 14) for a specified probability p.

Generating function
The mgf of X can be determined from (13) as Then, for all t < (k + 1), we have

R𝐞́nyi entropy
The entropy of X is a measure of variation of the uncertainty.There are many entropy measures studied and discussed in the literature, but the Rényi entropy is perhaps one of the most popular.The R_enyi entropy of X with density ( 6) is given by where p > 0 and p ≠ 1.Now, we consider the generalized binomial expansion which holds for any real non-integer b and ｜z｜ < 1.Using (19) twice in equation ( 4), we can write Inserting (1) and (2) in equation ( 20) and applying the binomial expansion twice from , we obtain Further, by inserting (21) in equation ( 18), the Rényi entropy reduces to Where is the exponential integral function.

Order statistics
The density function fi:n(x) of the ith order statistic, say Xi:n, for i = 1,…,n, from a random sample X1,…,Xn having the Ɛɡ distribution can be expressed as where f(x) is the pdf (4) and F(x) is the cdf (3).
Applying the binomial expansion in the last equation, we have Substituting ( 3) and (4) in equation ( 22) and applying the generalized binomial expansion (19), we can write Then, after a simple algebraic manipulation, we have where is given by and denotes the expɡ density function with power parameter Equation ( 23) reveals that the density function of the Ɛɡ order statistic is a linear mixture of exp-ɡ densities.We emphasize that this result is not new and has already been presented by Cordeiro et al. (2013).However, we now give an alternative way of expressing the weights that compose this linear combination.
By combining equations ( 11) and ( 23) and after some algebra, we obtain is given by ( 12) and (x; s + 1; (m + 1)) where the quantity denotes the Erlang density with shape parameter s + 1 and scale parameter (m + 1) .
Thus, based on (24), we obtain an important result that gives the density of Xi:n as a double linear mixture of Erlang densities.Undoubtedly, there are many applications for equation ( 24), but the most important is to calculate the moments and the mgf of the ith order statistic.The rth moment of Xi:n is given by Based on the results presented in Section 4.3, the last equation reduces to Next, the mgf of Xi:n is given by Based on the results in Section 4.5, the last equation can be rewritten as for all t < (m + 1).

Reliability
Here, we derive the reliability, say R, for the ƐɡƐƐ model when X1 ~ ƐɡƐƐ(a1, b1, ,  ) and X2 ~ ƐɡƐƐ(a2, b2, ,  ) are two independent random variables with the same baseline parameters  and .Let f1(x) denote the pdf of X1 and F2(x) denote the cdf of X2.The reliability can be expressed as and using equations ( 8) and ( 9) gives where Thus, the reliability of X reduces to

Estimation and Inference
Several approaches for parameter estimation were proposed in the literature but the maximum likelihood method is the most commonly employed.The maximum likelihood estimators (MLEs) enjoy desirable properties and can be used when constructing confidence intervals and regions and also in test statistics.The normal approximation for these estimators in large sample distribution theory is easily handled either analytically or numerically.So, we consider the estimation of the unknown parameters a, b, α and β of the ƐɡƐƐ distribution from complete samples only by maximum likelihood.Let x1,…, xn be a random sample of size n from the ƐɡƐƐ distribution.The log-likelihood function for the vector of parameters θ = (a, b, α, β) T , say can be expressed as where Equation (25) can be maximized either directly by using the Ox program (sub-routine MaxBFGS), R (optim function) and SAS (PROC NLMIXED), or by solving the nonlinear likelihood equations obtained by differentiating .
The elements of the score vector are given by The MLE  ̂ of  can be obtained numerically.For interval estimation and hypothesis tests on the parameters a, b, α and β, we determine the 4 × 4 observed information matrix given by J( ) = {-Urs}, whose elements can be obtained from the authors upon request.

Applications to real data
Here, we present two applications to real data to illustrate the potentiality of the new distribution.First, in addition to the ƐɡƐƐ model, we consider the three-parameter ƐɡƐƐ( a, b, α, 0) and ƐɡƐƐ (a, b, α, 1) sub-models.Also, the three-parameter beta-Lindley (BL) distribution, proposed by Merovci and Sharma (2014), is compared with the ƐɡƐƐ distribution and its submodels.All computations are performed using the SAS subroutine NLMixed.
The BL density is given by where B(a, b) = Γ(a)Γ(b)/Γ(a + b) is the beta function.
Table 1: The MLEs (and their standard errors in parentheses), AIC, BIC and CAIC statistics for the number of successive failures for the air conditioning system.
Further, we consider the formal goodness-of-fit tests based on the Cramér-von Mises (W * ) and Anderson-Darling (A * ) test statistics in order to verify which distribution fits better the current data.The W * and A * statistics are described in Chen and Balakrishnan (1995).In general, the lower values of these statistics indicate the better fit to the data.Table 2 gives the values of the W * and A * statistics for all fitted models.Based on the figures in this table, we conclude that the ƐɡƐƐ distribution provides a better fit to these data than its sub-models and the BL distribution.Plots of the estimated pdf and cdf of the ƐɡƐƐ distribution and the histogram of the data are displayed in Figure 3.These plots clearly reveal that the ƐɡƐƐ model fits the data adequately and then it can be chosen for modeling these data.
Second, we consider the data presented by  log-logistic, Fréchet and Birnbaum-Saunders (BS) distributions.The densities of these models are given in the Wolfram alpha website (https://www.wolframalpha.com).Table 3 gives the MLEs of the fitted models to the current data with their corresponding standard errors, in addition to the AIC, BIC and CAIC statistics.Table 4 lists the values of the A * and W * statistics.The figures in Tables 3 and 4 suggest at least two important conclusions.The first one is that the proposed model ƐɡƐƐ has the lowest values for the AIC, CAIC, A * and W * statistics, and therefore, may be chosen as the best model to analyze the current data.Moreover, these results confirm what has already been demonstrated in the recent statistical literature: generalized models, as the proposed in this paper, usually have superior performance in terms of adjustment when compared to non-generalized models.These conclusions emphasize the importance of the proposed model.
Finally, Figure 4 displays the estimated pdf and cdf of the ƐɡƐƐ model and the histogram of the data.These plots reveal that the proposed model is quite suitable for these data.

Conclusions
Recently, Cordeiro et al. ( 2013) introduced the exponentiated generalized (Ɛɡ) class of continuous distributions with two extra shape parameters.In this paper, we consider the Ɛɡ class to generalize the extended exponential (ƐƐ) distribution.We define a new four-parameter lifetime model called the exponentiated generalized extended exponential (ƐɡƐƐ ) distribution, which includes as special cases the exponential, Lindley and exponentiated exponential distributions, among others.The hazard function of the new model can take the classic bathtub, inverted bathtub, increasing, decreasing and constant shapes.We demonstrate that the ƐɡƐƐ density can be expressed as a double linear mixture of Erlang densities.Further, we derive several basic mathematical properties of the ƐɡƐƐ model, including explicit expressions for the quantile function, ordinary and incomplete moments, mean deviations, Bonferroni and Lorenz curves, generating function, Rényi entropy, density of the order statistics and reliability.We discuss the estimation of the model parameters by maximum likelihood.We conduct two applications to real data to illustrate the flexibility of the new model.
al. (2013) proposed a new way of adding two parameters to a continuous distribution.For a given continuous baseline cdf G(x), and x ∈ R, they defined the exponentiated generalized (Ɛɡ) class of distributions with two extra shape parameters a > 0 and b > 0 and cdf F (x) and pdf f (x) given by and respectively, in which are implicit the dependence on the parameters of G(x).To illustrate the flexibility of the Ɛɡ model, Cordeiro et al. (2013) applied (

( 3 )
reduces to F (x) = G(x) b and, for b = 1, we obtain F (x) = 1 − [1 − G(x)] a , which correspond to the cdf's of the Lehmann type I and II families (Lehmann, 1953), respectively.For this reason, the Ɛɡ model encompasses both Lehmann type I and type II classes.So, the Ɛɡ family can be derived from a double transformation using these classes.The two extra parameters a and b in the density (4) can control both tail weights, allowing generate flexible distributions, with heavier or lighter tails, as appropriate.There is also an attractive physical interpretation of the model (3) when a and b are positive integers.This interpretation is described in Cordeiro and Lemonte (2014).The above properties and many others have been discussed and explored in recent works for the Ɛɡ class.Here, we refer to the papers: Cordeiro et al. (2014), Cordeiro and Lemonte (2014), Elbatal and Muhammed (2014), Oguntunde et al. (2014) and da Silva et al. (2015), which used the Ɛɡ class to extend the Burr III, Birnbaum-Saunders, inverse Weibull, inverted exponential and generalized gamma distributions, respectively.
) by setting β = 0 and β = 1, respectively, in addition to a = b = 1.The exponentiated generalized exponential (ƐɡƐ) model comes from (6) by setting β = 0 and the exponentiated generalized Lindley (ƐɡL) distribution follows when β = 1.The ƐɡƐƐ model also includes the Lehmann type I and type II transformations of the ƐƐ, exponential and Lindley distributions.For example, the well-known exponentiated exponential distribution (Gupta et al., 1998), also referred in the literature as the generalized exponential distribution, follows when β = 0 and a = 1.For a brief discussion and some properties of the exponentiated exponential distribution, see a recent paper by Lemonte (2013).The exponentiated Lindley (ƐL) model by Nadarajah et al. (2011) (they called the generalized Lindley distribution, but here we adopt the EL terminology) comes when a = β = 1.The hrf and reversed hazard rate function (rhrf) of X are given by and respectively.Plots of the ƐɡƐƐ density for selected parameter values are displayed in Figure 1. Figure 2 provides some possible shapes of the ƐɡƐƐ hazard function for appropriate parameter values, including bathtub, inverted bathtub, increasing, decreasing and constant shapes.These plots indicate that the ƐɡƐƐ model is fairly flexible and can be used to fit several types of positive data

Figure 1 :
Figure 1: Plots of the ƐɡƐƐ density function for some parameter values.

Figure 2 :
Figure 2: Plots of the ƐɡƐƐ hazard function for some parameter values.

Figure 3 :
Figure 3: Plots of the estimated pdf and cdf of the ƐɡƐƐ model for the number of successive failures for the air conditioning system.

Figure 4 :
Figure 4: Plots of the estimated pdf and cdf of the ƐɡƐƐ model for the failure times of 50 components.

Table 2 :
Goodness-of-fit tests for the number of successive failures for the air conditioning system.

Table 3 :
The MLEs (and their standard errors in parentheses), AIC, BIC and CAIC statistics for the failure times of 50 components.

Table 4 :
Goodness-of-_t tests for the failure times of 50 components.