Comparing the exponentiated and generalized modified Weibull distributions

Abstract: In recent years, many modifications of the Weibull distribution have been proposed. Some of these modifications have a large number of parameters and so their real benefits over simpler modifications are questionable. Here, we use two data sets with modified unimodal (unimodal followed by increasing) hazard function for comparing the exponentiated Weibull and generalized modified Weibull distributions. We find no evidence that the generalized modified Weibull distribution can provide a better fit than the exponentiated Weibull distribution for data sets exhibiting the modified unimodal hazard function.In a related issue, we consider Carrasco et al. (2008), a widely cited paper, proposing the generalized modified Weibull distribution, and illustrating two real data applications. We point out that some of the results in both real data applications in Carrasco et al. (2008) are incorrect.


Introduction
The most popular lifetime distributions including the exponential, Weibull, gamma, Rayleigh, Pareto and Gompertz distributions have monotonic hazard functions (HFs), cf.Lawless (1982).However, certain lifetime data (for example, human mortality, machine life cycles and data from some biological and medical studies) require non-monotonic shapes like the bathtub shape, the unimodal (upside-down bathtub) shape or the modified unimodal (unimodal followed by increasing) shape.
The Weibull distribution is one of the most important, desirable and widely used lifetime distributions.It has been used in many different fields with many applications.The cumulative distribution function (CDF) of the Weibull distribution is simple and has a closed form, yielding simple expressions for its survival function (SF) and HF.It is a flexible distribution that can be used to fit different kinds of lifetime data sets in different fields.Moreover, its parameters have physical meanings and interpretations.
For many years, using different techniques, many researchers have developed various modified forms of the Weibull distribution to achieve non-monotonic shapes.Extensive reviews of some of these modifications have been presented, for example, see Rajarshi and Rajarshi (1988) and Murthy et al. (2003).Pham and Lai (2007) and Lai et al. (2011) presented brief reviews about modified Weibull models.Most of the modifications of the Weibull distribution (both continuous and discrete) were introduced in the last five years or so.Almalki and Nadarajah (2014) provide an extensive review of the continuous and discrete modifications of the Weibull distribution.Their review contains over 110 references on modifications/generalizations of the Weibull distribution and more than 55 percent of the cited references appeared in the last five years.
The main purpose of modified Weibull distributions is to fit data sets with non-monotonic HFs (bathtub, unimodal and modified unimodal).Many modifications of the Weibull distribution have achieved the above purpose.On the other hand, unfortunately, the number of parameters has increased, the forms of the SF and the HF have been complicated and estimation problems have risen.Moreover, some of the modifications do not have closed form CDFs.
We believe that there are some modified Weibull distributions with a small number of parameters which have not received the attention they deserve.Also, there are modified Weibull distributions with a large number of parameters which need to be revalued with respect to what they really contribute.Adding more parameters will automatically increase the maximum likelihood value.On the other hand, adding more parameters makes the estimation procedure more complicated.
Whenever a new distribution is proposed, its fit must be compared with all appropriate distributions having the same or fewer parameters.The fits can be compared by the likelihood ratio test if the distributions are nested or by information criteria like the Akaike information criterion (AIC) and the Bayesian information criterion (BIC) if the distributions are not nested.Information criteria like the AIC and BIC account for the increase in the maximum likelihood value as well as the number of parameters added.The smaller the values of these criteria the better the fit.Any newly proposed distribution must be proved to provide significantly better fits than all appropriate distributions having the same or fewer parameters for a range of real data sets.This exercise is often not performed for newly proposed distributions in the literature.
The GMW distribution has been widely cited in the statistics and related literatures in the last few years.It is important to note for latter reasons that (1), ( 2), ( 3) and ( 4) are not valid functions if λ < 0. For example, the CDF in (1) is not a monotonic increasing function of x if λ < 0. The PDF in (3) can take negative values if λ < 0. Also the HF in (4) can take negative values if λ < 0. 2008) applied the GMW distribution to two well-known censored data sets and compared its goodness-of-fit with its sub-models.The first data set is the serum-reversal data of Silva (2004) and Perdon´a (2006).The TTT-plot for this data is shown in Figure 1 (a), which takes a convex shape followed by a concave shape.This corresponds to a bathtub shaped HF.

Carrasco et al. (
The second data referred to as radiotherapy data are survival times in days of fifty one cancer patients undergoing radio therapy.The TTT-plot for this data is shown in Figure 1 (b), which takes a concave shape followed by a convex shape followed by a concave shape.Carrasco et al. (2008) mention that this corresponds to a unimodal HF.Unfortunately, this is not the only mistake in Carrasco et al. (2008).There are some other mistakes in the results of both applications.We show these mistakes later.
The third data in Table 1 are survival times of seventy two pigs infected by virulent tubercle bacilli (Greenwich, 1992).We shall refer to it as a infected pigs data set.It has a modified unimodal shape HF as shown later.
nlm converged all the time and nlm converged to a unique maximum.This gives us confidence on the reported MLEs.
The fits were compared using the following measures: •

Serum-reversal data
Table 1 in Carrasco et al. (2008) shows the MLEs of the parameters of the GMW distribution and its sub-models (MW, EW, EE, Weibull and GR distributions) for the serum-reversal data.All these results appear correct.It is clear that the GMW distribution presents a very good fit for this data with respect to AIC, BIC and CAIC values.We computed the Kolmogorov Smirnov statistic for the GMW distribution and its sub-models.Again, the GMW distribution has the smallest Kolmogorov Smirnov statistic with the value 0.117, whilst the Kolmogorov Smirnov statistics for the MW, EW, EE, Weibull and GR distributions are 0.155, 0.169, 0.259, 0.182 and 0.247, respectively.The HF for the fitted GMW distribution is plotted in Figure 4c

Radiotherapy data
The fitted MLEs for this data are presented in Table 2 of Carrasco et al. (2008).Unfortunately, the MLEs for the GMW and MW distributions and the corresponding AIC, BIC and CAIC measures appear incorrect.We now explain the mistakes.

Modified Weibull distribution
The MLEs of the parameters of the MW distribution reported in Table 2 of Carrasco et al. (2008) are α ̂ = 0.001, γ ̂ = 1.245 and λ ̂ = 0.001.But the reported values of AIC = 594.4,BIC = 600.1 and CAIC = 594.9appear to have been computed using λ = −0.001(an invalid value for λ).These values appear so close to the values of AIC, BIC and CAIC reported in Table 2 of Carrasco et al. (2008) for the GMW distribution.But the shape of the HF of the MW distribution can not be unimodal, so it is surprising that the MW and GMW distributions fit equally well for a data set exhibiting a unimodal HF.
For the MLEs of the MW distribution reported in Table 2 of   Note from Table 2 that the GMW distribution we fitted is actually an EW distribution since the MLE of λ is zero (that is, the likelihood for the GMW distribution for the given data appears largest when λ = 0).Also the MW distribution we fitted is actually a Weibull distribution since the MLE of λ is zero (that is, the likelihood for the MW distribution for the given data appears largest when λ = 0).So, the added parameter λ does not improve the fit of the GW distribution or the fit of the Weibull distribution.This can happen sometimes when the parameter is restricted to be positive (Liddle, 2004).

Generalized modified Weibull distribution
According to Table 2 in 2008) appear to have used λ ̂= −0.0002 (an invalid value for λ) to plot the SF and the HF for the fitted GMW distribution.Furthermore, the reported AIC, BIC and CAIC measures appear to have used the same negative value.
Figure 4a shows the SF for the fitted GMW distribution using the MLEs in Table 2  Table 2 shows the MLEs of the GMW distribution computed by Carrasco et al. (2008) (when λ ̂ = −0.0002and when λ ̂ = 0.0002), the MLEs we obtained (the corresponding standard errors in brackets) and the values of AIC, BIC, AICc, CAIC, KS, AD, CvM we obtained.The SF and the HF for the GMW distribution we fitted are plotted in Figures 4b and 4d.
Figure 5 plots the profile log likelihood functions for the GMW distribution around the MLEs reported in Table 2.The plots suggest that the MLEs reported in Table 2 are unique.
Figure 6 compares the QQ plots for the fits of the GMW and MW distributions.The points for the MW distribution are closer to the diagonal line, so it provides a better fit than the GMW distribution.

Infected pigs data
The TTT-plot for this data shown in Figure 7 shows a concave shape and then a convex shape followed by a concave shape.This corresponds to the HF being modified unimodal shaped.The EW distribution has the smaller values for the AIC, the BIC the AICc and the CAIC and larger p-values.The GMW distribution has the larger log-likelihood.But the likelihood ratio test statistic for testing H0 : λ = 0 versus H1 : H0 is false is 0.154 and the corresponding p-value is 0.695, so there is no evidence to reject H0.Hence, the GMW distribution does not improve significantly on the fit of the EW distribution.Figures 8a and 8b show that both distributions provide good fits.Figure 8d shows that both distributions provide good fits to the initial and middle parts of the non-parametric HF.But neither of the distributions appear to capture the last part of the non-parametric HF well.
Figure 9 plots the profile log likelihood functions for the EW distribution around the MLEs reported in Table 3.The plots suggest that the MLEs reported in Table 3 are unique.
Figure 10 compares the QQ plots for the fits of the GMW and EW distributions.There is little difference in terms of closeness of the points to the diagonal line.So, the EW distribution should be preferred since it is the simpler.
Suppose each infected pig has β number of components working in parallel, and that the pig dies if each component fails.Suppose also that the components work independently and that the lifetime of each component is Weibull distributed.Under these assumptions, the survival time of each infected pig will have the EW distribution.Given the parameter estimates, there are approximately 310250 components in each infected pig, the mean lifetime of each component is approximately 0.112 and the variance of the lifetime of each component is approximately 2.271.

Conclusions
The added parameter λ of the GMW distribution over the EW distribution did not improve the maximum likelihood function for the radiotherapy data.Also, the GMW distribution did not provide a better fit than the EW distribution for the infected pigs data.Both data sets have modified unimodal shaped HFs.Based on this, there is no evidence that the GMW distribution can provide a better fit than the EW distribution for data sets exhibiting modified unimodal HFs.
We have pointed out some incorrect results in Carrasco et al. (2008).The fitted HF for the serum-reversal data set using the GMW distribution is incorrect because it is plotted using a negative value of λ.For the radiotherapy data set, although Carrasco et al. (2008) reported a positive value for λ, they used a negative value to plot the fitted SF and HF.Also, the AIC, the BIC, the AICc and the CAIC were calculated using the same negative value.

Figure 1 :
Figure 1: (a) TTT-transform plot for the serum-reversal data, (b) TTT-transform plot for the radiotherapy data.
the p-value of the goodness of fit test based on the Kolmogorov Smirnov (KS) statistic; • the p-value of the goodness of fit test based on the Anderson Darling (AD) statistic; • the p-value of the goodness of fit test based on the Cramer von Mises (CvM) statistic; • the AIC due to Akaike (1974) defined by • the BIC due to Schwarz (1978) defined by • the consistent Akaike information criterion (CAIC) due to Bozdogan (1987) defined by • the corrected Akaike information criterion (AICc) due to Hurvich and Tsai (1989) defined by where ϕ is a vector of unknown parameters of length k, ϕ ̂ is the MLE of ϕ, xi, i = 1, 2, . . ., n are the observed data of size n, and L(ϕ ̂, xi) is the likelihood function.
of Carrasco et al. (2008).This figure appears incorrect because it is well-known that the HF must be nonnegative everywhere.It appearsCarrasco et al. (2008) plotted the HF for the fitted GMW distribution using λ = −0.023,an invalid value for λ, see Section 1.The MLE of λ reported in Table1of Carrasco et al. (2008) is 0.023.The HF for the fitted GMW distribution when λ = −0.023 is plotted in Figure2a, just to show that Figure4cinCarrasco et al. (2008) was plotted using this negative value.Figure2bpresents the non-parametric HF of the data and the HF for the fitted GMW distribution using the MLEs in Table1ofCarrasco et al. (2008).

Figure 2 :
Figure 2: For the serum-reversal data: (a) HF presented in Carrasco et al. (2008), (b) Non-parametric HF and the HF for the fitted GMW distribution using estimates inTable 1 of Carrasco et al (2008).

Figure 3 :
Figure 3: For radiotherapy data: (a) SF presented in Carrasco et al. (2008), (b) Our SF based on the MW distribution, (c) HF presented in Carrasco et al. (2008), (d) Our HF based on the MW distribution.
Furthermore, the p-values of the Kolmogorov Smirnov, Anderson Darling and Cramer von Mises statistics for the fitted GMW distribution (which is an EW distribution) are larger than the p-values corresponding to the estimates in Carrasco et al. (2008).The p-values for the fitted MW distribution (which is a Weibull distribution) are also larger than the p-values corresponding to the estimates in Carrasco et al. (2008).
Carrasco et al. (2008), the MLE of λ is 0.0002.But Figures 5b and 5c in Carrasco et al. ( of Carrasco et al. (2008) (red solid line for λ ̂ = −0.0002and blue dashed line for λ ̂ = 0.0002).The corresponding HFs are plotted in Figure 4c (red solid line for λ ̂ = −0.0002and blue open circles for λ ̂ = 0.0002).The SF and the HF for the fitted GMW distribution with λ ̂ = −0.0002(in red solid line and blue open circles) appear to be the same as Figures 5b and 5c in Carrasco et al. (2008).

Figure 4 :
Figure 4: For the radiotherapy data: (a) SF presented in Carrasco et al. (2008), (b) Out SF based on the GMW distribution, (c) HF presented in Carrasco et al (2008), (d) Out HF based on the GMW distribution.

Figure 5 :
Figure 5: For the radiotherapy data: the profile log likelihood functions of the four parameters of the GMW distribution.

Figure 6 :
Figure 6: QQ plot for the fits of GMW and MW distributions for the radiotherapy data.

Table 3 :
MLEs of parameters, standard errors, AIC, BIC, AICc, CAIC, KS, AD and CvM for the distributions fitted to the infected pigs data set.Table 3 shows the MLEs of the parameters, their standard errors, AIC values, BIC values, AICc values, CAIC values and p-values of the Kolmogorov Smirnov, Anderson Darling and Cramer von Mises statistics for the fitted GMW and EW distributions.The negative log-likelihood for the fitted GMW distribution is 398.200.The negative log likelihood for the fitted EW distribution is 398.201.AIC values, BIC values, AICc values and CAIC values for the fitted GMW and EW distributions are 802.4004,809.2304, 812.2304, 802.7533 and 802.4021, 809.2321, 812.2321, 802.7551, respectively.The p-values of the Kolmogorov Smirnov, Anderson Darling and Cramer von Mises statistics for the two distributions are 0.062, 0.069, 0.070 and 0.059, 0.060, 0.061, respectively.

Figure 8 (
a) shows the histogram of the data and the fitted PDFs.Figure 8 (b) shows the empirical SF of the data and the fitted SFs. Figure 8 (d) shows the non-parametric HF of the data and the fitted HFs.

Figure 7 :
Figure 7: TTT-transform plot for the infected pigs data.

Figure 9 :
Figure 9: For infected pigs data: the profile log likelihood functions of the three parameters of the EW distribution.

Figure 10 :
Figure 10: QQ plot for the fits of GMW and EW distribution for the infected pigs data.