A Two-Parameter Distribution with Increasing and Bathtub Hazard Rate

In this paper a new two-parameter distribution is proposed. This new model provides more ﬂexi-bility to modeling data with increasing and bathtub hazard rate function. Several statistical and reliability properties of the proposed model are also presented in this paper, such as moments, moment generating function, order statistics and stress-strength reliability. The maximum likelihood estimators for the parameters are discussed as well as a bias corrective approach based on bootstrap techniques. A numerical simulation is carried out to examine the bias and the mean square error of the proposed estimators. Finally, an application using a real data set is presented to illustrate our model.


Introduction
Mixture models have been playing an important role in distribution theory (Patil and Rao, 1978). In recent years, there has been a renewed interested in proposing new models based on mixture distribution. A simple case can be considered where new models are generated by a two-component mixture f (t|Λ 1 , Λ 2 , p) = pf 1 (t|Λ 1 ) + (1 − p)f 2 (t|Λ 2 ) , where 0 ≤ p ≤ 1 is mixing proportion (MP) and Λ 1 , Λ 2 are the parameters related to the probability density function (PDF) f 1 (·) and f 2 (·). Carta and Ramirez (2007) considered (1) based on a mixture of two Weibull distribution where the MP is a free unknown parameter to be estimated. Ramos and Louzada (2019) proposed a new one parameter distribution based on the mixture of a gamma and an exponential distribution. Ghitany et al. (2011) proposed a weighted Lindley model based on the mixture of two gamma distribution, where the MP is p = θ/(λ + θ), λ > 0 and θ > 0. Further, Ghitany et al. (2013) considered λ = 1 for MP and two Weibull distributions. The obtained model was named as power Lindley distribution. Ramos and Louzada (2016) unified these models by considering the mixture of two generalized gamma distributions. The main reason for the authors considered such MP comes from the fact that the new distributions are generalizations of the well known Lindley distribution. Other generalizations of the Lindley distribution can be view in Ekhosuehi et al. (2018); Yassmen (2019); Kwong and Nadarajah (2019).
On the other hand, many different MPs can be considered instead. For instance, let p = (λ − 2θ)/(λ − θ), then 0 ≤ p ≤ 1 if λ ≥ 0 and 0 ≤ θ ≤ λ/2. Here, we considered this MP based on two gamma distribution. The obtained distribution has PDF given by where 0 ≤ θ ≤ λ/2 and λ > 0 are the parameters and Γ(θ) = ∞ 0 e −x x θ−1 dx is the gamma function. Although we have considered two gamma distribution as f 1 (·) and f 2 (·), the same idea can be extended for other distributions such as Weibull, Lognomal, Exponentied Exponential among others. To the best of our knowledge, this distribution has not yet been presented in the literature. We also have not encountered its respective one parameter particular cases.
Mixture of gamma density functions has been used to describe heterogeneity (see, Mayrose et al. (2005)). In this paper, a significant account of mathematical properties has been presented for the new distribution such as moments, survival properties and its entropy function. For the new distribution the hazard function has increasing or bathtub shape, depending on the values of the parameters. This property plays an important role to describe lifetime data (Chen, 2000;Wang et al., 2002). The stress-strength parameter R = P (T 2 < T 1 ) where T 1 and T 2 have PDFs given by f (t|θ 1 , λ) and f (t|θ 2 , λ), respectively. The maximum likelihood estimators (MLEs) of the parameters and its asymptotic properties are discussed. Further, we also present a bias corrective approach for the MLEs based on bootstrap techniques. A simulation study is conducted to examine the performance of the proposed estimators. Finally, use our model to describe a data set related to 30 patients with brain cancer receiving radiotherapy.
The paper is organized as follows. Section 2 introduces our new distribution and its properties such as moments, moment generating function, order statistics, survival properties and stress-strength reliability. Section 3 presents the estimators of the unknown parameters based on MLEs. In Section 4, a simulation study to verify the performance of the MLEs is reported. Section 5 illustrates the relevance of our proposed methodology for a real lifetime data. Section 6 summarizes the present study.

Properties
The proposed distribution can be expressed as a two-component mixture where is a Gamma density given by After some algebraic manipulation in (3) the new non-negative random variable has PDF given by (5). Ghitany et al. (2011) followed a similar way but considering that the MP is p = θ/(λ + θ).
The behavior of the PDF (1) when t → 0 and t → ∞ are, respectively, given by   The cumulative distribution function from the new distribution is given by where γ(x, y) = x 0 w y−1 e −y dw is the lower incomplete gamma function. The proposed model may be considered a generalization of the one parameter case given by in which is also a new distribution.

Moments and Moment Generating Function
Moments play an important role in statistical theory, in this section we provide the r-th moment, the mean, variance and the moment generating function for our distribution. Proposition 1. For the random variable T with new distribution, the r-th moment is given by Proof. Note that if X ∼ Gamma(θ, λ) distribution then the r-th is given by Since the proposed model can be expressed as a two-component mixture, we have Proposition 2. The r-th central moment for the random variable T is given by Proof. The result follows directly from the Proposition 1.
Proposition 3. A random variable T with PDF (5) has the mean and variance given by Proof. From (6) and considering r = 1, it follows that µ 1 = µ. The second result follows from (7) considering r = 2 and with some algebra the proof is completed.
Proposition 4. For the random variable T , the Moment Generating Function is given by , and after some algebric computations the proof is completed.

Order Statistics
Let X 1 , X 2 , . . . X n be a random sample from (5) and X 1:n ≤ X 2:n ≤ . . . ≤ X n:n denote the the corresponding order statistics. It is well known that the probability density function and the cumulative distribution function of the of r−th order statistic say X r:n , 1 ≤ r ≤ n are given by and respectively, for k = 1, 2, . . . , n. It follows from (10) and (11) that

Survival Properties
The survival function of T representing the probability of an observation does not fail until a specified time t is given by where Γ(y, x) = ∞ x w y−1 e −w dw is the uper incomplete gamma function. The hazard function of T is given by This model has increasing and bathtub hazard rate. The following Lemma is useful to prove such result. Lemma 1. Glaser (1980): Let T be a non-negative continuous random variable with twice differentiable PDF f (t) and hazard rate function h(t) and h(t) has bathtub. Theorem 1. The hazard function (13) is bathtub if θ < 1 and increasing if θ ≥ 1.

Proof. Let
it follows that From (15) we observe that, if θ ≥ 1 then η (t) > 0 and consequently the hazard function is increasing ∀t > 0. By other hand, if θ < 1 the function η(t) has a global minimum given by , that is, η(t) has a bathtub shape and since that implies that, h(t) has a bathtub shape.

Inference
In this section we present the maximum likelihood estimator of the parameters θ and λ of the proposed distribution as well as a bias corrective approach.

Maximum Likelihood Estimation
Let T 1 , . . . , T n be a random sample such that T has PDF given in (1). In this case, the likelihood function from (5) is given by The log-likelihood function l(Θ; t) = log L(Θ; t) is given by From the expressions ∂ ∂θ l(Θ; t) = 0, ∂ ∂λ l(Θ; t) = 0, we get the likelihood equations where ψ(k) = ∂ ∂k log Γ(k) = Γ (k) Γ(k) is the digamma function. Under mild conditions (see Migon et al. (2014)) the ML estimates are asymptotically normal distributed with a bivariate normal distribution given by where the elements of the Fisher information matrix given by and ψ (k) = ∂ ∂ 2 k log Γ(k) is the trigamma function.

Bootstrap resampling method
In this section, we considered an corrective approach to reduce the bias of the MLEs. For our proposed model closed-form expressions for the bias are not possible since the higher-order derivatives do not have closed-form expression. To overcome this problem we consider the bootstrap resampling method proposed by Efron (1992) is a power alternative to reduce the bias of the MLEs specially in cases where the it is difficult to derive the analitical expression of the bias (see Cox and Snell (1968)). This method consists in generating pseudo-samples from the original to estimate the bias of the MLEs. Further, bias-corrected MLEs are obtained by subtraction of the estimated bias in relation to the original MLEs.
Here, we follow the same steps as describe in Reath et al. (2018). Let t = (t 1 , . . . , t n ) be a sample with size n randomly selected from the random variable T and has the distribution function F = F η (t). Also, let the parameter η be a function of F given by η = t(F ) andη be an estimator of η based on t, that is,η = s(t). The pseudo-samples t * = (t * 1 , . . . , t * n ) are obtained by resampling with replacement the original sample t. The bootstrap replicates ofη is calculated, whereη * = s(t * ) and the empirical cdf (ecdf) ofη * is used to estimate Fη (cdf ofη). Let B F (η, η) be the bias of the estimatorη = s(t) given by Note that the expectation is obtained in respect to F . The bootstrap estimators of the bias were obtained by replacing F with Fη, where F is generated from the original sample. Therefore, the bootstrap bias estimate isB By taking M bootstrap samples (t * (1) , t * (2) , . . . , t * (M ) ) that are generated independently from the original sample t and the respective bootstrap estimates (η * (1) ,η * (2) , . . . ,η * (M ) ) are calculated, then we can easily obtain the approximately bootstrap expectations E Fη [η * ] bŷ Hence, the obtained bias estimate based on M replications ofη isB F (η, η) =η * (.) −η, this implies that the bias corrected estimators obtained through by bootstrap resampling method can be obtained by For the proposed model, we have η B denoted byθ BOOT = (φ BOOT ,λ BOOT ) .

Simulation Analysis
In this section, a simulation study is presented to compare the efficiency of the maximum likelihood method with the bias correction approach in the presence of complete and censored data. These comparisons are performed by computing the Bias and the mean square errors (MSE) given by where N is the number of estimates and Θ = (θ, λ). The data set is generated as follows: 1. Generate U i ∼ Uniform(0, 1), i = 1, . . . , n; 2. Generate X i ∼ Gamma(θ, λ), i = 1, . . . , n; 3. Generate Y i ∼ Gamma(θ + 1, λ), i = 1, . . . , n;

If
The simulation study is performed under the assumption of n = (20, 35, . . ., 130), (θ, λ) = ((1.5, 3), (0.5, 3)) and N = 50, 000 . It is important to point out that, the results of this simulation study were similar for different choices of θ and λ. Following Reath et al. (2018) we considered M = 1, 000 to compute the bootstrap method. The programs can be obtained, upon request. Figures 4-5 present the Bias, the MSE and the coverage probability with a 95% confidence level of the estimates obtained through the MLEs and the Bias corrected MLES for different samples of size.
From the obtained results, we can conclude that as there is an increase of n both Bias and MSE tend to zero, i.e., the estimators are asymptotic efficiency. Moreover, the coverage probability of the confidence levels tend to the nominal value assumed 0.95. Therefore, the MLE showed to be a good estimator for the parameters of the proposed distribution.

An Application
In this section, the proposed distribution is fully applied in an important data set. The data have been presented by the Medical Research Council Working Party on Misonidazole in Gliomas (MRC Working Party on Misonidazole in Gliomas, 1983). Table 1 consists of the time in day (divided per 1000) of 30 patients with brain cancer receiving radiotherapy alone or radiotherapy   (Gupta and Kundu, 2001).
The discrimination among the models are conducted by different discrimination criteria based on log likelihood function. Such discrimination criterion methods are respectively: • Akaike information criterion AIC = −2l(θ; x) + 2k; • Corrected Akaike information criterion AICC = AIC +(2 k (k + 1))/(n − k − 1); • Hannan-Quinn information criterion HQIC = −2 l(θ; x) + 2 k log (log(n)); • Consistent Akaike information criterion CAIC = −2 l(θ; x) + k (log(n) + 1), where k is the number of parameters to be fitted andθ is the MLEs of θ,  The best model is the one which provides the minimum values of these criteria. The Kolmogorov-Smirnov (KS) test is also considered in order to check the goodness of the fit for the models. This procedure depends on the KS statistic D n = sup x |F n (x) − F (x; θ)|, where sup x is the supremum of the set of distances, F n (x) is the empirical distribution function and F (x; θ) is the cdf of the fitted distribution. Considering a significance level of 5%, if the data comes from F (x; θ) (null hypothesis), then hypothesis is rejected if the p-value is smaller than 0.05. Table 2 presents the results of AIC, AICc, HQIC, CAIC criteria, for the compared distributions. Table 3 displays the MLEs, standard-error and 95% confidence intervals for θ and λ.
In Figure 6, we have the survival function adjusted by the compared distributions and the non-parametric survival function.
Comparing the empirical survival function with the adjusted models we observe a goodness of the fit for the proposed model, which is confirmed from different discrimination criterion methods as the new distribution has the minimum value for all statistics and the largest for the P-value. Consequently, we conclude that the data related to 30 patients with brain cancer receiving radiotherapy alone or radiotherapy plus the radiosensitizer misonidazol can be described by our new distribution.

Concluding Remarks
In this paper, we introduce a new two-parameter distribution. Further, its mathematical properties were studied in detail. The hazard function of this distribution showed increasing and bathtub hazard rate, which is uncommon for simple two-parameters distribution. The MLEs for the parameters, its asymptotic properties and a bias corrective approach based on bootstrap techniques were discussed. The simulation study showed that as the samples size increase, both Bias and MSE tend to zero, i.e., the estimators are asymptotic efficiency. Finally, the practical importance of our model was reported in a real application, the goodness of fit for the proposed data set showed that our model returned better fitting in comparison with other well-known  Figure 6: Survival function adjusted by the compared distributions and a non-parametric method considering the data sets related to 30 patients with brain cancer receiving radiotherapy alone or radiotherapy plus the radiosensitiser misonidazol. distributions.