Bivariate Geometric ( Maximum ) Generalized Exponential Distribution

In this paper we propose a new five parameter bivariate distribution obtained by taking geometric maximum of generalized exponential distributions. Several properties of this new bivariate distribution and its marginals have been investigated. It is observed that the maximum likelihood estimators of the unknown parameters cannot be obtained in closed form. Five non-linear equations need to be solved simultaneously to compute the maximum likelihood estimators of the unknown parameters. We propose to use the EM algorithm to compute the maximum likelihood estimators of the unknown parameters, and it is computationally quite tractable. We performed extensive simulations study to see the effectiveness of the proposed algorithm, and the performance is quite satisfactory. We analyze one data set for illustrative purposes. Finally we propose some open problems.


Introduction
Generalized exponential (GE) distribution has received some attention in recent years in the statistical literature.It has been introduced by Gupta and Kundu (1999) as an alternative to gamma or Weibull distributions.A two-parameter GE distribution can be used quite effectively for analyzing lifetime data in place of two-parameter Weibull or two-parameter gamma distributions.It is observed that the probability density function (PDF) and the hazard function (HF) of a GE distribution are very similar to the corresponding PDF and HF of a gamma or a Weibull distribution.Since the cumulative distribution function of a GE distribution can be expressed in explicit form, this model can be used to analyze censored data quite conveniently.A brief review of GE distribution has been provided in Section 2. Marshall and Olkin (1997) in their classical paper introduced a method to add an extra parameter to a family of distributions, and discussed in details in particular the generalization of exponential and Weibull families.Due to presence of an extra parameter, the proposed class of distributions is more flexible than the exponential or Weibull class, respectively.Since then extensive work has been done along that line, and many researchers have investigated the same approach for different other distributions, see for example, Ghitany et al. (2007), Louzada et al. (2014), Ristic and Kundu (2015) and the references cited therein.Marshall and Olkin (1997) in their paper also indicated about the bivariate extension of the model.
Although, the proposed class of distributions is a more flexible than the original class of distributions, they did not discuss any properties or inference related issues of the bivariate model.It seems the problem becomes analytically quite intractable in its general set up.Special attention may be needed for specific distribution.This is an attempt towards that direction.
The main aim of this paper is to introduce a bivariate distribution obtained by geomet-ric maximum of generalized exponential distribution.In this case the method proposed by Marshall and Olkin (1997) may not produce the bivariate distribution in such a tractable form.In this paper, instead of minimization approach, as suggested by Marshall and Olkin (1997), we have taken the maximization approach, which produces a new class of bivariate distributions which are analytically quite tractable.Different properties of this new distribution have been investigated.Due to presence of five parameters, it is a very flexible model, and the joint PDF can take different shapes.Hence, it can be used quite effectively to analyze bivariate data.Moreover, it has some physical interpretations also.The marginals are very flexible, and we explore different properties of the marginals.Hazard function of the marginals can take all different shapes namely (i) increasing, (ii) decreasing, (iii) unimodal and (iv) bath-tub shaped.It is observed that the generation of random samples from the proposed bivariate model is very simple, hence simulation experiments can be performed quite conveniently.The proposed model has a simple copula structure, and we obtain different dependency properties and also computed different dependency measures using the copula structure.
The proposed bivariate distribution has five parameters.The maximum likelihood estimators (MLEs) of the unknown parameters can be obtained by solving five non-linear equations simultaneously.Computationally it becomes a challenging problem.Newton-Raphson or Gauss-Newton type algorithm iterative procedure is needed to solve these non-linear equations.Moreover, the choice of initial guesses and the convergence of the iterative algorithm are important issues.To avoid these problems, we treat this problem as a missing value problem, and propose to use the expectation maximization (EM) algorithm to compute the MLEs.In this case at each 'E'-step we need to solve two one-dimensional non-linear optimization problems.Therefore, the implementation of the proposed EM algorithm is very simple.Since it is a very flexible model and the implementation is also quite simple, it gives the practitioner a choice of an alternative bivariate model, which may provide a better fit than the existing models.For illustrative purposes, we analyze one bivariate data set using this model, and the performance is quite satisfactory.Further, we provide two generalization of the proposed bivariate model, and propose some open problems.It is observed that the GE distribution can be replaced by any other proportional reversed hazard model, and multivariate generalization is also quite straightforward.
It will be interesting to investigate different properties and develop estimation procedures in these general cases.
It may be mentioned that several other bivariate generalized exponential distributions are available in the literature.We briefly describe them now.Kundu and Gupta (2009) introduced bivariate generalized exponential distribution whose marginals are generalized exponential distributions.It has been obtained by using the trivariate reduction method similarly as the bivariate exponential distribution of Marshall and Olkin (1967).The bivariate generalized exponential distribution of Kundu and Gupta (2009) has four parameters, and it has a singular component along the line x = y.This model can be used quite effectively when there are ties in the data, and when the marginals have monotone hazard functions.In a subsequent paper Kundu and Gupta (2011) introduced absolute continuous bivariate generalized exponential distribution by removing the singular components from the bivariate generalized exponential distribution of Kundu and Gupta (2009), similarly as the bivariate generalized exponential distribution of Block and Basu (1974).This model also has four parameters, and it is quite useful to analyze data when there are no ties.In this case also the hazard functions of the marginals are monotone.Very recently Mirhosseini et al.Rest of the manuscript is organized as follows.In Section 2, we provide a brief review of the GE distribution.Bivariate geometric maximum of generalized exponential distribution and its properties are discussed in Section 3. In Section 4, we provide the statistical inference of the unknown parameters.In Section 5, we provide the results of the simulation experiments, and the analysis of a real data set.Finally we propose some open problems, and conclude the paper in Section 6.

Generalized Exponential Distribution
The random variable X is said to be a GE random variable with parameters α > 0 and λ > 0, if the cumulative distribution function (CDF) of X is and 0, otherwise.It will be denoted by GE(α, λ), and GE(α, 1) will be denoted by GE(α).
If X ∼ GE(α, λ), the corresponding PDF and HF become and respectively.Here α is the shape parameter, and λ is the scale parameter.For α ≤ 1, the PDF is a decreasing function, and for α > 1 it is an unimodal function.It is clear that for α = 1, it is an exponential distribution function.For, α < 1, the hazard function is a decreasing function, and for α > 1, it is an increasing function.When α = 1, it is constant.If X ∼ GE(α), then the moment generating function and the moments are obtained as follows, see Gupta and Kundu (1999) where ψ(x) = d dx Γ(x), the digamma function, and ψ ′ (•) is its derivative.
It is observed that the GE distribution behaves very similarly as the two-parameter gamma or two-parameter Weibull distributions.All the three distributions are extensions of the exponential distribution, but in different manners.Because of the explicit expression of the CDF, it can be used quite effectively to analyze censored data also.It is further observed that it is very difficult to discriminant between GE distribution and Weibull distribution or gamma distribution, particularly, if the shape parameter is very close to 1.
The GE distribution was first introduced by Gupta and Kundu (1999), as a special case of a more general three-parameter exponentiated Weibull distribution, originally proposed by Mudholkar and Srivastava (1993), see also Mudholkar et al. (1995) in this respect.
Extensive work has been done on GE distribution regarding different estimation and inference procedures.Interested reader may refer to the review articles by Gupta and Kundu (2007) or Nadarajah (2011) regarding different developments of this distribution.

Model Formulation
Consider two sequences of random variables X 1 , X 2 , . . .and Y 1 , Y 2 , . ... It is assumed that X i 's are independent and identically distributed (i.i.d.) GE(α 1 , λ 1 ) random variables, Y i 's are GE(α 2 , λ 2 ) random variables, and X i 's and Y j 's are independent.Let N be a geometric random variable with probability mass function N denotes the set of all positive integers, and 0 < p < 1.From now on, it will be denoted by GM(p).Moreover, N is independent of X i 's and Y j 's.Consider the following bivariate random variable (X, Y ), where We call (X, Y ) as the bivariate geometric (maximum) generalized exponential (BGGE) distribution, with parameters (α 1 , α 2 , p, λ 1 , λ 2 ), and it will be denoted by BGGE(α 1 , α 2 , p, λ 1 , λ 2 ).
Random Stress Model: Suppose, a system has two components.Each component is subject to random number of individual independent stresses, say {X 1 , X 2 , . ..} and {Y 1 , Y 2 , . ..}, respectively.If N is the number of stresses, then the observed stresses at the two components Parallel Systems: Consider two systems, say 1 and 2, each having N number of independent and identical components attached in parallel.Here N is a random variable.If X 1 , X 2 , . . .denote the lifetime of the components of system 1, and Y 1 , Y 2 , . . .denote the lifetime of the components of system 2, then the lifetime of the two systems become (X, Y ), where X = max{X 1 , . . ., X N } and Y = max{Y 1 , . . ., Y N }.
The joint CDF of (X, Y ) can be obtained as The joint PDF of (X, Y ) can be obtained as Here α 1 , α 2 are the shape parameters.The parameter p plays the role of the correlation parameter, and λ 1 and λ 2 are scale parameters.The joint PDF ( 8) is very flexible, it can take different shapes depending on the values of α 1 , α 2 and p.In Figure 1, we provide the surface plots of ( 8) for different parameter values α 1 , α 2 and p.It is clear that it can take variety of shapes depending on the parameter values.When p = 1, i.e.X and Y become independent.
It may be observed that the generation from a BGGE distribution is very straight forward using the definition of the model.First generate N from a geometric distribution, and once N = n is observed, X and Y can be generated from GE(nα 1 , λ 1 ) and GE(nα 2 , λ 2 ) respectively.
Rest of this section we discuss different properties of this distribution, hence without loss of generality it is assumed that λ 1 = λ 2 = 1.

Marginal Distribution
In this section we obtain the marginal distributions of X and Y .We provide the results for X, and for Y , it can be obtained along the same line.Consider the following bivariate random variable (X, N ), where X and N are same as defined before.The joint density function, f X,N (x, n), of (X, N ) is given by Therefore, the joint distribution function of (X, N ) is From (11), one can obtain Note that ( 12) can be obtained directly from (7) also, by taking y → ∞.The PDF of X becomes It is clear that the PDF of X can be written as the weighted GE distribution, with the weight function It is an increasing function and it increases from p to 1/p, as x varies from 0 to ∞.The PDF of X can take different shapes.The PDF can be a decreasing function or unimodal, it can have a thicker tail than the GE distribution for certain choice of the parameter values.
From (12), we obtain that for any fixed x, as p → 0, P (X > x) → 1.Therefore, it becomes heavy tail distribution.It is clear that if p = 1, X has GE(α 1 ).For p close to 1, the shape of PDF of X is very close to the shape of the PDF of GE distribution.
This model has a close resemblance with the model recently proposed by Louzada et al.
(2014).They proposed the model which is geometric minimum of generalized exponential distributions and it has the PDF for α > 0, λ > 0 and 0 < p < 1.Since GE distribution is closed under maximum, our model is a natural generalization of the GE model than the model proposed by Louzada et al. (2014).Moreover, many treatments and properties developed in this model can be developed along the same line for the model (15), and vice versa.For example, the EM algorithm developed for the model ( 13) can be developed along the same line for the model ( 15) also.
The hazard function of X can be written as and The weight function w 1 (x) is an increasing function, and it increases from p to 1, as x ranges from 0 to ∞.We have the following result regarding the shape of the hazard function of X.
Proof: Considering the logarithm of the hazard function, the proof can be obtained.The details are avoided.
If a random variable X has the CDF (12) or the PDF ( 13), we call it as the geometric maximum of GE (GGE) distribution, and it will be denoted by GGE(α 1 , p).If U ∼ GGE(α, p), the γ-th percentile point of U is Moreover, for fixed p and x, as α increases, P (U ≤ x) increases.It implies that for fixed p, GGE(α, p) family has a stochastic ordering in terms of α.
The following result will be useful in developing EM algorithm for GGE distribution.
From (10), it follows after some calculations that and The following result indicates that GGE is closed under geometric maximum.
Result 1: Suppose, {U i ; i ≥ 1} is a sequence of i.i.d.GGE(α, p) random variables, and M ∼ GM(q), for 0 < q < 1.Moreover, U i 's and M are independent.Consider a new random variable then V ∼ GGE(α, pq). Proof: The moment generating function cannot be obtained in explicit forms, it is obtained as an infinite series.If X ∼ GGE(α 1 , p), then the moment generating function of X is Different moments of X also cannot be obtained in explicit forms, they can be obtained from (23) as infinite series.Moreover, the PDF of X can be written as infinite mixture of GE distributions.(b)

BGGE: Basic Properties
. (e) Proof: The proof of (a), (b) and (c) can be obtained in a routine matter.We provide the proof of (d) and (e) only.Proof of (d): Proof of (e): The joint density function of (X, Y, N ), f X,Y,N (x, y, n) is given by Therefore, where Now using the fact that if M ∼ GM(1 − a), then E(M 3 ) = (a 2 + 4a + 1)/(1 − a) 3 , we obtain The above result (26) will be useful for developing the EM algorithm.
The following result is the bivariate extension of Result 1.It indicates that BGGE distribution is also closed under geometric maximum.The proof is very similar to Theorem 1, and it is avoided.
ables, and M ∼ GM(q), for 0 < q < 1.Moreover, (U i , V i )'s and M are independent.Consider a new random variable The joint moment generating function of (X, Y ) cannot be obtained in explicit forms, as expected.If (X, Y ) ∼ BGGE(α 1 , α 2 , p), then the moment generating function of (X, Y ) is Different cross moments of X and Y also cannot be obtained in explicit forms, they can be obtained from (28) as infinite series.

BGGE: Dependence Properties
It is known, Nelsen (2006), that every bivariate distribution function, F X,Y (•, •) with continuous marginals F X (•) and Conversely, the copula C(u, v) can be recovered from the joint distribution function F X,Y (•, •) as follows; It can be shown by some calculation that if (X, Y ) ∼ BGGE(α 1 , α 2 , p), then the correspond- The copula (31) is known as the Ali-Mikhail-Haq copula, see Ali, Mikhail and Haq (1978).
The authors provided some nice interpretation of the above copula in terms of bivariate odds ratio.
Let us recall the following definitions, see for Nelsen (2006).Suppose X and Y are random variables with absolute continuous joint distribution function.
Definition 1: X is stochastically increasing in Y if P (X > x|Y = y) is a non-decreasing function of y for all x.
Definition 2: X and Y are positively quadrant dependent (PQD) if for all (x, y) ∈ R 2 is a non-increasing function of y for all x.
Definition 4: A function f : R 2 → R is said to be a total positivity of order two (TP 2 ), if f (x, y) ≥ 0 for all (x, y) ∈ R 2 , and whenever x ≤ x ′ , and y ≤ y ′ , The random variables (X, Y ) is said to be TP 2 , if the joint cumulative distribution function of (X, Y ) is TP 2 .
We can establish the following properties using the above copula structure.
Proof: Using the copula function, it follows that Y is stochastically increasing in X, if and only if for any v ∈ [0, 1], C(u, v) is a concave function of u, see Nelsen (2006).In case of Ali-Mikhail-Haq copula, note that therefore, the result follows.Proof: It is known that PQD is a copula property, and two random variables X and Y are PQD if and only if, the corresponding copula, C(u, v), satisfies In case of Ali-Mikhail-Haq copula, it can be easily seen that it satisfies (33), and the result immediately follows.
Proof: Since 'left tail decreasing' property is a copula property, and X is left tail decreasing in Y if and only if for any u ∈ [0, 1], C(u, v)/v is non-increasing in v.In case of Ali-Mikhail-Haq copula, it is true, and the result follows.Proof: The joint CDF of (X, Y ) is TP 2 if and only if Ali-Mikhail-Haq copula is TP 2 , as TP 2 property is also a copula property.For u < u ′ and v < v ′ it can be seen after some calculation that Hence, the result follows.

BGGE: EM Algorithm
Let us assume that we have a random sample {(x 1 , y 1 ), . . ., (x m , y m )} from BGGE(α 1 , α 2 , p, λ 1 , λ 2 ), i.e. it has the PDF (8).The log-likelihood function based on the observation becomes; Now to compute the MLEs of the unknown parameters, we need to maximize (35) with respect to the unknown parameters.It is clear that we need to solve five dimensional optimization problem to compute the MLEs of the unknown parameters.We need to use some iterative algorithm like Newton-Raphson or Gauss-Newton, to solve these non-linear equations.Finding initial guesses for solving five dimensional non-linear equations is not a trivial issue.To avoid that we propose to use this problem as a missing value problem, and use the EM algorithm to compute the MLEs.
First it is assumed that p is known.In developing the EM algorithm, we treat this as a missing value problem.It is assumed that the complete observation is as follows: {(x 1 , y 1 , n 1 ), . . ., (x m , y m , n m )}.Here n i is missing corresponds to (x i , y i ), and it is obtained from N ∼ GM(p).Based on the complete observation, the complete log-likelihood function without the additive constant (involving only the unknown parameters) becomes; Now based on the complete observations, the MLEs of the unknown parameters can be obtained as follows: For given λ 1 and λ 2 , the MLEs of α 1 and α 2 can be obtained as , and respectively.The MLEs of λ 1 and λ 2 can be obtained by maximizing and with respect to λ 1 and λ 2 , respectively.It can be shown, see Gupta and Kundu (2002) for details, that under certain restrictions, h 1 (λ) and h 2 (λ) are unimodal functions, hence they have unique maximum.Although, because of the complicated nature of the log-likelihood function, general results cannot be established.Empirically it has been observed that they have unique maximum in all the cases considered.Finally the MLE of p can be obtained by maximizing the profile log-likelihood function with respect to p. Now we are in a position to develop EM algorithm, and it can be developed as follows.
At the k-stage, suppose the values of α 1 , α 2 , λ 1 and λ 2 are α 2 , respectively, then at the k + 1-stage, 'E'-step involves forming the pseudo log-likelihood function without the additive constant becomes Therefore, 'M'-step involves maximizing (40) with respect to the unknown parameters to ), and they can be obtained as follows: λ can be obtained by maximizing with respect to λ 1 , where ).

Testing of Hypotheses
It has already been mentioned that when p = 1, the two marginals are independent.Therefore, one of the natural tests of hypotheses problem will be to test the following: In this case the since p is in the boundary under the null hypothesis, the standard the results do not work.Using Theorem 3 of Self and Liang (1987), it follows that Here α 1 , α 2 , λ 1 , λ 2 , p are the MLEs of the corresponding parameters without any restriction, and ( α 1 , α 2 , λ 1 , λ 2 ) are the MLEs under the restriction p = 1.
5 Numerical Experiments and Data Analysis

Numerical Experiments
In this section we present some simulation results to show how the proposed EM algorithm performs for different sample sizes and for different parameter values.We have taken the following sets of parameter values (i) We fit the BGGE model to the simulated data set, and to compute the MLEs of the unknown parameters we use the EM algorithm as suggested in the previous section.For each p, we compute α 1 (p), α 2 (p), λ 1 (p), λ 2 (p), using EM algorithm.In each case we started the EM algorithm with α 2 = 0.5, and the iteration stops when the absolute value of the difference of the two consecutive iterates for all the four parameters are less than 10 −5 .We replicate the process 1000 times, and report the average estimates and the associated mean squared errors (MSEs) within brackets below.We also report the median number of iterations (MNI) needed for the EM algorithm to converge.All the results are reported in Tables 1 -1 Table 1: Average estimates, the associate MSEs of the different parameters and the median number of iterations needed for the algorithm to converge.Here α 1 = α 2 = 2.0, λ 1 = λ 2 = 1.0, p = 0.5.Some of the points are quite clear from the simulation results.First of all, it is observed in all cases that as the sample size increases, the biases and the mean squares errors decrease.It verifies the consistency properties of the MLEs.In all the cases the EM algorithm converges Table 2: Average estimates, the associate MSEs of the different parameters and the median number of iterations needed for the algorithm to converge.Here α 1 = α 2 = 2.5, λ 1 = λ 2 = 1.0, p = 0.5.  2 it is clear that if the shape parameters change, the performance of the MLEs of the scale (λ 1 and λ 2 ) and the correlation (p) parameters do not change.Comparing 1, Table 3 and 4 it is observed that if the correlation parameter changes the performances of the MLEs of the scale parameters do not change.In case of the shape parameters, the performance becomes better as p increases.
We have used the EM algorithm with some other initial values also.In all these cases the results remain the same, except the MNI changes.Based on the simulation results, we can conclude that the proposed EM algorithm is working quite well, and it can be used quite effectively for data analysis purposes.

Real Data Set
In this section we analyze one real data set using the BGGE model.This data set represents the two different measurements of stiffness, 'Shock' and 'Vibration' of each of 30 boards.The first measurement (Shock) involves sending a shock wave down the board and the second measurement (Vibration) is determined while vibrating the board.The data set was originally from William Galligan, and it has been reported in Johnson and Wichern (1992), and for convenience it is presented in Table 5.
Before progressing further, first we plot the scaled-TTT plots, see Aarset (1987) for details, of the marginals in Figure 2. Since both are concave functions, it can be assumed that the hazard function of the marginals are increasing functions.Therefore, BGGE may be used for analyzing this data set.
We have used EM algorithm to compute the MLEs of the unknown parameters.It is observed that the initial guesses of the unknown parameters do not create any problem in this case regarding the convergence of the EM algorithm.We have used the same convergence criterion as it has been used in the previous example.We have verified with different starting values, but it provides the same estimates in all cases.The MLEs of the unknown parameters are α 1 = 8.4496, α 2 = 3.5918, λ 1 = 6.4402, λ 2 = 6.7102, p = 0.0210, (49)

Conclusions
In this section we propose a new absolute continuous bivariate distribution by taking the geometric maximum of generalized exponential distributions.Several properties of this new bivariate distribution have been established.It is further observed that the proposed distribution can be obtained from a well known Ali-Mikhail-Haq copula, hence several properties can be obtained using the copula properties.We have suggested to use the EM algorithm to compute the MLEs of the unknown parameters, and it is observed that the proposed EM algorithm works quite well in practice.Along the same line EM algorithm for GGE model and also the model recently proposed by Louzada (2014), also can be obtained.Now we provide some open problems.Suppose F 0 (•) is a distribution function with support on the positive real axis, then the following class of distribution functions; F (t; α) = (F 0 (t)) α ; t > 0, (50) for α > 0 is known as the proportional reversed hazard class.Now along the same line bivariate geometric maximum of proportional reversed hazard distribution can be obtained.
Further, multivariate geometric maximum of proportional reversed models also can be obtained along the same line.It will be interesting to obtain different properties of this new class of distributions.More work is needed along these directions.

(
2015) introduced a new three parameter absolute continuous bivariate generalized exponential distribution whose marginals follow generalized exponential distributions.The absolute continuous bivariate generalized exponential distribution of Mirhosseini et al. (2015) has been obtained using exponential distributions.It has been observed that it is quite close to the absolute continuous Block and Basu bivariate exponential distribution, and the marginals have decreasing hazard functions.The proposed bivariate generalized exponential distribution has five parameters, and each marginal has three parameters.Due to presence of five parameters, the marginals and the joint probability density functions can take variety of shapes.The hazard functions of the marginals can be monotone, unimodal or bath-tub shaped.It has an absolute continuous joint probability density function for all parameter values.The proposed bivariate generalized exponential distribution is more flexible than any of the existing bivariate generalized exponential distributions.

H 0 ,
the MLEs of α 1 , α 2 , λ 1 and λ 2 become 14.9564, 8.4966, 3.6301, 3.6701, respectively, and the corresponding log-likelihood value is -37.6374.The value of theSelf and Liang (1987) test statistic (48) is 24.8202 and the associated p < 0.0001.Hence, two independent GE distributions cannot be used to analyze this data set.Recently, four-parameter bivariate Pareto distribution has been used to analyze this stiffness data set by Sankaran and Kundu (2014).It is observed that bivariate Pareto provides a good fit to this data set.The MLEs of the four parameters, α 0 , α 1 , α 2 and θ, see Sankaran and Kundu (2014) for details, are 0.0154, 0.0321, 0.0292 and 18.3488, respectively.The associated log-likelihood value is -96.5098.Now comparing five-parameter BGGE model and four-parameter bivariate Pareto model, based on both AIC or BIC we prefer to choose BGGE compared to bivariate Pareto model in this case.

Figure 2 :
Figure 2: Scaled TTT plots of the marginals.(a) Shock, (b) Vibration It indicates that the proposed EM algorithm is working well in this case.Now we would like to compare the performances of the MLEs for different sets of parameter values based on the biases and MSEs.Comparing Table1 and Table