The Long-Term Bivariate Survival FGM Copula Model: An Application to a Brazilian HIV Data

: In this paper we propose a new bivariate long-term distribution based on the Farlie-Gumbel-Morgenstern copula model. The proposed model allows for the presence of censored data and covariates in the cure parameter. For inferential purpose a Bayesian approach via Markov Chain Monte Carlo (MCMC) is considered. Further, some discussions on the model selection criteria are given. In order to examine outlying and inﬂuential observations, we develop a Bayesian case deletion inﬂuence diagnostics based on the Kullback-Leibler divergence. The newly developed procedures are illustrated on artiﬁcial and real HIV data.


Introduction
Multivariate survival data are present in various areas.Considering particularly bivariate survival data, we can observe two lifetimes for a same client, patient or equipment.For instance, in the medical area we can have interest in studying the lifetimes of paired human organs, as kidneys and eyes, and double recurrence of a certain disease.In industrial applications this type of data can occur for systems whose duration times depend on the durability of two components.For instance, the damages of dual generators in a power plant or the lifetimes of motors in a twin-engine airplane.In financial applications we may have bivariate lifetimes until default.
In general, multivariate survival data are correlated and the study of that dependence has been focus of many researches.In literature, an extensive list of papers on modeling multivariate survival data can be found, for example Aslanidou et al. (1998), Gao et al. (2006), Hougaard (2000) and Hanagal (2011).Clayton (1978), Vaupel et al. (1979), Hougaard (1986), Oakes (1989), Wienke (2011) and Hanagal (2011) consider frailty models in that one or more random effects are included in the model in order to model the dependence between the observations.More recently, the copula models (see for instance, Embrechts et al. (2003), Trivedi and Zimmer (2005), Chiou and Tsay (2008), Nelsen (2006), Zhang et al. (2010) and Jaworski (2010)) have become a popular tool to model the dependence of multivariate data, especially in biological areas, actuarial sciences and finances.A copula is a function that connects the marginal distributions to restore the joint distribution.Different copula functions represent different dependence structures between variables (Nelsen, 2006).By comparing to the joint distribution approach, a copula model is a more convenient tool in studying the dependence structure.Indeed it is more flexible in applications, since, when the scatter of the data does not fit any known family of joint distributions, it may be difficult to specify the joint distribution.Using copulas, however, we can first estimate the marginal distributions and then estimate the copula.Another advantage of the copula modeling is its relatively mathematical simplicity.Also, it is possible to build a variety of dependence structures based on parametric or non-parametric models for the marginal distributions.
In survival analysis, models based on copulas are considered in Hougaard (1989), Oakes (1989), Shih and Louis (1995) and Gustafson et al. (2003).Particularly, from a Bayesian perspective, Romeo, Tanaka and Pedroso de Lima ( 2006) considered an application of the Archimedian copula family for modeling the dependence of bivariate survival.In their analysis, the authors considered a Weibull distribution for the marginal distributions of the bivariate lifetime components.
A difficulty arises if a part of the population is not susceptible to the event of interest.For instance, in bivariate clinical studies a population can respond favorably to a treatment, being considered cured.Univariately, models which consider that part of the population is cured have been widely developed and are usually called long term survival models.Perhaps the most popular type of cure rate model is the mixture model introduced by Boag (1949) and Berkson and Gage (1952).In this model, it is assumed that a certain proportion of the patients, say p, are cured, in the sense that they do not present the event of interest during a long period of time and can be seen as to be immune to the cause of failure under study.Later on, the literature on mixture long-term model is extensive and and interested readers can refer to Maller and Zhou (1996), Peng et al. (1998), Rodrigues et al. (2009), Mazucheli et al. (2009) and Perdona and Louzada-Neto (2011) amongst others.
In this paper, considering a parallel path to the work of Romeo, Tanaka and Pedroso de Lima ( 2006), but differently from them, for the first time, we consider a bivariate distribution based on the Farlie-Gumbel-Morgenstern (FGM) copula (see, Conway (1983)) with long term survival models as marginal distributions.
The main objective of the paper is to present the FGM long-term bivariate survival copula model and to develop diagnostic measures from a Bayesian perspective based on the Kullback-Leibler (K-L) divergence as proposed by Peng and Dey (1995).For each individual lifetime variable we consider long-term Weibull mixture models with covariates included in the cure proportion.For inferential purpose joint estimation was performed and a sampling based approach via Markov Chain Monte Carlo (MCMC) was considered.Applications on artificial and on a real HIV data illustrate our approach.
The paper is organized as follows.Section 2 presents a survival bivariate model by considering a FGM copula distribution.Section 3 presents the Bayesian inferential procedure as well as some discussions on the model selection criteria are given.Section 4 a Bayesian case deletion influence diagnostics based on the K-L divergence is presented.Section 5 illustrates our approach on an artificial dataset generated according to the proposed bivariate model and on a real HIV dataset, where we also compare the fitting of our model with usual copula models, namely, the Positive Stable, Frank and Clayton ones, as well as with the independence case.Some final remarks in Section 6 ends the paper.

The Bivariate Long-Term Survival Model Based on the FGM Copula
In order to define a copula we first suppose that C φ is a joint survival function with density function c φ on [0, 1] 2 for φ ∈ R.Then, let (T 1 , T 2 ) denote the paired failure times, S pop j and f pop j denote, respectively, the marginal long-term survival functions and the marginal long-term density function of T j , j = 1, 2. Consider (T 1 , T 2 ) comes from the C φ copula for some φ then the joint survival and density function of (T 1 , T 2 ) are given by and respectively.Notice that the marginal distributions and the dependence structure can be visualized separately and this dependence structure is represented by a copula.
The marginal distributions of T j , j = 1, 2, were assumed to be driven by a mixture model (Maller and Zhou, 1996).The idea is to enable modeling a proportion of long-term survivors.Mixture models for long-term survivors have been widely used for fitting data where some individuals may never suffer the cause of failure under study (Maller and Zhou, 1996).In this type of modeling, it is assumed that, due to some unobserved prognostic factors, a certain fraction p j of the population is immune to the cause of failure under study or a long-term survivor.The survivor function for the entire population can be written as Following Conway (1983), considering the Farlie-Gumbel-Morgenstern distribution, hereafter FGM copula, we have the copula distribution function given as where 0 ≤ u, v ≤ 1 and −1 ≤ φ ≤ 1.For φ > 0 we have evidence for positive dependence between u and v.For φ < 0 we have evidence for negative dependence between u and v. Consider (T 1 , T 2 ) comes from the FMG copula (4) then the joint long-term survival of (T 1 , T 2 ) is given by where φ parameter measures the intensity of the dependence between the lifetimes.Observe that when φ = 0, S pop (t 1 , t 2 ) = S pop 1 (t 1 )S pop 2 (t 2 ), leading to the conclusion that the random variables T 1 and T 2 are independent.

Inference
For inference, we adopt a full Bayesian approach.The likelihood function, prior distributions for the parameters in the model, details of the MCMC algorithm and the model comparison are presented below.
Let (T i1 , T i2 ) denote the i -th bivariate lifetime and (C i1 , C i2 ) the censored bivariate lifetime, for i = 1, • • • , n. Suppose that (T i1 , T i2 ) and (C i1 , C i2 ) are independent.For each individual i observed quantities are represented by the random variables t ij = min(T ij , C ij ) and δ ij = I(t ij = T ij ), which denotes a censorship indicator, j = 1, 2.
Let S pop (t 1 |γ 1 ) and S pop (t 2 |γ 1 ) be the survival functions of T i1 and T i2 , respectively, where γ 1 and γ 2 are parameter vectors of q 1 and q 2 elements associated to each one of the marginal distributions.

Prior and Posterior Densities
The use of the Bayesian method besides being an alternative statistical approach, it allows the incorporation of previous knowledge of the parameters through an informative prior distribution.When there is not such previous knowledge one may consider a non-informative prior structure.For instance, in order to carry out a Bayesian inference procedure for the bivariate survival model of giving in (5), consider as in Section 2, that the parametric distribution family of the marginal lifetimes T 1 and T 2 are known and indexed by the parameter vectors, γ 1 and γ 2 , respectively.In order to guarantee proper posteriors, we adopt proper priors with known hyper-parameters.Thus, we consider a prior distribution of θ = (φ, γ 1 , γ 2 ) given by (7) implies that (1−φ)/2 follows a Beta(r 1 , r 2 ) distribution, π(γ j ) is the prior distribution of γ j and that γ 1 , γ 2 and φ are mutually independent.Besides, φ ∈ (−1, 1).
For parametric specification of the marginals it is possible to introduce covariates in each one of the components γ j , j = 1, 2. For simplicity and to avoid restrictions in the parametric space, we consider p j = exp(x j β j )/(1 + exp(x j β j )), where β j is corresponding to the vector of unknown coefficients of order m j × 1, associated to the covariates x j , j = 1, 2.

Computation
In the Bayesian approach, the target distribution for inference is the posterior of the parameters of interest.For this, we need to obtain the marginal posterior densities of each parameter, which are obtained by integrating the joint posterior density with respect to each parameter.
We point out that for any marginal distribution of T 1 and T 2 the joint posterior distribution is not tractable analytically but Markov chain Monte Carlo (MCMC) methods such as the Gibbs sampler, can be used to draw samples, from which features of the marginal posterior distributions of interest can be inferred (Gilks, Richardson and Spiegelhater, 1996).For the estimation procedure we consider joint estimation where all the model parameters are estimated simultaneously in the MCMC algorithm.The Gibbs sampler is an iterative procedure of a broad class of methods generically named Markov Chain Monte Carlo (MCMC).Many practical aspects of the MCMC methodology are described in Gelfand and Smith (1990) and Gamerman and Lopes (2006).This method is applicable in situations where one is not able to generate samples directly from the joint posterior density.It however requires the full conditional densities for generating samples.
Combining the likelihood function ( 6) and the prior distribution ( 7), we can obtain the joint posterior density of all unobservables which is not tractable analytically but MCMC methods such as the Gibbs samples, can be used to draw samples, from which features of marginal posterior distributions of interest can be inferred.The full conditional posterior densities for each parameter is given by π(φ|D, θ , and (1 − S popj (t ji )), and The conditional densities above do not belong to any known parametric density family.In order to generate our samples we then implement an Metropolis-Hasting algorithm within Gibbs iterations (Chib and Greenberg, 1995).The simulations were performed using the OpenBUGS software (Spiegelhalter et al., 2007).OpenBUGS codes are available by mailing to one of the authors.

Model Comparison Criteria
In the literature, there are various methodologies which intend to analyze the suitability of a distribution, as well as selecting the best fit among a collection of distributions.In this paper we shall inspect some of the Bayesian model selection criteria; namely, the deviance information criterion (DIC) proposed by Spiegelhalter et al. (2002), the expected Akaike information criterion (EAIC) by Brooks (2002), and the expected Bayesian (or Schwarz) information criterion (EBIC) by Carlin and Louis (2001) was used.These criteria are based on the posterior mean of the deviance, E{D(θ)}, which is also a measure of fit and can be approximated from the MCMC output by where the index v indicates the v-th realization of a total of V realizations and where g(•) is a likelihood function given by , where f popj (•) and S popj (•), j = 1, 2, are Weibull mixture density function and Weibull mixture survival function, respectively.The EAIC, EBIC and DIC criteria can be calculated using the MCMC output by means of EAIC = Dbar + 2q, EBIC = Dbar + q log(n) and DIC = Dbar + ρ D = 2Dbar − Dhat, respectively, where q is the number of parameters in the model and ρ D is the effective number of parameters, defined as E{D(θ)} − D{E(θ)}, where D{E(θ)} is the deviance evaluated at the expected values of the posterior distributions, which can be estimated as v) .
Comparing alternative models, the preferred model is the one with the smallest criteria values.
Another criteria which is one of the most used in applied works is derived from the conditional predictive ordinate (CPO) statistics.For a detailed discussion on the CPO statistics and its applications to model selection, see Gelfand, Dey and Chang (1992).Let D the full data and D (−i) denote the data with the i -th observation deleted.We denote the posterior density of θ given D (−i) by π(θ|D For the i -th observation, the CPO i (Chen, Ibrahim and Sinha, 2001) can be written as For the proposed model a closed form of the CPO i is not available.However, a Monte Carlos estimate of CPO i can be obtained by using a single MCMC sample from the posterior distribution π(θ|D).Let θ 1 , θ 2 , • • • , θ Q be a sample of size Q of π(θ|D) after the burn-in.A Monte Carlo approximation of CPO i (Chen, Shao and Ibrahim, 2000) is given by The larger is the value of B, the better is the fit of the model.

Bayesian Case Influence Diagnostics
The best known perturbation schemes are based on case deletion (Cook and Weisberg, 1982), in which the effects are studied instead of completely removing cases from the analysis.This reasoning shall form the basis for our Bayesian global influence methodology.Then, it will be possible to determine which subjects might be influential for the analysis.In order to investigate if some of the observations are influential for the analysis, we considered a Bayesian case influence diagnostic procedure based on the Kullback-Leibler (K-L) divergence between P and P (−i) , denoted by K(P, P (−i) ), where P denotes the posterior distribution of θ for the full data, and P (−i) denotes the posterior distribution of θ dropping the i -th observation.Specifically, dθ. (8) The K(P, P (−i) ) measures the effect of deleting of i -th observation from the full data on the joint posterior distribution of θ.As pointed by Peng and Dey (1995) and Weiss (1996), it may be difficult for a practitioner to judge the cutoff point of the divergence measure so as to determine whether a small subset of observations is influential or not.In this context, we will use the proposal given by Peng and Dey (1995) and Weiss (1996) by considering the following.Consider a biased coin, which has success probability p. Then the divergence between the biased and an unbiased coin is where then it can be easily checked that d K-L , satisfies the following equation It is not difficult to see that d K-L increases as p moves away from 0.5.In addition, d K-L (p) is symmetric about p = 0.5 and d K-L achieves its minimum at p = 0.5.In this point, d K-L (0.5) = 0 and f 0 = f 1 .Therefore, if we consider p > 0.80 (or p ≤ 0.20) as a strong bias in a coin, then d K-L (0.80) = 0.223.Then, this equation implies that i -th case is considered influential when K(P, P (−i) ) > 0.223.
Similarly, the calibration value of K(P, P (−i) ) for the i -th case can be obtained by solving the equation K(P, P (−i) ) = − log[4p i (1−p i )]/2 for p i .After some algebraic manipulation it can be shown that This equation implies that 0.5 ≤ p i ≤ 1.That is, if p i > 0.8 then the i -th case is considered influential.
For our case it can be shown that ( 8) can be expressed as a posterior expectation where E θ|D (•) denotes the expectation with respect to the joint posterior π(θ|D).Thus ( 11) can be estimated by sampling from the posterior distribution of θ via MCMC methods.Let Then, a Monte Carlo estimate of K(P, P (−i) ) is given by

Application
In this section, results from simulation studies and a real data example are presented in order to illustrate the performance of the proposed methodology.

Artificial Data
In this section we consider an artificial sample generated according to (5), assuming that the observed individual lifetime T j has a Weibull distribution with parameters α j and λ j with probability density function given by The artificial bivariate data (T i1 , T i2 ), i = 1, • • • , n, was generated assuming n = 160 according to the following steps: First we generated where u i1 ∼ U (0, 1).Next T i2 was generated using a random variable u i2 ∼ U (0, 1) and the solution of the nonlinear equation, w i +φ(2u 1i −1)(w 2 i −w i )−u i2 = 0, considering T i2 = (−log(1 − w i )/λ 2 ) 1/α 2 .The covariates x i were generated from a Bernoulli distribution with parameter 0.5.For each lifetime t j , for each covariate level, x (0 or 1), we have a proportion of long-term survivals.We fixed the p kj values, representing the cure rate for the level k for the lifetime j, for k = 0, 1 and j = 1, 2. For each covariate level from each generated lifetime t j we fixed a cutoff point t * kj , which was chosen so that the higher times represent the amount of p kj % of long-term survivals.The following values were considered: p 01 = 0.5, p 11 = 0.1, p 02 = 0.3, p 12 = 0.2, α 1 = 2, α 2 = 1.5, λ 1 = 0.5, λ 2 = 0.1 and φ = 0.6.
For sake of illustration we also compare our FGM bivariate copula model with some bivariate survival models induced by some well known copula models, namely, Positive Stable Frailty (PSF), Clayton and Frank copulas (Hougaard (1986), Clayton (1978) and Frank (1979)).The joint survival function of the PSF model is given by (Hougaard, 1986), When φ → 1, we obtain S pop (t 1 , t 2 ) = S pop (t 1 )S pop (t 2 ).
The joint survival function induced by the Clayton copula is given by, When φ → 0, we obtain S pop (t 1 , t 2 ) = S pop1 (t 1 )S pop2 (t 2 ).While, the joint survival function induced by the Frank copula is given by, When φ → 1, we obtain S pop (t 1 , t 2 ) = S pop1 (t 1 )S pop2 (t 2 ).For more details on the PSF, Clayton and Frank copulas interested readers can refer to Nelsen (2006).
The simulations were performed using the OpenBUGS software (Spiegelhalter et al., 2007).The Bayes estimates were based on posterior samples recorded every 20-th iteration from 50, 000 Gibbs samples after a burn-in of 10, 000 samples.The MCMC convergence was monitored according to the methods recommended by Cowless and Carlim (1996) (CODA package).The number of iterations is considered sufficient for the approximate convergence since in all cases the Gelman-Rubin diagnostic is very close to 1.
Table 1 presents the summary for the FGM bivariate long-term survival model parameters.Table 2 presents the Bayesian criteria for the FGM, PSF, Frank and Clayton models.The FGM model outperforms is better when compared to the other models in all considered criteria.In Table 1, the 95% HPD intervals for φ are large.In order to verify what conditions are needed to get narrower intervals we performed a simulation study in which we vary the φ parameter values and the sample size n, with φ = 0.2, 0.6, 0.8 and n = 100, 200, 300, 400, 500.The other parameters were fixed the same values as in Subsection 5.1: p 01 = 0.5, p 11 = 0.1, p 02 = 0.3, p 12 = 0.2, α 1 = 2, α 2 = 1.5, λ 1 = 0.5 and λ 2 = 0.1.For each set up 1,000 generated samples were considered, from which we fitted our modeling and retrieve the amplitude of the 95% HPD interval for φ.The results of this study were condensed in Table 3, where we can observe that the amplitude 95% HPD interval becomes narrower with the sample size.This fact is more evident for large values of φ.

Influence of Outlying Observations
One of our main goals in this study is to show the need for robust models to deal with the presence of outliers in the data.In order to do so, we use the same sample previously simulated.We selected cases 5, 35, 95 (both observed lifetimes) and 142 (lifetime 1 is observed and lifetime 2 is censored) for perturbation.The perturbation scheme were structured as following.To create influential observation artificially in the dataset, we choose one, two or three of these selected cases.For each case we perturbed one or both lifetimes as follows t i = t i + 5S t , i = 1, 2, where S t is the standard deviations of the t i 's.For case 5 we perturbed only the lifetime t 1 and for case 35, the lifetime t 2 , and for cases 95 and 142, both lifetimes were perturbed.
The MCMC computations were made in a similar fashion to those in the last section and further to monitor the convergence of the Gibbs samples we also used the methods recommended by Cowless and Carlim (1996).Table 4 shows that the posterior inferences are sensitive to the perturbation of the selected case(s).In Table 4, Dataset (a) denotes the original simulated data set with no perturbation, and Datasets (b) to (g) denote datasets with perturbed cases.
Table 5 displays the fit of different cases of the perturbed data set.We can observe that the original simulated data (Dataset (a)) had the best fit.Now we consider the sample from the posterior distributions of the parameters of the model based on the FGM bivariate survival model to compute the K-L divergence and calibration of this divergence, as describe in Section 4. The results in Table 6 show, before perturbation (Dataset (a)), that all the selected cases  are not influential with small K(P, P (−i) ) and related calibration close to 0.622.However, after perturbations (Datasets (b) to (g)), the K(P, P (−i) ) increases and the corresponding calibrations become larger than 0.80, indicating those cases are influential.
In the Figure 1 shows the K(P, P (−i) ) for the FGM bivariate survival model.Clearly we can see that K(P, P (−i) ) performed well to identifying influential case(s), providing larger K(P, P (−i) ) when compared to the other cases.

HIV Dataset
Opportunistic infections are important causes of morbidity and mortality in people with Human immunodeficiency virus (HIV), and deaths and hospitalizations are the major events resulting from these infections.In addition, opportunistic infections are important indicators of the impact of interventions in the HIV-infected population (Candiani et al., 2007).Moreover, HIV-infected persons are subject to numerous infectious and neoplastic complications related to their disease.In general, apart from infections acquired in living outside the hospital when hospitalized, the HIV-infected individual is also more susceptible to hospital infections by his possible poor health condition.Some of these risk factors for nosocomial infections, particularly those of nosocomial bloodstream are the use of invasive devices, antibiotic exposure and length of stay in hospital (Tumbarello et al., 1998).
Overall, the general health condition of the patient admitted to the service as well as the occurrence of an infection acquired during hospitalization, can further extend his length of stay in hospital.In addition, the HIV-infected patient may require one or more hospital admissions and length of stay in this environment can contribute significantly to serious complications, which may lead to death of the patient.Moreover, the timing of a given hospitalization may somehow depend on the time the patient been hospitalized previously.So it seems reasonable to consider a long-term modeling to assess two times of hospitalization with a copulating dependence structure according to the variable gender, in order to see whether gender impacts the times of hospitalization.
In this context, we consider here 135 patients older than 18 years of age with HIV seen at the Serviço de Doenças Infecciosas e Parasitárias (DIP), Universidade Federal do Triângulo Mineiro (UFTM), Brazil, diagnosed with HIV between January 1996 and December 1998.The times of first (T 1 ) and second (T 2 ) hospitalization (in days) were modeled according to the gender of the patient.The HIV data is an unpublished dataset and it was not statistically analyzed by other works in the best of our knowledge.
We then applied the proposed methodology to the HIV dataset.Figure 2 shows the scatter plot between (T 1 ) and (T 2 ), the index plot of K(P, P (−i) ) from fitting the bivariate survival model based on FGM copula and the Kaplan-Meier curves for the variables T 1 and T 2 , dichotomized by the gender, along with the fit of the FGM Copula model with Weibull mixture marginals to real data, which will be presented later.Focusing only on the empirical curves, we clearly observe the presence of long-term survivals, since some of the curves stabilize above zero, leading to an indicative of long-term survivals.q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 0 10 20 30 40 The data was analyzed though our approach by adopting the model in ( 5) with Weibull mixture marginals as in (13).The Bayes estimates were based on posterior samples recorded every 20-th iteration from 50, 000 Gibbs samples after a burn-in of 10, 000 samples.We choose the following independent prior distributions: α j ∼ Gamma(1, 0.001), λ j ∼ Gamma(1, 0.001), β kj ∼ N (0, 10 3 ), k = 0, 1, j = 1, 2 and (1 − φ)/2 ∼ Beta(1, 1), such choices guarantee that φ ∈ (−1, 1) and ensure non-informativeness.To monitor the convergence of the Gibbs samples we uses the methods recommended by Cowless and Carlim (1996).
In Table 7 we report posterior summaries for the parameters of the FGM long-term bivariate survival copula model considering Weibull marginal mixture distribution.Following Gelman, Carlin and Rubin (2006), we have checked the sensitivity of the routine use for gamma prior on the variance components and found that results are fairly robust under different priors.We also checked the sensitivity of the analysis for the variance component parameters for various choices of prior parameters by changing only on a parameter at a time and keeping all other parameters constant at their default values.The posterior summaries of the parameters do not present remarkable difference and not impair the results in Table 7.The HPD confidence intervals for β 11 and β 12 containing zero speaks against the gender impact in the times of hospitalization.These results are corroborated by the similarity between fitted bivariate survival FGM copula model curves under the Kaplan-Meier curves for the variables T 1 and T 2 , dichotomized by the gender presented in Figure 2.Although we have reanalyzed the data of HIV considering only the parameters that were significant (in Table 7), the amplitude of 95% HPD interval for φ showed no significant change.This fact was already expected, since the parameter φ is estimated regardless of the choice of the marginal.As pointed out by Romeo, Tanaka and Pedroso de Lima (2006), since the choice of the marginal distributions does not depend on a particular choice of a copula, it makes sense to consider the estimation of the marginals and the dependence parameters separately.
Table 8 presents the K-L divergences and related calibrations for the five observations which present the largest calibration values.Clearly we observe the procedure identifies the lifetime 61 as a possible influential case (corresponding calibration greater than 0.8).The result is corroborated by the K(P, P (−i) ) for the FGM bivariate survival model shown in upper right panel of Figure 2.Besides the proposed FGM bivariate survival model, we also fitted to the data the bivariate survival models induced by a PSF, Clayton and Frank copulas, as presented in Section 5.1.Table 9 presents the model comparison criteria discussed in Section 3 for comparing the FGM long-term bivariate survival copula model with Weibull mixture marginal distribution with its particular independence case, as well as the long-term bivariate survival models induced by the PSF, Frank and Clayton copulas considering Weibull mixture distributions as marginals.Based on all Bayesian criteria, there is positive evidence in favor The FGM modeling, indicating that the FGM long-term bivariate survival copula model can be seen as a competitor to the well known bivariate survival models induced by the PSF, Frank and Clayton copulas commonly used in literature for fitting bivariate life time data.

Some Final Remarks
In this paper, we present the FGM long-term bivariate survival copula model.Parameter estimation is based on a Bayesian approach via MCMC.We propose a Bayesian case influence diagnostic based on the Kullback-Leibler divergence in order to study the sensitivity of the Bayesian estimates under perturbations in the model/data.Finally, we illustrate our approach with an artificial and a real dataset.
In the two-step approach, the marginal parameters are estimated first and then, the copula parameter is estimated in a second step.This approach provides consistent, but not efficient estimators using a frequentist approach.However, from the Bayesian perspective, one can estimate all the model parameters simultaneously in the MCMC algorithm such that the assumption of independence in the first step is avoided.We however considered a two-step estimation method.We omitted the results since they are similar to those obtained by considering the joint estimation approach.

Figure 1 :
Figure 1: Artificial data.Index plot of K(P, P (−i) ) from fitting a bivariate survival model based on FGM copula

Figure 2 :
Figure 2: Real data.Scatter plot of T 1 vs T 2 (upper left panel).Index plot of K(P, P (−i) ) from fitting the bivariate survival model based on FGM copula (upper right panel).Kaplan-Meier curves for the variables T 1 (lower left panel) and T 2 (lower right panel) along with the fit of the FGM Copula model with Weibull mixture marginal

Table 1 :
Simulated data.Posterior mean, standard deviation (SD) and HPD (95%) interval for the FGM bivariate long-term survival model parameters

Table 3 :
Amplitude of the 95% HPD interval for parameter φ

Table 4 :
Simulated data.Posterior means and standard deviations (SD) for the FGM bivariate survival model parameters according to different perturbation schemes

Table 5 :
Simulated data.Bayesian criteria for each perturbed version fitting bivariate model based on the FGM bivariate survival model parameters according to different perturbation schemes

Table 6 :
Simulated data.Case influence diagnosticsData names Case number K(P, P (−i) ) Calibration

Table 7 :
HIV data.Summary results from the posterior distribution, mean and standard deviation (SD) for parameter under model based on FGM copula

Table 8 :
Real data.Bayesian case influence diagnostic

Table 9 :
Real data.Bayesian criteria