Modeling Time-to-Recurrent Relapse in Patients with Bipolar Disorder

One of the main features of bipolar disorder is repletion of relapse overtime. Many studies have focused on time-to-ﬁrst relapse using the most popular Cox proportional hazard model which discards subsequent information on recurrent relapses. The aim of this study was to identify some risk factors of time-to-recurrent relapses in bipolar disorder inpatients by using appropriate recurrent event model. Data on 206 inpatients, available at Amanuel mental specialized hospital, were collected by reviewing the medical records from September 11, 2013 to March 12, 2019. Diﬀerent extended cox proportional hazard models including AG, PWP-TT, PWP-GT and semiparametric shared gamma frailty models were used. R package FrailtyEM package used to ﬁt semi-parametric shared gamma frailty models through EM algorithm. The mean age of the patients was 33.33 years. Within the study time, a total of 418 inpatient admissions (relapses) were registered for 206 inpatients. Among these admissions, about 49.3% of the patients had ﬁrst relapse and 50.7% of the patients had more than one relapses. The likelihood test results indicated that the appropriate model is the gap-time based semi-parametric shared gamma frailty model and the important risk factors that have eﬀect on time since the end of the most recent relapse to the start of the next relapses are marital status, substance abuse, employment status and residence. Recurrent relapse may be reduced by giving more intensive forms of treatment and creating awareness on each risk factor.


Background of the Study
Bipolar disorder is severe mental disorder that causes several problems in a person's psychosocial functioning (Amini et al., 2017). The common types of bipolar disorder are bipolar I and bipolar II characterized by alternating episodes of mania, depression, hypomania and mixtures of them (Bobo, 2017). Patients with bipolar disorder are marked by extreme change in mood, thought, energy, activity level, behavior and ability to carry day-to-day tasks (Biglu and Biglu, 2013). The bipolar I disorder patients have the recurrence episodes of mania and depression while the bipolar II disorder patients never develop severe mania but instead experience milder episodes of hypomania that alternate with depression. Sometimes, patients of bipolar disorder experience more than 4 episodes within 12-months period, called rapid-cycling bipolar disorder (Parikh, 2013).
During depressive episode, the patients are characterized by loss of interest and pleasure in activities enjoyed before; overwhelming sadness; withdrawing from friends and avoiding social activities; ceasing self-care tasks like shopping and showering; changes to appetite and sleep patterns; lack of concentration, extreme tiredness, and feelings of guilt or worthlessness. During manic episodes, they are described by the most severe state of extreme elation and over-activity. The Common symptoms of mania include: elevated mood, increased energy and over-activity, reduced need for sleep, irritability, rapid thinking and speech, recklessness, grandiose plans and beliefs and lack of insight (Jones et al., 2010).
One important feature of Bipolar I disorder is the repetition of relapses over time and more than 90% of bipolar disorder patients have at least one relapse in their life time (Kessler et al., 1994;Bebbington and Ramana, 1995;Merikangas et al., 2007;Grant et al., 2005). Many factors, including demographic and illness characteristic might cause the recurrence of bipolar disorder. Therefore, identifying these factors can help in preventing or reducing the incidence of recurrence of this disease (Vejdani-aram et al., 2017).
Time-to-relapse in bipolar disorder is characterized by a cycling from depression to mania, and back again over time. This type of structure of data is known as recurrent relapse because the relapses happen subsequently on the same patients over time.
Patients with bipolar disorder and history of childhood maltreatment had greater severity of mania, depression and psychosis, higher risk of comorbidity, earlier age of onset, higher risk of rapid cycling, greater number of manic or depressive episodes and higher risk of suicide attempt compared with those bipolar patients without childhood maltreatment (Rowland and Marwaha, 2018).
The recurrent nature of bipolar disorder needs special methods of modeling, analyzing and identifying the influential factors that cause the relapses. Recurrent event is an event in which the event of interest can occur more than once on a subject under the assumption that the same type or different types of events can occur multiple times on the same individual. Examples of such events include recurrence in breast cancer (Rondeau, 2010), asthma attack (Duchateau et al., 2003), lower respiratory tract infection (Amorim and Cai, 2015), sport injury (Ullah et al., 2014), Chronic Renal Insufficiency (Yang et al., 2017) and relapse in bipolar disorder (Kessing et al., 1999).
Bipolar disorder is also known as mood disorder. It is a common chronic recurrence disease with various causes responsible for its recurrence. Bipolar disorder is a common, chronic and recurrent disease with various causes responsible for its recurrence. Therefore, information on the clinical course of bipolar disorder, including time-to-recurrent relapse provide valuable tool for planning and evaluating the health-outcome results of treatment. However, there is no known exact cause of bipolar disorder. Researchers have yet to find the exact genes that contribute to the disorder or understand exactly how the brain physically changes when the disorder is present (Smith, 2018). Time-to-recurrent relapses of these patients are rarely studied particularly considering heterogeneity between individuals and dependency between event times recurrent relapse (Taheri et al., 2016).
Many studies have focused on the time-to-first relapse in bipolar disorder and used Cox proportional hazard model, but few of them have been carried out to include most appropriate feature of bipolar disorder but have discarded subsequent information on recurrent relapse.
Moreover, ordinary methods in survival analysis implicitly assume that populations are homogenous. But heterogeneity, often referred to as variability is generally recognized as one of the most important sources of variability in medical and biological applications (Wienke, 2010).
A number of studies have failed to account for such heterogeneity assumption, the repeated feature of relapses and within subjects correlation caused by recurrent events. Estimating the individual hazard rate without taking into account the dependence and heterogeneity leads to underestimate of the hazard function to an increasingly greater extent; as time goes by (Wienke, 2010).
Studies on the factors responsible for the time-to-recurrent relapse in patients with bipolar disorder in Ethiopia are scanty. Hence, this study is an attempt to investigate and identify the most important factors associated with time-to-recurrent relapse in patients with bipolar disorder using multivariate survival recurrent event models.
Furthermore, many studies have also failed to account for recurrent event analyzing techniques of survival analysis in order to study time to relapses, with respect to repeated characteristic of bipolar disorder.
Therefore, this study has mainly attempted to reduce the identified research gap and answer the following basic questions. 1. What are the appropriate models that best fit the bipolar disorder patient data? 2. Which factors have an effect on the time to recurrent relapse in bipolar disorder? 3. Is there any dependency between the recurrent event times within individuals?

Data and Study Population
The data on inpatients with bipolar disorder relevant for this study were collected from medical charts available at Amanuel mental specialized hospital, which is the only mental specialized hospital in Ethiopia. Trained psychiatric nurses collected the data using structured questionnaire designed for this purpose.
The patients were diagnosed and treated as bipolar disorder patients based on the criteria of the fifth version of Diagnostic and Statistical manual for Mental disorder (DSM-IV) and the 10 th version of the international classification of diseases (ICD 10). All bipolar disorder patients who were clinically admitted from September 11, 2013 to March 12, 2019 for bipolar disorder cases and those who have experienced at least one event of relapse after onset of the disease were included in the study. For each recurrent relapse the start date of relapse and the discharge date of relapse were recorded. Since we used gap-time based approach, it was measured by the duration of the time since the end of the previous event to the start of next events. Data on the dependent variable was recorded as the date of admission and date of discharge for any severe mental disorder for each relapse that occurred within patients.

Variables of the Study
The dependent variable for our study was time-to-recurrent relapse of bipolar disorder. The periods between two successive relapses of bipolar disorder are called episodes. Episodes, to be considered as two separate episodes, must be separated by at least 8 consecutive weeks without mood of disturbance (Kessing et al., 1999).
Potential factors that have effect on time-to-recurrent relapse in bipolar disorder included in the study are family history of disorder, seasonal based relapse (spring, summer, autumn and winter), having other chronic disease (cardiovascular, diabetes and obesity), relation breakdown, death of significant others, type of treatment, stopping medication or using it erratically, substance abuse, high stressful life events, childhood stress factors, number of history prior relapse, number of relapse within the study time and total number of relapses. In addition, socio-demographic characteristics of the patients including sex, age at start of first admission, marital status, employment status, religion, place of residence and region were included.

Sampling Techniques and Sample Size Determination
Since we are interested to test the null hypothesis of homogeneity between the recurrent relapse times, the sample size (total number of patients) required for one sided test necessary to achieve the power of the test, = 1 − β , is given by where f t is median follow-up time, θ is the variance of frailty, max k i is the maximum number of events per patient, E m,T m+1 C m (t) − C 2 m (t) is the expectation with t 1 , . . . , f m+1 , m is the noncensored event from the maximum number of events max k i . 0 ≤ m < max k i . We can evaluate E m,T m+1 C m (t) − C 2 m (t) by using the r statistical package called cubature. Using α = 0.05 and β = 0.2, the tabulated values obtained from z table are z 0.2 = 0.84 and z 1−0.05 = 1.645. ψ, is integrated from the r software. However, for moderately correlated events its proposed values range from 0.5 to 0.75. This method of sample size determination for recurrent events in shared frailty models is widely discussed in Chen et al. (2014). Using ψ = 0.75, we have determined the sample size to be n = (0.84 + 1.645) 2 (0.2) 2 × 0.75 = 6.18 0.03 = 206.
A list of 822 inpatients that experienced at least one bipolar disorder relapse in all bipolar disorder wards was used as a sampling frame. From this sampling frame, we selected 206 patients using systematic random sampling method.

Methodology
Modeling time-to-recurrent relapse of inpatients with bipolar disorder have more than one event time for each patient and hence, requires multivariate survival data analysis methods.

General Data Layout for Recurrent Event Times
To represent recurrent event data, we used the following notations; N = total number of patients, T ij denotes the event time with t ij0 representing start time and t ij1 discharge time of the j th event for the i th subject. Patients that have not experienced relapse are represented by censored times C ij = (δ i0 , δ i1 , . . . , δ ir i ). If a patient has r i recurrent events, we observe information on recurrent events for patient i by representing r i triplets where, δ ij represent the status of event (1 = if the event has occurred, 0 = if the event did not occur), each event time has X ijk covariates, where i = 1, 2, . . . , N , j = 1, 2, . . . , n i , k = 1, 2, . . . , p.
In order to make an inference about recurrent relapse in bipolar disorder and for assessing the effects of covariates on the survival times between different recurrent relapses, we use different recurrent event models.

Key Components to Characterize the Recurrent Events Models
Deciding on risk interval, risk set, assumption of baseline hazard and within subject correlation independent increment, conditional, marginal and shared frailty models are proposed for recurrent events (Therneau and Grambsch, 2000).

Anderson and Gill Independent Increment Model
This model, also known as A-G independent increment model, is first proposed by Anderson and Gill (1982). It is particularly useful when interest centers on modelling the overall recurrence rate of the event of interest (Collett, 2015).
The AG model for the hazard of an event occurring at time t for the i th individual is given by where Y ij (t) denotes whether or not the i th individual is at risk of experiencing an event at time t, β represents the vector of parameters to be estimated, X ij (t) represent time-dependent explanatory variables, and h 0 (t) represent a baseline hazard function.

Prentice, William and Peterson (PWP) Conditional Model
This model is proposed by Prentice, William and Peterson (Prentice et al., 1981) and is the most flexible model in which separate strata are defined for each ordered event. PWP models assume that recurrent events within subject are related and baseline hazard vary from event to event and occurrence of subsequent event is affected by previous event (Yadav et al., 2018). In the PWP model, the hazard of the j th occurrence, j = 1, 2, . . . , of an event at time t in the i th of n individuals is Where Y ij (t) is unity until the (j − 1) th recurrent event and zero otherwise, β j is the vector of coefficients of the p explanatory variables for the j th recurrence time, X i (t) is the vector of the explanatory variables, and h 0j (t) is the event specific baseline hazard for the j th recurrence. In this model, the risk set for the j th recurrence is restricted to individuals who have experienced the previous (j − 1) th recurrences.
The PWP model has two forms: PWP-Total time and PWP-Gap time model, the basic interest is knowing the effect of intervention on the outcome variable from the beginning of the study and knowing the effect from previous event respectively (Yadav et al., 2018).

The Frailty Model
The term frailty is first introduced by Vaupel et al. (1979) and it is the convenient way to handle and describe the dependency of the observation within an individual and/or the heterogeneity between individuals in the recurrent events (Duchateau and Janssen, 2007).
A frailty model in statistical terms is called a random effect model for time -to-event data; this random effect has a multiplicative effect on the baseline hazard function. Its aim is to model and account dependency in cluster (recurrent) survival time through the introduction of a cluster-specific random effect, in which the survival times are conditionally independent given frailty (Wienke, 2010).
The extended Cox proportional hazard model with frailty term w i is specified as where h 0 (t) denotes baseline hazard function, X ij (t) denotes the vector of observed covariates, β the vector of regression coefficients and w i is the frailty term. If u i = e w i , the frailty model is When modeling time to recurrent relapse, the relapse is clustered within patients and the patients share a common frailty. This common frailty is termed as shared frailty (Hanagal, 2011).
The conditional multivariate shared frailty survival model for observed covariates X ij1 , X ij2 , . . . , X in i k is given by t ij0 h 0 (s) ds denotes the cumulative baseline hazard function and X ijk = X ij1 , . . . , X in i k is the covariate matrix in the i th cluster.
The unconditional multivariate shared frailty survival function is derived by using Laplace transform from conditional survival function, called marginal survival function is given by where L is the Laplace transform of frailty variable.
The joint survival function for all recurrent event-time data is the product of the survival functions of all the clusters because of the assumption about independence between clusters.
The gamma distribution has been widely applied for the choice of frailty distribution in different literature from computational and analytical point of view and availability of software. It is a flexible distribution that takes a variety of shapes. When, θ = 1, it is an exponential distribution and when θ is large, it takes a bell-shape identical to normal distribution (Wienke, 2010). Therefore, if we assume u i = u 1 , u 2 , . . . , u N are independent and identically distributed gamma random variables with density function The gamma frailty distribution has E (u i ) = 1 and V ar (u i ) = θ, where u i is used to measure degree of heterogeneity among subjects and the dependence (association) within subjects. If the sample size is large enough, a semi-parametric approach is often preferred because it does not rely on assumptions that are difficult to verify. To adapt this approach to proportional hazard models with frailty, the EM algorithm (Expectation-Maximization algorithm) can be used (Dempster et al., 1977).
An alternative approach to fit semi-parametric gamma frailty models based on penalized partial likelihood maximization leads to the same estimates as the EM algorithm. This technique, however, can be extended to a semi-parametric model with normally distributed random effects (Duchateau and Janssen, 2007).

Descriptive Analysis
Of a total of 822 inpatients with bipolar disorder, a sample of 206 patients who were hospitalized from September 11, 2013 to March 12, 2019 and those who have experienced at least one event of relapse of bipolar disorder were included in the study. To select the sample, Health information management system (HIMS) books available at all bipolar wards of Amanuel mental specialized Hospital were reviewed and 206 inpatients were selected randomly.
Table 2 also shows that a total of 418 inpatient admissions (relapses) were registered for the 206 inpatients under study within the study time from September 11, 2013 to March 12, 2019. Among these admissions, 161 (38.6%) admissions were for male patients, 333 (79.7%) admissions were for patients from urban areas, 220 (52.6%) admissions were for patients from Addis Ababa, only 2 (0.5%) were for patients from Benishangul-Gumuz, 325 (77.8%) admissions were for unmarried patients, 372 (89%) admissions were for unemployed patients and 364 (87.1%) admissions were registered for uneducated patients as shown in the fifth and sixth columns of Table 2.
The median and mean follow up times were 13.875 months and 20.366 months respectively as shown in Table 3. The average age of the patients at first admission was 33.33 years. The

Results from Anderson and Gill independent increment model (AG), total-time and gap-time based Prentice William and Peterson conditional models (PWP-TT and PWP-GT) and
Semiparametric shared gamma frailty model are displayed in the appendix. Coefficients of covariates, hazard ratios, robust standard errors (for non-frailty recurrent models), adjusted standard errors (for frailty model), z-values and p-values have been computed. The estimated regression coefficients for residence, religion (Protestant and Muslim), substance abuse, treatment type (therapy and both drug and therapy) and stopping medication or using it erratically were negative for all models. Conversely, the estimated coefficients for marital status, employment status, having history of previous relapse and childhood stressful life events are positive.

Model comparison
The likelihood ratio test (LRT) was used to compare AG, PWP-TT, PWP-GT and semiparametric shared gamma frailty models and the result in Table 4 show that the semi-parametric shared gamma frailty model with the smallest LRT is the best fitting model to our data.

Results from semi-parametric shared gamma frailty model
As can be seen in the second column of Table 5, the hazard ratio corresponding to each of the variables: age, marital status, region, employment status, previous relapse, childhood stress factors, stressful life events and seasonal case is greater than one while the hazard ratio for each of the remaining variables is below one.   The last column gives p-values for testing the significance of each coefficient. Thus, Marital status, with p=0.01; employment status (p=0.02), residence (p-value=0.02) and substance abuse (p-value=0.05) have significant effect on the response variable, time-to-recurrent relapse. The hazard ratio for unemployed patients is greater than one (exp (coefficient) =1.52678).

Results on measure of dependency and frailty variance
The results in table 6 below describe the estimated measures of dependency between recurrent relapses within patients. The results are part of the estimation of the semi-parametric shared gamma frailty model obtained by using EM algorithm.
Gamma distributed estimated frailty variance; Var [U] is 0.079 with 95% confidence interval [0.00, 0.297]. Unobserved estimated frailty for each individual, 12.605, which is shared by all recurrent observations within the same individual, is significantly greater than one. Table 6 also presents estimated values of 0.038 and 0.037 for Kendall's tau and Median concordance respectively.

Results for heterogeneity test
The log-likelihood result for frailty (full) and no-frailty (null) models are −2046.511 and −2046.014 respectively. This gives the observed likelihood ratio test value 0.994 presented in Table 4. The corresponding p-value for one-sided likelihood ratio test equals 0.16 while the p-value for the Commenges-Anderson test for heterogeneity is 0.31. Hence, one-sided test of heterogeneity for the given sample size and number of relapses per patient is not supported.

Discussion
Before analyzing our data, we placed two different data frames based on total time approach for AG and PWP-total time models and based on gap time for PWP-Gap time and for semiparametric shared gamma frailty models. This structure for data frame was similarly used by Ozga et al. (2018) for comparison of recurrent event models for composite end points. Summary of the fit results of the four models are presented in the appendix. The estimated coefficients and other results showed disagreement among the four recurrent event models. As stated by Amorim and Cai (2015), this disagreement is expected.
Based on the results of the LRT, we used semi-parametric shared gamma frailty model to analyze our recurrent relapse data. The same was used by Duchateau et al. (2003) to analyze recurrent asthma event rate. Rondeau (2010) has also used shared frailty model for recurrent breast cancer events.
The fit results of this model revealed that marital status has effect on time since the end of the preceding relapse to the start of the next relapse. Its hazard ratio was 1.46320 (Table 5). This implies that unmarried patients have higher risk of shorter period of time for having the next relapse than married patients on the same frailty level. In other words, unmarried patients have 1.46320 times shorter period of time between two successive relapses than married patients having the same level of frailty controlling for other variables in the model.
Similarly we have identified employment status as one of the main significant factors with hazard ratio 1.52678. This means unemployed patients have 1.52678 times higher risk of a shorter time-to-relapse than employed bipolar disorder patients conditioning on the same frailty.
Another finding of the present study is that rural patients had 0.67013 times less risk of a shorter time of relapse than urban patients at the same level of frailty adjusting for other covariates.
Similarly, patients who did not abuse substance had 0.76585 times lower risk of next relapse than patients with substance use disorder (substance abuse) given the same frailty and controlling for other variables. The result of the present study is consistent with the findings of Taheri et al. (2016). They used Semi-parametric penalized frailty model and found that substance abuse have significant effect on recurrent relapse in bipolar disorder patients. Our finding is also in agreement with the findings by Shim et al. (2017) that marital status and employment status are significant.
The present study also found that age and type of treatments have no significant effect on time-to-recurrent relapse. This result does not correspond with the findings of previous studies by Vejdani-aram et al. (2017).
The present findings about the effect of employment status and residence are also similar to those of previous studies conducted by Najafi-Vosough et al., (2016). But our results on age, sex, marital status, season, family history and type of treatment are not consistent with their results on these variables.
The model used for analyzing our recurrent bipolar disorder data is different from those in the stated studies. For example, Taheri et al. (2016) used semi-parametric penalized frailty model, Vejdani-aram et al. (2017) used fragmented frailty model, Kessing et al. (1998) used cox proportional hazard model and the others used logistic regression to analyze recurrent relapse in bipolar disorder patients.
After identifying the effect of observed covariates, we have presented adjusted parameters for correlation in Table 6 and the frailty variance is 0.079. This value is interpreted as a measure of the correlation between the survival times in the recurrent relapse in patients with bipolar disorder. This result is expected as suggested by Wienke (2010). The interpretation is different from univariate frailty case, in which the value of the frailty variance indicate the measure of unobserved heterogeneity in the study population. Our model is appropriate to handle the correlation between survival times since the value for frailty variance is not zero.
The additional measure of variance of the frailty variable is dependency. As presented in table 6, the value is greater than zero, suggesting that there is dependency between the survival times of the recurrent relapse in patients with bipolar disorder.
Another value in Table 6 is the value 12.605 for the frailty variable U, which follows gamma frailty distribution. As expected, the value is greater than one meaning that the value of the same frailty variable is shared by all recurrent events in patients.
The frailty variable is responsible for creating dependency between survival times of relapse in patients. We found the dependency between the event times to be 0.038 as shown in table 6. As proposed by Hougaard (2012), the dependency between survival times of recurrent relapse is measured by Kendall's tau. This is also an expected result and we conclude that the event times are dependent since the value is greater than zero.
Finally, the null hypothesis of no heterogeneity between survival times against the alternative that there is heterogeneity is tested by using likelihood ratio test, which follows 50:50 mixture of χ 2 0 and χ 2 1 . The heterogeneity test was not significant at 5% level of significance. Additionally, we used Commenges-Andersen test for heterogeneity, which also follows a mixture of chi-square distribution as likelihood ratio.
We failed to reject the null hypothesis of no heterogeneity based on the Commenges-Andersen test and likelihood ratio test (Table 7). Both tests indicated that there is no evidence to reject the null hypothesis of no heterogeneity in the recurrent event times between patients. This does not mean that frailty variance is zero and there is no correlation between survival times within patients. This is because in frailtyEM, Commenges-Andersen test is performed before the actual maximization of the likelihood, as it does not depend on the frailty distribution and it does not require the actual estimation of the frailty model (Balan and Putter, 2019).
The asymptotic behavior Likelihood ratio test converges to mixture of chi-square at point mass zero and with one degree of freedom when the sample size goes to infinity (Zhi et al., 2005). The reason for the unexpected result may be due to the sample size, number of relapses within patients and magnitudes of frailty variance.
According to the final suggestion made in the above study, the likelihood test for homogeneity is conservative and open problem if the sample size is for recurrent event and the number cluster for other events is finite. Claeskens et al. (2008) discussed that not only likelihood ratio test but also Commenges-Andersen test (score test) is still an interesting open problem to test homogeneity in the null hypothesis for semi-parametric frailty model.

Conclusion
The gap-time based semi-parametric shared gamma frailty model was found to be the most appropriate model to analyze time-to-recurrent relapse in inpatients with bipolar disorder. This model was used to identify some important risk factors that are responsible for further relapse. The appropriate multivariate survival recurrent event model used in this study was the gap-time based semi-parametric shared gamma frailty model. Using EM algorithm on this model, the important risk factors identified to have effect on time since the end of last relapse to the start of the next relapses are marital status, substance abuse, employment status and residence. The value of Kendall's' tau showed that there is dependency between survival times of recurrent relapses.
In many studies, cox proportional model and logistic regression models were used to analyze recurrent event data. These lead to underestimate and loosing most important information about latter events that occur after the first event. Based on our findings, we recommend researchers to use more appropriate recurrent event models such as AG, PWP and shared frailty models based on their objectives and research questions.
Furthermore, we recommend not only treating bipolar patients within family members and within the community but also educating them and the community about the features of disorder and creating awareness on each risk factor so as to increase the gap time between the ends of prior relapses to the next relapse.