Modelling location , scale and shape parameters of the birnbaum-saunders generalized t distribution

Abstract: The Birnbaum-Saunders generalized t (BSGT) distribution is a very flflexible family of distributions that admits different degrees of skewness and kurtosis and includes some important special or limiting cases available in the literature, such as the Birnbaum-Saunders and BirnbaumSaunders t distributions. In this paper we provide a regression type model to the BSGT distribution based on the generalized additive models for location, scale and shape (GAMLSS) framework. The resulting model has high flflexibility and therefore a great potential to model the distribution parameters of response variables that present light or heavy tails, i.e. platykurtic or leptokurtic shapes, as functions of explanatory variables. For different parameter settings, some simulations are performed to investigate the behavior of the estimators. The potentiality of the new regression model is illustrated by means of a real motor vehicle insurance data set.


Introduction
The Birnbaum-Saunders (BS) distribution is the most popular model used to describe the lifetime process under fatigue.It was proposed by Birnbaum and Saunders (1969) due to the problems of vibration in commercial aircraft that caused fatigue in the materials.This distribution is also known as the fatigue life distribution and can be used to represent failure time in various scenarios.As reported by Pescim et al. (2014), the BS distribution has received wide ranging applications in past years that include: modelling of hourly SO2 concentrations at ten monitoring stations located in different zones in Santiago (Leiva et al., 2008); modelling of diameter at breast height distributions of near-natural complex structure silver fir-European beech forests (Podlaski, 2008); modelling of hourly dissolved oxygen (DO) concentrations observed at four monitoring stations located at different areas of Santiago (Leiva et al., 2009;Vilca et al., 2010); statistical analysis of redundant systems with one warm stand-by unit (Nikulin and Tahir, 2010), among others.
Because of the widespread study and applications of the BS distribution, there is a need for new generalizations of this distribution.Daz-Garca and Leiva (2005)  Despite some of those BS generalized distributions induce asymmetry, symmetry and promote weight variation of the tail, they do not provide all these shapes in the same density function.A highly flexible model which admits light or heavy tails, shaper or flflatter peaked shape and it has some important special and/or limiting cases, is the Birnbaum-Saunders generalized t (BSGT) distribution pro-posed by Genc (2013).This generalization of the BS distribution contains some models previously studied in the literature and, therefore, the BSGT enables to study and t various types of data with different shapes by a unifed approach.
In many practical applications, the responses are affected by explanatory variables such as the socioeconomics and school levels, blood pressure, cholesterol level, soil quality, climate, among many others.BS regression models are widely used to estimate the reliability or predict the durability of non-repairable copies of materials.Among them, Rieck and Nedelman (1991) proposed a log-linear regression model based on the BS distribution.Diagnostic analysis for the BS regression model were developed by Galea et al. (2004), Leiva et al. (2007) and Xie and Wei (2007), while the Bayesian inference was introduced by Tisionas (2001).Barros et al. (2008) proposed a class of lifetime regression models that includes the log-Birnbaum-Saunders-t (BS-t) regression models as special case.Furthermore, Lemonte and Cordeiro (2009) and Villegas et al. (2011) studied the BS nonlinear and BS mixed models, respectively.However, those BS regression models follow the same idea of many previous regression type models in the literature such as generalized linear models (Nelder and Wedderburn, 1972), generalized additive models (Hastie and Tibshirani, 1990) and log location-scale models (Lawless, 2003;Cancho et al., 2009;Roman et al., 2012;Pascoa et al., 2013).These models use only the location parameter of the distribution of the response variable which is a major limitation since other parameters may need to be modelled.
In this context, Rigby and Stasinopoulos (2005) developed the generalized additive models for location, scale and shape (GAMLSS), a very general class of univariate regression models whose main advantage is that all parameters of a given distribution (that does not necessarily belong to the exponential family) can be modelled as parametric and/or additive nonparametric smooth functions of explanatory variables, which can lead to a simpler distribution for a given response variable Y, simplifying the interpretation of the problem in study.Within GAMLSS the shape of the conditional distribution of the response variable can vary according to the values of the explanatory variables, allowing great modelling flexibility.
In this paper, we introduce the BSGT distribution into the GAMLSS frame-work in order to provide a very flexible regression model for this family, modelling all of its four parameters using explanatory variables.The new regression model may be fitted to a data set with light or heavy tails, i.e. a platykurtic or leptokurtic response variable as, for example, the total claim amount of an insurance company.The rest of this paper is outlined as follows.Section 2 provides a brief review of the BSGT family of distributions.The BSGT is introduced into the GAMLSS framework in Section 3. Section 4 shows a simulation study with different values of the parameters.A real data set application regarding insurance is provided in Section 5 to show the BSGT flexibility, and comparing it with well-known models.Section 6 ends the paper with some concluding remarks.

The BSGT distribution -a brief review
Daz-Garca and Leiva (2005) proposed the GBS family of distributions defined by transformation from any random variable Z with symmetric distribution S (Gupta and Varga, 1993), with density given by where Z ~S( = 0, = 1, ) , δ correponds to the parameters inherited from the baseline distribution, c is the normalizing constant such that fY(y) is a proper density, K(⋅) is the kernel of the density of Z, α > 0 represents the shape parameter and β > 0 is the scale parameter and is also the median of the distribution.As α →0, the GBS distribution becomes symmetrical around β , whereas when α grows the distribution becomes increasingly positively skewed.Its probability density function (pdf) can be expressed as for y > 0.
If Z has a GT distribution, Z~GT(0,1, ν, τ), with pdf given by where −∞ < z < ∞ then the random variable Y obtained from transformation (1) has a BSGT distribution with pdf given by where y > 0, α > 1, β > 0, υ > 0 and τ > 0. As in (1) if the shape parameter α → 0, the distribution becomes near symmetrical around β and when α grows the model becomes increasingly positively skewed; β is a scale parameter and is also the median of the distribution; and ν and τ are the parameters related to the peak and tails of the distribution.
Note that as y ⟶ ∞ then ( | , , , ) = ( ) , the same order as a t distribution with ντ/2 degrees of freedom.Hence, small values of the product of and result in a heavier upper tail.Similarly, larger values of the product of ν and τ will result in a lighter upper tail.Parameter τ also affects the peak of the distribution: τ ≤ 1 results in a spiked peak in the distribution, with a sharp spike (i.e.infinite derivative) if τ < 1, while a larger τ results in an increasingly flatter peak.
Despite its flexibility that can combine symmetrical/asymmetrical shapes with light or heavy tails (i.e.leptokurtic or platykurtic densities), the BSGT model is important since it has some special or limiting cases already proposed in the literature such as the BS, BS-t, BS-Laplace, BS-Cauchy and BS-power exponential (BSPE) distributions, as displayed in figure1.

GAMLSS model for the BSGT distribution
GAMLSS are semi-parametric regression models that involve a distribution for the response variable (parametric part) and may involve non-parametric smoothing terms when modelling parameters of the distribution as functions of explanatory variables.The GAMLSS R implementation, called the gamlss package, includes distributions with up to four parameters that are commonly represented by μ for location, σ for scale and ν and τ for shape (Rigby and Stasinopoulos, 2005).Hence, for the BSGT distribution, we consider μ = β, σ = α, ν = ν and τ = τ to obey the established notation in GAMLSS framework in R software (Stasinopoulos and Rigby, 2007).Moreover, from this point, we say that a random variable Y follows a BSGT distribution, denoted by Y~BSGT(μ, σ, ν, τ).
The GAMLSS model for the BSGT distribution assumes that conditional on its parameters (μ, σ, ν and τ), observations are independent BSGT(μ, σ, ν, τ) variables with pdf given in (2), and can be expressed as where (•), k = 1,2,3,4, are the link functions, = ( , … , ) denotes the parameter vector associated to explanatory variables with design matrix and each ℎ function is a smooth non-parametric function of an explanatory variable , being typically a smoothing spline (for more details, see e.g.Hastie and Tibshirani, 1990) or P-spline (Eilers and Marx, 1996).

Selecting the response variable distribution and diagnostics
Two different stages comprehend the strategy to t a GAMLSS model: fitting and diagnostics.In the fitting stage, we t different models using a generalized Akaike information criterion (GAIC, for more information, see Voudouris et al., 2012) to compare them (the model with the smallest GAIC is selected).The Akaike information criterion (AIC; Akaike, 1974) and Schwarz Bayesian criterion (SBC, Schwarz, 1978) are special cases of the GAIC(k) when k = 2 and k = log(n), respectively.
In the diagnostic stage, we use the normalized (randomized) quantile residuals (Dunn and Smyth, 1996) that are defined by Where Φ is the inverse cumulative distribution function of a standard normal variable and = ( | ) is the fitted cumulative distribution function.The main advantage of this type of residual is that its true values , = 1, … , always have a standard normal distribution given the assumption that the model is correct, whatever the distribution of the response variable, i.e. if the model for the response variable is correct, the residuals have a standard normal distribution.

Selecting the explanatory variables
In order to select the explanatory variables for the BSGT model, we use a backward/forward algorithm implemented in gamlss package called StepGAICAll.A: 1) Step 1: select a model for μ using a forward GAIC selection procedure and fixing σ , ν and τ; 2) Step 2: select a model for σ using a forward GAIC selection procedure given the model for μ in Step 1 and fixing ν and τ; 3) Step 3: select a model for ν using a forward GAIC selection procedure given the models for μ and σ obtained in Steps 1 and 2, respectively, and fixing ;

4)
Step 4: select a model for τ using a forward GAIC selection procedure given the models for μ, σ and ν obtained in Steps 1, 2 and 3, respectively;

5)
Step 5: perform a backward GAIC selection procedure to select a model for ν given the models for μ, σ, and τ obtained from Steps 1, 2 and 4, respectively;

6)
Step 6: perform a backward GAIC selection procedure to select a model for σ given the models for μ, ν, and tau obtained from Steps 1, 5 and 4, respectively;

7)
Step 7: perform a backward GAIC selection procedure to select a model for μ given the models for σ, ν, and τ obtained from Steps 6, 5 and 4, respectively.The resulting model may contain different terms for each of the parameters μ, σ, ν and τ.

Computational functions
In order to perform a simulation study with the BSGT distribution and a real data set application using the BSGT regression model, we implemented this family into the gamlss package in R (for more details about GAMLSS framework estimation processes, see Rigby and Stasinopoulos, 2005) and the following functions will be available in the gamlss.distpackage: 1) dBSGT() gives the BSGT probability density function; 2) pBSGT() gives the BSGT cumulative distribution function (cdf); 3) qBSGT() gives the BSGT quantile function, i.e. inverse cdf; and 4) rBSGT() is the BSGT random number generator.It is noteworthy that we can also t the special and/or limiting cases of the BSGT distribution in gamlss, e.g. in order to fit a BS-t distribution, we use the following arguments within the main function gamlss(): tau.fix = TRUE and tau.start = 2 in order to fit τ = 2 in BSGT(μ, σ, ν, τ) model.

Simulation study
We performed a simulation study generating 12 different scenarios with two sample sizes (n = 100 and n = 500) using the rBSGT() function.The scenarios were chosen in such a way that all possible density shapes could be covered, using as true parameter values μ = 50 and 1) σ = 0.5 for near symmetrical (Table 1) and σ = 1.5 for very asymmetrical shapes (Table 2); 2) ν = 1.0 for heavy-tailed and ν = 5.0 for less heavy tailed; 3) τ = 1.5, τ = 2.0 and τ = 10.0 since a low value of τ tends to a sharper peaked shape (leptokurtic), while a high value of tends to a flatter peaked shape (platykurtic).
The simulation study was performed in a HP Proliant M530e Gen8 Computer under a Debian Linux operating system.Tables 1 and 2 present the true simulated parameter values, average estimates (AE) and standard deviations (SD) for the estimated parameters for near symmetrical ( σ = 0.5) and very asymmetrical ( σ = 1.5) scenarios, respectively.The required numerical evaluations are implemented in R software through the gamlss function (Stasinopoulos and Rigby, 2007).As expected, we observe (from Tables 1 and 2) that when n = 500 we obtain closer estimates compared to the true generating value and the SD values decrease.Moreover, we can note that parameters ν and τ are slightly more imprecise than μ and σ which could be happening since they are often high correlated.Also, distribution of the parameter estimators of ν and τ are highly positively skewed.

Application: motor vehicle insurance data
In this Section, we illustrate the usefulness of the BSGT regression model, using the GAMLSS framework, to the total claim amount (response variable, Y) from motor vehicle insurance policies over a twelve-month period in 2004−2005 (De Jong and Heller, 2008, p. 15).The original data set was composed of approximately 68,000 policies, but here, we used only those with at least one claim (totalling 3,911 policies).Using this reduced data set, Y ranges from 1.09 to 55,720.00, with mean=2,145.00,median=844.70,standard deviation=3,765.86,skewness=4.74 and kurtosis=38.58.
Since Y is a very positively skewed variable we used four different distributions besides the BSGT distribution which are possible suitable candidates for the response variable: the Box-Cox t (BCTo), generalized gamma (GG), inverse Gaussian (IG) distributions that are already available in gamlss.distpackage and the BS distributions which is a special case of the BSGT distribution.The co-variates used to build the models in order to explain Y are displayed in Table 3.
Here, we replaced by * = log(X + 1) to modify the high skewness exhibited by this variable.After some previous analysis, we excluded six observations that presented = 0, i.e. the vehicles with value equals zero and the only two observations with = 4, i.e. when there were four claims, since they were considered as outliers.Finally, we fitted several regression models using the backward/forward algorithm available in Section 3.Moreover, we considered a P-spline (pb; for more details, see Eilers and Marx, 1996) in both quantitative covariates ( X * and X ).Appropriate link functions for each of the parameters were chosen in all five distributions: when a distribution parameter θ has range −∞ < θ < ∞, we used the identity link function, whereas, when θ > 0 the logarithm link function was adopted.
A backward/forward selection of explanatory terms as described in Section 3 was performed for all parameters through stepGAICAll.A function in gamlss package (Stasinopoulos and Rigby, 2007) and values of global deviance (GD), Akaike information criterion (AIC) and Schwarz Bayesian criterion (SBC) were computed in order to compare all fitted models.Table 4 displays those statistics from the best tted models for each used distribution, and so, the BSGT regression model could be chosen as the more suitable model since it returned the smallest GD, AIC and SBC values (65,534.1,65,607.1 and 65,836.0,respectively).
We can note that the four covariates were considered on the location parameter μ in the final BSGT model and both of the quantitative ones did not require any smoothing function.From the model for the median μ in (4), we observe that the higher is the vehicle value the lower is the median total claim amount which is somewhat unexpected.The same occurs with the exposure during the year.From the other two covariates considered in μ, we can say analyzing Figures 2(a) and (b) that the greater is the number of claims, greater will be the median total claim amount and that people living in areas E and F tend to have higher median claim amounts, respectively.As it was observed in model (4), we jusy need a smoothing function to model the covariate * in σ.This relationship is showed in Figure 2(c) and we can note that for lower vehicle values there is a positive effect on the dispersion and after a certain point this relation becomes negative.Figures (d) and (e) present the relationship between the number of claims and driver's area of residence, respectively, with the dispersion.Further, the exposure during the year has a negative linear effect on dispersion.Figures 2(f)−(h) represent the relationships between selected covariates with the tails of the distribution.
Finally, the histogram and Q-Q plot of the normalized quantile residuals (Dunn and Smyth, 1996) of model ( 4) are displayed in Figure 3.

Concluding remarks
In this paper, we used the Birnbaum-Saunders generalized t (BSGT) distribution proposed by Genc (2013) which admits light or heavy tails, shaper or flatter peaked shape and it has some important special cases studied in the literature.Based on this distribution, we proposed a BSGT regression model using the flexibility of the GAMLSS framework (Rigby and Stasinopoulos, 2005).The new regression model can be used as an alternative to model light and heavy-tailed response variables as parametric and/or additive nonparametric smooth functions of explanatory variables.Hence, this extended regression model is very flexible in many practical situations.Moreover, we conducted a simulation study using 12 different scenarios in order to cover all possible BSGT density shapes: near symmetrical and very asymmetrical, light and heavy-tailed (i.e.platykurtic and leptokurtic).We also discussed model checking analysis using the normalized quantile residuals in the new regression model fitted to a real data.An application to insurance data set demonstrated that it can be used quite effectively to provide better fits than others flexible regression models.
proposed a family of generalized Birnbaum-Saunders (GBS) distributions based on countoured elliptical distributions such as Pearson VII and Student's t distributions, Vilca and Leiva (2006) introduced a BS model based on skew normal distributions.Gomez et al. (2009) extended the BS distribution from the slash-elliptic model.Vilca et al. (2010) and Castillo et al. (2011) developed the epsilon-skew Birnbaum-Saunders distribution.More recently, Cordeiro and Lemonte (2011) and Pescim et al. (2014) dened the beta Birnbaum-Saunders and the Kummer beta Birnbaum-Saunders models, respectively.

Figure 1 :
Figure 1: Relationships of the BSGT special models

Figure 3 (
a), apart from one outlier, show us that the residuals adequately follow a normal distribution.

Table 1 :
Real parameter value, average estimates (AE) and standard deviations (SD) based on 1,000 simulations of the near symmetrical version of the BSGT distribution