A new extension of the normal distribution

Abstract:Providing a new distribution is always precious for statisticians. A new threeparameter distribution called the gamma normal distribution is defined and studied. Various structural properties of the new distribution are derived, including some explicit expressions for the moments, quantile and generating functions, mean deviations, probability weighted moments and two types of entropy. We also investigate the order statistics and their moments. Maximum likelihood techniques are used to fit the new model and to show its potentiality by means of two examples of real data. Based on three criteria, the proposed distribution provides a better fit then the skew-normal distribution.


Introduction
In statistics, the normal distribution is the most popular model in applications to real data.When the number of observations is large, it can serve as an approximate distribution for other models.The probability density function (pdf) (for x ∈R) of the normal N(µ,σ) distribution becomes where −∞ < µ < ∞ is a location parameter and σ > 0 is a scale parameter.Its cumulative distribution function (cdf) is given by x − µ ) G(x;μ, σ)= Φ( A family of univariate distributions generated by gamma random variables was proposed by Zografos and Balakrishnan (2009) and Ristic and Balakrishnan (2011).They defined the gamma-G ("GG" for short) distribution from any baseline cdf G(x), x ∈R, using an additional shape parameter a > 0, by the pdf and cdf (3) and respectively, where g(x) = dG(x)/dx, t dt is the gamma function, t dt and γ1(a,z) = γ(a,z)/Γ(a) are the incomplete gamma function and the incomplete gamma function ratio, respectively.
Each new GG distribution can be obtained from a specified G distribution.For a = 1, the G distribution is a basic exemplar with a continuous crossover towards cases with different shapes ( for example, a particular combination of skewness and kurtosis).Zografos and Balakrishnan (2009) motivated the GG distribution as follows.Let X(1),...,X(n) be lower record values from a sequence of i.i.d.random variables from a population with pdf g(x).Then, the pdf of the nth lower record value is given by ( 3) with a = n.A logarithmic transformation of the baseline distribution G transforms the random variable X with density function (3) to a gamma distribution.In other words, if X has the density (3), then the random variable Z = −log[1 − G(X)] has a gamma density π(z;a) = Γ(a)−1 za−1 e−z ,z > 0, say Z ∼ G(a,1).The opposite is also true, if Z ∼ G(a,1), then the random variable X = G−1(1 − e−Z) has the GG density function (3).Nadarajah et al. (2013) derived some mathematical properties of (3) in the most simple, explicit and general forms for any G distribution.
In this paper, we study some structural properties of the gamma normal (GN) distribution, which generalizes the normal disribution.In Section 2, we introduce the GN distribution and provide plots of its density function.We derive expansions for the pdf and cdf (Section 3) and explicit expressions for the quantile function (Section 4), ordinary and incomplete moments and Bonferroni and Lorenz curves (Section 5), generating function (Section 6) and entropies (Section 7).In Section 8, we investigate the order statistics and their moments.The estimation of the model parameters is performed by maximum likelihood in Section 9 and two applications are provided in Section 10.Concluding remarks are addressed in Section 11.

The GN distribution
By taking the pdf (1) and cdf (2) of the normal distribution with location parameter µ ∈R and dispersion parameter σ > 0, the pdf and cdf of the GN distribution are obtained from equations (3) and (4) ( for x ∈R) as (5) and Evidently, the GN distribution is defined by a simple transformation: if Z ∼ G(a,1), then the random variable X = Φ−1(1 − e−Z) has the density function (5).Hereafter, a random variable X following ( 5) is denoted by X ∼GN(a,µ,σ).The density function (5) does not involve any complicated function and the normal distribution arises as the basic exemplar for a = 1.It is a positive point of the current generalization.We motivate the paper by comparing the performances of the GN, normal and skewnormal models applied to two real data sets.
In Figure 1, we display some possible shapes of the density function (5) for some parameter values.It is evident that the GN distribution is much more flexible than the normal distribution.
The new distribution is easily simulated as follows: if V is a gamma random variable with parameter a, then has the GN(a,µ,σ) distribution.This scheme is useful because of the existence of fast generators for gamma random variables and the standard normal quantile function is available in most statistical packages.

Useful expansions
Expansions for equations ( 5) and ( 6) can be derived using the concept of exponentiated distributions.Consider the exponentiated normal (EN) distribution with power parameter a > 0 defined by Y ∼ EN(a,µ,σ), with cdf and pdf given by Ha(y) = Φ(y−σµ)a and , respectively.
The properties of several exponentiated distributions have been studied by some authors, see Mudholkar and Srivastava (1993)   Then, equation ( 5) can be expressed as , (7) where denotes the EN(a + k,µ,σ) density function.The cdf corresponding to (7) becomes where denotes the EN cdf with parameters a + k, µ and σ.
If a > 0 is a real number, we can expand as where Combining equations ( 8) and ( 9), we obtain .
By differentiating the previous equation and changing indices, we can write where .Clearly, .Equation ( 11) is the main result of this section.
It reveals that the GN density function is a linear combination of EN densities.So, several properties of the GN distribution can be obtained by knowing those properties of the EN distribution.

Quantile Function
The GN quantile function, say Q(u) = F−1(u), can be expressed in terms of the normal quantile function (QN(•)).The normal quantile function is given by x = QN(u) = σΦ−1(u) + µ.Inverting equation (6) , we obtain the quantile function of X as for 0 < u < 1, where Q−1(a,u) is the inverse function of Q(a,z) = 1−γ(a,z)/Γ(a).Quantities of interest can be obtained from (12) by substituting appropriate values for u.Further, the normal quantile function can be expressed as (Steinbrecher, 2002) in equation (43), see Appendix A. Further, after some algebra (see Appendix A), we obtain where and the quantity dk was defined in Section 3.
By expanding the exponential function and using ( 14), we have (see Appendix A) , where the p ′ rs are defined there.We can write .
By using equations ( 13) and ( 14), we can obtain from ( 12) where and Some algebraic details about ( 16) and others quantities of interest are given in Appendix A. Equations ( 13)-( 15) are the main results of this section.

Moments
Here, we obtain the ordinary and incomplete moments of X.They can be immediately derived from the moments of Y following the EN(a,µ,σ) distribution.Hereafter, let Z be the standard GN(a,0,1) random variable.First, we obtain the moments of Z. Thus, we can write from (7) Further, we can express in terms of QN(u) as Using ( 13) and ( 14), we can rewrite as Where the quantities e , are determined from ( 13)-( 15 The moments of X immediately follow from the moments of Z as E( The second representation for µ′n is based on (n,r)th probability weighted moment (PWM) (for n and r positive integers) of the standard normal distribution given by Where s  ()is given by ( 10) and τ . can be expressed as (Nadarajah,2008) where is the Lauricella function of type A (Exton, 1978) and the Pochhammer symbol (a)k = a(a+1)...(a+k−1) indicates the kth rising factorial power of a with the convention (a)0 = 1.
A third representation for Tn(y) is based on the normal quantile function.Thus, equation ( 21) becomes After some algebra, using ( 13) and ( 14), we have where en,s is given before.More details about (24) are addressed in Appendix B. The nth incomplete moment of X follows after a binomial expansion .
We can derive the mean deviations of Z about the mean and about the median M in terms of its first incomplete moment.They can be expressed as where and .The quantity T1(q) can be obtained from (22) (or ( 23) or ( 24)) with n = 1 and the measures δ1 and δ2 in (25) are immediately determined by setting and q = M, respectively.
For a positive random variable X, the Bonferroni and Lorenz curves are defined by and , respectively, where q = F −1 (π) = QGN(π) comes from the quantile function ( 12) for a given probability π.
Next, we obtain the probability weighted moments (PWMs) of Z.They cover the summarization and description of theoretical probability distributions.The primary use of these moments is to estimate the parameters of a distribution whose inverse cannot be expressed explicitly.The (s,p)th PWM of Z is formally defined as Using ( 8), ( 11) and ( 14), we obtain (26) where for  ≥ 1, ̅ ,0 , and Equations ( 17)-( 19), ( 22)-( 24) and ( 26) are the main results of this section.Some algebraic details are given in Appendix B.
The skewness and kurtosis measures can be calculated from the ordinary moments using well-known relationships.Plots of the skewness and kurtosis for selected parameters values as function of a are displayed in Figure 2. In the plots of Figures 2a and 2c, σ = 10.50, whereas in those of Figures 2b and 2d A second representation for M(t) can be based on the quantile function.We have Expanding the exponential function, using (16) and after some algebra, we obtain Equations ( 27) and ( 28) are the main results of this section.The mgf of X is simply given by MX(t) = eµ M(σt), where ı = √−1.The characteristic function (cf) has many useful and important properties which gives it a central role in statistical theory.Its approach is particularly useful in analysis of linear combination of independent random variables.Clearly, a simple representation for the characteristic function (chf) ϕX(t) = MX(it) of X, where i = √−1, is given by sin(tx)f(x)dx.
From the expansions and , we obgtain

Entropies
An entropy is a measure of variation or uncertainty of a random variable X.Two popular entropy measures are the Rényi and Shannon entropies (Shannon, 1951;Rényi, 1961).Here we consider therandom variable Z ∼GN(a,0,1).Thus, the Rényi entropy is defined as for γ > 0 and γ ≠ 1.
Setting √(n + 1)x = y, we can easily determine the last integral and then rewrite ρn as By expanding the binomial term in (31), we can obtain an explicit expression for IR(γ), which holds for any γ real positive and γ ≠ 1, given by where ρk is determined from (32).Algebraic details can be found in Appendix D. Next, the Shannon entropy of a random variable Z is defined by E{−log[f(Z)]}.It is a special case of the Rényi entropy when γ ↑ 1. Equation ( 30) is very complicated for limiting, and then we derive an explicit expression for the Shannon entropy from its definition.We can write where comes from ( 17) or (18) with n = 2.

Order statistics
Order statistics have been used in a wide range of problems, including robust statistical estimation and detection of outliers, characterization of probability distributions and goodness-of-fit tests, entropy estimation, analysis of censored samples, reliability analysis, quality control and strength of materials.Suppose Z1,...,Zn is a random sample from the standard GN distribution and let Z1:n < ••• < Zi:n denote the corresponding order statistics.Using ( 7) and ( 8), the pdf of Zi:n can be expressed as .
Based on equations ( 14) and ( 15 Equation (38) is the main result of this section.It reveals that the pdf of the standard GN order statistics is a triple linear combination of EN densities with parameters (i+j)a+k+r, µ = 0 and σ = 1.So, several mathematical quantities of the GN order statistics such as ordinary and incomplete moments, mgf and mean deviations can be immediately obtained from those quantities of the EN distribution.It gives the density function of the GN order statistics as a power series of the standard normal cumulative function multiplied by the standard normal density function.
As an application of (37), the sth ordinary moment of Zi:n becomes , where τs,(i+j)a+k+r−1 can be obtained from (19).
Another closed-form expression for can be derived using a result due to Barakat and Abdelkader (2004) applied to the independent and identically distributed case.Thus, . By expanding [1−F(z)] j and using ( 8) , we obtain Jj(s).For any real a > 0, we can write from equations ( 8) and (15) where dm,k is defined in Section 6 and the quantities τn,r are given in equation ( 19).

Estimation
Here, we consider estimation of the unknown parameters of the GL distribution by the method of maximum likelihood.Let x1,...,xn be a random sample of size n from the GN(a,µ,σ) distribution.The log-likelihood function for the vector of parameters θ = (a,µ,σ)T can be expressed as (39) The components of the score vector U(θ) are given by , where ψ(•) is the digamma function.Setting these expressions to zero and solving them simultaneously yields the maximum likelihood estimates (MLEs) of the three parameters.We use the matrix programming language Ox ( MaxBFGS subroutine), see for example, Doornik (2006) and the procedure NLMixed in SAS to compute the MLE θ.For interval estimation of the model parameters, we require the expected information matrix.The 3b× 3 total observed information matrix J(θ) is given by , whose elements are listed in Appendix E. Under conditions that are fulfilled for parameters in the interior of the parameter space but not on the boundary, the asymptotic distribution of √n(θb−θ) is N3(0,K(θ)−1), where K(θ) = E{J(θ)} is the expected information matrix.The multivariate normal N3(0,J(θ)−1) distribution can be used to construct approximate confidence intervals for the parameters.
The likelihood ratio (LR) can be used for testing the goodness of fit of the GL distribution and for comparing this distribution with the normal model.We can compute the maximum values of the unrestricted and restricted log-likelihoods to construct LR statistics for testing some sub-models of the GL distribution.For example, we may use the LR statistic to check if the fit using the new distribution is statistically "superior" to a fit using the normal distribution for a given data set.In any case, hypothesis tests of the type H0 : ψ = ψ0 versus H : ψ ≠ ψ0, where ψ is a vector formed with some components of θ and ψ0 is a specified vector, can be performed using LR statistics.For example, the test of H0 : a = 1 versus H : H0 is not true is equivalent to compare the GN and normal distributions and then the LR statistic reduces to w = 2{ℓ(a,µ,σ) − ℓ(1,µ,σ)}, where a, µ and σ are the MLEs under H and µ and σ are the estimates under H0.

Applications
In this section, the potentiality of the GN model is illustrated in two applications to real data.An alternative analysis of these data can be performed using the normal distribution.The beta-normal ( BN ) (Eugene et al., 2002) and Kumaraswamy-normal (KwN) models extend the normal model and they can also used to fit data that come from a distribution with heavy tails reducing the influence of aberrant observations.

The BN distribution
The BN density function with parameters µ and σ and two extra shape parameters α > 0 and β > 0 is given by

Kumaraswamy-normal (KwN) distribution
The KwN density function with parameters µ and σ and two extra shape parameters a > 0 and b > 0 is given by For a = b = 1, we have the normal distribution.Clearly, equation ( 41) is much simpler than (40).

Application 1: Carbohydrates data
The first example refers to the data from on agronomic experiments (Matsuo, 1986) conducted at the Federal University of Paraná.The main objective was to verify the content of carbohydrates (in %) of the corn farms.Some summary statistics for the CO data are: mean=66.34,median=66.64,minimum=62.35 and maximum=68.46.
The parameters of each model are estimated by maximum likelihood (Section 9) using the subroutine NLMixed in SAS.We report the MLEs (and the corresponding standard errors in parentheses) of the parameters and the values of the Akaike Information Criterion (AIC), Consistent Akaike Information Criterion (CAIC) and Bayesian Information Criterion (BIC) in Table 2.The lower the values of these criteria, the better the fit.Since the values of these statistics are smaller for the GN distribution compared to their values for the other three models, we can conclude that the new distribution is the best model among the four to explain the current data.An analysis under the GN model also provides a check on the appropriateness of the normal model and indicates the extent for which inferences depend upon the model.For example, the LR statistic for testing the hypothesis H0 : a = 1 versus H : H0 is not true, i.e. to compare the GN and normal models, is w = 2{−63.05− (65.20)} = 4.30(p-value = 0.0381), which provides support toward to the new model.The CO data set can be found at http://home.att.net/rdavis2/cigra.html.The data include n = 384 records of CO measurements, in milligrams, in cigarettes of several brands.Some summary statistics for the CO data are: mean=11.34,median=12.00,minimum=0.05and maximum=22.00.In each case, the parameters are estimated by maximum likelihood using the subroutine NLMixed in SAS.We report the MLEs (and the corresponding standard errors in parentheses) of the parameters and the values of the AIC, CAIC and BIC statistics in Table 2. Since the values of these statistics are smaller for the GN and KwN distributions compared to those values for the other models, the new distribution (a) ( b )

Concluding remarks
In this paper, we propose a new model called the gamma-normal distribution which extends the normal distribution.The proposed distribution is very versatile to fit real data and could be a good alternative to the normal and two recent generalizations of this distribution.We study some of its structural properties.We provide explicit expressions for the ordinary and incomplete moments, quantile and generating functions, mean deviations, Rényi entropy, Shannon entropy, order statistics and their moments.We derive a power series expansion for its quantile function which is useful to obtain alternative formulae for several mathematical measures.The model parameters are estimated by maximum likelihood and the observed information matrix is determined.The potentiality of the new model is illustrated by means of two examples.

Appendix A: Quantile function
We derive a power series for the QGN(u) in the following way.First, we use a known power series for Q−1(a,1 − u).Second, we obtain a power series for the argument 1 − exp[−Q−1(a,1 − u)].Third, we consider the power series for the normal quantile function given in Steinbrecher (2002) to obtain a power series for QGN(u).
We introduce the following quantities defined by Cordeiro and Lemonte (2011).Let Q−1(a,z) be the inverse function of The inverse quantile function Q −1 (a,1 − u) is determined in the Wolfram website 1

Figure 1 :
Figure 1: Plots of the new density function for some parameter values.(a) For different values of a with µ = 0 and σ = 1.(b) For different values of a and σ with µ = 0. (c) For different values of a, µ and σ.

Figure 2 :
Figure 2: (a) Skewness of X as function of a for some values of µ.(b) Skewness of X as function of a for some values of σ.(c) Kurtosis of X as function of a for some values of µ.(d) Kurtosis of X as function of a for some values of σ.
pj and hj,r are given in Section 4.

,
β = 1, we obtain the normal distribution.Recently, Alexander et al. (2012) and Cordeiro et al. (2012) proposed the generalized beta-generated and McDonald normal distributions, respectively.The first generated model contains, as special cases, several important distributions discussed in the literature such as the normal, exponentiated normal, BN and KwN distributions, among others.

Figure 3
Figure3displays the estimated densities and cumulative functions and the empirical cdf for the GN and normal models.These plots reveal a better GN fit to these data.

Figure 3 :
Figure 3: (a) Estimated densities of the GN and normal models for carbohydrates data.(b) Estimated cumulative functions and the empirical cdf for Carbohydrates data.
is a very competitive model to explain these data and it is more parsimonious.The LR statistic for comparing the GN and normal models is w = 2{−962.9− (−1946.4)}= 20.6(p-value=< 0.0001), which yields favorable support toward to the first model.

Figure 4 Figure 4 :
Figure4displays the estimated densities and estimated cumulative functions and the empirical cdf for the BN and normal models.So, the proposed model provides a better fit to these data.

Table 1 :
MLEs and information criteria.

Table 2 :
MLEs and information criteria.