Statistical Analysis of Market Penetration in a Mandatory Privatized Pension Market Using Generalized Logistic Curves

In this paper we analyze market penetration of the Mexican Pension System. This market is unique in two respects: it is mandatory and it is private. Very few markets in the world have these two characteristics. In Mexico, the pension system became privatized in July 1997. By the end of 1999, more than 98% of workers (in the formal sector) had affiliated themselves with some private pension provider. We used two simple statistical methods of analysis of market share to draw some conclusions about how market share unfolds in a mandatory but privatized market. Our first descriptive analysis is based on a generalized logistic growth curve and the second on some simple linear regression fitting. Our results show that individual pension funds did not have similar growth patterns. Early market leaders (in terms of market share) did not necessarily stay leaders in the end. However, the first 12 months turned out to be critical for market share.


Introduction
Mandatory privatized pension provides a unique opportunity to study market penetration.It is unique in two respects.(1) It is mandatory.Therefore, there is no choice about "buying".(2) It is private.Therefore, a worker can affiliate himself or herself to any of the providers.Very few markets in the world have these two characteristics.
The objective of this paper is to provide a more general method (than fitting standard logistic growth curves) of empirically estimating market saturation.This method provides us with a way of classifying different companies in the market.One potential use of the model is to predict later growth in market share for every company based on early trends.Since the initial cost of staying in this market is very high (because of high initial fixed cost), a model that can be used for predicting the market share based on early behavior, can be very useful.Thus, for example, companies can use this model to decide whether to "stick around" or cut their losses and exit from the market.
What is so special about mandatory participation in the market?Consider the market penetration for television sets (Young and Ord, 1989).It is customary to talk about market penetration as if every person would buy only one.That does not happen.On the other hand, mandatory pension schemes are unique: all workers have to have them and each person can have only one account.Law does not allow multiple accounts.Thus, market saturation is a better-defined concept in the context of a pension market than in the context of a market for television sets (or any other consumer good).There are several qualifications that need to be mentioned here.If a worker is in the informal sector of the labor market, he/she is not covered by social security at all.Informal sector refers to the part of the labor market that is not registered with the government.This sector pays no tax to the government.Most activities in the sector are paid for by cash only thus evading the entire banking network making it all but impossible to track.In Mexico, the informal sector accounts for more than 50% of the labor force.Therefore, a true market saturation will mean a coverage of all workers in both informal and in formal sectors.But since the informal market is not covered by the system of social security in Mexico, we leave it out of our discussion.
It might be argued that the informal market is getting formalized over time.There is no empirical evidence to that effect in Mexico over the past decade (see Sinha, 2000, p. 209).However, with panel data from a number of countries in Latin America, Packard (2002) finds evidence that private pension system is providing incentives for labor markets to be formalized.Strictly speaking, market saturation will exist only when the number of workers entering the pool of contributors (either from the informal labor market or as new entrants) is equal to the number of workers leaving the formal labor market (through retirement, death or moving to the informal sector).
Why does a private market make any difference?In most countries around the world (especially in other OECD countries), pension is by and large public (if it is mandatory) and pay as you go (this is the term commonly employed to indicate that current contribution is used to pay pension right holders of today).Private market allows us to study market share in an unfettered condition.In some countries with private pension (such as in Uruguay), government affiliated companies have been allowed to dominate the market (for a broad analysis of the Latin American privatization, see Sinha, 2000).This is not so in Mexico.

The Mexican privatized pension market
In December 1995, the Mexican Congress passed the New Social Security Law (Ley del Seguro Social) paving the way for the new system.A second set of laws were passed in April 1996 to put in place the reforms in pension system (Ley de los Sistemas de Ahorro para el Retiro).This law allowed privatized-management of the country's pension system.It approved operation of investment management companies (Administradoras de Fondos para el Retiro or AFOREs).Each AFORE can now manage individual retirement funds (Sociedades de Inversion Especializadas en Fondos para el Retiro or SIEFOREs).Each AFORE account holder contributes 6.5% of wages to the system (paid for by the employer).In addition, there is a "government contribution" of 5.5% of the minimum wages (for more details, see Sinha, 2002).
Workers can choose any AFORE for contribution.Once an AFORE is chosen, no change can be made for one year.It is possible to choose a different AFORE every year without any financial penalty.AFOREs are allowed to charge "management fees" either as a percentage of contribution, or as a percentage of value accumulated or any combination thereof.Most AFOREs charge as a percentage of contribution.AFOREs have to inform the affiliate (an affiliate is a worker who has an account with a particular AFORE) about their account at least once a year.The statement would include information about the accumulated value, contribution during the year and any charges the account has incurred.The AFOREs started to collect compulsory and voluntary contributions from February of 1997.Contribution to the new system became compulsory for all private sector workers in September 1997.Note that all formal sector workers are not necessarily private sector workers because approximately 23% of formal sector workers work for the federal, state or municipal governments.Among the workers in the formal sector, very few agricultural workers or self-employed workers are included.
CONSAR, the regulatory body of the AFOREs in Mexico, had issued 17 licenses by the end of 1997.Three of the AFOREs have merged with others.Consequently, as of February 28, 1999 there are 14 AFOREs left in the market.All of these mergers had to be approved by CONSAR as per regulation.After the rearrangement, the fourteen AFOREs that are left in the market are Banamex, Bancomer, Bancrecer, Banorte, Bital, Garante, Genesis, Inbursa, Principal, Profuturo GNP, Santander, Tepeyac, Siglo XXI, and Zurich.The merits and demerits of the system are analyzed in Espinosa and Sinha (2000).

Growth curves
Let Z t be a variable that indicates the trend of market penetration at time t = 1, . . ., N. Thus Z t is measured as the number of elements of the population owning a particular item (in this case, membership to an AFORE).We assume that the saturation level of the market is the constant a < ∞, so that 0 < Z t < a and Z t → a as t → ∞.There are three commonly used models of market penetration (see Meade, 1984): (i) Modified exponential, (ii) Logistic and (iii) Gompertz.Here we shall employ an extension of the logistic curve, obtained as follows.First, note that the equation with c, β > 0, can also be expressed as Taking log is possible here by the assumption that Z t is positive and bounded by above by a. Thus, the (natural) logarithm is employed basically as a linearizing transformation of the variable which can be interpreted as the odds of the event of owning a particualr item of interest and whose empirical probability of success is Z t /a.Therefore, when Z * t > 1 we can say that the probability of the event is greater than that of its complement.This change of variable is used to extend the range of the original variable from 0 < Z t < a to 0 < Z * t < ∞.Rather than imposing the use of logarithms, we shall let the data lead us to an appropriate linearizing transformation chosen within the following power family that includes the logarithm as a special case.That is, we suggest selecting the data transformation within the family indexed by the value of λ in This power transformation family is similar to the Box-Cox family of transformations, defined for In fact, both (2.4) and the Box-Cox transformation keep the same trend direction as the variable Z * t .However (2.4) is discontinuous at λ = 0, but continuity of T λ (Z * t ) in λ is not required by the method that will be employed here.Guerrero (2000) proposed a method to select the index of the transformation (2.4) that most adequately validates the assumption that for t = 1, 2, . . ., N, where { t } is a sequence of zero-mean random error terms with constant variance.The method was designed as a data description device applicable to (short) time series.It allows { t } to be integrated of order 0 (stationary) or integrated of order 1 (nonstationary), so that 0 or 1 differences are needed to make the sequence { t } stationary, and it employs the corresponding optimal linear estimator of β in each case.A useful reference for model (2.5) with different orders of integration (including fractional integration) for the random sequence { t } is Deo and Hurvich (1998).When { t } is integrated of order 1 we need the additional assumption that the sequence started at some finite point in the past with a fixed initial condition given by the value 0. By substituting (2.3) in (2.4) and using (2.5) without the random error, we get the extended logistic curve given by This is a more flexible family of growth curves than (2.1), which is obtained as a special case of (2.6) when λ = 0.The index λ could be estimated by Nonlinear Least Squares if we were willing to assume a particular behavior for { t } (e.g. that the errors are independent and identically distributed or that they follow an autoregressive process of order 1), but we will not pursue this line of investigation here.Instead we shall apply a method that leaves the structure of { t } to be determined by the data themselves in so far as being integrated of order 0 or 1.

Selection of the linearizing transformation
Guerrero's (2000) method consists of calculating a sequence of λ values that trace the concavity or convexity in {Z * 1 , Z * 2 , . . ., Z * N } by means of for t = 1, 2 . . ., N − 2, where λ t+2 > 0 implies (local) concavity and λ t+2 < 0 (local) convexity in the data around time t+2.To summarize the global behavior of the data the following measures of central tendency can be used and they become natural estimates of λ, Selection of (2.8) or (2.9) can be made on the basis of the absence or presence of extreme values, respectively.For each estimated value, two estimates of the slope parameter are calculated, the Ordinary Least Squares (OLS) estimator (2.10) and the average difference estimator with Tλ (Z * ) the average of {T λ (Z * 1 ), T λ (Z * 2 ), . . ., T λ (Z * N )} and α = T λ (Z * ) − βOLS (N + 1)/2 the accompanying estimate of βOLS .These linear estimates are the most efficient in a statistical sense whenever the order of integration is 0 or 1, respectively (see Deo and Hurvich, 1998).Discrimination between the order 0 or 1 is performed by comparing the coefficient of determination, R 2 , they produce.To this end, we should recall that only the following expression works well when dealing with transformed data (see Kvålseth, 1985) (2.12) where Tλ (Z * t ) denotes an estimated value in the transformed scale and ] is the corresponding estimated value brought back to the original scale of Z * t .Thus, with respect to the choices (2.10) and (2.11) we have that is the adjusted value when OLS is used.Whereas in the other case we get If the true value of the saturation level were known (as it happens when Z * t represents a proportion, in which case a = 1) Guerrero's method can be applied as originally proposed.Nevertheless, in many applications the saturation level should be considered as an unknown parameter that has to be estimated from the available data.In that case, we suggest to search for the value of the saturation level that maximizes R 2 , over a grid of "a" values.

Data Analysis
The data set consists of the number of accounts recorded in each of the 14 existing AFOREs.At the beginning of 1999, seven of the 14 managed to capture about twelve million accounts (out of a market size of 14 million accounts).The rest together has only two million accounts.For the purpose of this analysis, we shall group the 7 smallest AFOREs in one called Others.Thus we shall analyze here the data on the following AFOREs: (1) Banamex, (2) Bancomer, (3) Banorte, (4) Bital, (5) Garante, (6) Profuturo, (7) Santander, (8) Others and ( 9) the Total number of accounts.The corresponding data are shown in the Appendix (expressed in thousands of accounts from February 1997 to February 2000).We decided to work first with the variable Z = Total Accounts by applying the methodology described in the previous section.By so doing we found that the optimum saturation level (providing the largest R 2 over a grid of "a" values) was ã = 21900.Correspondingly, the concavity or convexity in Total Accounts was numerically evaluated as indicated in Table 1 (see also Figure 1 to appreciate it visually), where the values shown were calculated using equation (2.7).Global values: λ = 0.76, λ Med = 1.02.
Table 1 allows us to say that the behavior of the odds is convex from April to September, 1997 (except perhaps in August) indicating a greater than linear growth during that period, then it becomes concave from October, 1997 to April, 1998 (excepting February).From May 1998 onwards, the pattern in the odds is unclear since it changes several times from convex to concave and vice versa, therefore a linear growth may be deemed reasonable.Since there are many extreme values in Table 1, the best summary is provided by the median of the values, that is λ Med = 1.02, whose proximity to unity indicates that the global behavior of the odds may be well described as linear.
Thus, a reasonable linearizing power transformation for Total accounts has index λ = 1.With this value already fixed we searched again for the best estimate of the saturation level, obtaining as a result â = 20400.This number is very close to 20139.34, which is the estimate provided by IMSS for the "potential formal labor market size" (quoted in CONSAR, 1999).Using these values of λ and â we obtained the fit shown in Figure 1 for the original variable Z t and in Figure 2 for the odds Z * t , calculated using equation (2.4).It should be clear that the estimated saturation level was obtained through an exploratory data analysis that was aimed only at obtaining a satisfactory fit to the observed data.This rationale is in line with the statistical requirement of a saturation level mentioned by Meade (1984), although put forward by other authors: "The primary objective should be to fit recent data satisfactorily, rather than forcing the curve to pass through some pre-ordained point in 50 years time".Moreover, Meade proposed that: "If the estimation of the saturation level is consistent over time, this is satisfactory.However, if the estimated saturation level is infeasible or fluctuates widely, it would seem reasonable to estimate the saturation level as a function of appropriate explanatory variables, given sufficient degrees of freedom".Due to the short length of the time series under study, we considered only two additional estimation exercises of the saturation level, first with data up to January, 2000 and then with data up to December, 1999.In both exercises we obtained the estimated value â = 20200, which is very close to the aforementioned estimate obtained with all the available data (up to February, 2000).Therefore we did not deem necessary to employ a model with explanatory variables for the saturation level.Besides, we are convinced that the estimated saturation level already takes into account population (formal labor market) growth implicitly, since the data correspond to consecutive time periods with different population sizes.An implicit assumption when estimating the saturation level is that the population growth will remain basically unchanged in the future.
In Table 2 we present some estimation results of the extended logistic growth curve introduced in the previous section.Let us recall that βOLS and βDIF are given by equations (2.10) and (2.11) and that the estimated intercept α is only required in association with βOLS .
Just for the sake of completeness, we also included λ = 0 in Table 2, because that value leads to the use of an ordinary Logistic growth curve, which is evidently not recommended in the present situation, due to its R 2 value and corresponding saturation level.Thus, in this case there is some gain in goodness of fit by using the extended logistic curve.
In order to make the results of the growth curve fitting for the different AFOREs comparable with each other, it was decided to use the same power transformation found earlier (λ = 1) for all the AFOREs.The odds were fitted by growth curves with their respective optimal saturation level.In all cases, the largest coefficient of determination was found with βDIF .Table 3 shows the saturation levels, estimated slopes, R 2 values, dates at which the median level (50% of the saturation level) was attained and percentage of the saturation level reached in February, 2000, which is the end-of-the-sample month.The estimated coefficients βDIF in Table 3 are directly linked to the end-ofsample level reached, since the larger the estimated slope, the larger the percentage level reached in February, 2000.Thus, in accordance with these values we have the following categories for the AFOREs.
1. Accelerated growth (characterized by a slope larger than 0.2 and end-of- Next, if we define that a market reaches maturity when its level equals or is larger than 50% of its saturation level, then we may say that this happened in November, 1997 for Total Accounts.For each individual AFORE, its maturity started as early as August, 1997 (Bancomer and Profuturo) and no later than October, 1998 (Bital).On the other hand, it is interesting to notice that the sum of the individual saturation levels is 20600.This figure is close to the estimated saturation level of Total Accounts.Thus we are led to believe that if a joint estimation of the whole set of growth curves had been carried out, taking into account the restriction that the sum of the saturation levels be equal to 20400, we would have obtained similar results to those reported in Table 3.However, carrying out that kind of joint estimation is beyond the scope of this paper.
Another type of descriptive analysis consists of studying the proportions of values of each AFORE with respect to Total Accounts.In Figure 3 we present such proportions.There we can observe that each proportion has its own trend, but all the trends are very close to linear from the beginning of 1998.Thus, it is natural to fit a linear trend to each proportion of AFORE to represent its behavior once it became a mature product.That is, the fit for all the proportions started at the average time of maturity (the average of August 1997, October 1998and November 1997) which is February 1998.
The fitted trend lines for each proportion of AFORE are shown in Figure 4.There we can see that four AFOREs have proportions trending downward (Bancomer, Santander, Profuturo and Garante).While Banamex, Bital, Banorte and Others trend upwards.Of course this empirical regularity is valid only during the sample period and it should not be expected to hold true in the long-run.
One particular application of the previous analysis is that of predicting the future number of accounts in each AFORE.To do that we can use the estimated slope for the odds of the AFORE, Z * t = Z t /(â−Z t ), and add it to the most recent odds in order to get a forecast of Z * t+1 .Then we must bring this forecast to the original scale of the variable by means of the formula Z t+1 = âZ * t+1 /(1 + Z * t+1 ).Just for illustrative purposes, since in our case there was no real need to get forecasts, we calculated three monthly forecasts for Banamex.Since the most recent observation of the AFORE accounts in Banamex was that of February, 2000 (Z F eb = 1927) we first calculated the odds Then we obtained forecasts of the odds

Conclusions
We present two statistical descriptions of the data on the AFOREs.(1) The first one was based on an extended logistic model for market penetration.We were able to classify the AFOREs objectively, according to its observed behavior, without any preconception of the factors that we should take into consideration.For example, some early observers claimed that the fund Siglo XXI will win a large share of the market simply because it has the backing of the IMSS.It did not turn out that way.The very data led us to a classification scheme that emphasized the speed of growth of market penetration as well as product maturity.
The second descriptive analysis focused on the relative behavior of each AFORE with respect to the Total number of accounts.A simple fitting of regression lines, from the point of maturity onwards, allowed us to see the trend of each AFORE's market penetration.We believe that these trend lines could be used by the business managers of the AFOREs as a reference, in order to make plans about future investments, management fees and marketing in other markets.It is critical to emphasize that once the "first phase" of growth of market share goes through, there is not much that can be done to change the market share as a saturated market can only offer additional workers for an AFORE only at the cost of cannibalizing other AFOREs.
The analysis has important implications for business strategy for pension funds.In many countries around the world, the social security systems are being privatized.A company planning to enter the market needs to know early in the game whether they should stay (when they have initially captured a small market share) or they should cut their losses and leave the market.Our analysis shows that the first twelve months is the most important time frame (the first phase).If a company has not made significant inroads into the market by then, they are likely never to get a significant market share.
There are several caveats to our analyses here.(i) Market shares can change dramatically as consolidation takes place.In fact, when we analyzed the data, the Spanish bank BBV had its own AFORE called Probursa.However, in June 2000, BBV took over the second largest bank in Mexico, Bancomer.Bancomer also had the largest AFORE.Thus, BBV had to sell off its own holding of the AFORE (to Profuturo).In early 2002, another AFORE Genesis was sold to Principal.Thus, there was a realignment of market sizes.(ii) The point forecasts derived from the second descriptive analysis are only a first approximation to obtaining optimal forecasts.This is so because there is no measure of uncertainty associated to the forecasts and no optimality criterion was used to obtain them.However, if the need for optimal forecasts arises, then the point forecasts can be used as a starting point to improve upon them.

Table 3 :
Estimation results for the growth curves of the AFOREs