The Performance of Hybrid Artificial Neural Network Models for Option Pricing during Financial Crises

Abstract: this paper provides a novel research on the pricing ability of the hybrid ANNs based upon the Hang Seng Index Options spanning a period of from Nov, 2005 to Oct, 2011, during which time the 2007-20008 financial crisis had developed. We study the performances of two hybrid networks integrated with Black-Scholes model and Corrado and Su model respectively. We find that hybrid neural networks trained by using the financial data retained from a booming period of a market cannot have good predicting performance for options for the period that undergoes a financial crisis (tumbling period in the market), therefore, it should be cautious for researchers/practitioners when carry out the predictions of option prices by using hybrid ANNs. Our findings have likely answered the recent puzzles about NN models regarding to their counterintuitive performance for option pricing during financial crises, and suggest that the incompetence of NN models for option pricing is likely due to the fact NN models may have been trained by using data from improper periods of market cycles (regimes), and is not necessarily due to the learning ability and the flexibility of NN models.


Introduction
ASince it was published in 1973, Black-Scholes model has been regarded as the foundation of a large number of conventional pricing models.Widely accepted though it is, researchers also find that Black-Scholes model has significant flaws due to the presumed prerequisites which are not feasible in the real financial market.According to empirical evidence, the distribution of option returns has a higher peak and heavier tails than that of normal distribution, and the implied volatility generated from Black-Scholes model shows a convex curve which is widely known as the "volatility smile" rather than a horizontal line as assumed (Kou, 2002).This to a large extent leads to a systematical bias when carrying out in-the-money and out of-the-money option pricing, which has been proved in a multitude of research (Black,1975, MacBeth, andMerville, 1979), and biases might even change over time.(Rubinstein,1985).Therefore, later researchers focused much of their attention on varies of modified models in order to get a model fitted better for the real financial transactions when omitting some of the assumed conditions in Black-Scholes model (Kou, 2002, Saurabha, and Tiwari,2007, Bakshi, Cao, and Chen,1997).However, most of training and testing of the BP neural network model, the hybrid ANN with BS model, and the hybrid ANN with CS model will be conducted using the collected data of Hang Seng Index Option.In the Results section, in-sample forecasting results from each model will be provided.Finally, we will draw a brief conclusion of this research.

Methodology Artificial Neural Network
Among varies of neural networks, back-propagation networks have gained a wide application especially in modeling and forecasting [16].Therefore, we adopt BP networks for this research.

Network Structure
There are four input nodes and only one output node which denotes the difference between the theoretical price and the market price of the option over their respective strike prices.
In determining the number of nodes in hidden layer, several criteria have been used such as the mean squared error (MSE), Akaike Information Criterion (AIC), and Bayesian Information Criterion (BIC).According to the BIC expression developed by Kashyap [17] and Schwarz [18] independently: The AIC formula proposed by Akaike [19] and then adjusted by Box and Jenkins [20] where

Transfer Function
In BP multilayer networks, three kinds of transfer function are usually used.They are logsigmoid transfer function, tan-sigmoid transfer function, and linear transfer function, and it is essential for these transfer functions to have derivatives in BP networks.
The log-sigmoid function: The tan-sigmoid function: () The linear function: In this research, the linear transfer function is used in the output layer because it is expected that the option prices should not be restricted into a specific range as (0,1) in log-sigmoid function or (-1,1) in tan-sigmoid function.Tan-sigmoid function is used in the hidden layer.
The outputs of layer two (option prices) can be expressed as where n is the total number of nodes in layer 0 (input layer); m is the total number of nodes in layer 1 (hidden layer); is the bias of j th node in layer 1; j f is the tan-sigmiod transfer function of j th node in layer 1; j w is the weight assigned to the output of j th node in layer 1 by the single node in layer 2 (output layer); 2 b is the bias of the single node in layer 2; h is the linear transfer function of the single node in layer 2.
O is the final output from layer 2 which is the forecasted option price.

Hybrid Neural Net with Black-Scholes Model (BS-ANN)
The Black-Scholes formula for option pricing is where .The final forecasted result is the sum of the multiplication of this predicted difference from ANN with its corresponding strike price K and the theoretical value of BS model, which can be expressed as ( ( / ) / ) .

Hybrid Neural Net with Corrado and Su Model (CS-ANN)
Corrado and Su [10] adjusted the Black-Scholes model with a Gram-Charlier series expansion in order to take into consideration the skewness and kurtosis deviations from normality in stock returns.While it is similar to the semi-parametric option pricing model proposed by Jarrow and Rudd [21], but Corrado and Su's model are easier to report and interpret [10].Evidence shows that this semi-parametric model performs significantly better than Black-Scholes model [5].Hence, the latter model will be used in this paper.
Corrado and Su [10] modified the standard normal density function by a Gram-Charlier series expansion.The modified density function is The option pricing formula after modification is ( 3) where Similarly, the differences between forecasted values from Corrado and Su model with market values over their corresponding strike price will also be put into the network as targets in the hybrid model of artificial neural network with Corrado and Su model.The final forecasted result from this model is ( ( . Training Process for option pricing Among all kinds of training algorithms, Levenberg-Marquardt (LM) modified algorithms is the fastest and is also the default training function in MATLAB [21].Due to the large amount of memory spaces it requires, LM algorithm is usually used in small or medium sized network.Considering that the size of the network here is not large, gradient descent LM training algorithm will be used in this research.
To train the network, the weights and biases were first initialized and the initial values are generated automatically by the network.With these weights and biases, the network will calculate the option price and the error of the network E which is defined as where M is the total number of input and output pairs as defined previously.
Then, the weights and biases will be adjusted after each iteration following where  is the learning rate.As explained by Hagan and Menhaj [23] in their paper, the derivatives of ()  fw is expressed by Jacobian matrix in LM algorithm.
where i x represents the element in parameter column vector X containing the weights and biases.Hence, () and So, the parameter matrix will be adjusted through equation in each iteration, where  is a learning parameter.

Hong Kong (HK) Hang Seng Index Options
The data adopted in this research ranges from 1st Nov, 2005 to 31st Oct, 2011 and covers the subprime mortgage crisis in 2008 during which period of time, a large number of stocks' prices experienced a slump [27].The financial option we considered in this research is the Hang Seng Index Option traded on Hong Kong Stock Exchange based upon the Hang Seng stock market index.

Parameters: T and S/K
The performances of the aforementioned three models are tested separately in several groups by considering that the predicted bias of BS model differs across maturity and moneyness [6].Based upon their time to maturity, the data are separated into long term options ( ).Consequently, in total 9 groups of data consist of 116,467 observations are summarized in Table1.

Parameters: risk free rate, historical volatility, skewness and kurtosis
One year deposit rate covering the time period of the collected data ranges from 2.25% to 4.14%.To simplify the model, a risk free rate of 3% is used in this paper.Yearly historical volatility of the returns in the past 60 days (including the present day) is calculated as the volatility of the present day.Skewness and kurtosis of the returns are also calculated using historical data of the past 60 days.It is assumed that there are 252 trading days per year.Pricing options under financial crisis One of the main purposes of this research is to test the forecast performances of neural network models for option prices during the 2008 financial crisis.Therefore, the data from each of the nine groups above are used for training the networks but also based upon different periods of time that are consistent with the developing stages of the subprime mortgage crisis in 2008 as shown in Figure 1.We now roughly split the time in consideration into four stages reflecting different periods of development of the 2008 financial crisis in Hong Kong.These can be seen in Table 2.We have designed in total 6 simulation sets (as shown in Table 3) for the nine groups of the index options mentioned earlier.In each of the sets 1, 2 and 3, data from Stage 1 are used for training.For example, when using the BS-ANN model, the input data for the model are obtained from Stage 1, and the targeted outputs are also from the same stage.After training, the network is then used as an option pricing tool for Stage 2, 3, and 4 separately.For Simulation sets 4, 5, and 6, we combine two or three stages in time to train neural network models, and the trained models are then used for prediction of the option prices for other stage(s).For instance, in Simulation Set 6, data from Stage 1, 2, and 3 are altogether used to train the network, which is later used to simulate the option prices in Stage 4.

Results and discussions
To provide a thorough comparison among the three models, four different measures are used: Where N is the number of data pair used in simulate; i F is the forecasted option price; i A is the actual price from the market.

Preliminary results
The first purpose of this research is to test the prediction performance of the models in each stage of the financial crisis.Firstly, we want to know whether learning of prices from a certain stage (s) would cause any side effect to the prediction of prices.Secondly, we want to find out whether hybrid networks possess any advantages over the traditional network.The error measures (RMSE) of each simulation set are extracted and summarized in Table 4, 5 and 6.It is clear that the worst performing excises lie with Simulation Set 1 (shown in Table 3).In this set, the booming stage before the financial crisis was used for training the networks, which were then used for predicting the prices for the tumbling period of the market.This phenomenon can be seen in all three networks regardless the network is hybrid or not.Largest errors always occurred in Set 1.The pricing performance become generally better when the networks (trained by using Stage 1) are used to predict option prices for Stage 3 and 4 which represent for a recovering economics after the financial crisis, though there are some large errors when predicting OTM options.The performance becomes better because the later stages in the market bear similar features to those in Stage 1, which was that market prices were climbing up in one way or the other.When we used more stages as training sets, for instance, using Stages 1, 2 and 3 as a whole, the prediction of prices of options gets better.Take the RMSE of BS-ANN in long term, out of-the-money group as an example.The RMSE of Stage 4 with a training set of Stage 1 is 829.50198, and that of the same stage with a training data from Stage 1 and Stage 2 together is 123.92778.The difference is as large as over 700.As seen in the errors of RMSEs with a training set of Stage 1 and Stage 2 or a training set of Stage 1, 2, and 3 altogether are much smaller.This holds true almost for all the groups.
On the other hand, the network trained only by Stage 1 can generate better results (smaller errors) when used to prediction option prices for Stage 3 or Stage 4, though the results are a little worse than those obtained by the networks trained by using more than 2 stages.All the phenomena indicate that the choice of a training set for a network is crucial for its prediction performance.Researchers need to pay attention to the financial data from which period of the economic cycles that is retained for training, and for which period that is to be used for prediction.Of course, large quantity of training data could certainly generate smaller errors, however, the improvement could be limited as seen in our simulation that a network trained by using a combination of Stages 1, 2 and 3 didn't outperform much over a network trained by using Stages 1 and 2. In our view, it was the diluteness effect of combination of booming period (S1) and crashing period (S2) in the market that resulted in better prediction for a stable period (S4), not necessarily because of larger data amount.Of course, the later certainly helps.With the other objective in our mind, we now pick Simulation Set 1 (shown in Table 7) as an example (the comparisons in relation to other Simulation Sets can be found in the Appendix).Understandingly, this is the worst case that we have had with all the networks having performed badly for the periods during which a regime switch in the financial market had occurred.As shown before, in this simulation, the booming stage before the 2008 financial crisis was used for training the networks, which were then used for predicting the prices for the tumbling period of the market.What we have observed here is that the hybrid CS-ANN and BS-ANN models performed a little better than the NN models in overall, while the BS-ANN performed worst for OTM options for options with long term maturity.This is somewhat related to the fact that Black-Scholes model mispricing worsens when underlying volatility is high especially for the deep out of-the-money options [15].
Although network models can make a significant contribution in improving forecast precision, there is a common and significant deficiency in all the three models attributed to this.They all might generate negative prices for out of-the-money options.A small moneyness ratio 0 / S K usually leads to a relatively low option price in the real market as shown in Table 1.In standard neural network model, it is hard for the network to control the extent to which the option price is close to zero.Similar to this, it is also hard for the network to control the difference between the predicted price from BS (or CS) model and the real price from the market ( This deficiency in simulation would probability lead to a negative sum ( / ) / , and further a negative predicted option price ( ( / ) / ) .

Discussions and conclusions
Recent researches on non-parametric models for option pricing have shown NN models to be less effective than parametric models (such as BS model) especially during the periods of financial crises.Lento and Gradojevic [24] ever concluded after pricing options for 1987 and 2008 by using NN models, that "the very advantages of non-parametric models over their parametric counterparts such as learning ability and the flexibility of functional forms largely contribute to the poor performance of non-parametric models when markets are highly volatile and experience a regime shift.".However, our research suggests that the incompetence of NN models for option pricing is likely due to the fact NN models may have been trained by using data from improper periods of market cycles (regimes), and is not necessarily due to the learning ability and the flexibility of NN models.Further, one should be aware that this research is not only about traditional neural networks, it is mainly about the hybrid ANN models for option pricing, i.e. the ANNs coupled with parametric option pricing models.The research is therefore novel on this aspect and the related results should be mainly referred to hybrid ANN models for options.
In conclusion, using data of Hang Seng Index Options dating from Nov, 2005 to Oct, 2011, this paper tests the performances of a standard neural network and two hybrid neural networks in option pricing.In order to test their performances in different stages of financial crisis, the data is separated into several groups and are forecasted separately.Four error measures including ME, RMSE, MAE, and MAPE are used to compare the performances.Two conclusions can be drawn based up our research: 1) a neural network trained by using the financial training data retained from a booming period of a market cannot have good predicting performance for options for the period which undergoes the financial crisis (tumbling period in the market), therefore, it should be cautious for researchers/practitioners when carry out the predictions of option prices by using either a standard ANN or a hybrid ANN. 2) We observed that Hybrid ANNs performed slightly better than traditional ANNs for option prices in the course of financial crisis except for predicting OTM option prices, while BS-ANN performed worst for OTM options for options with long term maturity.

P 2 L
is the number of weights and biases in network; L is the output layer (  in this research); i N is the number of nodes in the ith layerMis the number of data pair used in training process

3  4 
is the standard normal probability density function; t S is a random stock price at time t. is the nonnormal skewness; is the nonnormal kurtosis.
and short term options ( 60 T  days).Further, based upon the moneyness of the option, 0 / S K , each of the previous three groups are further separated into three subgroups as did in Bakshi, Cao and Chen's research[6]: Out of-the-money (OTM)

Fig. 1
Fig. 1 Trend of Hang Seng Index from Nov, 2005 to Oct, 2011

Table 1 .
Hong kong index options

Table 2 .
Stages of development of the financial crisis

Table 3 .
The simulation sets

Table 4 :
RMSEs of each simulation set in artificial neural network

Table 5 .
RMSEs of each simulation set for the Hybrid Model CS-ANN

Table 6 :
RMSEs of each simulation set in Hybrid Model 2 (CS-ANN)

Table 7 :
Error measures of each simulation set 1