April 7, 2006Analyzing Spatial Panel Data of Cigarette Demand: A Bayesian Hierarchical Modeling Approach

Analysis of spatial panel data is of great importance and inter- est in spatial econometrics. Here we consider cigarette demand in a spatial panel of 46 states of the US over a 30-year period. We construct a de- mand equation to examine the elasticity of per pack cigarette price and per capita disposable income. The existing spatial panel models account for both spatial autocorrelation and state-wise heterogeneity, but fail to account for temporal autocorrelation. Thus we propose new spatial panel models and adopt a fully Bayesian approach for model parameter inference and predic- tion of cigarette demand at future time points using MCMC. We conclude that the spatial panel model that accounts for state-wise heterogeneity, spa- tial dependence, and temporal dependence clearly outperforms the existing models. Analysis based on the new model suggests a negative cigarette price elasticity but a positive income elasticity.


Introduction
In econometric terms, panel data are observations aggregated on a crosssection over multiple time periods.In the case of cross sections of spatial regions such as counties in a state, states in a country, or countries around the world, panel data are referred to as spatial panels (Anselin, 1988;Baltagi, 2005).Here we examine a demand equation that relates cigarette consumption to cigarette price and per capita disposable income.The federal government of the US has made serious effort through major policy interventions to reduce the consumption of cigarettes since the 1960's.However some of the major policy interventions are not without controversy, an example being the Congressional ban of broadcast advertising of cigarettes in 1971 after the application of the Fairness Doctrine Act to cigarette advertising in 1967.The banning of pro-smoking messages from television and radio have eliminated anti-smoking messages, as it is no longer a requirement for these stations to adhere to the Fairness Doctrine Act.If antismoking messages are more effective than pro-smoking messages, then the net effect of the 1971 advertising ban may actually be an increase in the cigarette consumption.To assess this net effect and the effectiveness of other major policy interventions, it is important to properly construct and estimate the cigarette demand equation (Hamilton, 1972;Baltagi and Levin, 1986).years 1963, 1973, 1983 and 1992.The data set consists of a panel of 46 states of the United States from 1963 to 1992.The response variable here is the real per capita sales of cigarettes by persons of smoking age (14 years and older) measured in packs per person on the log scale (henceforth "cigarette sales").See Figure 1 for maps of cigarette sales in a few selected years.Two explanatory variables are considered for constructing the demand equation.The first explanatory variable is the average retail price of a pack of cigarettes measured in real terms on the log scale (henceforth "cigarette price").The second explanatory variable is the real per capita disposable income of each state on the log scale (henceforth "income").Following the convention in econometrics, all three variables, cigarette sales, cigarette price, and income, are on the log scale, so that the coefficients in the demand equation would represent elasticities.
While cigarette taxation is meant to deter cigarette consumption, it is still difficult to quantify the price elasticity of demand for cigarettes.One challenge is to account for spatial dependence between neighboring states, which may be attributed to economic phenomena such as bootlegging when consumers in a state of higher cigarette tax attempt to buy cigarettes in a bordering state of lower cigarette tax.Other challenges are to account for differences among different states or among different time points, which we will refer to as heterogeneity across states or heterogeneity over time.The effect of income on cigarette consumption can be ambiguous.On one hand, if cigarette is a normal good, higher income would lead to more consumption.On the other hand, more education, usually positively associated with higher income, would lead to less cigarette consumption.See Baltagi and Llevin (1986) and Baltagi and Li (2004) for more details.Baltagi and levin (1986) were the first to consider this spatial panel from 1963 to 1980 and constructed a dynamic demand equation for cigarettes to address several major policy issues.Their data analysis yielded a significant negative effect of cigarette price on cigarette consumption with a price elasticity of −0.2, while there was no effect of income on cigarette consumption with an insignificant income elasticity.Furthermore, they modeled the bootlegging effect directly by incorporating the lowest cigarette price in neighboring states as an additional explanatory variable and found the effect of bootlegging to be statistically significant.In light of these data analysis results, they concluded that cigarette taxation was an effective tool for generating revenues despite the spillover effects to neighboring states where bootlegging was significant.
More recently, Baltagi and Li (2004) considered the same spatial panel with the time period updated to range from 1963 to 1992.They constructed a simple demand equation for cigarettes to examine the elasticity of cigarette price and that of income.They also modeled the bootlegging effect but as a part of spatial dependence, which may be thought of as an improvement of the explanatory variable approach taken in Baltagi and Levin (1986).In addition, depending on whether heterogeneity across states or over time was accounted for explicitly, the resulting models were spatially and temporally homogeneous, or temporally heterogeneous, or spatially heterogeneous.Based on the performance of prediction, they concluded that it was important to take into account spatial dependence and spatial heterogeneity.
There are, however, some unresolved issues in Baltagi and Li (2004).First, the models assume independence over time while allowing for dependence across space.It would be interesting and, as it turns out, necessary to account for both spatial dependence and temporal dependence for the cigarette demand data.One of the main reasons that temporal dependence is not accounted for in these spatial panel models is due to computational difficulties involved in optimization over a high dimensional parameter space for obtaining the maximum likelihood estimates (MLE).Chen and Conley (2001) considered semiparametric estimation of spatial panel data models.Kapoor et al. (2006) proposed a method of moments estimator for a spatial panel data model.Driscoll and Kraay (1998) considered estimation of such models with unspecified spatial correlation.However, none of these models accounted for spatial dependence and temporal dependence simultaneously.See Elhorst (2003) and Anselin (2006) for a survey of spatial panel data models.
With the advances in computing technology and in Markov chain Monte Carlo (MCMC) algorithms, there is now ample opportunity to resolve these computational difficulties in spatial panel data analysis and consider more complex and thus more realistic spatial panel models.Here we consider generalizing the spatial panel models in Baltagi and Li (2004) to account for not only spatial dependence but also temporal dependence.In statistical terms, spatial panel data may be viewed as repeated measures of spatially aggregated data.Thus the models proposed here can be viewed as spatial-temporal statistical models for spatial panel data.For statistical inference, we use Bayesian hierarchical models as an alternative to maximum likelihood.We develop MCMC algorithms for obtaining the posterior distributions of model parameters, as well as posterior predictive distributions of cigarette demand at future time points.
Our contribution is that, for the cigarette demand data, our general spatialtemporal models provide a significant improvement over the existing spatial panel models in terms of model fitting as well as predictive power.We also show that the analysis results are comparable between MLE and Bayesian inference in the existing spatial panel models, while our Bayesian inference is still computationally feasible for the more complex spatial-temporal models.Furthermore, our approach provides direct statistical inference of spatial and temporal dependence in a panel data model, which, to our knowledge, has not been accomplished before.While Baltagi et al. (2006) considered the Lagrangian-Multiplier tests for similar models, these tests rely on the restricted model only and thus can not infer the unrestricted spatial-temporal model.In contrast, we provide the posterior distribution of spatial and temporal dependence in the unrestricted model.Finally, although spatial-temporal statistics have been applied to many disciplines such as ecology (Zhu et al. (2005(Zhu et al. ( , 2007) ) and epidemiology Waller et al. (1997), we believe that our spatial-temporal models and the Bayesian inference are novel for the analysis of spatial panel data in general.With greater availability of and a growing interest in spatial panel data, the work presented here would advance the capability of analyzing spatial panel data in practice and thus impact the field of spatial econometrics.
The remainder of the paper is organized as follows.We review the existing spatial panel models and propose new Bayesian inference in Section 2. Then in Section 3, we develop new spatial-temporal models for spatial panel data and again propose Bayesian inference.Analysis results of the cigarette demand data using both the existing and the new models are shown in Section 4. Model comparisons based on an information criterion and prediction performance are given in Section 5, followed by a brief conclusion in

Bayesian Inference for the Existing Spatial Panel Models
Here we review the spatial panel models considered in Baltagi and Li (2004) and propose a Bayesian hierarchical model for statistical inference.Following the notation in Baltagi and Li (2004), we let y it , x it1 , x it2 denote the variables of cigarette sales, cigarette price, and income, respectively, in the i th state and the t th year, where i = 1, • • • , N indexes the N = 46 states and t = 1, . . ., T indexes the years starting from 1963.We will use the first T = 25 years  for model building and set aside the last 5 years (1988)(1989)(1990)(1991)(1992) for prediction and model comparisons.The existing spatial panel model for cigarette demand is expressed as, where x it = (x it1 , x it2 ) denotes the two explanatory variables, β = (β 1 , β 2 ) denotes the corresponding regression coefficients, and it is such that where, at time t and in all states, t = ( 1t , . . ., N t ) , µ t = (µ 1t , . . ., µ N t ) denotes the vector of state effects, and φ t = (φ 1t , . . ., φ N t ) denotes the vector of disturbance which are assumed to be independent of µ t .Further, φ t follows a spatial dependence model, where W denotes an N × N matrix of known normalized spatial weights based on a pre-specified neighborhood structure, λ denotes a spatial autocorrelation coefficient where λ is between the inverse of the smallest eigenvalue of W and 1, and ν t = (ν 1t , . . ., ν N t ) denotes white noise that are iid N (0, σ 2 ν ) and are independent of φ t and µ t .The spatial dependence model in (2.3) is also known as the simultaneous autoregressive (SAR) model (Cressie, 1993).In Sections 2.1-2.4,we will consider different specifications of µ t that give rise to four types of spatial panel models, which we will refer to as homogeneous model, heterogeneous model, fixed-effects model, and random-effects model, similar to Baltagi and Li (2004).Now let y t = (y 1t , . . ., y N t ) denote the vector of the response variables and let X t = [x 1t , . . ., x N t ] denote the matrix of the explanatory variables in all states at time t.Further let y = (y 1 , . . ., y T ) and X = [X 1 , . . ., X T ] denote the response and explanatory variables in all states and at all times.Then we have the following distribution for the data at time t, Under the assumption of temporal independence, the distribution for the data at all time points is, where µ = (µ 1 , . . ., µ T ) .In Baltagi and Li (2004), statistical inference of the spatial panel models are via maximum likelihood and the maximization is performed using the procedure optmum in the software package GAUSS.Here we propose a Bayesian hierarchical model and devise MCMC algorithms for statistical inference instead.
For the prior distributions, we let where w denotes the smallest eigenvalue of W so that the resulting spatial variance-covariance matrix is positive definite.We will specify the prior distribution for {µ t } in Sections 2.1-2.4.Even though here conjugate priors are used for β, σ 2 ν , and {µ t }, the hyper parameters are chosen such that the priors are diffuse.We perform sensitivity analysis to ensure that the posterior distribution is not sensitive to these choices.We use a Gibbs sampler here for simulating from the posterior distributions.For brevity, we present the full conditional distributions for the Gibbs sampler while omitting the detailed derivation.The full conditional distribution of β is, where The full conditional distribution of λ is not in closed-form and thus we use a Metropolis-Hastings algorithm with a normal distribution as the proposal distribution.In the case λ = 0, the disturbance model does not account for spatial dependence and reduces to white noise.The same Gibbs sampler above can be applied, with omission of the Metropolis-Hastings algorithm for simulating λ.We will specify the corresponding posterior distribution of µ t in Sections 2.1-2.4.

Homogeneous model
In the homogeneous model, the state effects are assumed to be state-homogeneous and time-homogeneous with µ t = µ1 N , where µ is an overall mean and 1 N denotes an N dimensional vector of 1's.We use the following prior distribution for µ, µ ∼ normal(µ 0 , γ µ0 ). (2.9) Thus the full conditional distribution of µ is, where . When λ = 0, the model assumes spatial independence, which we will refer to as "homogeneous independent model".Otherwise when λ = 0, we will refer to the model as "homogeneous spatial model".

Heterogeneous model
In the heterogeneous model, the state effects are assumed to be state-homogeneous but time-inhomogeneous with µ t = µ t 1 N , where µ t is a mean across states at time t.We let µ = (µ 1 , . . ., µ T ) denote the vector of these means over time.Also, the regression coefficients and the spatial autocorrelation coefficient are assumed to be state homogeneous and time-inhomogeneous with β = (β 1 , . . ., β T ) and λ = (λ 1 , . . ., λ T ) , where β t = (β 1t , β 2t ) .Then the distribution for the data at time t is, Thus the distribution for the data at all time points is, ) , (2.12) where V = diag{λ t W } T t=1 and X = (I T ⊗ J N )X with J N a N × 2 matrix of all 1's.Let Xt = (l t ⊗J N )X t where l t denotes the t th row of I T .Thus X = [ X 1 , . . ., X T ] .With the following prior distributions for β and σ 2 ν , where We let λ t ∼ uniform(w −1 , 1).We use the following prior distribution for µ, µ ∼ normal(µ 0 , Γ µ0 ). (2.16) Thus the full conditional distribution of µ is, where When λ = 0, the model assumes spatial independence, which we will refer to as "heterogeneous independent model".Otherwise when λ = 0, we will refer to the model as "heterogeneous spatial model".

Fixed-effects model
In the fixed-effects model, the state effects are assumed to be time-homogeneous but state-inhomogeneous with µ t = μ = (µ 1 , . . ., µ N ) , where µ i is a mean over time for state i.We use the following prior distribution for μ, μ ∼ normal(µ 0 , Γ µ0 ). (2.18) where When λ = 0, the model assumes spatial independence, which we will refer to as "fixed-effects independent model".Otherwise when λ = 0, we will refer to the model as "fixed-effects spatial model".

Random-effects model
In the random-effects model, the state effects are assumed to be time-homogeneous but state-inhomogeneous with µ t = μ = (µ 1 , . . ., µ N ) , where µ i 's are random variables following iid normal(µ * , σ 2 µ ).Then the full conditional distribution of μ is, where We use the following prior distribution for µ * and σ 2 µ , (2.21) The full conditional distribution of µ * is, where ) .

(2.23)
When λ = 0, the model assumes spatial independence, which we will refer to as "random-effects independent model".Otherwise when λ = 0, we will refer to the model as "random-effects spatial model".

New Spatial Panel Models and Bayesian Inference
The existing spatial panel models assume temporal independence of cigarette sales y t , which may not be appropriate.To account for potential temporal dependence, we propose a spatial-temporal statistical model.Using the same notation as in Section 2, we consider the same data model as (2.1) and the same model for the state effect and disturbance as (2.2).But now, to account for temporal dependence, we let φ t follow a spatial-temporal dependence model, where the new parameter η denotes a temporal autocorrelation coefficient and, in addition, ν t is white noise iid N (0, σ 2 ν ) and is independent of φ t−1 .In Sections 3.1-3.4,we will consider different specifications of µ t that give rise to four new spatial panel models, corresponding to the homogeneous model, heterogeneous model, fixed-effects model, and random-effects model in Section 2.
Due to (3.1), we have the following distribution for the data at time t, where The spatial-temporal model may be viewed as a temporal process with the transition probability (3.2).When η = 0, model (3.1) reduces to (2.3).Thus the distribution for the data at all time points is, where µ = (µ 1 , . . ., µ T ) and Then as in Section 2, we propose a Bayesian hierarchical model and devise MCMC algorithms for statistical inference.We use the same prior distributions as in Section 2 for β, σ 2 ν and λ (see (2.6)) and a uniform prior for η, η ∼ uniform(−1, 1). (3.5) We will specify the prior distribution for {µ t } in Sections 3.1-3.4.Again we choose hyper parameters that ensure a diffuse prior for β, σ 2 ν and {µ t }.We also perform sensitivity analysis to ensure that the posterior distribution is not sensitive to these choices.We use a Gibbs sampler here for simulating from the posterior distributions.Again for brevity we omit presenting the derivation of the full conditional distributions.The full conditional distribution of β is, where where We use a Metropolis-Hastings algorithm with a normal distribution as the proposal distribution to update λ and η.In the case λ = 0, the disturbance model does not account for spatial dependence, but allows for temporal dependence.The corresponding posterior distribution of µ t will be specified in Sections 3.1-3.4using notation that is consistent with that in Section 2.

Homogeneous model
In the homogeneous model, the state effects are µ t = µ1 N with an overall mean µ.We use the same prior distribution as (2.9) for µ.Thus the full conditional distribution of µ is, where When λ = 0, the model assumes spatial independence, which we will refer to as "homogeneous temporal model".Otherwise when λ = 0, we will refer to the model as "homogeneous spatialtemporal model".

Heterogeneous model
In the heterogeneous model, the state effects are µ t = µ t 1 N with a mean across states µ t at time t.We let µ = (µ 1 , . . ., µ T ) denote the vector of these means over time.Also, the regression coefficients, spatial autocorrelation coefficient and temporal autocorrelation coefficient are assumed to be state homogeneous but time-inhomogeneous with β = (β 1 , . . ., β T ) , λ = (λ 1 , . . ., λ T ) and η = (η 1 , . . ., η T −1 ) , where β t = (β 1t , β 2t ) .Then the distribution for the data at time t is, where Thus the distribution for the data at all time points is, where

The existing spatial panel models
Table 1 summarizes the 2.5%, 50%, 97.5% percentiles of the posterior distributions of the parameters in the existing spatial panel models.For the heterogeneous models, the time-inhomogeneous parameter estimates averaged over time are shown as in Table 3 of Baltagi and Li (2004).For the four spatial models, details of implementation of the Metropolis within Gibbs algorithm are as follows.
For the normal proposal distributions, we tune the standard deviations to reach an average Hastings ratio between 0.2 and 0.7 as recommended in Gelman et al. (2004).The total number of iterations in the Metropolis within Gibbs run is 500,000 with a burn-in length of 20,000.Then every 1, 000 th of the Monto Carlo samples are used to form a Monto Carlo sample of size 500.
The primary model parameters of interest are the elasticities of cigarette price and income.In the case of price elasticity, the 95% credible intervals based on the posterior distributions of the existing spatial panel models all suggest a significant negative effect of cigarette price on cigarette sales.Specifically, for the independent models, the homogeneous independent model gives a posterior median price elasticity of -0.618, whereas the heterogeneous independent model yields an average posterior median price elasticity of -1.172.The posterior distributions of price elasticity in the fixed-effects independent model and the random-effects independent model are similar with posterior medians of -0.464 and -0.460 respectively, which are smaller than those in the homogeneous and heterogeneous independent models.For the corresponding four spatial models, the posterior median price elasticities are -0.882,-1.236, -0.751 and -0.747, respectively.We note that the posterior median price elasticities are larger in absolute values when spatial dependence is accounted for, which suggests that when spatial dependence is accounted for, cigarette price has a stronger effect on cigarette sales.
In the case of income elasticity, it is not quite clear what the relation is between cigarette sales and income.For the independent models, the posterior median income elasticity in the homogeneous independent model is 0.112 and the average posterior median income elasticity in the heterogeneous independent model is 0.420.There is a significant positive effect of income on cigarette sales based on the 95% credible intervals.However, the posterior median income elasticities in the fixed-effects independent model and the random-effects independent model are -0.252 and -0.250 respectively, indicating a negative relation between cigarette sales and income.Based on the 95% credible intervals, these negative price elasticities are also significant.When spatial autocorrelation is accounted for, the posterior median income elasticities become larger in the homogeneous spatial model and the heterogeneous spatial model with values 0.287 and 0.533 respectively.However, in the fixed-effects spatial model and the random-effects spatial model, the income elasticities are no longer significant, based on the 95% credible intervals.
The results of the spatial autocorrelation coefficient suggest strong positive spatial dependence in the disturbance.The posterior median spatial autocorrelation coefficient in the homogeneous spatial model is 0.410.In the heterogeneous spatial model, the average posterior median spatial autocorrelation coefficient is 0.244, which is somewhat smaller.In the fixed-effects spatial model and the random-effects spatial model, the posterior median spatial autocorrelation coefficients are 0.614 and 0.611 respectively, suggesting a stronger spatial dependence.Based on the 95% credible intervals, the spatial dependence is significant.The posterior medians of 0.614 and 0.611 are close to the MLEs of 0.61 and 0.65 in Baltagi and Li (2004), where statistical inference is via maximum likelihood.

The new spatial panel models
Table 2 summarizes the 2.5%, 50%, 97.5% percentiles of the posterior distributions of the new spatial panel models.For all the new models, details of the implementation of the Metropolis within Gibbs algorithm are similar to Section 4.1.
Table 2: The central 95% credible intervals and the medians for the parameters in (a) the temporal models that ignore spatial dependence; (b) the spatialtemporal models that account for spatial dependence and temporal dependence.The abbreviations are homo = homogeneous model, hetero = heterogeneous model, fixed = fixed-effects model, random = random-effects models.Also reported are the deviance information criterion (DIC) for each model fitting.

Model
(a) Temporal Models (b) Spatial-Temporal Models parameter homo hetero fixed random homo hetero fixed random price 2.5% -0.460 -0.502 -0.370 -0.364 -0.451 -0.510 -0.373 -0.374 β 1 50% -0.413 -0.429 -0.330 -0.325 -0.402 -0.430 -0.320 -0.336 97.5% -0.365 -0.350 -0.292 -0.-3534.5 -3772.1 -4177.1 -3804.4 -3604.6 -3744.2 -4248.8 -4138.7In the case of price elasticity, the 95% credible intervals based on the posterior distributions of the new spatial panel models all suggest a significant negative effect of cigarette price on cigarette sales.Specifically, for the temporal models without spatial dependence, the homogeneous temporal model gives a posterior median price elasticity of -0.413.The heterogeneous temporal model yields an average posterior median price elasticity of -0.429, which is slightly larger than that in the homogeneous temporal model.The posterior distributions of price elasticity for the fixed-effects temporal model and the random-effects temporal model are similar with posterior medians of -0.330 and -0.325 respectively, which are smaller than those in the homogeneous and heterogeneous temporal models.For the spatial-temporal models accounting for both spatial and temporal dependence, the posterior median price elasticities are -0.402,-0.430,-0.320,and -0.336, respectively, which are very close to the posterior median price elasticities in the temporal models without spatial dependence.Furthermore, comparing the posterior median price elasticities in the new spatial panel models to those in the existing spatial panel models, we note that the posterior median price elasticities are smaller when the temporal dependence is accounted for.That is, when the temporal dependence is accounted for, cigarette price seems to have less effect on cigarette sales.
In the case of income elasticity, among the temporal models without spatial dependence, the posterior median income elasticity in the homogeneous temporal model is 0.377 and the average posterior median income elasticity in the heterogeneous temporal model is 0.270.There is a significant positive effect of income on cigarette sales based on the 95% credible intervals.However, in the fixed-effects temporal model and the random-effects temporal model, the income elasticities are not significant based on the 95% credible intervals.For the spatial-temporal models accounting for both spatial and temporal dependence, the posterior median income elasticities are 0.448, 0.265, 0.137, and 0.159, respectively.The 95% credible intervals based of these four models all suggest a significant positive effect of income on cigarette sales.Furthermore, comparing the posterior median income elasticities in the new spatial panel models to those in the existing spatial panel models, we note that the posterior median income elasticities in the homogeneous models and heterogeneous models all have positive signs.However, the posterior median income elasticities are negative in the fixed-effects independent model and the random-effects independent model, insignificant in the fixed-effects spatial model, the random-effects spatial model, the fixed-effects temporal model and the random-effects temporal model, and positive in the fixed-effects spatialtemporal model and the random-effects spatial-temporal model.
As for spatial dependence, the posterior median spatial autocorrelation coefficients in the homogeneous spatial-temporal model, fixed-effects spatial-temporal model and random-effects spatial-temporal model are 0.062, 0.078 and 0.080, respectively.The average posterior median spatial coefficient in the heterogeneous spatial-temporal model is 0.022.Based on the 95% credible intervals, the spatial dependence is significant in all four models.Comparing the posterior median spatial autocorrelation coefficients in the new spatial panel models to those in the existing spatial panel models, we note that the spatial dependence is much weaker when the temporal dependence is accounted for.On the other hand, the results of the temporal autocorrelation coefficient suggest a very strong positive temporal dependence in the disturbance.The posterior median temporal autocorrelation coefficients in all the new spatial panel models are very similar.Based on the 95% credible intervals, the temporal dependence is significant.Given the persistent nature of cigarette consumption behavior, it is important to take temporal dependence into account.Indeed, our results reveal that it appears more important to incorporate the temporal dependence than the spatial dependence.

Model selection based on DIC
Here we use deviance information criterion (DIC) to compare and select models (Spiegelhalter et al. (2002).Let θ denote the parameters in a given model.Then the DIC is given by: DIC Table 1 provides the DIC values for the existing spatial panel models.Among the independent models, the homogeneous independent model has the largest DIC value and the heterogeneous independent model has the second largest DIC value.A substantial improvement occurs when state-wise heterogeneity is accounted for, because the fixed-effects independent model and the random-effects independent model give much smaller DIC values.Overall the fixed-effects model independent has the smallest DIC value and is thus the best among the independent models.Similar statements can be made about the DIC values of the spatial models.Furthermore, comparing the DIC values to those of independent models, we note that the models have somewhat lower DIC values when spatial dependence is accounted for.
Table 2 provides the DIC values for the new spatial panel models.Among the temporal models without spatial dependence, the homogeneous temporal model has the largest DIC value and the heterogeneous temporal model has the second largest DIC value.Accounting for the state-wise heterogeneity improves the models slightly.The fixed-effects temporal model and the random-effects temporal model give smaller DIC values.Overall the fixed-effects temporal model has the smallest DIC value and is thus the best among the temporal models.Similar observations are made about DIC values of the spatial-temporal models.Comparing Table 2 to Table 1, the DIC values of the new spatial panel models are much smaller than those of the existing spatial panel models, which suggests a substantial improvement of model fitting when temporal dependence is accounted for.
We show in Figure 2 the posterior distributions of the regression coefficients, the spatial autocorrelation coefficient, and the temporal autocorrelation coefficient for the fixed-effects spatial-temporal model, which is the best among all the models.These posterior distributions show a significant negative relation between cigarette sales and cigarette price with a posterior median price elasticity of -0.320, and a significant positive relation between cigarette sales and income with a posterior median income elasticity of 0.137.According to economic theory, if cigarette is considered normal, then a consumer with a higher income would tend to buy more, which gives a positive income elasticity.On the other hand, income is highly correlated with education.Since those with more education are less likely to smoke, the income elasticity may be negative.The net effect, according to this spatial panel of data, seems to be a positive income elasticity.The temporal dependence is very strong with a posterior median temporal autocorrelation coefficient of 0.986, while spatial dependence is relatively weak with a posterior median spatial autocorrelation coefficient of 0.078.
For the new spatial panel models, prediction is made using the transition probabilities specified in (3.2).In particular, for each MCMC sample of θ and Table 4 gives the RMSEs for the prediction of cigarette demand for each year from 1988 to 1992 along with the RMSE for all five years, based on the new spatial panel models.Among the temporal models, the RMSEs in the homogeneous temporal model, the fixed-effects temporal model, and the random-effects temporal model are 0.1025, 0.0985, and 0.0775, respectively, which are quite close in value.The heterogeneous temporal model gives a slightly larger RMSE of 0.1193.Again accounting for the spatial dependence yields smaller RMSEs.Among the spatial-temporal models, the RMSEs for the homogeneous spatial-temporal model, the fixed-effects spatial-temporal model, and the random-effects spatialtemporal model are 0.0651, 0.0705, and 0.0696, respectively, which are again quite close.The heterogeneous spatial-temporal model also gives a slightly larger RMSE of 0.1098.Overall, the homogeneous spatial-temporal model, the fixedeffects spatial-temporal model, and the random-effects spatial-temporal model perform well in predicting cigarette demand.Comparing Table 4 to Table 3, the RMSEs in the new spatial panel models are much smaller than those in the existing spatial panel models, which suggests that accounting for temporal dependence improves prediction.

Conclusion and Discussion
In this article, we considered a demand equation to examine the effect of price and income on cigarette demand, based on a spatial panel of 46 states of the US over a 30-year period.We generalized the existing spatial panel models in Baltagi and Li (2004) to account for not only spatial dependence but also temporal dependence.For statistical inference, we adopted a fully Bayesian approach and developed MCMC algorithms to obtain posterior distributions of model parameters and posterior predictive distributions of cigarette demand at future time points.The proposed method here overcomes the computational obstacles in other approaches for similar models.We showed that the analysis results are comparable between MLE in Baltagi and Li (2004) and our Bayesian inference, based on the existing spatial panel models that do not account for temporal dependence.Moreover our new spatial panel models provide significant improvement over the existing spatial panel models in terms of model fitting and out-of-sample predictive power.Based on the best fixed-effects spatial-temporal model, we found a negative price elasticity as in Baltagi and Li (2004), but a positive income elasticity contrary to Baltagi and Li (2004).This further suggests the importance of accounting for spatial-temporal dependence simultaneously.We also found that the temporal dependence of cigarette demand appears stronger than the spatial dependence.Finally the methodology proposed here is suitable for spatial panel data analysis in general and may be applied to a wider range of data sets in spatial econometrics.Since the demand equation is of primary interest and the auto-correlation in the disturbance is of secondary interest, we have used a relatively simple spatial-temporal model for the disturbance.Our modeling framework can be extended to accommodate more complex spatial-temporal dependence structure.For example, (3.1) can be generalized from an autoregressive order of 1 to p to account for stronger time dependence.Similarly, the temporal term ηI N in (3.4) can be extended to more general forms to feature spatial-temporal interactions.For future research, it would be worthwhile to develop more general classes of spatial-temporal models for the disturbance, while ensuring the computation remains feasible.
= D(y) + p D , where D(y) is the average deviance D(y, θ) averaged over the range of possible parameter values (here all MCMC samples of θ from the posterior distribution), and D(y, θ) is the Bayesian deviance defined as D(y, θ) = −2 log p(y|θ).where p(y|θ) is the likelihood function.The average deviance D(y) can be interpreted as a Bayesian measure of model fitting.The term p D is the effective number of parameters defined as p D = D(y) − D(y, θ), where θ is the average of MCMC samples of θ.The term p D can be interpreted as a measure of model complexity.A smaller value of D(y) indicates a better model fit, while a smaller value of p D indicates a more parsimonious model.With the two terms together, a smaller value of DIC indicates a better model.

Table 1 :
The central 95% credible intervals and the medians for the parameters in (a) the independent models that ignore spatial and temporal dependence; (b) the spatial models that account for spatial dependence.The abbreviations are homo = homogeneous model, hetero = heterogeneous model, fixed = fixedeffects model, random = random-effects models.Also reported are the deviance information criterion (DIC) for each model fitting.

Table 3 :
Root mean squared error of prediction using the existing spatial panel models