Analysis of Covariance Structures in Time Series

Longitudinal data often arise in clinical trials when measurements are taken from subjects repeatedly over time so that data from each subject are serially correlated. In this paper, we seek some covariance matrices that make the regression parameter estimates robust to misspecification of the true dependency structure between observations. Moreover, we study how this choice of robust covariance matrices is affected by factors such as the length of the time series and the strength of the serial correlation. We perform simulation studies for data consisting of relatively short (N=3), medium (N=6) and long time series (N=14) respectively. Finally, we give suggestions on the choice of robust covariance matrices under different situations.


Introduction
In ordinary regression, the error terms are assumed to be independently and identically distributed with a constant variance.However, these assumptions are not always true in practice.For example, when subjects are measured repeatedly over time in longitudinal studies, observations from each subject are usually serially correlated.Such time series are common in clinical trials and stock markets.Then, it is customary to consider a generalized regression model in which errors are modeled by a covariance matrix with non-constant variances and nonzero covariances.Lindsey (1993) points out the importance of estimating such a covariance matrix.However, the number of unknown elements in the matrix can increase quadratically with its dimension causing considerable difficulties in estimation.Various approaches have been suggested, see for example, Efron and Morris (1976) who estimated the inverse of a covariance matrix using a loss function, Yang and Berger (1994) who applied a spectral decomposition to a covariance matrix, Chiu et al. (1996) who applied a matrix exponential transformation to a covariance matrix and estimated the parameters in the transformed matrix as well as the mean model using an ML approach and Barnard et al. (2000) who modeled the covariance matrix in terms of its standard deviations and correlation matrix and estimated the parameters using a Bayesian approach.Instead of resorting to this complicated techniques, we propose the generalized regression models in which the covariance matrices are based on widely used forms, and we seek those structures that are robust to misspecification of the true structures which are often unknown in practice.
We use the PROC MIXED procedure in SAS, the Statistical Analysis Software, to fit the generalized regression models for continuous outcomes, incorporating different choices of covariance structures with parameters estimated from the restricted maximum likelihood (REML) method (Patterson and Thompson, 1971).Fortran programs are used to simulate data sets of time series from multivariate normal distributions with different covariance matrices.Then the simulation experiments are carried out by fitting models adopting different covariance matrices repeatedly using the PROC MIXED procedures written in SAS macros.
In section 2, we describe the model and estimation procedures.We introduce some common covariance structures and their interpretation in terms of dependency structures between observations.Three criteria for assessing goodness-of-fit are also given.Performance of the models adopting different covariance matrices can be compared through simulation.Since the length of the time series and the strength of the serial correlation may affect the choice of robust covariance structures, simulation experiments are performed in section 3, using two data sets: the blood glucose data for relatively short and medium time series with weak serial correlation (N =3 and 6 and ρ = 0.36 in the covariance matrix of AR1 type), and the plasma citrate concentration data available for relatively longer time series with stronger serial correlation (N =14 and ρ = 0.70).True values of parameters in the simulation experiments are set equal to the estimates obtained from fitting the two data sets.Moreover we also consider other sets of the parameter values in the covariance matrix in order to study the interaction effect of the two factors: the length of the time series and the strength of the serial correlation on the choice of robust covariance structure.Results from simulations are analyzed and compared in section 4. Finally, in section 5, conclusions are drawn on the best covariance structures that give consistently the best fit, regardless of the true covariance structures.

The Model
We assume a generalized regression model of the form T is a vector of M N observed outcomes based on M time series each containing responses from one subject on N equally spaced time points, β = (β 0 , β 1 , . . ., β P b ) T is a vector of (P b + 1) parameters, X is a M N ×(P b +1) design matrix, ε is a vector of M N residuals such that ε ∼ N (0, Σ) and Σ is the block diagonal matrix with covariance matrix Σ 0 for each subject.The covariance matrix Σ 0 with P v variance parameters and P c covariance or correlation parameters describes the relationship between observations within each subject; we assume that the correlations between observations from different subjects are zero.There are P = P b + P v + P c + 1 parameters in total; emphasis is on the estimation of β.
First-order autoregressive (AR1) The restricted maximum likelihood (REML) estimates of the P v + P c parameters in Σ 0 can be obtained by maximizing the reduced log-likelihood and the REML estimate of β is where Σ is obtained by substituting the block diagonal matrix Σ 0 by Σ 0 .As the REML estimate, Σ 0 using (2.1), depends on β and the REML estimate, β using (2.2), depends on Σ 0 in Σ, iterations are required.Moreover, as β depends on Σ 0 , we seek some covariance matrices for Σ 0 that consistently give the best estimate of β and the best fit to data if the true dependency structure is misspecified.
For short time series of length N = 3 say, the covariance matrix Σ 0 can be modeled by J = 7 types, as listed in Table 1.
Note that we discard the unstructured (UN) covariance matrix with no constraints imposed on its entities σ ij because it contains too many parameters and offers no summary of information.Practically, correlation between observations with large time lag is very small and hence it can be well approximated by UN2 especially for time series of medium to long in length.For time series of medium length, N = 6 say, SIM and EC can be represented by σ 2 I 6 (P v = 1, P c = 0) and σ 2 (1 − ρ)I 6 + σ 2 ρJ 6 (P v = 1, P c = 1) respectively where I 6 and J 6 denote respectively a 6 × 6 identity matrix and a 6 × 6 matrix with all elements being 1.The remaining matrices are: For time series of longer length, N = 14 say, the covariance matrices of J = 7 types are similarly defined as in the cases of N = 3 and N = 6.Some of these covariance matrices can be directly linked to particular models.Let Y mn denote the outcome of the n-th measurement from subject m.Then EC corresponds to a random intercept model Diggle et al. 1996) On the other hand, AR1 (see Diggle et al. 1996) corresponds to the model Y mn = µ mn +ε mn and ε mn = ρε m,n−1 +e mn where e mn ∼ N (0, σ 2 (1−ρ 2 )), V ar(ε mn ) = σ 2 and Cov(ε mn , ε m,n−1 ) = σ 2 ρ and is one of the exponential correlation models.
Here σ mn , denoting the mn-th element in Σ 0 , is defined to be σ 2 exp(−φ|m − n|) = σ 2 ρ |m−n| where φ is a constant showing the rate of decay.It implies that covariance between a pair of measurements on the same subject decays to zero as the time between measurements increases.

Goodness-of-fit Tests
In order to facilitate comparison between different covariance structures, data set simulated from model adopting the i-th (i = 1, . . ., I) true covariance structures is fitted to models adopting each of the j-th (j = 1, . . ., J) structures and there are K = 200 replicates for each (i, j) combination (I = J = 7).Then we assess the goodness-of-fit of the J models adopting different covariance structures based on three criteria R hj , h = 1, 2, 3: the estimated parameters, their standard errors and the Akaike's Information Criterion (AIC) (Akaike, 1973).Let x ij denote the value of any of the three criteria in each (i, j) combination.For each criterion, ranking is applied twice; firstly across the J fitted structures for each of the i-th true structure as given by Rank j (x ij ) such that Rank j (x ij ) = 1 when x ij < x ij , for all j = j and Rank j (x ij ) = J when x ij > x ij .When there are ties, we take the average of ranks.Then Rank j (x ij ) is summed over i and second ranking R hj is applied across the sums, ∑ i Rank j (x ij ), for the J structures in the fitted model.The first ranking Rank j (x ij ) is necessary because ∑ i Rank j (x ij ) rather than ∑ i x ij will not be affected by the scale of x ij as well as any outlying x ij .If model with covariance structure j gives the smallest R j , it provides the best fit to data of different true covariance structures and hence is considered the most robust covariance structure.
C1 Rank of mean squared error(RMSE): where and β ijkp is the parameter estimate for β p in the k-th replicated data set which is simulated from model adopting the i-th true covariance structure and is fitted to model adopting the j-th covariance structure.This is a useful criterion when our objective is parameter estimation.
C2 Ratio of standard error and standard derivation: where and Note that SE( β ijkp ) is the standard error of β ijkp .The ratio Ratio ij between SD ijp and SE ijp measures the relative magnitude between the standard error (SE) and the standard derivation (SD) based on β ijkp .We expect Ratio ij to be close to 1 (SD close to SE) if the models fit the data well.
C3 Average of AIC: where and AIC ijk is the AIC value in the k-th replicated data set when the fitting involved the i-th true covariance structure and the j-th fitted structure.This is a useful criterion when our objective is data fitting.
An overall measure of the goodness-of-fit of models adopting the J structures based on these three criteria R hj , h = 1, 2, 3 are

Blood glucose data
Data of inter and intra individual variation of blood glucose levels (Andrews and Herzberg 1985, P.211) obtained from registrants for pre-natal care at Boston City Hospital, USA, are used to estimate the variation of blood glucose for women on the pregnant and non-pregnant states.The data contain two parts.There are 53 women in non-pregnant state.Each of them undertook an annual glucose tolerance test over a period of six years.In each year, a fasting blood glucose test and an one-hour-post-fasting blood glucose test were conducted.There are also 52 women in pregnant state, each having three fasting blood  2 shows that all parameter estimates are significant and the best model is EC according to AIC.By using the parameter estimates in Table 2 as true values for each of the J = 7 covariance structures, we simulate K = 200 data sets (or totally 1400 data sets) each having 105 trivariate normal vectors or 315 observations.Then we fit each of the 1400 data sets to J = 7 covariance structures.The fittings are done automatically using a SAS macro program.Moreover, in order to study the interaction effect of the two factors: the length of the time series and the strength of the serial correlation on the choice of robust covariance structure, we repeat the whole simulation experiment with an adjusted set of true values of which β remain the same but the level of serial correlation is higher: ρ in AR1 and EC and ρ 1 in TOEP and TOEP2 equal to 0.7, σ 21 = 0.7σ 2 σ 1 and σ 32 = 0.7 ρ 32 ρ 21 σ 3 σ 2 in UN2 and ρ 2 = 0.7 ρ 2 ρ 1 in TOEP.Obviously, models fitting to data simulated from the same covariance structure are generally the best according to AIC.On the other hand, Table 3 reports the sum of ranks ( AIC ij ) and the ranks R hj , h = 1, 2, 3 respectively.Across levels of ρ (0.36 and 0.7 in AR1 say), the ranks R 2j according to ratio of SD and SE are identical, the ranks R 1j according to M SE are similar but the ranks R 3j according to AIC are quite different.Table 3 also reports the overall rank R j for each structure according to the three criteria.The choices are not the same across levels of ρ but AR1 generally performs well according to both M SE and AIC.Hence AR1 is a good choice of robust covariance structure for both data fitting and parameter estimation.

Data of non-pregnant state women
This data set containing times series of medium length (N = 6) is obtained by including all blood glucose tests from the six fasting and one-hour-postfasting from 53 non-pregnant women.There are 318 (6 × 53) observations coming from M = 53 subjects each repeatedly measured N = 6 times.The dependent variable is again Y and the only independent variable is X 1 .The means for Y and X 1 over 318 observations are -14.39 and 79.72 respectively.Table 4 reports that parameter estimates for models with J = 7 different covariance matrices are all significant and the best model is TOEP according to AIC.Similarly by setting the parameter estimates as true values for I = 7 different covariance matrices, we simulate K = 200 data sets (or totally 1400 data sets) each having 53 normal vectors of N = 6 in length.Next, we fit each of the 1400 data sets to models with J = 7 different covariance matrices.From Table 5, the ranks R 1j and R 3j are similar to those of the previous data with N = 3 and low level of ρ, according to M SE and AIC respectively but they are quite different according to ratio of SD and SE.Moreover the first and second choices of robust covariance structure are identical to those of the previous data.Again, AR1 generally performs well according to both M SE and AIC.

Plasma citrate concentration data
An experiment involving 10 subjects (Andrews and Herzberg, 1985, P.237) was carried out to study the variation of plasma citrate concentration during a day.The concentration of citrate in plasma (in µmol per litre) of each subject was measured hourly from 8am to 9pm (N = 14) during a day.Meals were given at 8am, at noon and at 5pm.The mean concentration over the complete set of The regression models with 2 covariates, namely indicator of meal time (Meal) and time from 1 to 14 (Time), and using each of the J = 7 covariance structures are fitted to the data.Table 6 shows that all parameter estimates are significant and the best model is again TOEP according to AIC.Using the I = 7 sets of parameter estimates as true values, we simulate K = 200 data sets (or totally 1400 data sets) each having 10 normal vectors of N = 14 in length.Lastly we fit each of the 1400 data sets to models with J = 7 different covariance structures.
Again, we repeat the whole simulation experiment with an adjusted set of true values of which β remain the same but the level of serial correlation is lower: ρ in AR1 and EC and ρ 1 in TOEP and TOEP2 equal to 0.3 and σ ij in UN2 and ρ i in TOEP are similarly calculated as in the blood glucose data when N = 3.From Table 7, across levels of ρ (0.7 and 0.7 in AR1 say), the ranks R 1j and R 2j are very similar according to M SE and ratio of SD and SE but are quite different according to AIC.The table also reports the first and second choices of robust covariance structure according to the total rank R j using all the three criteria.
Across levels of ρ, the choices are not the same but SIM and EC generally perform well according to M SE.Hence when our objective is parameter estimation, SIM and EC are good choices of robust covariance structure.

Results
The first two simulation experiments using the blood glucose data consider time series of short and medium length and of weaker serial correlation (ρ = 0.36 and 0.38 in AR1 for N = 3 and 6 respectively) whereas the third simulation experiment using plasma citrate concentration data considers longer time series of stronger serial correlation (N = 14 and ρ = 0.70 in AR1).In order to study the interaction effect of the length of the time series and the strength of the serial correlation on the choice of robust covariance structure, two more simulation studies have been done with level of ρ in AR1 set to 0.7 and 0.35 for the two data sets containing time series of N = 3 and N = 14 respectively.Table 8 summarizes the first and second choices of robust covariance structure according to all the three criteria including M SE, ratio of SD to SE and AIC in a 2×2 contingency table.For data of relatively shorter time series and with weaker serial correlation, AR1 is the most robust structure and EC is also good.Both structures assume a constant variance and a consistent correlation between pairs of lag-k observations, given by ρ k and ρ respectively.Models adopting the two covariance structures describe well the gradual decline of serial correlation across 'lag'.For data with stronger serial correlation, TOEP and AR1 (or UN2) rank the first and second respectively.Their lag-k correlations, ρ k and ρ k (or non-zero lag-1) respectively, drop to zero across 'lag' at a rate which depends more on the data.For data of relatively longer time series with stronger serial correlation, AR1 and EC are the first and second choices of robust covariance structure.They are exactly identical to those of shorter time series with weaker serial correlation.Their lag-k correlations, ρ k and ρ respectively, are simple with less parameters and non-zero with constant (ρ) or consistent (ρ k ) correlations between pairs of observations.On the other hand, for data with weaker serial correlation, EC is the most robust structure and TOEP2 is also good.The two correlation structures model correlation between pairs of lag-k observations by a constant ρ and a non-zero lag-1 correlation (ρ 1 = 0) respectively.They describe the decline of correlation across 'lag'.Note also that SIM, UN1 and UN2 are usually worse especially for data containing longer time series or with stronger serial correlation because they assume non-constant variances as well as zero or non-constant covariances resulting in a large number of parameters.On the other hand, robust structures generally assume constant variances and consistent covariances between observations of a given time lag.

Conclusion
Results suggest that both the length of time series and the strength of serial correlation affect the choice of robust covariance structure.For data containing relatively shorter time series, AR1 is the choice of the most robust covariance structure irrespective of the strength of serial correlation.AR1 is simple with only two parameters, the constant variance σ 2 and the auto-regressive parameter ρ which describes the strength of serial correlation.Moreover, EC and TOEP are also good choices for data of weak and strong serial correlation respectively.AR1, EC and TOEP all have P v = P c = 1 but with different assumptions on lag-k correlation, namely ρ k , ρ and ρ k respectively.For data of longer time series, modeling the covariance structure becomes more difficult because of the possibility of more complicate covariance structures.Simulation experiments show that EC is the most robust choice irrespective of the strength of serial correlation.Moreover, AR1 and TOEP2 are also good choices for data of strong and weak serial correlation respectively.The three chosen structures, namely EC, AR1 and TOEP2 are similar to those of short time series except that TOEP is replaced by TOEP2 with less parameters.In general, robust covariance structures for data containing longer time series have constant variances and covariances for observations of equal time lag and zero covariances for observations of higher time lag.
glucose tests and three one-hour-post-fasting blood glucose tests.Outcomes are differences in blood glucose concentration during fasting and one hour post fasting.Measurements are in mg/100 ml.Variables in the analyses include Dependent variable: Y : the difference between their fasting and one hour post blood glucose level, Independent variables: X 1 : the fasting blood glucose level, X 2 : the indicator of pregnancy.1.Data of pregnant and non-pregnant states women Data from both pregnant and non-pregnant states women are used in the analysis of short time series.It contains 315 (3 × (53 + 52)) observations coming from M = 105 subjects each repeatedly measured N = 3 times.For the 53 non-pregnant women, only the first three fasting and one-hour-postfasting blood glucose levels are used in the analysis.The overall means of Y and X 1 are respectively -21.72 and 76.08 whereas the corresponding means for the pregnant group are -34.94 and 72.88 and for the non-pregnant group are -8.74 and 79.21.The generalized regression model has P b = 2.We use the PROC MIXED in SAS to fit the generalized regression models with 7 different covariance matrices.Table

Table 1 :
The seven types of the covariance matrix

Table 2 :
Parameter estimates for models with different covariance structures for the blood glucose data (N = 3 with an original low level and an adjusted high level of ρ)

Table 3 :
Ranks of the three criteria for the blood glucose data (N = 3)

Table 4 :
Parameter estimates for models with different covariance structures for the blood glucose data (N = 6 with an original low level of ρ)

Table 5 :
Ranks of the three criteria for the blood glucose data (N = 6)

Table 6 (
continued): Parameter estimates for models with different covariance structures for the citrate concentration data (N = 14 with an original low level and an adjusted high level of ρ)

Table 7 :
Ranks of the three criteria for the citrate concentration data(N = 14)

Table 8 :
Summary of choices for robust covariance structure cross-classified by the length of time series and strength of serial correlation