FUNCTIONAL VARYING COEFFICIENT MODEL WITH TIME-INDEPENDENT COVARIATE AND LONGITUDINAL RESPONSE

: In this paper, we consider functional varying coefficient model in present of a time invariant covariate for sparse longitudinal data contaminated with some measurement errors. We propose a regularization method to estimate the slope function based on a reproducing kernel Hilbert space approach. As we will see, our procedure is easy to implement. Our simulation results show that the procedure performs well, especially when either sampling frequency or sample size increases. Applications of our method are illustrated in an analysis of a longitudinal CD4+ count dataset from an HIV study.


Introduction
In functional data, unlike multivariate data, the observations are naturally curves. In fact they are independent and identically distributed realizations of a stochastic process. See Silverman (2002, 2005) for an overview of methods and applications. See also Ferraty and Vieu (2006) and, Horváth and Kokoszka (2012). In many experiments realizations of involved trajectories are not directly observable. Instead, the observed data are obtained at discrete location points. These type of data are usually sparsely and irregularly sampled on random time points and are noise-contaminated. The aforementioned situation often occurs in many longitudinal experiments, for example in the most of biological, biomedical and medical studies.
Varying coefficient models which are extension of parametric regression models, became popular after the works of Cleveland et al. (1991) and, Hastie and Tibshirani (1993). These models have been extensionally studied in the literature. Most of existing approach are based on polynomial spline, smoothing splines and local polynomial smoothing. See for example Hoover et al. (1998), Wu et al. (1998), Kauermann and Tutz (1999), Chiang et al. (2001), Wu and Chiang (2000) and Huang et al. (2002Huang et al. ( , 2004. See Fan and Zhang (2008) to review the collection of approaches and applications up-to that date. In the case of sparse designs and noisy measurement, we refer the readers to Sentürk and Müller (2008) In this paper, we consider the following functional varying coefficient model where ⊂ ℝ is a compact set, ( ) is a square integrable response with mean function ( ) and is a random variable with finite variance 2 . Also, 0 ( ) and 0 ( ) are smoothed intercept and slope functions, respectively, and ( ) is a mean zero random processes, independent of . We want to estimate 0 ( ) and 0 ( ) in the situation that the observations are sparse and irregular longitudinal data and contaminated with some measurement errors. Let denote the observations of the random function at the random times , contaminated with measurement errors which are assumed to be independent and identically distributed with means zero and finite variance, and independent of random function . Then the model that we consider is The requirement that the number of time points in each curve are equal is not essential and is placed for easy illustration of the methodology.
In the present paper, we use a reproducing kernel Hilbert space (RKHS) framework for estimating the slope function. We assume that the slope function 0 resides in an RKHS, a subspace of the collection of square integrable functions on . Then we investigate the method of regularization for estimating 0 . By obtaining an estimate of 0 , say ̂0 , we estimate the intercept function 0 via The paper is organized as follows. In section 2, we introduce our methodology for estimating the slope function 0 . Section 3 presents results of some simulation studies. An application to a longitudinal CD4+ count dataset are demonstrated in section 4. Finally, concluding remarks and possible extensions of our work are presented in Section 5.

Estimation Method
In this section, we introduce a regularization method for estimating the slope function. First, we present the general definitions and common properties of RKHS required in this paper. Verification of these results as well as more detailed discussions of RKHS theory can be found, for example, in Aronszajn (1950).
Let ℋ be a Hilbert space of functions on some set and denote by 〈⋅,⋅〉 ℋ the inner product in ℋ. A bivariate function on × is said to be a reproducing kernel for ℋ if (i) (⋅, ) ∈ ℋ for every ∈ , and (ii) ( ) = 〈 , (⋅, )〉 ℋ for every ∈ and ∈ ℋ. When (i) and (ii) hold, ℋ is said to be a reproducing kernel Hilbert space (RKHS) with reproducing kernel . Property (ii) is called the reproducing property. If is a reproducing kernel of ℋ then, it can be shown that, is the unique reproducing kernel and is a symmetric and positive definite function: that is, ( , ) = ( , ) for , ∈ and for any real 1 , … , and 1 , … , ∈ , Conversely, if is a symmetric and positive definite function on × , a unique RHKS of functions on with as the reproducing kernel can be constructed. The notation ℋ( ) will be used to denote the RKHS having the reproducing kernel .
Assume that the reproducing kernel is square integrable, that is, Then Mercer's theorem (see Riesz and Sz.-Nagy 1955) states that admits the following eigenvalue decomposition: where 1 , 2 , … are constants and 1 , 2 , … are orthonormal basis for 2 ( ), that is where is Kronecker's delta. From Lemma 1.1.1 in Wahba (1990), we know that any squared integrable function on belongs to ℋ( ) if and only if where = 1,2, … for = 〈 , 〉 2 ( ) . Now, consider functional varying coefficient model (1). We investigate the method of regularization for estimating 0 . We assume that the slope function 0 resides in an RKHS ℋ( ), a subspace of the collection of square integrable functions on . The method of regularization estimates 0 by measures how well fits the data and ∥ ∥ ℋ( ) 2 is an RKHS norm that measures smoothness of , and ≥ 0 is a tuning parameter that balances the fidelity to the data and the smoothness.

Remark 1 In general, ℓ is chosen such that it is convex in and
Define dimensional vectors and respectively as = is the dimensional vector with all elements equal to 1 and = [ 11 , … , 1 , 21 , … , ] ′ . Suppose be an × matrix such that (the th column of ) = (the th column of ) ∘ ( ∘ )), = 1,2, … , where ∘ is the Hadamard product of two matrices and . In the following theorem, we give solution of the minimization problem (5). Theorem 1 The function ̂ minimizing the regularized empirical error over ∈ ℋ( ), may be expressed as where = [ 11 , … , 1 , 21 , … , ] ′ is the unique solution of the well-posed linear system in ℝ For any ≥ 1, we have If is a minimizer of , then for each , we must have . Thus, where the last equality follows from (3). Replacing ( ) in the definition of above to obtain By multiplying both sides in and writing the results in matrix form, we obtain ( + ) = ∘ . This system is well-posed since is positive definite and the addition of a positive definite matrix and identity matrix is strictly positive definite.

Simulation Studies
To demonstrate the performance of the proposed estimator, we carried out a set of simulation studies with different combinations of sampling frequency and sample size. Let = [0,1]. The true slope function 0 is fixed as It is clear that, 0 ∈ 2 2 where 2 is the th order Sobolev-Hilbert space:    We estimate based on these generated data according to Theorem 1. The integrated squared error, ∥̂− 0 ∥ 2 [0,1] 2 = ∫ 1 0 (̂( ) − 0 ( )) 2 , as a function of the tuning parameter is given in Figure 2. The smoothing parameter is set to yield the smallest integrated squared error and therefore reflect the best performance of the estimating procedure.
The next numerical experiment intends to demonstrate the effect of sample size and sampling frequency . We repeat the experiment with varying combinations of = 25, 50, 100 or 200, and = 1, 3 or 5. The obtained value of the smoothing parameter, for each simulated data set, is reported in Table 1. As we see in the Table, the value of smoothing parameters decreases whenever either or increases. That is because, when number of observations increase we just need a little smoothing. By 200 runs replications the average of integrated squared errors and variance of ̂ for each of the settings are reported in Table 2. It can be seen from the table that increasing either or leads to improved estimates, whereas such improvement is more visible for small values of . In addition, the average integrated squared error and the variance of estimated slope function decrease as either or increases.

Application
CD4+ cells are a type of white blood cell that fights infection. CD4+ cells move throughout body, helping to identify and destroy germs such as bacteria and viruses. It is a well known fact that the human immune deficiency virus (HIV) causes AIDS by attacking CD4+ cells. HIV infects components of CD4+ cells. It directly and indirectly destroys CD4+ cells. The CD4+ count measures the number of CD4+ cells in a sample of blood. CD4+ counts are reported as the number of cells in a cubic millimeter of blood. A normal CD4+ count is around 1100 cells per cubic millimetre of blood. In general, HIV disease is progressing if the CD4+ count is going down. This means the immune system is getting weaker and HIV infected persons are more likely to get sick. In some persons, CD4+ counts can drop dramatically, even going down to zero.
The dataset considered here is from the Multicenter AIDS Cohort Study. The data contains CD4+ cell counts for a total of 369 infected men along with other variables (see Kaslow et al, 1987). In this study, the patients were scheduled to have their measurements made twice a year, however some subjects missed some of their scheduled visits. So, the actual times of measurement is random, irregular and sparse. The number of measurements for each individual varies from 1 to 12 yielding a total of 2376 records.
The objective here is to evaluate the effects of centred age at HIV infection on CD4+ counts based on model (1). In this dataset, centred age at HIV infection is time-independent covariate variable and CD4+ counts are considered as response functions ( ) of time since seroconversion (time when HIV becomes detectable). Figure 3 displays individual trajectory of CD4+ count for the included 369 subjects along with the smooth estimated mean function. As we see in the figure, mean of CD+ counts decreases to below 1100 cells when time runs up. We calculate the smoothing parameter of ̂ using the following -fold cross-validation. The 369 subjects is randomly partitioned into roughly equal part. Of the parts, a single part is retained as the validation data for testing the model, and the remaining − 1 parts are used as training data. For each = 1,2, … , , we fit the model with parameter to the other − 1 parts, giving ̂− . Then we compute its error in predicting the th part as follows: The cross-validation error is defined by We compute CV(λ) for many values of λ and choose the value of λ that makes CV(λ) smallest. The estimated smoothing parameter λ= 0.1 was obtained by 5-fold cross-validation.
The estimated slope function, ̂ is shown in the left panel of Figure 4. The right panel of Figure 4 shows the estimated mean surface [ ( )| ]. The estimated slope function, ̂ initially increases and then after seroconversion decreases rapidly. Note that the surface of Figure 4 shows that the age at HIV infection has significant effect on the mean of CD4+ cell numbers. From this surface, we observe that for older peoples the means of CD4+ cell numbers are increasing function before seroconversion and decreasing after that. In contrast, for younger peoples these means function decrease until one year after seroconversion and then increase slowly. But for middle age the means of CD4+ cell numbers fluctuate around 1000 cells before seroconversion and during the second year after seroconversion decrease rapidly, and then increase slowly. In general, for older people, when time runs up, HIV infections spread faster. On the other hand, the situation for younger people before seroconversion is normal.

Discussions
We have developed a regularization method for slope function estimation in functional varying coefficient model with longitudinal response data and scalar covariate. Since longitudinal data arise in many fields of science, the proposed method can be applicable in these fields. We note that although the methodology is proposed for irregular and sparsely sampled functional data, such as longitudinal data, it can be also applied for regularly and densely sampled functional data. Our simulation study shows that the procedure performs well. Also the estimator of the slope function introduced in closed form that makes it computationally very tractable.
The theoretical properties of the proposed estimator can be studied. For example, one can study convergence rate and asymptotically distribution of the slope function estimator. On the other hand, the existence of only one covariate is the limitation of our proposed model, that can be resolved by extension the model to the case with combination of longitudinal and scaler covariates. These ideas will be explored in future works.